September 2024 - Page 14 of 15 - Mobile Monitoring Solutions

Uncategorized

Podcast: Engineering Leadership: Balancing Autonomy, Growth, and Culture with Michael Gray

MMS • Michael Gray

Subscribe on:

Transcript

Shane Hastie: Good day folks. This is Shane Hastie for the InfoQ Engineering Culture podcast. Today I’m sitting down with Michael Gray. Michael, welcome. Thanks for taking the time to talk to us today.

Michael Gray: Thanks for having me, Shane.

Shane Hastie: My normal starting point with these conversations is, who’s Michael?

Introductions [00:59]

Michael Gray: Who’s Michael? That’s a good question. I’m Michael. I’m a principal engineer at ClearBank. Been there for getting on for three years now, which makes me one of the veterans actually ’cause we’re only seven years old, so I’ve nearly been there for 50% of its lifetime. Before that I worked at Vanquis Bank, also in the finance sector, and before that I was in a completely different industry, which is security surveillance. So it’s quite a contrast and they’re quite different industries with very different challenges.

Yes, so that’s my work history. What do I enjoy? Software. I enjoy systems, thinking about the bigger picture, understanding the problems that systems have, helping to think about how we might solve some of those problems, hence why I’m in a principal engineer role. It’s very much the bigger picture for ClearBank and how everything joins together. I’m pretty passionate about engineering culture, which is why we’re here, I think, Shane, so we’re here to discuss that a bit. Safe environment for everybody to work in, how we can create that great culture that people can excel in and I think that’s a huge part of successful engineering teams and companies that we don’t often talk about enough. We focus a lot on the technology and not the people and culture side.

Shane Hastie: As a principal engineer, what is your contribution to culture?

Contribution to culture [02:12]

Michael Gray: Quite a lot, I guess. So we’re leaders. We’re leaders in an organization which very much means that we have to lead by example. How we behave is seen by most of the engineering floor so making sure we’re communicating clearly, safely, making sure people feel heard, how we behave and interact influences how others do, so very much holding ourselves to a high standard with that. We own processes, how we make decisions as an example that we’ve put in place and I believe that’s the big contribution to culture, to make sure that the right people are making the right decisions to feel like they’re empowered to make the decisions themselves when appropriate and that those kinds of things are really clear.

As a group, we do a lot of mentoring across ClearBank. If you’ve seen a previous one of my talks, I talk quite passionately about mentoring and I think it’s something we don’t do enough in technology at the moment. We’ve grown as a industry very rapidly over the last few years and that means we’ve had a lot more people coming in. They’ve been promoted very quickly, that maybe haven’t had the support that they need to grow into the roles that maybe they are currently in. So mentoring, I think, is a huge part, especially the more senior folk in companies should be embracing more than they currently are. Yes, so a couple of flavors there I guess.

Shane Hastie: So let’s dig into some of those things. The first one maybe we can talk about is decision-making. We want our engineering teams to be as autonomous as possible. We know from the motivation research that autonomy mastery purpose are things that drive us in the knowledge worker environment, but we also have to, in ClearBank’s case, live within a regulated environment and many organizations do have strict constraints. How do we balance that?

Balancing Autonomy and Regulation [04:09]

Michael Gray: Yes, it is a tough problem. Maybe if I take ClearBank back in time to where it was to where we are now and how we originally dealt with it versus how we deal with it now. So when I joined nearly three years ago, we had a centralized function, which was called architecture review, and this was the senior folk in ClearBank that came together to discuss architectural decisions. ClearBank was smaller then, but even then when I joined, that wasn’t scaling very well. How that was run was we had minutes and they were stored in a wiki. That was kind of the extent of where we were at that point in time. I thought this was a bit madness really because those people weren’t close enough to some of the problems that we’re discussing and making decisions on to really know the consequences of them. So we’ve shifted, in recent years, to kind of what Andrew Harmel-Law’s described as the Architecture Advice Process.

I’m sure a lot of people have seen his article on ThoughtWorks and he’s got his book out, which I have reviewed the first couple of chapters, but then I was a bit rubbish, so I don’t think I’ve been asked to review the rest of them to be fair, but that’s on me. But the first two chapters were great. So we transitioned to the Architecture Advice Process and that’s heavily leveraging architecture decision records as the mechanism for storing decisions that are mutable, they’re a point in time. We’re quite strict about how well these are written, especially for, I’ll come onto decision scopes, which is kind of a layer that we’ve put on top of this, but when you think about a regulator’s perspective, yes, the architecture review led the purpose, we have the minutes and we could record the decisions, but they missed a lot of the nuance and the detail. I actually think ADR’s are serving that purpose a lot more clearly than the minutes from the meetings were. So that’s great.

So another problem that we had was because we had this centralized function, people didn’t feel like they could make decisions and that was when you’re a growing company, that’s a problem, your engineer, so I think since I joined two and a half years ago, technology and products were maybe 150 people and now we’re 310, something like that. It didn’t really scale very well at 150 nevermind 310. And we had a problem where people felt like they couldn’t make decisions and they needed to go to the centralized, people were continuously looking upwards to somebody else to make decisions for them. So we introduced something called decision scopes, which were very simple and it’s all about impact, team, domain and enterprise, so team-based decisions. What we ask everybody to ask themselves is, does this decision just impact my team? If it does, your decision to make, discuss it within your team and move forward with it. We don’t need to know, you know better than we do, but please write it down in an ADR, because at least it’s recorded and audited and that keeps everybody happy.

Domain. That’s how we’re organized at ClearBank. There may be a couple of people in a particular domain and they may be changing something to do with, I don’t know, the way we process payments is slightly nuanced and may impact multiple teams, let’s put it that way. Again, it’s about impact, how a discussion with the people that it’s going to impact in your domain and write it down. Third one is enterprise, and that’s a bit more of a bigger deal. That’s where we do have a centralized forum, which we call the Architecture Advisory Forum. That’s where we create a quorum of everybody that we need to have to approve these decisions. That’s principal engineers, that is security, data. Who else have we got in there? Basically everybody that needs to be, hey, we can kind of tick this box.

That forum serves two purposes really. One, means everybody can have an input into these enterprise decisions ’cause it’s going to impact them, and I think that’s really important. Before it was a closed forum. Anybody can attend Architecture Advisory Forum if they like, so they can come in, have an opinion and feel like they’ve made input. And I truly believe that that means that these decisions have a bigger impact because people feel like they’ve contributed. It was partly their decision, they’ve been consulted. The other side of the forum is the clue’s in their name, architecture advisory. People can bring stuff where they would like opinions, should they want to. Sometimes that is an opinion from the forum, but sometimes it’s just we’ve got that quorum of people and you can find the right people that you need to speak to, to move what you’re trying to achieve forward. So it also creates that open culture.

The final piece, which I mentioned earlier is anybody can attend. What I was really pleased to see, maybe a couple of months ago as one of our directors of engineering encouraged one of our associate software engineers, so we sponsored Code First Girls. So we took on a cohort of Code First Girls in and we had, so an associate software engineer from Code First Girls who’s been at ClearBank for 10 months maybe, pretty new to software in the industry, sharing and presenting in a safe environment. They felt safe to share and present a kind of enterprise decision that they wanted to be part of in that forum. And I thought that was brilliant and I thought that showed that that was safe, what people feel like they can contribute. So how was that shared? That was quite a lot, but that’s a summary of the very topic.

Shane Hastie: Very clear decision scope, clarification, and you’re talking about that Architectural Advice Forum. You made a point there of the safety, the ability for very junior people to step up and talk there. How do you avoid this becoming a bureaucratic rubber stamp?

Finding the balance between architectural advice and overburdensome bureaucracy [09:25]

Michael Gray: So we have had challenges with that. I spoke about it at QCon because somebody in the audience asked me this very awkward question because it’s a tough problem. We had examples where we have a problem at the moment with too many services, if we’re quite honest, for the number of people, that kind of microservices evolution where everybody said everything needs to be 100 lines. We suffer from a bit of that and we’re trying to bring it back together. But that’s quite a contentious point and that’s something that we’ve not solved through the forum because we couldn’t get everybody to agree. And I think this one depends because for that decision to make an impact, you have to get buy-in. Because it didn’t get approved and agreed with everybody, we probably wouldn’t have had enough buy-in anyway, so we’ve kind of taken that to take a different tact, more of an influence in one particular areas in teams where we think we’re going to have the most impact. As a group of principal engineers, we’ve done that.

In other cases, there are decisions that we need to make that we’re going to have to dictate, we’re a bank so sometimes it’s regulatory stuff that we need to do and therefore that some of the decisions that we do just have to make. That’s maybe a little bit of a challenge we’ve got now, it’s not super clear. Normally that ends up we have to go there and say, “We need to do this because of X. This is for information and not for debate”. To be fair, when we have done that, people do still input and actually suggest improvements quite often when that has happened, but it’s maybe a bit of a gray area and one of the challenges with this way of working, I still think the benefits outweigh that.

Shane Hastie: Yes, thank you for that. As you say, these are sometimes gray and difficult areas that are not easy to figure out, and I think our audience will appreciate the openness there that hey, we haven’t really got it all working perfectly, but it’s better than it could be. So thank you for that. Another thing that we were chatting about before we started recording was the pressure for efficiency. What’s happening in the industry today? So stepping outside of ClearBank but looking at the bigger picture.

Industry pressure on efficiency compromising quality and continuous improvement [11:31]

Michael Gray: So there’s a general push in the industry, it’s all about do more with less. I think that’s the phrase a lot of people are saying. And you see the big tech companies, they’re making layoffs and their share prices, they’re going up, okay? But I think they were in a different position to most people ’cause they all purposefully over hired because they wanted to stop their competitors having the talent in the market. That’s why they also paid so much.

But the do more with less, generally, is feeling like a squeeze on teams and it’s all about value, efficiency. There’s a pressure that people are becoming feature factories, and from my perspective, what that’s doing in the industry is squeezing teams, which means they don’t have space to continuously improve. It’s all about the next feature, the next feature, the next feature. And I think that’s pretty negative to be honest with you, and I think we’re going to feel the pain of that in maybe three or four years time from now where we’ve stopped continuously improving because maybe got investors that need to see that we’re being super efficient, that kind of thing.

Yes, and we’ve removed the space from people to improve or just refactor a bit of code or update a little bit of documentation or put that call in about the process that feels a little bit broken that they want to do something. While we’ve got this pressure, I just think it’s having a negative impact in the longer term.

Shane Hastie: So how do we resolve it? How do we want to say push back? How do we challenge this?

Michael Gray: Yes, it’s interesting. I think it depends. If I talk about, in a regulated industry, there’s plenty of frameworks that you can use to your advantage. There’s risk frameworks, if there is risks and you’re regulated and it could cause detriment or maybe something that’s reportable to a regulator, that’s definitely something you can use to your advantage. Make it really clear why certain things need to be done. Another side is you leverage the processes that you’ve got that maybe you felt like you didn’t need to use in the past. The other thing is I think leadership has a big role to play in this and that’s leading by example. If it looks like, we, as leaders are absolutely flat out and don’t have time to do anything, then they’re going to feel like they should be operating in the same kind of way. So maybe even if you are completely flat out, make sure it’s not perceived like you are to the rest of the engineering floor. Make sure you’re preaching that culture of continuous improvement and encouraging it.

There’s certainly things as leadership we need to make sure we’re doing, influencing the right stakeholders, influencing senior people, the C-suite potentially in the organization to make sure they understand the value of this kind of continuous improvement mindset. I think about ClearBank, ClearBank’s got to where it’s today because of the continuous improvement mindset and it’s really important we don’t lose it. We’ve created space to create a really solid technology platform. Why do customers come to us? Because we never go down and we have a really solid technology platform compared to our competitors. So that’s because we’ve continuously improved it and not needed to ask for permission to do so. So those are the few things I think we can do, but we do also have to acknowledge the macroeconomic climate is what it is, that sometimes we’re going to have to make compromises on idealisms and pick our battles and make sure we’re improving the bits that we absolutely think we need to versus the bits that are going to have less impact.

Shane Hastie: You made the point about ClearBank growing from, you said about 150 people when you joined nearly three years ago to over 300 today. What’s the impact of that sort of growth and how do you maintain the culture that you want through this growth?

Maintaining culture during rapid growth [15:16]

Michael Gray: Yes, in my QCon talk, I talked about power boats and alignment. That was my metaphor versus like an oil tanker ship. Really it was in reference to incumbent banks versus ClearBank and why we had an advantage against them and how we try and avoid becoming an oil tanker and moving slowly. For us, part of it’s about clear boundaries and ownership, what people own and therefore can make decisions on and make changes to. So we try and make sure that’s really clear. Interactions between those boundaries is something we continuously focus on. The more awkward interactions there are between those boundaries, the less autonomy teams can have to move quickly and make good decisions, so that’s certainly an area principal engineers focus on ’cause we get that wider picture of the organization.

To maintain the culture, the answer is it’s more effort. You have to put work into it. When everybody’s in an office together, so we’ve been through COVID, a lot of us are actually remote now versus in an office together. We have to put more, yes, more effort into making sure we’re continuously learning, creating the environment. So one of the things we have is (Sophie Western is doing it, among a few of the people at ClearBank is brilliant, to be honest) it’s like we have our own, really, it’s our own internal meetup that happens every Friday. It’s called Tech It Easy and again, similar to Architecture Advisory Forum, anyone can come in, anyone can talk, whether it’s for five minutes or 10 minutes or anything. And it doesn’t have to be on tech, it doesn’t have to be on ClearBank. That’s like, again, a safe space that’s been created that anybody can go to and spend some time to share and discuss something that they’re interested in.

Maintaining that culture. We talked about the Architecture Advisory Process. I think that’s about maintaining the decisions as close to the people that know the decisions and impacts best. And leading by example, I spoke about that earlier as well.

Shane Hastie: How do we grow our engineering talent within organizations, but as an industry?

Growing talent as an industry [17:16]

Michael Gray: Earlier we talked about the impact and the current economy whereby it’s do more with less and previously it was grow, grow, grow. So I think we’re in an interesting situation at the moment where we’ve grown, grown, grown, and we’ve got a lot of people that are fairly inexperienced in the industry if we’re honest, because they’ve probably been in the industry five, eight years maybe. So I think there’s a lot of missing skills, at the moment, yes. So what do I need to do to grow that? I think, I talked about mentoring earlier, 100% we need to be doing more of that, especially from the more experienced people. I’ve always found mentoring to actually be quite a high-impact thing to be doing because if you think about it while you’re growing somebody, that somebody tends to be in another team. If you grow them, they’re in a team with other people, they’re likely going to be sharing their learnings with that team that they are in.

Selfishly, I’ve also found that quite a good way to influence and have impact in an organization, upscale that person, suddenly how they test and change the software becomes better as a team. It makes them ask the right questions, so mentoring, I think, is, while some people may perceive it is it’s like a one-on-one thing, I think it has a much wider impact than just the individual. We’ve talked about creating safe spaces to grow and learn, definitely. Yes, so safe spaces to grow and learn. It’s that theme of psychological safety, I think, has been a theme through what I’ve been saying in this podcast so far.

Shane Hastie: So you’ve been in the industry a while. You have got into the role of a principal engineer. What advice would you have for yourself, five, six, seven years ago?

Advice for aspiring principal engineers [18:59]

Michael Gray: I did a talk on this actually, which is roughly a theme of me with my head in my hands in frustration versus how I’ve learned to navigate people and organizations over time. I think my biggest piece of advice would be someone who has been through, and you said this to me, not everybody sees the world the same way as you. I was fairly set in my ways, and I think that’s quite true of a lot of engineers. You have your perspective, it can be quite black and white. When you start realizing there’s a bigger world out there, and it’s not just about technology, it’s about where the company want to invest, where the company want to grow, your customers. It’s not just about technical perfection. There’s so many different inputs into the decisions that we need to make. And trying to become aware of those earlier in your career and sooner, I think, makes you a much more rounded individual and much more valuable to your company.

And another thing I talked about in that was about timing and when to challenge things. Earlier in my career, I would challenge everything all the time. That’s wrong. “I’ve been to a conference, we should be doing this”. “Well, this context is important”. “Why? Why are we doing it this way?” The other thing is if I challenge it now, are people maybe at 100%, are they likely to hear me? Is it going to make an impact if I challenge it at this point in time? Is there a pause in a project? Well, it means we’re going to have a retrospective and you can ask a question rather than saying, “I think this is wrong, why don’t we try …?” So there’s just different ways of introducing ideas a little more softly rather than being set in ways, telling people stuff’s wrong, that kind of thing. It doesn’t have an impact to believe it or not, and you lose a lot of your influence and just become a bit of a white noise. Definitely that.

Being more patient is another one. I think that what I’ve just described there is a symptom of not being very patient and not accepting that change takes time because it does. So you need to set your horizons a little bit longer than tomorrow. Yes, I think that’s the first summary of the biggest piece of feedback I would’ve given myself.

Shane Hastie: Mike, some really good advice. Advice to your younger self, advice for our audience, a whole lot of stuff to think about here. If people do want to continue the conversation, where do they find you?

Michael Gray: So you can find me on LinkedIn, also Twitter or X, as it’s now called, @mikegraycodes. And I think you can find me on … well, you can find me on Mastodon as well, I’m just @mikegray. The @mikegraycodes, I don’t actually write that much code anymore as a principal engineer, I do a bit but not that much, so I’ve dropped the codes part of the handle.

Shane Hastie: Thank you so much for taking the time to talk to us today, Mike.

Michael Gray: No problem. Thanks for having me, Shane.

Mentioned:

About the Author

Michael Gray

Show moreShow less

.
From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.

Uncategorized

Podcast: Engineering Leadership: Balancing Autonomy, Growth, and Culture with Michael Gray

MMS • Michael Gray

Subscribe on:

Transcript

Michael Gray: Thanks for having me, Shane.

Shane Hastie: My normal starting point with these conversations is, who’s Michael?

Introductions [00:59]

Shane Hastie: As a principal engineer, what is your contribution to culture?

Contribution to culture [02:12]

Balancing Autonomy and Regulation [04:09]

Finding the balance between architectural advice and overburdensome bureaucracy [09:25]

Industry pressure on efficiency compromising quality and continuous improvement [11:31]

Shane Hastie: So how do we resolve it? How do we want to say push back? How do we challenge this?

Maintaining culture during rapid growth [15:16]

Shane Hastie: How do we grow our engineering talent within organizations, but as an industry?

Growing talent as an industry [17:16]

Shane Hastie: So you’ve been in the industry a while. You have got into the role of a principal engineer. What advice would you have for yourself, five, six, seven years ago?

Advice for aspiring principal engineers [18:59]

Shane Hastie: Thank you so much for taking the time to talk to us today, Mike.

Michael Gray: No problem. Thanks for having me, Shane.

Mentioned:

About the Author

Michael Gray

Show moreShow less

.
From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.

Uncategorized

Leveraging the Transformer Architecture for Music Recommendation on YouTube

MMS • Sergio De Simone

Google has described an approach to use transformer models, which ignited the current generative AI boom, for music recommendation. This approach, which is currently being applied experimentally on YouTube, aims to build a recommender that can understand sequences of user actions when listening to music to better predict user preferences based on their context.

A recommender leverages the information conveyed by different user actions, such as listening, skipping, or liking a piece, which is then used to make recommendations about items the user could be likely interested in.

A typical scenario where current music recommenders would fail, say Google researchers, is when a user’s context changes, e.g., from home listening to gym listening. This context change can produce a shift in their music preferences towards a different genre or rhythm, e.g., from relaxing to upbeat music. Trying to take such contextual changes into account makes the task of recommendation systems much harder, say Google researchers, since they need to understand user actions in the user’s current context.

This is where the transformer architecture may help, they believe, since it is especially suited to making sense of sequences of input data, as shown by NLP and more generally large language models (LLMs). Google researchers are confident that the transformer architecture may show the same ability to make sense of sequences of user actions as they do of language based on the user’s context.

The self-attention layers capture the relationship between words of text in a sentence, which suggests that they might be able to resolve the relationship between user actions as well. The attention layers in transformers learn attention weights between the pieces of input (tokens), which are akin to word relationships in the input sentence.

Google researchers aim at adapting the transformer architecture from generative models to understanding sequential user actions based on the current user context. This understanding is then blended with personalized ranking models to produce a recommendation. To explain how user actions may have different meaning depending on the context, the researchers depict a user listening to music at the gym who might prefer more upbeat music. They would normally skip tha kind of music when at home, so this action should get a lower attention weight when at the gym. In other words, the recommender applies different attention weights in the user context versus the global user’s listening history.

We still utilize their previous music listening, while recommending upbeat music that is close to their usual music listening. In effect, we are learning which previous actions are relevant in the current task of ranking music, and which actions are irrelevant.

As a short summary of how it works, Google’s transformer-based recommender follows the typical structure of recommendation system and is comprised of three different phases: retrieving items from a corpus or library, ranking them based on user actions, and filtering them to show a reduced selection to the user. While ranking items, the system combines a transformer with an existing ranking model. Each track is associated to a vector called track embedding which is used both for the transformer and the model. Signals associated to user actions and track metadata are projected on to a vector of the same length, so they can be manipulated just like track embeddings. For example, when providing inputs to the transformer the user-action embedding and the music-track embedding are simply added together to generate a token. Finally, the output of the transformer is combined with that of the ranking model using a multi-layer neural network.

According to Google’s researchers, initial experiments show an improvement of the recommender, measured as a reduction in skip-rate and an increase in time users spend listening to music.

About the Author

Sergio De Simone

Show moreShow less

Uncategorized

Leveraging the Transformer Architecture for Music Recommendation on YouTube

MMS • Sergio De Simone

The self-attention layers capture the relationship between words of text in a sentence, which suggests that they might be able to resolve the relationship between user actions as well. The attention layers in transformers learn attention weights between the pieces of input (tokens), which are akin to word relationships in the input sentence.

We still utilize their previous music listening, while recommending upbeat music that is close to their usual music listening. In effect, we are learning which previous actions are relevant in the current task of ranking music, and which actions are irrelevant.

According to Google’s researchers, initial experiments show an improvement of the recommender, measured as a reduction in skip-rate and an increase in time users spend listening to music.

About the Author

Sergio De Simone

Show moreShow less

Uncategorized

Presentation: Everything Is a Plugin: How the Backstage Architecture Helps Platform Teams at Spotify and Beyond Spread Ownership and Deliver Value

MMS • Pia Nilsson Mike Lewis

Transcript

Nilsson: I am Pia. I work at Spotify. I lead developer experience at Spotify, as well as leading Backstage since its inception in 2017. I joined Spotify in 2016, having been an engineer for 14 years. I joined as an engineering manager for one of the teams there in the platform organization. I was so excited to join Spotify. I was very impressed with the autonomous engineering culture, thrilled to work in that exciting, world-leading audio space that Spotify is in.

This is the reason for being on this stage, that I have never struggled so much in my entire life to add value quickly. Leading the Backstage team back then, which of course didn’t exist when I joined, is a healing journey for me, as well, because Backstage is trying to solve many of the challenges that I personally struggled with in the beginning.

Lewis: My name is Mike Lewis. I am the tech lead for Backstage at Spotify, which means I get to work with the amazing team that we have at Spotify working on Backstage. I get to think about Backstage all day, which is so fun. I’ve been at Spotify about 5-and-a-half years now. When I joined Spotify, I was working in the premium mission, working on things related to spotify.com, and checkout, things like that. I was using Backstage every day and seeing the value that we get from Backstage at Spotify. When the opportunity came up to join the Backstage team and start working on it myself, I jumped at the chance.

Nilsson: We are here to speak to you about how we use our developer portal Backstage plugin architecture, to change the ways of working for our 3000 engineers. I think just that sentence is important, at least to me. It’s not all about technology, although that’s the heart of it. It’s technology in order to change the ways of working in a meaningful way.

Backstage Journey

Before we get into the plugin architecture, and why it matters so much to us, I think it’s important for you to know some little thing about what kind of challenges we were facing at Spotify, back in 2017, when we were really starting to talk about these problems. These are accurate clouds that I took from our ecosystem. Imagine them a little smaller back in 2017, but you can understand it was a similar challenge.

This is what was facing any engineer joining Spotify in terms of scale. Every single dot here is a component. You see the backend services. Every single dot is an actual component. Every single line is a component using another component. This is the scale that you are meeting today at Spotify. Back in 2017, it was starting to hurt our productivity quite a bit. All of our productivity metrics were trending in the wrong direction. One of the metrics we were measuring is number of days it takes for your 10th pull request, if you are a newly onboarded person.

That’s just one of the crude metrics we were using. We were up to 60 days, which absolutely wasn’t our target number. As I said, those are trending in the wrong direction. Adding to this complexity in scale, I’m sure many of you have heard about Conway’s Law. Conway’s Law is that the systems tend to look like the org chart they were created in. At Spotify, we have this autonomous engineering culture, which is just beautiful. It’s just wonderful to work in. People are very excited. They are passionate. They own what they build. It’s lovely.

The backside of that is that people were also, back in 2017, expected to deploy their code, for example, and do all kinds of infra for their code all by themselves. Many of these data endpoints and backend services, they were all created in some way that that particular team felt was the right thing to do. Entering into that scene as I was doing, that’s what made me struggle so much. I could never extrapolate a learning from one place to the next because it was entirely different. Documentation were in different places. Services were written in different languages, of course, different libraries didn’t work together. It was like the Wild West, but in a very loving way.

Mike and I, we worked for the platform organization. Of course, many of you know, platform organizations, our job is to fix this ecosystem for the engineers that are building the actual business value, so that they can do it much faster and with more happiness, so that it becomes greater and faster for the company. We were thinking hard about this particular problem, what are we going to do to help our engineers through this Wild West place they are in? I happened to lead, at that time, this little team that owned the backend component catalog. It was a tiny system.

Only the backend engineers cared about it, nobody else knew about it. They owned this backend component catalog. Having that reality, and being faced with this problem, I believe very much was the embryo to why we realized like, ok, we’re seeing this problem, of course, of scale and complexity for all engineering disciplines. However, the backend engineers, they are just a slightly bit happier, because they have this component catalog, at least. What if we actually create the catalog for everyone? That’s the little embryo of why we started the Backstage idea.

I think, over the quarters, we were doing all of these engineering surveys. The top two problems here were being surfaced to us. Engineers were calling out top productivity blockers, difficult to find things, and context switching. These were sort of people who were very aware of these two problems, and they are connected. As probably many of you understand that if you have problems finding things, you got to pull people into a meeting, you got to Slack them, you got to tap them on the shoulder. Then I will do that to you, and you will do that to me, and all the other seven teams I’m integrating with. You can imagine the amount of meetings that our engineers were running around trying to be helpful all the time. That we bundled into the world of context switching, that’s what it means for our context. These were the top two problems, according to the engineering service we were running.

Then the platform organization, we thought, there is actually a third problem as well. If you remember these clouds, we were so unable to help engineers. We were seeing it. We were seeing our metrics trending in the wrong way. We were seeing people couldn’t be productive within 60 days of onboarding. We couldn’t do that much, because of all these dots, where there were no standards. Naively, Spotify had the opinion that standards are boring, and it’s the opposite of freedom. In our little team, since we were feeling this pain, we created this little slogan saying, standards set you free. Then we rode around on horses here at Spotify, and like, “Standards set your free.” Which is like a little bit of a joyous way of saying like, no, actually, if you don’t have standards for your software ecosystem, you’re totally tied down to all the boring stuff.

Because you are going to have to set up your CI/CD deployment systems. You are going to build your build pipelines. There are going to be no templates. You’re going to invent it over again, over 500 teams. Of course, we were seeing a lot of duplication everywhere. That’s the third productivity blocker that we were discussing. Simply put, this is what Spotify was looking for a solution for. Also, for all of you who are contemplating using a developer portal, this is what we used with our upper management to help them understand what are you trying to solve. We’re trying to solve speed, scale, and chaos control. People can relate to that, because they see that it is a little bit of chaos.

Backstage – A Single Pane of Glass

Here it is. This is what Backstage looks internally for us, what it is, a single pane of glass for your infrastructure, that’s what it is. It’s simply put, a homepage. Even simpler than that, it’s a catalog for your infrastructure, all of it. All of you I’m sure know, any system that some kind of platform organization, like what I’m representing, puts together and offers to our engineers internally, is only as useful as its adoption. It can be as beautiful as can be, but if three teams use it, it’s just very expensive.

How do we make sure that Backstage was used? That’s where infrastructure as code comes into play. The metadata on each component needs to be in the repositories. Hence, the ownership of that metadata needs to be transferred to the owning team of that component. I think that’s a little small, but important engineering best practice that I’m sure all of you know about. That is the basis for why I believe our catalog happened to actually stay relevant, stay useful to our engineering population. Then, of course, we want Backstage to be more than only components. We want to add more kinds of functionality, such as measuring standards across the fleet. There are so many, monitoring and CI/CD. In order to do that, that’s where the plugin ecosystem comes in.

Key Backstage Engineering Best Practices

I wanted to say, these two engineering best practices that I really think, if one tries to figure out like why was Backstage successful, really, for Spotify? Of course, we have been asking ourselves that question for very many years. I think it’s as simple as this, infrastructure as code. Then, for today’s talk, we’re going to focus on the second one, which is the distributed code ownership of the plugin architecture. Extensibility is the plugin architecture that enables this distributed code ownership.

That is key to distribute it, to decentralize the decision making to the team actually owning the expertise. Instead of having, in our example, my little team building the Backstage developer portal, trying to figure out how to build all of these new functionalities into the Backstage portal so that it would become meaningful for all of the 3000 engineers. It goes without saying that that can never happen, that will never scale. It’s like it’s a must to have a plugin architecture for us.

Backstage Extensibility Architecture

Now we’re going to deep dive into the plugin architecture structure, to give you a broad view of what it is.

Lewis: Extensibility is just so important for Backstage, and it’s important at Spotify, and it’s important elsewhere, too, because Backstage is open source. It’s used by thousands of companies around the world. It’s important in both of those contexts. I want to tell you a little bit about how extensibility works in Backstage and how it’s changed over the years, and what we’ve learned along the way. First, I think it’s important that we cover the high-level architecture of Backstage. How many folks are familiar with the full stack web architecture of building Node.js, React web apps, just JavaScript on the frontend? I’ll try and keep it fairly generic. At a high level, Backstage is a web based React frontend. There is a horizontally scalable backend written in Node.js. That backend is talking to some infrastructure, a database, potentially a cache.

Then the backend is also talking to some third-party APIs as well. Those lines of communication between those things are usually HTTP, although other things can be supported at a plugin level, too. There are some logos there to represent this tech stack. In that context, what’s a plugin? A plugin is just a bundle of TypeScript code that’s been compiled into JavaScript, published to npm for public packages, or even private for adopters that don’t want to share that package, and it’s just used for private internal use, there’s no need to publish it. It’s generally for use in either the frontend or the backend, although sometimes there’s isomorphic packages in the mix too. The standard is frontend or backend.

For those who are familiar with full stack web development, there’s maybe an argument to be made that we’ve done it at that point. React in the frontend and Express in the backend are already pretty extensible, or at least composable. You can render components from other places. You can bring middlewares into your Express app, and you’ve done it. You’ve extended your app with new backend functionality via the middlewares, and new frontend functionality with your React components.

Extensibility by default is not enough for a couple of reasons. The first is, if you’re building an extension in a model where all you have is React on the frontend and Express in the backend, that’s hard work. There are no shared solutions for the things that you need to get started quickly and get working building on your plugin. There are also much more decisions to make. Every time you need to decide how you’re going to manage your database connection, or logging, or any of those things, you got to start from scratch, make a decision. It’s more cognitive load. It slows you down, so you’re less efficient.

Also, the results, the plugins that get built in that ecosystem are less consistent, which is bad for people using them. That’s on the plugin builder side. What about adopters, people using Backstage? In that world, Backstage adopters have a lot of fiddly wiring to do. They have to wire everything up themselves in the backend to run the middleware in the Express app and provide it with its dependencies. By the same token in the frontend, they need to render the React components at the right time and provide those dependencies too. It’s a lot of work.

I think of it like, what we want in a plugin system is we want like a power plug that you just plug in, and it’s done, you’re finished. What that’s like is like having to wire a plug. It’s a lot more work, and it just is not efficient to work in that way. The last thing to mention here, actually, is that we want to encourage contributions. Backstage is an open platform. We want people to contribute plugins and plugin modules to that platform. If it’s hard to build plugins and hard to build plugin modules, then that’s discouraging people from doing that.

How do we start encouraging them? The first way is, I think of it like a tool library. If you haven’t encountered one of these in the wild, outside the world of software, a tool library is a place where you can go and borrow the tools that you need to do a task in the real world. I’m not talking about software here, I’m talking about actual saws and drills and things. This means you don’t have to own every single tool that you need to do DIY things at home, you can just go and borrow them. You don’t have to figure out which one’s the best one, you just take the one that they’ve got, use it. When you’re done with it, you bring it back.

This analogy that I’m trying to draw here is between a tool library in the physical world, and the Backstage tool library, which is something we’ve built, which is a collection of core services in the backend and the frontend that provides you with a bunch of capabilities that you can leverage to do things more efficiently. To get stuff done. To bake in some sensible decision so you can just get productive quickly building your plugin. I’m not going to go through this whole list. Just to give an example, I think database management is a really cool system in Backstage, or as cool as a database management system can be. The way that it works is a plugin owner gets a connection to a database. They don’t know anything about where that database has come from, it’s just been configured by the adopter of Backstage.

Actually, behind the scenes, adopters can configure a single database and share that connection with all different plugins, and they all get access to a database within that single database instance. Or, each plugin can have its own database. That system of configuration and separation of databases is entirely abstracted away from plugin owners. All they have to do is just get the database connection and use it, and job done. That’s the tool library. That’s the first way that we are making life easier for people building extensions or people using them.

The second way is more focused on adopters. This screenshot is showing to an extent the old world of Backstage, the way things used to be when you were adopting plugins as an adopter. You’ll see a lot of documentation pages like this if you look on the Backstage docs today. What you’ve got here is like that plug wiring analogy that I talked about. You have to pull in specific lines of code to your backend instance, and put them in in the right place to add the middleware and make everything work.

When plugins do different things, you have to put subtly different lines of code in. When things change, you have to adapt your code to address those changes. That’s a lot of work. It’s like wearing a plug every time. Over the last year, we’ve been working towards a solution, in particular, the maintainers have been working towards a solution called declarative integration, which takes away the need for this manual wiring, and instead makes it possible to install a plugin or a plugin module just by adding the package via yarn, which is the standard package manager for JavaScript code bases.

This is a solution that’s in alpha right now, particularly immature in the frontend, but also in the backend, it’s still pretty new. We’re not recommending folks migrate to this yet. It’s under active development, and it’s really adding a lot of value in the ecosystem. We’re going to show you a little bit more about that. I’ll be doing a bit more of that during the demo.

What’s next? I want to talk a bit about the mental model for Backstage, because it’s not as simple as core frameworks, and then some plugins that sit on top of the core frameworks. I think what’s so powerful about this extensibility model, and what I’ve thought was so cool, since I started working on Backstage is this idea of nested extensibility. We don’t just have a core framework that plugins sit on top of and can extend. Instead, we’ve built a system that allows the plugin owners individually to allow other people to extend their plugins with plugin modules.

An example of the way this is really powerful is if you have a generic plugin, which is providing some shared functionality in Backstage, you can have plugin modules, which offer the direct connections to specific upstream APIs. For example, you might have a system that pulls in org data to the Backstage catalog that Pia mentioned earlier, and then a plugin module that knows how to fetch that org data specifically from an LDAP server. Adopters can write their own ones of those to pull org data from their own custom org data providers. We can provide additional ones in open source to support whatever integrations people need to use.

A key concept with all of this is the notion of importable points of extension. I’ll show some code for this on the next slide, particularly to cover the importable bit. I want to just cover this example to talk about how extensibility works in this nested model. You’ll see on the left-hand side, we’ve got the core framework, that foundational bit from the previous diagram, and just an example core service, the HTTP router in the backend, which is the thing that routes requests coming from the frontend to different parts of the backend, different middlewares. We’ve got some arrows here, that’s pointing out the fact that the core framework is providing an HTTP router as a point of extensibility. The catalog plugin in this case, is extending that point of extensibility with a specific middleware. All of this is happening between the plugin and the framework without any interaction from an adopter.

That’s the bottom layer, going to the middle layer of plugins. That same thing is replicated between plugins and plugin modules. An individual plugin like the catalog can export an extension point. In this case, the example is the catalog processing extension point which controls how entities are processed, as they come in from sources of data. Plugin modules can add additional ones of those via that extension point. This extensibility is nested. That’s where I think the power really comes from. I said I drew some code that corresponds to this. Same topic, importable points of extension.

This is heavily abbreviated and simplified code. What’s happening is we’re importing the core services extension points from the core framework in the top section of code. We’re grabbing the HTTP router, and we’re adding our plugins middleware to that router. By the same token, in the bottom slice of code, we’re importing the catalog processing extension point from the catalog plugin. Then we’re using that to add an entity provider which provides us with the entity data that we need.

Extensibility Examples in Backstage (Access Control)

I’m going to cover just one example of extensibility in Backstage today. This is a concrete example of how extensibility works, in one particular case. The case I want to talk about is access control, which is a relatively new system in Backstage. Access control or authorization. You might have heard or seen this notion before, that access control as a concept is the product of decision making and enforcement. Decision making is whether a given user can perform a given operation or access a given resource.

Enforcement is ensuring that the system operates within that constraint, the constraint represented by the decision. If you have those two things, you’re done with access control. That’s the whole access control job done. When we think about that model, how can we find the right point to introduce extensibility? I think the first thing to think about is, who’s responsible for each of those bits? In this case, I’m asserting that the individual plugins are responsible for enforcement. Because Backstage is so extensible and so generic, and we’re not in any control of how plugins manage their resources or operations, that enforcement part has to rest with plugin owners. They have to decide how to enforce the access control restrictions that exist. Conversely, the decision rests with Backstage adopters who own Backstage instances, because they may have very different requirements, from instance to instance, about how access control is managed.

Some adopters might have strict regulatory requirements that limit which users can see which entities. Conversely, other adopters might have much more transparent cultures where they want everyone to see everything. Then the framework there is just making the point that the Backstage framework is stitching that all together. Given that model, and I think you probably see what’s coming where the extensibility lies, is with the decision. We want the enforcement to be the same every time, every plugin. We want plugin owners to implement that enforcement consistently and in a single way inside their plugin. Decision making, we want to defer entirely to adopters. In the case of access control, we’ve introduced one point of extensibility, where you can replace what we call the policy with arbitrary code that decides what decision to make in any given circumstance, or with a configurable solution that lets you manage the decisions that you want to make, case by case, in the UI.

Demo (Extensibility)

Let’s do a demo. What I want to show you is what the process looks like for adding and configuring an extension in both the Backstage backend and frontend. Here, we’ve got the Backstage backend running. Doesn’t matter what’s on the screen right now. The Node.js backend is running in this terminal. In this terminal, the Backstage frontend is running. I’ve got the frontend and backend running locally. You can see here, this is the Backstage instance running with that frontend and backend.

This is a generic instance of Backstage. It’s pretty close to what you’d get if you scaffolded a brand-new Backstage app from scratch. The only caveat is I have replaced the standard backend and frontend systems which require that manual wiring still with the new backend and frontend systems which provide that declarative integration solution. How do I add features and capabilities to this Backstage instance, in that model? You’ll see, we have a bunch of different kinds of resource available here. I can browse around and see the different ones.

There’s a guest user, and there’s me. We can see some groups and all that, and so on. We haven’t got any resources in the system. Let’s say I want to add some of the resources that I have in my software ecosystem. The resources that I want to add are planet resources, because my company is somehow in charge of planets, or partly because I was sitting in a room with a planets poster when I was working on this and got inspired. We’ve got an API running here, which provides me information about the planets in the solar system. You can see we’ve got planet names and we’ve got a planet image as well. We have this running as an API too, so that’s available to query over here in this terminal window. If I want to add those resources to my Backstage catalog, there’s actually only one step that I have to do because I already have a module written, which provides an entity provider in the right extension point to load those entities into the catalog.

If I switch to the backend package, in my Backstage instance, and I add the package name, which I believe is catalog-backend-module-planets. There we go. Let’s stay in the terminal for now. You’ll see that the backend is actually already restarted. You’ll see that since it restarted, it’s now discovered a new backend module, and you’ll see that it’s now got this log line refreshing planet resources. I’m leveraging the scheduling system built into Backstage in the core services to run this every 5 seconds. Now I’m refreshing the planet resources from that API every 5 seconds, and persisting them into the catalog.

Let me go back to the browser, and have a look at this tab. If we refresh and look at the kind, we’ll now see there’s resource kind, and we have all of the planets showing up here. I can click into one of them and see the details. Now let’s say I want to add some frontend functionalities to this too. I’m going to do the same thing. If I go over to my terminal again, and switch to the app package, and add the frontend planets module. This time, we’re going to have to restart the frontend to catch those changes. You see, it’s still refreshing those plugins for us in the backend.

Once I restart this, and add one piece of config, this piece of config here to control the order of the entity cards that appears on the screen. Once that’s done, we should be able to browse to one of these planets, maybe refresh to catch that config change. You see the planet image appears as a new card. The thing that I want to highlight here is that there’s no code changes necessary apart from that configuration change to my Backstage instance at all. All I’ve done is add modules and they’ve stitched in automatically, provided their functionality to Backstage with none of that wiring required. I just plugged in the plug to the socket. We’re hoping that’s going to be stable in the next year or so, and people are going to be able to benefit from that consistently when they adopt Backstage.

Extensibility Value Proposition

What’s the effect of all this extensibility? What benefits do we get from extensibility? Firstly, focusing on Spotify. Because all the user facing functionality in Backstage, and that’s including core things like the catalog, are built as plugins, it’s really easy to parallelize work. Teams can work on features independently without having to coordinate or collaborate to get things done, they can just work on their features without having to talk to each other. They can also have distributed ownership of those features. The cataloging team can work independently from the scaffolder team, and folks building new functionality on top of it can also work totally independently. That distributed ownership model is really powerful for allowing us to match our investments in different areas to the level of importance of that part of the system.

The other thing that we get from this is also consistency. Because everything is built on this Backstage foundation, expertise is transferable between plugins. If a person moves from team to team, they can easily contribute more quickly because it’s a Backstage plugin still. What about outside Spotify? Firstly, all that same stuff. These things are true both inside and outside of Spotify. It’s still easy to parallelize both in open source and in other adopters’ instances of Backstage, those benefits for minimizing coordination and transferable expertise, distributed ownership, that’s all still true. The bonus that we get in the world outside of Spotify, is that the tech stack, the standards, the choices that we’ve made about how Backstage fits together at Spotify doesn’t have to be mirrored at other organizations in order for Backstage to be valuable.

They can pick the plugins that they want to use, or even build their own, to compose the perfect developer platform for their own needs. That’s very different from the one where we built a fixed Backstage at Spotify, and then tried to get everyone to use it, because in that situation, you have to convince everyone that they should work exactly like you work.

Key Takeaways

I want to pick out some key takeaways from this. These are technical takeaways that seem important to me when we’re thinking about how to build extensibility models into other software. The first is, when we’re reducing repetition in systems like these, we should be reducing it by persona, so thinking about who is writing what code. When I think about Backstage specifically, I think about moving code from adopter instances into plugins, and moving code from plugins into the framework. Because the framework you write once, plugins get written plugin by plugin. Adopters have the most instances, the most numerous in that group. The more we can push things up into plugins, and then into the framework, the more the overall repetition is reduced. The other thing is, use the framework that you’re building.

An extensibility solution where you have some core systems that aren’t built with the extensibility model, and then you’re trying to extend those things with a set of extensions that have different capabilities. It’s much harder to get that extensibility model right because you’re not leveraging it, especially in your core team. It’s just getting used separately by this separate group of people. Conversely, if you build your core systems with that extensibility model, you guarantee that extensibility model is powerful and fit for purpose. Nested extensibility is just such a powerful concept for us. I really think it can apply elsewhere too. Making sure that you can have extensions that are themselves extensible is so powerful for making sure that you’re enabling the maximum amount of flexibility in your system.

The ROI of Backstage

Nilsson: Just some finishing words on the ROI of Backstage. We got this question a lot since 2020, when we decided to open source. It’s a very important and good question. As I mentioned, at the beginning, we were measuring our own developer productivity through number of days it took until the 10th pull request, which is not a fantastic metric. It lacks a lot of nuance. We stole it from Meta, and figured it will do. Now we have evolved a bit. These are some of the metrics we are measuring developer productivity with at Spotify.

One should read these numbers like this. At Spotify, we have 100% adoption of Backstage, so we divided this into 50% of the more frequent users, and the other one who are less frequent. These numbers are for the more frequent users, the 50%. Some of them, of course, aren’t using Backstage all that much. This cohort is still 2.3 times more active in our GitHub repo. They have a 17% shorter cycle time. I think they create code, twice as many changes. The list goes on. You can read more about it on our blog. We published all of our metrics there. They’re also, we want to believe, a little bit happier. The last one here, they are 5% more likely to stay with us. Going back to the beginning here, I was pretty unhappy when I joined Spotify, and realized I had such a difficult time adding value.

That makes me also feel that we’re doing something good for the engineering community at Spotify, and for the world, that they actually want to stick around a little longer if they use Backstage. This is open source. We’re not just standing here talking about something that we happen to have, and nobody else. This is very accessible to all of you. You can go and use it today. One thing I think that I want to leave you with is, if you do recognize these challenges that we were having with scale and speed, and scale slowing you down, these are universal problems. Let’s solve them globally. Why not join the open community where you have thousands of other adopters that have similar challenges that your organization may have? I think that’s going to speed all of us up.

Resources

If you want to know more, we have these webinars, biyearly, where we release a bunch of stuff, new products, as we have a commercial leg to Backstage as well, next to the open source, of course. Check it out if you’re interested.

See more presentations with transcripts

Uncategorized

Presentation: Everything Is a Plugin: How the Backstage Architecture Helps Platform Teams at Spotify and Beyond Spread Ownership and Deliver Value

MMS • Pia Nilsson Mike Lewis

Transcript

Backstage Journey

Backstage – A Single Pane of Glass

Key Backstage Engineering Best Practices

Backstage Extensibility Architecture

Now we’re going to deep dive into the plugin architecture structure, to give you a broad view of what it is.

Extensibility Examples in Backstage (Access Control)

Demo (Extensibility)

Extensibility Value Proposition

Key Takeaways

The ROI of Backstage

Resources

See more presentations with transcripts

Uncategorized

The Value of Using Timeless Testing Tools

MMS • Ben Linders

According to Benjamin Bischoff, developers find new tools much more interesting than old ones, as they offer an opportunity to learn new technologies and approaches and to expand their tool belt. Using tools that have been around for decades, however, can save time and budget. When evaluating tools, it is more important to understand the problem to be solved than to jump straight into the tools.

Benjamin Bischoff will give a talk about the value of using timeless testing tools at QA Challenge Accepted. This conference will take place on September 28 in Sofia, Bulgaria.

In his talk, he will discuss biases that play a role when favouring a new tool:

For example, there is the “sunk cost fallacy”. This means that you are tempted to use a new tool if you have invested a lot of time and money. This can be training time as well as budget spent on the tool itself or learning materials. It can therefore happen that you close your eyes to the disadvantages of a tool and use it against your better judgement.

According to Bischoff, familiarity is the main reason for using tools that have been around for decades. If the tools are already familiar, their use can save a lot of training time and budget, and also help to achieve goals more quickly. Also, there is a deeper understanding of the potential issues with these tools, which is not the case with new tools:

New tools may suddenly show limitations when they are run in production, which can lead to considerable additional work.

To find suitable tools, Bischoff suggests gathering thorough information and creating proofs of concept to find out which tool is ideal in which situation:

When we were looking for a tool for our API tests, we created and compared three different proofs of concept with different tools. My position at the time was that I was not a fan of the Karate framework. In the end, however, I was convinced by how quickly we achieved a good result with this framework. Looking back, this was simply a prejudice on my part because I simply didn’t have enough knowledge about this tool. That changed during the proof of concept phase.

According to Bischoff, it’s helpful to always bring a healthy dose of scepticism with you and not get carried away by the flashy and shiny hypes. In the end, it’s not about using certain things at the drop of a hat, but rather analysing more precisely what you need and how you can get the most out of technology.

InfoQ interviewed Benjamin Bischoff about testing challenges and using tools to address them and asked him about his experience with AI tools.

InfoQ: What main testing challenges have you faced?

Benjamin Bischoff: I have already had several different testing challenges, specifically technical challenges in the sense of “How can we map this requirement as tests?” and “What technical options are there for this type of test?” Of course, this also includes tools.

For example, we needed a tool that would simplify the testing of APIs for us. The Karate framework proved its worth for us here, as it has a simple syntax, but is very flexible thanks to its extension options. For UI-based end-to-end tests, Selenium fulfilled our criteria as a remote browser control. It was important that both desktop and mobile platforms could be operated and that features such as websites with multiple tabs and windows worked. We also wanted a solution that would accurately simulate users from the outside and not run in the browser like many other solutions.

On the other hand, there have also been challenges due to prejudices against certain roles, e.g. test automation engineers or QA engineers and their goals and intentions. But these problems have disappeared with time and experience.

InfoQ: What tools do you use to address these challenges, and how do you use them?

Bischoff: In our UI-based end-to-end test strategy, we mainly use our in-house Selenium test framework. For me, this technology still has many advantages over the hyped new tools, mainly that it uses the W3C webdriver standard so that all browsers are supported out-of-the-box. It does not need any custom browser versions or has to run within the Javascript context. That means that it behaves like a real user – and that’s what we want.

We also use Bash and Make in the CI/CD context, for example. We have a lot of experience with this and can achieve our goals quickly.

Of course, we also regularly look at new technologies and approaches and consider whether and how we can integrate them into our test strategy. For example, in terms of test frameworks, we have also evaluated Playwright and Cypress, but in the end decided against them due to the reasons mentioned before. However, if we find interesting tools in the future, we have no qualms about replacing existing solutions with them. However, these must offer considerable added value that outweighs the resources invested.

InfoQ: What have you experienced using AI tools?

Bischoff: I use AI tools on a daily basis, both professionally and privately. For development and automation, I often use Github Copilot, Google Gemini and ChatGPT. If you already have experience in a certain area, these tools can be really helpful. But you need to know exactly what you need and how to express it. More importantly, you need to know when a proposed solution doesn’t make sense or might be hallucinated, and express this in a follow-up prompt.

Gradually, you will get more and more useful ideas and solutions from AI tools the more you give feedback. I always say that you have to treat AI tools like advanced rubber ducks.

About the Author

Ben Linders

Show moreShow less

Uncategorized

The Value of Using Timeless Testing Tools

MMS • Ben Linders

Benjamin Bischoff will give a talk about the value of using timeless testing tools at QA Challenge Accepted. This conference will take place on September 28 in Sofia, Bulgaria.

In his talk, he will discuss biases that play a role when favouring a new tool:

For example, there is the “sunk cost fallacy”. This means that you are tempted to use a new tool if you have invested a lot of time and money. This can be training time as well as budget spent on the tool itself or learning materials. It can therefore happen that you close your eyes to the disadvantages of a tool and use it against your better judgement.

New tools may suddenly show limitations when they are run in production, which can lead to considerable additional work.

To find suitable tools, Bischoff suggests gathering thorough information and creating proofs of concept to find out which tool is ideal in which situation:

When we were looking for a tool for our API tests, we created and compared three different proofs of concept with different tools. My position at the time was that I was not a fan of the Karate framework. In the end, however, I was convinced by how quickly we achieved a good result with this framework. Looking back, this was simply a prejudice on my part because I simply didn’t have enough knowledge about this tool. That changed during the proof of concept phase.

InfoQ interviewed Benjamin Bischoff about testing challenges and using tools to address them and asked him about his experience with AI tools.

InfoQ: What main testing challenges have you faced?

Benjamin Bischoff: I have already had several different testing challenges, specifically technical challenges in the sense of “How can we map this requirement as tests?” and “What technical options are there for this type of test?” Of course, this also includes tools.

For example, we needed a tool that would simplify the testing of APIs for us. The Karate framework proved its worth for us here, as it has a simple syntax, but is very flexible thanks to its extension options. For UI-based end-to-end tests, Selenium fulfilled our criteria as a remote browser control. It was important that both desktop and mobile platforms could be operated and that features such as websites with multiple tabs and windows worked. We also wanted a solution that would accurately simulate users from the outside and not run in the browser like many other solutions.

On the other hand, there have also been challenges due to prejudices against certain roles, e.g. test automation engineers or QA engineers and their goals and intentions. But these problems have disappeared with time and experience.

InfoQ: What tools do you use to address these challenges, and how do you use them?

Bischoff: In our UI-based end-to-end test strategy, we mainly use our in-house Selenium test framework. For me, this technology still has many advantages over the hyped new tools, mainly that it uses the W3C webdriver standard so that all browsers are supported out-of-the-box. It does not need any custom browser versions or has to run within the Javascript context. That means that it behaves like a real user – and that’s what we want.

We also use Bash and Make in the CI/CD context, for example. We have a lot of experience with this and can achieve our goals quickly.

Of course, we also regularly look at new technologies and approaches and consider whether and how we can integrate them into our test strategy. For example, in terms of test frameworks, we have also evaluated Playwright and Cypress, but in the end decided against them due to the reasons mentioned before. However, if we find interesting tools in the future, we have no qualms about replacing existing solutions with them. However, these must offer considerable added value that outweighs the resources invested.

InfoQ: What have you experienced using AI tools?

Bischoff: I use AI tools on a daily basis, both professionally and privately. For development and automation, I often use Github Copilot, Google Gemini and ChatGPT. If you already have experience in a certain area, these tools can be really helpful. But you need to know exactly what you need and how to express it. More importantly, you need to know when a proposed solution doesn’t make sense or might be hallucinated, and express this in a follow-up prompt.

Gradually, you will get more and more useful ideas and solutions from AI tools the more you give feedback. I always say that you have to treat AI tools like advanced rubber ducks.

About the Author

Ben Linders

Show moreShow less

Uncategorized

Elastic Returns to Open Source: Will the Community Follow?

MMS • Renato Losio

In a surprising move for both the open-source and Elastic communities, Shay Banon, founder and CEO of Elastic, recently announced that Elasticsearch and Kibana will once again be open source. The two products will soon be licensed under the AGPL, an OSI-approved license.

Just over three years ago, Elastic relicensed their main products from Apache 2.0 to a dual-license model under the Server Side Public License (SSPL) and the new Elastic License, neither of which are OSI-compliant open-source licenses. This change prompted AWS to fork Elasticsearch, leading to the creation of OpenSearch, which continues to operate under the Apache 2.0 license. Banon explains the goal of this latest change:

We never stopped believing and behaving like an open-source community after we changed the license. But being able to use the term Open Source, by using AGPL, an OSI-approved license, removes any questions, or fud, people might have.

Sometimes referred to as the “server-side GPL,” the AGPL was approved by OSI in 2008. It requires that the source code of all modified versions of the software be made available to all users who interact with it over a network, offering protection against challenges by cloud service providers. Banon adds:

We have people that really like ELv2 (a BSD-inspired license). We have people that have SSPL approved (through MongoDB using it). Which is why we are simply adding another option, and not removing anything. (…) We chose AGPL, vs another license, because we hope our work with OSI will help to have more options in the Open Source licensing world.

While Luc van Donkersgoed, principal engineer at PostNL, described this as one of the weirdest press releases ever, Peter Zaitsev, open-source advocate, writes:

I wonder though if community trust can be repaired as quickly? Can we count on Elastic to stick to Open Source this time or is the license likely to be changed to serve the need of the moment?

On HackerNews, Adrian Cockcroft, tech advisor and formerly VP at AWS, references an article he wrote in 2018 about the Open Distro for Elasticsearch and comments:

At the time we didn’t think a new license made sense, as AGPL is sufficient to block AWS from using the code, but the core of the issue was that AWS wanted to contribute security features to the open source project and Elastic wanted to keep security as an enterprise feature, so rejected all the approaches AWS made at the time.

Lars Larsson, field CTO at Elastisys, comments:

I find it hard to believe that the community will flock back to Elasticsearch: When Elastic closed the source, a lot of companies and individuals saw their contributions to the Apache 2 codebase all of a sudden locked into only creating value for Elastic. This burns the community, just like when Hashicorp took all their previously-open products and closed them up.

Guido Iaquinti, CTO and co-founder at SafetyClerk, agrees:

Trust is something that takes a long time to build but can be shattered in an instant. Only time will tell, but for now, I see no reason why people shouldn’t continue to stick with OpenSearch.

In the article, Banon acknowledges that the community might experience confusion and surprise and attempts to address the main questions. He denies that the 2021 license change was a mistake and wants to dispel concerns that the AGPL is not a true open-source license.

About the Author

Renato Losio

Show moreShow less

Uncategorized

Uno Platform 5.3 Released, Including JetBrains Rider Official Support

MMS • Arthur Casals

A few weeks ago, Uno released version 5.3 of their multi-platform UI framework for .NET developers. The highlight of the new release is the official support for JetBrains Rider. Other relevant features include an improved Hot Reload experience, two new UI controls, new font options, and support for SkiaSharp 3 Previews.

The Uno Platform acts as a bridge for WinUI and UWP apps to run natively on iOS, macOS, Android, Linux, and WebAssembly. It is built over different technologies (Xamarin Native Stack, Mono-WASM, and Skia, depending on the target platform), allowing the creation of single-source C# and XAML applications with responsive design and pixel-perfect control and using a single codebase.

With the new release, the Uno Platform introduces official support for JetBrains Rider through a new extension on the JetBrains marketplace. According to the company, this addresses a request from the community, bringing productivity advantages to developers using Rider:

Rider support has been a top request in our recent developer surveys […]. Starting this release, the development experience Rider developers have is on par with that of Visual Studio and VS Code developers. What this means in practice is that you can enjoy the full set of developer productivity enhancers such as C# and XAML Hot Reload for Uno Platform apps and debugging.

A full list of supported scenarios for developers using the new extension can be found here.

Another relevant feature of this release is the improved Hot Reload experience. “Hot Reload” refers to the ability to instantly see the effects of any code changes in a project. With the new release, a new visual indicator displays new information every time Hot Reload is triggered, helping the developer to monitor any changes in the code. The new feature is available on all IDEs for all targets that Uno Platform supports, except for WinAppSDK (since it has its own Hot Reload indicators).

The new release also introduces the support for a new default font for all platforms, using Open Sans. This is a relevant change since the default font on WinUI is Segoe UI, which is not supported on macOS, Linux, or browsers running in either these systems. Also, as a result of this feature, Uno now supports the use of Variable Fonts through the use of a font manifest. Using a font manifest allows specifying where a single variable font file can be used (the Web) and where multiple files may be used (Skia Desktop).

Other features in this release include two new UI controls: ItemsView (available for desktop applications) and SelectorBar (available for Skia Desktop targets). It also includes support for SkiaSharp 3 Previews.

Uno Platform is open source (Apache 2.0) and available on GitHub. The list of supported platforms includes Windows, iOS, macOS, Android, and Linux. It can be used with Visual Studio Code, Visual Studio 2022 for Windows (17.8 or later), and JetBrains Rider (2024.2 or later). The new release supports .NET 8 and later (up to .NET 9 Preview 6). At this point, Uno itself is built using .NET 9, which is part of the team’s effort to be ready to support .NET 9 on day 0 of its official release.

About the Author

Arthur Casals

Show moreShow less

Podcast: Engineering Leadership: Balancing Autonomy, Growth, and Culture with Michael Gray

MMS • Michael Gray

Subscribe on:

Transcript

Introductions [00:59]

Contribution to culture [02:12]

Balancing Autonomy and Regulation [04:09]

Finding the balance between architectural advice and overburdensome bureaucracy [09:25]

Industry pressure on efficiency compromising quality and continuous improvement [11:31]

Maintaining culture during rapid growth [15:16]

Growing talent as an industry [17:16]

Advice for aspiring principal engineers [18:59]

About the Author

Michael Gray

Subscribe for MMS Newsletter

Did you know...

Podcast: Engineering Leadership: Balancing Autonomy, Growth, and Culture with Michael Gray

MMS • Michael Gray

Subscribe on:

Transcript

Introductions [00:59]

Contribution to culture [02:12]

Balancing Autonomy and Regulation [04:09]

Finding the balance between architectural advice and overburdensome bureaucracy [09:25]

Industry pressure on efficiency compromising quality and continuous improvement [11:31]

Maintaining culture during rapid growth [15:16]

Growing talent as an industry [17:16]

Advice for aspiring principal engineers [18:59]

About the Author

Michael Gray

Subscribe for MMS Newsletter

Did you know...

Leveraging the Transformer Architecture for Music Recommendation on YouTube

MMS • Sergio De Simone

About the Author

Sergio De Simone

Subscribe for MMS Newsletter

Did you know...

Leveraging the Transformer Architecture for Music Recommendation on YouTube

MMS • Sergio De Simone

About the Author

Sergio De Simone

Subscribe for MMS Newsletter

Did you know...

Presentation: Everything Is a Plugin: How the Backstage Architecture Helps Platform Teams at Spotify and Beyond Spread Ownership and Deliver Value

MMS • Pia Nilsson Mike Lewis

Transcript

Backstage Journey

Backstage – A Single Pane of Glass

Key Backstage Engineering Best Practices

Backstage Extensibility Architecture

Extensibility Examples in Backstage (Access Control)

Demo (Extensibility)

Extensibility Value Proposition

Key Takeaways

The ROI of Backstage

Resources

Subscribe for MMS Newsletter

Did you know...

Presentation: Everything Is a Plugin: How the Backstage Architecture Helps Platform Teams at Spotify and Beyond Spread Ownership and Deliver Value

MMS • Pia Nilsson Mike Lewis

Transcript

Backstage Journey

Backstage – A Single Pane of Glass

Key Backstage Engineering Best Practices

Backstage Extensibility Architecture

Extensibility Examples in Backstage (Access Control)

Demo (Extensibility)

Extensibility Value Proposition

Key Takeaways

The ROI of Backstage

Resources

Subscribe for MMS Newsletter

Did you know...

The Value of Using Timeless Testing Tools

MMS • Ben Linders

About the Author

Ben Linders

Subscribe for MMS Newsletter

Did you know...