Month: February 2025

MMS • Courtney Nash
Article originally posted on InfoQ. Visit InfoQ

Transcript
Shane Hastie: Good day folks. This is Shane Hastie for the InfoQ Engineering Culture Podcast. Today I’m sitting down with Courtney Nash. Courtney, welcome. Thanks for taking the time to talk to us.
Courtney Nash: Hi Shane. Thanks so much for having me. I am an abashed lover of podcasts, and so I’m also very excited to get the chance to finally be on yours.
Shane Hastie: Thank you so much. My normal starting point with these conversations is who’s Courtney?
Introductions [00:56]
Courtney Nash: Fair question. I have been in the industry for a long time in various different roles. My most known, to some people, stint was as an editor for O’Reilly Media for almost 10 years. I chaired the Velocity Conference and that sent me down the path that I would say I’m currently on, early days of DevOps and that whole development in the industry, which turned into SRE. I was managing the team of editors, one of whom was smart enough to see the writing on the wall that maybe there should be an SRE book or three or four out there. And through that time at O’Reilly, I focused a lot on what you focus on, actually, on people and systems and culture.
I have a background in cognitive neuroscience, in cognitive science and human factors studies. And that collided with all of the technology and DevOps work when I met John Allspaw and a few other folks who are now really leading the charge on trying to bring concepts around learning from incidents and resilience engineering to our industry.
And so the tail end of that journey for me ended up working at a startup where I was researching software failures, really, for a company that was focusing on products around Kubernetes and Kafka, because they always work as intended. And along the way I started looking at public incident reports and collecting those and reading those. And then at some point I turned around and realized I had thousands and thousands of these things in a very shoddy ad hoc database that I still to this day maintain by myself, possibly questionable. But that turned into what’s called The VOID, which has been the bulk of my work for the last four or five years. And that’s a large database of public incident reports.
Just recently we’ve had some pretty notable ones that folks may have paid attention to. Things like when Facebook went down in 2021 and they couldn’t get into their data center. Ideally companies write up these software failure reports, software incident reports, and I’ve been scooping those up into a database and essentially doing research on that for the past few years and trying to bring a data-driven perspective to our beliefs and practices around incident response and incident analysis. That’s the VOID. And most recently just produced some work that I spoke at QCon about, which is how we all got connected, on what I found about how automation is involved in software incidents from the database that we have available to us in The VOID.
Shane Hastie: Can we dig into that? The title of your talk was exploring the Unintended Consequences of Automation in Software. What are some of those and where do they come from?
Research into unintended consequences [03:43]
Courtney Nash: Yes. I’m going to flip your question and talk about where they come from and then talk about what some of them are. A really common through line for my work and other people in this space, resilience engineering, learning from incidents, is that we’re really not the first to look at some of this through this lens. There’s been a lot of researchers and technologists, but looking at incidents in other domains, critically safety critical domains, so things like aviation, healthcare, power plants, power grids, that type of thing. A lot of this came out of Three Mile Island.
I would say the modern discipline that we know of now as resilience engineering married with other ones that have been around even longer like human factors research and that type of thing really started looking at systems level views of incidents. In this case pretty significant accidents like threatening the life and wellbeing of humans.
There were a lot of high consequence, high tempo scenarios and a huge body of research already exists on that. And so what I was trying to do with a lot of the work I’m doing with The VOID is pull that information as a through line into what we’re doing. Because some of this research is really evergreen just because it’s software systems or technology there’s a lot of commonalities in what folks have already learned from these other domains.
In particular, automated cockpits, automation in aviation environments is where a lot of the inspiration for my work came from. And also, you may or may not have noticed that our industry is super excited about AI right now. And so I thought I’m not going to go fully tackle AI head on yet because I think we haven’t still learned from things that we could about automation, so I’m hoping to start back a little ways and from first principles.
Some of that research really talks about literally what I called my talk. Unintended Consequences of Automation. And some of this research in aviation and automated cockpits had found that automating these human computer environments had a lot of unexpected consequences. The people who designed those systems had these specific outcomes in mind. And we have the same set of beliefs in the work that we do in the technology industry.
Humans are good at these things and computers are good at these things so why don’t we just assign the things that humans are good at to the humans and yada yada. This comes from an older concept from the ’50s called HABA-MABA (humans-are-better-at/machines-are-better-at) from a psychologist named Paul Fitts. If anyone’s ever heard of the Fitts list, that’s where this comes from.
Adding automation changes the nature of the work [06:15]
But that’s not actually how these kinds of systems work. You can’t just divide up the work that cleanly. It’s such a tempting notion. It feels good and it feels right, and it also means, oh, we can just give the crappy work, as it were, to the computers and that’ll free us up. But the nature of these kinds of systems, these complex distributed systems, you can’t slice and dice them. That’s not how they work. And so that’s not how we work in those systems with machines, but we design our tools and our systems and our automation still from that fundamental belief.
That’s where this myth comes from and these unintended consequences. Some of the research we came across is that adding automation into these systems actually changes the nature of human work. This is really the key one. It’s not that it replaces work and we’re freed up to go off and do all of these other things, but it actually changes the nature of the work that we have to do.
And on top of that, it makes it harder for us to impact a system when it’s not doing what it’s supposed to be doing, an automated system, because we don’t actually have access to the internal machination of what’s happening. And so you could apply this logic to AI, but you could back this logic all the way up to just what is your CI/CD doing? Or when you have auto-scaling across a fleet of Kubernetes pods and it’s not doing what you think it’s doing, you don’t actually have access to what it was doing or should have been doing or why it’s now doing what it’s doing.
It actually makes the work that humans have to do harder and it changes the nature of the work that they’re doing to interact with these systems. And then just recently some really modern research from Microsoft Research in Cambridge and Carnegie Mellon really actually looked at this with AI and how it actually can degrade people’s critical thinking skills and their ability when you have AI in a system depending on how much people trust it or not.
There’s some really nice modern research that I can also add too. Some of the stuff people are like, “Oh, it came out in 1983”, and I’m like, “Yes, but it’s still actually right”. Which is what’s crazy. We see these unintended consequences in software systems just constantly. I went in to The VOID report and really just read as many as I could that looked like they had some form of automation in them. We looked for things that included self-healing or auto-scaling or auto config. There’s a lot of different things we looked for, but we found a lot of these unintended consequences where software automation either caused problems and then humans had to step in to figure that out.
The other thing, the other unintended consequence is that sometimes automation makes it even harder to solve a problem than it would’ve been were it not involved in the system. I think the Facebook one is I feel like one of the more well-known versions of that where they literally couldn’t get into their own data center. Amazon in 2021 had one like that as well for AWS where they had a resource exhaustion situation that then wouldn’t allow them to actually access the logs to figure out what was going on.
The myth comes from this separation of human and computer duties. And then the kinds of unintended consequences we see are humans having to step into an environment that they’re not familiar with to try to fix something that they don’t understand why or how it’s going wrong yet. And then sometimes that thing actually makes it harder to even do their job, all of which are the same phenomenon we saw in research in those other domains. It’s just now we’re actually being able to see it in our own software systems. That’s the very long-winded answer to your question.
Shane Hastie: If I think of our audience, the technical practitioners who are building these tools, building these automation products, what does this mean to them?
The impact on engineering [10:16]
Courtney Nash: This is a group I really like to talk to. I like to talk to the people who are building the tools, and then I like to talk to the people who think those tools are going to solve all their problems, not always the same people. A lot of people who are building these are building it for their own teams, they’re cobbling together monitoring solutions and other things and trying. It’s not even that they necessarily have some vendor product, although that is certainly increasingly a thing in this space. I was just talking to someone else about this. We have armies of user experience researchers out there, people whose job is to make sure that the consumer end of the things that these companies build work for them and are intuitive and do what they want. And we don’t really do that for our internal tools or for our developer tools.
And it is a unique skill set, I would say, to be able to do that. And a lot of times I learned recently, in another podcast, tends to fall on the shoulders of staff engineers. Who’s making sure that the internal tooling, you may be so lucky as to have a platform team or something like that. But I think I would just, in particular, the more people can be aware of that myth, the HABA-MABA Fitts list, it is, I had this belief myself about automating things and automating computers. And just to preface this, I’m not anti-automation. I’m not, don’t do it, it’s terrible. We should just go back to rocks and sticks. I’m a big fan of it in a lot of ways, but I’m a fan of it when the designers of it understand the potential for some of those unintended consequences.
And instead of thinking of replacing work that humans might make or do, it’s augmenting that work. And how do we make it easier for us to do these kinds of jobs? And that might be writing code, that might be deploying it, that might be tackling incidents when they come up, but understanding what the fancy, nerdy academic jargon for this is joint cognitive systems. But thinking instead of replacement or our functional allocation, another good nerdy academic term, we’ll give you this piece, we’ll give the humans those pieces.
How do we have a joint system where that automation is really supporting the work of the humans in this complex system? And in particular, how do you allow them to troubleshoot that, to introspect that, to actually understand and to have even maybe the very nerdy versions of this research lay out possible ways of thinking about what can these computers do to help us? How can we help them help us? What does that joint cognitive system really look like?
And the bottom line answer is it’s more work for the designers of the automation, and that’s not always something you have the time or the luxury for. But if you can step out of the box of I’m just going to replace work you do, knowing that’s not really how it works, to how can these tools augment what our people are doing? That’s what I think is important for those people.
And the next question people always ask me is, “Cool who’s doing it?” And I answer up until recently was like, “Nobody”. Record scratch. I wish. However, I have seen some work from Honeycomb, which is an observability tooling vendor that is very much along these lines. And so I’m not paid by Honeycomb, I’m not employed by Honeycomb or staff. This is me as an independent third party finally seeing this in the wild. And I don’t know what that’s going to look like. I don’t know how that’s going to play out, but I’m watching a company that makes tooling for engineers think about this and think about how do we do this? And so that gives me hope and I hope it also empowers other people to be, oh, Courtney is not just spouting off all this academic nonsense, but it’s possible. It’s just definitely a very different way of approaching especially developer or SRE types of tooling.
Shane Hastie: My mind went to observability when you were describing that.
Courtney Nash: Yes.
Shane Hastie: What does it look like in practice? If I am one of those SREs in the organization, what do I do given an incident’s likely to happen, something’s going to go wrong? Is it just add in more logs and observability or what is it?
Practical application [14:40]
Courtney Nash: Yes and no. I think of course it’s always very annoyingly bespoke and contextually specific to a given organization and a given incident. But this is why the learning from incidents community is so entwined with all of this because if instead of looking for just technical action item fixes out of your incidents, you’re looking at what did we learn about why people made the decisions they made at the time. Another nerdy research concept called local rationality, but if you go back and look at these incidents from the perspective of trying to learn from the incident, not just about what technically happened, but what happened socio-technically with your teams, were there pressures from other parts of the organization?
All of these things, I would say SREs investing in learning from incidents are going to figure out A, how to better support those people when things go wrong. It’s like, what couldn’t we get access to or what information didn’t we have at the time? What made it harder to solve this problem? But also, what did people do when that happened that made things work better? And did they work around tools? What was that? What didn’t they know? What couldn’t they know that could our tooling tell them, perhaps?
And so that’s why I think you see so many learning from incident people and so many resilience engineering people all talking around this topic because I can’t just come to you and say, “You should do X”, because I have no idea how your team’s structured, what the economic and temporal pressures are on that team. The local context is so important and the people who build those systems and the people who then have to manage them when they go wrong are going to be able to figure out what the systemic things going on are, and especially if it’s lack of access to what X, Y, or Z was doing. Going back, looking at what made it hard for people and also what natural adaptations they themselves took on to make it work or to solve the problem.
And again, it’s like product management and it’s like user experience. You’re not going to just silver bullet this problem. You’re going to be fine-tuning and figuring out what it is that can give you that either control or visibility or what have you. There is no product out there that does that for you. Sorry, product people. That’s the reason investing in learning from their incidents is going to help them the most I would biasedly offer.
Shane Hastie: We’re talking in the realm of socio-technical systems. Where does the socio come in? What are the human elements here?
The human aspects [17:14]
Courtney Nash: Well, we built these systems. Let’s just start with that. And the same premise of designing automation, we design all kinds of things for all kinds of outcomes and aren’t prepared for all of the unexpected outcomes. I think that the human element, for me, in this particular context, software is built by people, software is maintained by people. The through line from all of this other research I’ve brought up is that if you want to have a resilient or a reliable organization, the people are the source of that. You can’t engineer five nines, you can’t slap reliability on stuff. It is people who make our systems work on the day-to-day basis. And we are, I would argue, actively as an industry working against that truth right now.
For me, there’s a lot of socio in complex systems, but for me, that’s the nut of it. That’s the really crux of the situation is we are largely either unaware or unwilling to look at close at how important people are to keep things running and building and moving in ways that if you take these ironies or unexpected consequences of automation and scale those up in the way that we are currently looking at in terms of AI, we have a real problem with, I believe, the maintainability, the reliability, the resilience of our systems.
And it won’t be apparent immediately. It won’t be, oh shoot, that was bad. We’ll just roll that back. That’s not the case. And I’m seeing this talking to people about interviewing junior engineers. There is a base of knowledge that humans have that is built up from direct contact with these systems that automated systems can’t have yet. It’s certainly not in the world we live in despite all the hype we might be told. I am most worried about the erosion of expertise in these complex systems. For me, that’s the most important part of the socio part of the social technical system other than how we treat people. And those are also related, I’d argue.
Shane Hastie: If I’m a technical leader in an organization, what do I do? How do I make sure we don’t fall into that trap?
Listen to your people [19:36]
Courtney Nash: Listen to your people. You’re going to have an immense amount of pressure to bring AI into your systems. Some of it is very real and warranted and you’re not going to be able to ignore it. You’re not going to be able to put a lid on it and set it aside. Faced with probably a lot of pressure to bring AI and bring more automation, those types of things, I think the most important thing for leaders to do is listen to the people who are using those tools, who are being asked to bring those into their work and their workflow. Also find the people who seem to be wizards at it already. Why are some people really good at this? And tap into that. Try to figure out where those sources of expertise and knowledge with these new ways of doing are coming from.
And again, I ask people all the time, if you have a product company, let’s say you work at a company that produces something. You work for big distributed systems companies, but they’re still like Netflix or Apple or whatever, “Do you A/B test stuff before you release it? Why don’t you do that with new stuff on your engineering side?” Think about how much planning and effort goes into a migration or moving from one technology to another.
We could go monolith to microservices, we could go pick your digital transformation. How long did that take you? And how much care did you put into that? Maybe some of it was too long or too bureaucratic or what have you, but I would argue that we tend to YOLO internal developer technology way faster and way looser than we do with the things that actually make us money as that is the perception, the things that actually make us money.
And the more that leaders of technical teams can listen to their people, roll things out in a way that allows you to, how are you going to decide what success looks like? Integrating AI tools into your team, for example, what does that look like? Could you lay down some ground rules for what that looks like? And if you’re not doing that in two months or three months or four months, what do your people think you should be doing? I feel like it’s the same age-old argument about developer experience, but I think the stakes are a little higher because we’re rushing so fast into this.
Technical leaders, listen to your people, use the same tactics you use for rolling out lots of high stakes, high consequences things, and don’t just hope it works. Have some ground rules for what that should look like and be willing to reevaluate that and rethink how you should approach it. But I’m not a technical leader, so they might balk at that advice. And I understand that.
Shane Hastie: If I can swing back to The VOID, to this repository that you’ve built up over years. You identified some of the unintended consequences of automation as something that’s coming up. Are there other trends that you can see or point us towards that you’ve seen in that data?
Trends from the VOID data [22:31]
Courtney Nash: Some of the earliest work I did was really trying to myth-bust some things that I thought I had always had a hunch were not helping us and were hurting us as an industry, but I didn’t have the data for it. The canonical one is MTTR. I wouldn’t call this a trend, except in that everybody’s doing it. But using the data we have in The VOID to show that things like duration or severity of incidents are extremely volatile, not terribly statistically reliable. And so trying to help give teams ammunition against these ideas that I think are actually harmful, they can actually have pretty gnarly consequences in terms of the way that metrics are assigned to team performance, incentivization of really weird behaviors and things that I think just on the whole aren’t helping people manage very complex high stakes environments.
I’ve long thought that MTTR was problematic, but once I got my hands on the data, and I have a strong background in statistics, I was able to demonstrate that it’s not really a very useful metric. It’s still though widely used in the industry. I would say it’s an uphill battle that I have definitely not, I don’t even want to say won, because I don’t see it that way, but I do believe that we have some really unique data to counteract a lot of these common beliefs and things like severity actually is not correlated with duration.
There’s a lot of arguments on teams about how should we assign severity, what does severity need to be? And again, these Goddard’s law things and things like the second you make it a metric, it becomes a target, and then all these perverse behaviors come out of that. Those are some of the past things that we’ve done.
I would say the one trend that I haven’t chased yet, or that I don’t have the data for in any way yet is I really do think that companies that invest in learning from their incidents have some form of a competitive advantage.
Again, this is a huge hunch. It’s a lot, I think, where Dr. Nicole Forsgren was in the early days of DevOps and the DORA stuff where they were like, we have these theories about organizational performance and developer efficiency and performance and stuff, and they collected a huge amount of data over time towards those theories. I really do believe that there is a competitive advantage to organizations that invest in learning from their incidents because it gets at all these things that we’ve been talking about. But like I said, if you want to talk trends, I think that’s one, but I don’t have the data for it yet.
Shane Hastie: You’re telling me a lot of really powerful interesting stuff here. If people want to continue the conversation, where do they find you?
Courtney Nash: Thevoid.community, which is quite possibly the weirdest URL, but domain names are hard these days. That is the easiest way to find all of my past research. There is links to a podcast and a newsletter there. I’m also on all the social things, obviously, and speaking at a few events this year. And just generally that’s the best spot. I post a lot on LinkedIn, I will say, and I’m surprised by that. I didn’t use to be much of a LinkedIn person, but I’ve actually found that the community that are discussing these topics is very lively. If you’re looking for any current commentary, I would actually say that strangely, I can’t believe I’m saying this, but The VOID on LinkedIn is probably the best place to find us.
Shane Hastie: You also mentioned, when we were talking earlier, an online community for resilience engineering. Tell us a little bit about that.
Courtney Nash: There’ve been a few fits and starts to try to make this happen within the tech industry. There is a Resilience Engineering Association. Again, the notion of resilience engineering long precedes us as technology and software folks. That organization exists, but recently a group of folks have put together a Resilience in Software Foundation and there’s a Slack group that’s associated with that.
There’s a few things that are emerging specific to our industry, which I really appreciate because sometimes it is really hard to go read all this other wonky research and then you’ve asked these questions even just today in this podcast, okay, but me as an SRE manager, what does that mean for me? There’s definitely some community starting to build around that and resilience in software, which The VOID has been involved with as well. And I think it’s going to be a great resource for the tech community.
Shane Hastie: Thank you so much.
Mentioned:
.
From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.

MMS • Pierre Pureur Kurt Bittner
Article originally posted on InfoQ. Visit InfoQ

Key Takeaways
- Selling yourself and your stakeholders on doing architectural experiments is hard, despite the significant benefits of this approach; you like to think that your decisions are good but when it comes to architecture, you don’t know what you don’t know.
- Stakeholders don’t like to spend money on things they see as superfluous, and they usually see running experiments as simply “playing around”. You have to show them that experimentation saves money in the long run by making better-informed decisions.
- These better decisions also reduce the overall amount of work you need to do by reducing costly rework.
- You may think that you are already experimenting by doing Proofs of Concept (POCs). Architectural experiments and POCs have different purposes. A POC helps validate that a business opportunity is worth pursuing, while an architectural experiment tests some parts of the solution to validate that it will support business goals.
- Sometimes, architectural experiments need to be run in the customer’s environment because there is no way to simulate real-world conditions. This sounds frightening, but techniques can be used to roll back the experiments quickly if they start to go badly.
As we stated in a previous article, being wrong is sometimes inevitable in software architecting; if you are never wrong, you are not challenging yourself enough, and you are not learning. The essential thing is to test our decisions as much as possible with experiments that challenge our assumptions and to construct the system in such a way that when our decisions are incorrect the system does not fail catastrophically.
Architectural experimentation sounds like a great idea, yet it does not seem to be used very frequently. In this article, we will explore some of the reasons why teams don’t use this powerful tool more often, and what they can do about leveraging that tool for successful outcomes.
First, selling architectural experimentation to yourself is hard
After all, you probably already feel that you don’t have enough time to do the work you need to do, so how are you going to find time to run experiments?
You need to experiment for a simple reason: you don’t know what the solution needs to be because you don’t know what you don’t know. This is an uncomfortable feeling that no one really wants to talk about. Bringing these issues into the open stimulates healthy discussion that shape the architecture, but before you can have them you need data.
One of the forces to overcome in these discussions is confirmation bias, or the belief that you already know what the solution is. Experimentation helps you to challenge your assumptions to reach a better solution. The problem is, as the saying goes, “the truth will set you free, but first it will make you miserable”. Examples of this include:
- Experimentation may expose that solutions that have worked for you in the past may not work for the system you are working on now.
- It may expose you to the fact that some “enterprise standards” won’t work for your problem, forcing you to explain why you aren’t using them.
- It may expose that some assertions made by “experts” or important stakeholders are not true.
Let’s consider a typical situation: you have made a commitment to deliver an MVP, although the scope is usually at least a little “flexible” or “elastic”; the scope is always a compromise. But the scope is also, usually, more optimistic and you rarely have the resources to confidently achieve it. From an architectural perspective you have to make decisions, but you don’t have enough information to be completely confident in them; you are making a lot of assumptions.
You could, and usually do, hope that your architectural decisions are correct and simply focus on delivering the MVP. If you are wrong, the failure could be catastrophic. If you are willing to take this risk you may want to keep your resumé updated.
Your alternative is to take out an “insurance policy” of sorts by running experiments that will tell you whether your decisions are correct without resorting to catastrophic failure. Like an insurance policy, you will spend a small amount to protect yourself, but you will prevent a much greater loss.
Next, selling stakeholders on architectural experimentation is a challenge
As we mentioned in an earlier article, getting stakeholder buy-in for architectural decisions is important – they control the money, and if they think you’re not spending it wisely they’ll cut you off. Stakeholders are, typically, averse to having you do work they don’t think has value, so you have to sell them on why you are spending time running architectural experiments.
Architectural experimentation is important for two reasons: For functional requirements, MVPs are essential to confirm that you understand what customers really need. Architectural experiments do the same for technical decisions that support the MVP; they confirm that you understand how to satisfy the quality attribute requirements for the MVP.
Architectural experiments are also important because they help to reduce the cost of the system over time. This has two parts: you will reduce the cost of developing the system by finding better solutions, earlier, and by not going down technology paths that won’t yield the results you want. Experimentation also pays for itself by reducing the cost of maintaining the system over time by finding more robust solutions.
Ultimately running experiments is about saving money – reducing the cost of development by spending less on developing solutions that won’t work or that will cost too much to support. You can’t run experiments on every architectural decision and eliminate the cost of all unexpected changes, but you can run experiments to reduce the risk of being wrong about the most critical decisions. While stakeholders may not understand the technical aspects of your experiments, they can understand the monetary value.
Of course running experiments is not free – they take time and money away from developing things that stakeholders want. But, like an insurance policy that costs the amount of premiums but protects you from much greater losses, experiments protect you from the effects of costly mistakes.
Selling them on the need to do experiments can be especially challenging because it raises questions, in their minds anyway, about whether you know what you are doing. Aren’t you supposed to have all the answers already?
The reality is that you don’t know everything you would like to know; developing software is a field that requires lifelong learning: technology is always changing, creating new opportunities and new trade-offs in solutions. Even when technology is relatively static, the problems you are trying to solve, and therefore their solutions, are always changing as well. No one can know everything and so experimentation is essential. As a result, the value of knowledge and experience is not in knowing everything up-front but in being able to ask the right questions.
You also never have enough time or money to run architectural experiments
Every software development effort we have ever been involved in has struggled to find the time and money to deliver the full scope of the initiative, as envisioned by stakeholders. Assuming this is true for you and your teams, how can you possibly add experimentation to the mix?
The short answer is that not everything the stakeholders “want” is useful or necessary. The challenge is to find out what is useful and necessary before you spend time developing it. Investing in requirements reviews turns out not to be very useful; in many cases, the requirement sounds like a good idea until the stakeholders or customers actually see it.
This is where MVPs can help improve architectural decisions by identifying functionality that doesn’t need to be supported by the architecture, which doubly reduces work. Using MVPs to figure out work that doesn’t need to be done makes room to run experiments about both value and architecture. Identifying scope and architectural work that isn’t necessary “pays” for the experiments that help to identify the work that isn’t needed.
For example, some MVP experiments will reveal that a “must do” requirement isn’t really needed, and some architectural experiments will reveal that a complex and costly solution can be replaced with something much simpler to develop and support. Architectural decisions related to that work are also eliminated.
The same is true for architectural experiments: they may reveal that a complex solution isn’t needed because a simpler one exists, or perhaps that an anticipated problem will never occur. Those experiments reduce the work needed to deliver the solution.
Experiments sometimes reveal unanticipated scope when they uncover a new customer need, or that an anticipated architectural solution needs more work. On the whole, however, we have found that reductions in scope identified by experiments outweigh the time and money increases.
At the start of the development work, of course, you won’t have any experiments to inform your decisions. You’re going to have to take it on faith that experimentation will identify extra work to pay for those first experiments; after that, the supporting evidence will be clear.
Then you think you’re already running architectural experiments, but you’re not
You may be running POCs and believe that you are running architectural experiments. POCs can be useful but they are not the same as architectural experiments or even MVPs. In our experience, POCs are hopefully interesting demonstrations of an idea but they lack the rigor needed to test a hypothesis. MVPs and architectural experiments are intensely focused on what they are testing and how.
Some people may feel that because they run integration, system, regression, or load tests, they are running architectural experiments. Testing is important, but it comes too late to avoid over-investing based on potentially incorrect decisions. Testing usually only occurs once the solution is built, whereas experimentation occurs early to inform decisions whether the team should continue down a particular path. In addition, testing verifies the characteristics of a system but it is not designed to explicitly test hypotheses, which is a fundamental aspect of experimentation.
Finally, you can’t get the feedback you need without exposing customers to the experiments
Some conditions under which you need to evaluate your decisions can’t be simulated; only real-world conditions will expose potentially flawed assumptions. In these cases, you will need to run experiments directly with customers.
This sounds scary, and it can be, but your alternative is to make a decision and hope for the best. In this case, you are still exposing the customer to a potentially severe risk, but without the careful controls of an experiment. In some sense, people do this all the time without knowing it, when they assume that our decisions are correct without testing them, but the consequences can be catastrophic.
Experimentation allows us to be explicit about what hypothesis we are evaluating with our experiment and limits the impact of the experiment by focusing on specific evaluation criteria. Explicit experimentation helps us to devise ways to quickly abort the experiment if it starts to fail. For this, we may use techniques that support reliable, fast releases, with the ability to roll back, or techniques like A/B testing.
As an example, consider the case where you want to evaluate whether a LLM-based chatbot can reduce the cost of staffing a call center. As an experiment, you could deploy the chatbot to a subset of your customers to see if it can correctly answer their questions. If it does, call center volume should go down, but you should also evaluate customer satisfaction to make sure that they are not simply giving up in frustration and going to another competitor with better support. If the chatbot is not effective, it can be easily turned off while you evaluate your next decision.
Conclusion
In a perfect world, we wouldn’t need to experiment; we would have perfect information and all of our decisions would be correct. Unfortunately, that isn’t reality.
Experiments are paid for by reducing the cost, in money and time, of undoing bad decisions. They are an insurance policy that costs a little up-front but reduces the cost of the unforeseeable. In software architecture, the unforeseeable is usually related to unexpected behavior in a system, either because of unexpected customer behavior, including loads or volumes of transactions, but also because of interactions between different parts of the system.
Using architectural experimentation isn’t easy despite some very significant benefits. You need to sell yourself first on the idea, then sell it to your stakeholders, and neither of these is an easy sell. Running architectural experiments requires time and probably money, and both of these are usually in short supply when attempting to deliver an MVP. But in the end, experimentation leads to better outcomes overall: lower-cost systems that are more resilient and sustainable.
Microsoft Launches Visual Studio 2022 v17.13 with AI-Powered Enhancements and Improved Debugging

MMS • Robert Krzaczynski
Article originally posted on InfoQ. Visit InfoQ

Microsoft has released Visual Studio 2022 v17.13, introducing significant improvements in AI-assisted development, debugging, productivity, and cloud integration. This update focuses on refining workflows, enhancing code management, and improving the overall developer experience.
One of the features in this release is GitHub Copilot Free, which provides 2,000 code completions and 50 chat requests per month at no cost. Copilot has also been improved with AI-powered feature search, enhanced multi-file editing, and shortcut expansions, making it easier to navigate and optimize code. These AI-powered improvements are already receiving positive feedback from developers. Hugo Augusto, an IT consultant, commented:
Adding AI directly inside VS is the biggest addition Microsoft has made in a while. I’m surprised every day at how good the suggestions are and how it understands the context of the source to provide those suggestions.
Another user shared their experience with GitHub Copilot Free, emphasizing how much it has improved their workflow:
I have been playing around with GitHub Copilot Free, and I have to say, it’s been a game-changer for my workflow. The advanced debugging features in Visual Studio 2022 v17.13 are also nice.
Alongside AI improvements, Visual Studio 2022 v17.13 introduces new productivity features. Developers can now set default file encoding, use a more accessible horizontal scrollbar, and quickly navigate recent files in Code Search. There is also an option to indent wrapped lines for better readability.
Debugging and diagnostics have also seen major enhancements. AI-generated thread summaries in Parallel Stacks simplify debugging complex applications, while the profiler now unifies async stacks for .NET profiling and introduces color-coded CPU swim lanes for easier performance analysis. IEnumerable Visualizer has been updated with syntax highlighting and Copilot-powered inline chat, making LINQ query debugging more efficient.
For Git users, this version allows developers to add comments directly on pull requests from within Visual Studio. Additionally, AI-powered commit suggestions help catch potential issues early, ensuring higher code quality before merging.
Furthermore, web and cloud developers can now integrate .NET Aspire with Azure Functions for easier serverless application development. Docker Compose introduces scaling support, offering more control over containerized environments. In addition, front-end developers can extract HTML into Razor components, improving code structure and maintainability.
Moreover, Database developers using SQL projects can now take advantage of SDK-style project support in SSDT, improving debugging and schema comparison. Visual Studio also preserves font preferences across themes, ensuring a consistent interface.
More information about the features can be found in the release notes.

MMS • RSS
Posted on mongodb google news. Visit mongodb google news

Summary
MongoDB Inc (MDB, Financial), a leading developer data platform company, announced on February 27, 2025, that its CEO, Dev Ittycheria, and Interim CFO, Serge Tanjga, will present at the Morgan Stanley Technology, Media & Telecom Conference in San Francisco, CA. The presentation is scheduled for March 6, 2025, at 11:30 a.m. Pacific Time. A live webcast will be available on MongoDB’s investor relations website, with a replay accessible for a limited time.
Positive Aspects
- MongoDB’s participation in a prestigious conference highlights its industry relevance and leadership.
- The live webcast and replay availability provide accessibility to a broader audience, enhancing investor engagement.
- MongoDB’s mission to empower innovators aligns with current market trends towards digital transformation.
Negative Aspects
- The press release does not provide specific details on the topics to be covered in the presentation.
- There is no mention of new product launches or strategic partnerships, which could have added more excitement.
Financial Analyst Perspective
From a financial analyst’s viewpoint, MongoDB’s participation in the Morgan Stanley conference is a strategic move to showcase its growth potential and innovation capabilities to investors and industry peers. The company’s focus on empowering developers and its extensive customer base in over 100 countries are strong indicators of its market position. However, the lack of detailed financial updates or new strategic initiatives in the press release may leave some investors seeking more concrete information on future growth prospects.
Market Research Analyst Perspective
As a market research analyst, MongoDB’s engagement in the conference underscores its commitment to maintaining a strong presence in the technology sector. The company’s emphasis on a unified developer data platform positions it well in the competitive landscape of modern application development. The absence of specific presentation topics, however, suggests that MongoDB may be focusing on reinforcing its existing strategies rather than introducing groundbreaking changes at this event.
FAQ
Q: When and where is MongoDB presenting?
A: MongoDB is presenting at the Morgan Stanley Technology, Media & Telecom Conference in San Francisco, CA, on March 6, 2025, at 11:30 a.m. Pacific Time.
Q: Who will be presenting from MongoDB?
A: CEO Dev Ittycheria and Interim CFO Serge Tanjga will be presenting.
Q: How can I access the presentation?
A: A live webcast will be available on MongoDB’s investor relations website, with a replay accessible for a limited time.
Q: What is MongoDB’s mission?
A: MongoDB’s mission is to empower innovators to create, transform, and disrupt industries by unleashing the power of software and data.
Read the original press release here.
This article, generated by GuruFocus, is designed to provide general insights and is not tailored financial advice. Our commentary is rooted in historical data and analyst projections, utilizing an impartial methodology, and is not intended to serve as specific investment guidance. It does not formulate a recommendation to purchase or divest any stock and does not consider individual investment objectives or financial circumstances. Our objective is to deliver long-term, fundamental data-driven analysis. Be aware that our analysis might not incorporate the most recent, price-sensitive company announcements or qualitative information. GuruFocus holds no position in the stocks mentioned herein.
Article originally posted on mongodb google news. Visit mongodb google news

MMS • RSS
Posted on mongodb google news. Visit mongodb google news
<!–
–>
By Gloria Methri
Today
- Banking Modernisation
- Banking Transformation
- Europe
Lombard Odier, a global Swiss private bank, has partnered with MongoDB to modernise its banking technology systems further. As part of the collaboration, MongoDB has accelerated the modernisation of Lombard Odier’s systems and applications with generative AI, reducing technical complexity and accelerating the bank’s innovation journey.
The generative AI-assisted modernisation initiative has enabled Lombard Odier to:
- Migrate code 50 to 60 times quicker than previous migrations
- Move applications from legacy relational databases to MongoDB twenty times faster, leveraging generative AI
- Automate repetitive tasks with AI tooling to accelerate the pace of innovation, reducing project times from days to hours.
Delivering seamless digital experiences to private and institutional customers while driving cost efficiencies is a major challenge across the banking industry. With digitisation accelerating and AI afoot, Lombard Odier is evolving its systems and integrating new technologies to give its clients better service and experience.
The bank’s GX Program—a seven-year initiative designed to modernise Lombard Odier’s banking application architecture to respond to market developments—launched in 2020 with the goal of enabling quicker innovation, reducing potential service disruption, and improving customer experiences.
Building on its 10-year relationship with MongoDB, Lombard Odier chose MongoDB as the data platform for its transformation initiative. The bank initially decided to develop its portfolio management system (PMS) on MongoDB. The bank’s largest application, with thousands of users, PMS manages shares, bonds, exchange-traded funds, and other financial instruments. MongoDB’s ability to scale was key to this system migration, as this system is used to monitor investments, make investment decisions, and generate portfolio statements. It is also the engine that runs Lombard Odier’s online banking application “MyLO,” which is used by the bank’s customers.
The bank engaged with MongoDB to co-build a Modernisation Factory—a service that helps customers eliminate barriers like time, cost, and risk frequently associated with legacy applications and eliminate technical debt that has accumulated over time—to expedite a secure and efficient modernisation. MongoDB’s Modernisation Factory team worked with Lombard Odier to create customisable generative AI tooling, including scripts and prompts tailored for the bank’s unique tech stack. This accelerated the modernisation process by automating integration testing and code generation for seamless deployment.
“To enhance Lombard Odier’s business strategy, we developed a technology platform that draws on the latest technological innovations to facilitate employees’ day-to-day work, and provide clients with individualised investment perspectives,” said Geoffroy De Ridder, Head of Technology and Operations at Lombard Odier. “We chose MongoDB because it offers us a cloud-agnostic database platform and an AI modernisation approach, which helps to automate time-consuming tasks, accelerate the upgrade of existing applications, and migrate them at a faster rate than ever before. Having up to date technology has made a big impact on our employees and customers while proving to be fast, cost-effective, and reducing maintenance overheads.”
Previous Article
The Weekly Wrap: all you need to know by Friday COB | February 28th
Next Article
nCino acquires Sandbox Banking for $52.5m to simplify financial integrations
Article originally posted on mongodb google news. Visit mongodb google news

MMS • RSS
Posted on mongodb google news. Visit mongodb google news

-
Optimizing Business Decisions with AI: Evolutionary Algorithms in Actions
Date: Saturday, March 1,11:00 AM to 1:00 PM
Place: CIE at IIIT Hyderabad
-
Global AI Bootcamp 2025 Hyderabad – 1 March 2025
Date: Saturday, March 1, 9:00 AM to 2:00 PM
Place: Building 3, Microsoft office, Microsoft Campus, ISB Rd, Gachibowli
-
DSA Roadmap: for Software Engineering in 2024
Date: Saturday, March 1,11:00 AM to 5:30 PM
Place: CIE at IIIT Hyderabad
-
Grafana & Friends Hyderabad X CloudBuilders March Meetup
Date: Saturday, March 8, 10:00 AM to 1:00 PM IST
Place: CloudBuilders, Hyderabad
-
Date: Saturday, March 8, 9:00 AM to 2:00 PM
Place: Building 3, Microsoft office, Microsoft Campus, ISB Rd, Gachibowli
Future Events/Conferences:
-
Grafana & Friends Hyderabad X Cloud Builders March Meetup
Date: Saturday, March 8, 10:00 AM to 1:00 PM
Place: Cloud Builders, IndiQube Peral 1st Floor, Beside Rolling Hills and Ramky Towers, Mindspace Rd, P Janardhan Reddy Nagar
-
MongoDB x TensorFlow: Supercharging AI with MongoDB: RAG, Search & Security
Date: Saturday, March 8, 10:00 AM to 02:30 PM
Place: EPAM-SALARPURIA SATTVA KNOWLEDGE CITY, Hyderabad
-
Date: Saturday, March 15, 9:30 AM to 4:00 PM
Place: CIE at IIIT Hyderabad
Experience Hyderabad’s vibrant tech community in person. See you there! Follow me on bluesky @ksridhar02
Article originally posted on mongodb google news. Visit mongodb google news

MMS • RSS
Posted on nosqlgooglealerts. Visit nosqlgooglealerts

IBM has announced its intent to acquire DataStax, a leading data platform provider. This strategic acquisition significantly boosts IBM’s AI data platform by integrating advanced vector capabilities critical for powering retrieval-augmented generation (RAG) applications. It positions IBM to help businesses leverage value from vast volumes of unstructured data, an area where IBM lacks a strong foothold. DataStax brings expertise to IBM in distributed databases capable of spanning multiple regions, an essential capability for enabling seamless global AI and data fabric deployments. Also, this acquisition strengthens IBM’s commitment to advancing open-source initiatives with DataStax’s support for the Apache Cassandra database and Langflow, a low-code tool for AI development.
What It Means
IBM has made numerous acquisitions over the years, but this one stands out as one of the most strategic moves to enhance its data platform, primarily focusing on AI. While IBM has previously acquired database companies, integrating them into its stack has often been slow. The success of this acquisition will hinge on how quickly and seamlessly it integrates with IBM’s watsonx AI platform. This acquisition positions IBM to better compete in the AI space in several key ways by adding:
- Enhanced support for unstructured data management at scale. While IBM supports unstructured data management with its Db2 offering, it has historically lagged in providing comprehensive and scalable solutions. This acquisition addresses that gap, enabling IBM to offer a more robust suite of AI data capabilities. Apache Cassandra, a schemaless NoSQL database, is designed to handle massive volumes of semistructured data at scale, empowering IBM to deliver a more robust and scalable data platform for AI applications.
- Strengthened vector capabilities for RAG applications. IBM has lagged in providing the critical vector capabilities that are now essential for powering RAG applications. Built on Apache Cassandra, Astra DB delivers high-performance advanced vector capabilities vital for AI-driven workloads requiring rapid retrieval of high-dimensional data. Recognized as a Leader in The Forrester Wave™: Vector Databases, Q3 2024, DataStax has comprehensive, advanced capabilities. Integrating Astra DB with IBM watsonx.data will significantly enhance its vector capabilities, positioning IBM for greater success in the evolving AI landscape.
- Enablement for globally distributed data AI environments. DataStax delivers a cloud-native database as a service that simplifies deployment and management and provides a globally distributed data infrastructure ensuring flexibility across multicloud and multiregional environments. As the demand for distributed data continues to rise, this capability significantly enhances IBM’s ability to empower AI-driven solutions on a global scale.
- Middleware capabilities for IBM watsonx.ai with Langflow. In April 2024, DataStax acquired Logspace, the creator of Langflow — a graphical low-code platform that empowers users to visually design and manage AI workflows. Langflow offers seamless integration with diverse AI models and provides Python-based customization. This acquisition extends the IBM watsonx platform by adding dynamic middleware capabilities, streamlining the creation of advanced generative AI applications more efficiently.
- Expanded data fabric capabilities with a scalable data platform. IBM has a viable data fabric solution with its IBM Cloud Pak for Data and watsonx.data offerings. With this acquisition, IBM is poised to enhance its data fabric capabilities, supporting both structured and unstructured data at scale while integrating advanced vector capabilities. This expansion is also likely to help IBM deploy AI agents at scale, strengthening its position in the AI-driven data landscape.
For more insights, book time with me via an inquiry or guidance session.

MMS • RSS
Posted on mongodb google news. Visit mongodb google news

NEW YORK, Feb. 27, 2025 /PRNewswire/ — MongoDB, Inc. (NASDAQ: MDB) today announced that Chief Executive Officer, Dev Ittycheria, and Interim Chief Financial Officer, Serge Tanjga, will present at the Morgan Stanley Technology, Media & Telecom Conference in San Francisco, CA.
Article originally posted on mongodb google news. Visit mongodb google news

MMS • RSS
Posted on mongodb google news. Visit mongodb google news
NEW YORK, Feb. 27, 2025 /PRNewswire/ — MongoDB, Inc. (NASDAQ: MDB) today announced that Chief Executive Officer, Dev Ittycheria, and Interim Chief Financial Officer, Serge Tanjga, will present at the Morgan Stanley Technology, Media & Telecom Conference in San Francisco, CA.
The MongoDB presentation is scheduled for Thursday, March 6, 2025, at 11:30 a.m. Pacific Time (2:30 p.m. Eastern Time). A live webcast of the presentation will be available on the Events page of the MongoDB investor relations website at https://investors.mongodb.com/news-events/events. A replay of the webcast will also be available for a limited time.
About MongoDB
Headquartered in New York, MongoDB’s mission is to empower innovators to create, transform, and disrupt industries by unleashing the power of software and data. Built by developers, for developers, MongoDB’s developer data platform is a database with an integrated set of related services that allow development teams to address the growing requirements for today’s wide variety of modern applications, all in a unified and consistent user experience. MongoDB has tens of thousands of customers in over 100 countries. The MongoDB database platform has been downloaded hundreds of millions of times since 2007, and there have been millions of builders trained through MongoDB University courses. To learn more, visit mongodb.com.
Investor Relations
Brian Denyeau
ICR for MongoDB
646-277-1251
ir@mongodb.com
Media Relations
MongoDB PR
press@mongodb.com
View original content to download multimedia:https://www.prnewswire.com/news-releases/mongodb-inc-to-present-at-the-morgan-stanley-technology-media–telecom-conference-302387900.html
SOURCE MongoDB, Inc.
Article originally posted on mongodb google news. Visit mongodb google news

MMS • RSS
Posted on nosqlgooglealerts. Visit nosqlgooglealerts

IBM recently announced plans to acquire DataStax, a leading provider of NoSQL database solutions and AI tools. This strategic acquisition aims to accelerate production AI and NoSQL data capabilities at scale—developments that carry particular significance for the federal government’s homeland security community.
The homeland security sector relies heavily on robust, secure, and scalable data infrastructure to support critical missions across border security, emergency management, cybersecurity, and intelligence gathering. DataStax’s expertise with Apache Cassandra®, which powers mission-critical applications for major enterprises like FedEx, Capital One, and Verizon, as well as databases for the Department of Veteran Affairs and the Defense Information Systems Agency, offers compelling capabilities for homeland security agencies that similarly require high-availability, fault-tolerant systems that can handle massive data volumes without downtime.
Chet Kapoor, Chairman & CEO at DataStax, stated, “We have long said that there is no AI without data, and this vision will now be amplified with IBM.” The integration of DataStax’s hybrid vector database technology with IBM’s watsonx platform presents significant opportunities for homeland security applications. Retrieval Augmented Generation (RAG) techniques, which allow large language models (LLMs) to access external knowledge bases before generating responses, could dramatically improve intelligence analysis, threat assessment, and situational awareness capabilities within homeland security operations.
Since 2020, IBM and DataStax have collaborated to serve customers including T-Mobile and The Home Depot. DataStax’s recent introduction of a Hyper-Converged Database (HCD) and Mission Control has further advanced Cassandra’s deployment in cloud-native environments, particularly on IBM OpenShift. For homeland security agencies navigating complex cloud migration strategies, this integration could provide secure, containerized database solutions that meet federal compliance requirements while maintaining operational flexibility.
As homeland security agencies continue to implement zero trust architecture in accordance with federal mandates, DataStax’s scale-out capabilities combined with IBM’s enterprise security features could provide a foundation for secure, compartmentalized data access that aligns with zero trust principles while maintaining the performance needed for time-sensitive homeland security applications.
“We are immensely excited about the value that our combined technologies can bring to our clients and the opportunity we have to continue advancing open-source excellence and innovation across critical areas in data and AI,” wrote Ritika Gunnar, General Manager, Data and AI, IBM.
The planned acquisition represents not just a business transaction, but a potential acceleration of AI and data capabilities that could significantly enhance the technological toolkit available to those safeguarding our nation’s security. As the acquisition moves forward, homeland security leaders should closely monitor how these combined technologies might be leveraged to address their most pressing data and AI challenges.
IBM’s acquisition of DataStax is subject to the close of the transaction and regulatory approval.
IBM is a Government Technology and Services Coalition Mentor Partner.