March 2025 - Page 7 of 8 - Mobile Monitoring Solutions

Uncategorized

Presentation: Building Your First Platform Team in a Fast Growing Startup

MMS • Jessica Andersson

Transcript

Andersson: Once upon a time, there was a startup. This startup was running code in production, because that’s how we get our product out. They have been running for a couple of years, and they started to identify the need for all the things DevOps. They knew that they wanted to keep developer focus on delivering software value, and they knew that they wanted to do all them things cloud. Recognize this story? I think this is common for a lot of people.

They realized that the solution was to invest into a base platform and platform engineering in order to solve the problem of CI/CD, runtime, and observability. It’s been evolving ever since. I’m here to tell you that you need a platform regardless of the size of your organization. I would even go as far as to say, you already have a platform, you just don’t know it yet. You need to treat your platform as a product. It’s no different than delivering the software product that your company is probably surviving on. Trust is a currency, and you need to treat it as such, because otherwise you will fail. Being a small company, the tradeoffs that you make will be the key to your success.

Background

I’m Jessica. I’m the product area lead for developer experience or engineering enablement at Kognic, the startup that I’m working at. I’m also a CNCF ambassador and a speaker at conferences. Before joining Kognic, I was a platform engineer working at another company and delivering platform as a service to more than 40 teams globally across the world. We were focusing mainly on Kubernetes as a platform and logging as a service. We had other things that we provided as well. I first heard of Kognic from a friend of mine who worked there when he sent me a job ad with a motivation, like, “You do cloud and Kubernetes. How about this one?” Who can resist that? It read, “Wanted, Head of DevOps”. Let’s give it a chance. I like my friend. He’s nice. I start reading the ad and I realize what they describe is something that actually sounds fairly similar to what I’m already doing.

I set up a meeting with our co-founder, Daniel, and we met up in a cafe looking like this, actually. We took a cup of coffee and we talked about it. I told him about all the things that I’ve been doing with trying to empower and enabling different development teams to solve their needs more efficiently. He told me about their situation and how far they had gotten, that they had some software running, but they realized that they need to invest more into making it continue to run smoothly over time, and solve some of the common needs of how to operate it in production. I told him my other hot take that I don’t dare to put in text, is that, I think a DevOps team is a glorified operations team, and I don’t work with operations. It’s nothing wrong doing that, but I feel that I really enjoy being a platform team more because I can affect a lot of people and I can try to make their lives easier.

Empowering and Enabling Product Teams

As a platform team, making people’s lives easier, someone wrote on Twitter, “Being a platform engineer is the closest that I will probably ever be to become a 10x developer”. This was all the rage when I was looking for this job. We also know that by empowering and enabling our product teams, they can focus on the things that they care about a lot, which is delivering product value. We tried to look at the needs for this head of DevOps, the reason why this ad was out. What Daniel described the need of the company was that they needed to do quick iterations because they didn’t really know exactly what the product would end up being in the long run. This is a startup. You’re trying to figure out what is the perfect fit for my product. They wanted to do quick iterations. They also wanted to have a lot of flexibility in changing direction if they discovered this is not the way we want to go. They wanted to maintain the developer focus on delivering value. They didn’t want all the developers to understand everything about Kubernetes.

I don’t want them to have to do that either because there’s a lot of things to know. They also knew that they wanted to keep a cloud native modern tech stack. We saw on the hype scale that cloud native was right up there with generative AI. We also talked about developers and operations versus DevOps. I’ve done a talk only about this thing previously. I think the main thing is that when you have a situation where you have a developer and operations team as we had at my previous company and we transformed that into DevOps, we have several reasons of doing so. We discovered that having the developers focusing only on pushing out new features and code changes was very disconnected from operating them in production because we started to see a lot of issues such as failures in production. We got long time to get new things pushed out because operations were busy firefighting. They were pushing back when developers wanted to deploy more things.

Operations had a problem getting the feedback back to the developers and prioritize the fixes for solving the things failing in production. It was more or less just throwing things over the wall and hoping it all works out. It did not work out.

At my previous company, we did a large effort in trying to transform into DevOps and have all the product teams work with an end-to-end full application lifecycle workflow. When we talk about end-to-end, like only in the full application lifecycle, we can also talk about how we get there. We get there through having empowered product teams. If you haven’t read Empowered Product Teams by Marty Cagan, it’s still a really good book, and there’s a lot of great ideas in it. You don’t have to read all of it. There’s a lot of blog posts that summarize some of the main points, or talk to someone smart around you that actually read it. That also works for me. Check it out if you haven’t. Marty Cagan describes empowered product teams as being about ordinary people delivering extra-ordinary products. You want to take any product teams, empower them so that they can focus on delivering great products.

Difference between, as I mentioned, the developers pushing features and empowered product teams can be described as, product teams, they are cross-functional. They have all the functionality or all the skillsets that they need in order to deliver their part, their slice of the product. They might have product managers, they might have designers, engineering, whatnot, they need to deliver their slice. They are also measured by outcomes and not output. Output is, I did a thing, yes, me. Outcome is, I made an impact. There’s a difference in that. We want to optimize for making good outcomes rather than a lot of output. They’re also empowered to figure out the best way to solve the problems that they’ve been asked to solve. This is a quote from the blog post that describes exactly this. It says that solving problems in ways our customers love, yet work for our business. It’s not all about only focusing on making customers happy, about doing it in such a way that the business can succeed, because we’re all here to earn money.

Very much related to this is “Team Topologies” by Matthew Skelton and Manuel Pais. I will focus on it because I think this is strongly related and this is something that we looked at on how to structure our teams. Stream-aligned teams have also a slice of the cake. They have like a you build it, you run it thing. They have a segment of the business domain, and they’re responsible for that, end-to-end. Then you have the enabling team. I said, I offer engineering enablement. There’s a reason why it’s here. They work on trying to help and unblock the stream-aligned teams. They are there to figure out the capabilities that we need to make in order to improve the life of the stream-aligned teams.

Then we have the platform team, and they are supposed to build a compelling internal product to accelerate delivery by the stream-aligned teams. We’re here to empower and enable our stream-aligned teams to deliver good outcomes to create business value. As a small company and a small organization, I argue that you probably can’t afford to have both an enabling team and a platform team. In our case, we decided to combine these two. We decided that we should have both engineering enablement and platform engineering within the same function.

Given what I told you about empowered product teams and trying to focus on good outcomes, do you think we should have a DevOps team or a platform team? It’s a given. We’re going for the platform team. That’s why I’m here. Back to the coffee shop, me and Daniel, we talked for longer than we set off time. We’re both big talkers. In the end, we think that this might be something and we decided to give it a go. In June 2020, I joined Kognic with the mission of starting up a platform engineering team and try to solve the problem of empowering the product teams to deliver more value.

By the time we had the platform team hired because there was new hiring going on, the engineering team had grown to 31 people. Four of these were working in the platform team. That means that about 13% of our headcount were dedicated to platform and internal tooling. The reason why I tell you is not because 13 is a magic number, I just thought it could be nice to know, like put it in numbers. Tell us what you really were doing. Being a small company, this is 13% of the capacity, whatever you want to call it, that we had to deliver new value, and we thought that we could actually gain this in more empowerment.

Implicit Platform

We had a team. We could start with a platform. We didn’t start from scratch, because, as I said, we were running code in production. It turns out, if you’re running code in production, you already have an implicit platform. The first thing that I had to do was try to figure out what is already here, what do we have right now. This platform, it exists, but it’s less structured and less intentional. This is what happens when you don’t do an intentional effort in trying to build your platform. There were some really good things in place, but a lot of it had happened in the way of, we need something for this, let’s do it, and then go back to the things that I’m really supposed to be working on. Meaning that we had Kubernetes running, but it had not been upgraded since it was started. Yes, that’s true. We had other things that were working good enough to solve the immediate need, but maybe not good enough to empower the teams fully. We had a lot of things in place.

They were done with the best effort, good enough approach, and we needed to turn this around. I’m going to show you what we had in our implicit platform. This is not me telling you this is what you should run in any way, but I want you to have this in mind. We use Google Cloud Platform as our cloud provider. We use GKE, Kubernetes running on top of it for our runtime. We had CircleCI for our CI. We are writing code in TypeScript and Scala and Python. We had bash scripts for deploying to production. We also had InfluxDB and Grafana for observability. You can see that there’s nothing here about logs because I don’t think we work with logs at this point in time.

A Base Platform

What do I mean when I say platform? Because this is what we were hired to fix. This is where we started out and what we wanted to build. I define this as a base platform. This is, for me, a new term. Focus on solving the basic needs. We’re talking about CI/CD. You want to build, package, and distribute your code. You want to have it run somewhere in what’s equivalent to production for you. You want to have the observability to be able to know in case something goes wrong so that you can operate and maintain your applications. Without these three, it’s really hard to have something continuously deployed to production and keep working in production. I see those as the bare necessities of platform. With this concept in mind, we want to take our implicit platform and turn it into an intentional platform.

Our platform, it changed a tiny bit, not much. You can see it’s basically the same picture that I showed you before. We took Google Cloud and we introduced infrastructure as code to make sure that we have resources created in the same way and that it’s easy to upgrade when we want to make changes to how we utilize them. We improved the separation of concern for cloud resources. There was a lot of reusing the same database instances for staging and production and other things. We also took Kubernetes and we upgraded it and applied security patches, and then continuously maintain it. Kubernetes does several releases per year, so it’s a lot of years to keep up. CircleCI, we were running a monorepo where all the code was, for both TypeScript, Scala, and Python. We broke it apart and we reduced the build times a lot. We went from 40-plus minutes to less than 5 minutes.

We also introduced GitHub Actions in order to have smaller, more efficient jobs because our developers really felt those were easy to integrate with. We didn’t even provide it as a platform. It was just something they started using and then we adopted it. We removed InfluxDB and replaced it with OpenTelemetry and automagic instrumentation of applications. When I say automagic, I really mean automagic. If you haven’t looked at it, it’s as good as they say, and I didn’t believe it until we tried it. Then we removed the bash scripts for deployments and we introduced Argo CD and GitOps for better version controlling, and control and easier upgrades and rollbacks of applications.

Platform as a Product

How did we go about getting to this place? We treated our platform as a product. It’s an important part. I think the first thing you need to do is to understand your market. This is your empowered product teams. What do they need to do? How do they work today? What pain points do they have? You need to understand, what would be the good direction to go with this? You need to iterate and validate your solutions. You can’t just go away for a year and do something and come back and deploy it, and hope everyone is happy. Because you need to always constantly work together with the teams in order to make sure that you have something good. If you’re new to working with a product mindset, I can recommend looking at something that is called a Double Diamond, where you have two different phases. One is you go broad for problem discovery, and then you narrow down on a problem solution.

Then you go broad on a solution discovery and then narrow down on a solution decision, and then you iterate on that. When we look at our platform team, we do similar things as our empowered product teams, meaning that we try to be cross-functional. We try to figure out, what capabilities do we need in our team in order to deliver the things that we need? We are looking at cloud infrastructure. We need that skill. We also need containers and Kubernetes skills because there’s a lot to learn about those.

Observability is a big thing. It’s good to have that skill. Also, the enablement and the teaching. You need to teach your empowered product teams to adopt and work with these services that you provide. You need to be able to show them new ways of working. You need people that can actually both teach and communicate with other people. Communication is actually one of the bullets that we had in job ads for skills we’re looking for. You also need product management skill combined into this team. Obviously, since we’re doing product. If you want to learn more about working with product thinking for platform teams, I can recommend checking out this talk by Samantha Coffman. It was at KubeCon + CloudNativeCon EU in Paris. The recording is up on YouTube. Check it out. She does a really good description of it and she has really concrete examples of what it means to figure out what the real problem is rather than fixing the symptoms.

Finding the Right Problems

Talking about figuring out what the real problems are, remember what we wanted to achieve? We wanted to achieve quick iterations, flexibility to change direction, and maintain a developer focus on delivering product value. Given that, we wanted to figure out what is holding us back from achieving that. Let’s start with understanding the market. Autonomous teams and empowered product teams, I have trouble separating those terms. Where does one end, where does another start? Autonomous teams is something that we have talked a lot about at Kognic. One thing that it says is that autonomous teams can deliver, beginning to earn value, and with minimum supervision. It’s similar to empowered product teams: do all the things that you need to solve the problem, but also with the caveat of minimum supervision. They’re in charge of defining the daily tasks and the work processes that need. It’s very similar.

If we think about autonomy and the free choice, we can’t force our product teams to use our platform because then we are removing the autonomy and the freedom to choose. As a platform team, it’s very important that we try to make it enticing and something they want to use. We can have things as our platform as a default setting, but maybe we can’t hinder them from choosing something else. To achieve that, we want to create a paved road for the developers. What does that even mean? What is paved road? We want to empower and enable our product teams or our stream-aligned teams to deliver valuable outcomes. For every decision and every action they need to take that doesn’t build towards that, we can view that as a cost that takes away from the outcomes that they are able to deliver. If we provide a paved road, something that is easy to follow, they don’t have to make a decision of, how do I want to deploy to production? They can follow the already paved road of that.

Then we can solve the basic needs of building, running, and operating applications. We allow our teams to focus on the things that make a difference, and we reduce that cost. This paved road should be effortless. It should not cost them energy to stay on the paved road because then your paved road is not that great. As a platform team, I mentioned Kubernetes has a lot of upgrades. How many of you do migrations continuously, because I feel like we are always in a migration from one thing to another? If we as a platform team make those migrations feel cumbersome or heavy to do, people would try to like, “Maybe I don’t have to migrate. Maybe I can do something else. Maybe I don’t have to follow this thing that platform is now forcing me to do again”. You need to make those things effortless so that people can maintain on the paved road without adding more energy.

Is autonomy limitless? Are there boundaries to what you’re allowed to do as an autonomous team? Who is responsible for setting those limits? If everyone decides to go for their own solution, it will be really hard to be a platform team. Like you say, this is how you should deploy your applications. Then the team goes like, I think I have a better way. There are two reasons for that: either your platform is shit or your culture is shit. Both of those is something that you should try to figure out as soon as possible so you can address it. I also think that there are occasions where autonomy is good, but like if you have people running around freely, just doing the things that they find really interesting, it will be very costly in the long run because it will happen that you have to do something. With a very diverse codebase, it’s super hard to handle that as a platform team and as an organization. The longer you wait to figure out where the limits for your autonomy goes, the harder it will be to address it once you decide to do it.

There are things that might be good to put as not optional for the paved road. When I talk about that, I usually think about compliance and I think about security. Everyone loves compliance and security. I’m sure you do because I do know that I do. A paved road or a platform is something that can really help you figure those things out. If you make it easy for the teams to be compliant and be secure by building innately into your platform, you can reduce the load for them to do so, and be able to focus on the things that they want to do, like valuable outcomes. I think there are situations where the paved road might not be optional and that you can build it into the platform in order to solve that.

Back to finding the right problem. If we build a paved road that enables quick iterations, flexibility change while allowing product teams to focus on product value while staying empowered, if we want to do that, then we need to figure out what is holding us back from doing so right now. We need to figure out what the right problems are. We knew that with our limited resources, being a small team, 31 people, 4 people in platform, we needed to figure out what we wanted to focus on and be intentional with what we invest to. We want to take our implicit platform, apply strategy in order to make it intentional. We wanted to reduce the pain points and the time sinks, and improve developer experience to increase our ability to deliver product value.

Problem Statements

I have some problem statements that we can use as a tool when asking ourselves what should we focus on. The first thing is like, teams are blocked from performing the tasks that they need to do. Maybe they have to wait for someone to help them in order to move forward. This is, of course, bad. I’m only listing bad things here. The second one could be like the task that the team performed takes a long time and they are hindered from moving on until it’s done. The third thing is the tasks teams perform are unreliable and prone to failure. I’m going to give you three examples of where this applied to us. The first one was DNS. DNS was not failing, but DNS was blocking. When our teams wanted to deploy a new service and they wanted to attach a DNS to it, they had to go and ask one or two people that can help them create a DNS record and give it back. They were blocked from moving on until they got that support.

Something that was taking a very long time, I mentioned before, we had a monorepository with a lot of long builds. You had to wait for the build to build your package so you could deploy to production. We had build times of over 40 minutes. This was taking a lot of time and hindering people from moving forward. When it comes to unreliable and failures, we had deploying to production with bash scripts. Because there was a lot of hidden functions within this bash script that was not clear, it was a black box to the developers, and it failed several times a week. It was becoming painful. The team members were not sure how to figure it out themselves. They couldn’t know for sure, even if it seemed to go fine, if it actually also got reproduced in production. It was prone to errors. It was unreliable.

This was something that they were not able to solve themselves. They were hindered from moving forward. We looked at these tasks and we figured out what we should focus on. Hint, we took all three of those and tried to look at. We tried to look at our implicit platform. We tried to figure out where can we streamline it, where can we upgrade it, and where can we improve it in order to remove those pain points and time sinks? When we have tackled how to solve those problems, we also need to figure out how to roll this out to the teams, and how we can get them started using it, and how can we gain adoption of a platform.

Trust is Currency

Which nicely leads me to the next section, which says, trust. Gaining adoption is closely related to the amount of trust your product teams have in you as a platform team. As eng says, trust is currency and you should treat it as such. You have to gain some currency before you can spend it. Credibility is a currency and it is what you earn and spend as a platform team. When we talk about trust, trust goes up and trust goes down. When I say up, I mean up in the organization. You have to keep your trust with leadership because they are the ones that decide to continue to invest into your platform team. If you look at the budget, you’re just cost. If you’re unlucky as well, you also get the cloud bill on your cost part of their budget, and then it looks like you’re very expensive. You need to build trust with your organization that you are actually improving things so that you can keep doing it. It’s already been talked about, and DORA metrics is something that you can work with in order to show some kind of improvement and show what value to deliver.

This link goes to the “Accelerate” book which is written by Dr. Nicole Forsgren, Jez Humble, and Gene Kim. If we think about DORA metrics, they are four metrics and they are focused on deployment frequency, how often do you get new things into production? Lead time for changes, how long does it take that you start working on something until it actually reaches an end user? Mean time to recovery, if you have a failure, how quickly do you recover? Change failure rate, how often does a change lead to failure? Those four metrics is something that you measure your empowered product teams on but can be a nice indicator on your effect as a platform team. If you think about down, you want to have trust with your product teams in order to gain adoption of your platform.

If they don’t trust you to solve their pain points, then you need to figure out why they don’t trust you and what you can do to change that. I would suggest starting with something easy but painful or time consuming. Make that work so much easier for them, and then go on to the next thing. Start small, start building up credibility, because when you have built some trust, people will start coming to you and then you will have an easier time understanding their point of view. For us, something that we did, the DNS thing, we introduced external DNS into Kubernetes which means that you can use Kubernetes configuration in order to allocate a DNS record. This was very easy for the developers to understand how to use and it was very quick for them to start using it as well, meaning that from one day to another basically, they were no longer blocked by anyone ever when wanting to change DNS.

Once you have tackled some of the small things, you can go on to the bigger things, and then you will probably spend some credits so you can earn them back again. Based on my experience, these are some of the things that you can do in order to earn or spend the credits. When we talk about earning credits, we can talk about removing pain points, and really anything that is painful for developers will do. As a platform team, it’s good to be approachable and helpful. You want people to reach out to you so you can learn about what they are doing. Something that we do for this, is that in Slack, we have a team channel that is team platform engineering in which we have a user group that is a goalkeeper, platform engineering goalkeeper. Teams know that they can ping this goalkeeper with questions regarding the platform and to get help to figure out how they can solve something in case something breaks and they need help understanding what went wrong, they can do that.

If they want help understanding how they can utilize some part of the platform, they can do that. By being very approachable and helpful, and by approachable, I mean there are no stupid questions, we know this. Also make sure that they understand it. Be nice. Be kind. If someone comes to you with a question and you go like, here’s a link to wiki. Do you think they will ask you again? They will probably be like, no, they don’t want to help. If you go like, we wrote this part about it but if there’s anything that’s unclear, please let me know and we can work on it together. That’s more approachable. That is something that makes people want to come back and ask again. You can still give them the link to wiki because like, check the documentation. You can do it in a kind way so people want to reach out again.

You want to be proactive. You want to be able to fix some things before people have to ask, especially if it’s something that you really should know, like something is broken in your platform. It would be nice if you know it before some developers come and ask you, why is this thing not working anymore? You need to understand the team perspective, like, where do they come from? What do they know? What do they not know? What do they want to achieve? In spending credits, we have enforcing processes. Sometimes you can’t avoid it, like compliance, like security. It costs you credits. In my experience, teams really don’t like when you tell them you have to do this just because. Be mindful of how you enforce your processes.

Also, blocking processes. Empowered teams, they know that they are empowered. They know they’re supposed to be. If you take that away from them, you’re blocking them from being empowered, they’re not going to like it. Migrations, we can’t get away from it, but depending on how you perform them, it will cost you credits. It might even cost you credits even if you do it well. Assumptions, I know everyone working on my platform team is really smart and capable at what they’re doing. I also know that there are several times when we made assumptions of what the teams needed and how they wanted it to work, and we were wrong. It’s very easy to take your world view and project it on something where it doesn’t really fit. Make sure that you validate your assumptions and understand the team perspective in combination with your assumptions. Otherwise, you might be spending credits.

I want to tell you, this could be our trust credit score over time, from June 2020 up until now. Please don’t mind the time warp. It was hard. I don’t have any numbers as well because we haven’t been tracking this, but it could look like this. On the y-axis, we have the trust credits. On the x-axis, we have the time. You can see that we did a really big drop in trust. This is when we were making the assumption, enforcing a process, not talking to the teams to anchor or understand their perspective before migrating everyone to a new way of working. I’m talking about how we introduced Argo CD and GitOps instead of a bash script for deploying into Kubernetes.

For us, it was clear that everyone wants to work with GitOps, because you have it version controlled. It’s very nice. You have control always running. You can follow the trail of everything. It’s easy to do rollbacks and all the things. We knew this. This was clear. The whole industry is talking about how this is a good way of working, but we did not anchor it with the teams. We did not understand how they viewed working with the bash script and interacting with it. We took something away from them, we forced something on them, and we did not make them understand why.

In the long run, we actually managed to gain back some trust on this because this change and the new process that we enforced, it proved itself time over time. In the long run, we gained more trust than we spent. I would rather not have that dip if I were doing it again, because I think it could have been avoidable, and I think we could have mitigated it through spending more time understanding the developers and making sure they’ve anchored the change before performing it. In the long run, great investment.

Small Teams and Tradeoffs

Speaking of investment, as a small team, you will have to do tradeoffs. The link goes to the CNCF landscape. The CNCF landscape is a map of all the open-source projects that is under the Cloud Native Computing Foundation umbrella. I don’t know the number, but if you zoom in, it looks like this. There’s a lot of project icons, and they are structured into different areas. Being a small team, you will not be able to use all these tools. You need to figure out what you need for your use case. You need to be mindful of what you take on, because if you take on too many things, it will be hard for you to maintain the speed, and it will be hard for you to adapt once you have to really work with the business value. Let’s say you’re working and you have some slack time, and so you go like, “What should we do now? How about this cool thing over here that I found? I think this would be a really nice feature for our product teams. I think they would really love it. Let’s do that”.

Then you start. Then you make everyone start using it. Then you get a new version, and you have to migrate everyone. Then suddenly the business comes and says like, we need you to fix this thing, because we need to add this capability into our platform. You go like, but we are working on this nice-to-have thing over here. You have must needs that you will not be able to address because suddenly you have filled all your time with nice-to-haves. Be mindful of what you take on. Be mindful that the open-source community is both a power and a risk, because there’s a risk of drowning into a lot of things, but it’s also the power of standing on the shoulders of other people and utilizing what they already have done. Have reservation, but make use of the things that you must have. Ask yourself, what can you live without? What do I not need?

For us, we realize we can live without a service mesh. A service mesh is a dedicated infrastructure layering for facilitating service-to-service communications within Kubernetes or other things. It’s really nice. You can get these fancy maps where you can see the graph of services talking to each other and all the things. You can do network policies, all them things. Really nice, but not a must for us. We don’t need it. In a similar way, we don’t need mutual TLS between applications in our Kubernetes cluster, because that’s not the main concern for us right now. Caveat, I really love Backstage.io as a project, but we don’t need a developer portal. It can be extremely nice to have. It can solve many issues that you have, but as a small company, we don’t have those pain points that motivate people to start using Backstage.

We don’t need to invest into a developer portal. Design system. Starting out, design system is like clear standards and component libraries that you can reuse for the frontend. Starting out, we did not want to invest into this because we didn’t see the need. Actually, in the last year, we have started to invest into a design system. It’s really valuable. We started out with the components that were mostly used throughout the application, and we started by standardizing those. Not every component is in the design system, but the ones that are, are used a lot, which is really nice for our frontend developers and our designers who can work together in collaborating on how they want the standard to work. Starting out, ask yourself, what things can you live without? What is on your nice-to-have list, but maybe not worth investing into?

Summary

With this knowledge, if you want to get started on your own paved road, what should your paved road contain? When you know what your business needs are, what your team needs are, how you can continuously build trust with your organization and your teams, and what tradeoffs you’re willing to make, then you’re ready to start paving your own road of empowered product teams. Do remember, you already have a platform. Start investing strategically into it. You should treat your platform as a product and unleash new capabilities. Trust is a currency, and you use it to gain adoption from your product teams. Tradeoffs is a key to success. Pick the right ones, and you can win again and again.

Questions and Answers

Participant 1: You started with your personal journey, and I appreciate that a lot. Forgive me for saying so. It seems to me like the deal was already in place when you joined the new company. You didn’t have to fight for a consensus. You didn’t have to convince the company to start building a platform team. Myself, I’m in a different situation. I’m still fighting that battle, and I am exactly against the things you were saying. We are very small as a company. Maybe we just need a bigger DevOps team. Based on my experience, based on my reading, it seems to me like to win these arguments, what one would have to do is, small POCs improve value, but you also need a little bit of help from the top down. I need to manage upwards. I need to manage downwards. I’m looking for some advice, basically.

Andersson: Eb was talking about change, how to drive change without having the authority. He talked about finding allies for your cause, and I think finding allies in the leadership is what you need to do. Maybe not someone directly in your line, if you have trouble convincing them. Find someone else in leadership that will listen to you and that can be an ally for you. Then we had a talk about bits and bots and something else around DevOps, and she talked about how the different team structures can look. I think she made a really good case for why a DevOps team is not the way to go. She had a really nice dumpster picture on the DevOps team and everything. Check that talk out if you haven’t. I think you can use that as a motivation for why a big DevOps team is not the solution.

Then I think, yes, it really helped to have our co-founder, Daniel, convinced before joining the company. He knew they needed to change something. He wasn’t sure how to approach it. Talking together, we could come to a shared vision of what that would look like, which was very useful.

Participant 2: Luckily, for us, we’re on the right path towards this, so we just started something like this. I’m trying to know, did you have something like a roadmap from the beginning? How long did it take you to achieve this?

Andersson: I’m a really bad product manager because I never had a roadmap that I successfully maintained, unfortunately. Luckily, no one really blamed me for it either. What we did mainly was, it took about a little bit over half a year from me joining before we actually had a team in place. There was a lot of recruitment and getting to know the company first. That’s where I spent the first time. Then when the team joined, we started to remove the blockers because there were some things that the teams were not able to do. Those were often quicker fixes, so we started with that. Within a year from June 2020, we had changed the DNS thing and we had changed the GitOps thing, but we had not started on the monorepo and the build times. Half of the things I’ve told you now was basically within the first year. This second half spread out over the last three years, but also all the things that I did not mention happened in the last three years.

Participant 3: If I’m a developer within your company, what does the platform look like? Is it a website? Is it documentation, API? If I want to deploy something on your platform as a developer, where do I go? How does it work on a day-to-day basis?

Andersson: As a developer wanting to use our platform, it’s back to the keynote thing. If you want to introduce a Kanban for a team that never worked with Agile, or Jira, or one of these things, start simple. They used an Excel sheet. We used mainly GitHub repositories where it was like, this is how you can copy-paste code to get started. It’s not a fancy platform in any way. It’s more like, here’s where you put your Kubernetes configuration and here’s what you can copy-paste to get started. Here’s where you can clone a GitHub repository to get that base application. It’s a little bit crude still, but it’s still making it more streamlined and everything. Boilerplates, we are currently working on rolling that out. It takes a while. Boilerplates are part of the fancy part of the platform. Bare necessities is just, make it run.

See more presentations with transcripts

Uncategorized

Google DeepMind Unveils Gemini Robotics

MMS • Daniel Dominguez

Google DeepMind has introduced Gemini Robotics, an advanced AI model designed to enhance robotics by integrating vision, language, and action. This innovation, based on the Gemini 2.0 framework, aims to make robots smarter and more capable, particularly in real-world settings.

One of the key features of Gemini Robotics is its embodied reasoning, which allows robots to understand and react to their environment in a more human-like way. This capability is crucial for robots to adapt quickly in dynamic and unpredictable environments. Gemini Robotics enables robots to perform a wider range of tasks with greater precision and adaptability, which are significant advancements in robotic dexterity.

Google DeepMind is also developing the next generation of humanoid robots partnering with Apptronik, which have the potential to work alongside humans in various environments, including homes and offices. The concept of steerability is emphasized, referring to the responsiveness of robots to human commands and environmental changes, enhancing their versatility and ease of use.

Safety and ethics are top priorities, with measures such as collision avoidance and force limitation integrated into the AI models. The ASIMOV dataset, inspired by Isaac Asimov’s Three Laws of Robotics, aims to improve safety in robotic actions, ensuring robots operate ethically and safely around humans.

Comments from various sources reflect excitement and optimism highlighting its adaptability and generalization, calling it a step toward genuine usefulness in robotics, moving beyond mere automation.

Educator and business leader Patrick Egbunonu posted on X:

Imagine robots intuitively packing lunchboxes, handling delicate items, or assembling products efficiently—without extensive custom programming.

Others note its impressive dexterity and instruction-following, suggesting it could be a pivotal advancement. Web discussions, like those on Reddit, draw parallels to a ChatGPT moment for robotics, though some argue it needs broader consumer access to truly revolutionize the field.

User ogMackBlack shared on Reddit:

The ChatGPT moment in robotics, to me at least, will be the moment regular people like us will be able to purchase them robots for personal use or have Gemini taking control of physical stuff autonomously at home via an app.

Google DeepMind’s work expands the capabilities of robotics technology, pushing its development forward. While experts recognize its potential to connect cognitive processing with physical action, some remain skeptical about its immediate real-world impact, especially when compared to high-profile demonstrations from competitors like Tesla’s Optimus.

About the Author

Daniel Dominguez

Show moreShow less

Uncategorized

Azure Database for Mysql Trigger for Azure Functions in Public Preview

MMS • Steef-Jan Wiggers

Microsoft has recently introduced a public preview of Azure Database for MySQL trigger for Azure Functions. With these triggers, developers can build solutions that track changes in MySQL tables and automatically trigger Azure Functions when rows are created, updated, or deleted.

Azure Functions is Microsoft’s serverless computing offering. It allows developers to build and run event-driven code without managing infrastructure. Within functions, triggers and bindings are defined. Triggers define how a function runs and can pass data into it. At the same time, bindings connect tasks to resources, allowing input and output data handling – a setup that enables flexibility without hardcoding access to services.

Azure Functions has several triggers such as Queue, Timer, Event Grid, Cosmos DB, and Azure SQL. Microsoft has introduced another one for the Azure Database for MySQL in preview, which bindings monitor the user table for changes (inserts, updates) and invokes the function with updated row data. The Azure Database for MySQL bindings was available in a public preview earlier.

Sai Kondapalli, a program manager at Microsoft, writes in a tech Community blog post:

Similar to the Azure Database for MySQL Input and Output bindings for Azure Functions, a connection string for the MySQL database is stored in the application settings of the Azure Function to trigger the function when a change is detected on the tables.

For the trigger to work, it is necessary to alter the table structure to enable change tracking on an existing Azure Database for MySQL tables to use trigger bindings for an Azure function. A data table will look like this:

ALTER TABLE employees
ADD COLUMN az_func_updated_at TIMESTAMP 
DEFAULT CURRENT_TIMESTAMP 
ON UPDATE CURRENT_TIMESTAMP;

According to the documentation, the Azure MySQL Trigger bindings use “az_func_updated_at” and column data to monitor the user table for changes. Based on the employee’s table, the C# function would look like this:

using System.Collections.Generic;
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Extensions.MySql;
using Microsoft.Extensions.Logging; 

namespace EmployeeSample.Function
{
    public static class EmployeesTrigger
    {
        [FunctionName(nameof(EmployeesTrigger))]
        public static void Run(
            [MySqlTrigger("Employees", "MySqlConnectionString")]
            IReadOnlyList<MySqlChange> changes,
            ILogger logger)
        {
            foreach (MySqlChange change in changes)
            {
                Employee employee= change. Item;
                logger.LogInformation($"Change operation: {change.Operation}");
                logger.LogInformation($"EmployeeId: {employee.employeeId}, FirstName: {employee.FirstName}, LastName: {employee.LastName}, Company: {employee. Company}, Department: {employee. Department}, Role: {employee. Role}");
            }
        }
    }
}

With the Azure Database for MySQL trigger, developers could build solutions that enable real-time analytics by automatically updating dashboards and triggering alerts with new data. This would allow automated workflows with seamless integration into other Azure services for MySQL data processing. Additionally, it enhances compliance and auditing by monitoring sensitive tables for unauthorized changes and logging updates for security purposes.

While Azure Database for MySQL triggers for Azure Functions offers powerful automation capabilities, developers should consider:

Scalability: High-frequency updates may lead to function execution bottlenecks. Implementing batching or filtering logic can mitigate performance concerns.
Supported Plans: The feature is currently only available on premium and dedicated Azure Function plans.
Compatibility: Ensure that the MySQL version used is compatible with Azure’s bindings and trigger mechanisms.

Microsoft’s investments in MySQL include bindings and triggers in Functions, as well as supporting a newer version of MySQL for Azure database offering, resiliency, migration, and developer experience, as announced at Ignite.

Lastly, developers can find examples of the Azure Database for MySQL Triggers in a GitHub repository.

About the Author

Steef-Jan Wiggers

Show moreShow less

Uncategorized

Project Leyden Ships Third Option for Faster Application Start with JEP 483 in Java 24

MMS • Karsten Silz

In Java 24, JEP 483, Ahead-of-Time Class Loading & Linking, under the auspices of Project Leyden, starts Java applications like Spring PetClinic up to 40% faster without code changes or new application constraints. It needs a training run to build a cache file that ships with the application. With GraalVM Native Image and CRaC, applications start 95-99% faster but face more constraints. Since JVM initialization is very expensive, Leyden plans more improvements.

JEP 483 extends Java’s Class-Data Sharing (CDS). On every startup, the JVM processes the same Java classes from the application, libraries, and the JDK the same way. CDS stores the results of reading and parsing those classes in a read-only cache file. JEP 483 adds loaded and linked classes to that cache and calls it “AOT cache.”

The training run only records the AOT configuration. It’s another step to create the AOT cache. This example uses a Java compiler benchmark picked by Leyden:

java ‑XX:AOTMode=record ‑XX:AOTConfiguration=app.aotconf ‑cp JavacBenchApp.jar JavacBenchApp 50
java ‑XX:AOTMode=create ‑XX:AOTConfiguration=app.aotconf ‑XX:AOTCache=app.aot ‑cp JavacBenchApp.jar

The AOT cache app.aot file is then ready to use:

java ‑XX:AOTCache=app.aot ‑cp JavacBenchApp.jar JavacBenchApp 50

On an Apple M1 MacBook Pro, the resulting 23 MBytes AOT cache leads to a 26% faster startup. The more classes an application loads, the higher the potential speed-up from the AOT cache. That is why frameworks like Spring Boot may especially benefit from JEP 483.

Project Leyden may combine the two steps for the AOT cache creation in the future. The Quarkus framework already does that today.

The training run could be a production run, but should at least mirror production as much as possible. Using the AOT cache requires the same JDK version, operating system, CPU architecture (such as Intel x64 or ARM), class path, and Java module options as the training run, though additional classes can be used. JEP 483 cannot cache classes from user-defined class loaders and does not work with JVMTI agents that rewrite class files using ClassFileLoadHook or call the AddToBootstrapClassLoaderSearch or AddToSystemClassLoaderSearch APIs.

GraalVM Native Image is an AOT compiler that moves compilation and as much initialization as possible to build time. It produces native executables that start instantly, use less RAM, and are smaller and more secure. But these executables also have principal constraints that do not affect most applications, need longer build times, have a more expensive troubleshooting process, and require more configuration. GraalVM started in Oracle Labs, but its two Java compilers may join OpenJDK.

The OpenJDK project, Coordinated Restore at Checkpoint (CRaC), takes an application memory snapshot during a training run and uses it later, similar to how JEP 483 creates and uses the AOT cache. But unlike JEP 483, CRaC only runs on Linux and requires all files and network connections to be closed before taking a snapshot and then re-opened after restoring it. That’s why it needs support from the JDK and the Java framework. While most frameworks support CRaC, only two downstream distributions of OpenJDK, Azul and Bellsoft, do. And the CRaC memory snapshot may pose security risks, as it contains passwords and credentials in clear text and is susceptible to hacking attacks.

Introduced in June 2020, the goal of Project Leyden is “to improve the startup time, time to peak performance, and footprint of Java programs.” Initially, Leyden wanted to introduce the “concept of static images to the Java Platform,” such as from GraalVM Native Image, but after two years with no public activity, it instead pivoted to optimizing the JIT compiler. JEP 483 is the first result of that pivot shipping.

In an October 2024 blog post, Juergen Hoeller, senior staff engineer and Spring Framework project lead at Broadcom, spoke of a “strategic alignment with GraalVM and Project Leyden.” JEP 483 appears to prove that: Spring and Spring Boot are the only Java frameworks mentioned, and the Spring PetClinic sample application is one of the two examples. Oracle’s Per Minborg, consulting member of technical staff, Java Core Libraries, also gave a joint presentation with Spring team member Sébastien Deleuze from Broadcom in October 2024, where unreleased improvements reduced the PetClinic startup time even further.

InfoQ reached out to learn how some Java frameworks plan to support JEP 483. Here are their answers in alphabetical order of the framework name. Some answers were edited for brevity and clarity.

The Helidon team shared a blog post with benchmarks of JEP 483, CRaC, and GraalVM Native Image. It used an application in the two Helidon flavors: Helidon SE and Helidon MP. The GraalVM Native Image speed-up below uses Profile-Guided Optimization (PGO), which also requires a training run.

Application Type	JEP 483 Speed-Up	CRaC Speed-Up	GraalVM Native Image Speed-Up
Helidon SE	67%	95%	98%
Helidon MP	62%	98%	98%

Max Rydahl Andersen, Distinguished Engineer at Red Hat, Quarkus, and Sanne Grinovero, Quarkus founding engineer and senior principal software engineer at Red Hat, from Quarkus, said the following:

We’re glad to see Project Leyden progressing. Quarkus fully supports JEP 483 since it’s integrated into the Java VM. The biggest challenge is the training run, which can be complex – especially in containerized environments.

To simplify this, we’ve made it possible to “boot” Quarkus just before the first request and then package applications with the AOT cache. This follows a similar approach to our AppCDS support.

If your JVM supports it, you can try it with:
mvn package ‑DskipTests ‑Dquarkus.package.jar.appcds.enabled=true ‑Dquarkus.package.jar.appcds.use-aot=true
Then run:
cd target/quarkus-app/
java ‑XX:AOTCache=app.aot ‑jar quarkus-run.jar
This makes it easy to get the AOT cache, as long as you are aware of the limitations around the JDK, OS, and architecture.

This provides a noticeable boost in startup time. However, project Leyden is not complete yet, and we’re looking forward to several improvements which are not available yet.

As an example, early previews of Leyden had a significant tradeoff: While it started more efficiently, the memory consumption was also higher. And since Quarkus users care about memory, we didn’t want to recommend using it until such aspects were addressed. The Quarkus team is working very closely with the Red Hat engineers working on OpenJDK, so we are confident that such aspects are being addressed. In fact, memory consumption has already improved significantly compared to the early days, and more improvements are scheduled.

Support for custom class loaders is another big ticket on our wishlist. Speeding up classes loaded by the system class loader is great, as that accelerates the JDK initialization. But application code and Quarkus extensions are loaded by a custom class loader, so only a subset of the application currently benefits from Leyden. We’ll keep working both on our side and in collaboration with the OpenJDK team to push this further.

We’re also exploring ways to make it more practical for containerized environments, where a training run isn’t always a natural fit.

So yes, Quarkus supports Leyden and the AOT cache introduced in JEP 483, but we’re just at the beginning of a longer journey of improvements.

Sebastien Deleuze from Spring had the following to say:

The Spring team is excited that Java 24 exposes the first benefits of Project Leyden to the JVM ecosystem for wider consumption. The AOT Cache is going to supercharge CDS that is already supported by Spring Boot. We are looking forward to further evolution likely to come in future Java versions.

The Micronaut team has not responded to our request to provide a statement.

About the Author

Karsten Silz

Show moreShow less

Uncategorized

MongoDB Acquires Voyage AI to Enhance AI-Powered Search and Retrieval

MMS • RSS

MongoDB, a database for contemporary apps, has announced the acquisition of Voyage AI, a company specializing in embedding and reranking models for AI-powered applications. This integration will strengthen MongoDB’s database capabilities by improving information retrieval accuracy within AI applications. Businesses often face challenges with AI-generated inaccuracies, particularly in critical fields including healthcare, finance, and legal services. Voyage AI’s technology addresses this issue by ensuring that AI models extract precise and relevant data, reducing the risk of incorrect or misleading outputs. The company’s models, recognized for their high performance, will help organizations apply AI more effectively across specialized domains, including legal and financial documents, enterprise knowledge bases, and unstructured data.

Become a Subscriber

Please purchase a subscription to continue reading this article.

Subscribe Now

MongoDB plans to integrate Voyage AI’s retrieval capabilities into its database platform, allowing businesses to build more reliable AI applications. According to MongoDB CEO Dev Ittycheria, this acquisition redefines the role of databases in AI by enabling trustworthy and meaningful AI-driven solutions. Voyage AI’s technology will remain accessible through its platform, AWS Marketplace, and Azure Marketplace, with additional integrations expected later this year. This acquisition reinforces MongoDB’s commitment to advancing AI applications by providing businesses with enhanced data retrieval and accuracy, making AI solutions more practical for real-world use cases.

Presentation: Recommender and Search Ranking Systems in Large Scale Real World Applications

MMS • Moumita Bhattacharya

Transcript

Bhattacharya: We’re going to talk about large-scale recommender and search systems. Initially, I’ll motivate why we need recommendation and search systems. Then I’ll further motivate by giving one example use case from Netflix for both recommendations and search. Then identify some common components between these ranking systems, our search and recommendation system. What usually ends up happening for a successful deployment of a large-scale recommendation or search system. Then, finally, wrap it up with some key takeaways.

Motivation

I think it’s no secret that most of the product, especially B2C product, have in-built search and recommendation systems. Whether it’s video streaming services such as Netflix, music streaming services such as Spotify, e-commerce platforms such as Etsy, Amazon, usually have some form of recommendations and some form of search systems. The catalogs in each of these products are ever-growing. The user base is growing. The complexity of these models and this architecture and these overall systems keeps growing.

This whole talk is about, with examples, trying to motivate what it takes to build one of these large recommendation or search systems at scale in production. The reality of any of the B2C, business to customer products are that there are almost like, often depending on the product, there could be 100 million plus users, like Netflix has more than 280 million users, 100 million plus products. In general, to rank for these many users, these many products at an admissible latency, is almost impossible. There are some tricks that we do in industry to still keep the relevance of what items we are showing to our users, but be able to be realistic in the time it takes to render the service.

Typically, any ranking system, whether it’s recommendation system or search system, we break it down into two steps. One is candidate set selection, or oftentimes also referred to as first pass ranking, wherein you take all these items for user and instead of those millions of items, you narrow it down to hundreds and thousands of items. That’s basically called candidate set selection. You are selecting a candidate that you can then rank. In this kind of set selection, we’ll typically try to retain the recall. We want to ensure that it’s high recall system. Then, once these hundreds of thousands of items are selected, then we have a more complex machine learning model that does second pass ranking. That leads to the final set of recommendation or results for search, query that then gets shown to the user. Beyond this stratification of first pass and second pass ranker, there are many more things that need to be considered.

Here is an overview of certain components that we think about, we look into irrespective of whether it’s a search system or a recommendation system. First is the first pass ranking that I showed before. The second one is the second pass ranking. For the first pass ranking, typically, depending on the latency requirement, one needs to decide whether we can have a machine learning model, or we can build some rules and heuristics, like for query, you can have lexical components that retrieves candidates. Versus for recommendation, you can have simple non-personalized model that can retrieve candidates. The second pass ranking is where usually a lot of heavy machine learning algorithms are deployed. There, again, there are many subcomponents like, what should be the data for training the model? What are the features, architecture, objectives, rewards? I’m going to go into much more details on some example second pass ranking.

Then there is a whole system of offline evaluation. What kind of metrics should we use? Should we use ranking metrics? Should we use human in the loop, like human annotation for quality assessment, and so on? Then there is the aspect of biases where when we deploy a system, all our users are seeing the results from that system. There is a selection bias that loops in. How we typically address that is by adding some explore data. How can we set up this explore data while not hitting the performance of the model? Then there is a component of inference where once the model is trained, we want to deploy it and then do the inference within the acceptable latency, throughput, what is the compute cost, GPU, and so on? Ultimately, any experience typically in a B2C product, we are A/B testing it as well.

Then in the A/B test, we need to think about the metrics. I wanted to show this slide first. If you want to take away one thing, you can just take away this, that these are the different things you need to think about and consider. During this talk, I’m going to focus on the second pass ranking, offline evaluation, and inference setup with some examples. Just in case you were not able to see some of the details in the previous slide, here are the sub-bullet points where data, feature, model architecture, evaluation metric, explore data, all these things are crucial for any of these ranking systems.

Recommendation: A Netflix Use Case

Now let’s take a recommendation use case from Netflix. On Netflix, when a user comes to Netflix, usually there is a lot to watch from, and we often hear in our user research, it sometimes feels overwhelming. How do we find what I want to watch in the moment? Netflix oftentimes has so many recommendations on the homepage. It just feels overwhelming. One approach that our members do take is they often go to search, and then they will type something. For example, here, show me a stand-up comedy. The results from these search ranking systems are either personalized or just relevant to the query. I think 60% of our users are on TV, and typing query is still a very tedious task. Just to motivate the search use case, most of the discovery still happens on homepage on Netflix, but 20% to 30% of discovery happens from search, second only to the homepage. There is a nice paper that talks about the challenges of search in the context of Netflix linked here.

Usually, when a member comes to search, there are three different types of member intent. Either the member directly knows what they want, so typing, Stranger Things, and then you specifically want Stranger Things. Versus you know something, but you don’t know exactly, so that’s find intent. Then there is explore intent where you type as broad things like, it’s a Saturday evening, show me comedy titles, or something like that. Depending on this different intent, how the search ranking system responds are different.

Going back to the aspect of a member coming from homepage to search and having to type this long query on a TV remote, which is very tedious. What if we can anticipate what the member is about to search and update the recommendation before the member needs to start typing? That’s why this particular example I’m referring to as a recommendation use case, even though it is after you click on the search page. Internally, we refer to as pre-query, but in industry it often is also referred to as no-query systems. This is a personalized recommendation canvas, which is also trying to capture search intent for a member. Here, let me motivate the purpose of this canvas a little bit more. On a Thursday evening, Miss M is trying to decide whether she goes to Netflix, HBO, Hulu, and so on.

Then she comes to Netflix because she heard that they have good Korean content. There is a model that understands this member’s long-term preference, but in the moment, she recently heard that Netflix has some Korean content that is really good from her friend, and her intent changed. What she did is she browsed on the homepage with some horror titles, and then she browsed on the homepage with some Korean titles. Now she still didn’t find the title that she wants to start watching on homepage. Now she went on search. In this moment, if you’re able to understand this user’s long-term preference, but also the short-term intent, that is, she’s looking for a Korean movie, and before she has to search Korean horror or something, we can just update the recommendation to show her a mix of Korean and Korean horror movies on the no-query, pre-query canvas. We can really capture her intent without the struggle of typing.

If you imagine, to build a system like this, there is, of course, modeling consideration, but a large part of it is also software and infrastructural consideration, and that’s what I’m going to try to highlight. This is anticipatory search because we want to anticipate before the user has to search based on the in-session signal and browsing behavior that the member did in that current session. Overall, pre-query recommendation needs this kind of approach where it not only learns from long-term preference, but also utilizes short-term preference. We have seen in industry that being able to leverage browsing signals in the session, it’s able to help the model capture user short-term intent, while the research question there is, how do we balance the short-term and long-term intent and not make the whole recommendation just Korean horror movies for this member?

There are some advantages of these kinds of in-session signals. One is, as you can imagine, freshness, where if the model is aware of the user in-the-moment behavior, then it will not go into a filter bubble of only showing a certain taste, so you can break out of that filter bubble. It can help inject diversity. Of course, it’ll introduce novelty because you’re not showing the same old long-term preference to the member. Make it easy for findability because you’re training the model or you’re tuning the model to be attuned to the user’s short-term intent. It also helps user and title cold starting. Ultimately, how we call it in Netflix is it sparks member joy. We see in our real experience, so this is a real production model, that it ultimately reduces abandoned session. In the machine learning literature, there is a ton of research of how do we trade off between user long-term interest with short-term interest.

In the chronological order of research done many years ago to more recent, earlier we used to use Markov chain, Markovian methods, then there is reinforcement learning, there are some papers that tries to use reinforcement learning. Then, more recently, there is a lot of transformer and sequence model that captures the user long-term preference history while also adding some short-term intent as a part of the sequence and how they balance the tradeoff. I’m not going into details about these previous studies, but some common consideration if you want to explore this area is, what sequence length to consider. How long back in the history should we go to capture user long-term and short-term interest? What are the internal order of actions? In the context of e-commerce, for example, purchase is the most valuable action, add to cart is a little less, click might be much less informative than purchase, and the different types of action.

What is the solution that we built? I’ll go into the model itself later, but first I wanted to show the infrastructure overview. A member comes to Netflix and the client tells the server that the member is on pre-query canvas, fetch the recommendation. In the meantime, as JIT’s just-in-time server request call happens, we are also in parallel accessing every engagement that the member did elsewhere on the product. There has to be a data source that can tell, one second ago the member thumbs up a title and two seconds ago member clicked on a title. That information needs to come just in time to be sent to the server. While we also, of course, need to train the future model, so we also set up logging.

Ultimately, this server then makes a real-time call with the in-session signals as well as the historical information to the online model, which is hosted somewhere. This online model was trained previously and has been hosted, but it’s capable of taking these real-time signals to make the prediction. Ultimately, this model then returns a ranked list of results within a very acceptable latency. In this case, the latency is lower than 40 milliseconds, and ultimately sends the results to a client. In the process, we are also saving the server engagement, client engagement into logging so that the offline model training can happen in the future. There is a paper in RecSys on this particular topic. If you’re more interested, feel free to dig deeper. That is the overall architecture.

Here are some considerations that we had to think through when implementing a system like that. Actually, one of the key things for a system like that was this just-in-time server call. We really have to make the server call or access the model when the member is on that canvas. We have to return the result before the member even realizes, because we want to take all the in-session signals that the member did the browsing on the product in that session to the model. Because otherwise we lose the context. Let’s say in the Korean horror movie context, the member is seeing a Korean horror movie and immediately goes to search, and if you’re not aware of the interaction that the member did on homepage, then we will not really be able to capture the member intent. The recommendations will not be relevant to the member in that short-term intent of the member. Here are some considerations. The server call pattern is the most important thing we needed to figure out in this world.

More interestingly, different platform, I don’t know if that’s the case for other companies. In this particular case, different platforms had different server call patterns. How do you figure out and work together with engineers and infra teams to change the service call pattern and make sure that the latency of the model and the end-to-end latency is within acceptable bound that the member doesn’t realize that so much action is happening within such few milliseconds. Of course, throughput SLA becomes even more important depending on the platform, depending on the region, and so on. Because we want to do absolute real-time inference to capture the user in-session browsing signals, we had to remove caching. Any kind of caching had to be removed or the TTL had to be reduced a lot. These three components really differentiated a work like this from more traditional recommendation where you can prefetch the recommendation for a member. You can do offline computation.

The infrastructural and software constraint is much lenient in a more traditional prefetching recommendation versus this system has to be really real-time. Then, of course, the regular things like logging. We need to make sure client-side and server-side logging is correctly done. Near real-time browsing signal is available through some data source, or there is a Kafka stream or something also making sure those streams have very low latency so that the real-time browsing signal can become available to the model during inference time without much delay. Ultimately, then the model comes, that the model needs to be able to handle these long-term and short-term preference and be able to predict relevant recommendation. There is a reason why I did a priority listing like that. The first three components are really more important than the model itself in this particular case, which is server call, latency, and caching.

What is the model? The model itself in this case is a multi-task learning, deep learning architecture, which is very similar to traditional content-based recommendation model where we have a bunch of different types of signals that gets trained, go into the model. It’s, I think, a few layered deep learning model with some residual connection and some real sequence information of the user. There is a profile context. That is where the user is from, country, language, and so on. Then there is video-related data as well, things like tenure of the title, how new or old the title is, and so on. Then there is synopsis and these other information about the video itself. Then, more importantly, there is video and profile information, so engagement data. Those are really powerful signals, whether the member had thumbs up a title in the past, or is this a re-watch versus a new discovery, new title that the member is discovering? In this particular work, there was this addition of browsing signals that we had added.

This is where the short-term member intent is being captured, where in real time, we know whether the member did a my list add on this title or thumbs up this title or thumbs down some title. Negative signal is also super important. That immediately goes and feeds into the model during inference time, letting the model then trade off between short-term and long-term. We do have some architectural consideration here to trade off between short-term and long-term. Unfortunately, that’s something I could not talk about. I’m just giving you this thought that it’s important to trade off between short-term and long-term in this model architecture. Overall, with this improvement of absolute real-time inference as well as the model architecture incorporating in-session browsing signal, offline we saw over 6% improvement, and it is currently 100% in production. All of Netflix users, when you go to pre-query canvas, this is the model that shows you your recommendation.

Here is a real example. This was the initial recommendation when the user session started. This is on a test user. Then the member went and did a bunch of browsing on homepage, and did browsing on woman in the lead role shows and movies. Then they came back to no-query or pre-query page, and their recommendation immediately within that session got updated to have shows like, Emily in Paris, New Girl, and so on, which has woman in the lead character. Then they did, again, go back to category page or homepage and did some shows related to cooking or baking. Ultimately, in the same session, when they went back to search page, their recommendation immediately changed. You can see it’s a combination of all three. It’s not just souping the whole thing to make it baking show or something. This is where the tradeoff between short-term and long-term preference comes into play. That you want to capture what member is doing in the session, but don’t want to overpower the whole recommendation with that short-term intent only.

Challenges and Other Considerations

What were some of the challenges and other considerations that we took into account? I think something that I alluded to in the previous slide, where filter bubble and concentration effect is a big problem, and still an open question in the space of recommendation and search, where, how do we make sure that when we understand a member need, we are not saying this is the only need you have, and the whole page, whole product gets overwhelmed with that one taste profile or one kind of recommendation. In this, both short-term, long-term tradeoff is important, but also explore-exploit or reinforcement learning, these are areas that are usually explored to break out of filter bubble and avoid concentration effect. Because this is such a real-time system, as you would imagine, depending on the latency and the region the model has been served, sometimes there is increased timeout, which leads to increased error rate. What we don’t want is a member to see an empty page.

There was a lot of infrastructural debugging and stuff we had to do to make sure that the error rate and increased timeout is not affected, which include multi-GPU inference, but also thinking about how do we deploy the model and additional consideration like the feature computation, whether there is some caching in some of the features that doesn’t need to be real-time and so on. Overall, we also want to be careful about not making the recommendation too dynamic. We do want to capture the user’s short-term intent and hence update the recommendation in real-time, but we also don’t want to completely move the floor behind the member’s feed by just changing the page every time the member is coming into that part of the product. We want to have a tradeoff between how much we are changing versus how much we are keeping constant. Because it’s such a real-time system and because it’s so dynamic, it is more difficult to debug.

Also, it becomes more susceptible to network unreliability, which can ultimately cause degraded user experience. Another thing that is important is depending on how you are building these short-term signals, browsing signals. Some of these signals are very sparse, like how many times do you actually thumbs up a show when you enjoy something on Netflix, or on Hulu, or somewhere. Signals like thumbs up, my list add, or thumbs down, they are usually very sparse. Typically, we need to do something to the model to generalize these signals that are otherwise very sparse, and make sure the model is not over-anchoring on one signal versus another. That was my recommendation use case.

Defining Ranking – How and When It Was Right

Participant 1: You mentioned ranking, and I’m assuming that after you computed and you had a list of things, you had to rank them based on, because that page is limited, so you can only show so much. How did you guys go about defining that ranking? How did you know it was right or when did you know it was right?

Bhattacharya: This is where that happens. When we train the model, that’s the example that I shared, the deep learning model with short-term, long-term intent and so on. Then we have offline evaluation. With offline evaluation, we evaluate for ranking. Some metrics for ranking is NDCG and MRR. What rankings mean typically is the model generates a likelihood of something. In this case, let’s say likelihood of playing a title. Then we order that likelihood in decreasing order and cut it off. Let’s say top 20, if you just want to show the member top 20 titles, we rank the probability scores and then take top 20 probability scores for a given context. In this case, let’s imagine the context is just profile ID.

Then we take that as top-k, and then we use some metric, for example, NDCG or an MRR, to evaluate how good the model is doing. There’s something called golden test set here, where usually we would build a temporarily independent dataset, temporarily independent to the training data, to evaluate how good the model is doing. That’s the offline metric. Then we go to the A/B test, which tells us what we saw offline, whether that’s what our members are seeing. A/B test gives us a real test.

Balancing What Is Happening During Searches vs. Tagging with Metadata

Participant 2: As customers are changing their language about how they’re searching, as well as the metadata that’s associated with all of the content that’s available inside of Netflix, it seems like there is this constant change, as you maybe had missed some metadata, because what was being pulled back in terms of recall and precision wasn’t matching actually what the customer’s language was trying to represent. How are you all trying to balance how things are tagged with metadata versus what is taking place during searches?

Bhattacharya: We usually try to incorporate some of those metadata in the model as features so that that correspondence between the query or the user engagement with other titles and the metadata is used for the model to learn the connection. Usually the metadata is static, but the query is dynamic. When the query comes in, depending on the title and metadata that the model thinks is the right relevant result for that query, it gets pulled as top-k ranking. In general, there is also certain lexical and certain filters as well as guardrails in the actual production system. There are some boosting or some lexical matching that happens as well to make sure the model do not surface something that is completely irrelevant to the query or the context.

Search Use Case: A Netflix Use Case

The next use case is a search use case. Although it is a search use case, it’s actually a search and recommendation use case. We built this model called UniCoRn, Unified Contextual Recommender. I’ll get to in a couple of slides why is it called UniCoRn. Similar to what I already motivated, many products, especially B2C products, have both search and recommendations use case. In the context of Netflix, here is an example where we have traditional search, which is query to video ranker. For example, if you type P-A-R-I, we want to make sure the model is being able to show you Emily in Paris or some kind of P-A-R-I related titles.

Then there is purely recommendations use case, which is what we saw in the previous slide, example is no-query or pre-query. Then there is other kind of recommendations, such as title-title recommendation, video-video recommendation. In the context of e-commerce, Canvas is more like this, or in the context of Netflix, it’s more like this, wherein you click on a title, here, Emily in Paris, and you see other titles that are similar to it. That’s a recommendation use case as well.

The overarching thesis for this work was, can we build a single model for both search and recommendation task? Is it possible? Do we need different bespoke models, or can we build one model? The answer is, yes, we can build one model, we don’t need different models, because both of them, the search and recommendation task, are two sides of the same coin. They are ultimately ranking tasks, and ultimately, we want, for a given context, top-k results that are relevant to the context. What really changes is part of this example: how we went about identifying what are the differences between search and recommendation tasks, and how we built one model.

What are the differences between search and recommendation tasks, typically? The most important difference is the context itself. When you think about search in the context, we think about query. We type a query, we see results. Query is in the context. Whereas when we think about recommendation, we usually think about the person, it’s usually personalized, so profile ID. Similarly, for more like this, or video-video, or title-title recommendation, the context is the title itself. You are looking at Emily in Paris, you want to see similar shows to Emily in Paris. The next big difference is the data itself, which is a manifestation of the product. They’re in different parts of the product. The data that is collected based on the engagement are different.

For example, when you go to search, you type a query, you see the results, you engage with the result, you start watching it, you start purchasing it. Versus when you go on homepage, you are seeing the recommendation, which is a laid-back recommendation, then you engage on it. The data, how it’s being logged, and what the engagement the user is, is different. Similarly, the third difference would be candidate set retrieval itself, where for query, you might want to make sure that there is lexical relevance. For personalization, a purely recommendation task, the candidate set itself, the first pass, could be different. Ultimately, to the previous question, there is usually canvas specific or product specific business logic that puts guardrails on what is allowed on that part of the product. What we do is first identify what are these differences, and then we set a goal to set out to combine these differences.

Overall, the goal is to develop a single contextual recommender system that can serve all search and recommendation canvases. Not two different models, five different models, just one model that is aware of all these different contexts. What are the benefits? I think the first benefit is these different tasks learn from each other. When you go on Netflix, if you’re typing, Stranger Things, the result that you see versus on more like this, when you click on Stranger Things, the recommendations that you see, what we see from our members is they don’t want different results for the same context on different parts of the canvas. Or, do they? We want the model to learn this information. We want to leverage these different tasks for benefiting the other tasks.

Then, innovation applied to one task can be immediately scaled to other tasks. The most important benefit is, instead of five models, now we have to maintain one model. It’s overall much reduced tech debt and much lower maintenance cost. Overall, engineering cost reduces, PagerDuty, like overall, on-calls become easier because instead of debugging five models and their issues, you’re debugging one model. It’s an overall pretty big win-win.

How we go about doing it? Essentially, we unify the differences. The first important difference was context. We, instead of having a small context, training one model, gathering data and feature for the small context, we expand the context. Then we do the same things, gathering data, features for the whole context. Instead of just having query or just profile ideas context, we build a model that has this large context, query, country, entity in the context of Netflix’s video ID, and a task type. Task type is telling the model that this is a search task, this is a more like this task, this is a pre-query task, and so on. In a way, we are injecting this information in the data while giving all the information this particular task needs as one dataset. Then, in this particular case, in the context of Netflix, entity here refers to beyond just video, we have out-of-catalog videos.

For example, we often get queries like Game of Thrones, and we have to tell our users, we don’t have Game of Thrones. First, to tell our users, we need to identify what is Game of Thrones. It is an out-of-catalog entity. Similarly, person, like people search Tom Cruise. We need to understand, what is Tom Cruise? It’s not a title. It’s person. Similarly, genre, and so on. An example of context for a specific task would be, for search, the context is query, country, language, and task is search. For title-title recommendation, the context is source video ID. In our example, Emily in Paris, the ID of it. Then country, language, and the task is, title-title recommendation. They’re different tasks. Then the data, which is what we have logged in different parts of the product, we merge them all together while adding this context, this task type, as a part of the data collection. Ultimately, we know which engagement is coming from which task or which part of the product that is associated with which task, but we let the model learn those tradeoffs.

Finally, the target itself, whether it’s a video ranker, whether it’s an entity ranker, like now on Netflix, we also have games, whether it’s a game ranker, so we unify that as well and make the model rank the same entity for all these different tasks.

Here’s a setup. We basically build a multi-task learning model, but multi-task via model sharing. Actually, I’m not sure here if people have built multi-task learning model. Typically, we would have different parts of the objective. Let’s say an example would be, train a model to learn play, thumbs up, and click. There are three parts of the objective, and we are asking the model to learn all the three objectives and learn the tradeoff between these objectives. Whereas in our case, we did the multi-task through data, where we mix all the data, with the context and with the task type tagged to the data, and we’re asking the model to learn the tradeoff between these different tasks from the data itself without explicitly calling out the objectives. Similar to the previous example use case of recommendation that I showed, here also there are different types of features, the big one being the entity features, which is basically the video features or the game features.

Then now a big difference compared to traditional recommendation or search system is, here we have context features, which is much larger. We have query-based features. We have profile features. We have video ID features. We have task-specific features as well. Because the context is so broad, this information has to be expanded. Then we have context and entity features. All these different types, when it’s numeric feature, it gets feed into the model in a different way, versus if it’s a categorical feature, we have the corresponding embeddings in the model. Then, ultimately, the model is a similar architecture to the previous one, which is a large deep learning model with a lot of residual connection, and some sequence features, and so on. Ultimately, the target or the objective of this model is to learn probability of positive engagement for a given profile and context, and title, because we are ultimately ranking the titles.

Let’s take an example. This same model, when a user comes to Netflix and types a query, P-A-R-I, the same model takes that context query and does create all these features and ultimately generates the likelihood of all the videos that are relevant to this query, P-A-R-I. Then the same model, when it’s used on more like this canvas, when a user clicks on Emily in Paris, it generates all these features for the context 12345. Let’s say that’s the ID of Emily in Paris, and generates the likelihood of all the titles in our catalog that are similar to Emily in Paris. That’s the power of unifying this whole model where even though the product itself are in different parts of canvases of the product, we are just using the same infra, same ML model to make inference and ultimately generate rank list of given tasks for a given context.

How is this magic happening? Here are some of the hypotheses based on a lot of ablation studies that we have done. I think the key benefit of an effort like this is each task benefits from each other, each of the auxiliary tasks. In this case, search is one of the tasks is benefiting for all these different recommendation tasks. This model replaced four different ML models. We were able to sunset and deprecate four different ML models and replace it with one model. Clearly, there was benefit from one task to another task. The task type as a context was very important. Then the feature specific to these different tasks, was allowing the model to learn tradeoffs between these different tasks. Another key important thing is, how do we handle these different contexts and missingness of these different contexts? We took like an approach of imputing the missing context.

For example, in the context of more like this, we don’t really have a query, but we can think of some heuristic and impute query. Also, things like feature crossing, which is a specific ML architecture consideration helped. Then, with this unification, we were able to achieve either a lift or parity in performance for different tasks. As a first step, we wanted to just at least be able to replace the four different models and not take a hit in performance. Then, once we were able to do that, we brought in all sorts of innovation, which was immediately applicable to four different parts of the product rather than one place. Here’s an example where, ultimately, we replace initially pre-query search and more like this canvas with this one model. Then we also brought in personalization in it. This is a traditional UniCoRn model. Then we took a user foundation model that is trained separately and merged with this UniCoRn model.

Then, ultimately, immediately we were able to bring personalization to pre-query, to search, and more like this. In the previous world where we had three different models for three different tasks, we would have to bring in these similar features to three different models. Instead of taking three quarters, we ended up doing it in one quarter. Again, there is a recent paper on this work in RecSys. Feel free to take a look. Offline, we got an improvement of 7% and 10% lift on search and recommendation tasks by combining these. That makes the point that these different tasks are benefiting from each other.

This is a redundant slide, just the way I showed you before that we are able to merge in a personalization signal, a separate TF graph into the UniCoRn model to bring personalization to all canvases. Here is an example. After we deployed UniCoRn in production and then we deployed the personalized version of UniCoRn, I usually don’t see this on my profile, so I clicked on s as a query, and I don’t see kids show. Before personalization, I was getting some kids show here, Simon, Sahara. Then, after the personalization model was pushed, all those kids show disappeared and then these were very relevant personalized titles for me for the very broad query, s. Go give it a try because currently Netflix production search more like this and entity suggestion is being powered by these models, this specific model, UniCoRn.

Considerations

What were the considerations? In addition to the other infra considerations that I shared in my previous use case, here, because we are merging search and recommendation, a very big consideration is how we make sure the tradeoff between personalization and relevance. What does relevance mean here? Relevance to the context. If you type s and on Netflix you see a lot of titles that starts with a, I think you’ll find it a pretty bad experience. If you’re typing s, you would expect things to have s, so that’s lexical relevance. Similarly, if you’re typing a genre of romance and you start seeing a lot of action movies on the results, it will be irrelevant. Even though it might be very personally relevant to you, but in the context of the query, it’s irrelevant. We want to make sure that we trade off between personalization and relevance pretty well.

Then, because query is real-time, all these engagements are real-time, we want to make sure that our model is really good, but it’s not hurting latency. We don’t want our member to wait around after typing a query, for 5 minutes. In fact, the latency consideration is very strict, like something around 40 milliseconds to 100 milliseconds, P50. Similarly, depending on the region the app Netflix has been opened, throughput becomes important. Handling missing context is important for this particular case because we are expanding the context quite a lot. Features specific to the context, and ultimately, what kind of task-specific retrieval logic we have becomes important. In this case, one thing to note is we just unify the second pass ranker and not the first pass ranker. The retrieval or the retrieval logic remains different for different tasks.

Some additional consideration. In general, when you’re building ranking systems, in addition to everything I showed, there are things like negative sampling. What should be the negative sampling? Should you look at random negative sampling or should you look at impressions as negative sampling? Overall sample weighting, is one action more important than another action? Then, a very important thing is cost of productization. Even though it’s a winning experience during A/B test, we might not be able to productize because it’s too expensive. We ended up training it on too many GPUs that the company cannot support. Multi-GPU training, even for inference if there are GPUs used, cost of productization becomes a very critical thing to be considered. Then, ultimately, during online A/B testing, what kind of metrics to look at. How do we analyze and tell a story of what really is happening? Debugging from what the members are really liking in an experience becomes very important.

Key Takeaways

Overall, it’s beneficial to identify common components among production ranking systems, because then we can really unify those differences and reduce tech debt, improve efficiency, and have less on-call issues. A single model aware of diverse contexts can perform and improve both search and recommendation tasks. The key advantages in consolidating these tech stacks and model, is that these different tasks can benefit from one another. It reduces tech debt, and higher innovation velocity. Real-time in-session signals are important to capture member short-term intent, while we want to be sure of also trading off with long-term interest.

Overall, infrastructural considerations are equally important as the model itself. I know, saying machine learning, modeling is really the cool or the sexy part, but in real production model, infrastructure becomes even more important. Oftentimes, I’ve seen that being the bottleneck rather than the model itself. Latency, throughput, training, and inference time efficiency, all these are super critical considerations when we build something for at-scale production real-time member.

Questions and Answers

Participant 3: You mentioned a lot about how having one model can benefit the search and recommendation system. Beside the fact that there’s consideration to take into training and CPU, what were the real drawbacks having to condense four different models into just one?

Bhattacharya: The first point here is the biggest drawback or something to keep in mind. We spend a bunch of time trying to ensure that personalization is not overpowering relevance, because recommendation tasks typically over-anchors our personalization, where a search task is more relevance-oriented. How we merged in this particular context here, the same picture, the left-hand side is bringing personalization and the right-hand side is the relevance server. How we merge these two is very important. Because if we do some of these merges very early on, it could hurt relevance. If you don’t merge it with the right architectural consideration, then it could not bring in the personalization for the relevant queries. Going back to this, I think the first one is the personalization-relevance tradeoff, which is a difficult thing to achieve and you have to do a bunch of experimentation.

Then, in general, bigger model helps, but bigger model come with higher latency. How do we do that? We have a few tricks that we used to address latency, which I cannot share because we haven’t publicly written that in the paper. Latency becomes a big consideration and can be one of the blockers to be able to combine.

Participant 4: In terms of unifying the models between search and recommender system, the number of entities in context of Netflix is limited to a number of genres, then movie titles, and the person. Let’s say if it was something like X, social media platform, where the entities would be of unlimited number, will the approach of unifying those models still scale in terms of those kind of applications?

Bhattacharya: I think that’s where this disclaimer, that this is a second pass ranker unification. Prior to Netflix, I was in Etsy, which is an e-commerce platform where the catalog size was much bigger than Netflix catalog size. We usually do first pass ranking and then second pass ranking. This unification is a second pass ranker. I believe this would scale to any other product application. As long as we have first pass ranking, which retrieves the right set of candidate and has high recall, then usually the second pass ranking, the candidate set size is much smaller. To also be able to unify the first pass or the retrieval phase, actually, there are a few papers now with generative retrieval, but this world did not focus on that.

See more presentations with transcripts

Uncategorized

Java News Roundup: Jakarta NoSQL 1.0, Spring 7.0-M3, Maven 4.0-RC3, LangChain4j 1.0-beta2

MMS • RSS

This week’s Java roundup for March 10th, 2025 features news highlighting: OpenJDK JEPs targeted and proposed to target for JDK 25; the release of Jakarta NoSQL 1.0; the third milestone release of Spring Framework 7.0; the third release candidate of Maven 4.0; and the second beta release of LangChain4j 1.0.

OpenJDK

JEP 502, Stable Values (Preview), has been elevated from Proposed to Target to Targeted for JDK 25. Formerly known as Computed Constants (Preview), this JEP introduces the concept of computed constants, defined as immutable value holders that are initialized at most once. This offers the performance and safety benefits of final fields, while offering greater flexibility as to the timing of initialization.

JEP 503, Remove the 32-bit x86 Port, has been elevated from Candidate to Proposed to Target for JDK 25. This JEP proposes to “remove the source code and build support for the 32-bit x86 port.” This feature is a follow-up from JEP 501, Deprecate the 32-bit x86 Port for Removal, to be delivered in the upcoming release of JDK 24. The review is expected to conclude on March 18, 2025.

JDK 24

Build 36 remains the current build in the JDK 24 early-access builds. Further details may be found in the release notes.

JDK 25

Build 14 of the JDK 25 early-access builds was also made available this past week featuring updates from Build 13 that include fixes for various issues. More details on this release may be found in the release notes.

For JDK 24 and JDK 25, developers are encouraged to report bugs via the Java Bug Database.

GlassFish

GlassFish 7.0.23, the twenty-third maintenance release, delivers bug fixes, dependency upgrades and improvements: SSH managed node connections on both the Linux and Windows environments; and support for the org.glassfish.envPreferredToProperties system property that, when set to true, allows environment variables to take precedence when resolving variable references in JVM options. Further details on this release may be found in the release notes.

Jakarta EE

In his weekly Hashtag Jakarta EE blog, Ivar Grimstad, Jakarta EE Developer Advocate at the Eclipse Foundation, provided an update on Jakarta EE 11, writing:

Jakarta NoSQL 1.0 has passed its release review and is now publicly available. This is a major milestone for the project. Congrats to the team!

The Jakarta EE 11 Web Profile is as good as ready for the release review ballot to start. The final version of the TCK has been staged, and Eclipse GlassFish passes it on both JDK 17 and JDK 21. I expect the ballot to start early next week, as soon as all the materials have been gathered.

The release of the Jakarta NoSQL 1.0 specification features notable changes such as: an improved Template interface that increases productivity on NoSQL operations; the removal of the Document, Key-Value and Column Family APIs as they are now maintained in the Jakarta Data specifications; and the addition of new annotations, @MappedSuperclass, @Embeddable, @Inheritance, @DiscriminatorColumn and @DiscriminatorValue for improved support of NoSQL databases. More details on this release may be found in the changelog.

The road to Jakarta EE 11 included four milestone releases, the release of the Core Profile in December 2024, and the potential for release candidates as necessary before the GA releases of the Web Profile in 1Q 2025 and the Platform in 2Q 2025.

Spring Framework

The third milestone release of Spring Framework 7.0.0 delivers bug fixes, improvements in documentation, dependency upgrades and new features such as: first-class support for registering an instance of the GenericApplicationContext class via the new BeanRegistrar interface; and support for the Java Optional class with null-safety and Elvis operators defined in the Spring Expression Language (SpEL). Further details on this release may be found in the release notes.

Similarly, the release of Spring Framework 6.2.4 and 6.1.18 ship with bug fixes, improvements in documentation, dependency upgrades and new features such as: avoid unnecessary CGLIB processing on classes annotated with @Configuration that do not declare, or inherit, any instance-level methods annotated with @Bean; and improvements to the BeanFactory and ObjectProvider interfaces to select only one default candidate among non-default candidates if the bean name is volatile or not visible to application. More details on these releases may be found in the release notes for version 6.2.4 and version 6.1.18.

The second milestone release of Spring Data 2025.0.0, also known as Spring Data 3.5.0, provides new features such as: Interface Projections are now properly throwing a NullPointerException if a getter method return value is null even if the method is defined to return a non-nullable value; and allow the use of bean validation callbacks in reactive flows with the Spring Data MongoDB ValidatingEntityCallback and ReactiveValidatingEntityCallback classes. Further details on this release may be found in the release notes.

Four days after the release of version 0.4.0, the release of Spring gRPC 0.5.0 provides notable changes such as: the addition of a Spring Boot compatibility check workflow; and a fix in the docs.yml workflow to add the package command. More details on this release may be found in the release notes.

Open Liberty

IBM has released version 25.0.0.3-beta of Open Liberty featuring compliance with FIPS 140-3, Security Requirements for Cryptographic Modules, for the IBM SDK, Java Technology Edition 8.

LangChain4j

The second beta release of LangChain4j 1.0.0, provides notable changes such as: a migration to using the Java HttpClient class as a first step towards decoupling modules from the OkHttpClient class; and support for the OpenAI Java Library. Breaking changes include: removal of the deprecated generate() and onNext()/onComplete() methods in the ChatLanguageModel and TokenStream interfaces, respectively. Further details on this release may be found in the release notes.

Micrometer

The third milestone release of Micrometer Metrics 1.15.0 delivers bug fixes, dependency upgrades and new features such as: allow the TimedAspect and CountedAspect classes to inject a Java Function interface to create tags based on method result; and improvements to the OtlpMetricsSender interface that removes a possible inconsistency where the sender could be given an instance of the OtlpConfig interface that differs from the one passed to OtlpMeterRegistry class. More details on these releases may be found in the release notes.

The third milestone release of Micrometer Tracing 1.5.0 provides notable dependency upgrades such as: Micrometer Metrics 1.14.5; Zipkin Brave 6.1.0; and Testcontainers for Java 1.20.6. Further details on this release may be found in the release notes.

Piranha Cloud

The release of Piranha 25.3.0 delivers bug fixes, dependency upgrades, improvements in documentation and notable changes such as: support for JDK 24 in the experimental workflow; and various Jakarta EE Core Profile TCK certifications for the Piranha Core Profile. More details on this release may be found in the release notes, documentation and issue tracker.

Project Reactor

The first milestone release of Project Reactor 2025.0.0 provides dependency upgrades to reactor-core 3.8.0-M1, reactor-netty 1.3.0-M1, reactor-pool 1.2.0-M1. There was also a realignment to version 2025.0.0-M1 with the reactor-addons 3.5.2, reactor-kotlin-extensions 1.2.3 and reactor-kafka 1.3.23 artifacts that remain unchanged. Further details on this release may be found in the release notes.

Similarly, Project Reactor 2024.0.4, the fourth maintenance release, providing dependency upgrades to reactor-core 3.7.4 and reactor-netty 1.2.4. There was also a realignment to version 2024.0.4 with the reactor-addons 3.5.2, reactor-pool 1.1.2, reactor-kotlin-extensions 1.2.3 and reactor-kafka 1.3.23 artifacts that remain unchanged. More details on this release may be found in the changelog.

Maven

The third release candidate of Maven 4.0.0 ships with notable changes such as: a migration from the Java EE 8 javax.inject package to Maven Dependency Injection; support for ${project.rootDirectory} property in GitHub repositories; and improve validation error messages and removal of direct support for ${project.baseUri} property in the DefaultModelValidator class. Further details on this release may be found in the release notes.

About the Author

Michael Redlich

Show moreShow less

Uncategorized

Swiss National Bank Trims Holdings in MongoDB, Inc. (NASDAQ:MDB) – MarketBeat

MMS • RSS

Swiss National Bank cut its stake in MongoDB, Inc. (NASDAQ:MDB – Free Report) by 4.1% in the fourth quarter, according to the company in its most recent filing with the Securities and Exchange Commission. The firm owned 208,700 shares of the company’s stock after selling 9,000 shares during the period. Swiss National Bank owned approximately 0.28% of MongoDB worth $48,587,000 as of its most recent filing with the Securities and Exchange Commission.

Several other hedge funds and other institutional investors have also recently modified their holdings of MDB. Jennison Associates LLC boosted its stake in shares of MongoDB by 23.6% during the 3rd quarter. Jennison Associates LLC now owns 3,102,024 shares of the company’s stock worth $838,632,000 after buying an additional 592,038 shares during the last quarter. Raymond James Financial Inc. acquired a new position in MongoDB during the fourth quarter valued at approximately $90,478,000. Amundi increased its position in shares of MongoDB by 86.2% during the 4th quarter. Amundi now owns 693,740 shares of the company’s stock worth $172,519,000 after purchasing an additional 321,186 shares during the last quarter. Assenagon Asset Management S.A. raised its stake in shares of MongoDB by 11,057.0% during the 4th quarter. Assenagon Asset Management S.A. now owns 296,889 shares of the company’s stock worth $69,119,000 after purchasing an additional 294,228 shares in the last quarter. Finally, Avala Global LP acquired a new stake in shares of MongoDB in the 3rd quarter valued at approximately $47,960,000. Institutional investors and hedge funds own 89.29% of the company’s stock.

Insider Activity

In other news, CAO Thomas Bull sold 169 shares of the company’s stock in a transaction dated Thursday, January 2nd. The shares were sold at an average price of $234.09, for a total transaction of $39,561.21. Following the transaction, the chief accounting officer now owns 14,899 shares in the company, valued at approximately $3,487,706.91. This represents a 1.12 % decrease in their ownership of the stock. The sale was disclosed in a document filed with the Securities & Exchange Commission, which is accessible through this link. Also, CFO Michael Lawrence Gordon sold 1,245 shares of the company’s stock in a transaction that occurred on Thursday, January 2nd. The stock was sold at an average price of $234.09, for a total value of $291,442.05. Following the transaction, the chief financial officer now directly owns 79,062 shares in the company, valued at approximately $18,507,623.58. This trade represents a 1.55 % decrease in their ownership of the stock. The disclosure for this sale can be found here. Insiders have sold 44,314 shares of company stock valued at $11,642,583 in the last ninety days. 3.60% of the stock is owned by company insiders.

MongoDB Stock Performance

Shares of MDB stock traded up $7.68 on Monday, hitting $193.05. The company had a trading volume of 2,039,974 shares, compared to its average volume of 1,683,732. MongoDB, Inc. has a twelve month low of $173.13 and a twelve month high of $387.19. The firm has a market cap of $14.38 billion, a P/E ratio of -70.46 and a beta of 1.30. The stock’s 50-day moving average is $256.72 and its two-hundred day moving average is $272.55.

MongoDB (NASDAQ:MDB – Get Free Report) last released its quarterly earnings results on Wednesday, March 5th. The company reported $0.19 earnings per share (EPS) for the quarter, missing the consensus estimate of $0.64 by ($0.45). MongoDB had a negative return on equity of 12.22% and a negative net margin of 10.46%. The firm had revenue of $548.40 million for the quarter, compared to the consensus estimate of $519.65 million. During the same quarter in the previous year, the business posted $0.86 EPS. Equities research analysts predict that MongoDB, Inc. will post -1.78 EPS for the current year.

Analyst Ratings Changes

Several research firms have issued reports on MDB. China Renaissance initiated coverage on shares of MongoDB in a research report on Tuesday, January 21st. They set a “buy” rating and a $351.00 price target on the stock. Monness Crespi & Hardt upgraded MongoDB from a “sell” rating to a “neutral” rating in a report on Monday, March 3rd. Canaccord Genuity Group lowered their price target on MongoDB from $385.00 to $320.00 and set a “buy” rating on the stock in a research report on Thursday, March 6th. Truist Financial reduced their target price on MongoDB from $400.00 to $300.00 and set a “buy” rating for the company in a research note on Thursday, March 6th. Finally, Barclays dropped their price target on MongoDB from $330.00 to $280.00 and set an “overweight” rating on the stock in a research note on Thursday, March 6th. One analyst has rated the stock with a sell rating, seven have given a hold rating and twenty-three have issued a buy rating to the company’s stock. According to data from MarketBeat.com, the company presently has an average rating of “Moderate Buy” and an average target price of $319.87.

View Our Latest Report on MDB

MongoDB Profile

(Free Report)

MongoDB, Inc, together with its subsidiaries, provides general purpose database platform worldwide. The company provides MongoDB Atlas, a hosted multi-cloud database-as-a-service solution; MongoDB Enterprise Advanced, a commercial database server for enterprise customers to run in the cloud, on-premises, or in a hybrid environment; and Community Server, a free-to-download version of its database, which includes the functionality that developers need to get started with MongoDB.

Featured Stories

Institutional Ownership by Quarter for MongoDB (NASDAQ:MDB)

Before you consider MongoDB, you’ll want to hear this.

MarketBeat keeps track of Wall Street’s top-rated and best performing research analysts and the stocks they recommend to their clients on a daily basis. MarketBeat has identified the five stocks that top analysts are quietly whispering to their clients to buy now before the broader market catches on… and MongoDB wasn’t on the list.

While MongoDB currently has a Moderate Buy rating among analysts, top-rated analysts believe these five stocks are better buys.

View The Five Stocks Here

The 10 Best Stocks to Own: Spring 2025 Cover

Discover the 10 best stocks to own in Spring 2025, carefully selected for their growth potential amid market volatility. This exclusive report highlights top companies poised to thrive in uncertain economic conditions—download now to gain an investing edge.

Get This Free Report

Like this article? Share it with a colleague.

Link copied to clipboard.

Uncategorized

Atria Investments Inc Sells 686 Shares of MongoDB, Inc. (NASDAQ:MDB) – Defense World

MMS • RSS

Atria Investments Inc decreased its stake in MongoDB, Inc. (NASDAQ:MDB – Free Report) by 31.5% during the 4th quarter, according to the company in its most recent disclosure with the SEC. The firm owned 1,489 shares of the company’s stock after selling 686 shares during the period. Atria Investments Inc’s holdings in MongoDB were worth $347,000 as of its most recent filing with the SEC.

A number of other hedge funds and other institutional investors have also recently added to or reduced their stakes in MDB. Jennison Associates LLC increased its holdings in MongoDB by 23.6% in the 3rd quarter. Jennison Associates LLC now owns 3,102,024 shares of the company’s stock worth $838,632,000 after acquiring an additional 592,038 shares during the last quarter. Geode Capital Management LLC boosted its position in MongoDB by 2.9% during the third quarter. Geode Capital Management LLC now owns 1,230,036 shares of the company’s stock worth $331,776,000 after purchasing an additional 34,814 shares during the period. Westfield Capital Management Co. LP increased its stake in shares of MongoDB by 1.5% in the third quarter. Westfield Capital Management Co. LP now owns 496,248 shares of the company’s stock worth $134,161,000 after purchasing an additional 7,526 shares in the last quarter. Holocene Advisors LP raised its position in shares of MongoDB by 22.6% in the third quarter. Holocene Advisors LP now owns 362,603 shares of the company’s stock valued at $98,030,000 after purchasing an additional 66,730 shares during the period. Finally, Assenagon Asset Management S.A. lifted its stake in shares of MongoDB by 11,057.0% during the 4th quarter. Assenagon Asset Management S.A. now owns 296,889 shares of the company’s stock valued at $69,119,000 after buying an additional 294,228 shares in the last quarter. 89.29% of the stock is currently owned by hedge funds and other institutional investors.

Analyst Ratings Changes

A number of research firms recently weighed in on MDB. Monness Crespi & Hardt upgraded shares of MongoDB from a “sell” rating to a “neutral” rating in a research note on Monday, March 3rd. The Goldman Sachs Group reduced their price target on MongoDB from $390.00 to $335.00 and set a “buy” rating for the company in a research report on Thursday, March 6th. Guggenheim upgraded MongoDB from a “neutral” rating to a “buy” rating and set a $300.00 price objective on the stock in a research note on Monday, January 6th. Bank of America cut their target price on MongoDB from $420.00 to $286.00 and set a “buy” rating for the company in a research note on Thursday, March 6th. Finally, China Renaissance started coverage on MongoDB in a research report on Tuesday, January 21st. They set a “buy” rating and a $351.00 price target on the stock. One research analyst has rated the stock with a sell rating, seven have assigned a hold rating and twenty-three have given a buy rating to the company’s stock. Based on data from MarketBeat, MongoDB currently has an average rating of “Moderate Buy” and an average target price of $319.87.

<!—->

Check Out Our Latest Report on MDB

Insiders Place Their Bets

In related news, Director Dwight A. Merriman sold 885 shares of the firm’s stock in a transaction dated Tuesday, February 18th. The stock was sold at an average price of $292.05, for a total value of $258,464.25. Following the sale, the director now directly owns 83,845 shares of the company’s stock, valued at $24,486,932.25. This trade represents a 1.04 % decrease in their ownership of the stock. The transaction was disclosed in a document filed with the Securities & Exchange Commission, which is available through this link. Also, Director Hope F. Cochran sold 1,175 shares of the business’s stock in a transaction dated Tuesday, December 17th. The shares were sold at an average price of $266.99, for a total value of $313,713.25. Following the transaction, the director now owns 17,570 shares in the company, valued at approximately $4,691,014.30. This represents a 6.27 % decrease in their position. The disclosure for this sale can be found here. Insiders have sold 44,314 shares of company stock valued at $11,642,583 in the last ninety days. Company insiders own 3.60% of the company’s stock.

MongoDB Stock Performance

MDB stock opened at $185.37 on Monday. The firm has a market cap of $13.80 billion, a P/E ratio of -67.65 and a beta of 1.30. The firm’s fifty day moving average is $256.72 and its 200 day moving average is $272.55. MongoDB, Inc. has a 52 week low of $173.13 and a 52 week high of $387.19.

MongoDB (NASDAQ:MDB – Get Free Report) last announced its quarterly earnings data on Wednesday, March 5th. The company reported $0.19 earnings per share (EPS) for the quarter, missing analysts’ consensus estimates of $0.64 by ($0.45). The company had revenue of $548.40 million for the quarter, compared to the consensus estimate of $519.65 million. MongoDB had a negative net margin of 10.46% and a negative return on equity of 12.22%. During the same period last year, the business posted $0.86 earnings per share. As a group, analysts forecast that MongoDB, Inc. will post -1.78 EPS for the current year.

MongoDB Profile

(Free Report)

Steward Partners Investment Advisory LLC Increases Stock Position in MongoDB, Inc …

MMS • RSS

Steward Partners Investment Advisory LLC grew its position in MongoDB, Inc. (NASDAQ:MDB – Free Report) by 12.9% in the 4th quarter, according to the company in its most recent Form 13F filing with the Securities and Exchange Commission (SEC). The fund owned 1,168 shares of the company’s stock after purchasing an additional 133 shares during the period. Steward Partners Investment Advisory LLC’s holdings in MongoDB were worth $272,000 at the end of the most recent reporting period.

Several other institutional investors and hedge funds also recently made changes to their positions in the company. Janney Montgomery Scott LLC purchased a new stake in MongoDB during the third quarter worth about $861,000. Principal Financial Group Inc. increased its holdings in MongoDB by 2.7% during the third quarter. Principal Financial Group Inc. now owns 6,095 shares of the company’s stock worth $1,648,000 after buying an additional 160 shares during the last quarter. Atria Investments Inc increased its holdings in MongoDB by 6.6% during the third quarter. Atria Investments Inc now owns 2,175 shares of the company’s stock worth $588,000 after buying an additional 135 shares during the last quarter. Versor Investments LP purchased a new stake in MongoDB during the third quarter worth about $404,000. Finally, GSA Capital Partners LLP increased its holdings in MongoDB by 38.0% during the third quarter. GSA Capital Partners LLP now owns 1,598 shares of the company’s stock worth $432,000 after buying an additional 440 shares during the last quarter. Hedge funds and other institutional investors own 89.29% of the company’s stock.

Insider Buying and Selling

In related news, CFO Michael Lawrence Gordon sold 5,000 shares of MongoDB stock in a transaction on Monday, December 16th. The shares were sold at an average price of $267.85, for a total value of $1,339,250.00. Following the sale, the chief financial officer now owns 80,307 shares in the company, valued at $21,510,229.95. The trade was a 5.86 % decrease in their ownership of the stock. The transaction was disclosed in a legal filing with the SEC, which is available at this link. Also, insider Cedric Pech sold 287 shares of the business’s stock in a transaction on Thursday, January 2nd. The shares were sold at an average price of $234.09, for a total transaction of $67,183.83. Following the sale, the insider now owns 24,390 shares in the company, valued at approximately $5,709,455.10. This trade represents a 1.16 % decrease in their position. The disclosure for this sale can be found here. Insiders have sold a total of 49,314 shares of company stock valued at $12,981,833 over the last three months. 3.60% of the stock is owned by company insiders.

Analysts Set New Price Targets

<!—->

A number of analysts recently issued reports on the stock. Loop Capital decreased their price objective on shares of MongoDB from $400.00 to $350.00 and set a “buy” rating for the company in a report on Monday, March 3rd. Morgan Stanley cut their target price on shares of MongoDB from $350.00 to $315.00 and set an “overweight” rating for the company in a research report on Thursday, March 6th. Truist Financial cut their target price on shares of MongoDB from $400.00 to $300.00 and set a “buy” rating for the company in a research report on Thursday, March 6th. KeyCorp lowered shares of MongoDB from a “strong-buy” rating to a “hold” rating in a research report on Wednesday, March 5th. Finally, China Renaissance began coverage on shares of MongoDB in a research report on Tuesday, January 21st. They set a “buy” rating and a $351.00 target price for the company. One analyst has rated the stock with a sell rating, seven have issued a hold rating and twenty-three have assigned a buy rating to the stock. According to data from MarketBeat.com, MongoDB currently has an average rating of “Moderate Buy” and a consensus price target of $319.87.

View Our Latest Report on MongoDB

MongoDB Price Performance

Shares of MDB opened at $185.37 on Friday. The firm’s 50 day simple moving average is $256.72 and its 200 day simple moving average is $272.06. The company has a market capitalization of $13.80 billion, a P/E ratio of -67.65 and a beta of 1.30. MongoDB, Inc. has a 1-year low of $173.13 and a 1-year high of $387.19.

MongoDB (NASDAQ:MDB – Get Free Report) last posted its quarterly earnings results on Wednesday, March 5th. The company reported $0.19 EPS for the quarter, missing the consensus estimate of $0.64 by ($0.45). The business had revenue of $548.40 million for the quarter, compared to analysts’ expectations of $519.65 million. MongoDB had a negative return on equity of 12.22% and a negative net margin of 10.46%. During the same period last year, the business posted $0.86 earnings per share. On average, equities analysts anticipate that MongoDB, Inc. will post -1.78 earnings per share for the current year.

MongoDB Company Profile

(Free Report)

Featured Articles

Want to see what other hedge funds are holding MDB? Visit HoldingsChannel.com to get the latest 13F filings and insider trades for MongoDB, Inc. (NASDAQ:MDB – Free Report).

Institutional Ownership by Quarter for MongoDB (NASDAQ:MDB)

Presentation: Building Your First Platform Team in a Fast Growing Startup

MMS • Jessica Andersson

Transcript

Background

Empowering and Enabling Product Teams

Implicit Platform

A Base Platform

Platform as a Product

Finding the Right Problems

Problem Statements

Trust is Currency

Small Teams and Tradeoffs

Summary

Questions and Answers

Subscribe for MMS Newsletter

Did you know...

Google DeepMind Unveils Gemini Robotics

MMS • Daniel Dominguez

About the Author

Daniel Dominguez

Subscribe for MMS Newsletter

Did you know...

Azure Database for Mysql Trigger for Azure Functions in Public Preview

MMS • Steef-Jan Wiggers

About the Author

Steef-Jan Wiggers

Subscribe for MMS Newsletter

Did you know...

Project Leyden Ships Third Option for Faster Application Start with JEP 483 in Java 24

MMS • Karsten Silz

About the Author

Karsten Silz

Subscribe for MMS Newsletter

Did you know...

MongoDB Acquires Voyage AI to Enhance AI-Powered Search and Retrieval

MMS • RSS

Become a Subscriber

Subscribe for MMS Newsletter

Did you know...

Presentation: Recommender and Search Ranking Systems in Large Scale Real World Applications

MMS • Moumita Bhattacharya

Transcript

Motivation

Recommendation: A Netflix Use Case

Challenges and Other Considerations

Defining Ranking – How and When It Was Right

Balancing What Is Happening During Searches vs. Tagging with Metadata

Search Use Case: A Netflix Use Case

Considerations

Key Takeaways

Questions and Answers

Subscribe for MMS Newsletter

Did you know...

Java News Roundup: Jakarta NoSQL 1.0, Spring 7.0-M3, Maven 4.0-RC3, LangChain4j 1.0-beta2

MMS • RSS

OpenJDK

JDK 24

JDK 25

GlassFish

Jakarta EE

Spring Framework

Open Liberty

LangChain4j

Micrometer

Piranha Cloud

Project Reactor

Maven

About the Author

Michael Redlich

Subscribe for MMS Newsletter

Did you know...

Swiss National Bank Trims Holdings in MongoDB, Inc. (NASDAQ:MDB) – MarketBeat

MMS • RSS

Insider Activity

MongoDB Stock Performance

Analyst Ratings Changes

MongoDB Profile

Featured Stories

Subscribe for MMS Newsletter

Did you know...