Presentation: gRPC Migration Automation at LinkedIn

MMS Founder
MMS Karthik Ramgopal Min Chen

Article originally posted on InfoQ. Visit InfoQ

Transcript

Ramgopal: What do we do today at LinkedIn? We use a lot of REST, and we built this framework called Rest.li. It’s because we didn’t like REST as it was in the standard open-source stuff, we wanted to build a framework around it. Primarily Java based users like JSON, familiar HTTP verbs. We thought it’s a great idea. It’s worked fairly well for us. It’s what is used to primarily power interactions between our microservices, as well as between our client applications and our frontends, and as well as our externalized endpoints. We have an external API program. Lots of partners use APIs to build applications on LinkedIn. We have, over the years, started using Rest.li in over 50,000 endpoints. It’s a huge number, and it keeps growing.

Rest.li Overview

How does Rest.li work? We have this thing called Pegasus Data Language. It is a data schema language. You go and author your schemas in Pegasus Data Language, schema describes the shape of your data. From it, we generate what are called record template classes. This is a code generated programming language friendly binding for interacting with your schemas. Then we have these resource classes. Rest.li is weird, where, traditionally in RPC frameworks, you start IDL first, and you go write your service definitions in an IDL.

We somehow thought that starting in Java is a great idea, so you write these Java classes, you annotate them, and that’s what your resource classes are. From it, we generate the IDL. Kind of reverse, and it is clunky. This IDL is in the form of a JSON. All these things combined together in order to generate the type safe request builders, which is how the clients actually call the service. We have these Java classes which are generated, which are just syntactic sugar over HTTP, REST under the hood, but this is how the clients interact with the server. The client also uses the record template bindings. We also generate some human readable documentation, which you can go explore and understand what APIs exist.

Gaps in Rest.li

Gaps in Rest.li. Back in the day, it was pretty straightforward synchronous communication. Over the years, we’ve needed support for streaming. We’ve needed support for deferred responses. We’ve also needed support for deadlines, because our stack has grown deeper and we want to set deadlines from the top. Rest.li does not support any of these things. We also excessively use reflection, string interpolation, and URI encoding. Originally, this was done for simplicity and flexibility, but this heavily hurts performance. We also have service stubs declared as Java classes. I talked about this a bit before. It’s not great. We have very poor support for non-Java servers and clients. LinkedIn historically has been a Java shop, but of late, we are starting to use a lot of other programming languages.

For our mobile apps, for example, we use a lot of Objective-C, Swift on iOS, Java, Kotlin on Android. We’ve built clients for those. Then on the website, we use JavaScript and TypeScript, we’ve built clients for those. On the server side, we are using a lot of Go in our compute stack, which is Kubernetes based, as well as some of our observability pieces. We are using Python for our AI stuff, especially with generative AI, online serving, and stuff. We’re using C++ and Rust for our lower-level infrastructure. Being Java only is really hurting us. The cost of building support for each and every programming language is prohibitively expensive. Although we open sourced it, it’s not been that well adopted. We are pretty much the only ones contributing to it and maintaining it, apart from a few enthusiasts. It’s not great.

Why gRPC?

Why gRPC? We get bidirectional streaming and not just unidirectional streaming. We have support for deferred responses and deadlines. The cool part is we can also take all these features and plug it into higher level abstractions like GraphQL, for example. It works really well. We have excellent out of the box performance. We did a bunch of benchmarking internally as well, instead of trusting what the gRPC folks told us. We have declarative service stubs, which means no more writing clunky Java classes. You write your RPC service definitions in one place, and you’re done with it. We have great support for multiple programming languages.

At least all the programming languages we are interested in are really well supported. Of course, it has an excellent open-source community. Google throws its weight behind it. A lot of other companies also use it. We are able to reduce our infrastructure support costs and actually focus on things that matter to us without worrying about this stuff.

Automation

All this is great. We’re like, let’s move to gRPC. We go to execs, and they’re like, “This is a lot of code to move, and it’s going to take a lot of humans to move it.” They’re like, “No, it’s too expensive. Just don’t do it. The ROI is not there.” In order to justify the ROI, we had to come up with a way to reduce this cost, because three calendar years involving so many developers, it’s going to hurt the business a lot. We decided to automate it. How are we automating it?

Right now, what we have is Rest.li, essentially, and in the future, we want to have pure gRPC. What we need is a bridge between the two worlds. We actually named this bridge as Alcantara. Bridged gRPC is the intermediary strait where we serve both gRPC and Rest.li, and allow for a smooth transition from the pure Rest.li world to the pure gRPC world. Here we are actually talking gRPC over the wire, although we are serving Rest.li. We have a few phases here. In stage 1, we have our automated migration infrastructure. We have this bridged gRPC mode where we are having our Rest.li resources wrapped with the gRPC layer, and we are serving both Rest.li and gRPC. In stage 2, we start moving the clients from Rest.li to gRPC using a configuration flag, gradually shifting traffic. We can slowly start to retire our Rest.li clients. Finally, we can also start to retire the server, once all the clients are migrated over, and we can start serving pure gRPC and retire the Rest.li path.

gRPC Bridged Mode

This is the bridged mode at a high level, where we are essentially able to have Rest.li and gRPC running side by side. It’s an intermediary stepping stone. More importantly, it unlocks new features of gRPC for new endpoints or evolutions of existing endpoints, without requiring the services to do a full migration. We use a gRPC protocol over the wire. We also serve Rest.li for folks who still want to talk Rest.li before the migration finishes. We have a few principles here. We wanted to use codegen for performance in order to ensure that the overhead of the bridge was as less as possible.

We also wanted to ensure that this was completely hidden away in infrastructure without requiring any manual work on the part of the application developers, either on the server or the client. We wanted to decouple the client adoption and the server adoption so that we could ramp independently. Obviously, server needs to go before the client, but the ramp, we wanted to have it as decoupled as possible. We also want to do a gradual client traffic ramp to ensure that if there are any issues, bugs, problems introduced by a bridging infrastructure, they are caught early on, instead of a big bang, and we take the site down, which would be very bad.

Bridge Mode (Deep Dive)

Chen: Next I would go through the deep dive, what does really the bridge mode look like in this one. This includes two parts, server side and client side, as we mentioned. We have to decouple this part, because people are still developing, services are running in production. We cannot disrupt them. I will talk about server migration first. Before server migration, each server has one endpoint, curli, is Rest.li, we expose there. That’s the initial, before server. After running the bridge, this is the bridge mode server. In this bridge mode server, you can see inside the same JVM, we autogenerated a gRPC service there. This gRPC service dedicated to the Rest.li service call to complete a task. Zooming to this gRPC service, what we are doing there is this full step there to do the conversion.

First from the proto-to-pdl, that’s our data model in the Pegasus input, and then translate your gRPC request to a Rest.li request, and through the in-process call to the Rest.li resource to complete task, get a response back and translate back to the gRPC. This is a code snippet for the autogenerated gRPC service. This is a GET call. Internally, we filled in blank here. We first do this, what we just discussed. gRPC request comes in, I need to translate to a Rest.li request, and then make a Rest.li request in-process call to the Rest.li resource to do the job, afterwards we translate Rest.li response to gRPC response. Remember here, this is the in-process call. Why an in-process call? Because, otherwise, you make a remote call, you have an actual hop. That is the tradeoff we think about there.

Under the Hood: PDL Migration

Now I need to talk a little bit under the hood exactly what it is we are doing down in the automation migration framework. There’s two parts, as we discussed before in Rest.li. We have a PDL, that’s data model. Then we also have IDL, that’s API. Under the hood, we do two migrations. First, we build tooling to do the PDL migration. We coined a term called protoforming. In the PDL migration, under the hood, you are doing this. You start with Pegasus, that’s PDL. We built a pegasus-to-proto schema translator to get back to a proto schema. Proto schema will become your source of truth because the developer, after migration, they will work with native gRPC developer environment, not working with PDL anymore.

Source of truth is proto schema. They will evolve there. Source of truth proto schema, we’re using the proxy compiler to compile all different multi-language artifact for the proto. After, because the developer work on the proto, but the underlying service is still Rest.li, how this works. When people evolve the proto, we build a proto-to-pegasus reverse translator to get back to Pegasus. Then all our Pegasus tuning can continue work to get all the Pegasus artifact, so not impact any business logic. This is the schema part, but there’s a data part. For the data part, we built the interop generator to build a bridge between Pegasus data to proto data, so that this Pegasus data bridge can work with both proto binding and Pegasus binding to do the data conversion. That is under the hood, what is protoforming for the PDL.

There’s a complication for this, because there’s a feature not complete parity between the Pegasus PDL and the proto spec. There’s official gaps. For example, in Pegasus, we have includes, more like inheritance in a certain way. It’s a weird inheritance, more like a macro embedded there. Proto doesn’t have this. We also have required, optional. In Pegasus, people define which one field is required, which is optional. In proto3 we all know gRPC got rid of this. Then we also have custom default so that people can specify for each field, what is their default value for this.

Proto doesn’t have it. We also have union without alias. That means you can have a union kind of a different type, but there’s no alias name for each one of this. Proto all require to have all this named alias here. We also have custom type to say fixed. Fixed means you have defined a UUID, have the fixed size. We also can define a typeref. Means you can have an alias type to a ladder type. All these things don’t exist in proto. In Pegasus, the important part, we also allow cycle import, but proto doesn’t allow this.

All this feature parity, how do we bridge them? The solution is to introduce Pegasus custom option. Here’s an example I’ll show you. This is a greeting, simple greeting Pegasus model, so you have required field, optional field. After data model protoforming, we generated a proto. As you can see, all these fields have Pegasus options defined there, for example, required and optional there. We are using that in our infra to actually generate this Pegasus validator to mimic the parity we had before the protoforming, so to throw some kind of a required field exception if you have some field not specified.

That’s how we use our custom Pegasus option to bridge gap about required, optional. We have all the other Pegasus validators for the cycle import and for the fixed size. They’re all using our custom option hint to generate this. Note this, all this class is autogenerated, and we define the interface when everything is autogenerated there. We have the schema bridged and mapped together. Now you need to bridge the data. This is a data model bridge, we introduced Pegasus bridge interface, and for each data model, we autogenerated the Pegasus data bridge here, the bridge between the proto object and the Pegasus object.

Under the Hood: API Migration

Now talk about API migration. API we just talk about is IDL we use in the Rest.li, called the IDL, basically so API contract. We have this similar flow defined for the IDL protoforming. You start with IDL Pegasus. We have it built-in for the rest.li-to-grpc service translator to translate IDL to a proto service, service proto. That becomes your source of truth later on, because in your source code, after this migration, your code doesn’t see this IDL anymore. They are not in the source control. Developer facing is proto. With this proto, you are using gRPC compiler to give you all the compiled gRPC service artifact and the service stub. Now when developer evolve your proto service, so underlying the Rest.li resource in action, so we need to build a proto-to-rest.li service backporter to backport your service API change to your Rest.li service signature so that a developer can evolve their Rest.li resource to make the feature change.

Afterwards, all your Pegasus plugin kick in to generate your derived IDL. Every artifact works same as before, your client is not impacted at all when people evolve this service. Of course, then we also build a service interop generator to do the bridge between the request-response, because when you interact API, they send in through a Rest.li request, and over the wire, they’re sending gRPC requests. We need a bridge between request and also response. This bridged gRPC service make in-process call to the evolved Rest.li resource to complete task. This is the whole flow, how under the hood, API migration works. Everything is also completely automated. There’s nothing developers need to evolve there. Only thing to evolve is when they need to change the endpoint function. Then you need to go to evolve the Rest.li resource to fill in your business logic.

I’ll give you some examples, how this API protoforming works. This is the IDL, purely JSON generated from the Java Rest.li resource, and they show you what is supported in my Rest.li endpoint. This is auto-translated, the gRPC service proto. To show you, each one wraps inside a request-response, and also have the Pegasus custom option to help us to generate later on the Pegasus request-response bridge and the client bridge.

To give an example, similar to a data model bridge, we generate a request bridge to translate from a Pegasus GetGreetingRequest, to a proto getting request. The response, same thing. This is bidirectional. This is response part. This is the autogenerated response from Pegasus to proto, proto to Pegasus, bidirectional. Because, bidirectional, as I illustrated before, they were using one direction, using in the server, the other direction in using the client, they have to echo together.

gRPC Client Migration

Now I finished talking about Server migration. Let’s talk a little bit about client migration, because we know when a server migrate, client is still Rest.li, needs to still work. They are decoupled. Client migration, remember, before, this is our server, purely Rest.li. How the client works? We have a factory generated Rest.li client to make a curli call to your REST endpoint. When we do this server bridge, you expose two endpoints now, both running side by side inside the same JVM. What we did for the client bridge, you introduce a facet we call the Rest.liOverGrpcClient. What this facet does, it looks at the service, if you migrate or non-migrate, if you non-migrate, it goes through the old paths. Rest.li client goes through the curli.

If I migrate, this client bridge class goes through the same conversion from the pdl-to-proto and rest.li-to-grpc, then send over wire gRPC request, coming back, gRPC response goes through the grpc-to- rest.li response converter, back to your client. This way the client bridge handles both traffic seamlessly. This is what client bridge looks like. This client bridge only exposed one execute request. Give you Pegasus request. I change that Rest.li request to the gRPC request and send over the wire using the gRPC stub call. Because in the gRPC, everything promotes type safety. You need to use the client stub to really make the call. Then after, back, we’re using the gRPC response to Rest.li response bridge, get back to your client. We call that Pegasus client bridge.

Bridged Principles

To sum up, we repeated before, why we use the bridge. There are several principles we highlight here. First principle, as you can see, we use codegen a lot. Why do we use codegen? Because of performance. You can use refactor as well, in Rest.li, we suffered a lot there. We do all the codegen for the bridge there, for the performance. There’s no manual work for your server and the client. Run the migration tool, people don’t change their code, your traffic can automatically switch.

Another important part, we decoupled server and client gRPC switch, because we have to handle the people continually developing. When your migration server is still working, a client is still sending traffic, we cannot disrupt any business logic here. It’s very important to decouple server and client gRPC switch there. Then, finally, we allow the gradual client traffic ramp, because we want a gradual ramp in traffic, shifting traffic from Rest.li to gRPC, so that we can get immediate feedback, and reramping and to see their degradation or regression, we can ramp down, to not disrupt our business traffic here.

Pilot Proof (Profiles)

In theory, that all works, and accuracy, everything looks fine. Is it really working in real life? We did a pilot. We picked the most complicated service endpoint running in LinkedIn, that’s profiles. Everybody uses that. Because that is an endpoint to serve all the LinkedIn member profiles, have the most high QPS. We run the bridge automation on this endpoint. This is the performance benchmark we did after, side by side with previous Rest.li and gRPC. As you can see, our client server latency with bridge mode doing the actual work, not only didn’t degrade, actually getting better, because we have more efficient protobuf encoding than our Rest.li to Pegasus. As you can see, there’s a slight increase in memory usage. That’s understandable, because we are running both parts in the same JVM. That is a slight change there. We verify this works before we roll out mass migration.

Mass Migration

It is simple to run some IDLs on one MP service, we call the run report service endpoint. Think about, we have 50,000 endpoints, 2,000 services running in LinkedIn, how to do this in a mass migration way without disrupting is a very challenging task. Before I start, I want to talk about the challenge here. Unlike most industry companies, LinkedIn actually doesn’t use monorepo. We are using multi-repo. What that means is that is a decentralized developer environment and deployment. Each service in their own repo, it has their own pro. Pro is because of flexible development. Everybody is decoupled from the other, it’s not dependent on their thing. It also caused issues, and the most difficult part of our mass migration, because, first, every repo has to be green build.

You have to build successfully before I can run the migration. Otherwise, I don’t know, is my migration causing the build fail, or your build has already failed before? Also, deployment, because I want to also make sure after migration, I need to deploy so that I can ramp in traffic. If people don’t deploy their code, that is bad code, I cannot take any effect to ramping my traffic, ship traffic. Deployment freshness also needs to be there.

Also, with large scale like this, we need to have a way to track, what is migrated and what is non-migrated, what’s the progress, what’s error? This migration tracker will need to be in place. Last but not least, we need to have up-to-date dependency, because different multi-repo, they are dependent on each other. You need to bring up-to-date dependency correctly, so that we apply our latest infrastructure change.

That is all the prerequisite we need to be finished before we can start mass migration. This is a diagram to show you how complicated this multi-repo brings a challenge to our mass migration because, as we say, PDL is data model, IDL is API model. The PDL, just like proto, each MP, each repo service can define their data model, and the other service can import this PDL into their repo and compose a ladder PDL to import by a ladder repo, of course, and the API IDL also will depend on your PDL.

To make things more complicated in Rest.li, people can actually define a PDL in one API repo and then define their implementation in a ladder repo. The implementation repo will depend on API multi-repo. All this complication, data model dependencies, service dependency, will bring all the complicated dependency ordering. We have to consider that’s up to 20 levels deep, because you have to first protoform your dependent PDL first, then you protoform another one. Otherwise, you cannot import your proto. That is why we have to have a complicated dependency ordering figured out to make sure we migrate them in the right order, in the right sequence.

To ensure all the target change to propagate to all our multi-repo, we built a high-level automation. We built a high-level automation called the grpc-migration-orchestrator. grpc-migration-orchestrator, actually eating our own dogfood. We are developing that service using gRPC itself, so it consists of some component, the most important part, dashboard. Everybody knows, we need to have a dashboard to track, which MP is migrate, and are you ready to migrate? Which state are they in? Are they in the PDL protoforming, or are they in the IDL protoforming? Are they in the post-processing, all these things?

We have a database to track that, and we’re using a dashboard to display that. Our gRPC service endpoint to do all the actions to execute pipeline, go through all the PDL, IDL, and work with our migration tool, and have a job runner framework to do the non-running job. To make sure extra additional more complication beside PDL and IDL migration, because you expose a new endpoint, we need to allocate a port. That’s why we need to talk to the porter to assign a gRPC port for you. We also need to talk to our acl-tool to enable authentication, authorization to the new endpoint, that ACL migration.

We also have the service discovery configuration for your new endpoint, so that they can do all the load balancing service discovery for your migrate gRPC endpoint, same as before for the Rest.li endpoint. This is all the component. We built this orchestrator to handle this automatically.

Besides that, we also built a dry run process to simulate mass migration so that we can preemptively discover bugs earlier, instead we found this in the mass migration when people are in the stack, the developer will get stuck there. This dry run process workflow included both automated and manual work. Automated part, we have a planner to basically do the offline data mining, to figure out the list of the repo to migrate based on the dependency. Then we do run this automation framework, the infra code on a remote developer cluster.

After that, we analyze log and aggregate the categories error. Finally, each daily run, we capture snapshot execution on a remote drive so that we can analyze and reproduce easily. Every day, we have regression report generated from our daily run. Developers can pick up this report to see the regression fixed bug, and this becomes a [inaudible 00:31:29] every day. This is the dry run process we developed so that it gives us more confidence before we kick off a mass migration, a whole company. As we mentioned before, gRPC bridge is just a stepping stone for us to get to the end state and to help us to not disrupt the business running. Our end goal is to go to gRPC Native.

gRPC Native

Ramgopal: We first want to get the client off the bridge. The reason we want to get the client off the bridge is because once you have moved all the code via the switch we described earlier, to gRPC, you should be able to clean up all the Rest.li code, and you should be able to directly use gRPC. On the server, it’s decoupled, because we have the gRPC facet. If you rewrite the facet, or if it’s using Rest.li under the hood, it doesn’t really matter, because the client is talking gRPC. We want to go from what’s on the left to what’s on the right. It looks fairly straightforward, but it’s not because of all these differences, which we described before.

We also want to get the server off the bridge. A prerequisite for this is, of course, that all the client traffic gets shifted. Once the client traffic gets shifted, we have to do few things. We have to delete those ugly looking Pegasus options, because they no longer make sense. All the traffic is gRPC. We have to ensure that all the logic they enabled in terms of validation, still stays in place, either in application code or elsewhere. We also have to replace this in-process Rest.li call which we were making with the ported over business logic behind that Rest.li resource, because otherwise that business logic is not going to execute. We need to delete all the Rest.li artifacts, like the old schemas, the old resources, the request builders, all that needs to be gone. There are a few problems like we described here.

We spoke about some of the differences between Pegasus and proto, which we tried to patch over with options. There are also differences in the binding layer. For example, the Rest.li code uses mutable bindings where you have getters and setters. GRPC bindings, though, are immutable, at least in Java. It’s a huge paradigm shift. We also have to change a lot of code, 20 million lines of code, which is pretty huge. What we’ve seen in our local experimentation is that if we do an AST, like abstract syntax tree based code mod to change things, it’s simply not enough. There are so many nuances that the accuracy rate is horrible. It’s not going to work. Of course, you can ask humans to do this themselves, but as I mentioned before, that wouldn’t fly. It’s very expensive. We are taking the help of generative AI.

We are essentially using both foundation models as well as fine-tuned models in order to ask AI, we just wave our hands and ask AI to do the migration for us. We have already used this for a few internal migrations. For example, just like we built Rest.li, long ago, we built another framework called Deco for functionality similar to GraphQL. Right now, we’ve realized that we don’t like Deco, we want to go to GraphQL. We’ve used this framework to actually try and migrate a lot of our code off Deco to GraphQL.

In the same way on the offline side, we were using Pig quite a bit, and right now, we’re going to Hive and we’re going to Spark. We are using a lot of this generative AI based system in order to do this migration. Is it perfect? No, it’s about 70%, 80% accurate. We are continuously improving the pipeline, as well as adopting newer models to increase the efficacy of this. We are still in the process of doing this.

Key Takeaways

What are the key takeaways? We’ve essentially built an automation framework, which right now is taking us to bridge mode, and we are working on plans to get off the bridge and do the second step. We are essentially switching from Rest.li to gRPC under the hood, without interrupting the business. We are undertaking a huge scope, 50,000 endpoints across 2,000 services, and we are doing this in a compressed period of 2 to 3 quarters with a small infrastructure team, instead of spreading the pain across the entire company over 2 to 3 years.

Questions and Answers

Participant 1: Did this cause any incidents, and how much harder was it to debug this process? Because it sounds like it adds a lot of complexity to transfer data.

Ramgopal: Is this causing incidents? How much complexity does it add to debug issues because of this?

We’ve had a few incidents so far. Obviously, any change like this without any incidents would be almost magical. The real world, there’s no magic. Debugging actually has been pretty good, because in a lot of these automation things which we show, it’s all code which you can step through. Often, you get really nice traces, which you can look at and try to understand what happened. What we did not show, which we’ve also automated, is, we’ve automated a lot of the alerts and the dashboards as part of the migration.

All the alerts we were having on the Rest.li side you would also get on the gRPC side in terms of error rates and stuff. We’ve also instrumented and added a lot of logging and tracing. You can look at the dashboards and know exactly what went wrong, where. We also have this tool which helps you associate a timeline of when major changes happened. For example, if a service is having errors after we started routing gRPC traffic to it, we have the timeline analyzer to know, correlationally, this is what happened, so most likely this is the problem. Our MTTR, MTTD, has actually been largely unaffected by this change.

Chen: We also built actual dark cluster infrastructure along with that, so that the actual team, when they migrate, they can set up a dark cluster to duplicate traffic before real ramping.

Participant 2: In your schema, or in your migration process, you went from the PDL, you generated gRPC resources, and said, that’s the source of truth, but then you generated PDL again. Is it because you want to change your gRPC resources and then have these changes flow back to PDL, so that you only have to change this one flow and get both out of it.

Ramgopal: Yes, because we still will have few services which will be using Rest.li, because they haven’t undergone the migration yet, few clients, and you still want them to see new changes to the APIs.

See more presentations with transcripts

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


PyTorch 2.5 Release Includes Support for Intel GPUs

MMS Founder
MMS Anthony Alford

Article originally posted on InfoQ. Visit InfoQ

The PyTorch Foundation recently released PyTorch version 2.5, which contains support for Intel GPUs. The release also includes several performance enhancements, such as the FlexAttention API, TorchInductor CPU backend optimizations, and a regional compilation feature which reduces compilation time. Overall, the release contains 4095 commits since PyTorch 2.4.

The Intel GPU support was previewed at the recent PyTorch conference. Intel engineers Eikan Wang and Min Jean Cho described the PyTorch changes made to support the hardware. This included generalizing the PyTorch runtime and device layers which makes it easier to integrate new hardware backends. Intel specific backends were also implemented for torch.compile and torch.distributed. According to Kismat Singh, Intel’s VP of engineering for AI frameworks:

We have added support for Intel client GPUs in PyTorch 2.5 and that basically means that you’ll be able to run PyTorch on the Intel laptops and desktops that are built using the latest Intel processors. We think it’s going to unlock 40 million laptops and desktops for PyTorch users this year and we expect the number to go to around 100 million by the end of next year.

The release includes a new FlexAttention API which makes it easier for PyTorch users to experiment with different attention mechanisms in their models. Typically, researchers who want to try a new attention variant need to hand-code it directly from PyTorch operators. However, this could result in “slow runtime and CUDA OOMs.” The new API supports writing these instead with “a few lines of idiomatic PyTorch code.” The compiler then converts these to an optimized kernel “that doesn’t materialize any extra memory and has performance competitive with handwritten ones.”

Several performance improvements have been released in beta status. A new backend Fused Flash Attention provides “up to 75% speed-up over FlashAttentionV2” for NVIDIA H100 GPUs. A regional compilation feature for torch.compile reduces the need for full model compilation; instead, repeated nn.Modules, such as Transformer layers, are compiled. This can reduce compilation latency while incurring only a few percent performance degradation. There are also several optimizations to the TorchInductor CPU backend.

Flight Recorder, a new debugging tool for stuck jobs, was also included in the release. Stuck jobs can occur during distributed training, and could have many root causes, including data starvation, network issues, or software bugs. Flight Recorder uses an in-memory circular buffer to capture diagnostic info. When it detects a stuck job, it dumps the diagnostics to a file; the data can then be analyzed using a script of heuristics to identify the root cause.

In discussions about the release on Reddit, many users were glad to see support for Intel GPUs, calling it a “game changer.” Another user wrote:

Excited to see the improvements in torch.compile, especially the ability to reuse repeated modules to speed up compilation. That could be a game-changer for large models with lots of similar components. The FlexAttention API also looks really promising – being able to implement various attention mechanisms with just a few lines of code and get near-handwritten performance is huge. Kudos to the PyTorch team and contributors for another solid release!

The PyTorch 2.5 code and release notes are available on GitHub.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


MongoDB, Inc. (NASDAQ:MDB) Shares Sold by Raymond James & Associates – MarketBeat

MMS Founder
MMS RSS

Posted on mongodb google news. Visit mongodb google news

Raymond James & Associates lowered its position in MongoDB, Inc. (NASDAQ:MDBFree Report) by 21.3% in the 3rd quarter, according to its most recent 13F filing with the SEC. The firm owned 44,082 shares of the company’s stock after selling 11,908 shares during the period. Raymond James & Associates owned about 0.06% of MongoDB worth $11,918,000 at the end of the most recent quarter.

Several other hedge funds and other institutional investors have also made changes to their positions in the stock. Sunbelt Securities Inc. lifted its holdings in shares of MongoDB by 155.1% in the 1st quarter. Sunbelt Securities Inc. now owns 125 shares of the company’s stock valued at $45,000 after acquiring an additional 76 shares during the last quarter. Diversified Trust Co raised its stake in shares of MongoDB by 19.0% during the first quarter. Diversified Trust Co now owns 4,111 shares of the company’s stock valued at $1,474,000 after acquiring an additional 657 shares in the last quarter. Sumitomo Mitsui Trust Holdings Inc. raised its stake in shares of MongoDB by 2.3% during the first quarter. Sumitomo Mitsui Trust Holdings Inc. now owns 182,727 shares of the company’s stock valued at $65,533,000 after acquiring an additional 4,034 shares in the last quarter. Azzad Asset Management Inc. ADV increased its holdings in MongoDB by 650.2% during the first quarter. Azzad Asset Management Inc. ADV now owns 5,214 shares of the company’s stock valued at $1,870,000 after buying an additional 4,519 shares during the period. Finally, BluePath Capital Management LLC purchased a new position in MongoDB during the first quarter valued at approximately $203,000. Institutional investors own 89.29% of the company’s stock.

Insiders Place Their Bets

In other MongoDB news, CFO Michael Lawrence Gordon sold 5,000 shares of the stock in a transaction that occurred on Monday, October 14th. The shares were sold at an average price of $290.31, for a total value of $1,451,550.00. Following the sale, the chief financial officer now directly owns 80,307 shares of the company’s stock, valued at approximately $23,313,925.17. This represents a 0.00 % decrease in their position. The sale was disclosed in a document filed with the SEC, which is available at this link. In related news, CFO Michael Lawrence Gordon sold 5,000 shares of the stock in a transaction on Monday, October 14th. The shares were sold at an average price of $290.31, for a total value of $1,451,550.00. Following the transaction, the chief financial officer now directly owns 80,307 shares of the company’s stock, valued at $23,313,925.17. This trade represents a 0.00 % decrease in their position. The transaction was disclosed in a filing with the SEC, which is accessible through the SEC website. Also, Director Dwight A. Merriman sold 1,385 shares of the stock in a transaction on Tuesday, October 15th. The shares were sold at an average price of $287.82, for a total transaction of $398,630.70. Following the completion of the transaction, the director now directly owns 89,063 shares in the company, valued at $25,634,112.66. This trade represents a 0.00 % decrease in their ownership of the stock. The disclosure for this sale can be found here. Insiders have sold a total of 23,281 shares of company stock worth $6,310,411 in the last ninety days. 3.60% of the stock is owned by corporate insiders.

Wall Street Analysts Forecast Growth

Several research firms have issued reports on MDB. Scotiabank lifted their price target on shares of MongoDB from $250.00 to $295.00 and gave the stock a “sector perform” rating in a report on Friday, August 30th. Mizuho lifted their price target on shares of MongoDB from $250.00 to $275.00 and gave the stock a “neutral” rating in a report on Friday, August 30th. Truist Financial lifted their price target on shares of MongoDB from $300.00 to $320.00 and gave the stock a “buy” rating in a report on Friday, August 30th. Piper Sandler lifted their price target on shares of MongoDB from $300.00 to $335.00 and gave the stock an “overweight” rating in a report on Friday, August 30th. Finally, Royal Bank of Canada reaffirmed an “outperform” rating and issued a $350.00 price target on shares of MongoDB in a report on Friday, August 30th. One research analyst has rated the stock with a sell rating, five have assigned a hold rating, twenty have issued a buy rating and one has given a strong buy rating to the company. According to MarketBeat.com, the stock presently has a consensus rating of “Moderate Buy” and an average target price of $337.96.

Read Our Latest Report on MDB

MongoDB Price Performance

Shares of MDB stock traded up $3.03 on Tuesday, reaching $275.21. 657,269 shares of the company were exchanged, compared to its average volume of 1,442,306. The company has a market cap of $20.19 billion, a price-to-earnings ratio of -96.86 and a beta of 1.15. The business has a 50-day moving average price of $272.87 and a 200 day moving average price of $280.83. The company has a debt-to-equity ratio of 0.84, a quick ratio of 5.03 and a current ratio of 5.03. MongoDB, Inc. has a 1 year low of $212.74 and a 1 year high of $509.62.

MongoDB (NASDAQ:MDBGet Free Report) last announced its quarterly earnings data on Thursday, August 29th. The company reported $0.70 earnings per share (EPS) for the quarter, beating the consensus estimate of $0.49 by $0.21. MongoDB had a negative net margin of 12.08% and a negative return on equity of 15.06%. The company had revenue of $478.11 million during the quarter, compared to analysts’ expectations of $465.03 million. During the same quarter in the prior year, the company earned ($0.63) EPS. MongoDB’s revenue was up 12.8% compared to the same quarter last year. Equities research analysts forecast that MongoDB, Inc. will post -2.39 earnings per share for the current fiscal year.

MongoDB Profile

(Free Report)

MongoDB, Inc, together with its subsidiaries, provides general purpose database platform worldwide. The company provides MongoDB Atlas, a hosted multi-cloud database-as-a-service solution; MongoDB Enterprise Advanced, a commercial database server for enterprise customers to run in the cloud, on-premises, or in a hybrid environment; and Community Server, a free-to-download version of its database, which includes the functionality that developers need to get started with MongoDB.

Read More

Institutional Ownership by Quarter for MongoDB (NASDAQ:MDB)

Before you consider MongoDB, you’ll want to hear this.

MarketBeat keeps track of Wall Street’s top-rated and best performing research analysts and the stocks they recommend to their clients on a daily basis. MarketBeat has identified the five stocks that top analysts are quietly whispering to their clients to buy now before the broader market catches on… and MongoDB wasn’t on the list.

While MongoDB currently has a “Moderate Buy” rating among analysts, top-rated analysts believe these five stocks are better buys.

View The Five Stocks Here

7 Stocks That Could Be Bigger Than Tesla, Nvidia, and Google Cover

Growth stocks offer a lot of bang for your buck, and we’ve got the next upcoming superstars to strongly consider for your portfolio.

Get This Free Report

Like this article? Share it with a colleague.

Link copied to clipboard.

Article originally posted on mongodb google news. Visit mongodb google news

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


The HackerNoon Newsletter: Teslas Good Quarter (10/28/2024)

MMS Founder
MMS RSS

Posted on mongodb google news. Visit mongodb google news

How are you, hacker?

🪐 What’s happening in tech today, October 28, 2024?

The
HackerNoon Newsletter
brings the HackerNoon
homepage
straight to your inbox.
On this day,

Cuban Missile Crisis Ends in 1962, St. Louis’s Gateway Arch is Completed in 1965, Italy Invades Greece in 1940,

and we present you with these top quality stories.

From
Timing is Important for Startups and Product Launches
to
Teslas Good Quarter,
let’s dive right in.

By @docligot [ 5 Min read ] Generative AI threatens journalism, education, and creativity by spreading misinformation, displacing jobs, and eroding quality. Read More.

By @companyoftheweek [ 2 Min read ] This week, HackerNoon features MongoDB — a popular NoSQL database designed for scalability, flexibility, and performance. Read More.

By @nebojsaneshatodorovic [ 3 Min read ] The Anti-AI Movement Is No Joke. Can you think of a movie where AI is good? Read More.

By @pauldebahy [ 4 Min read ] Many great startups and products were launched too early to too late, impacting their potential for success and leaving them vulnerable to competitors. Read More.

By @sheharyarkhan [ 4 Min read ] Teslas Q3 results couldnt have come at a better time as Wall Street was becoming increasingly concerned with the companys direction. Read More.

🧑‍💻 What happened in your world this week?

It’s been said that
writing can help consolidate technical knowledge,
establish credibility,
and contribute to emerging community standards.
Feeling stuck? We got you covered ⬇️⬇️⬇️

ANSWER THESE GREATEST INTERVIEW QUESTIONS OF ALL TIME

We hope you enjoy this worth of free reading material. Feel free to forward this email to a nerdy friend who’ll love you for it.See you on Planet Internet! With love,
The HackerNoon Team ✌️

Article originally posted on mongodb google news. Visit mongodb google news

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Meet MongoDB: HackerNoon Company of the Week

MMS Founder
MMS RSS

Posted on mongodb google news. Visit mongodb google news

We are back with another Company of the Week feature! Every week, we share an awesome tech brand from our tech company database, making their evergreen mark on the internet. This unique HackerNoon database ranks S&P 500 companies and top startups of the year alike.

This week, we are proud to present MongoDB! You may have caught a mention of MongoDB during the pilot of the hit TV show “Silicon Valley,” but the NoSQL database product is not only popular with budding startups. In fact, you’d be amazed to learn that MongoDB powers some of the largest enterprises in the world, including prior HackerNoon company of the week Bosch which uses MongoDB Atlas when dealing with a vast amount of data in its workflows.



MongoDB HackerNoon Targeted Ads

For all you programmers out there, you may have seen MongoDB appear across our website recently, especially if you were browsing any coding or programming-related tag on HackerNoon. Well, that’s because MongoDB and HackerNoon partnered together to highlight some amazing features from MongoDB to help you develop your own RAG or cloud document database for artificial intelligence. How’s that for cool!?

In effect, you could use the same technology that is being utilized by the likes of Bosch, Novo Nordisk, Wells Fargo, and even Toyota, to create or innovate on your products!

Learn How You Can Advertise to Your Specific Niche on HackerNoon

Meet MongoDB: #FunFacts

The founders of MongoDB — Dwight Merriman, Eliot Horowitz, and Kevin Ryan — previously worked with the online advertising company Doubleclick which was acquired by Google in the late 2000s for a whopping $3.1 billion.

Using their knowledge of the limitations of relational databases, all three got together the same year Doubleclick was sold to found MongoDB whose mission is to empower innovators to create, transform, and disrupt industries by unleashing the power of software and data.

Today, MongoDB’s Atlas is one of the most loved vector databases in the world, and the company, which is listed on Nasdaq, consistently ranks as one of the best places to work in. MongoDB is also a leader in cloud database management systems, as ranked by Gartner.

How’s that for interesting 😏

Share Your Company’s Story via HackerNoon

That’s all this week, folks! Stay Creative, Stay Iconic.

The HackerNoon Team

Article originally posted on mongodb google news. Visit mongodb google news

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Article: Securing Cell-Based Architecture in Modern Applications

MMS Founder
MMS Stefania Chaplin

Article originally posted on InfoQ. Visit InfoQ

Key Takeaways

  • Despite its benefits, cell-based architecture introduces significant security challenges.
  • Permissions are essential, and strong authorization and authentication methods are required.
  • All data must be encrypted in transit; mutual TLS (mTLS) can help.
  • Adopting a centralized cell and service registry and API gateway can help track configurations and improve logging and monitoring.
  • Cell health is vital. Maintaining cell health allows each cell to run smoothly and reliably, maintaining the system’s overall integrity and security.
This article is part of the “Cell-Based Architectures: How to Build Scalable and Resilient Systems” article series. In this series we present a journey of discovery and provide a comprehensive overview and in-depth analysis of many key aspects of cell-based architectures, as well as practical advice for applying this approach to existing and new architectures.

Cell-based architecture is becoming increasingly popular in the fast-evolving world of software development. The concept is inspired by the design principles of a ship’s bulkheads, where separate watertight compartments allow for isolated failures. By applying this concept to software, we create an architecture that divides applications into discrete, manageable components known as cells. Each cell operates independently, communicating with others through well-defined interfaces and protocols.

Cell-based technologies are popular because they provide us with an architecture that is modular, flexible, and scalable. They help engineers rapidly scale while improving development efficiency and enhancing maintainability. However, despite these impressive feats, cell-based technology introduces significant security challenges.

Isolation and Containment

Each cell must operate in a sandboxed environment to prevent unauthorized access to the underlying system or other cells. Containers like Docker or virtual machines (VMs) are often used to enforce isolation. By leveraging sandboxing, even if a cell is compromised, the attacker cannot easily escalate privileges or access other parts of the system.

Permissions and access control mechanisms guarantee that cells can only interact with approved entities. Role-based access control (RBAC) assigns permissions based on roles assigned to users or entities.

On the other hand, attribute-based access control (ABAC) considers multiple attributes like user role, location, and time to make access decisions.

Network segmentation is another crucial strategy. Organizations can minimize the attack surface and restrict attackers’ lateral movement by creating isolated network zones for different cells. Micro-segmentation takes this a step further by creating fine-grained security zones within the data center, providing even greater control over network traffic. Enforcing strict access controls and monitoring traffic within each segment enhances security at the cell level, helping meet compliance and regulatory requirements by guaranteeing a robust and secure architecture.

Zero-Trust Security

In a cell-based architecture, adopting a zero-trust approach means treating every interaction between cells as potentially risky, regardless of origin. This approach requires constantly verifying each cell’s identity and applying strict access controls. Trust is always earned, never assumed.

Zero trust involves explicitly checking each cell’s identity and actions using detailed data points. It means limiting access so cells only have the permissions they need (least privilege access) and creating security measures that assume a breach could already be happening.

To implement zero trust, enforce strong authentication and authorization for all cells and devices. Use network segmentation and micro-segmentation to isolate and contain any potential breaches. Employ advanced threat detection and response tools to quickly spot and address threats within the architecture. This comprehensive strategy ensures robust security in a dynamic environment.

Authentication and Authorization

Strong authentication mechanisms like OAuth and JWT (JSON Web Tokens) verify cells’ identities and enforce strict access controls. OAuth is a widely used framework for token-based authorization. It allows secure resource access without sharing credentials, which is particularly useful in distributed systems. This framework lets cells grant limited access to their resources to other cells or services, reducing the risk of credential exposure.

JWTs, on the other hand, are self-contained tokens that carry claims about the user or system, such as identity and permissions. They provide a compact and secure way to transmit information between cells. Signed with a cryptographic algorithm, JWTs ensure data authenticity and integrity. When a cell receives a JWT, it can verify the token’s signature and decode its payload to authenticate the sender and authorize access based on the claims in the token.

Using OAuth for authorization and JWTs for secure information transmission achieves precise access control in cell-based architectures. As a result, cells only access the resources they are permitted to use, minimizing the risk of unauthorized access. Furthermore, these mechanisms support scalability and flexibility. Cells can dynamically issue and validate tokens without needing a centralized authentication system, enhancing the overall security and efficiency of the architecture and making it robust and adaptable.

Encryption

Encryption ensures that only the intended recipient can read the data. Hashing algorithms verify that the data hasn’t been altered during transmission, and certificates with public-key cryptography confirm the identities of the entities involved. All data exchanged between cells should be encrypted using strong protocols like TLS. Using encryption prevents eavesdropping and tampering and keeps the data confidential, intact, and authenticated. It also protects sensitive information from unauthorized access, ensures data integrity, and verifies the identities of the communicating parties.

Organizations should follow best practices to implement TLS effectively. It’s crucial to ensure the TLS implementation is always up to date and robust by managing certificates properly, renewing them before they expire, and revoking them if compromised. Additional security measures include enabling Perfect Forward Secrecy (PFS) to keep session keys secure, even if the server’s private key is compromised. In order to avoid using deprecated protocols, it’s essential to check and update configurations regularly.

Mutual TLS (mTLS)

mTLS (mutual TLS) boosts security by ensuring both the client and server authenticate each other. Unlike standard TLS, which only authenticates the server, mTLS requires both sides to present and verify certificates, confirming their identities. Each cell must present a valid certificate from a trusted Certificate Authority (CA), and the receiving cell verifies this before establishing a connection. This two-way authentication process ensures that only trusted and verified cells can communicate, significantly reducing the risk of unauthorized access.

In addition to verifying identities, mTLS also protects data integrity and confidentiality. The encrypted communication channel created by mTLS prevents eavesdropping and tampering, which is crucial in cell-based architectures where sensitive data flows between many components. Implementing mTLS involves managing certificates for all cells. Certificates must be securely stored, regularly updated, and properly revoked if compromised. Organizations can leverage automated tools and systems to assist in managing and renewing these certificates.

Overall, mTLS ensures robust security by establishing mutual authentication, data integrity, and confidentiality. It provides an additional layer of security to help maintain the trustworthiness and reliability of your system and prevent unauthorized access in cell-based architectures.

API Gateway

An API gateway is a vital intermediary, providing centralized control over API interactions while simplifying the system and boosting reliability. By centralizing API management, organizations can achieve better control, more robust security, efficient resource usage, and improved visibility across their architecture.

An API Gateway can be one of the best options for cell-based architecture for implementing the cell router. A single entry point for all API interactions with a given cell reduces the complexity of direct communication between numerous microservices and reduces the surface area exposed to external agents. Centralized routing makes updating, scaling, and ensuring consistent and reliable API access easier. The API gateway handles token validation, such as OAuth and JWT, verifying the identities of communicating cells. It can also implement mutual TLS (mTLS) to authenticate the client and server. Only authenticated and authorized requests can access the system, maintaining data integrity and confidentiality.

The API gateway can enforce rate limiting to control the number of requests a client can make within a specific time frame, preventing abuse and ensuring fair resource usage. It is critical for protecting against denial-of-service (DoS) attacks and managing system load. The gateway also provides comprehensive logging and monitoring capabilities. These capabilities offer valuable insights into traffic patterns, performance metrics, and potential security threats, which allows for proactive identification and resolution of issues while maintaining system robustness and efficiency. Effective logging and monitoring, facilitated by the API gateway, are crucial for incident response and overall system health.

Service Mesh

A service mesh helps manage the communication between services. It handles the complexity of how cells communicate, enforcing robust security policies like mutual TLS (mTLS). Data is encrypted in transit, and the client and server are verified during every transaction. Only authorized services can interact, significantly reducing the risk of unauthorized access and data breaches.

They also allow for detailed access control policies, precisely regulating which services can communicate with each other, further strengthening the security of the architecture. Beyond security, a service mesh enhances the visibility and resilience of cell-based architectures by providing consistent logging, tracing, and monitoring of all service interactions. This centralized view enables the real-time detection of anomalies, potential threats, and performance issues, facilitating quick and effective incident responses.

The mesh automatically takes care of retries, load balancing, and circuit breaking. Communication remains reliable even under challenging conditions, maintaining the availability and integrity of services. A service mesh simplifies security management by applying security and operational policies consistently across all services without requiring changes to the application code, making it easier to enforce compliance and protect against evolving threats in a dynamic and scalable environment. Service meshes secure communication and enhance the overall robustness of cell-based architectures.

Doordash recently shared that it implemented zone-aware routing using the Envoy-based service mesh. The solution allowed the company to efficiently direct traffic within the same availability zone (AZ), minimizing more expensive cross-AZ data transfers.

Centralized Registry

A centralized registry is the backbone for managing service discovery, configurations, and health status in cell-based architectures. By keeping an up-to-date repository of all cell and service instances and their metadata, we can ensure that only registered and authenticated services can interact. This centralization strengthens security by preventing unauthorized access and minimizing the risk of rogue services infiltrating the system. Moreover, it enforces uniform security policies and configurations across all services. Consistency allows best practices to be applied and reduces the likelihood of configuration errors that could lead to vulnerabilities.

In addition to enhancing access control and configuration consistency, a centralized registry significantly improves monitoring and incident response capabilities. It provides real-time visibility into the operational status and health of services. Allowing the rapid identification and isolation of compromised or malfunctioning cells. Such a proactive approach is crucial for containing potential security breaches and mitigating their impact on the overall system.

The ability to audit changes within a centralized registry supports compliance with regulatory requirements and aids forensic investigations. Maintaining detailed logs of service registrations, updates, and health checks strengthens the security posture of cell-based architectures. Through such oversight, cell-based architectures remain resilient and reliable against evolving threats.

Cell Health

Keeping cells healthy allows each cell to run smoothly and reliably, which, in turn, maintains the system’s overall integrity and security. Continuous health monitoring provides real-time insights into how each cell performs, tracking important metrics like response times, error rates, and resource use. With automated health checks, the system can quickly detect any anomalies, failures, or deviations from expected performance. Early detection allows for proactive measures, such as isolating or shutting down compromised cells before they can affect the broader system, thus preventing potential security breaches and ensuring the stability and reliability of services.

Maintaining cell health also directly supports dynamic scaling and resilience, essential for strong security. Healthy cells allow the architecture to scale efficiently to meet demand while keeping security controls consistent. When a cell fails health checks, automated systems can quickly replace or scale up new cells with the proper configurations and security policies, minimizing downtime and ensuring continuous protection. This responsive approach to cell health management reduces the risk of cascading failures. It improves the system’s ability to recover quickly from incidents, thereby minimizing the impact of security threats and maintaining the overall security posture of the architecture.

Infrastructure as Code

Infrastructure as Code (IaC) enables consistent, repeatable, and automated infrastructure management. By defining infrastructure through code, teams can enforce standardized security policies and configurations across all cells from the start, enforcing best practices and compliance requirements.

Tools like Terraform or AWS CloudFormation automate provisioning and configuration processes, significantly reducing the risk of human error, a common source of security vulnerabilities. A consistent setup helps maintain a uniform security posture, making it easier to systematically identify and address potential weaknesses.

IaC also enhances security through version control and auditability. All infrastructure configurations are stored in version-controlled repositories. Teams can track changes, review configurations, and revert to previous states if needed. This transparency and traceability are critical for compliance audits and incident response, providing a clear history of infrastructure changes and deployments.

IaC facilitates rapid scaling and recovery by enabling quick and secure provisioning of new cells or environments. Even as the architecture grows and evolves, security controls are consistently applied. In short, IaC streamlines infrastructure management and embeds security into the core of cell-based architectures, boosting their resilience and robustness against threats.

Conclusion

Securing cell-based architecture is essential to fully capitalize on its benefits while minimizing risks. To achieve this, comprehensive security measures must be put in place. Organizations can start by isolating and containing cells using sandbox environments and strict access control mechanisms like role-based and attribute-based access control. Network segmentation and micro-segmentation are crucial — they minimize attack surfaces and restrict the lateral movement of threats.

Adopting a zero-trust approach is vital. It ensures that each cell’s identity is continuously verified. Robust authentication mechanisms such as OAuth and JWT and encrypted communication through TLS and mTLS protect data integrity and confidentiality. A service mesh handles secure, reliable interactions between services. Meanwhile, a centralized registry ensures that only authenticated services can communicate, boosting monitoring and incident response capabilities.

API gateways offer centralized control over API interactions, ensuring consistent security between cells. Continuous health monitoring and Infrastructure as Code (IaC) further enhance security by automating and standardizing infrastructure management, allowing rapid scaling and recovery.

By integrating these strategies, organizations can create a robust security framework that allows cell-based architectures to operate securely and efficiently in today’s dynamic technology landscape.

This article is part of the “Cell-Based Architectures: How to Build Scalable and Resilient Systems” article series. In this series we present a journey of discovery and provide a comprehensive overview and in-depth analysis of many key aspects of cell-based architectures, as well as practical advice for applying this approach to existing and new architectures.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Google Cloud Cost Attribution Solution to Enhance Cloud Cost Visibility

MMS Founder
MMS Steef-Jan Wiggers

Article originally posted on InfoQ. Visit InfoQ

Google recently announced the launch of the Google Cloud Cost Attribution Solution, a set of tools designed to enhance cost management through improved metadata and labeling practices.

The company released the solution for its customers, who view managing cloud costs as a complex challenge. The solution allows organizations to identify the specific teams, projects, or services driving their expenses, providing for more informed financial decisions. Google states that whether new to Google Cloud or a long-time user, the solution offers valuable insights for cost optimization.

The solution’s core is labels—key-value pairs that act as customizable metadata tags attached to cloud resources such as virtual machines and storage buckets. These labels provide granular visibility into costs by enabling users to break down expenses by service, team, project, or environment (development, production, etc.).

Tim Twarog, a FinOps SME, writes in an earlier LinkedIn blog post on GCP cost allocation strategies:

Tagging and labeling resources is a key strategy for cost allocation in GCP. Tags are key-value pairs that can be attached to resources. They allow for more granular cost tracking and reporting.

The tool allows businesses to:

  • Generate detailed cost reports: Break down expenses by department, service, or environment.
  • Make data-driven decisions: Optimize resource allocation and justify future cloud investments.
  • Customize reporting: Tailor reports to the specific needs of any organization.

In addition, the solution addresses the challenge of label governance with proactive and reactive approaches. The proactive approach entails integrating labeling policies into infrastructure-as-code tools like Terraform, ensuring all new resources are labeled adequately. The reactive approach identifies unlabeled resources and offers real-time alerts when new resources lack labels. The solution also automates the application of correct labels to ensure comprehensive cost visibility and near real-time alerts when resources are created or modified without the proper labels.

(Source: Google Cloud Blog Post)

Like Google’s solution, other cloud providers like Microsoft and AWS have tools that help their customers gain granular visibility into their cloud spending and the ability to optimize resource usage and make data-driven decisions to manage costs effectively.

Microsoft’s Azure, for instance, offers cost management tools that include budgeting, cost analysis, and recommendations for cost optimization, which integrates with other Azure services and provides detailed cost breakdowns. Meanwhile, AWS Cost Explorer allows AWS customers to visualize, understand, and manage their AWS costs and usage over time. It offers thorough insights and reports to identify cost drivers and optimize spending.

Lastly, the company will soon incorporate tag support in the solution, allowing users to organize resources across projects and implement fine-grained access control through IAM conditions.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Spring News Roundup: Release Candidates for Spring Boot, Security, Auth Server, Modulith

MMS Founder
MMS Michael Redlich

Article originally posted on InfoQ. Visit InfoQ

There was a flurry of activity in the Spring ecosystem during the week of October 21st, 2024, highlighting first release candidates of: Spring Boot, Spring Security, Spring Authorization Server, Spring Integration, Spring Modulith, Spring Batch, Spring AMQP, Spring for Apache Kafka and Spring for Apache Pulsar.

Spring Boot

The first release candidate of Spring Boot 3.4.0 delivers bug fixes, improvements in documentation, dependency upgrades and many new features such as: improved support for the Spring Framework ClientHttpRequestFactory interface with new builders and additional customization; and support for ARM and x86 architectures in the Paketo Buildpack for Spring Boot. More details on this release may be found in the release notes.

Similarly, the release of Spring Boot 3.3.5 and 3.2.11 provide improvements in documentation, dependency upgrades and resolutions to notable bug fixes such as: removal of the printing the stack trace to the error stream from the driverClassIsLoadable() method, defined in the DataSourceProperties class, upon exception as it was determined to be unnecessary; and an instance of the ArtemisConnectionFactoryFactory class failing when building a native image. More details on this release may be found in the release notes for version 3.3.5 and version 3.2.11.

Spring Framework

In conjunction with the release of Spring Boot 3.4.0-RC1, the third release candidate of Spring Framework 6.2.0 ships with bug fixes, improvements in documentation and new features such as: removal of the proxyTargetAware attribute in the new @MockitoSpyBean annotation as it was deemed unnecessary; and a refactor of the retrieve() method, defined in the RestClient interface, to execute a request and extract the response as necessary, eliminating the need for two method calls in this workflow. More details on this release may be found in the release notes.

Spring Security

The first release candidate of Spring Security 6.4.0 delivers bug fixes, dependency upgrades and new features such as: support for Passkeys; a new authorize() method that replaces the now-deprecated check() method, defined in the AuthorizationManager interface, that returns an instance of the AuthorizationResult interface; and a refactored publishAuthorizationEvent(Supplier, T, AuthorizationDecision) method, defined in the AuthorizationEventPublisher interface, that now accepts an instance of the AuthorizationResult interface, replacing the AuthorizationDecision class in the parameter list. More details on this release may be found in the release notes and what’s new page.

Similarly, versions 6.3.4, 6.2.7 and 5.8.15 of Spring Security have been released featuring build updates, dependency upgrades and resolutions to notable bug fixes such as: erasures of credentials on custom instances of the AuthenticationManager interface despite the eraseCredentialsAfterAuthentication field being set to false; and methods annotated with @PostFilter are processed twice by the PostFilterAuthorizationMethodInterceptor class. More details on these releases may be found in the release notes for version 6.3.4, version 6.2.7 and version 5.8.15.

Spring Authorization Server

The first release candidate of Spring Authorization Server 1.4.0 provides dependency upgrades and new features such as: a replacement of the DelegatingAuthenticationConverter class with the similar DelegatingAuthenticationConverter class defined in Spring Security; and a simplification of configuring a dedicated authorization server using the with() method, defined in the Spring Security HttpSecurity class. More details on this release may be found in the release notes.

Similarly, versions 1.3.3 and 1.2.7 of Spring Authorization Server have been released featuring dependency upgrades and a fix for more efficiently registering AOT contributions with subclasses defined from the JdbcOAuth2AuthorizationService class. More details on these releases may be found in the release notes for version 1.3.3 and version 1.2.7.

Spring for GraphQL

Versions 1.3.3 and 1.2.9 of Spring for GraphQL have been released ship with bug fixes, improvements in documentation, dependency upgrades and new features such as: the ability to set a timeout value for server-side events; and methods annotated with @BatchMapping should pass the localContext field, defined in the GraphQL for Java DataFetcherResult class, from an instance of the BatchLoaderEnvironment class.More details on these releases may be found in the release notes for version 1.3.3 and version 1.2.9.

Spring Integration

The first release candidate of Spring Integration 6.4.0 delivers bug fixes, improvements in documentation, dependency upgrades and new features such as: an instance of the RedisLockRegistry class may now be configured via an instance of the Spring Framework TaskScheduler interface for renewal of automatic locks in the store; and a migration of Python scripts to support Python 3 and GraalPy. More details on this release may be found in the release notes and what’s new page.

Similarly, versions 6.3.5 and 6.2.10 of Spring Integration have been released with bug fixes, dependency upgrades and new features: the addition of a new property, idleBetweenTries, to the aforementioned RedisLockRegistry class to specify a duration to sleep in between lock attempts; and improved support for the JUnit @Nested annotation when using the @SpringIntegrationTest annotation. More details on this release may be found in the release notes for version 6.3.5 and version 6.2.10.

Spring Modulith

The first release candidate of Spring Modulith 1.3.0 provides bug fixes, improvements in documentation, dependency upgrades and new features such as: support for the Oracle database type in the JDBC event publication registry; and support for the MariaDB database driver. More details on this release may be found in the release notes.

Similarly, versions 1.2.5 and 1.1.10 of Spring Modulith have been released featuring bug fixes, dependency upgrades and various improvements in the reference documentation. More details on this release may be found in the release notes for version 1.2.5 and version 1.1.10.

Spring Batch

The first release candidate of Spring Batch 5.2.0 ships with bug fixes, improvements in documentation, dependency upgrades and a new feature that allows the CompositeItemReader class to be subclassed to allow generics to be less strict. More details on this release may be found in the release notes.

Spring AMQP

The first release candidate of Spring AMQP 3.2.0 provides improvements in documentation, dependency upgrades and new features such as: additional Open Telemetry semantic tags that are exposed via an instance of the RabbitTemplate class and @RabbitListener annotation; and a new method, getBeforePublishPostProcessors(), in the RabbitTemplate class that complements the existing addBeforePublishPostProcessors() method that allows developers to dynamically access and modify these processors. More details on this release may be found in the release notes.

Spring for Apache Kafka

The first release candidate of Spring for Apache Kafka 3.3.0 delivers bug fixes, improvements in documentation, dependency upgrades and new features such as: a new KafkaMetricsSupport class for improved metrics support; and the ability to use the Java @Override annotation on the createAdmin() method, defined in the KafkaAdmin class, to use another type of class implementing the Apache Kafka Admin interface. This release also provides full integration with Spring Boot 3.4.0-RC1. More details on this release may be found in the release notes.

Spring for Apache Pulsar

The first release candidate for Spring for Apache Pulsar 1.2.0 provides dependency upgrades and improvements such as: ensure that all uses of the toLowerCase() and toUpperCase() methods, defined in the Java String class, specify an instance of a Java Locale class with a default of Locale.ROOT; and a new log to warn developers when lambda producer customizers are used for increased awareness. More details on this release may be found in the release notes.

Similarly, versions 1.1.5 and 1.0.11 of Spring for Apache Pulsar have been released featuring dependency upgrades and the aforementioned use of the toLowerCase() and toUpperCase() methods. More details on this release may be found in the release notes for version 1.1.5 and version 1.0.11.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Rhymes AI Unveils Aria: Open-Source Multimodal Model with Development Resources

MMS Founder
MMS Robert Krzaczynski

Article originally posted on InfoQ. Visit InfoQ

Rhymes AI has introduced Aria, an open-source multimodal native Mixture-of-Experts (MoE) model capable of processing text, images, video, and code effectively. In benchmarking tests, Aria has outperformed other open models and demonstrated competitive performance against proprietary models such as GPT-4o and Gemini-1.5. Additionally, Rhymes AI has released a codebase that includes model weights and guidance for fine-tuning and development.

Aria has several features, including multimodal native understanding and competitive performance against existing proprietary models. Rhymes AI has shared that Aria’s architecture, built from scratch using multimodal and language data, achieves state-of-the-art results across various tasks. This architecture includes a fine-grained mixture-of-experts model with 3.9 billion activated parameters per token, offering efficient processing with improved parameter utilization.

Rashid Iqbal, a machine learning engineer, raised considerations regarding Aria’s architecture:

Impressive release! Aria’s Mixture-of-Experts architecture and novel multimodal training approach certainly set it apart. However, I am curious about the practical implications of using 25.3B parameters with only 3.9B active—does this lead to increased latency or inefficiency in certain applications?
Also, while beating giants like GPT-4o and Gemini-1.5 on benchmarks is fantastic, it is crucial to consider how it performs in real-world scenarios beyond controlled tests.

In benchmarking tests, Aria has outperformed other open models, such as Pixtral-12B and Llama3.2-11B, and performs competitively against proprietary models like GPT-4o and Gemini-1.5. The model excels in areas like document understanding, scene text recognition, chart reading, and video comprehension, underscoring its suitability for complex, multimodal tasks.


Source: https://huggingface.co/rhymes-ai/Aria

In order to support development, Rhymes AI has released a codebase for Aria, including model weights, a technical report, and guidance for using and fine-tuning the model with various datasets. The codebase also includes best practices to streamline adoption for different applications, with support for frameworks like vLLM. All resources are made available under the Apache 2.0 license.

Aria’s efficiency extends to its hardware requirements. In response to a community question about the necessary GPU for inference, Leonardo Furia explained

ARIA’s MoE architecture activates only 3.5B parameters during inference, allowing it to potentially run on a consumer-grade GPU like the NVIDIA RTX 4090. This makes it highly efficient and accessible for a wide range of applications.

Addressing a question from the community about whether there are plans to offer Aria via API, Rhymes AI confirmed that API support is on the roadmap for future models.

With Aria’s release, Rhymes AI encourages participation from researchers, developers, and organizations in exploring and developing practical applications for the model. This collaborative approach aims to further enhance Aria’s capabilities and explore new potential for multimodal AI integration across different fields.

For those interested in trying or training the model, it is available for free at Hugging Face.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


OpenJDK News Roundup: Stream Gatherers, Scoped Values, Generational Shenandoah, ZGC Non-Gen Mode

MMS Founder
MMS Michael Redlich

Article originally posted on InfoQ. Visit InfoQ

There was a flurry of activity in the OpenJDK ecosystem during the week of October 21st, 2024, highlighting: JEPs that have been Targeted and Proposed to Target for JDK 24; and drafts that have been promoted to Candidate status. JEP 485, Stream Gatherers, is the fifth JEP confirmed for JDK 24. Four JEPs have been Proposed to Target and will be under review during the week of October 28, 2024.

Targeted

After its review had concluded, JEP 485, Stream Gatherers, has been promoted from Proposed to Target to Targeted for JDK 24. This JEP proposes to finalize this feature after two rounds of preview, namely: JEP 473: Stream Gatherers (Second Preview), delivered in JDK 23; and JEP 461, Stream Gatherers (Preview), delivered in JDK 22. This feature was designed to enhance the Stream API to support custom intermediate operations that will “allow stream pipelines to transform data in ways that are not easily achievable with the existing built-in intermediate operations.” More details on this JEP may be found in the original design document and this InfoQ news story.

Proposed to Target

JEP 490, ZGC: Remove the Non-Generational Mode, has been promoted from Candidate to Proposed to Target for JDK 24. This JEP proposes to remove the non-generational mode of the Z Garbage Collector (ZGC). The generational mode is now the default as per JEP 474, ZGC: Generational Mode by Default, delivered in JDK 23. By removing the non-generational mode of ZGC, it eliminates the need to maintain two modes and improves development time of new features for the generational mode. The review is expected to conclude on October 29, 2024.

JEP 487, Scoped Values (Fourth Preview), has been promoted from Candidate to Proposed to Target for JDK 24. Formerly known as Extent-Local Variables (Incubator), proposes a fourth preview, with one change, in order to gain additional experience and feedback from one round of incubation and three rounds of preview, namely: JEP 481, Scoped Values (Third Preview), delivered in JDK 23; JEP 464, Scoped Values (Second Preview), delivered in JDK 22; JEP 446, Scoped Values (Preview), delivered in JDK 21; and JEP 429, Scoped Values (Incubator), delivered in JDK 20. This feature enables sharing of immutable data within and across threads. This is preferred to thread-local variables, especially when using large numbers of virtual threads. The only change from the previous preview is the removal of the callWhere() and runWhere() methods from the ScopedValue class that leaves the API fluent. The ability to use one or more bound scoped values is accomplished via the call() and run() methods defined in the ScopedValue.Carrier class. The review is expected to conclude on October 30, 2024.

JEP 478, Key Derivation Function API (Preview), has been promoted from Candidate to Proposed to Target for JDK 24. This JEP proposes to introduce an API for Key Derivation Functions (KDFs), cryptographic algorithms for deriving additional keys from a secret key and other data, with goals to: allow security providers to implement KDF algorithms in either Java or native code; and enable the use of KDFs in implementations of JEP 452, Key Encapsulation Mechanism. The review is expected to conclude on October 31, 2024.

JEP 404, Generational Shenandoah (Experimental), has been promoted from Candidate to Proposed to Target for JDK 24. Originally targeted for JDK 21, JEP 404 was officially removed from the final feature set due to the “risks identified during the review process and the lack of time available to perform the thorough review that such a large contribution of code requires.” The Shenandoah team decided to “deliver the best Generational Shenandoah that they can” and target a future release. The review is expected to conclude on October 30, 2024.

New JEP Candidates

JEP 495, Simple Source Files and Instance Main Methods (Fourth Preview), has been promoted from its JEP Draft 8335984 to Candidate status. This JEP proposes a fourth preview without change (except for a second name change), after three previous rounds of preview, namely: JEP 477, Implicitly Declared Classes and Instance Main Methods (Third Preview), delivered in JDK 23; JEP 463, Implicitly Declared Classes and Instance Main Methods (Second Preview), delivered in JDK 22; and JEP 445, Unnamed Classes and Instance Main Methods (Preview), delivered in JDK 21. This feature aims to “evolve the Java language so that students can write their first programs without needing to understand language features designed for large programs.” This JEP moves forward the September 2022 blog post, Paving the on-ramp, by Brian Goetz, Java language architect at Oracle. Gavin Bierman, consulting member of technical staff at Oracle, has published the first draft of the specification document for review by the Java community. More details on JEP 445 may be found in this InfoQ news story.

JEP 494, Module Import Declarations (Second Preview), has been promoted from its JEP Draft 8335987 to Candidate status.This JEP proposes a second preview after the first round of preview, namely: JEP 476, Module Import Declarations (Preview), delivered in JDK 23. This feature will enhance the Java programming language with the ability to succinctly import all of the packages exported by a module with a goal to simplify the reuse of modular libraries without requiring to import code to be in a module itself. Changes from the first preview include: remove the restriction that a module is not allowed to declare transitive dependencies on the java.base module; revise the declaration of the java.se module to transitively require the java.base module; and allow type-import-on-demand declarations to shadow module import declarations. These changes mean that importing the java.se module will import the entire Java SE API on demand.

JEP 493, Linking Run-Time Images without JMODs, has been promoted from its JEP Draft 8333799 to Candidate status. This JEP proposes to “reduce the size of the JDK by approximately 25% by enabling the jlink tool to create custom run-time images without using the JDK’s JMOD files.” This feature must explicitly be enabled upon building the JDK. Developers are now allowed to link a run-time image from their local modules regardless where those modules exist.

JDK 24 Release Schedule

The JDK 24 release schedule, as approved by Mark Reinhold, Chief Architect, Java Platform Group at Oracle, is as follows:

  • Rampdown Phase One (fork from main line): December 5, 2024
  • Rampdown Phase Two: January 16, 2025
  • Initial Release Candidate: February 6, 2025
  • Final Release Candidate: February 20, 2025
  • General Availability: March 18, 2025

For JDK 24, developers are encouraged to report bugs via the Java Bug Database.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.