Google Releases PaliGemma 2 Vision-Language Model Family

MMS Founder
MMS Anthony Alford

Article originally posted on InfoQ. Visit InfoQ

Google DeepMind released PaliGemma 2, a family of vision-language models (VLM). PaliGemma 2 is available in three different sizes and three input image resolutions and achieves state-of-the-art performance on several vision-language benchmarks.

PaliGemma 2 is an update of the PaliGemma family, which was released in 2024. It uses the same SigLIP-So400m vision encoder as the original PaliGemma, but upgrades to the Gemma 2 LLM. The PaliGemma 2 family contains nine different models, combining LLM sizes of 2B, 9B, and 27B parameters with vision encoders of 224, 448, and 896 pixels-squared resolution. The research team evaluated PaliGemma 2 on a variety of benchmarks, where it set new state-of-the-art records, including optical character recognition (OCR), molecular structure recognition, and radiography report generation. According to Google:

We’re incredibly excited to see what you create with PaliGemma 2. Join the vibrant Gemma community, share your projects to the Gemmaverse, and let’s continue to explore the boundless potential of AI together. Your feedback and contributions are invaluable in shaping the future of these models and driving innovation in the field.

PaliGemma 2 is a combination of a pre-trained SigLIP-So400m image encode and a Gemma 2 LLM. This combination is then further pre-trained on a 1B example multimodal dataset. Besides the pre-trained base models, Google also released variants that were fine-tuned on the Descriptions of Connected and Contrasting Images (DOCCI) dataset, a collection of images and corresponding detailed descriptions. The fine-tuned variants can generate long, detailed captions of images, which are “more factually aligned sentences” than those produced by other VLMs.

Google created other fine-tuned versions for benchmarking purposes. The benchmark tasks included OCR, table structure recognition, molecular structure recognition, optical music score recognition, radiography report generation, and spatial reasoning. The fine-tuned PaliGemma 2 outperformed previous state-of-the-art models on most of these tasks.

The team also evaluated performance and inference speed for quantized versions of the model running on a CPU instead of a GPU. Reducing the model weights from full 32-bit to mixed-precision quantization showed “no practical quality difference.” 

In a Hacker News discussion about the model, one user wrote:

Paligemma proves easy to train and useful in fine-tuning. Its main drawback was not being able to handle multiple images without being partly retrained. This new version does not seem to support multiple images as input at once. Qwen2vl does. This is useful for vision RAG typically.

Gemma team member Glenn Cameron wrote about PaliGemma 2 on X. In response to a question about using it to control a robot surgeon, Cameron said:

I think it could be taught to generate robot commands. But I wouldn’t trust it with such high-stakes tasks…Notice the name of the model is PaLM (Pathways Language Model). The “Pa” in PaliGemma stands for “Pathways”. It is named that because it continues the line of PaLI  (Pathways Language and Image) models in a combination with the Gemma family of language models.

InfoQ previously covered Google’s work on using VLMs for robot control, including Robotics Transformer 2 (RT-2) and PaLM-E, a combination of their PaLM and Vision Transformer (ViT) models.

The PaliGemma 2 base models as well as fine-tuned versions and a script for fine-tuning the base model are available on Huggingface. Huggingface also hosts a web-based visual question answering demo of a fine-tuned PaliGemma 2 model.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Nvidia Announces Arm-Powered Project Digits, Its First Personal AI Computer

MMS Founder
MMS Sergio De Simone

Article originally posted on InfoQ. Visit InfoQ

Capable of running 200B-parameter models, Nvidia Project Digits packs the new Nvidia GB10 Grace Blackwell Superchip to allow developers to fine-tune and run AI models on their local machines. Starting at $3,000, Project Digits targets AI researchers, data scientists, and students to allow them to create their models using a desktop system and then deploy them on cloud or data center infrastructure.

Nvidia Grace Blackwell brings together Nvidia’s Arm-based Grace CPU and Blackwell GPU with the latest-generation CUDA cores and fifth-generation Tensor Cores connected via NVLink®-C2C. A single unit will include 128GB of unified, coherent memory and up to 4TB of NVMe storage.

According to Nvidia, Project Digits delivers up to 1 PetaFLOP for 4-bit floating point, which means you can expect that level of performance for inference using quantized models but not for training. Nvidia has not disclosed the system’s performance for 32-bit floating point or provided details about its memory bandwidth.

The announcement of Project Digits made some developers ponder whether it can be a preferable choice to an Nvidia RTX 5090-based system. In comparison to a 5090 GPU, Project Digits has the advantage of coming in a compact box and not requiring the huge fan used on the 5090. On the other hand, the usage of low-power DDR5 memory on Project Digits seems to imply a reduced bandwidth compared to the 5090’s GDDR7 memory, which further hints at Project Digits being optimized for inference. However lacking final details, it’s hard to understand how the two solutions compare performance-wise.

Another interesting comparison that has been brought up is with Apple’s M4 Max-based systems, which may pack up to 196GB of memory and are thus suitable to run large LLMs for inference. Here, there seem to be more similarities between the two systems, including the use of DDR5X unified memory, so it seems Nvidia is seemingly aiming, among other things, to provide an alternative to that kind of solution.

Project Digits will run Nvidia’s own Linux distribution, DGX OS, which is based on Ubuntu and includes Nvidia-optimized Linux kernel with out-of-the-box support for GPU Direct Storage (GDS). Nvidia says the first units will be available in May this year.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


AJ Styles comments on his recovery from injury, Chelsea Green wants Matt Cardona in WWE

MMS Founder
MMS RSS

Posted on mongodb google news. Visit mongodb google news

– On the October 4th 2024 edition of Smackdown, AJ Styles suffered an injury in a match against Carmelo Hayes. Styles then revealed that he had suffered a “mid foot ligament sprain”. A fan recently asked Styles if he could provide an update on his injury and the response wasn’t as positive as we would have all hoped.

Chelsea Green expressed her desire to see her husband, Matt Cardona, return to WWE. She said “I want to see Matt in WWE, honestly more than anything else, anything else that I even could want out of my career. I feel guilt because first of all, he supports me like no other. He’s so happy for me. He watches everything I do. He’s at shows when I’m winning championships. But at the end of the day, I go home and I know that this was his dream. I joke with you about the fact that I googled how to be a WWE Diva, but he didn’t. He literally came out of the womb wanting to be a WWE Superstar. So I just want him so badly to come back and have that final closure, that ending that he so deserves as, I mean, he was with WWE for a very, very, very long time. I think the fans want it too. Like, I don’t want to speak for anyone, but I just, I get a lot of people asking, you know, when’s he coming back? When’s he coming back? Gosh, I would love, love, love to see him back.“

Article originally posted on mongodb google news. Visit mongodb google news

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


How to create realistic, safe, document-based test data for MongoDB – Security Boulevard

MMS Founder
MMS RSS

Posted on nosqlgooglealerts. Visit nosqlgooglealerts

An Overview of MongoDB

MongoDB is a NoSQL database platform that uses collections of documents to store data rather than tables and rows like most traditional Relational Database Management Systems (RDBMS). It derives its name from the word ‘humongous’ — ‘mongo’ for short. It is an open source database with options for free, enterprise, or fully managed Atlas cloud licenses.

Development on MongoDB began as early as 2007 with plans to release a platform as a service (PaaS) product; however, the founding software company 10gen decided instead to pursue an open source model. In 2013, 10gen changed their name to MongoDB to unify the company with their flagship product, and the company went public.

MongoDB was built with the intent to disrupt the database market by creating a platform that would ease the development process, scale faster, and offer greater agility than a standard RDBMS. Before MongoDB’s inception, its founders — Dwight Merriman, Kevin P. Ryan, and Eliot Horowitz — were founders and engineers at DoubleClick. They were frustrated with the difficulty of using existing database platforms to develop the applications they needed. MongoDB was born from their desire to create something better.

.ai-rotate {position: relative;}
.ai-rotate-hidden {visibility: hidden;}
.ai-rotate-hidden-2 {position: absolute; top: 0; left: 0; width: 100%; height: 100%;}
.ai-list-data, .ai-ip-data, .ai-filter-check, .ai-fallback, .ai-list-block, .ai-list-block-ip, .ai-list-block-filter {visibility: hidden; position: absolute; width: 50%; height: 1px; top: -1000px; z-index: -9999; margin: 0px!important;}
.ai-list-data, .ai-ip-data, .ai-filter-check, .ai-fallback {min-width: 1px;}

As of this writing, MongoDB ranks first on db-engines.com for documents stores and fifth for overall RDBMS platforms.

Being document-based, Mongo stores data in JSON-like documents of varying sizes that mimic how developers construct classes and objects. MongoDB’s scalability can be attributed to its ability to define clusters with hundreds of nodes and millions of documents. Its agility results from intelligent indexing, sharding across multiple machines, and workload isolation with read-only secondary nodes.


Challenges Creating Test Data in MongoDB

While the ease of creating documents to store data in MongoDB is valuable for development purposes, it entails significant challenges when attempting to create realistic test data for Mongo. Unlike traditional RDBMS platforms with predefined schemas, MongoDB functions through JSON-like documents that are self-contained with their own individual definitions. In other words, it’s schema-less. The elements of each document can develop and change without requiring conformity to the original documents, and their overall structure can vary. Where in one document, a field may contain a string, that same field in another document may have an integer.

The JSON file format itself introduces its own level of complexity. JSON documents have great utility because they can be used to store many types of unstructured data from healthcare records to personal profiles to drug test results. Data of this type can come in the form of physician notes, job descriptions, customer ratings, and other formats that aren’t easy to quantify and structure. What’s more, it is often in the form of nested arrays that create complex hierarchies. A high level of granularity is required to ensure data privacy when generating test data based on this data, whether through de-identification or synthesis. If that granularity isn’t achieved, the resulting test data will, at best, fail to accurately represent your production data and, at worst, leak PII into your lower environments.

A high degree of privacy paired with a high degree of utility is the gold standard when generating test data based on existing data. Already it can take days or weeks to build useful, safe test data in-house using a standard RDBMS. The variable nature of MongoDB’s document-based data extends that in-house process considerably. It’s the wild west out there, and you’d need to build a system capable of tracking every version and format of every document in your database to ensure that nothing is missed—a risky proposition.

It’s also worth noting that there aren’t many tools currently available for de-identifying and synthesizing data in MongoDB. This speaks to the challenges involved—challenges we’re gladly taking on.

Solutions for Mimicking Document-based Data with Tonic

Safely generating mock data in a document-based database like MongoDB requires best-in-class tools that can detect and locate PII across documents, mask the data according to its type (even when that type varies within the same field across different documents), and give you complete visibility so you can ensure no stone has been left unturned.

De-identifying a MongoDB collection in Tonic

At Tonic, we provide an integrated, powerful solution for generating de-identified, realistic data for your test environments in MongoDB. For companies working with data that doesn’t fit neatly into rows and columns, Tonic enables aggregating elements across documents to realistically anonymize sensitive information while providing a holistic view of all your data in all of its versions. Here are a few ways we accomplish this goal:

  • Schema-less Data Capture: For document-based data, Tonic builds a hybrid document model to capture the complexity of your data and carry it over into your lower environments. Our platform automatically scans your database to create this hybrid document, capturing all edge cases along the way, so you don’t miss a single field or instance of PII.
  • Granular NoSQL Data Masking: With Tonic, you can mask different data types using different rules, even within the same field. Regardless of how varied your unstructured data is, you can apply any combination of our growing list of algorithm-based generators to transform your data according to your specific field-level requirements.
  • Instant Output Preview: After applying the appropriate generators to your data, you can preview the masked output directly within the Tonic UI. This gives you a complete and holistic view of the data transformation process across your database.
  • Cross-database support: Achieve consistency in your test data by working with your data across database types. Tonic matches input-to-output data generated across your databases from MongoDB to PostgreSQL to Redshift to Oracle. Our platform connects natively to all of your database types to consistently and realistically de-identify your entire data ecosystem.
  • De-identify non-Mongo NoSQL data too: You can use Tonic with Mongo as a NoSQL interface, to de-identify NoSQL data stored in Couchbase or your own homegrown solutions. By using MongoDB as the go-between, Tonic is able to mask a huge variety of unstructured/NoSQL data.

We’re proud to be leading the industry in offering de-identification of semi-structured data in document-based databases. Are you ready to start safely creating mock data that mimics your MongoDB production database? Check out a recording of our June launch webinar, which includes a demo of our Mongo integration. Or better yet, contact our team, and we’ll show you the ropes live.

*** This is a Security Bloggers Network syndicated blog from Expert Insights on Synthetic Data from the Tonic.ai Blog authored by Expert Insights on Synthetic Data from the Tonic.ai Blog. Read the original post at: https://www.tonic.ai/blog/how-to-create-realistic-safe-document-based-test-data-for-mongodb

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Podcast: Building Green Software with Anne Currie and Sara Bergman

MMS Founder
MMS Anne Currie Sara Bergman

Article originally posted on InfoQ. Visit InfoQ

Transcript

Thomas Betts: What does it mean to be green in IT? That’s the question that begins chapter one in Building Green Software. Today I’m joined by two of the book’s authors. Anne Currie has been a passionate leader in green tech for many years. She’s a community co-chair of the Green Software Foundation, as well as lead of the GSF’s Green Software Maturity Matrix Project. Sara Bergman is a senior software engineer at Microsoft. She’s an advocate for green software practices at Microsoft and externally. She’s an individual contributor of the Green Software Foundation. Unfortunately, their co-author, Sarah Hsu was unable to join us today, Anne and Sara, welcome to The InfoQ Podcast.

Anne Currie: Thank you very much for having us.

Sara Bergman: Thank you for having us. Excited to be here.

Thomas Betts: I guess I should say welcome back Anne. You’ve been on before, but it’s been three or four years probably. You talked with Wes and Charles in the past, so welcome back.

What does it mean to be green in IT? [01:10]

Thomas Betts: So, let’s start with that opening question from your book: what does it mean to be green in IT? Anne, you want to start?

Anne Currie: It’s an interesting question. What does it mean to be green in IT? The O’Reilly book came out in April and is going well so far. People generally like it. It gives a good overview of what we’ll need to do. The good thing about writing a book is you learn a lot about the subject while you are writing the book. Going into the book, so when we started on chapter one, I think our view of what is green software is software that produced the minimum amount of carbon to operate it. So, without dropping any of the functionality, without dropping any of the SLAs, it’s produced less carbon, and that usually means it’s more efficient, uses fewer servers to do the job, and/or it runs when the sun’s shining or the wind’s blowing, so there’s more renewable power available.

But I think by the end, my thoughts on it had changed quite a lot. And nowadays I tend to think that green software is software that is optimized to run on renewable power because that is the power of the future. That is the only way we’re really going to get green in the future. So, all of those things, being more efficient, being more able to shift and shape work to when the sun is shining or the wind is blowing, it reduces carbon, but it is what is required to run on renewable power. That’s a little bit of a confusing answer, so Sara, what do you think?

Sara Bergman: I think it’s good… It explains also what our minds went through when we wrote the book, so that’s good. Or I don’t know, should people have a closer insight to our minds? I don’t know if that would be good or confusing, but anyway, no, I agree with your answer. I do think, because we’re on a journey, the energy transitions are moving from these fossil fuel, always-on energy sources that have dominated the energy sector, well, they’re going away slowly and being replaced by renewables. We aren’t quite there yet though, so it still matters. The other things still matters. Energy efficiency matters. It’s going to matter less, I dare say, once the energy transition is complete.

Then this shifting capabilities are going to come out in full swing. I also think hardware efficiency is fascinating, so that means doing more with less, so being very mindful of the carbon debt, and that lands in your hand when you get a shiny new device. And there are many ways we can do this, and that people are already doing it.

Fundamental principles of green computing [03:36]

Thomas Betts: I think the two of you hit on a couple different themes, and maybe we should dive into these and explain what the terms mean. So, we talked a little bit about just energy efficiency overall, but also I think, Sara, you mentioned hardware and hardware efficiency, and there’s an aspect of carbon awareness that you discuss in the book. So, what are those fundamental principles of green computing?

Energy efficiency [03:57]

Anne Currie: Well, shall I do efficiency and you do hardware efficiency and then we’ll both talk a little bit about carbon awareness? There are specialist subjects, so carbon efficiency is about producing the same functionality with fewer watts required, which really means with fewer CPU cycles because CPU is very correlated with energy use in servers. So, a more efficient system does the same stuff for you… well, uses less power to do it. So, that’s incredibly useful at times when the sun isn’t shining and the wind isn’t blowing. When you’re running off non-renewables, when you’re running off fossil fuels, you want to be using less power to produce the same output. And there are various ways you can do it.

You can do it through operational efficiency, so just being a bit cleverer about the way you use systems so they’re not, for example, over-provisioned. And you can also do it through code efficiency, so not running 10 lines of code when one would do. That’s a much more controversial and difficult subject. We’ve got a whole chapter on it in the book, but in addition to that, we’ve got hardware efficiency.

Hardware efficiency [05:03]

Sara Bergman: Yes, hardware is… In case you haven’t thought about it’s a very labor-intensive and carbon-intensive thing to produce because minerals are mined all over the world, shipped to other places, assembled sometimes in very energy-intensive ways before they land in your hand or in your server hall or in your closet where you keep servers, I don’t know. I don’t judge. But it’s significant carbon debt, and that’s already paid. Once you get it and it’s brand new, you’ve paid the substantial… or the environment’s paid a substantial cost for that to happen. So, we want to be mindful of that. One of the best things that we do is to hold onto it for longer, so that we amortize this cost over time and prolong the new additional cost of buying a new shiny thing. And if you are a software person, I do identify as a software person, it’s easy to say, “Well, that’s not my problem. It sounds like a hardware person’s problem”.

And they are certainly aware of it, and certainly thinking about it and working on it, but we as software people are not off the hook because the type of software that we run highly impacts the type of hardware that we have. If you just look at any enterprise software today, you could not run it on a machine that’s 30 years old, for example, because they are massively improved. So, they kind of lockstep. We get better machines with more complicated software and so forth, so forth. Also on the client side, and now I’m thinking PCs, smartphones, smart TVs, for example, there we see that actually the manufacturing cost in terms of carbon is much higher than the use time. So, it’s very, very important to be mindful of that cost. And you can do that in multiple ways and that’s a whole chapter in itself, but I think that’s the short dish

Anne Currie: Sometimes that carbon that you have to pay out upfront, a lot of the terms in this whole area are very confusing, sometimes it’s called embodied carbon, sometimes it’s called embedded carbon, but it is basically the same. It’s the theoretical amount of carbon that was emitted into the atmosphere when your device was created.

Sara Bergman: And it’s basically… I believe it’s a borrowed expression. It comes from the building industry. They use the same terminology.

Anne Currie: I didn’t realize that. That’s good. And one of the things that’s quite interesting is in the tech industry, when you’re talking about consumer devices like phones, it’s all embodied carbon. That’s a major issue. Huge issue with embodied carbon in phones and user devices, but in data centers, embodied carbon is not the main problem in data centers because the hardware data center is really well managed compared to how you manage your devices at home. You don’t keep them as highly utilized. You don’t keep them as long. In data centers, it’s mostly about the electricity used to power your servers, how that was generated. That’s the biggest cause of carbon emissions into the atmosphere.

Carbon awareness [08:05]

Sara Bergman: So, carbon awareness I think is the third thing we mentioned that we should maybe dive into. But it sounds so sci-fi. Whenever I mention it to an unsuspecting crowd, I’m like, “Yes, now we’re going to go off the deep end”. But it really isn’t super complicated. So, the thing is energy can be produced in different ways. We all know that. Most countries don’t have one source, they have multiple. So, the grid mix varies throughout the day, throughout the year and whatnot, over the course of years also as the grid evolves. And carbon awareness is adapting to this fluxuality. It’s about listening to real time. What energy do I have right now? Or listen to forecasts and then plan how you operate or what you serve to customers based on that.

Anne Currie: Yes, it’s interesting, almost everybody talks mostly about efficiency, and that’s a good place to start. It’s usually the easiest place to start. It’s not hard for people to just go through, and do fairly manual stuff to do with sorting out provisioning and cut their carbon emissions by 50%. That’s really not that difficult to do. Carbon awareness is more difficult. It requires you to design your systems, to architect your system, so you have parts of your system that, particularly CPU-heavy parts of your system, workloads that can be shifted in time that can be delayed to when the sun comes up or when the wind starts blowing again.

So, in many ways that’s very hard, but in the long term that is absolutely where all the win is because power that’s generated from photovoltaic cells, from wind farms, if you can use it directly, is going to be 10 times cheaper, even than fossil fuels. So the wind, if you can find a way to use it, is you will get electricity that’s 10 times cheaper, but it’s a longer term planning exercise to work out how you’re going to produce systems that can shift work, take advantage of that.

What can engineers do to make software more green? [09:59]

Thomas Betts: So, this is what happens with every time I talk about green software, you go into 17 different aspects that we all need to consider and it kind of becomes overwhelming. And it’s easy for someone to say, “Well, it’s embedded carbon in the iPhone, and so that’s clearly not my problem. I’m just an app developer, and so nothing I do makes a difference”, or, “It’s up to the data centers because they’re the ones that are actually plugged in, and are they running on fossil fuels or solar?” So, as the lowly developer or architect, how can you convince those people, no, what you’re doing does make a difference?

Anne Currie: Well, that’s good because it really does. AWS, Azure I think use the same terminology. They talk about it as a shared responsibility model. Even in the clouds where they’re putting tons of effort into providing services that… I know that you hear pros and cons of the cloud. And a big company is not one thing. There’s loads of things going on in their big company, but they do want to produce systems that will in the long run deliver zero carbon power because they want some of that 10X cheaper electricity too, thank you very much. So, they want us to produce systems that will run on that cheap green power, but as developers, you have to use those systems. You have to build your applications to run on those offerings that are effectively green platforms that can provide that.

If you don’t, if you run, if you just lift and shift from your data center, from your on-prem data center, for example, into the cloud and you just run on a dedicated server, you really don’t get that much benefit. You get some benefits, you don’t get that much benefit and the clouds will never be able to be green based on that. In order to be green, you have to start modernizing. You have to start using those services like spot instances, the right kind of instance type in order to get the benefit of the investment that they’re making, serverless, that kind of thing. Both sides are going to have to step up, and if either side doesn’t, you don’t end up with a very good solution.

Sara Bergman: Yes, and I think at the very least, use your consumer power. Do your research before choosing where you host your software. And continue to use your consumer power to apply necessary pressure on the topics you’re interested in. That goes for everything, of course. And don’t eat the elephant in one bite. Is that the saying?

Anne Currie: Oh yes, don’t try and eat the elephant in one bite. Yes, that is the saying.

Sara Bergman: Sorry, the two native English speakers need to correct the Swedish person. I don’t even know what the Swedish equivalent would be. Anyway, so don’t eat the elephant in one bite. Try to break it down. I think most people who have any piece of software, you know where your inefficiencies are, you know the thing you’re doing that’s kind of a little bit stupid, but you’re doing it anyways because of reasons. So, if you can find, for example, a sustainability argument to motivate doing it, I think you’re very, very likely going to be able to find other reason for doing it as well. Maybe this is also cheaper. Maybe this also makes your service more reliable. Maybe it increases performance, lower latency. So, bundle those arguments together. Sadly, most people, if you come to them and say, “Hey, we’re going to make this 5% greener”, they’re like, “Okay, you also have to deliver other value”.

Publicly talk about making software more green [13:23]

So, bundle it together with other things. And then what I… Maybe this is a… I don’t know if it’s a secret tip or not, but I think there’s a lot of goodwill to be had around publicly talking about this. Say you did something, you could see that it decreased your emissions and you go and talk about it. It can be a positive story. So, even if you are being carbon efficient, you’re delivering the same value for lesser carbon, that value can still be that you can talk about it externally. Say, “This is what we did”, and your customers will think, “Hey, what a nice company. I want to buy their stuff because I want to be green too”, because consumers want to be green.

Thomas Betts: Yes, a lot of corporate sustainability reports are talking about, “Here’s our carbon impact. We’re trying to go carbon neutral”. And it’s all playing with the numbers. There’s ways of manipulating it so that you look good in a report, but it is for that public campaign saying, “Hey, this is important to us and we’re making strides to reduce our footprint”. And it started with, “Okay, we’re just going to make sure the lights are turned off in the building”. And then it turns out everyone went home for COVID and they just never turned the lights back on in the building, so there, big win. But now it is looking at the, what is it, the second and third order effects. We’ve now put all of our stuff in the cloud, so we have to understand: what is the carbon impact of our software running the cloud as opposed to the data center where it’s a little easier to manage and measure?

Anne Currie: There is one downside of advertising and talking about what you’re doing, although I really like it when people do. It’s fantastic when they do. If you go into the ESG area, they are very keen on getting really accurate numbers. The trouble is that a lot of the times it’s really quite hard to get accurate numbers at the moment. And therefore they kind of go, “Well, if I can’t the accurate numbers, I’m not even going to start”. But there are significant knock-on benefits to being green even if you don’t have the number. There are things you can do that are kind of proxy measures for being green and are worthwhile in and of themselves.

So, I often say, “Well, your hosting bill isn’t a perfect proxy metric for carbon emissions, but it’s not bad and it’s a really good place to start”. The very basic things that you start with to clear up your systems, like turning off stuff that you’re not using anymore just has low value, or you are over-provisioned, get the right sizing everything, that has an immediate impact on your hosting bill.

And I always think that in terms of green, if it isn’t at least a double win, it’s probably not green. So, to start with, the double win is you cut your cost. And then as you get a bit more sophisticated, the way that you then cut your carbon again is usually by using more modern resilience techniques like autoscaling. Well, that has a triple win there. And you cut your hosting costs. You cut your carbon emissions, but most importantly you get a more resilient system because autoscaling is a lot more resilient than having a cold or even a hot backup somewhere. So again, you’re looking at that double win. And if you don’t have the double win, ironically it’s probably not green because it won’t scale.

Proxy measures for how green you are [16:26]

Thomas Betts: You mentioned there were several different proxies. So, the bill is the one, what else can we look at for some sort of measure to say, “Am I green?” beyond just the bill that I’m getting?

Anne Currie: Well, I do quite like the most obvious proxy other than the bill. The bill is the nice easy one want to start with. The more sophisticated one is performance, because as we said, each CPU cycle costs you time and it costs you money and it costs you carbon emissions. So, basically if you can improve the performance of your system, as Sara said, if you could just reduce those niggles, all those performance bugs that you kind of always meant to fix, but you never really got round to fixing, fix those performance bugs, you go faster and you reduce less carbon into the atmosphere.

Sara Bergman: The reason why that works is because you’re then cramming more work into your CPU in a shorter amount of time, meaning you might not need two instances or three instances. You can reduce down. Now, if you’re in the cloud, that’s great because someone else can use that hardware. If you’re on-prem, then you might say, “Oh, but I already have all those machines”. However you want to think forward as well, into the future in terms of capacity planning. You may not need to buy additional servers if you’re using the ones you have more frugally. So, you might be able to maybe even turn someone off and save them for a bit, so that’s why performance is a good metric. Don’t just speed things up for the sake of it. You speed things up so that you can reduce your footprint.

Green does not require the cloud [17:59]

Thomas Betts: I like that you brought up the on-prem versus in the cloud because I think there’s some mindset that, “Oh, I have to go to the cloud or we can’t be green”, and you can apply these same techniques in your own data center. It’s the same thing. You can run microservices yourself, you can host all those things if you need to on your own hardware. Plenty of companies do. And that’s the idea, that I shifted from… You said you don’t want to just do the straight lift and shift, that’s inefficient for all the reasons, but you also have to start thinking about: how do I design my system so that it can auto-scale up and down even though that’s on my hardware, and when it scales down, can something else run in its place, so that machine that’s still sitting there running is not completely useless, it’s actually producing some value?

Anne Currie: Well, or ideally it gets turned off. The ideal is you just turn the stuff off when you’re not using it. That is absolutely the dream. In fact, from what Sara said, obviously we’re completely in agreement on this, the way I quite like to say it is start with operational efficiency and use cost as your way of measuring that. If you could just run fewer machines, then it’s reflected in your energy bill if you’re on-prem or your hosting bill if you are in the cloud because you’ve right-sized or you’ve got rid of systems that are not doing anything. And then only after you’ve done that do you think about performance and effectively code level tuning. Because if you start doing the code level tuning before you’ve got really good at the operations, then you just end up with a whole load of machines that are only partially utilized. So, you’ve got to start with ops first. So, cost is your first proxy, performance is only a later proxy.

Sara Bergman: Yes, do it in the right order. And I think also maybe something that, if you’re not that into green software maybe, haven’t heard about them, the reason why we say, “Turn things off”, is because there’s something called energy proportionality about servers. And this was, I believe, first introduced in 2007 by two Google engineers who wrote this amazing paper, which I recommend people to read. But basically when you turn a server on, it consumes a baseline of power that’s pretty high. In their paper it’s around 50%. That was an aggregation over… Yes. Point is, it’s pretty high and it’s not doing anything.

And then as utilization increases, the curve goes up, but even if you just have an idle machine, you’re consuming a big chunk of energy, so you want to be mindful of that and have properly utilized machines or turning them off. And I think their paper originally was to propose we do fully energy proportional hardware so that the curve starts at zero and then increases with utilization. I haven’t seen any servers that can do that yet. That will be amazing, so looking forward to it. It’s very much a research area still, but we’re not there yet. So, idle servers consume lots of energy.

Anne Currie: Again, that’s something you can do on-prem, is that you can set yourself a target of saying, “Average across my entire estate, have I got between 50 and 80% CPU utilization?” So, above 80% is problematic because you end up with all kinds of other downsides of having all machines that are a little bit highly utilized, but on average, if you can achieve 50% across your estate, then that’s a great target.

Thomas Betts: Yes, I think some people will look at a metric and like, “Oh, my CPU is sitting there at like five or 10%, I’m doing great”. And that’s a mindset you have to get out of. Oh, if that’s the only thing that machine’s doing, that’s over-provisioned.

Anne Currie: Absolutely, yes.

Thomas Betts: You can get the same performance without running at 50%. You’re not introducing latency, you’re not introducing any other concerns, it’s just more of the smaller size CPU. And like you said, someone else can use that CPU for something else if you’re on the cloud.

Anne Currie: Yes, indeed. But on-prem too, it just requires a little bit of thought.

Thomas Betts: Right.

Sara Bergman: Yes, yes, you can definitely build the most sustainable data center ever yourself. There is no secret to the cloud. It takes a lot of engineering power.

Multi-tenancy is the efficiency secret sauce in the cloud [21:55]

Anne Currie: Well, there is one magic secret to the cloud, which you can do on-prem, but not very many businesses have the range of workloads to achieve it, which is multi-tenancy. So, the secret source of the cloud is that they’ve got such a mixture of tenants, and they all have different demand profiles, and they’re very good at making sure that you’re sharing a machine with people who have different demand profiles. So, maybe it’s the other side of the world, so they’re busy during the day and then you’ve got a different time of day. So, basically they’re saying, “Well, how do we align these demand profiles so that we never have a quiet machine and we never have an overloaded machine?” So, some companies can do that.

The most famous is probably Google. They’re one of the reasons why they’re very ahead on all of this stuff and a lot of the interesting thinking in this area came from Google was because from the get-go, from a very early stage, they were effectively their own multi-tenant. And they did quite a lot of work to become even more their own multi-tenant. And the classic is YouTube video. When they bought YouTube, that suddenly made them an excellent multi-tenant because YouTube has a very, very CPU-intensive workload associated with it, which is video transcoding and encoding. And they were always very careful. So, right from the start, there’s no SLA on that. They’ll say, “Do you know, could be done five minutes, could be done in five hours, could be done tomorrow. Just we’ll let you know when it’s done”.

So, that gives them something they can move around in time to make sure they get really high utilization on their machines. So, excellent example of finding something that effectively could be used to make themselves their own multi-tenant. So, one thing that any enterprise can do is if you can find that CPU-intensive task that you can… If you’re in the cloud, stick it in a spot instance. If you’re on-prem, you’re going to have to orchestrate that yourself in some way. Kubernetes does have a spot instance tool associated with it, so you can do it on-prem. But yes, the secret sauce of the cloud is multi-tenancy. It’s hard to manage otherwise.

Sara Bergman: Also Google has been doing… Their YouTube processing, the media processing have been carbon aware since 2021, I believe, which is another fun fact about them.

Shifting in time is better than shifting in space [24:12]

Thomas Betts: But that’s the thing, they have the incentive because they are at such a scale, yes, there’s the carbon benefit, there’s also the cost benefit for them. Because I’m not buying the GCP instance. Google is paying for that. Even though it’s one division of Google is paying another division of Google, like, “Please host this software for me”, there’s still a lot of costs. And so those big things that add up to multi-comma numbers, we can do this and we can get cheaper energy, like you said, if we can run this on solar somewhere else later, then let’s just do that. It also gives you the ability, I guess, to follow the sun.

We talk about that for software support, that we have people all around the globe so that we can support our users, but when you say, “I want to move this software”, it’s not I’m always going to move it to Sweden or Norway or somewhere, I’m going to move it to wherever it might make sense right now, so I don’t have to necessarily wait 18 hours. I could just pick a different place potentially?

Anne Currie: You can do, but I have to say, we are more keen on a shift in time than a shift in space. Well, it depends. If it’s something with a low amount of data associated with it, then shift it in place. There’s no problem. You’re not moving a great deal of data. But if you are moving a great deal of data, you’re often better off coming up with some cunning wheeze which will enable you to just do the work, either beforehand… Pre-caching is a big thing. You do things in advance, like a CDN does a whole lot of work in advance, or do it later. But yes, that just involves architects coming up with some clever plans for how they can do things early or late.

Sara Bergman: Because we do worry about network costs, like we’re spending carbon in the networks because, spoiler alert, networks consume a lot of energy. They have a lot of equipment that eats a lot of electricity. So, sending data on the internet for long distances, if it’s a lot of data, we’re spending carbon. So, then we’re worried that that is actually counter-weighting to the benefit you had of moving it to a greener green time zone. Because of that, again, you can’t necessarily easily see where your network is consuming electricity. That’s just harder the way the internet works. Also, there are, of course, a lot of, especially in Europe I want to say, legal rules around where you can move data, and you have to be mindful of those things as well.

Various viewpoints and roles for green software [26:29]

Thomas Betts: Well, so you’ve brought up a couple of different things, the two of you in the last minute. So, Sara was talking about networks. Anne said, “Architects have to come up with these clever solutions”. So, the InfoQ audience is engineers, architects, DevOps, SREs. We have all these different personas. And so think about those different people and the different viewpoints, how would a software engineer think about changing their job to be more green versus an architect versus say a DevOps engineer or a platform engineer that’s responsible for deploying all this stuff and setting things up?

Anne Currie: I would say that you start with the platform. So, the platform engineers talk to the software engineers talk to the architects, and come up with a platform that’s going to work well for what they do. And it has to be a platform that in the long run the platform creators, the people who support that platform, are committed to it being a platform for the future, platform that will be able to run on top of renewable power. So, just for example, Kubernetes are working on all of this, but the clouds… There are some things that the clouds do that are the green platforms, and they talk about as green platforms, and some things that they kind of are clearly not and therefore they don’t really mention green associated with them.

So, this is again where Sara was saying you have to use your consumer power. You have to ask, you have to say, “Is that a green platform? Is that a platform that’s just going to be able to carry me all the way through to 24/7 carbon-free electricity in 2030 or 2035 or is it not? Is it going to be a dead end for me when I’m going to have to re-platform at some later point?” So, the first thing is make sure you choose the right platform. And then after that, and this is where the techies come in, where the architects come in, it’s using that platform as it was intended to be used. So, using that platform really well, which means reading all the docs, learning how to do it, and grasping that platform. Well for a platform engineer now, you need to step back and say, “Is this a green platform?” And if it’s not, what’s my plan for moving to one?

Well-architected frameworks from cloud providers [28:47]

Sara Bergman: And one place to start could be all three major cloud providers have their own well-architected frameworks and they all have a sustainability section within that, so that can be a good place to get going. I think all three of them could be a bit more comprehensive, but I think they’re a very good start. So, no matter who you use, that’s a good starting point.

Anne Currie: And it is well worth saying that have a look at the well-architected framework sustainability section for all of them, and that will give you information. You can ask your account manager, but quite often they’ll tell you stuff that’s not so useful because they don’t always know.

A lot of them have a sustainability architectural expert and you can ask to speak to them, and they will know what the green platforms are. But if you just say, “Is this a green platform?” quite often they will say, “Oh, this is already a green platform because we offset everything”. So, this is a big thing at the moment. A lot of them are saying, “Oh, well we offset everything, so we’ll pay somebody else to..”. But that is not enough. Carbon neutral is not carbon zero. It’s not getting us there. It’s the goal of 10 years ago. And all too often, when you ask a hyperscaler these days, “Is it a green platform?” they’ll say, “Yes, because we offset everything”, and that is not sufficient.

Thomas Betts: Yes. I still hear companies talking about, “We’re going carbon neutral”, or, “We made it to carbon neutral”, and it’s like, “Okay, and next”. So, the goal after that is carbon zero, and that’s for the data centers, that’s for software, that’s for hopefully every company eventually. It might not be tomorrow, but it’s a goal. You can have these long-term goals.

Anne Currie: Yes, indeed. Carbon neutral, if you’re not already there, you are quite behind the times, but it’s the first step.

Sustainable AI and LLMs [30:29]

Thomas Betts: Let’s go to something that’s not carbon neutral: machine learning, AI, LLMs. They get a lot of press of power consumption, data center consumption, CPU. It’s like the new Bitcoin mining, “I’m going to go and train an LLM”. Some of these numbers are hard to come by. I don’t know if you guys have looked into how much it is. And there’s two aspects. There’s the training of the LLMs and the AI models, and then there’s hosting them, which can be less, but if they get used inappropriate, like any software they can also have significant consumption sides to that. So, where are we at with that? What does it look like? Do you have a sense of the impact? And then what do we do about it?

Anne Currie: So, I can talk to where we are and where I think we’re likely to be. Sara is the expert in techniques to reduce the carbon associated with creating models or running inference. But from my perspective, yes, there is a great deal of power being used by LLMs at the moment. Do I think that’s a problem? Not actually. I’m going to be controversial, and say no, not really because it’s very much at the beginning. And it feels to me like it’s a gold rush, and all hardware is being used like crazy and nobody’s waiting for the sun to shine, Everybody’s running everything all the time, that kind of stuff. But fundamentally AI, both training and inference, has the potential to be something that could be shifted to when and where the sun is shining or the wind is blowing. There’s loads of potential asynchronicity about those loads.

They’re very CPU-intensive, and right now they tend to be urgent, but that is because we’re really at this stage terrible at AI. We don’t know how to do it. We don’t know how to operate it. Inference, for example, is often described as something that has to run instantly, that maybe the training you could do in advance, but inference is instant. But we’ve got tons of experience of things that need to be done instantly that are not done instantly on the internet.

For example, I mentioned CDNs earlier, content delivery networks. When you want to watch Game of Thrones, it’s bit of an old school reference now, but in the olden days when you want to watch Game of Thrones, you say, “I want to watch Game of Thrones now, now. It has to be delivered to me”. And back in the very old days, we would’ve said, “Oh, no, it has to be delivered from the US to the UK now instantly. That’s going to be impossible. Oh my goodness me, it’s going to overload the internet and everything’s going to collapse”.

But we learned that that wasn’t the way we needed to do things. What we would do is we’d say, “Well, everybody wants to watch a Game of Thrones”, so we move it to the UK overnight and we sit it on a CD not very far away from where you live. And it seems like getting it from the US, but you’re actually getting it from 100 yards away and it’s very quick, and it’s kind of a magic trick. You do things in advance. It’s a form of pre-caching. Because humans are not that different. They want to watch the same shows, and they want to ask the same questions of AI as well, so there’s tons of opportunity for us to get good at caching on inference, I think. And I think that’s where we’ll go.

But yes, so at the moment it’s horrendous, but I don’t think it’s as bad as Bitcoin. Well, it’s similar to Bitcoin at the moment in many ways, but it’s not innately as bad as Bitcoin. Bitcoin is purely about using power. And this is actually about achieving a result. And if you can achieve that result using less power, we will. Sara, this is actually your area rather than mine.

Sara Bergman: Yes, we seem to be talking about this a lot lately. I guess everyone in our industry is talking a lot about it lately. I think one of the good things is when I started my journey into how do we build greener software, a lot of the research that was out there was actually AI-related. And specifically on the training side, they’re very smart people who’ve been thinking about this for a long time, which is good. And I think especially on the training, because if you go a few years back, that was maybe the bigger source of emission or at least it was a bit easier to do research on perhaps, so there is a bias there. So, there are tons of things we can do. We can time shift obviously. There are a bunch of methods for making the model smaller. There are tons of research paper that describes techniques on how you can achieve that.

You can also think about on the inference side how you deploy it. If you you think federated learning, are you utilizing your edge networks with moving data and the models closer to your users? And all these things that we had talked about before, like how do you do the operational efficiency for AI workloads, is not super foreign to… You use the same tools, maybe screw them in slightly different ways, but it’s very similar. So, we’re in good hands there. I think we will learn over time to apply these to the workloads, because, and this is something Anne and I have talked about before, we need to make it green. There isn’t an option where we say, “Use AI or be green”, because where we are at this point in time, that’s not a fair choice. We need to make AI green. That’s the thing. And we will definitely, as an industry, get there, I believe.

Anne Currie: Yes, I agree. It’s amazing how often we run across people going, “Oh, well, yes, the only green AI is no AI. You turn the AI off”. Well, there’s only one thing we can be really, really certain of in the tech industry, which is that AI is coming and the energy transition is coming. Both of those things need to be done at once. We cannot pick and choose there.

Sara Bergman: Yes. And I think also what’s interesting, because now we have said AI a lot, but I think what we all have meant is generative AI and LLMs. And it’s really only one branch of AI. I think we’re going to see more approach, and that they will maybe have completely different characteristics that will… And I’m curious to see how that goes. But I think specifically with large language models, we’re seeing the rise of small language models, and they’re often outperform… So, a specialized small language model often outperforms a more general large language model on specific tasks they’re trained for, which is obvious when you think about it. And so I think we’re also going to see more of that. Then they also have the benefit they can be closer to you, maybe even on your device because they’re small. It’s right there in the name.

Thomas Betts: Yes. One of the techniques that is often used for getting the large language model to be better is to throw a larger prompt at it. And it turns out if you have a small language model that’s basically trained on what you’re prompting with every time, the analogy I liked was you can send someone to med school or whatever and they’ve spent decades going to school, or somebody wants to be a CPA and be an accountant, they only need to go to a two-year program and they learn how to be an accountant. I need the accounting LLM for my accounting software. I don’t need to ask it all about Taylor Swift. If it knows about Taylor Swift, it’s not actually a good language model for me to have in my software, and now it’s this larger thing that I’m not fully utilizing, just like the larger server that I’m not fully utilizing.

So, I like the idea of get something smaller, and you’re going to get better benefits of it. So, it kind of plays into your… There’s these co-benefits. So, smaller, faster, AI is greener AI, is greener software.

Thomas Betts: So, I have to call out, there was a talk at QCon San Francisco just last month about GitHub Copilot, and it talked about those same ideas, that they were… It was specifically talking about how to make the requests very quick because they needed to act like this is software running on your machine, in your IDE. And to make that happen, it has to be making local network calls.

So, they have it calling the local Azure instance that has the local AI model or LLM model, LLM, because they can’t afford to go across the pond to look it up because it would just take too long and then the software wouldn’t work. So, all those things are being discovered as we’re finding these large use cases for things like GitHub Copilot.

The Green Software Maturity Matrix [38:28]

Thomas Betts: I do want to give you a chance to talk about the Green Software Maturity Matrix. I mentioned a little bit in your bio, Anne, and so tell us, what is that?

Anne Currie: Well, actually that harks back to something we talked about earlier today in the podcast, that it’s very, very complicated. This is something that Sarah and Sara and I really hit writing the book. And we knew from going out and talking at conferences and things about this stuff, there’s lots of different things that you can do to be green, and there’s lots of different ways you can approach it. And there are some ways to approach it that are just vastly more successful than others. And we spent quite a lot of time thinking in the book about how we could communicate the process that you should use. So, don’t try and leap to the end. Don’t try and be Google in one leap because you’ll fail and then it won’t be green. It might be… If you set your goals too high, your bar too high, you’ll fail to achieve it, and then you’ll go, “Well, it’s impossible to be green”.

And that isn’t the case. You just needed to do it in a more regulated way. So, we put a… A maturity matrix is a project… It’s quite an old fashioned project management techniques that basically says, “Well, what does good look like?” But also more importantly it says, “But you’re not going to get there in one go. You actually do need to get there in..”. A maturity matrix rather arbitrarily chooses five steps to get there. And if you do it through those five steps, you will almost certainly get there. If you don’t do it through those five steps, if you try and skip a step, then you will fail miserably.

So, the maturity matrix was basically step one is just want to do it at all. Step two is basic operational efficiency, so turning off things that are not in use, right-sizing. Step three is high-end operational efficiency. This is like modern operational techniques, so autoscaling, using the right services properly. And only at four and five of the matrix do you start doing things like rewriting things in Rust. And ideally most enterprises would never have to achieve four or five because they will adopt a platform at level three that will do four and five for them. So, that’s kind of like the idea. It should be, if you follow it, a very successful way to get modern systems that achieve 24/7 carbon free electricity or can run on 24/7 carbon free electricity by stopping you from trying too hard.

Sara Bergman: And it also, I think, nicely separates the vast area of sustainability into swim lanes. So, depending on your role, you can start with the swim lane that feels most natural to you. If you’re a product manager, you start in the product. You don’t have to start in one of the other ones. And you sort of go from there. And I think that’s also a good mental model for people because sometimes when something is new, it can be a bit overwhelming. And spreading it out, not just as five layers, but also nine different categories, really shows that, “Okay, you start with one thing. Just do this cell and then you can sort of expand from there. Just pick any cell in the first level and just start there, and sort of take it step by step”.

I always these days start my conference talks with that slide because I think it’s a good way to set the expectation, like, “This area is big. Today we’re going to focus on this because I can’t reasonably talk about everything about sustainability in IT in one hour or how long we have. So, we’re going to focus on this. Just know there is more stuff”. It also helps prevent the question, “What if I just rewrite everything in Rust?” like Anne said. Because somehow we always get asked that. Somehow I can never escape that question.

Anne Currie: Everybody wants to rewrite their systems in Rust or C. Why? Why?

Sara Bergman: The fan base is massive. Kudos to the Rust folks because you have some hardcore fans out there.

The Building Green Software book club [42:30]

Thomas Betts: We just had a Rust track at QCon, so yes, it’s popular. So, this has been a great conversation. I think we’re running a little long. I want to give you guys a minute to give a little plug for the Building Green Software Book Club. I attended the first meeting last month, I think it was. What is the book club and how can people find out more about it?

Sara Bergman: So, we’re aiming to do once a month, one chapter of the book. We have done one, so so far we’re on track. I think there will be maybe a bit of a break now over the holidays, but we’ll come back on track. You can find on our LinkedIn pages, we have a link out. You can sign up. It’s on Zoom. It’s very casual. We chat. You can ask us questions. As you might have noticed in this podcast, we’re happy to talk about stuff. So, even if there are no questions, we’ll talk anyway. People enjoyed it last time. I enjoyed it. It was very cozy. It was very nice to see people face-to-face virtually. Yes.

Thomas Betts: Well, I’ll be sure to add links to the book, to the Maturity Matrix, to the book club in the show notes. Anne and Sara, thanks again for joining me today.

Anne Currie: Thank you again for having us. It was lovely.

Sara Bergman: It was really awesome. Thank you for having us.

Thomas Betts: And listeners, we hope you’ll join us again soon for another episode of The InfoQ Podcast.

Mentioned:

About the Authors

.
From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Java News Roundup: WildFly 35, Jakarta EE 11 Update, Java Operator SDK 5.0-RC1

MMS Founder
MMS Michael Redlich

Article originally posted on InfoQ. Visit InfoQ

This week’s Java roundup for January 6th, 2025 features news highlighting: the release of WildFly 35; Java Operator SDK 5.0-RC1; Spring Framework 2023.0.5; Micronaut 4.7.4; Quarkus 3.17.6; Arquillian 1.9.3; and an update on Jakarta EE 11.

JDK 24

Build 31 of the JDK 24 early-access builds was made available this past week featuring updates from Build 30 that include fixes for various issues. Further details on this release may be found in the release notes.

JDK 25

Build 5 of the JDK 25 early-access builds was also made available this past week featuring updates from Build 4 that include fixes for various issues. More details on this release may be found in the release notes.

For JDK 24 and JDK 25, developers are encouraged to report bugs via the Java Bug Database.

Jakarta EE 11

In his weekly Hashtag Jakarta EE blog, Ivar Grimstad, Jakarta EE Developer Advocate at the Eclipse Foundation, provided an update on Jakarta EE 11, writing:

Jakarta EE Core Profile 11 was released in December. You can check out all the details on the updated Jakarta EE Core Profile 11 specification page. The next out will be Jakarta EE Web Profile 11, which will be released as soon as there is a compatible implementation that passes the refactored TCK. The Jakarta EE Platform 11 will follow after the Web Profile.

The road to Jakarta EE 11 included four milestone releases, the release of Core Profile with the potential for release candidates as necessary before the GA releases of the Platform and Web Profile in 1Q2025.

Spring Framework

Spring Cloud 2023.0.5, codenamed Leyton, has been released featuring bug fixes and notable updates to sub-projects: Spring Cloud Kubernetes 3.1.5; Spring Cloud Function 4.1.5; Spring Cloud Stream 4.1.5; and Spring Cloud Circuit Breaker 3.1.4. This release is based upon Spring Boot 3.4.0. Further details on this release may be found in the release notes.

WildFly

The release of WildFly 3.5 primarily focuses on support for MicroProfile 7.0 and the updated specifications, namely: MicroProfile Telemetry 2.0; MicroProfile Open API 4.0; MicroProfile Rest Client 4.0; and MicroProfile Fault Tolerance 4.1. Along with bug fixes and dependency upgrades, other enhancements include: a refactor of the WildFlyOpenTelemetryConfig class as it had become too large and unmanageable; and the addition of profiles in the source code base for a “cleaner organization of the build and testsuite execution so the base and expansion parts can be independently built, and, more importantly, can be independently tested.” More details on this release may be found in the release notes. InfoQ will follow up with a more detailed news story.

Micronaut

The Micronaut Foundation has released version 4.7.4 of the Micronaut Framework featuring Micronaut Core 4.7.11, bug fixes and patch updates to modules: Micronaut Serialization and Micronaut Discovery Client. Further details on this release may be found in the release notes.

Quarkus

Quarkus 3.17.6, the fifth maintenance release (3.17.1 was skipped due to a regression), ships with bug fixes, dependency upgrades and notable resolutions to issues such as: a NullPointerException caused by the mappingToNames() method, defined in the BuildTimeConfigurationReader class, using the SmallRye Config PropertyName class to map with mapping names; and bootstrapping an application crashes using the Dev Console. More details on this release may be found in the changelog.

Java Operator SDK

The first release candidate of Java Operator SDK 5.0.0 ships with continuous improvements on new features such as: the Kubernetes Server-Side Apply elevated to a first-class citizen with a default approach for patching the status resource; and a change in responsibility with the EventSource interface to monitor the resources and handles accessing the cached resources, filtering, and additional capabilities that was once maintained by the ResourceEventSource subinterface. Further details on this release may be found in the changelog.

Arquillian

A week after the release of version 1.9.2, Arquillian 1.9.3 provides dependency upgrades and improvements to the ExceptionProxy class to produce a meaningful stack trace when the exception class is missing on a client. More details on this release may be found in the release notes.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Google Expands Gemini Code Assist with Support for Atlassian, GitHub, and GitLab

MMS Founder
MMS Renato Losio

Article originally posted on InfoQ. Visit InfoQ

Google recently announced support for third-party tools in Gemini Code Assist, including Atlassian Rovo, GitHub, GitLab, Google Docs, Sentry, and Snyk. The private preview enables developers to test the integration of widely-used software tools with the personal AI assistant directly within the IDE.

Offering similar functionalities to the market leader GitHub Copilot, Gemini Code Assist provides AI-assisted application development with AI code assistance, natural language chat, code transformation, and local codebase awareness. Launching these tools in private preview integrates real-time data and external application access directly into the coding environment, enhancing functionality while reducing distractions. Ryan J. Salva, senior director at Google, and Prithpal Bhogill, group product manager at Google, write:

Recognizing the diverse tools developers use, we’re collaborating with many partners to integrate their technologies directly into Gemini Code Assist for a more comprehensive and streamlined development experience. These partners, and more, help developers stay in their coding flow while accessing information through tools that enhance the SDLC.

According to the documentation, the supported third-party tools can convert any natural language command into a parameterized API call, based on the OpenAPI standard or a YAML file provided by the user. GitHub Copilot Enterprise also includes extensions to reduce context switching. Richard Seroter, senior director and chief evangelist at Google Cloud, comments:

Google often isn’t first. There were search engines, web email, online media, and LLM-based chats before we really got in the game. But we seem to earn our way to the leaderboard over time. The latest? Gemini Code Assist isn’t the first AI-assisted IDE tool. But it’s getting pretty good!

With coding assistance being one of the most promising areas for generative AI, Salva and Bhogill add:

Code Assist currently provides developers with a natural language interface to both traditional APIs and AI Agent APIs. Partners can quickly and easily integrate to Code Assist by onboarding to our partner program. The onboarding process is as simple as providing an OpenAPI schema, a Tool config definition file, and a set of quality evals prompts used to validate and tune the integration.

This is not the only recent announcement impacting Code Assist, with support for Gemini 2.0 Flash being a significant announcement. Powered by Gemini 2.0, Code Assist now offers a larger context window, enabling it to understand more extensive enterprise codebases. According to Google, this new LLM aims to enhance productivity by providing higher-quality responses and lower latency, allowing users to “stay in an uninterrupted flow state for longer.” In the “The 70% problem: Hard truths about AI-assisted coding” article, Addy Osmani warns:

AI isn’t making our software dramatically better because software quality was (perhaps) never primarily limited by coding speed (…) What AI does do is let us iterate and experiment faster, potentially leading to better solutions through more rapid exploration (…) The goal isn’t to write more code faster. It’s to build better software. Used wisely, AI can help us do that. But it’s still up to us to know what “better” means and how to achieve it.

Code Assist currently supports authentication to partner APIs via the OAuth 2.0 Authorization Code grant type, with Google planning to add support for API key authentication in the future. Pricing is based on per-user, per-month licenses, with monthly or annual commitments. Licenses range from $19 USD to $54 USD per user per month. A Google form is available to request access to the private preview of Code Assist tools.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Nvidia Nemotron Models Aim to Accelerate AI Agent Development

MMS Founder
MMS Sergio De Simone

Article originally posted on InfoQ. Visit InfoQ

Nvidia introduced Llama Nemotron large language models (LLMs) and Cosmos Nemotron vision language models (VLMs) with a special emphasis on workflows powered by AI agents such as customer support, fraud detection, product supply chain optimization, and more. Models in the Nemotron family come in Nano, Super, and Ultra sizes to better fit the requirements of diverse systems.

AI agents are a new frontier of generative AI evolution, says Nvidia, aiming to create systems able to act autonomously to carry complex tasks through. This requires combining language skills, as displayed by LLMs, with the ability to perceive and interact with the environment.

To be effective, many AI agents need both language skills and the ability to perceive the world and respond with the appropriate action.

That explains why the Nemotron Model family includes models derived from Meta’s LLaMA models as well as new Cosmos Nemotron VLMs that enable analyzing and responding to images and video captured in the user environment.

The availability of agents with vision capabilities, says Nvidia, could make it feasible to analyze videos from industrial cameras in a multitude of environments in real-time to help detect incidents, reduce defects, or guide humans through some course of action. Currently, according to the company, less than 1% of video from industrial cameras is watched live by humans.

According to Nvidia, they trained Llama Nemotron models to efficiently execute a number of common agentic tasks so you can use just one single model whereas you would normally use multiple specialized models.

The models are pruned to reduce latency and improve compute efficiency, then retrained using a hiqh-quality dataset with distillation and alignment methods to increase accuracy across tasks. This results in smaller models with high accuracy and throughput.

Nemotron models are optimized for distinct compute requirements, including Nano for PC application developers, Super to provide high performance on a single GPU, and Ultra, designed for data-center-scale applications.

The Nvidia Nemotron ecosystem also includes Nvidia NeMo to customize models with proprietary data, and NeMo Aligner to better align a model to follow instruction and generate human preferred responses. Additionally, Nvidia provides Nvidia AI Blueprints as a tool to quickly create AI agents by using NIM microservices as building blocks to serve Nemotron models.

On a related note, Nvidia also announced its Cosmos world foundation models which are specially tailored to generate physics-aware videos for robotics and autonomous vehicles.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Express 5.0 Released, Focuses on Stability and Security

MMS Founder
MMS Bruno Couriol

Article originally posted on InfoQ. Visit InfoQ

The Express.js team has released version 5.0.0, 10 years after the first major version release in 2014. The release focuses on stability and security with a view to enabling developers to write more robust Node.js applications.

Express 5 drops support for old versions of Node.js. The release note states:

This release drops support for Node.js versions before v18. This is an important change because supporting old Node.js versions has been holding back many critical performance and maintainability changes. This change also enables more stable and maintainable continuous integration (CI), adopting new language and runtime features, and dropping dependencies that are no longer required.

Following a security audit, the team decided to introduce changes in how path route matching works. To avoid regular expression Denial of Service (ReDoS) attacks, Express 5 no longer supports sub-expressions in regular expressions, for example /:foo(d+).


app.get('/:id(d+)', (req, res) => res.send(`ID: ${req.params.id}`));

Blake Embrey, member of the Express.JS technical committee, provides an example of regular expression (e.g., /^/flights/([^/]+?)-([^/]+?)/?$/i), that, when matched against '/flights/' + '-'.repeat(16_000) + '/x' may take 300ms instead of running below one millisecond. The Express team recommends using a robust input validation library.

Express 5 also requires wildcards in regular expressions to be explicitly named or replaced with (.*)** for clarity and predictability. Thus, paths like /foo* must be updated to /foo(.*).

The syntax for optional parameters in routes also changes. Former Express 4’s :name? becomes {/:name}:


app.get('/user/:id?', (req, res) => res.send(req.params.id || 'No ID'));


app.get('/user{/:id}', (req, res) => res.send(req.params.id || 'No ID'));

Unnamed parameters in regex capture groups can no longer be accessed by index. Parameters must now be named:


app.get('/user(s?)', (req, res) => res.send(req.params[0])); 


app.get('/user:plural?', (req, res) => res.send(req.params.plural));

Express 5 additionally enforces valid HTTP status codes, as a defensive measure against silent failures and arduous sessions of debugging responses.


res.status(978).send('Invalid status');  


res.status(978).send('Invalid status');  

Express.js 5 makes it easier to handle errors in async middleware and routes. Express 5 improves error handling in async. middleware and routes by automatically passing rejected promises to the error-handling middleware, removing the need for try/catch blocks.


app.get('/data', async (req, res, next) => {
  try {
    const result = await fetchData();
    res.send(result);
  } catch (err) {
    next(err);
  }
});


app.get('/data', async (req, res) => {
  const result = await fetchData();
  res.send(result);
});

While the Express team strives to keep the breaking changes minimal, the new release will require interested developers to migrate their Express code to the new version. Developers can review the migration guide available online.

Express.js is a project of the OpenJS Foundation (At-Large category). Developers are invited to read the full release note for additional technical details and examples.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Agile Alliance Joins the Project Management Institute

MMS Founder
MMS Shane Hastie

Article originally posted on InfoQ. Visit InfoQ

The Agile Alliance has officially joined the Project Management Institute (PMI), forming the PMI Agile Alliance as of December 31, 2024. The partnership aims to enhance global project management by integrating Agile principles with PMI’s resources and reach. While many celebrate the expanded opportunities for collaboration, professional development, and innovation, critics express concerns about the potential dilution of Agile values and the loss of independence for Agile Alliance.

According to the announcements, this partnership is expected to have a significant impact on the project management and Agile communities. It represents a step towards integrating traditional project management approaches with Agile methodologies, potentially reshaping how projects are managed and delivered across various industries.  There has been a lot of commentary in the agile community in particular; while some see this partnership as an opportunity for growth and integration in the project management field, others express concerns about the future direction of Agile practices and principles.

The rationale for the partnership is presented as:

  • Evolving Project Management Landscape: The partnership recognizes that modern project delivery requires fluency across various delivery practices
  • Global Reach: PMI Agile Alliance will benefit from PMI’s worldwide presence and resources
  • Expanding Agile Influence: The collaboration aims to broaden the understanding of Agile and agility beyond the tech industry and software development
  • Enhancing Project Success: By combining PMI’s structured approach with Agile Alliance’s adaptive principles, the partnership seeks to empower professionals to achieve greater project success

Financial considerations are not mentioned in the press releases, but Mike Cohn, one of the original founders of the Agile Alliance, has <a href="http://stated that the decline in revenues from conferences post-COVID is considered a key factor. 

According to Teresa Foster, managing director of the Agile Alliance, key aspects of the partnership are: 

  • Agile Alliance will now operate as PMI Agile Alliance
  • Foster and her team will transfer to PMI
  • PMI Agile Alliance will maintain its existing membership structure and elected Board
  • The core mission, values, and principles of Agile Alliance will be preserved
  • PMI members will gain enhanced access to expanded agile thought leadership, tools, and resources, enhancing their agile mindset

This partnership is expected to have a significant impact on the project management and Agile communities. It represents a step towards integrating traditional project management approaches with Agile methodologies, potentially reshaping how projects are managed and delivered across various industries.

 Jim Highsmith, another of the Agile Manifesto signatories and a founder of the Agile Alliance, sees the partnership as a step towards Bridging the Divide: Integrating Agile and Project Management.

Concerns raised about the partnership include: 

  • Fear of Dilution: there is concern that the merger might dilute Agile principles or lead to a return to more traditional project management approaches  
  • Loss of Independence: There are worries about the Agile Alliance losing its independent voice and becoming subsumed under PMI’s structure.
  • Relevance Concerns: Some argue that Agile Alliance was already struggling to remain relevant, and this move might not address underlying issues
  • Cultural Clash: There’s apprehension about potential conflicts between PMI’s structured approach and Agile’s adaptive principles

According to Dave Westgarth, there is irony in Agilists reacting to the merger with the same fear and resistance that traditional project managers once showed towards Agile.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.