Category: Uncategorized
MMS • Daniel Dominguez
Article originally posted on InfoQ. Visit InfoQ
Hugging Face has launched the integration of four serverless inference providers Fal, Replicate, SambaNova, and Together AI, directly into its model pages. These providers are also integrated into Hugging Face’s client SDKs for JavaScript and Python, allowing users to run inference on various models with minimal setup.
This update enables users to select their preferred inference provider, either by using their own API keys for direct access or by routing requests through Hugging Face. The integration supports different models, including DeepSeek-R1, and provides a unified interface for managing inference across providers.
Developers can access these services through the website UI, SDKs, or direct HTTP calls. The integration allows seamless switching between providers by modifying the provider name in the API call while keeping the rest of the implementation unchanged. Hugging Face also offers a routing proxy for OpenAI-compatible APIs.
Rodrigo Liang, Co-Founder & CEO at SambaNova stated:
We are excited to be partnering with Hugging Face to accelerate its Inference API. Hugging Face developers now have access to much faster inference speeds on a wide range of the best open source models.
And Zeke Sikelianos, Founding Designer at Replicate quoted:
Hugging Face is the de facto home of open-source model weights, and has been a key player in making AI more accessible to the world. We use Hugging Face internally at Replicate as our weights registry of choice, and we’re honored to be among the first inference providers to be featured in this launch.
Fast and accurate AI inference is essential for many applications, especially as demand for more tokens increases with test-time compute and Agentic AI. Open-source models help optimize performance on RDU, enabling developers to achieve up to 10x faster inference with improved accuracy.
Billing is handled by the inference provider if a user supplies their own API key. If requests are routed through Hugging Face, charges are applied at standard provider rates with no additional markup.
MMS • Robert Krzaczynski
Article originally posted on InfoQ. Visit InfoQ
Block’s Open Source Program Office has launched Codename Goose, an open-source, non-commercial AI agent framework designed to automate tasks and integrate seamlessly with existing tools. Goose provides users with a flexible, on-machine AI assistant that can be customized through extensions, enabling developers and other professionals to enhance their productivity.
Goose is designed to integrate seamlessly with existing developer tools through extensions, which function using the Model Context Protocol (MCP). This enables users to connect with widely used platforms such as GitHub, Google Drive, and JetBrains IDEs while also allowing them to create custom integrations. The AI agent is positioned as a tool for both software engineers and other professionals looking to optimize their workflows.
Goose functions as an autonomous AI agent that can carry out complex tasks by coordinating various built-in capabilities. Users can integrate their preferred LLM providers, ensuring flexibility in how the tool is deployed. Goose is designed for easy adaptation, allowing developers to work with AI models in a way that fits their existing workflows.
The agent supports a range of engineering-related tasks, including:
- Code migrations
- Generating unit tests for software projects
- Scaffolding APIs for data retention
- Managing feature flags within applications
- Automating performance benchmarking for build commands
- Increasing test coverage above specific thresholds
As an open-source initiative, Goose has already attracted attention from industry professionals. Antonio Song, a contributor to the project, highlighted the importance of user interaction in AI tools:
Most of us will have little to no opportunity to impact AI model development itself. However, the interface through which users interact with the AI model is what truly drives users to return and find value.
Furthermore, user Lumin commented on X:
Goose takes flight. Open-source AI agents are no longer a side project—they are defining the future. Codename Goose 1.0 signals a paradigm shift: decentralized, non-commercial AI frameworks bridging intelligence and real-world execution. The AI race has been dominated by centralized models with restricted access. Goose challenges that by enabling modular AI agents that can install, execute, edit, and test with any LLM, not just a select few.
Goose is expected to evolve further as more contributors refine its capabilities. The tool’s extensibility and focus on usability suggest it could become a widely adopted resource in both engineering and non-engineering contexts.
MMS • RSS
Posted on nosqlgooglealerts. Visit nosqlgooglealerts
2024 marked a significant year for Amazon DynamoDB, with advancements in security, performance, cost-effectiveness, and integration capabilities. This year-in-review post highlights key developments that have enhanced the DynamoDB experience for our customers.
Key highlights and launches of 2024 include:
These improvements, along with numerous other updates, reflect our commitment to making DynamoDB more resilient, flexible, and cost-effective for businesses of all sizes. In the following sections, we dive deeper into each category of updates, exploring how they can benefit your applications and workflows.
Whether you’re a long-time DynamoDB user or just getting started, this post will guide you through the most impactful changes of 2024 and how they can help you build reliable, faster, and more secure applications. We’ve sorted the post by alphabetical feature areas, listing releases in reverse chronological order. Note that certain announcements may be duplicated across feature areas; the first of these duplicates will cover comprehensive details.
Over the course of 2024, we’ve also overhauled areas of the official DynamoDB documentation, so be sure to check out all of the new and modified pages, and update your browser bookmarks. Be sure to contact us at @DynamoDB or on AWS re:Post if you have questions, comments, or feature requests.
Amazon DynamoDB Accelerator (DAX)
Application integration
- December 3 – Amazon DynamoDB zero-ETL integration with Amazon SageMaker Lakehouse automates the extracting and loading of data from a DynamoDB table into SageMaker Lakehouse, an open and secure lakehouse. You can run analytics and machine learning (ML) workloads on your DynamoDB data using SageMaker Lakehouse, which provides integrated access control and open source Apache Iceberg for data interoperability and collaboration. With this launch, you now have the option to enable analytics workloads using SageMaker Lakehouse, in addition to the previously available Amazon OpenSearch Service and Amazon Redshift zero-ETL integrations. To learn more, refer to DynamoDB integrations, read the DynamoDB zero-ETL documentation with Amazon SageMaker Lakehouse, or read the Amazon SageMaker Lakehouse documentation.
- November 12 – Amazon Managed Service for Apache Flink now supports Amazon DynamoDB Streams as a source. The new connector, contributed by AWS to the Apache Flink open source project, adds Amazon DynamoDB Streams as a new source for Flink. Flink connectors are software components that move data into and out of an Amazon Managed Service for Apache Flink application. You can use the new connector to read data from a DynamoDB stream starting with Flink version 1.19. With Amazon Managed Service for Apache Flink there are no servers and clusters to manage, and there is no compute and storage infrastructure to set up. For the Flink repository for AWS connectors, refer to Amazon DynamoDB Connector. For detailed documentation and setup instructions, see DynamoDB Streams and Apache Flink.
- October 15 – AWS announces general availability of Amazon DynamoDB zero-ETL integration with Amazon Redshift. This integration allows you to perform high-performance analytics on DynamoDB data without impacting production workloads or building complex extract, transform, and load (ETL) pipelines. As data is written to DynamoDB, it becomes immediately available in Amazon Redshift, enabling holistic insights across applications, breaking data silos, and providing cost savings. You can take advantage of Amazon Redshift capabilities such as high-performance SQL, built-in ML, Spark integrations, and data sharing to enhance your analysis of DynamoDB data. To learn more, see DynamoDB zero-ETL integration with Amazon Redshift.
- September 18 – AWS Cost Management now provides purchase recommendations for Amazon DynamoDB reserved capacity. This new feature analyzes your DynamoDB usage and suggests optimal reserved capacity purchases for 1- or 3-year terms. You can customize recommendation parameters to align with your financial goals. This addition expands the recommendation capabilities of AWS Cost Explorer to include seven reservation models across various AWS services. The feature is available in most AWS Regions where DynamoDB operates, except China (Beijing, operated by Sinnet), China (Ningxia, operated by NWCD), and AWS GovCloud. For more information, or to get started with DynamoDB reserved capacity recommendations, refer to Accessing reservation recommendations.
- March 27 – Amazon DynamoDB Import from S3 now supports up to 50,000 Amazon S3 objects in a single bulk import. With the increased default service quota for import from Amazon Simple Storage Service (Amazon S3), customers who need to bulk import a large number of S3 objects can now run a single import to ingest up to 50,000 S3 objects, removing the need to consolidate S3 objects prior to running a bulk import. The new Import from S3 quotas are now effective in all Regions, including the AWS GovCloud (US) Regions. Start taking advantage of the new DynamoDB Import from S3 quotas by using the DynamoDB console, the AWS Command Line Interface (AWS CLI), or AWS APIs. For more information about DynamoDB quotas, refer to the Service, account, and table quotas in Amazon DynamoDB.
- February 2 – Amazon DynamoDB zero-ETL integration with Amazon Redshift in the US East (Virginia) region.
Developer and user experience
- October 17 – Amazon DynamoDB announces user experience enhancement to organize your tables. You can choose the favorites icon to view your favorited tables on the console’s tables page. With this update, you have a faster and more efficient way to find and work with tables that you often monitor, manage, and explore. The favorites table console experience is now available in all Regions at no additional cost. Get started with creating a DynamoDB table from the AWS Management Console.
- May 28 – Amazon DynamoDB local supports configurable maximum throughput for on-demand tables. You can use the configurable maximum throughput for on-demand tables feature for predictable cost management, protection against accidental surging in consumed resources and excessive use, and safe guarding downstream services with fixed capacity from potential overloading and performance bottlenecks. With DynamoDB local, you can develop and test your application with managing maximum on-demand table throughput, making it easier to validate the use of the supported API actions before releasing code to production. To get started with the latest version refer to Deploying DynamoDB locally on your computer. Learn more in the documentation by referring to Setting Up DynamoDB Local (Downloadable Version).
- April 24 – NoSQL Workbench for Amazon DynamoDB launches a revamped operation builder user interface. The revamped operation builder interface gives you more space to explore and visualize your data, lets you manage tables with just one click, and allows direct item manipulation right from the results pane. We’ve added a copy feature for quick item creation, streamlined the Query and Scan filtering process, and added a seamless DynamoDB local integration. For those who prefer a different look or need better accessibility, there’s a new dark mode too. All of these great features come at no extra cost, no matter which Region you’re using. Download the latest version now—you will love how much smoother your database management becomes! Get started with the latest version of NoSQL Workbench by downloading from Download NoSQL Workbench for DynamoDB. For more information about the latest updates, refer to Exploring datasets and building operations with NoSQL Workbench.
- March 14 – Amazon DynamoDB local upgrades to Jetty 12 and JDK 17. DynamoDB local (the downloadable version of DynamoDB) version 2.3.0 migrates from Jetty 11 to Jetty 12 server, and from JDK 11 to JDK 17. With this update, developers using Spring Boot 3.2.x with Jetty 12 support can use DynamoDB local to develop and test their Spring applications when working with DynamoDB. With DynamoDB local, you can develop and test applications by running DynamoDB in your local development environment with incurring any costs.
Global tables
- December 2 – Amazon DynamoDB global tables previews multi-Region strong consistency. This new capability enables you to build highly available multi-Region applications with zero Recovery Point Objective (RPO). Multi-Region strong consistency makes sure applications can consistently read the latest data version from different Regions in a global table, eliminating the need for manual cross-Region consistency management. This feature is particularly beneficial for global applications with strict consistency requirements, such as user profile management, inventory tracking, and financial transaction processing. The preview is currently available in US East (N. Virginia), US East (Ohio), and US West (Oregon) Regions, with the existing global tables pricing. For more information, refer to Multi-Region strong consistency, and Global tables – multi-Region replication for DynamoDB.
- November 14 – Amazon DynamoDB reduces prices for on-demand throughput and global tables. We have made DynamoDB even more cost-effective by reducing prices for on-demand throughput by 50% and global tables by up to 67%. To learn more, see New – Amazon DynamoDB lowers pricing for on-demand throughput and global tables.
- April 30 – Amazon DynamoDB now supports an AWS FIS action to pause global table replication. This feature enhances the fully managed global tables service in DynamoDB, which automatically replicates tables across selected Regions for fast, local read and write performance. The new AWS Fault Injection Service (FIS) action enables you to simulate and observe your application’s response to a pause in Regional replication, allowing you to fine-tune monitoring and recovery processes for improved resiliency and availability. Global tables can now be tested more thoroughly to maintain proper application behavior during Regional interruptions. You can integrate this new action into your continuous integration and release testing processes by creating experiment templates in FIS, and combine it with other FIS actions for comprehensive scenario testing. The DynamoDB Pause Replication action is now available across AWS commercial Regions where FIS is available. To learn more, see Amazon DynamoDB actions.
Security
- December 13 – Amazon DynamoDB announces support for FIPS 140-3 interface VPC and Streams endpoints. FIPS-compliant endpoints help companies contracting with the federal government meet the FIPS security requirement to encrypt sensitive data in supported Regions. The new capability is available in Regions in the US and Canada, and the AWS GovCloud (US) Regions. To learn more about AWS FIPS 140-3, refer to Federal Information Processing Standard (FIPS) 140-3.
- November 18 – Amazon DynamoDB announces general availability of attribute-based access control. DynamoDB now supports ABAC for tables and indexes. ABAC is an authorization strategy that defines access permissions based on tags attached to users, roles, and AWS resources. ABAC uses tag-based conditions in your AWS Identity and Access Management (IAM) policies or other policies to allow or deny specific actions on your tables or indexes when IAM principals’ tags match the tags for the tables. Using tag-based conditions, you can also set more granular access permissions based on your organizational structures. ABAC automatically applies your tag-based permissions to new employees and changing resource structures, without rewriting policies as organizations grow. To learn more, see Using attribute-based access control with DynamoDB and Using attribute-based access control for tag-based access authorization with Amazon DynamoDB.
- September 3 – Amazon DynamoDB announces support for Attribute-Based Access Control (preview). ABAC for DynamoDB is available in limited preview in the US East (Ohio), US East (Virginia), and US West (N. California) Regions.
- May 28 – Amazon DynamoDB now supports resource-based policies in the AWS GovCloud (US) Regions. This feature enables customers in AWS GovCloud (US) to use resource-based policies.
- March 20 – Amazon DynamoDB now supports resource-based policies. This feature allows you to specify IAM principals and their allowed actions on tables, streams, and indexes. It simplifies cross-account access control and integrates with AWS IAM Access Analyzer and Block Public Access capabilities. Resource-based policies are available in AWS commercial Regions at no additional cost and can be implemented using various AWS tools. This new feature provides greater flexibility and security in managing access to DynamoDB resources across different AWS accounts. There is no additional cost to use the feature. Get started with resource-based policies by using the console, AWS CLI, AWS SDK, AWS CDK, or AWS CloudFormation. To learn more, see Using resource-based policies for DynamoDB.
- March 19 – Amazon DynamoDB now supports AWS PrivateLink, allowing you to connect to DynamoDB over a private network without using public IP addresses. This helps you maintain compliance for your DynamoDB workloads and eliminates the need to configure firewall rules or an internet gateway. AWS PrivateLink for DynamoDB is available in AWS commercial Regions and there is an additional cost to use the feature. To learn more, see Simplify private connectivity to Amazon DynamoDB with AWS PrivateLink.
Serverless
Documentation
- December 26 – Application integration – Published a new topic on integrating Amazon Managed Streaming for Apache Kafka (Amazon MSK) with DynamoDB. Learn how Amazon MSK integrates with DynamoDB by reading data from Apache Kafka topics and storing it in DynamoDB. For more information, refer to Integrating DynamoDB with Amazon Managed Streaming for Apache Kafka.
- November 18 – Security – Added two new permissions to the
AmazonDynamoDBReadOnlyAccess
managed policy:dynamodb:GetAbacStatus
anddynamodb:UpdateAbacStatus
. These permissions allow you to view the attribute-based access control (ABAC) status and enable ABAC for your AWS account in the current Region. For more information, refer to AWS managed policy: AmazonDynamoDBReadOnlyAccess. - October 16 – Billing – Published two new topics regarding billing for global tables and billing for backups. For more information, refer to Understanding Amazon DynamoDB billing for global tables and Understanding Amazon DynamoDB billing for backups.
- October 11 – Generative AI – Published a new topic that provides information about using generative AI with DynamoDB, including examples of generative AI use cases for DynamoDB.
- September 3 – Application integration – Added documentation for account-based endpoints and the
ACCOUNT_ID_ENDPOINT_MODE
setting for SDK clients. For more information, refer to SDK support of AWS account-based endpoints. - July 31 – Developer and user experience – Overhauled Getting started with DynamoDB pages. We combined the AWS CLI and AWS SDK instructions into the same page as the AWS Management Console, so new users getting started with DynamoDB can choose the medium in which they interact with DynamoDB.
- July 2 – Developer and user experience – Restructured and consolidated the DynamoDB backup and restore documentation in the DynamoDB Developer Guide. For more information, refer to Backup and restore for DynamoDB.
- June 3 – DynamoDB Accelerator – Published a new best practices topic that provides you with comprehensive insights for using DAX effectively. This topic covers performance optimization, cost management, and operational best practices. For more information, refer to Prescriptive guidance to integrate DAX with DynamoDB applications.
- May 29 – Developer and user experience – Added a new topic on migrating DynamoDB tables from one account to another. For more information, refer to Migrating a DynamoDB table from one account to another.
- May 7 – Security – DynamoDB preventative security best practices pages updated.
- March 6 – Developer and user experience – Added a programming guide for AWS SDK for JavaScript. Learn about the AWS SDK for JavaScript, abstraction layers, configuring connection, handling errors, defining retry policies, managing keep-alive, and more. For more information, refer to Programming Amazon DynamoDB with JavaScript.
- March 5 – Developer and user experience – Created a new programming guide for AWS SDK for Java 2.x that goes in depth about high-level, low-level, and document interfaces, HTTP clients and their configuration, and error handling, and addresses the most common configuration settings that you should consider when using the SDK for Java 2.x. For more information, refer to Programming DynamoDB with AWS SDK for Java 2.x.
- February 26 – Developer and user experience – Allowed developers to use NoSQL Workbench to copy or clone tables between development environments and Regions (DynamoDB Local and DynamoDB web). For more information, refer to Cloning tables with NoSQL Workbench.
- January 11 – Developer and user experience – Created a new guide that goes in depth about both high-level and low-level libraries and addresses the most common configuration settings that you should consider when using the Python SDK. For more information, refer to Programming Amazon DynamoDB with Python and Boto3.
- January 3 – Developer and user experience – Updated Using time to live (TTL) in DynamoDB documentation, with updated code samples in Java, Python, and JavaScript.
Summary
2024 marked a year of considerable advancement for DynamoDB, with major strides in security, cost optimization, and seamless integrations. Key developments like multi-Region strong consistency, reduced pricing for on-demand throughput, and zero-ETL integrations with services like Amazon Redshift and SageMaker Lakehouse have empowered customers to build more resilient, cost-effective, and data-driven applications.
As we look towards 2025, we’re excited to see how you use these new capabilities. Whether you’re an experienced DynamoDB user or just starting your journey, there’s never been a better time to explore what’s possible. We encourage you to try out our new features, and share your experiences with us at @DynamoDB or on AWS re:Post.
Get started with the Amazon DynamoDB Developer Guide and the DynamoDB getting started guide, and join the millions of customers that push the boundaries of what’s possible with DynamoDB.
About the Author
Michael Shao is a Senior Developer Advocate on the Amazon DynamoDB team. Michael has spent over 8 years as a software engineer at Amazon Retail, with a background in designing and building scalable, resilient, and highly performing software systems. His expertise in exploratory, iterative development helps drive his passion for helping AWS customers understand complex technical topics. With a strong technical foundation in system design and building software from the ground up, Michael’s goal is to empower and accelerate the success of the next generation of developers.
MMS • RSS
Posted on nosqlgooglealerts. Visit nosqlgooglealerts
Deep-pocketed investors have adopted a bearish approach towards MongoDB MDB, and it’s something market players shouldn’t ignore. Our tracking of public options records at Benzinga unveiled this significant move today. The identity of these investors remains unknown, but such a substantial move in MDB usually suggests something big is about to happen.
We gleaned this information from our observations today when Benzinga’s options scanner highlighted 10 extraordinary options activities for MongoDB. This level of activity is out of the ordinary.
The general mood among these heavyweight investors is divided, with 20% leaning bullish and 80% bearish. Among these notable options, 2 are puts, totaling $61,301, and 8 are calls, amounting to $392,891.
Predicted Price Range
After evaluating the trading volumes and Open Interest, it’s evident that the major market movers are focusing on a price band between $175.0 and $280.0 for MongoDB, spanning the last three months.
Analyzing Volume & Open Interest
In terms of liquidity and interest, the mean open interest for MongoDB options trades today is 349.11 with a total volume of 3,948.00.
In the following chart, we are able to follow the development of volume and open interest of call and put options for MongoDB’s big money trades within a strike price range of $175.0 to $280.0 over the last 30 days.
MongoDB Call and Put Volume: 30-Day Overview
Largest Options Trades Observed:
Symbol | PUT/CALL | Trade Type | Sentiment | Exp. Date | Ask | Bid | Price | Strike Price | Total Trade Price | Open Interest | Volume |
---|---|---|---|---|---|---|---|---|---|---|---|
MDB | CALL | TRADE | BEARISH | 02/28/25 | $14.8 | $14.6 | $14.6 | $265.00 | $102.2K | 58 | 51 |
MDB | CALL | TRADE | BEARISH | 02/28/25 | $12.55 | $12.05 | $12.05 | $270.00 | $60.2K | 26 | 129 |
MDB | CALL | SWEEP | BULLISH | 02/07/25 | $2.4 | $2.37 | $2.4 | $280.00 | $54.9K | 284 | 2.0K |
MDB | CALL | TRADE | BEARISH | 03/21/25 | $34.2 | $33.05 | $33.05 | $250.00 | $46.2K | 500 | 16 |
MDB | CALL | SWEEP | BEARISH | 02/07/25 | $28.0 | $26.95 | $28.0 | $237.50 | $42.0K | 1 | 15 |
About MongoDB
Founded in 2007, MongoDB is a document-oriented database. MongoDB provides both licenses as well as subscriptions as a service for its NoSQL database. MongoDB’s database is compatible with all major programming languages and is capable of being deployed for a variety of use cases.
Where Is MongoDB Standing Right Now?
- With a trading volume of 588,072, the price of MDB is down by -2.36%, reaching $266.87.
- Current RSI values indicate that the stock is may be approaching overbought.
- Next earnings report is scheduled for 31 days from now.
What The Experts Say On MongoDB
Over the past month, 5 industry analysts have shared their insights on this stock, proposing an average target price of $320.0.
Unusual Options Activity Detected: Smart Money on the Move
Benzinga Edge’s Unusual Options board spots potential market movers before they happen. See what positions big money is taking on your favorite stocks. Click here for access.
* An analyst from Cantor Fitzgerald has revised its rating downward to Overweight, adjusting the price target to $344.
* An analyst from Guggenheim upgraded its action to Buy with a price target of $300.
* Consistent in their evaluation, an analyst from Scotiabank keeps a Sector Perform rating on MongoDB with a target price of $275.
* Consistent in their evaluation, an analyst from Barclays keeps a Overweight rating on MongoDB with a target price of $330.
* Reflecting concerns, an analyst from China Renaissance lowers its rating to Buy with a new price target of $351.
Trading options involves greater risks but also offers the potential for higher profits. Savvy traders mitigate these risks through ongoing education, strategic trade adjustments, utilizing various indicators, and staying attuned to market dynamics. Keep up with the latest options trades for MongoDB with Benzinga Pro for real-time alerts.
Market News and Data brought to you by Benzinga APIs
© 2025 Benzinga.com. Benzinga does not provide investment advice. All rights reserved.
MMS • RSS
Posted on mongodb google news. Visit mongodb google news
Article originally posted on mongodb google news. Visit mongodb google news
Java News Roundup: Java Operator SDK 5.0, Open Liberty, Quarkus MCP, Vert.x, JBang, TornadoVM
MMS • Michael Redlich
Article originally posted on InfoQ. Visit InfoQ
This week’s Java roundup for January 27th, 2025 features news highlighting: the GA release of Java Operator SDK 5.0; the January 2025 release of Open Liberty; an implementation of Model Context Protocol in Quarkus; the fourth milestone release of Vert.x 5.0; and point releases of JBang 0.123.0 and TornadoVM 1.0.10.
JDK 24
Build 34 of the JDK 24 early-access builds was made available this past week featuring updates from Build 33 that include fixes for various issues. Further details on this release may be found in the release notes.
JDK 25
Build 8 of the JDK 25 early-access builds was also made available this past week featuring updates from Build 7 that include fixes for various issues. More details on this release may be found in the release notes.
For JDK 24 and JDK 25, developers are encouraged to report bugs via the Java Bug Database.
TornadoVM
TornadoVM 1.0.10 features bug fixes, compatibility enhancements, and improvements: a new command-line option, -Dtornado.spirv.runtimes
, to select individual (Level Zero and/or OpenCL) runtimes for dispatching and managing SPIR-V; and support for multiplication of matrices using the HalfFloat
type. Further details on this release may be found in the release notes.
Spring Framework
The first milestone release of Spring Cloud 2025.0.0, codenamed Northfields, features bug fixes and notable updates to sub-projects: Spring Cloud Kubernetes 3.3.0-M1; Spring Cloud Function 4.3.0-M1; Spring Cloud Stream 4.3.0-M1; and Spring Cloud Circuit Breaker 3.3.0-M1. This release is based upon Spring Boot 3.5.0-M1. More details on this release may be found in the release notes.
Open Liberty
IBM has released version 25.0.0.1 of Open Liberty featuring updated Open Liberty features – Batch API (batch-1.0
), Jakarta Batch 2.0 (batch-2.0
), Jakarta Batch 2.1 (batch-2.1
), Java Connector Architecture Security Inflow 1.0 (jcaInboundSecurity-1.0
), Jakarta Connectors Inbound Security 2.0 (connectorsInboundSecurity-2.0
) – to support InstantOn; and a more simplified web module migration with the introduction of the webModuleClassPathLoader
configuration attribute for the enterpriseApplication
element that controls what class loader is used for the JARs that are referenced by a web module Class-Path
attribute.
Quarkus
The release of Quarkus 3.18.0 provides bug fixes, dependency upgrades and notable changes such as; an integration of Micrometer to the WebSockets Next extension; support for a JWT bearer client authentication in the OpenID Connect and OpenID Connect Client extensions using client assertions loaded from the filesystem; and a new extension, OpenID Connect Redis Token State Manager to store an OIDC connect token state in a Redis cache datasource. Further details on this release may be found in the changelog.
The Quarkus team has also introduced their own implementation of the Model Context Protocol (MCP) protocol featuring three servers so far: JDBC, Filesystem and JavaFX. These servers have been tested with Claude for Desktop, Model Context Protocol CLI and Goose clients. The team recommends using JBang to use these servers for ease of use, but isn’t required.
Apache Software Foundation
Maintaining alignment with Quarkus, the release of Camel Quarkus 3.18.0, composed of Camel 4.9.0 and Quarkus 3.18.0, provides resolutions to notable issues such as: the Kamelet extension unable to serialize objects from an instance of the ClasspathResolver
, an inner class defined in the DefaultResourceResolvers
, to bytecode; and the Debezium BOM adversely affects the unit tests from the Cassandra CQL extension driver since the release of Debezium 1.19.2.Final. More details on this release may be found in the release notes.
Infinispan
The release of Infinispan 15.1.5 features dependency upgrades and resolutions to issues such as: a NullPointerException
due to a concurrent removal with the DELETE
statement causing the cache::removeAsync
statement to return null
; and an instance of the HotRodUpgradeContainerSSLTest
class crashes the test suite due to an instance of the PersistenceManagerImpl
class failing to start. Further details on this release may be found in the release notes.
Java Operator SDK
The release of Java Operator SDK 5.0.0 ships with continuous improvements on new features such as: the Kubernetes Server-Side Apply elevated to a first-class citizen with a default approach for patching the status resource; and a change in responsibility with the EventSource
interface to monitor the resources and handles accessing the cached resources, filtering, and additional capabilities that was once maintained by the ResourceEventSource
subinterface. More details on this release may be found in the release notes.
JBang
JBang 0.123.0 provides bug fixes, improvements in documentation and new features: the options, such as add-open
and exports
, in a bundled MANIFEST.MF
file are now honored; and the addition of Cursor, the AI code editor, in the list of supported IDEs. Further details on this release may be found in the release notes.
Eclipse Vert.x
The fourth release candidate of Eclipse Vert.x 5.0 delivers notable changes such as: the removal of deprecated classes – ServiceAuthInterceptor
and ProxyHelper
– along with the two of the overloaded addInterceptor()
methods defined in the ServiceBinder class; and support for the Java Platform Module System (JPMS). More details on this release may be found in the release notes and deprecations and breaking changes.
JHipster
Versions 1.26.0 and 1.25.0 of JHipster Lite (announced here and here, respectively) ship with bug fixes, dependency upgrades and new features/enhancements such as: new datasource modules for PostgreSQL, MariaDB, MySQL and MSSQL; and a restructured state ranking system for modules. Version 1.26.0 also represents the 100th release of JHipster Lite. Further details on these releases may be found in the release notes for version 1.26.0 and version 1.25.0.
MMS • Apoorva Joshi
Article originally posted on InfoQ. Visit InfoQ
Transcript
Srini Penchikala: Hi, everyone. My name is Srini Penchikala. I am the lead director for AI, ML, and the data engineering community at InfoQ website and a podcast host.
In this episode, I’ll be speaking with Apoorva Joshi, senior AI developer advocate at MongoDB. We will discuss the topic of how to develop software applications that use the large language models, or LLMs, and how to evaluate these applications. We’ll also talk about how to improve the performance of these apps with specific recommendations on what techniques can help to make these applications run faster.
Hi, Apoorva. Thank you for joining me today. Can you introduce yourself, and tell our listeners about your career and what areas have you been focusing on recently?
Apoorva Joshi: Sure, yes. Thanks for having me here, Srini. My first time on the InfoQ Podcast, so really excited to be here. I’m Apoorva. I’m a senior AI developer advocate here at MongoDB. I like to think of myself as a data scientist turned developer advocate. In my past six years or so of working, I was a data scientist working at the intersection of cybersecurity and machine learning. So applying all kinds of machine learning techniques to problems such as malware detection, phishing detection, business email compromise, that kind of stuff in the cybersecurity space.
Then about a year or so ago, I switched tracks a little bit and moved into my first role as a developer advocate. I thought it was a pretty natural transition because even in my role as a data scientist, I used to really enjoy writing about my work and sharing it with the community at conferences, webinars, that kind of thing. In this role, I think I get to do both the things that I enjoy. I’m still kind of a data scientist, but I also tend to write and talk a bit more about my work.
Another interesting dimension to my work now is also that I get to talk to a lot of customers, which is something I always wanted to do more of. Especially in the gen AI era, it’s been really interesting to talk to customers across the board, and just hear about the kind of things they’re building, what challenges they typically run into. It’s a really good experience for me to offer them my expertise, but also learn from them about the latest techniques and such.
Srini Penchikala: Thank you. Definitely with your background as a data scientist and a machine learning engineer, and obviously developer advocate working with the customers, you bring the right mix of skills and expertise that the community really needs at this time because there is so much value in the generative AI technologies, but there’s also a lot of hype.
Apoorva Joshi: Yes.
Srini Penchikala: I want this podcast to be about what our listeners should be hyped about in AI, not all about the hype out there.
Let me first start by setting the context for this discussion with a quick background on large language models. The large language models, or LLMs, have been the foundation of gen AI applications. They play a critical role in developing those apps. We are seeing LLMs being used pretty much everywhere in various business and technology use cases. Not only for the end users, customers, but also for the software engineers in terms of code generation. We can go on with so many different use cases that are helping the software development lifecycle. And also, devops engineers.
I was talking to a friend and they are using AI agents to automatically upgrade the software on different systems in their company, and automatically send the JIRA tickets if there are issues. Agents are doing all this. They’re able to cut down the work from number of days and number of weeks for these upgrades. The patching process is down to minutes and hours. Definitely the sky is the limit there, right?
Apoorva Joshi: Yes.
Current State of LLMs [04:18]
Srini Penchikala: What do you see? What’s the current state of LLMs? And what are you seeing in the industry, are they being used, and what use cases are they being applied today?
Apoorva Joshi: I think there’s two slightly different questions here. One is what’s the current state of LLMs, and then application.
To your first point, I’ve been really excited to see the shift from purely text generation models to models that generate other modalities, such as image, audio, and video. It’s been really impressive to see how the quality of these models has improved in the past year alone. There’s finally benchmarks and we are actually starting to see applications in the wild that use some of these other modalities. Yes, really exciting times ahead as these models become more prevalent and find their place in more mainstream applications.
Then coming to how LLMs are being applied today, like you said, agents are the hot thing right now. 2025 is also being touted as the year of AI agents. Definitely seeing that shift in my work as well. Since the past year, we’ve seen our enterprise customers move from basic RAG early or mid last year to building more advanced RAG applications using slightly more advanced techniques, such as hybrid search, parent document retrieval, and all of this to improve the context being passed to LLMs for generation.
Then now, we are also seeing folks further move on to agents, so frequently hearing things like self-querying retrieval, human in the loop agents, multi-agent architectures, and stuff like that.
Srini Penchikala: Yes. You’ve been publishing and advocating about all of these topics, especially LLM-based applications which is the focus of this podcast. We’re not going to get too much into the language models themselves.
Apoorva Joshi: Yes.
Srini Penchikala: But we’ll be talking about how those models are using applications and how we can optimize those applications. This is for all the software developers out there.
LLM-based Application Development Lifecycle [06:16]
Yes, you’ve been publishing about and advocating about how to evaluate and improve the LLM application performance. Before we get into the performance side of discussion, can you talk about what are the different steps involved in a typical LLM-based application, because different applications and different organizations may be different in terms of number of steps?
Apoorva Joshi: Sure. Yes. Thinking of the most common elements, data is the first obvious big one because the LLMs work on some task out of the box. But at most organizations they want them to work on their own data or domain-specific use cases in industries like healthcare, legal. You need something a bit more than just a powerful language model, that’s where data becomes an important piece.
Then once you have data and you want language models to use that data to inform their responses, that’s where retrieval becomes a huge thing. Which is why things have progressed from just simple vector search or semantic search to some of these more advanced techniques, like again, hybrid search, parent document retrieval, self-querying, knowledge graphs. There’s just so much on that front as well. Then the LLM is a big piece of it if you’re building LLM-based applications.
I think one piece that a lot of companies often tend to miss is the monitoring aspect. Which is when you put your LLM applications into production, you want to be able to know if there’s regressions, performance degradations. If your application is not performing the way it should, so monitoring is the other pillar of building LLM applications.
Srini Penchikala: Sounds good. Once the developers start work on these applications, I think first thing they should probably do is the evaluation of the application.
Apoorva Joshi: Yes.
Evaluation of LLM-based Applications [08:02]
Srini Penchikala: What is the scope? What are the benchmarks? Because the metrics and service level agreements (SLAs) and response times can be different for different applications. Can you talk about evaluation of LLM-based applications, like what developers should be looking for? Are there any metrics that they should be focusing on?
Apoorva Joshi: Yes. I think anything with respect to LLMs is such a vast area because they’ve just opened up the floodgates for being used across multiple different domains and tasks. Evaluation is no different.
If you think of traditional ML models, like a classification or regression models, you had very quantifiable metrics that applied to any use case. For classification, you would have accuracy, precision recall. Or if you were building a regression model, you had means squared error, that kind of thing. But with LLMs, all that’s out of the window. Now the responses from these models are natural language, or an image, or some generated commodity. The metrics, when it comes to LLMs, are hard to quantify.
For example, if they’re generating a piece of text for a Q&A-based application, then metrics like how coherent is the response, how factual is the response, or what is the relevance of the information provided in the response. All of these become more important metrics and these are unfortunately pretty hard to quantify.
There’s two techniques that I’m seeing in the space broadly. One is this concept of LLM as a judge. The premise there is because LLMs are good at identifying patterns and interpreting natural language, they can be also used as an evaluation mechanism for natural language responses.
The idea there is to prompt an LLM on how you wanted to go about evaluating responses for your specific task and dataset, and then use the LLM to generate some sort of scoring paradigm on your data. I’ve also seen organizations that have more advanced data science teams actually putting the time and effort into creating fine-tuned models for evaluation. But yes, that’s typically reserved for teams that have the right expertise and knowledge to build a fine-tuned model because that’s a bit more involved than prompting.
Domain-specific Language Models [10:31]
Srini Penchikala: Yes. You mentioned domain-specific models. Do you see, I think this is one of my predictions, that the industry will start moving towards domain-specific language models? Like healthcare would have their own healthcare LLM, and the insurance industry would have their own insurance language model.
Apoorva Joshi: I think that’s my prediction, too. Coming from this domain, I was in cybersecurity, I used to do a lot of that. This was in the world when BERT was supposed to be a large language model. A lot of my work was also on fine-tuning those language models on cybersecurity-specific data. I think that’s going to start happening more and more.
I already see signals for that happening because let’s take the example of natural language to query. That’s a pretty common thing that folks are trying to do. I’ve seen that usually, with prompting or even something like RAG, you can achieve about, I would say, 90 to 95 percent accuracy or recall on slightly complicated tasks. But there’s a small set of tasks that are just not possible by just providing the LLM with the right information to generate responses.
For some of those cases, and more importantly for domain-specific use cases, I think we are going to pretty quickly move towards a world where there’s smaller specialized models, and then maybe an agent that’s orchestrating and helping facilitate the communication between all of them.
LLM Based Application Performance Improvements [12:02]
Srini Penchikala: Yes, definitely. I think it’s a very interesting time not only with these domain-specific models taking shape, and the RAG techniques now, you can use these base models and apply your own data on that. Plus, the agents taking care of a lot of these activities on their own, automation type of tasks. Definitely that’s really good. Thanks, Apoorva, for that.
Regarding the application performance itself, what are the high level considerations and strategies that teams should be looking at before they jump into optimizing or over-optimizing? What are the performance concerns that you see the teams are running into and what areas they should be focusing on?
Apoorva Joshi: Most times, I see teams asking about three things. There’s accuracy, latency, and cost. When I say accuracy, what I really mean is performance on metrics that apply to a particular business use case. It might not be accuracy, it might be, I don’t know, factualness or relevance. But yes, you get the drift. Because that’s how it is, because there are so many different use cases, it really comes down to first determining what your business cares about, and then coming up with metrics that resonate with that use case.
For example, if you’re building a Q&A chatbot, your evaluation parameters would be mainly faithfulness and relevance. But say you’re building a content moderation chatbot, then you care more about recall on toxicity and bias, for example. I think that’s the first big step.
Improvements here could be, again, depend on what you end up finding are the gaps of the model. Say you’re evaluating a RAG system, you would want to evaluate the different components of the system itself first, in addition to the overall evaluation of the system. When you think of RAG, there’s two components, retrieval and generation. You want to evaluate the retrieval performance separately to see if your gap lies in the retrieval strategy itself or do you need a different embedding model. Then you evaluate the generation to see what the gaps on the generation front are, to see what improvements you need to do there.
I think work backwards. Evaluate as many different components of the system as possible to identify the gaps. And then work backwards from there to try out a few different techniques to improve the performance on the accuracy side. Guardrails are an important one to make sure that the LLM is appropriately responding or not responding to sensitive or off-topic questions.
In agentic applications, I’ve seen folks also implement things like self-reflection and critiquing loops to have the LLM reflect and improve upon its own response. Or even human in the loop workflows, too. Get human feedback and incorporate that as a strategy to improve the response.
Maybe I’ll stop there to see if you have any follow-ups.
Choosing Right Embedding Model [15:02]
Srini Penchikala: Yes. No, that’s great. I think the follow-up is basically we can jump into some of those specific areas of the process. One of the steps is choosing the right embedding model. Some of these tools come with … I was trying out the Spring AI framework the other day. It comes with a default embedding model. What do you see there? Are there any specific criteria we should be using to pick one embedding model for one use case versus a different one for a different use case?
Apoorva Joshi: My general thumb rule would be to find a few candidate models and evaluate them for your specific use case and dataset. For text data, my recommendation would be to start from something like the massive text embedding, or MTEB Benchmark on Hugging Face. It’s essentially a leader board that shows you how different proprietary and open source embedding models perform on different tasks, such as retrieval, classification, and clustering. It also shows you the model size and dimensions.
Yes. I would say choose a few and evaluate for performance and, say latency if that’s a concern for you. Yes, there’s similar ones for multi-modal models as well. Until recently, we didn’t have good benchmarks for multi-modal, but now we have things like MME, which is a pretty good start.
Srini Penchikala: Yes. Could we talk about, real quick, about the benchmarks? When we are switching these different components of the LLM application, what standard benchmarks can we look at or run to get the results and compare?
Apoorva Joshi: I think benchmarks apply to the models themselves more than anything else. Which is why, when you’re looking to choose models for your specific use case, you take that with a grain of salt because the tasks that are involved in a benchmark. If you look at the MMLU Benchmark, it’s mostly a bunch of academic and professional examinations, but that might not necessarily be the task that you are evaluating for. I think benchmarks mostly apply for LLMs, but LLM applications are slightly different.
Srini Penchikala: You said earlier the observability or the monitoring. If you can build it into the application right from the beginning, it will definitely help us pinpoint any performance problems or any latencies.
Apoorva Joshi: Exactly.
Data Chunking Strategies [17:18]
Srini Penchikala: Another technique is how the data is divided or chunked into smaller segments. You published an article on this. Can you talk about this a little bit more, and tell us what are some of the chunking strategies for implementing the LLM apps?
Apoorva Joshi: Sure, yes. I think my disclaimer from before, with LLMs the answer starts from it depends, and then you pick and choose. I think that’s the thumb rule for anything when it comes to LLMs. Pick and choose a few, evaluate on your dataset and use case, and go from there.
Similarly for chunking, it depends on your specific data and use case. For most text, I typically suggest starting with this technique called recursive token with overlap, with say a 200-ish token size for chunks. What this does is it has the effect of keeping paragraphs together with some overlap at the chunk boundaries. This, combined with techniques such as parent document or contextual retrieval could potentially work well if you’re working with mostly text data. Semantic chunking is another fascinating one for text where you try to find or align the chunk boundaries with the semantic boundaries of your text.
Then there’s semi-structured data, which is data containing a combination of text, images, tables. For that, I’ve seen folks retrieve the text and non-textual components using specialized tools. There’s one called Unstructured that I particularly like. It supports a bunch of different formats and has different specialized models for extracting components present in different types of data. Yes, I would use a tool like that.
Then once you have those different components, maybe chunk the text as you would normally do. Then, two ways to approach the non-textual components. You either maybe summarize the images and tables to get everything in the text domain, or use multi-modal embedding models to embed the non-text elements as is.
Srini Penchikala: Yes, definitely. Because if we take the documents and if we chunk them into too small of segments, the context may be lost.
Apoorva Joshi: Exactly.
Srini Penchikala: If you provide a prompt, the response might not be exactly what you were looking for.
Apoorva Joshi: Right.
RAG Application Improvements [19:40]
Srini Penchikala: What are the other, especially if you’re using a RAG-based application which is probably the norm these days for all the companies … They’re all taking some kind of foundation model and ingesting their company data, incorporating it on top of it. What are the other strategies are you seeing in the RAG applications in terms of retrieval or generation steps?
Apoorva Joshi: There’s a lot of them coming every single day, but I can talk about the ones I have personally experimented with. The first one would be hybrid search. This is where you combine the results from multiple different searches. It’s commonly a combination of full text and vector search, but it doesn’t have to be that. It could be vector and craft-based. But the general concept of that is that you’re combining results from multiple different searches to get the benefits of both.
This is useful in, say ecommerce applications for example, where users might search for something very specific. Or include keywords in their natural language queries. For example, “I’m looking for size seven red Nike running shoes”. It’s a natural language query, but it has certain specific points of focus or keywords in them. An embedding model might not capture all of these details. This is where combining it with something like a full text search might make sense.
Then there’s parent document retrieval. This is where you embed and store small chunks at storage and ingest time, but you fetch the full source document or larger chunks at retrieval time. This has the effect of providing a more complete context to the LLM while generating responses. This might be useful in cases such as legal case prep or scientific research documentation chatbots where the context surrounding the user’s question can result in more rounded responses.
Finally, there’s graph RAG that I’ve been hearing about a lot lately. This is where you structure and store your data as a knowledge graph, where the nodes can be individual documents or chunks. Edges capture which nodes are related and what the relationship between the nodes is. This is particularly common in specialized domains such as healthcare, finance, legal, or anywhere where multi-hop reasoning or if you need to do some sort of root cause analysis or causal inference is required.
Srini Penchikala: Yes, definitely. The graph RAG has been getting a lot of attention lately. The power of knowledge graph in the RAG.
Apoorva Joshi: But that’s the thing. Going back to what you said earlier on, what’s the hype versus what people should be hyped about. I think a lot of organizations have a hard time balancing that too, because they want to be at the bleeding-edge of building these applications. But then sometimes, it might just be overkill to use the hottest technique.
Srini Penchikala: Where should development teams decide, “Hey, we started with an LLM-based application in mind, but my requirements are not a good fit?” What are those, I don’t want to call them limitations, but what are the boundaries where you say, “For now, let’s just go with the standard solution rather than bringing some LLM in to make it more complex?”
Apoorva Joshi: This is not just an LLM thing. Even having spent six years as a data scientist, a lot of times … ML in general, for the past decade or so, it’s just been a buzzword. Sometimes people just want to use it for the sake of using it. That’s where I think you need to bring a data scientist or an expert into the room and be like, “Hey, this is my use case”, and have them evaluate whether or not you even need to use machine learning, or in this case gen AI for it.
Going from traditional to gen AI, now there’s more of a preference to generative AI as well. I think at this point, the decision is, “Can I use a small language model or just use an XG boost and get away with it? Or do I really need a RAG use case?”
But I think in general, if you want to reason and answer questions using natural language on a repository of text, then I agree, some sort of generative AI use case is important. But say you’re basically just trying to do classification, or just doing something like anomaly detection or regression, then just because an LLM can do it doesn’t mean you should, because it might not be the most efficient thing at the end of the day.
Srini Penchikala: The traditional ML solutions are still relevant, right?
Apoorva Joshi: Yes. For some things, yes.
I do want to say the beauty of LLMs is that it’s made machine learning approachable to everyone. It’s not limited to data scientists anymore. A software engineer or PM, someone who’s not technical, they can just use these models without having to fine-tune or worry about the weights of the model. Yes, I think that results in these pros and cons, in a sense.
Srini Penchikala: Yes, you’re right. Definitely these LLM models and these applications that use them have brought the value of these to the masses. Now everybody can use ChatGPT or CoPilot and get the value out of it.
Apoorva Joshi: Yes.
Frameworks and Tools for LLM applications [25:03]
Srini Penchikala: Can you recommend any open source tools and frameworks for our audience to try out LLM applications if they want to learn about them before actually starting to use them?
Apoorva Joshi: Sure, yes. I’m trying to think what the easiest stack would be. If you’re looking at strictly open source, you don’t want to put down a credit card to just experiment and build a prototype, then I think three things. You first need a model of some sort, whether it’s embedding or LLMs.
For that, I would say use something like Hugging Face. Pretty easy to get up and running with their APIs. You don’t have to pay for it. Or if you want to go a bit deeper and try out something local, then Ollama has support for a whole bunch of open source models. I like LangGraph for orchestration. It’s something LangChain came up with a while ago. A lot of people think it’s an agent orchestration framework only, but I have personally used it for just building control flows. I think you could even build a RAG application by using LangGraph. It just gives you low-level control on the flow of your LLM application.
For vector databases, if you’re looking for something that’s really quick and open source, and easy to start with, then you could even start with something like Chroma or FAISS for experimentation. But of course, when you move from the prototype of putting something in production, you would want to consider enterprise-grade databases such as my employer.
Srini Penchikala: Yes, definitely. For local, just to get started, even Postgres has a vector flavor of the database called PG Vector.
Apoorva Joshi: Right.
Srini Penchikala: Then there’s Quadrant and others. Yes.
Apoorva Joshi: Yes.
Srini Penchikala: Do you have any metrics, or benchmarks, or resources that teams can use to look at, “Hey, I just want to see what are the top 10 or top five LLMs before I even start work on this?”
Apoorva Joshi: There’s an LLM similar to, what’s the one you were mentioning?
Srini Penchikala: The one I mentioned is Open LLM Leaderboard.
Apoorva Joshi: There’s a similar one on Hugging Face that I occasionally look at. It’s called LLM LMSYS Chatbot Arena. That’s basically a crowdsourced list of evaluation of different proprietary and open source LLMs. I think that’s a good thing to look at than just performance on benchmarks because benchmarks can have data contamination.
Sometimes vendors will actually train their models on benchmark data so certain models could end up looking good on certain tasks than they actually are. Which is why leader boards such as the one you mentioned and LMSYS are good because it’s actually people trying these models on real world prompts and tasks.
Srini Penchikala: Just like anything else, teams should try it out first and then see if it works for their use case and their requirements, right?
Apoorva Joshi: Yes.
Online Resources [27:58]
Srini Penchikala: Other than that, any other additional resources on LLM application performance improvements and evaluation? Any online articles or publications?
Apoorva Joshi: I follow a couple of people and read their blogs. There’s this person called Eugene Yan. He’s an applied scientist at Amazon. He has a blog and he’s written extensively about evals and continues to do extensive research in that area. There’s also a bunch of people in the machine learning community who had written almost a white paper titled What We Learned from a Year of Building With LLMs. It’s just really technical practitioners who’ve written that white paper based on their experience building LLMs in the past year. Yes. I generally follow a mix of researches and practitioners in the community.
Srini Penchikala: Yes, I think that’s a really good discussion. Do you have any additional comments before we wrap up today’s discussion?
Apoorva Joshi: Yes. Our discussion made me realize just how important evaluation is when building just any software application, but LLMs specifically because while they’ve made ML accessible and usable in so many different domains, what you really need on a day-to-day is for the model or application to perform on the use case or task you need. I think evaluating for what you’re building is key.
Srini Penchikala: Also, another key is your LLM mileage may vary. It all depends on what you’re trying to do, and what are the constraints and the benchmarks that are working towards.
Apoorva Joshi: Exactly.
Srini Penchikala: Apoorva, thank you so much for joining this podcast. It’s been great to discuss one of the very important topics in the AI space, how to evaluate the LLM applications, how to measure the performance, and how to improve their performance. These are practical topics that everybody is interested in, not just another Hello World application or ChatGPT tutorial.
Apoorva Joshi: Yes.
Srini Penchikala: Thank you for listening to this podcast. If you’d like to learn more about AI and ML topics, check out the AI, ML, and data engineering community page on infoq.com website. I also encourage you to listen to the recent podcasts, especially the 2024 AI ML Trends Report we published last year. And also, the 2024 Software Trends Report that we published just after the new year’s. Thank you very much. Thanks for your time. Thanks, Apoorva.
Apoorva Joshi: Yes. Thank you so much for having me.
Mentioned:
.
From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.
MMS • Robert Krzaczynski
Article originally posted on InfoQ. Visit InfoQ
OpenAI has launched ChatGPT Gov, a version of its AI-powered chatbot designed specifically for U.S. government agencies. This tailored deployment provides federal, state, and local agencies with access to OpenAI’s latest AI models while allowing them to maintain control over security, privacy, and compliance. Agencies can self-host ChatGPT Gov on Microsoft Azure’s commercial or government cloud, ensuring alignment with stringent federal cybersecurity requirements.
Kevin Weil, a chief product officer at OpenAI, emphasized the importance of AI adoption in the public sector:
Enabling the public sector, especially the U.S. Federal government, to leverage ChatGPT is critical to maintaining America’s global leadership in AI. We see enormous potential for these tools to support the public sector in tackling complex challenges—from improving public health and infrastructure to strengthening national security.
The collaboration between OpenAI and Microsoft has played a significant role in bringing ChatGPT Gov to life. Reuben Cleetus, an AI leader, highlighted the significance of this partnership:
Kudos to the amazing team at OpenAI, and the collaboration with Azure and Azure OpenAI that led to the launch of ChatGPT Gov. This initiative underscores Microsoft and OpenAI’s shared commitment to supporting U.S. government agencies in leveraging advanced AI technology to enhance public services and address complex challenges.
ChatGPT Gov includes many of the same capabilities as ChatGPT Enterprise, such as GPT-4o for advanced text interpretation, coding, and analysis, as well as custom GPTs that agencies can create and share internally. Additionally, it offers an administrative console for IT teams to manage access, single sign-on (SSO), and other security settings.
The announcement has also sparked international interest in AI’s role in government services. Stan Wepundi, a founder of Chat Nation, asked whether ChatGPT Gov could be adopted outside the United States, including in Kenya and other countries. Kevin Weil responded that while the service is currently focused on the U.S. government, OpenAI plans to expand it internationally in the future.
However, some experts have raised concerns about the governance and control of AI tools like ChatGPT Gov. Arlando Velho, a strategic account director at Salesforce, commented:
This reads like more of a deployment update than a real shift in AI governance. ChatGPT Gov still relies on OpenAI’s control, offers no real sovereignty, and operates under largely the same terms as enterprise customers. Sensitive Citizen data and national security workloads demand a much higher standard in the context of AI.
With ChatGPT Gov, OpenAI aims to provide a structured, secure way for government agencies to integrate AI while maintaining oversight of data and compliance. The company continues to work with agencies to explore applications in public services, security, and research, ensuring responsible AI deployment within the public sector.
MMS • Johan Janssen
Article originally posted on InfoQ. Visit InfoQ
The Debezium project recently completed its move to the Commonhaus Foundation after consulting with the Debezium community and Red Hat, who exclusively sponsored the project since the start in 2015.
In a blog post published in early November 2024, Chris Cranford, Principal Software Engineer at Red Hat, described their transition to the foundation, writing:
Commonhaus stands out because of its innovative governance framework and commitment to project independence. This benefits the Debezium community and its collaborators by allowing us to continue to provide the same release cadence and commitment to excellence that we have today. We are thrilled to join other prominent projects at Commonhaus, which includes Hibernate, Jackson, and Quarkus.
Cranford believes that a foundation is best suited to support Debezium’s growth and adoption. Moving to a foundation makes contributions from other developers and organizations easier in order for Debezium to remain the open-source leader for Change Data Capture.
The open source Debezium project, written in Java, provides a low latency distributed platform for Change Data Capture. Debezium can be configured to monitor databases like MySQL, MongoDB, PostgreSQL, Oracle, IBM Db2, Apache Cassandra and Microsoft SQL Server, collecting the events with Kafka and Kafka Connect. Applications can subsequently use the events to react on all the inserts, updates and deletes in the database. For example, to clear or update caches or search indexes.
Debezium’s last major release, version 3.0 in October 2024, introduced a major change as Source connectors require a Java 17 runtime, while Debezium Server, Debezium Operator or the Outbox Quarkus Extension require Java 21. Work on Debezium 3.1 is ongoing and the first alpha version was released in January 2025.
More information about Debezium can be found on GitHub, in the tutorial or in the documentation.
The non-profit Commonhaus Foundation, introduced in March 2024, provides a neutral home for open source projects. Inspired by the late Codehaus, the focus is to provide a stable long term home for open source projects, with an effective minimum amount of governance and simplified access to funding. Commonhaus started with the following projects during the launch: Hibernate, Jackson, OpenRewrite, JBang, JReleaser, and Morphia. Since then EasyMock, Feign, Infinispan, Objenesis, Quarkus, SDKMAN! and SlateDB joined the foundation.
More information about the Commonhaus Foundation is available in the FAQ or by joining the community.
MMS • RSS
Posted on mongodb google news. Visit mongodb google news
Microsoft has unveiled a new platform for document databases built on the relational database system PostgreSQL. This platform is completely free to use and is part of an open-source initiative.
A New Era with Documentdb
The popularity of document databases, which rely less on rigid database schemas, has surged in recent years. Early innovators like MongoDB championed these technologies as better suited for handling semi-structured data, often found in web-based applications. Microsoft is now joining this movement with Documentdb, an entirely open platform.
According to a blog post by Product Manager Abinav Ramesh, Documentdb has no commercial license fees, no usage restrictions, and doesn’t require users to contribute back to the project. Thanks to its co-license, developers are free to modify, fork, and distribute the software however they see fit.
As part of this platform, Microsoft has created two PostgreSQL extensions specifically optimized for document databases:
- PG_Documentdb_core: Optimizes PostgreSQL for BSON (Binary JSON), a compact format for storing JSON documents.
- PG_Documentdb_api: Provides functionality for CRUD operations, queries, and index management.
Microsoft also recommends the open-source solution FerretDB for accessing this platform. FerretDB is a document-oriented database protocol widely recognized in both the PostgreSQL and NoSQL communities.
Innovations in Collaboration
FerretDB 2.0 has significantly enhanced integration with Microsoft’s new technology. Peter Farkas, co-founder and CEO of FerretDB, stated that using the PostgreSQL extension delivers “up to 20x performance improvements for certain workloads.” The introduction of the BSON data type and optimized query operations allows for more efficient data storage and processing.
Microsoft’s entry into the document database space presents a fresh challenge to established players like MongoDB. While MongoDB benefits from a large user base, Microsoft’s open-source approach could appeal to developers embarking on new projects.
With its innovative features and open model, Documentdb could become a game-changer in the world of document databases, offering developers greater freedom, performance, and flexibility.
Digital marketing enthusiast and industry professional in Digital technologies, Technology News, Mobile phones, software, gadgets with vast experience in the tech industry, I have a keen interest in technology, News breaking.
Article originally posted on mongodb google news. Visit mongodb google news