MongoDB, Inc. (NASDAQ:MDB) CFO Sells $3,106,797.31 in Stock – MarketBeat

MMS Founder
MMS RSS

Posted on mongodb google news. Visit mongodb google news

MongoDB, Inc. (NASDAQ:MDBGet Free Report) CFO Michael Lawrence Gordon sold 7,577 shares of the company’s stock in a transaction dated Monday, November 27th. The stock was sold at an average price of $410.03, for a total value of $3,106,797.31. Following the sale, the chief financial officer now owns 89,027 shares in the company, valued at approximately $36,503,740.81. The sale was disclosed in a filing with the SEC, which is available through the SEC website.

Michael Lawrence Gordon also recently made the following trade(s):

  • On Wednesday, November 22nd, Michael Lawrence Gordon sold 10,097 shares of MongoDB stock. The stock was sold at an average price of $410.03, for a total value of $4,140,072.91.
  • On Monday, November 20th, Michael Lawrence Gordon sold 21,496 shares of MongoDB stock. The stock was sold at an average price of $410.32, for a total transaction of $8,820,238.72.
  • On Monday, October 2nd, Michael Lawrence Gordon sold 7,394 shares of MongoDB stock. The shares were sold at an average price of $345.20, for a total transaction of $2,552,408.80.
  • On Thursday, September 14th, Michael Lawrence Gordon sold 5,000 shares of MongoDB stock. The shares were sold at an average price of $370.82, for a total value of $1,854,100.00.

MongoDB Price Performance

NASDAQ MDB traded up $14.26 during mid-day trading on Wednesday, reaching $420.51. 1,767,725 shares of the stock were exchanged, compared to its average volume of 1,555,912. The company has a quick ratio of 4.48, a current ratio of 4.48 and a debt-to-equity ratio of 1.29. MongoDB, Inc. has a 1-year low of $137.70 and a 1-year high of $439.00. The business’s fifty day simple moving average is $359.57 and its 200 day simple moving average is $366.44.

MongoDB (NASDAQ:MDBGet Free Report) last issued its quarterly earnings results on Thursday, August 31st. The company reported ($0.63) earnings per share for the quarter, topping analysts’ consensus estimates of ($0.70) by $0.07. The company had revenue of $423.79 million for the quarter, compared to analyst estimates of $389.93 million. MongoDB had a negative return on equity of 29.69% and a negative net margin of 16.21%. As a group, equities analysts forecast that MongoDB, Inc. will post -2.17 earnings per share for the current year.

Institutional Trading of MongoDB

Several hedge funds and other institutional investors have recently added to or reduced their stakes in MDB. Jennison Associates LLC raised its stake in shares of MongoDB by 101,056.3% in the 2nd quarter. Jennison Associates LLC now owns 1,988,733 shares of the company’s stock valued at $817,350,000 after acquiring an additional 1,986,767 shares during the period. 1832 Asset Management L.P. lifted its stake in shares of MongoDB by 3,283,771.0% during the fourth quarter. 1832 Asset Management L.P. now owns 1,018,000 shares of the company’s stock worth $200,383,000 after purchasing an additional 1,017,969 shares in the last quarter. Price T Rowe Associates Inc. MD boosted its position in shares of MongoDB by 13.4% in the first quarter. Price T Rowe Associates Inc. MD now owns 7,593,996 shares of the company’s stock worth $1,770,313,000 after buying an additional 897,911 shares during the period. Norges Bank purchased a new stake in shares of MongoDB during the 4th quarter valued at about $147,735,000. Finally, Champlain Investment Partners LLC purchased a new position in shares of MongoDB in the first quarter valued at $89,157,000. Hedge funds and other institutional investors own 88.89% of the company’s stock.

Wall Street Analysts Forecast Growth

Several brokerages have recently weighed in on MDB. Macquarie raised their target price on MongoDB from $434.00 to $456.00 in a research report on Friday, September 1st. Guggenheim upped their target price on shares of MongoDB from $220.00 to $250.00 and gave the company a “sell” rating in a report on Friday, September 1st. KeyCorp cut their target price on MongoDB from $495.00 to $440.00 and set an “overweight” rating on the stock in a research report on Monday, October 23rd. Tigress Financial upped their target price on MongoDB from $490.00 to $495.00 and gave the company a “buy” rating in a research report on Friday, October 6th. Finally, Wells Fargo & Company initiated coverage on MongoDB in a research report on Thursday, November 16th. They set an “overweight” rating and a $500.00 price target on the stock. One analyst has rated the stock with a sell rating, two have assigned a hold rating and twenty-four have given a buy rating to the company. According to data from MarketBeat, the company presently has a consensus rating of “Moderate Buy” and an average target price of $419.74.

Check Out Our Latest Report on MDB

About MongoDB

(Get Free Report)

MongoDB, Inc provides general purpose database platform worldwide. The company offers MongoDB Atlas, a hosted multi-cloud database-as-a-service solution; MongoDB Enterprise Advanced, a commercial database server for enterprise customers to run in the cloud, on-premise, or in a hybrid environment; and Community Server, a free-to-download version of its database, which includes the functionality that developers need to get started with MongoDB.

Featured Articles

Insider Buying and Selling by Quarter for MongoDB (NASDAQ:MDB)

This instant news alert was generated by narrative science technology and financial data from MarketBeat in order to provide readers with the fastest and most accurate reporting. This story was reviewed by MarketBeat’s editorial team prior to publication. Please send any questions or comments about this story to contact@marketbeat.com.

Before you consider MongoDB, you’ll want to hear this.

MarketBeat keeps track of Wall Street’s top-rated and best performing research analysts and the stocks they recommend to their clients on a daily basis. MarketBeat has identified the five stocks that top analysts are quietly whispering to their clients to buy now before the broader market catches on… and MongoDB wasn’t on the list.

While MongoDB currently has a “Moderate Buy” rating among analysts, top-rated analysts believe these five stocks are better buys.

View The Five Stocks Here

Metaverse Stocks And Why You Can't Ignore Them Cover

Thinking about investing in Meta, Roblox, or Unity? Click the link to learn what streetwise investors need to know about the metaverse and public markets before making an investment.

Get This Free Report

Article originally posted on mongodb google news. Visit mongodb google news

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Should You Accumulate Mongodb Inc (MDB) Stock Thursday Morning? – InvestorsObserver

MMS Founder
MMS RSS

Posted on mongodb google news. Visit mongodb google news

News Home

Thursday, November 30, 2023 07:06 AM | InvestorsObserver Analysts

Mentioned in this article

Should You Accumulate Mongodb Inc (MDB) Stock Thursday Morning?

Mongodb Inc (MDB) has risen Thursday morning, with the stock gaining 3.32% in pre-market trading to 434.46.

MDB’s short-term technical score of 98 indicates that the stock has traded more bullishly over the last month than 98% of stocks on the market. In the Software – Infrastructure industry, which ranks 31 out of 146 industries, MDB ranks higher than 94% of stocks.

Mongodb Inc has risen 25.04% over the past month, closing at $329.00 on November 2. During this period of time, the stock fell as low as $329.00 and as high as $412.67. MDB has an average analyst recommendation of Strong Buy. The company has an average price target of $433.46.

Overall Score - 60
MDB has an Overall Score of 60. Find out what this means to you and get the rest of the rankings on MDB!

Mongodb Inc has a Long-Term Technical rank of 79. This means that trading over the last 200 trading days has placed the company in the upper half of stocks with 21% of the market scoring higher. In the Software – Infrastructure industry which is number 45 by this metric, MDB ranks better than 45% of stocks.

Important Dates for Investors in MDB:

-Mongodb Inc is set to release earnings on 2023-12-05. Over the last 12 months, the company has reported EPS of $-3.45.

-We do not have a set dividend date for Mongodb Inc at this time.

Click Here To Get The Full Report on Mongodb Inc (MDB)

You May Also Like

Article originally posted on mongodb google news. Visit mongodb google news

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Informatica and MongoDB Partner on Cloud-Native Solutions – ReadITQuik

MMS Founder
MMS RSS

Posted on mongodb google news. Visit mongodb google news

Aiming to facilitate a new era of cloud-native master data management (MDM) applications, Informatica and MongoDB have announced expanding their global partnership. In a strategic alliance, Informatica’s leading Master Data Management (MDM) SaaS solution, paired with MongoDB Atlas, allows customers to build revolutionary 360-degree data-driven applications.

This partnership marks a pivotal step towards innovative and comprehensive data utilization. The duo has partnered on financial services, insurance, and healthcare solutions, merging MongoDB Atlas advantages with Informatica’s MDM and domain-specific applications. These include Customer 360, Product 360, Supplier 360, and Reference 360.

As businesses growly adopt cloud-native technologies, this collaboration improves their ability to derive insights, ensure data accuracy and build resilient applications. The joint efforts reflect a commitment to pushing the boundaries of cloud-native MDM, offering enterprises a competitive edge in today’s dynamic digital landscape.

#mp_form_popup8 .mailpoet_form { background: #fff; }
#mp_form_popup8 form { margin-bottom: 0; }
#mp_form_popup8 p.mailpoet_form_paragraph { margin-bottom: 10px; }
#mp_form_popup8 .mailpoet_column_with_background { padding: 10px; }
#mp_form_popup8 .mailpoet_form_column:not(:first-child) { margin-left: 20px; }
#mp_form_popup8 .mailpoet_paragraph { line-height: 20px; margin-bottom: 20px; }
#mp_form_popup8 .mailpoet_segment_label, #mp_form_popup8 .mailpoet_text_label, #mp_form_popup8 .mailpoet_textarea_label, #mp_form_popup8 .mailpoet_select_label, #mp_form_popup8 .mailpoet_radio_label, #mp_form_popup8 .mailpoet_checkbox_label, #mp_form_popup8 .mailpoet_list_label, #mp_form_popup8 .mailpoet_date_label { display: block; font-weight: normal; }
#mp_form_popup8 .mailpoet_text, #mp_form_popup8 .mailpoet_textarea, #mp_form_popup8 .mailpoet_select, #mp_form_popup8 .mailpoet_date_month, #mp_form_popup8 .mailpoet_date_day, #mp_form_popup8 .mailpoet_date_year, #mp_form_popup8 .mailpoet_date { display: block; }
#mp_form_popup8 .mailpoet_text, #mp_form_popup8 .mailpoet_textarea { width: 200px; }
#mp_form_popup8 .mailpoet_checkbox { }
#mp_form_popup8 .mailpoet_submit { }
#mp_form_popup8 .mailpoet_divider { }
#mp_form_popup8 .mailpoet_message { }
#mp_form_popup8 .mailpoet_form_loading { width: 30px; text-align: center; line-height: normal; }
#mp_form_popup8 .mailpoet_form_loading > span { width: 5px; height: 5px; background-color: #5b5b5b; }
#mp_form_popup8 h2.mailpoet-heading { margin: 0 0 20px 0; }
#mp_form_popup8 h1.mailpoet-heading { margin: 0 0 10px; }#mp_form_popup8{border-radius: 25px;text-align: center;}#mp_form_popup8{width: 380px;max-width: 100vw;}#mp_form_popup8 .mailpoet_message {margin: 0; padding: 0 20px;}
#mp_form_popup8 .mailpoet_validate_success {color: #00d084}
#mp_form_popup8 input.parsley-success {color: #00d084}
#mp_form_popup8 select.parsley-success {color: #00d084}
#mp_form_popup8 textarea.parsley-success {color: #00d084}

#mp_form_popup8 .mailpoet_validate_error {color: #cf2e2e}
#mp_form_popup8 input.parsley-error {color: #cf2e2e}
#mp_form_popup8 select.parsley-error {color: #cf2e2e}
#mp_form_popup8 textarea.textarea.parsley-error {color: #cf2e2e}
#mp_form_popup8 .parsley-errors-list {color: #cf2e2e}
#mp_form_popup8 .parsley-required {color: #cf2e2e}
#mp_form_popup8 .parsley-custom-error-message {color: #cf2e2e}
#mp_form_popup8 .mailpoet_paragraph.last {margin-bottom: 0} @media (max-width: 500px) {#mp_form_popup8 {background-image: none;animation: none;border: none;border-radius: 0;bottom: 0;left: 0;max-height: 40%;padding: 20px;right: 0;top: auto;transform: none;width: 100%;min-width: 100%;}} @media (min-width: 500px) {#mp_form_popup8 {padding: 30px;}} @media (min-width: 500px) {#mp_form_popup8 .last .mailpoet_paragraph:last-child {margin-bottom: 0}} @media (max-width: 500px) {#mp_form_popup8 .mailpoet_form_column:last-child .mailpoet_paragraph:last-child {margin-bottom: 0}}

Article originally posted on mongodb google news. Visit mongodb google news

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Chaos Engineering Service Azure Chaos Studio Now Generally Available

MMS Founder
MMS Sergio De Simone

Article originally posted on InfoQ. Visit InfoQ

Two years after entering public preview, reliability experimentation service Azure Chaos Studio is now generally available. Among its most recent features are experiment templates, dynamic targets, load testing faults, and more.

Chaos Studio is a fully managed service that allows users to apply chaos engineering techniques to experiment with controlled fault injection to assess the reliability of their apps.

Chaos Studio enables users to assess how applications respond to real-world disruptions like network delays, unexpected storage failures, expired secrets, or datacenter outages. Using Chaos Studio, customers can design and conduct experiments with a wide range of agent-based and service-direct faults to better understand how to proactively improve the resilience of their application.

A chaos experiment defines a sequence of actions to execute against your target resources. Additionally, the chaos experiment defines which actions you want to take in parallel against other branches.

Since it became available in preview, Azure Chaos Studio has been extended with several new capabilities, including experiment templates, dynamic targets, load testing faults, and improved identity management.

Experiment templates aim to simplify the creation of experiments using pre-filled templates. For example, templates may describe an Azure Active Directory outage, an availability zone going down, or simulating all targets in a zone going down. Each template defines a number of rules specific to the fault as well as more generic ones, such as the experiment duration, allowing you to quickly run common experiments.

When selecting targets affected in an experiment, you can either list them manually, or use the new query-based dynamic targets feature, which allows you to filter targets based on Azure resource parameters including type, region, name, and others.

Load testing faults make it possible to start and stop Azure load testing, which is a service able to generate high-scale loads and simulate traffic for your applications. Azure Load Testing uses Apache JMeter to run load tests and simulate a large number of virtual users accessing your application endpoints at the same time.

To better control who can inject faults into your systems, Azure Chaos Studio has also improved identity management by introducing user-assigned managed identities and custom role assignment. This feature will allow you to create a user-assigned managed identity and explicitly assign it the permissions required to run an experiment beforehand. When creating an experiment, you assign a specific user-assigned managed identity to it and review if that identity has sufficient permissions.

If you want to start practicing chaos experiments with Azure Chaos Studio, you can head to its official documentation and have a look at the Azure Chaos Studio fault and action library.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Applying the Analytic Hierarchy Process for Tech Decisions

MMS Founder
MMS Ben Linders

Article originally posted on InfoQ. Visit InfoQ

The analytic hierarchy process uses pairwise comparisons and scoring for criteria between the alternatives to give insights into what the best option is and why. John Riviello spoke about applying the analytic hierarchy process to decide what JavaScript framework to use at QCon New York 2023.

When a decision was needed to pivot to a new JavaScript framework for one of their web apps, Riviello came across the Analytic Hierarchy Process (AHP). There were three groups, each with strong opinions on what that new framework should be, as Riviello explained:

The VP of engineering of my organization told me that he trusted the team to make the right decision, but wanted me to explore how I would know the process we would use to arrive at that decision was effective.

Like most decision-making frameworks, you start with some options and criteria to evaluate them, Riviello said. Where the AHP differs from some other more common process is that it involves doing pairwise comparisons (i.e. comparing only two items to each other at a time) for each option with respect to each criteria, and also pairwise comparisons with each criteria with respect to your overall goal:

There is a scale it uses from 1 to 9 to compare two items, with 9 being the case where one item is by far superior to the other, and 1 being the two are for all intents and purposes equal.

Once you have all those comparison numbers, it has a formula to give you insights into what the best option is and why, Riviello mentioned.

For their particular case, they used seven criteria to evaluate the three JavaScript frameworks they were considering: Community, Performance, Redux compatibility, Web Components support, Localization, Developer Productivity, and Hybrid App Support, Riviello explained:

After going through the exercise, it was clear Community and Performance were most important to us, with Developer Productivity being a close third, and the others not being as critical to our decision.

AHP is just a tool in the toolbox when it comes to making decisions and getting buy-in, Riviello said. Like any decision, how you go about soliciting feedback, getting approval for, and announcing the decision is critical to how well it will be received and acted upon. Riviello mentioned that the concept of nemawashi, which favors building consensus openly instead of trying to force consensus, goes along well with AHP:

The best way to leverage and commute decisions that use AHP is to be open about the inputs and the results.

The original JavaScript framework AHP exercise took almost six hours total, but it was six hours well spent to arrive at a common decision that we would be rallying around and living with for years to come, Riviello concluded.

InfoQ interviewed John Riviello about applying the analytic hierarchy process.

InfoQ: How did you adapt the Analytic Hierarchy Process to your needs?

John Riviello: The recommended process is to collect results on the pairwise comparisons via individual surveys and then use those to determine the results. The adjustment we made was we had participants come up with their pairwise comparison numbers in advance, but instead of blindly calculating at that point, we came together and discussed our numbers for each pairwise comparison until we had an agreement on what each pairwise comparison number should be. We adopted this from how we approach applying story points to user stories.

InfoQ: What JavaScript framework did you pick?

Riviello: One JS framework became the clear winner. But never share what that framework was because it is important to note that our criteria were specific to our team’s needs at that time, so I’d recommend others who are looking to follow the process to pick their own JS framework to come up with their own criteria and evaluate them.

InfoQ: What came out of the retrospective on using the Analytic Hierarchy Process?

Riviello: Many groups at Comcast have used AHP by this point, and the feedback has always been favorable. Many groups capture Architecture Decision Records (ADRs), and including the charts of AHP results in ADRs can be useful for those that go back to review them to understand why a decision was made.

We’ve also had groups mention that by going through the process, they’ve learned more about other teams (for example when teams that previously did not work together much in the past had to come together to build a new system or update an existing one).

InfoQ: What advice can you give to teams that want to experiment with the Analytic Hierarchy Process?

Riviello: Give it a try! The tool I opened sourced at Analytic Hierarchy Process (AHP) Tool seeks to make it easy to capture your AHP data and generate visual charts of your decision.

One important thing to note is when it comes to the number of options or criteria, you do not want to have more than eight, otherwise the number of pairwise comparisons you have to do will increase dramatically.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


AWS Announces Amazon Q, a New Generative AI–Powered Assistant

MMS Founder
MMS Daniel Dominguez

Article originally posted on InfoQ. Visit InfoQ

AWS has introduced Amazon Q, a new generative AI-powered assistant designed for professional applications. This assistant is configurable to align with your company’s requirements, facilitating conversations, issue resolution, content generation, and action-taking through the utilization of information present in your code, enterprise systems, and data repositories.

Unveiled at AWS re:Invent 2023 in the keynote address, Amazon Q will give staff the ability to ask questions related to work tasks.

Amazon Q is a new type of generative AI-powered assistant tailored to your business that provides actionable information and advice in real time to streamline tasks, speed decision making, and spark creativity, built with rock-solid security and privacy, said Adam Selipsky, CEO at AWS.

Amazon Q can help you get fast, relevant answers to pressing questions, solve problems, generate content, and take actions using the data and expertise found in your company’s information repositories, code, and enterprise systems. When you chat with Amazon Q, it provides immediate, relevant information and advice to help streamline tasks, speed decision-making, and help spark creativity and innovation at work.

Amazon Q also offers a range of capabilities in preview to work with AWS, such as the AWS Management Console, popular IDEs, and more giving you expert assistance when building, deploying, and operating applications and workloads.

Additionally, contact center personnel can quickly and accurately resolve customer issues with the assistance of Amazon Q in Connect, which provides real-time recommended replies and actions.

Amazon Q builds on AWS’s history of taking complex, expensive technologies and making them accessible to customers of all sizes and technical abilities, said Swami Sivasubramanian, vice president of data and AI at AWS.

AI chatbots gained interest when ChatGPT and Bard entered the field of generative AI. OpenAI’s ChatGPT has sparked more competition in the market and increased interest in the development of AI products.

Copilot is well-known in the conversational AI space for incorporating GPT-4 into its chatbot features. By not depending on just one model, Amazon Q adopts a novel strategy. It is strongly linked to Amazon Bedrock and offers access to a variety of AI models, such as Titan from Amazon, Claude from Anthropic, and Llama 2 from Meta.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Article: Going Global: A Deep Dive to Build an Internationalization Framework

MMS Founder
MMS Hemanth Murali

Article originally posted on InfoQ. Visit InfoQ

Key Takeaways

  • Internationalization (i18n) and localization are critical processes in web development that ensure software can be adapted for different languages and regions and the actual adaptation of the software to meet these specific requirements.
  • Though JavaScript-focused i18n libraries (like i18next, react-intl, and react-i18next) are dominant tools in the field, aiding developers in efficiently handling translations and locale-specific configurations, they are only available for Javascript-based web applications. There is a need for a language-agnostic framework for internationalization.
  • JSON is a widely-accepted format for storing translations and locale-specific configurations, allowing for easy integration and dynamic content replacement in various applications irrespective of the language and framework used.
  • Content Delivery Networks (CDN) can be strategically used to efficiently serve locale-specific configuration files, mitigating potential downsides of loading large configurations.
  • Building and integrating a custom internationalization framework with databases or data storage solutions enables dynamic and context-aware translations, enhancing the user experience for different regions and languages.

Dipping your toes into the vast ocean of web development? You’ll soon realize that the web isn’t just for English speakers — it’s global. Before you’re swamped with complaints from a user in France staring at a confusing English-only error message, let’s talk about internationalization (often abbreviated as i18n) and localization.

What’s the i18n Buzz About?

Imagine a world where your software speaks fluently to everyone, irrespective of their native tongue. That’s what internationalization and localization achieve. While brushing it off is tempting, remember that localizing your app isn’t just about translating text. It’s about offering a tailored experience that resonates with your user’s culture, region, and language preferences.

However, a snag awaits. Dive into the tool chest of i18n libraries, and you’ll notice a dominance of JavaScript-focused solutions, particularly those orbiting React (like i18next, react-intl, and react-i18next).

Venture outside this JavaScript universe, and the choices start thinning out. More so, these readily available tools often wear a one-size-fits-all tag, lacking the finesse to cater to unique use cases.

But fret not! If the shoe doesn’t fit, why not craft one yourself? Stick around, and we’ll guide you on building an internationalization framework from scratch — a solution that’s tailored to your app and versatile across languages and frameworks.

Ready to give your application a global passport? Let’s embark on this journey.

The Basic Approach

One straightforward way to grasp the essence of internationalization is by employing a function that fetches messages based on the user’s locale. Below is an example crafted in Java, which offers a basic yet effective glimpse into the process:

public class InternationalizationExample {

    public static void main(String[] args) {
        System.out.println(getWelcomeMessage(getUserLocale()));
    }

    public static String getWelcomeMessage(String locale) {
        switch (locale) {
            case "en_US":
                return "Hello, World!";
            case "fr_FR":
                return "Bonjour le Monde!";
            case "es_ES":
                return "Hola Mundo!";
            default:
                return "Hello, World!";
        }
    }

    public static String getUserLocale() {
        // This is a placeholder method. In a real-world scenario,
        // you'd fetch the user's locale from their settings or system configuration.
        return "en_US";  // This is just an example.
    }
}

In the example above, the getWelcomeMessage function returns a welcome message in the language specified by the locale. The locale is determined by the getUserLocale method. This approach, though basic, showcases the principle of serving content based on user-specific locales.

However, as we move forward, we’ll dive into more advanced techniques and see why this basic approach might not be scalable or efficient for larger applications.

Pros:

  • Extensive Coverage — Given that all translations are embedded within the code, you can potentially cater to many languages without worrying about external dependencies or missing translations.
  • No Network Calls — Translations are fetched directly from the code, eliminating the need for any network overhead or latency associated with fetching translations from an external source.
  • Easy Code Search — Since all translations are part of the source code, searching for specific translations or troubleshooting related issues becomes straightforward.
  • Readability — Developers can instantly understand the flow and the logic behind choosing a particular translation, simplifying debugging and maintenance.
  • Reduced External Dependencies — There’s no reliance on external translation services or databases, which means one less point of failure in your application.

Cons:

  • Updates Require New Versions — In the context of mobile apps or standalone applications, adding a new language or tweaking existing translations would necessitate users to download and update to the latest version of the app.
  • Redundant Code — As the number of supported languages grows, the switch or conditional statements would grow proportionally, leading to repetitive and bloated code.
  • Merge Conflicts — With multiple developers possibly working on various language additions or modifications, there’s an increased risk of merge conflicts in version control systems.
  • Maintenance Challenges — Over time, as the application scales and supports more locales, managing and updating translations directly in the code becomes cumbersome and error-prone.
  • Limited Flexibility — Adding features like pluralization, context-specific translations, or dynamically fetched translations with such a static approach is hard.
  • Performance Overhead — For high-scale applications, loading large chunks of translation data when only a tiny fraction is used can strain resources, leading to inefficiencies.

Config-Based Internationalization

Building on the previous approach, we aim to retain its advantages and simultaneously address its shortcomings. To accomplish this, we’ll transition from hard-coded string values in the codebase to a config-based setup. We’ll utilize separate configuration files for each locale, encoded in JSON format. This modular approach simplifies the addition or modification of translations without making code changes.

Here’s how a configuration might look for the English and Spanish locales:

Filename: en.json

{
    "welcome_message": "Hello, World"
}
Filename: es.json
{
    "welcome_message": "Hola, Mundo"
}

Implementation in Java:

First, we need a way to read the JSON files. This often involves utilizing a library like Jackson or GSON. For the sake of this example, we’ll use Jackson.

import com.fasterxml.jackson.databind.ObjectMapper;
import java.io.File;
import java.io.IOException;
import java.util.Map;

public class Internationalization {

    private static final String CONFIG_PATH = "/path_to_configs/";
    private Map translations;

    public Internationalization(String locale) throws IOException {
        ObjectMapper mapper = new ObjectMapper();
        translations = mapper.readValue(new File(CONFIG_PATH + locale + ".json"), Map.class);
    }

    public String getTranslation(String key) {
        return translations.getOrDefault(key, "Key not found!");
    }
}

public static class Program {

    public static void main(String[] args) throws IOException {
        Internationalization i18n = new Internationalization(getUserLocale());
        System.out.println(i18n.getTranslation("welcome_message"));
    }

    private static String getUserLocale() {
        // This method should be implemented to fetch the user's locale.
        // For now, let's just return "en" for simplicity.
        return "en";
    }
}

The Internationalization class reads the relevant JSON configuration in the above code based on the provided locale when instantiated. The getTranslation method fetches the desired translated string using the identifier.

Pros:

  • Retains all the benefits of the previous approach — It offers extensive coverage, no network calls for translations once loaded, and the code remains easily searchable and readable.
  • Dynamic Loading — Translations can be loaded dynamically based on the user’s locale. Only necessary translations are loaded, leading to potential performance benefits.
  • Scalability — It’s easier to add a new language. Simply add a new configuration file for that locale, and the application can handle it without any code changes.
  • Cleaner Code — The logic is separated from the translations, leading to cleaner, more maintainable code.
  • Centralized Management — All translations are in centralized files, making it easier to manage, review, and update. This approach provides a more scalable and cleaner way to handle internationalization, especially for larger applications.

Cons:

  • Potential for Large Config Files — As the application grows and supports multiple languages, the size of these config files can become quite large. This can introduce a lag in the initial loading of the application, especially if the config is loaded upfront.

Fetching Config from a CDN

One way to mitigate the downside of potentially large config files is to host them on a Content Delivery Network (CDN). By doing so, the application can load only the necessary config file based on the user’s locale. This ensures that the application remains fast and reduces the amount of unnecessary data the user has to download. As the user switches locales or detects a different locale, the relevant config can be fetched from the CDN as required. This provides an optimal balance between speed and flexibility in a high-scale application. For simplicity, let’s consider you’re using a basic HTTP library to fetch the config file. We’ll use the fictional HttpUtil library in this Java example:

import java.util.Map;
import org.json.JSONObject;

public class InternationalizationService {

    private static final String CDN_BASE_URL = "https://cdn.example.com/locales/";

    public String getTranslatedString(String key) {
        String locale = getUserLocale();
        String configContent = fetchConfigFromCDN(locale);
        JSONObject configJson = new JSONObject(configContent);
        return configJson.optString(key, "Translation not found");
    }

    private String fetchConfigFromCDN(String locale) {
        String url = CDN_BASE_URL + locale + ".json";
        return HttpUtil.get(url);  // Assuming this method fetches content from a given URL
    }

    private String getUserLocale() {
        // Implement method to get the user's locale
        // This can be fetched from user preferences, system settings, etc.
        return "en";  // Defaulting to English for this example
    }
}

Note: The above code is a simplified example and may require error handling, caching mechanisms, and other optimizations in a real-world scenario.

The idea here is to fetch the necessary config file based on the user’s locale directly from the CDN. The user’s locale determines the URL of the config file, and once fetched, the config is parsed to get the required translation. If the key isn’t found, a default message is returned. The benefit of this approach is that the application only loads the necessary translations, ensuring optimal performance.

Pros:

  • Inherits all advantages of the previous approach.
  • Easy to organize and add translations for new locales.
  • Efficient loading due to fetching only necessary translations.

Cons:

  • Huge file size of the config might slow the application initially.
  • Strings must be static. Dynamic strings or strings that require runtime computation aren’t supported directly. This can be a limitation if you need to insert dynamic data within your translations.
  • Dependency on external service (CDN). If the CDN fails or has issues, the application’s ability to fetch translations.

However, to address the cons: The first can be mitigated by storing the config file on a CDN and loading it as required. The second can be managed by using placeholders in the static strings and replacing them at runtime based on context. The third would require a robust error-handling mechanism and potentially some fallback strategies.

Dynamic String Handling

A more flexible solution is required for situations where parts of the translation string are dynamic. Let’s take Facebook as a real-life example. In News Feed, you would have seen custom strings to represent the “Likes” for each post. If there is only one like to a post, you may see the string “John likes your post.” If there are two likes, you may see “John and David like your post.”. If there are more than two likes, you may see “John, David and 100 others like your post.” In this use case, there are several customizations required. The verbs “like” and “likes” are used based on the number of people who liked the post. How is this done?

Consider the example: “John, David and 100 other people recently reacted to your post.” Here, “David,” “John,” “100,” “people,” and “reacted” are dynamic elements.

Let’s break this down:

  • “David” and “John” could be user names fetched from some user-related methods or databases.
  • “100” could be the total number of people reacting on a post excluding David and John, fetched from some post-related methods or databases.
  • “people” could be the plural form of the noun person when referring to a collective group.
  • “reacted” could be used when the user reacts with the icon’s heart or care or anger to a post instead of liking it.

One way to accommodate such dynamic content is to use placeholders in our configuration files and replace them at runtime based on context.

Here’s a Java example:

Configuration File (for English locale):

{
      oneUserAction: {0} {1} your post,
      twoUserAction: {0} and {1} {2} your post,
      multiUserAction: {0}, {1} and {2} other {3} recently {4} to your post,
      people: people,
      likeSingular: likes,
      likePlural: like,
}

Configuration File (for French locale):

{
      oneUserAction: {0} {1} votre publication,
      twoUserAction: {0} et {1} {2} votre publication,
      multiUserAction: {0}, {1} et {2} autres {3} ont récemment {4} à votre publication,
      people: personnes,
      likeSingular: aime,
      likePlural: aiment,
}

Java Implementation:

import java.util.Locale;
import java.util.ResourceBundle;

public class InternationalizationExample {

    public static void main(String[] args) {
        // Examples
        System.out.println(createMessage("David", null, 1, new Locale("en", "US"))); // One user
        System.out.println(createMessage("David", "John", 2, new Locale("en", "US"))); // Two users
        System.out.println(createMessage("David", "John", 100, new Locale("en", "US"))); // Multiple users

        // French examples
        System.out.println(createMessage("David", null, 1, new Locale("fr", "FR"))); // One user
        System.out.println(createMessage("David", "John", 2, new Locale("fr", "FR"))); // Two users
        System.out.println(createMessage("David", "John", 100, new Locale("fr", "FR"))); // Multiple users
    }

    private static String createMessage(String user1, String user2, int count, Locale locale) {
        // Load the appropriate resource bundle
        ResourceBundle messages = ResourceBundle.getBundle("MessagesBundle", locale);    

        if (count == 0) {
            return ""; // No likes received
        } else if (count == 1) {
            return String.format(
                  messages.getString("oneUserAction"),
                  user1,
                  messages.getString("likeSingular")
            ); // For one like, returns "David likes your post"
        } else if (count == 2) {
            return String.format(
                  messages.getString("twoUserAction"),
                  user1,
                  user2,
                  messages.getString("likePlural")
            ); // For two likes, returns "David and John like your post"
        } else {
            return String.format(
                  messages.getString("multiUserAction"),
                  user1,
                  user2,
                  count,
                  messages.getString("people"),
                  messages.getString("likePlural")
                  ); // For more than two likes, returns "David, John and 100 other people like your post"
        }
    }
}

Conclusion

Developing an effective internationalization (i18n) and localization (l10n) framework is crucial for software applications, regardless of size. This approach ensures your application resonates with users in their native language and cultural context. While string translation is a critical aspect of i18n and l10n, it represents only one facet of the broader challenge of globalizing software.

Effective localization goes beyond mere translation, addressing other critical aspects such as writing direction, which varies in languages like Arabic (right-to-left) and text length or size, where languages like Tamil may feature longer words than English. By meticulously customizing these strategies to meet specific localization needs, you can deliver your software’s truly global and culturally sensitive user experience.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Presentation: PostgresML: Leveraging Postgres as a Vector Database for AI

MMS Founder
MMS Montana Low

Article originally posted on InfoQ. Visit InfoQ

Transcript

Low: Putting all of the machine learning algorithms, models, and everything else into the database didn’t always sound like a good idea to me. A lot of this is based on things that I learned working at Instacart over the last decade, trying to scale our machine learning infrastructure, our data infrastructure, and our real-time inference systems. When I got to Instacart, one of the first things I did actually was helped pull all of our product catalog data out of our monolithic Postgres database that was already hitting scalability constraints 8, 9 years ago. I moved all of that into Elasticsearch, so that we would have this beautiful, horizontally scalable, amazing natural language processing framework. That actually carried the company for about the next 5 years. It became the real heart of our data infrastructure. It allowed us to grow the business to a multibillion-dollar revenue generating enterprise. Our Elasticsearch cluster grew to several hundred nodes. It was powering several thousand stores worth of data with hundreds of thousands of personal shoppers and millions of customers using it on a regular basis. We were doing everything with that cluster. We were taking our machine learning embeddings and putting it in there. We were putting our feature store data in there. Some of the JSON blobs for our Elastic documents reached the size of megabytes.

Over time, this became slightly less tenable. Once you have the god object in your data architecture that everybody wants to put something in and get something out of, there’s all kinds of organizational problems. There’s also all kinds of scalability problems, regardless of the technology. You really need tight control over everything going on. The crucible moment, I think, for me came during the COVID pandemic, when most of the world moved from an offline grocery experience to an online grocery experience. We were already a multibillion-dollar company. Then we started doubling on a weekly basis. At that point, everything you know about scalability goes out the window, everything you know about engineering best practices goes out the window, because the company is going to die if you can’t actually scale double in a week. Then you have to do it again the next week. You find all of these microservices that had their individual feature stores based on Redis, or Cassandra, or Druid, or we’re talking directly to Snowflake, all of the concurrency issues that you thought you would slowly deal with, all come to a head.

What we did was, we had a Postgres cluster, we had learned a lot about scaling Postgres in the intervening years, between the time that I get there, you can get pretty far with read replicas. Especially in a machine learning context, where a few seconds of data latency is actually pretty state of the art when you start thinking about Kafka and Flink streaming, then you’re used to tolerating and not having ACID compliant transactions. We can actually scale Postgres in a very similar manner horizontally. We get really good control of the sharding capabilities of our database, so that we can very specifically target exactly what criteria we need. We abstracted all of that logic that we built internally in Instacart into a different project called PgCat that sits in front of a massive cluster of Postgres databases to help scale it horizontally. Just know that that’s always the other half of the project that I rarely talk about, because scaling Postgres is so important when you’re talking about machine learning.

Programming

Talking about machine learning, when we think about machine learning, a lot of people in the world think that it’s this mystical, dark art. I think that’s really unfortunate. I think that we should try to take an engineering-first approach, because I think engineering is actually where many of the hard problems in deploying machine learning systems are. I think engineers are very used to dealing with systems as black boxes: you have inputs, you expect outputs. It’s just a function in the middle. A machine learning model is just a function. It just takes inputs. It just produces outputs. You don’t need to know how it works to use it. We use hundreds of APIs, hundreds of functions, hundreds of SDKs without really knowing how they work internally, and we do this with very effectively. We have unit tests. We have integration tests. We have end-to-end QA. We can deal with machine learning systems very similarly and get very far. This is an example. It’s very contrived. It’s just a function that takes a birthdate and it returns a float. This is implemented in Python. I hope that I don’t have any bugs in my code. It gives you a rough idea of what we’re talking about.

Machine Learning

We can actually pretend that we’re a machine, we don’t understand birthdates. We don’t understand age as a concept. If we’re trying to teach a machine about how to calculate a birthdate, we’ll take a data-driven approach. We’ll collect some samples from the audience. Somebody is born in 1960. Somebody is born in 1980. Somebody is born in the year 2000. We can ask them, what is your age? We don’t know how they’re calculating their age. It’s some internal mystical black box that they know how old they are, and they know when they were born. They’ll actually do some computation very similar to that Python function. We don’t need to know that. We can just put their data in a lookup table, and now our function implementation doesn’t need to know anything about dates or ages. You parse in a birthdate. It will look up that birthdate in our data table and will return the answer. This is actually pretty terrible, because most people wouldn’t have an exact birthdate, and so we’ll get a not found error, rather than actually returning a useful age. This is where machine learning comes into play. What we can do with this very meager sample of data is we want to generalize. We want to be able to tell people their age in years, given any birthdate, not just some birthdate that somebody’s already told us the answer for.

One of the simplest machine learning algorithms is linear regression. All linear regression is, is you take your three data points that you have, you put them on a plot graph. You draw a line through them. If you remember, the slope intercept form of a line from early algebra in high school, is y equals mx plus b. In this case, we know that every year you get 1 year older, and it happens to be the year 2023 today, so m is negative one, b is 2023. We’ve now solved the equation for the linear regression that passes through these points. You don’t need to know how linear regression actually works, is actually implemented. That’s just another function call. It’ll give you this data. There’s libraries that do this, so it’s all implemented. We can now rewrite our Python function. We have a couple constants, we have m and b. You can now parse in your datetime. The year of the datetime is the only feature we care about in our very simple model. We can now multiply that by m and add b. Now we’ve actually generalized our three data points into predicting the correct age from any birthdate without really knowing anything about ages or birthdates. If you start to think about all the hidden functions in our applications, from a user perspective, we don’t know why users behave the way that they do or what they want. We can start to gather data about their behavior. They can start explaining to us these hidden functions, and we can model them. Then we can actually generalize those models across populations. Machine learning is a very powerful technique when you’re dealing in a murky environment.

Neural Networks, Deep Learning, and LLMs

To move this forward beyond the simplest linear regression, neural networks, deep learning, and LLMs are a much more advanced topic. This is a diagram of a very simple neural network. It has three layers. The three layers are the inputs, the hidden layer, and the output. This is just a function. It takes three inputs, it produces one output. What happens in the middle is a black box, nobody needs to know. I’ll walk through an example. Just for reference, all machine learning models only operate on math, they only take numbers. What we’ve seen lately is that LLMs, they take text. How does that work? You start with your words, and you assign all of your words an ID. A is the first word in our dictionary, it gets the ID number 1. Upon is the 74th word in our dictionary, so it gets the ID number 74, and so on and so forth, until you have all of the words in your dictionary assigned numbers. Then you multiply those numbers, you add those numbers.

Every single line in this graph represents a function very similar to linear regression. There’s hundreds of different functions and hundreds of different ways that you can implement those lines. You don’t need to know that right now. You don’t need to be a machine learning expert to know that this is just math. It’s a lot of math. There are a lot of lines, and that’s why GPUs can actually execute all of those lines in parallel, they’re all independent. What you’ll get is just some more magic numbers in an array. In the middle in the hidden layer, you just repeat that process with more lines, more functions, more math, and you’ll get an output. The output is just some numbers, a magic number like 42. To actually understand what 42 means, we look it up in the dictionary, and we get the word timeout. Now we have this model that given the three inputs, once upon a time, is predicting the next word in the sentence is time. This is how LLMs work. This is the magic that they do.

For this talk, we’re going to focus on embeddings and vectors. What are embeddings and vectors? An embedding is just that hidden layer. This is some mystical intermediate representation that the model has. State of the art research doesn’t really understand why these numbers are the way that they are. They are the way that they are because that’s what gets the right answer, is basically what it boils down to. There are lots of clever ways to figure out how to generate those numbers to make sure that you are getting the right answer. Again, we don’t need to know any of that: it’s just a black box, it’s just a function.

The very cool thing is that we’ll have lots of various ways to start a story, like, once upon a, that may all be similar to us. They will also be similar in this hidden layer, in this embedding, even though they may use completely different words. It may be a completely different phrase. It may be a completely different language. As long as the model has been trained well, then the embedding, the intermediate representation of that language will be very similar. When I say similar, I just mean like, by Euclidean geometry, like it’ll be some number close to negative 3, it will be some number close to 23 in the second box. We have lots of ways to measure similarity of large arrays, you can do the Manhattan distance, since we’re in New York, or you might choose the dot product, or you might choose cosine similarity. These are all just for loops to implement these things. Again, modern processors can do a lot of those computations very quickly, and tell you how similar all of these things are.

The reason you need an embedding database, or you might want an embedding database, just think about the Elasticsearch case that I mentioned before. You have hundreds of thousands, or millions, or even billions of documents, and they’re all text. In traditional search, you’re going to do keyword matching against an inverted index using English language. If somebody uses the wrong word, if they say, vanilla ice cream versus old fashioned vanilla, you may not actually get the right keyword match, and you may return the wrong product. Synonym matching is something that embeddings are very good at. You can actually take all of your natural language documents, you will run them all through a neural network like this. You generate the embedding for that document. Then you save just that embedding, just that array in your database. Pretty much every database I know of has an array datatype for storage. It’s a pretty primitive data type. Pretty much every programming language supports arrays, even garbage collected runtime languages like Python, can implement operations on arrays very quickly, especially with optimized libraries like NumPy, and pandas. This is not new stuff. Even though vector databases seem very new today, people have been doing these things for decades. A lot of the functionality you need has actually been baked into hardware by Intel, and by NVIDIA, because this is such a generally useful thing to be able to say, add up the numbers into a race. Everybody needs to do it for all kinds of things.

If you have a database, and it’s full of these arrays, with millions of them, and then you have a user query comment, you can also generate the embedding of that user query. Now you can actually calculate the distance between that user embedding array and each and every one of the arrays in your database, and you’ll get some distance function for all of those. You can sort that list, find the one with the smallest distance, that will give you the array. You translate that array is mapped back to a text document, and now you have the English language document that is most similar to the English language query coming in. Again, this is just a function. You take a single input, which is English language, you produce a single output, which is English language, even though it’s all math under the hood.

Embeddings

I’ve just given you a pretty high-level description of how machine learning works. You can basically forget all of that now, because if you want to generate an embedding, you don’t really need to know how any of it actually happens. We’ve created a very simple function in PostgresML. If you install PostgresML on your Postgres database, you can select the pgml.embed function, and you parse it two arguments. The first argument is the model_name. This can be any model published on Hugging Face. There are hundreds of open source models that will generate embeddings. This can be a model that you’ve trained yourself if you are a natural language processing expert. Then it, of course, takes the text that you want to create an embedding for. You can use this function both to index all of your documents and create embeddings. This is the output of that function for a single call. It’s always a vector. Again, a vector is just a massive blob of numbers. Keep in mind, though, that when you’re talking about these embedding vectors, typically they’re on the scale of 1000 floats each. If you’re talking about a million documents that you’re creating embeddings for, that’s a billion floats that you need to store. That’s a gigabyte. These things can quickly grow large. You do need to be thoughtful about how you’re storing them, how you’re indexing them.

Storage

Postgres makes this, again, very simple. If you have a documents table, and that table would normally have a text column that represents the body of the document, we can add a separate column or a second column to that table, and this table will hold the embedding. In this case, the embedding is a vector. It has 768 elements in it. These need to be sized. It’s really nice to have a typed schema in Postgres, where you can check the correctness of everything. This particular vector column, Postgres has this really nice feature where you can say that a column is generated. In this case, the generation of this embedding column is our pgml.embed function. Anybody who inserts some text into this document table is automatically going to also create an embedding alongside that document. They don’t even need to know that they’re creating an embedding. It eliminates a lot of issues with data staleness. If they update the document, the embedding is also regenerated and updated for them.

This is an example of a slightly newer model. It’s the Hong Kong University’s natural language processing instructor-xl model. At the time that I made these slides, it was the leading model, it’s no longer the leading model for embeddings. These things change on a weekly basis. Being able to just swap out your model_name, regenerate all your embeddings is actually really nice. This model is interesting because it takes a prompt for how to actually generate the embedding, similar to all the prompt engineering that people are doing with other large language models. It’s worth calling out that everything is moving very quickly now. You need a lot of flexibility, and you need to constantly be updating dependencies. Python dependencies are not fun to maintain. There’s a very large operational burden there. If you’re doing machine learning on a bunch of laptops for your data scientists, it’s very hard to make sure that every data scientist in a large organization has the latest updates and that their laptops are working. On the other hand, if there’s one large central database cluster, that thing can be managed very effectively by an operational team who has complete control. Then the data scientists can still get access to the latest models. They can still get access to do all of the things that they need to do.

PostgresML isn’t just about large language models. It has all of the classical machine learning algorithms that you expect. We have native bindings for XGBoost that are super-duper fast, we’re very proud of. The implementation is highly optimized so that when you have, what is an array, a C array in memory in Postgres in a buffer from a table, we take a pointer to that, we don’t even copy it. We parse that pointer to XGBoost. XGBoost says, “Fine. I know what an array is.” Again, these things are very primitive. This is the goal of Apache Arrow. This is the goal of protobuf from Google. They have Spanner. Spanner is very similar concept that when you have a protobuf, will store the data in the table in the row in the exact protobuf format. There is no serialization. They can just copy it straight out over the wire. Serialization and going over the wire kill so many machine learning applications. They really limit the amount of data that you can bring to bear in an interactive online context.

The Storage Keyword, and 768

The Hong Kong University natural language processing instructor-xl model, it has a hidden layer in the middle of it that is 768 nodes wide. It will always have 768 floats as its intermediate hidden layer representation. Regardless of the string size that you parse in, it’s going to have that hidden state once it’s read the entire string. That is the size of the vector that we will always be storing in this table, regardless if its one-word input, or 1000-word input. All of these models have what’s called a context window, which means they can only consider 512, or 2000 tokens at a time. If you parse fewer than that, it will get padded out with zeros or the empty word token. If you parse more than that, it will probably just get truncated. There’s a whole bunch of techniques that you might want to do called chunking, where if your documents are larger than the context window of your model, you will want to split those documents up into chunks, create an embedding for each chunk. There’s lots of cool tricks you can do with embeddings.

You can do all kinds of math with arrays. Let’s say you have a document and you break it up into 10 chunks, and you generate 10 different embeddings for those 10 chunks. You can actually add up all the vectors, and the vector that you get by adding up those 10 vectors will actually be the average vector of the entire document. Even though your model may be limited to a context window of 512, the embedding practically that it can consider is unlimited. If you think about what happens when you start adding up a bunch of embeddings together, some of them will have positive numbers, some of them will have negative numbers. A lot of them will just cancel out and trend towards zero. If you add up the embeddings for all of Wikipedia, you might just get zeros across the board, meaning that this is not special in any way. It’s equally relevant to everything or equally close to everything. It will actually be closer to things that are also very generic, or general. That covers I think the 768.

GENERATED ALWAYS AS is just the Postgres syntax that tells Postgres, anytime someone inserts or updates a row in this table, it needs to run this function. This body here is actually a variable that references this body column up here. You only have to parse the body text in once. Postgres has an in memory once. It will reuse it, and it will re-parse it to the model here. STORED is another option. You can have this be stored in the table physically where it will take up space. This is good if you read things more than you write them. You don’t have to store it. It can be generated on the fly at read time. Perhaps you’re storing a lot of documents that you never need the embedding for. Generating embeddings is expensive. In that case, you only want to run the generation function if somebody is reading the embedding column. That might be a savings there.

Vector Search

This is an example of the cosine_distance operator that’s also provided by pgvector. There are three of these operators parsed. This will do the cosine similarity function. There’s one for the Manhattan distance. There’s one for the dot product. These three functions have different tradeoffs. Manhattan doesn’t involve any square roots. You just add up the east-west plus the north-south, and that’s your final distance. Whereas dot product and cosine similarity have a little bit more math involved, they’re a little bit more expensive to compute. The distance is truer. If we actually take the hypotenuse of the triangle, that’s better than if we look at just the East-West, North-South Manhattan distance, although we’ve just had to calculate a square root. My preference is to always start with cosine_distance, which is the most flexible. If your vectors are normalized, then the dot product will give you the same accuracy for free. Not all vector spaces are fully normalized. This is the safest, highest quality answer. If you need more performance later, you can test empirically, and you really have to test empirically. There’s all kinds of metrics you can get for these models, that will give you a quantitative answer of like, the perplexity is blah-blah-blah, and like, what does that mean? I don’t know, it means it’s better than the other one, maybe? What you really have to do is you have to actually run some queries against your database. You have to look at the things that you’re getting back.

Indexes

If you have 1000 vectors, a modern CPU core or GPU core with several gigahertz of speed, can probably compute 100,000 cosine similarities or dot products per second. If you have 100,000 documents in your database, and you have a single core CPU, and one second is fast enough for your application, then you’re done. You don’t need anything special. If you have fewer documents, you’re done, you don’t need anything special. A lot of document collections that I see are on the order of thousands, they’re not on the order of millions. You can just brute force the computation on a single core. If you have a GPU that has 5000 cores, then you can do 5000 times 100,000 in a second, which will give you 5 billion documents. Five billion is an interesting number for scalability reasons.

In this case, though, let’s say you have a million documents, and you don’t want to wait 10 seconds for a user query to come in, you need a way to actually find your closest vectors in your dataset much more quickly. You want to avoid doing a lot of the direct comparisons. We use indexes to create shortcuts all of the time. Most people are probably familiar with keyword indexes, when they’re doing text recall. We just build the list of all of the documents that contain that keyword, and we sort that list ahead of time, by how many times that keyword appears in that document. Then when somebody searches for that keyword, you just go get your list, and you take the head of it. That gives you your documents that match. It’s very quick. It’s much more quick than actually having to scan every single document and see if the keyword actually exists in it. That would be terribly slow.

Similarly, with vectors, there are indexing operations that we do. pgvector supports the IVFFlat indexing type. This index type, what it does, is, let’s say you’ve got a million vectors that you want to build an index over. In this case, we’ve said we want 2000 lists in this index. It’s going to create 2000 lists of vectors. It’s going to automatically cluster all of your million vectors into 2000 different clusters that are most compact, so that every vector in each list is most similar to all of the other vectors in that list and less similar to vectors in other lists. Then it’s going to compute the centroid of all of those lists. This is just vector math. Each list can then be looked at as the centroid, and everything will be closest to that centroid in each list. Now when I do a query, and I want to look up and find the nearest vectors to my query vector, I only have to brute force the 2000 centroids. I can find the list from those 2000 that has the most likely candidates that are closest to my input. Then, I can brute force all of the ones in that list. I don’t want to do the math of 1 million divided by 2000, I think it’s 500. That would be another 500 vector comparisons. In total, we would do 2000 to find the list. We would do 500 across that list. That’s 2500 comparisons or vector distance calculations, rather than the full million, so 2500 is much faster than a million.

The tradeoff here is around the edge cases. If your vector is on the boundary between two lists and only slightly falls closer to one list than the other, then you might actually miss some other vectors from that other list that are also near the boundary. That’s why this is an approximate nearest neighbor search. What we can do to handle those cases if we find that our recall is bad, that we’re frequently missing vectors near the edge in our results, is we can turn up the number of probes. The number of probes is basically how many lists do we want to actually look at or consider. We can go from one probe of just considering the 500 vectors in the very nearest list, to 10 probes. Then we’ll actually consider 5000 additional vectors across 10 different lists. That will make sure that if anything is on any of the edges, we’ll get that. Again, this is 10 times more expensive given the fixed cost of 2000 up front. You can do the math. The runtimes are actually very predictable. If you’re doing this on a CPU core, it’s always the same number of floating-point operations, and so you can measure. You can say, how much latency budget do I have? How important is it to actually get every single record back or not?

Vector_Cosine_Ops

You actually need to build these with the matching operator that you will use. In this case, we are doing a cosine operator against our index, so we’ll generate the index to match against the cosine operator. If you want to use the Euclidean distance or the dot product, you would substitute that here as well. You can build as many of these indexes as you want. You could have three indexes, one for each operator, if you wanted to cover it that way. There’s a lot you can do with these indexes. You can build partial indexes in Postgres that only cover certain portions of the table. Let’s say you have some other criteria in the table where like, maybe some document is part of a collection, and you only want to search for documents in collection 32, you can build a partial index that only includes documents from collection 32. Actually, what we found at Instacart was we didn’t need vector indexes, because all of our queries were scoped to very tight constraints. Where given the user id, given the store that they were shopping at, given all of these extra criteria, we could winnow down the number of possible vectors to some subset far smaller than 10,000. Then we could just brute force the answer, and it would be an exact answer. It’d be very fast.

Calculating Distance from Pairs

What are the pairs that we’re calculating distance from?

It’s whatever you want it to be. In the clustering case, the pair is going to be two documents. In the query case, you’ll be comparing the incoming vector, the query vector to a document vector, into actually all of the document vectors. Clustering is a very expensive operation, because you actually have to compare every vector to every other vector to find out what vectors are closest to each other. This is like an n^2 operation on your vector size. Creating these indexes is a time-consuming process, and so you might want to consider. If you’ve already created your index, one of the very cool things about Postgres and pgvector is that anytime you insert a new record in the table, it goes into the index. The index is persisted to disk. If your database catches on fire or whatever, you can be assured that that data is safe and sound, as long as you’ve gotten the response back from Postgres, that, in fact, your transaction has completed.

Search Query Matching to Repository Items

You have to create the index before you can use the index. If you don’t have an index on the table, and you run that query like this, Postgres will accept this query with or without an index. Postgres has a very advanced query planner. It will look at all of the indexes on the table, given a particular query. It will use statistics about the data, the number of rows, the different predicates in your query. In this case, the query has no predicate. If the table has no index, this will be a full table scan. It will actually do this comparison against every record. If you have an index, on the other hand, it will use the index to take the shortcut that I’ve described. That’s the magic that pgvector gives you.

Clustering (n^2 pairs in a join table)

This is a very generic example. I probably could have picked something more explicit. You could say that the left-hand side is query vector, and the right-hand side is documents.embedding. In that case, that’s what you would write. Then, you would be selecting from documents. That’s how you would query that table. Again, you would have to generate that user query at runtime, because what they’re going to give you is a piece of text. Then you would call pgml.embed on that piece of text.

Vector Search (Summary)

There’s more work being done on pgvector right now to improve. There’s a lot of research going on around vector indexes. There’s a lot of tradeoffs. There’s at least a dozen different vector indexing algorithms. They are all some tradeoff between build cost, runtime cost, accuracy. IVFFlat is a pretty good first stab that makes pretty balanced tradeoffs amongst all of those things.

Common Table Expressions

This is a better example of what you might do. Let’s say we have a set of the Amazon movie reviews from all customer reviews for all of the Amazon DVDs for sale. There are millions of these customer reviews. We can create embeddings out of all those customer reviews. We can store them in a table. That’ll be our documents table for this example, where we have millions of user reviews. Now when somebody comes in with a query, and they want to find a movie similar to that query, we can use PGML to generate the embedding for that. In this case, we’re looking for the best 1980s sci-fi movie. Some people may describe, in their review, they may use the keyword ‘best 1980s sci-fi movie,’ but they might also say, ‘best 1970s sci-fi movie,’ or worse, ‘1980s sci-fi movie’. It’s these very subtle modifiers and nuances that embeddings are really magical about capturing the entire sentiment, and not just over-indexing on particular keywords. The quality you can get with a recall like this is pretty cool. When you have a WITH request AS, this creates a virtual table. In Postgres, this is called a common table expression. You can begin any query with this. The content is any other query. Whatever comes out of that query, in this case, it’ll be an embedding. For the rest of this query, and we can chain multiple ones of these, that’s what the SELECT … is going to do in the next slide. We’ll have a virtual table in-memory, it will have one row in it. That row will be a single vector named embedding, and it will be generated on the fly from the best 1980s sci-fi movie using our model.

Recall

Then, we can actually do the cosine operator because we’ve actually built an index on this table in the previous slide. This will be very fast. This will be very efficient. We get the full power of SQL. This doesn’t have a limit. You should put a limit, otherwise you’re going to be pulling back 5 million documents, and Postgres will know that. It’ll know that there’s 5 million documents and you didn’t put a limit, so it’ll just ignore your index because it’s not going to do any good anyway, because you’re going to need to calculate the 5 million dot products anyway. Because that’s what you’re actually selecting, you’re selecting the cosine_distance. The Postgres query planner is smart. It will try to help you out. You also need to be a little bit diligent when you’re thinking about, what do I really want? What do I really need?

Tasks

PostgresML isn’t only limited to embeddings. There’s lots of things that you can do with these large language models. Hugging Face has this concept of tasks that we’ve also adopted. Some of these tasks can be text generation. It can be text classification. It can be translation from one language to another. There are several dozens of these. This is an example of using pgml.transform, which is the second function. It will also download a model from Hugging Face. In this case, we haven’t even specified the model. Hugging Face has some default model. This is a bit of a footgun, because they may change that default model, and your application may be using a new model that you haven’t tested. Be wary. You can also specify the model to this function. They do take keyword arguments. Anything you can specify to any of these API models, there’s a way to parse the keyword argument down all the way through. The output of this example here will be whatever the output of the Hugging Face model would be. In this case, Hugging Face returns some JSON. It’s going to have a sentiment colon floating-point number. Postgres has all of the JSON operators you need, so then you can dig into that JSON object coming back.

Text Generation

This is where I think the true power of SQL as a platform becomes really interesting, because if you think about all of the thousands of chatbots being built with LangChain today. It’s sort of like the iPhone moment when everybody was building a Flashlight app back in 2008. The very standard model is, I have my user prompt that I want to chat with. You take that, you parse it to OpenAI. That’s a remote data center call, it takes several hundred milliseconds. They have to actually then run their model on it. They give you back, no longer one of the best quality embeddings, but they’ll charge you for it anyway. Then you take that embedding, you parse it to your vector database. You look up a bunch of context that you want to use for prompt engineering with your chatbot. Maybe those are help documents that your support center has. Maybe that’s a movie catalog because this is a movie guru chatbot. You get that context that you’ve written all these documents about movies, or about help for your support center. You pull back the English language text out of your database, and you pull back as much English language text as OpenAI’s models will take in their context windows. You put your prompt on that. You say, “Given this information, you are a helpful chatbot. Please respond to this prompt.” You paste all that together. You send off 30 kilobytes worth of data to OpenAI again. They run it through their text generation model, they send you a response back. Then you go ahead and send that response back to the user.

With PostgresML, you can do all of that in a single query inside the database. There’s no network transit. There’s no memory copies. You create your embedding in the first CTE. You retrieve all of the relevant documents in the second CTE. Then in the third query, you parse those as concatenated inputs with your prompt, to another large language model. You can actually start stacking and chaining models with these CTEs as much as you want. One of the things that we found very useful is that cosine similarity is ok. It’s good and fast, relative to an XGBoost model that re-ranks. An XGBoost model that re-ranks is going to be much more accurate if rankings really matter.

Model Hosting on The Database

The transform is going to happen in the database. All of that text document for the prompt engineering and context will stay inside of the database. There will be a much lower network data transit cost. It’s going to be pointers in memory, or a memcpy if necessary. You’re talking about many kilobytes, potentially megabytes worth of data that doesn’t need to get sent over the wire. Compared to OpenAI and Python implementations doing the same thing, it’s usually a 10x speedup.

If you wanted to do it with PostgresML, you could have a foreign data wrapper to a different database instance. You can set up that and you can query that different database instance. If you wanted to retrieve all of your documents from Postgres into your Python application, and then send those off to OpenAI and do a hybrid approach, you can do that, absolutely, but you’re then now paying the network transit cost.

SDKs

We covered machine learning, in theory. We covered SQL and how to do all this in depth. We’ve gone a step further, and we’ve created a Python SDK that implements all of these SQL queries for you. It gives you the functions with inputs and outputs that you actually want and need to know. You connect your Python application to a Postgres database. You create a collection of documents. You give it some arbitrary name for your collection. You can then upsert documents into that collection. If you want to go ahead and generate the indexes, you can do that with a single function call. We’ll enforce all of the best practices and the most efficient query patterns with this SDK. Finally, you can just do your vector search. You can get those documents back that you might then want to pass on to OpenAI. We’re extending this, though, to create chainable API SDK function calls, so that you can do a vector search. Then, instead of materializing the results immediately, you can call dot text generation. You can actually do it all in a single execution to get the efficiency back at the application layer.

Scalability

It’s worth mentioning that if you try to do a bunch of LLMs inside of a single database, you’re probably going to knock that database over pretty quickly. PgCat is our other project that acts as a router. It handles sharding. It handles replication. It handles failover and load balancing across many different Postgres replicas. You might want to look into that project as well.

See more presentations with transcripts

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Informatica and MongoDB Expand Global Partnership to Enable New Class of Cloud-Native …

MMS Founder
MMS RSS

Posted on mongodb google news. Visit mongodb google news

Informatica launched a new strategic partnership with MongoDB. The partnership enables customers to efficiently create a modern class of cloud-native, data-driven, industry-tailored applications powered by MongoDB Atlas and with a secure foundation of trusted data from Informatica’s market-leading, AI-powered MDM solution. Informatica and MongoDB take a customer centric approach to meet customers’ strictest data requirements.

Together, the companies have collaborated on joint solutions across financial services, insurance and healthcare to combine the benefits of MongoDB Atlas with Informatica’s MDM and domain-specific applications such Customer 360, Product 360, Supplier 360, and Reference 360. For example, Informatica and MongoDB help insurers around the globe realize their digital and AI strategies faster by consolidating and replacing legacy systems, and accelerating the delivery of the next generation, cloud-native business applications on a foundation of trusted data.

Article originally posted on mongodb google news. Visit mongodb google news

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Informatica and MongoDB Expand Global Partnership to Enable New Class of … – Yahoo Finance

MMS Founder
MMS RSS

Posted on mongodb google news. Visit mongodb google news

As a strategic partner, Informatica’s best-of-breed Master Data Management (MDM) SaaS solution and MongoDB Atlas will enable customers to develop a new class of 360-degree data-driven applications

REDWOOD CITY, Calif., November 29, 2023–(BUSINESS WIRE)–Informatica (NYSE: INFA), an enterprise cloud data management leader, launched a new strategic partnership with MongoDB. The partnership enables customers to efficiently create a modern class of cloud-native, data-driven, industry-tailored applications powered by MongoDB Atlas and with a secure foundation of trusted data from Informatica’s market-leading, AI-powered MDM solution.

“We’re thrilled to announce the partnership with MongoDB. Given we already leverage the performance and scale of MongoDB Atlas within our cloud-native MDM SaaS solution and share a common focus on high-value, industry solutions, this partnership was a natural next step,” said Rik Tamm-Daniels, Group Vice President of Strategic Ecosystems and Technology at Informatica. “Now, as a strategic MDM partner of MongoDB, we can help customers rapidly consolidate and sunset multiple legacy applications for cloud-native ones built on a trusted data foundation that fuels their mission-critical use cases.”

Informatica and MongoDB take a customer centric approach to meet customers’ strictest data requirements. Together, the companies have collaborated on joint solutions across financial services, insurance and healthcare to combine the benefits of MongoDB Atlas with Informatica’s MDM and domain-specific applications such Customer 360, Product 360, Supplier 360, and Reference 360.

For example, Informatica and MongoDB help insurers around the globe realize their digital and AI strategies faster by consolidating and replacing legacy systems, and accelerating the delivery of the next generation, cloud-native business applications on a foundation of trusted data.

“For decades, function-first has been the focus of business applications, and data has been the digital exhaust of processes and activities performed by staff within those applications, ” says Stewart Bond, vice president of Data Intelligence and Integration software research at IDC. “The combination of MongoDB and Informatica MDM will provide developers and organizations with the opportunity to create data-first business applications, where data will drive the function and action of users and provide more opportunities for automation.”

A survey of global data leaders Informatica commissioned with Wakefield Research released earlier this year found nearly half (45%) of respondents reported that gaining more holistic/single views of customers was a priority data strategy for 2023. The same study found a lack of a complete view and understanding of their data estates is the obstacle more data leaders (32%) cite as the reason they can’t execute their data strategies.

“Informatica is a valued partner of MongoDB and we’re thrilled to formalize the collaboration through our ISV program and bring innovative joint solutions to market,” said Alan Chhabra, Executive Vice President, Worldwide Partners at MongoDB. “Master data management (MDM) is a fundamental component for nearly all application workloads, particularly those in highly regulated industries with complex compliance requirements. Informatica is a leader in this space, and their MDM solution running on MongoDB Atlas ensures that joint customers can innovate more quickly, efficiently and effectively.”

Informatica was named a Leader in The Forrester Wave: Master Data Management, Q2 2023 report.

About Informatica

Informatica (NYSE: INFA), an Enterprise Cloud Data Management leader, brings data and AI to life by empowering businesses to realize the transformative power of their most critical assets. We have created a new category of software, the Informatica Intelligent Data Management Cloud™ (IDMC). IDMC is an end-to-end data management platform, powered by CLAIRE® AI, that connects, manages, and unifies data across virtually any multi-cloud or hybrid system, democratizing data and enabling enterprises to modernize and advance their business strategies. Customers in more than 100 countries, including 85 of the Fortune 100, rely on Informatica to drive data-led digital transformation. Informatica. Where data and AI come to life.

View source version on businesswire.com: https://www.businesswire.com/news/home/20231129690727/en/

Contacts

Informatica Public Relations
prteam@informatica.com

Article originally posted on mongodb google news. Visit mongodb google news

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.