January 2024 - Page 17 of 23 - Mobile Monitoring Solutions

Uncategorized

The Knot Worldwide unifies data on single platform – Chain Store Age

MMS • RSS

Posted on mongodb google news. Visit mongodb google news

A digital wedding products and services retailer is storing, managing and processing corporate data on one database.

Having to store, manage and process large volumes of data across multiple different applications, The Knot operates a large ecosystem of application programming interfaces (APIs), resulting in high demands on its data storage solutions.

The company, which has nearly 35 million users across 16 countries, operates a global vendor marketplace that connects more than 4 million couples per year with 850,000 local wedding professionals. The Knot also offers customers a suite of personalized websites, tools, invitations and registry services.

To resolve issues caused by storing and managing large volumes of structured and unstructured data across multiple platforms, The Knot decided to migrate all of its disparate data to the MongoDB Atlas managed multi-cloud database, based on Amazon Web Services (AWS), which it had been using for data storage.

The retailer now operates around 15 different applications based on the MongoDB platform, which it has upgraded over the years from from version 2.6 to version 7.0.

MongoDB Atlas provides access to all of The Knot’s metrics via a dashboard. Employees can detect data patterns for the last 24 hours, the last seven days, and the last month. If there is a red flag in any of those charts, they canidentify and change the problem in a short period time”

In addition, an alerting system makes recommendations for indices and checks the data modeling The Knot puts in place in the database to suggest changes.

“Changes in capacity happen behind the scenes without any disruption or action by the developer team,” said Vladimir Carballo, principal software engineer, The Knot Worldwide. “You just receive an email that tells you how much storage you’re now using. More importantly, from the user’s point of view it all happens with zero downtime, which translates into better-performing client applications.”

Article originally posted on mongodb google news. Visit mongodb google news

Uncategorized

Navigating the open source landscape: Opportunities, challenges, insights – ITWeb

MMS • RSS

Posted on mongodb google news. Visit mongodb google news

Johannes Briel, Senior IT Security Specialist – PCI QSA, QSA (P2PE), Galix.

In the ever-evolving landscape of technology, open source software stands as a pillar of innovation, collaboration and versatility. As we venture through 2023, open source presents many opportunities – and challenges – that demand a slightly better understanding of it. From its potential to reshape industries to the intricate web of security concerns, open source software continues to shape our digital future in profound ways.

Opportunities and threats in the open source software market

2023 marked a remarkable growth for open source software, partly fuelled by the increased adoption of cloud computing. This trend creates opportunities for agile open source solutions that can deliver fast and efficient results. However, this also poses a challenge: how to implement and manage these solutions effectively in the cloud. One of the main benefits of open source is its cost-effectiveness, as it eliminates the need for expensive licences. This is especially attractive for hi-tech industries that need quick deployment. Furthermore, mature open source solutions have high reliability due to the active community behind their development. On the other hand, proprietary software users may face the risks of discontinued support and rising costs as technology changes. Therefore, it is crucial to pay close attention to the basic operations and image reliability in the open source environment, as well as to conduct rigorous security checks to ensure stability.

The versatility of open source: A multifaceted powerhouse

Open source software affects various sectors in significant ways, transforming industries and promoting innovation. In software development, open source enables collaboration and progress, creating a dynamic ecosystem where developers work together to improve codebases, generate innovation and explore possibilities. The web also relies heavily on open source, using technologies such as WordPress and Apache to support websites, web frameworks and servers. In operating systems, Linux demonstrates the power of open source, offering stability, security and customisation options for servers, embedded systems and personal computers.

Businesses of all sizes use open source databases, utilising solutions like MySQL for reliability and MongoDB for flexibility. The emerging fields of the internet of things (IOT) and artificial intelligence (AI) also benefit from open source, advancing the development of connected devices and machine learning models. The scientific community leverages open source as well, accelerating discoveries through co-operative efforts and knowledge sharing, especially in areas like bioinformatics and climate research. And in the vital area of security and privacy, open source supports vigilance, providing essential tools like Linux and OpenSSH that enable secure solutions.

Additionally, open source cultivates a culture of learning, experimentation and entrepreneurship, equipping students and start-ups with accessible tools that drive innovation and growth. As open source continues to shape the technological landscape, its multifaceted influence serves as a testament to the power of collaboration and community-driven progress.

Navigating security challenges in the open source domain

Amid the array of advantages that open source offers, it is not devoid of security concerns, necessitating vigilant consideration of several factors. The collaborative nature of open source development, while fostering innovation, can also introduce vulnerabilities if not subjected to rigorous examination and testing.

The integrity of open source software can be compromised by unvetted contributions, underscoring the importance of strict policies and practices. Furthermore, the risk of vulnerabilities looms when software remains stagnant or outdated, highlighting the need for consistent updates. Dependencies, if left unmonitored, can become gateways for vulnerabilities, potentially leading to security breaches. The prominence of these vulnerabilities underscores the importance of implementing stringent practices, such as adhering to software development life cycles or embracing DevSecOps principles, to effectively mitigate risks.

Enterprises, particularly in agile development scenarios, must exercise caution, as swift changes can inadvertently expose vulnerabilities. In this complex landscape, proactive measures are paramount to ensure that the benefits of open source are harnessed without compromising security.

Navigating the open source odyssey

In the complex field of technology, open source is dynamic, empowering software, offering many opportunities but also requiring careful attention to security and strategic decisions. As we journey through 2023 and beyond, we have a responsibility to adopt the culture of collaboration and innovation that supports open source while being aware of the challenges it poses. By doing this, we ensure that the open source ecosystem continues to shape a future where technology aligns well with human progress.

Article originally posted on mongodb google news. Visit mongodb google news

Uncategorized

JEP 447: Refining Java Constructors for Enhanced Flexibility

MMS • A N M Bazlur Rahman

Article originally posted on InfoQ. Visit InfoQ

After its review concluded, JEP 447, Statements before super(…) (Preview), was delivered for JDK 22. Under Project Amber, this JEP proposes to allow statements that do not reference an instance being created to appear before super() calls in a constructor and preserve existing safety and initialization guarantees for constructors. Gavin Bierman, the consulting member of the technical staff at Oracle, has provided an initial specification of this JEP for the Java community to review and provide feedback.

Traditionally, Java constructors were required to place any explicit invocation of another constructor as the first statement. This constraint ensured top-down execution and prevented access to uninitialized fields, but it significantly limited the expressiveness and readability of constructor logic. Consider the following example:

public class PositiveBigInteger extends BigInteger {

    public PositiveBigInteger(long value) {
        super(value);               // Potentially unnecessary work
        if (value <= 0)
            throw new IllegalArgumentException("non-positive value");
    }
}

It would be better to declare a constructor that fails fast by validating its arguments before invoking the superclass constructor. JEP 447 relaxes these restrictions, allowing statements that do not reference the instance being created to appear before an explicit constructor invocation. With this, the code above can be simplified as:

public class PositiveBigInteger extends BigInteger {

    public PositiveBigInteger(long value) {
        if (value <= 0)
            throw new IllegalArgumentException("non-positive value");
        super(value);
    }
}

Consider another scenario where a subclass constructor needs to prepare arguments for a superclass constructor. Previously, this required auxiliary methods due to the restriction of having the superclass constructor invocation as the first statement.

public class SubClass extends SuperClass {
    public SubClass(Certificate certificate) {
        super(prepareByteArray(certificate));
    }

    private static byte[] prepareByteArray(Certificate certificate) {
        // Logic to prepare byte array from certificate
        // ...
        return byteArray;
    }
}

In this example, the prepareByteArray method processes the Certificate object before passing it to the SuperClass constructor. With JEP 447, this process becomes more streamlined and intuitive.

public class SubClass extends SuperClass {
    public SubClass(Certificate certificate) {
        // Directly include the logic to prepare byte array
        PublicKey publicKey = certificate.getPublicKey();
        if (publicKey == null) {
            throw new IllegalArgumentException("Null certificate");
        }
        byte[] byteArray = switch (publicKey) {
            case RSAPublicKey rsaKey -> rsaKey.getEncoded();
            case DSAPublicKey dsaKey -> dsaKey.getEncoded();
            default -> throw new UnsupportedOperationException("Unsupported key type");
        };
        super(byteArray);
    }
}

In this updated example, the constructor of SubClass directly includes the logic to process the Certificate object. This direct approach enhances readability and reduces the need for auxiliary methods, demonstrating the practical benefits of JEP 447 in real-world scenarios.

While JEP 447 offers greater flexibility, it preserves the essential guarantees of constructor behaviour, ensuring that subclass constructors do not interfere with superclass instantiation. This update does not require any changes to the Java Virtual Machine (JVM), relying solely on the JVM’s existing capabilities to verify and execute pre-constructor invocation code.

As Java continues to evolve, JEP 447 is a clear indication of the language’s ongoing adaptation to modern programming practices. It reflects a balance between introducing new features and maintaining the robustness of the Java ecosystem. For Java developers, this means an opportunity to explore more efficient coding practices while staying grounded in the language’s core principles.

About the Author

A N M Bazlur Rahman

Show moreShow less

Uncategorized

MongoDB, Inc. – Consensus ‘buy’ rating and 24.8% Upside Potential

MMS • RSS

Posted on mongodb google news. Visit mongodb google news

MongoDB, Inc. with ticker code (MDB) have now 26 market analysts covering the stock. The analyst consensus now points to a rating of ‘buy’. The range between the high target price and low target price is between $515.00 and $250.00 calculating the average target price we see $456.09. Given that the stocks previous close was at $365.39 and the analysts are correct then we can expect a percentage increase in value of 24.8%. Also worth taking note is the 50 day moving average now sits at $388.77 and the 200 day moving average is $346.46. The company has a market capitalization of 27.32B. The stock price for American International Group, Inc. is $378.45 USD

The potential market cap would be $34,096,111,691 based on the market consensus.

The company is not paying dividends at this time.

Other points of data to note are a P/E ratio of -, revenue per share of $22.49 and a -5.71% return on assets.

MongoDB, Inc. is a developer data platform company. Its developer data platform is an integrated set of databases and related services that allow development teams to address the growing variety of modern application requirements. Its core offerings are MongoDB Atlas and MongoDB Enterprise Advanced. MongoDB Atlas is its managed multi-cloud database-as-a-service offering that includes an integrated set of database and related services. MongoDB Atlas provides customers with a managed offering that includes automated provisioning and healing, comprehensive system monitoring, managed backup and restore, default security and other features. MongoDB Enterprise Advanced is its self-managed commercial offering for enterprise customers that can run in the cloud, on-premises or in a hybrid environment. It provides professional services to its customers, including consulting and training. It has over 40,800 customers spanning a range of industries in more than 100 countries around the world.

Article originally posted on mongodb google news. Visit mongodb google news

Uncategorized

OpenAI Adopts Preparedness Framework for AI Safety

MMS • Anthony Alford

Article originally posted on InfoQ. Visit InfoQ

OpenAI recently published a beta version of their Preparedness Framework for mitigating AI risks. The framework lists four risk categories and definitions of risk levels for each, as well as defining OpenAI’s safety governance procedures.

The Preparedness Framework is part of OpenAI’s overall safety effort, and is particularly concerned with frontier risks from cutting-edge models. The core technical work in evaluating the models is handled by a dedicated Preparedness team, which assesses a model’s risk level in four categories: persuasion, cybersecurity, CBRN (chemical, biological, radiological, nuclear), and model autonomy. The framework defines risk thresholds for deciding if a model is safe for further development or deployment. The framework also defines an operational structure and process for preparedness, which includes a Safety Advisory Group (SAG) that is responsible for evaluating the evidence of potential risk and recommending risk mitigations. According to OpenAI:

We are investing in the design and execution of rigorous capability evaluations and forecasting to better detect emerging risks. In particular, we want to move the discussions of risks beyond hypothetical scenarios to concrete measurements and data-driven predictions. We also want to look beyond what’s happening today to anticipate what’s ahead…We learn from real-world deployment and use the lessons to mitigate emerging risks. For safety work to keep pace with the innovation ahead, we cannot simply do less, we need to continue learning through iterative deployment.

The framework document provides detailed definitions for the four risk levels (low, medium, high, and critical) in the four tracked categories. For example, a model with medium risk level for cybersecurity could “[increase] the productivity of operators…on key cyber operation tasks, such as developing a known exploit into an attack.” OpenAI plans to create a suite of evaluations to automatically assess a model’s risk level, both before and after any mitigations are applied. While the details of these have not been published, the framework contains illustrative examples, such as “participants in a hacking challenge…obtain a higher score from using ChatGPT.”

The governance procedures defined in the framework include safety baselines based on a model’s pre- and post-mitigation risk levels. Models with a pre-mitigation risk of high or critical will trigger OpenAI to “harden” their security; for example, by deploying the model only into a restricted environment. Models with a post-mitigation risk of high or critical will not be deployed, and models with post-mitigation scores of critical will not be developed further. The governance procedures also state that while the OpenAI leadership are by default the decision makers with regard to safety, the Board of Directors have the right to reverse decisions.

In a Hacker News discussion about the framework, one user commented:

I feel like the real danger of AI is that models will be used by humans to make decisions about other humans without human accountability. This will enable new kinds of systematic abuse without people in the loop, and mostly underprivileged groups will be victims because they will lack the resources to respond effectively. I didn’t see this risk addressed anywhere in their safety model.

Other AI companies have also published procedures for evaluating and mitigating AI risk. Earlier this year, Anthropic published their Responsible Scaling Policy (RSP), which includes a framework of AI Safety Levels (ASL) modeled after the Center for Disease Control’s biosafety level (BSL) protocols. In this framework, most LLMs, including Anthropic’s Claude, “appear to be ASL-2.” Google DeepMind recently published a framework for classifying AGI models, which includes a list of six autonomy levels and possible associated risks.

About the Author

Anthony Alford

Show moreShow less

Uncategorized

Presentation: Virtual Threads for Lightweight Concurrency and Other JVM Enhancements

MMS • Ron Pressler

Article originally posted on InfoQ. Visit InfoQ

Transcript

Pressler: My name is Ron. I’m an architect at Oracle’s Java Platform Group. That’s the team that specifies Java and develops OpenJDK. In the last several years, I’ve led an OpenJDK project called Project Loom, which has contributed some features to the JDK, and today, I’ll primarily be talking about one of them, virtual threads. Later on, I’ll mention a couple of others. Virtual threads are threads that are represented by the same 27-year-old java.lang.Thread class. They’re implemented in the JDK in user mode, rather than by the OS. You can have millions of them running in the same process. I like to talk about why we added the feature, and why we did it differently from some other languages.

Java Design Choices

First, to understand some of our design choices. Let’s talk a bit about Java’s general philosophy and the market environment in which it operates. Java has been the world’s most popular server-side programming platform for many years. No other platform language with the possible exception of C and JavaScript has been this popular for this long. Java has an ecosystem that is both very large and quite deep, and by deep, I mean that third and fourth level library dependencies are a common thing in the Java world. James Gosling said that he designed Java as a wolf in sheep’s clothing, a very innovative runtime that to this day offers state of the art compilation, garbage collection, and low overhead observability that together make an attractive combination of productivity, performance, and tooling. All that is wrapped in a very conservative language. Why? Because this strategy has worked well for us. Java’s users seem to like it, and there are many of them. We also place a high premium on backwards compatibility, because companies trust Java with their most important programs and we want to give them a good value for their investment. Given Java’s longevity and good prospects for the future, we try to think long-term. Many times, we’d rather wait and think whether a feature that could be attractive right now, might suffer loss in the future.

Whenever we do add user facing features, we always start with a problem. The problem in front of us when designing virtual threads was that of concurrency, in particular, concurrency in servers, which are Java’s bread and butter. First, a little bit about differences between concurrency and parallelism. If parallelism is the problem of accelerating a task, that is reducing its latency by internally splitting it down into multiple subtasks, and then employing multiple processing resources to cooperate on completing each of those subtasks and say, the one big task. Concurrency is the problem of scheduling many largely independent tasks that arrive at the application from the outside onto some set of resources. With concurrency, we’re mostly interested in throughput, or how many of these tasks, or normally they are requests, we can complete per time unit.

Little’s Law

Like I said, we’re mostly interested in servers, and there’s a very useful equation describing the behavior of servers called Little’s Law. It is actually more complex than it may first appear because it is independent of any statistical distribution of the request, and that’s why it’s interesting and it’s also more useful than it may appear. Last year I gave an entire talk just about Little’s Law at a conference devoted to performance. Little’s theorem says this, if some server-like system, so a system that requests arrive from the outside, they’re processed in it, and then they’re dispatched and leave. If such a system is stable, which means that the requests coming in don’t pile up in an ever-growing queue, waiting to get in, then the average rate of requests times the average duration that every request spends inside our system, for some arbitrary definition of inside, that product is equal to the average number of requests that will be inside the system concurrently. That’s L. Because the system is stable, the average rate at which requests arrive is equal to the average rate at which they’re handled, namely, our throughput. Lambda is our throughput.

Also, the duration that each request spends in the system depends on what the system does, and how it does it, and can be thought of as a constant for the particular system. Of course, we optimize it as much as we can. Then we still reach some number W, that is characteristic of our system. Some of you might be wondering, what if that duration W depends on the load? For that, you will need to watch that other talk. Now we can think of the maximum throughput as a function of the system’s capacity, that’s the maximum, that’s the L bar, that’s the maximum number of requests that the system can handle, or can contain within it at any given time. This is very important to repeat. We see that the throughput, or the maximal throughput of the system, is a function, given that W is a constant for the system, it’s a function of the number of requests that it can process at any given time. The number of requests that the system can contain within it, that is the primary thing that determines its throughput.

Actually, an incoming request often translates to more than one concurrent server operation. Because there’s a common pattern in servers called fanout, where every incoming request splits into multiple concurrent outgoing requests to microservices or databases. We can see how the number of concurrent operations can be quite large. It’s the number of incoming requests plus all the outgoing requests. Of course, handling every request requires a mix of both I/O and processing. The question is, what is the concurrent request capacity of a server? Depending on the ratio of processing to I/O, the capacity of the server, at least as far as the hardware goes, can be around 10,000 concurrent requests per CPU call. If you have 30 calls, then your server can support about 300,000 requests within it at any one time. It can be lower than that, but it can also be higher.

Thread-per-Request – Thread Capacity

This finally brings us to the problem. Java was one of the first languages to have threads as part of a standard library and language. The thread is the Java platform’s unit of software concurrency. The easiest approach in Java and many other languages like it, is to represent the domain units of concurrency, namely the request either incoming or outgoing, or the transaction, with the software unit of concurrency in a thread, so do one to one. That is the traditional style of writing servers, and it’s called thread-per-request. In that style, a request is mapped to a thread for its entire duration for at least one thread, because then you can split into outgoing requests in a fanout. That means that to make the best use of the hardware, we need to be able to support 10,000 threads times the number of CPU cores. Java has never specified that a Java thread must be represented by an OS thread. For a short while in the ’90s it wasn’t. Ever since all operating systems started supporting threads many years ago, the main Java implementation that nowadays is called OpenJDK has used an OS thread to represent each of its threads. The problem is that most OSes simply can’t support so many active threads.

Asynchronous Programming

What do Java programmers do? It’s not just in Java, it’s in other languages, but I’m talking primarily about Java. To get the scalability we need, going by Little’s Law, we need a great many concurrent requests, but we can’t have that many threads. Instead of representing every domain into a unit of concurrency request, with an expensive thread, and it’s expensive because we can’t have many of them, we the Java programmers started representing them as asynchronous tasks. These tasks share a small thread pool. They hold onto a thread only when they’re doing processing. Where they need to wait for I/O rather than blocking, and hanging on to the thread while waiting for the I/O to complete, they return the thread to the pool to be reused by other tasks. When the I/O operation completes, the next processing stage is triggered and submitted to the pool and so on. The small number of threads that the OS can support is no longer an issue because we can have a lot of asynchronous tasks and we get excellent scalability and good hardware utilization. This asynchronous style that’s not thread-per-request, comes at a very high cost. First, composing the different processing stages of the asynchronous task, say, you want to initiate some outgoing request, then you have to wait for it, then you get it’s resolved, then you need to process it, so composing these multiple stages requires a monadic DSL. We can no longer use the languages built in composition of loops and exceptions, because those constructs are tied to a thread. The coding style is very different unless convenient.

Second, all I/O APIs need to be separated into two disjoint worlds that perform the same operation. In one world, the asynchronous world, blocks and hangs onto a thread while it’s waiting for I/O. In the other world, the asynchronous world, we don’t block but we switch to processing some other task. This means that if your application wants to move from the synchronous world to the asynchronous world, to get better scaling, much of it needs to be rewritten. Lastly, and no less important, we must remember that software is more than just code. It’s also about observing the program as it runs. Observability has always been very important in Java. We need to do it for debugging, profiling, troubleshooting. When we use a debugger, we step through the execution of a thread, not an asynchronous pipeline. Profilers gather samples and organize them by threads. When we get an exception, we get a back trace, and it gives us the context in the form of a particular thread’s stack. We lose all that with the asynchronous style of programming.

Syntactic Coroutines

What did other programming languages do about this? Some of them, I think, pioneered by C#, introduced something called syntactic coroutines in the form of async/await. These addressed the language composition problem, so you can use loops and exceptions, but they still require two separate API worlds that do the same things but they are syntactically incompatible. You can use one of them in one context, another in another context, even though they do the same thing. As for observability, the software unit of concurrency is now either a thread in the blocking world, or this async/await thing, this coroutine in the coroutine world. All tools need to learn about this new construct. Some of them have done that, some of them haven’t. Why did other languages choose async/await despite these downsides over implementing lightweight threads in user mode, as Erlang did, and Go did, and Java has now ended up doing as well? There are different reasons, both conceptually and technically, threads and syntactic coroutines are very similar. This is how a thread state is managed. Subroutines are state machines, and they’re connected to each other and make up one big state machine in the form of a tree. Async/await is pretty much the same thing. Yet, there are still some differences both in how the construct is exposed to users and how it is implemented in a language.

Thread vs. Async/Await

First, there’s a subtle semantic difference between threads and async/await. They have a different default regarding scheduling points. Those are the points in the code where the scheduler is allowed to suspend a process, a linear composition of something, so either thread or a coroutine from running in a processor. We do schedule it and schedule another to interleave them. Those are the scheduling points. For threads, it doesn’t have to, but the scheduler is allowed to interleave another thread anywhere, except where explicitly forbidden with a critical section expressed as a lock. With async/await, it’s the opposite. The interleaving happens nowhere, except where explicitly allowed. In some languages, you need to write await, in some languages it’s implied but it’s still statically known. Personally, I think that the former, the thread style is a better default because the correctness of concurrent code depends on assumptions around atomicity, when we can have interleavings. The first option allows code to say, this block must be atomic, regardless of what other subroutines that I call do. The latter is the opposite. If you change a code unit from a subroutine to an async coroutine to allow it to block and interleave, will affect assumptions made up the call stack, and you have to go up the call stack and change everywhere. You can’t do it blindly. You have to make sure whether or not there are implied assumptions about atomicity there. The assumptions are implied. Regardless of your personal preference, many languages like Java already have threads. Their scheduling points can already happen anywhere. Adding yet another construct with different rules would only complicate matters. Some languages don’t have threads, or rather, they just have one thread. By default, they don’t have interleaving anywhere at all. For example, JavaScript has just one thread. All existing code in JavaScript was written under the assumption that scheduling points happen nowhere. When JavaScript added a built-in convenient syntactic construct for concurrency, they naturally preferred a model where atomicity is a default, because otherwise, it would have broken too much existing code. This is why async/await makes sense in JavaScript rather than threads.

The second difference is in the implementation. If we want to implement threads in user mode, it requires reifying the state of the thread, so that picture I showed, the state of each subroutine. We need to represent the state of each subroutine. To do it efficiently requires integrating with the low-level representation of ordinary subroutines, the stack frames of the ordinary subroutines, and that requires access to the compiler’s backend. That is a very low-level implementation detail. On the other hand, async/await is a completely separate construct, and all that’s required to implement it efficiently is control over the compiler’s frontend. The frontend can choose to compile one of those units, not to a single subroutine, but to something else. Sometimes it’s maybe several subroutines, and sometimes it’s the opposite. One of them is how Kotlin does it. The other one is how Rust does it. Maybe you compile multiple async/await coroutines into a single subroutine, or sometimes you compile a single async/await into multiple subroutines. Some languages simply don’t have control over the backend, they don’t have access to that low-level representation. They simply have no choice. To get an efficient implementation, they must use async/await despite any shortcomings it may have. For example, Kotlin. Kotlin is a language that compiles to Java bytecode, and also other platforms. It’s just a compiler frontend. It has no control over the design of the Java Virtual Machine. It cannot expose or change the required internals. It also compiles to Android and to JavaScript to other backends that it has no control over, which is why async/await made sense for Kotlin. This is also somewhat of an issue for Rust, because Rust runs on top of LLVM, and WebAssembly. The situation there is a little bit different because they do have some impact in how those are designed.

Another difference has to do with managing the stack’s memory. If we have a thread stack, we have to keep it in memory, somehow. Ordinary subroutines can recurse, or they can do dynamic dispatch. That means that we don’t know statically at compile time how big of a stack we need. That means that thread stacks can either be very large, that’s how OS threads work, they’re just very large. That’s the reason why we don’t want OS threads to begin with. Or, they need to be dynamically resizable to be efficient, which means some kind of dynamic memory allocation. In some languages, usually, low-level languages like C and C++, or Rust, or Zig, memory allocations are very carefully controlled, and they’re also relatively expensive. Having a separate syntactic construct that could forbid recursion and dynamic dispatch may suit them better. That’s what they’ve done. If you’re in one of those, you can’t have recursion or dynamic dispatch.

Context-Switching

There’s another reason that’s sometimes cited as supporting the preference of low-level languages for syntactic coroutines, and that is the cost of context switches. I think that deserves a few more words. There are other reasons for coroutines besides concurrency. For example, suppose we’re writing a game engine, and we need to update many game entities every frame. From an algorithmic point of view, it’s more of a parallelism problem than concurrency, because we want to do it to reduce the latency of the entire processing things. The entities don’t come from the outside. Syntactically, we might choose to represent each of those entities as a sequential process. Another example is generators, shown here in Python. It’s a way to write iterators in an imperative way, by representing the iterator as a process that yields one value at a time to the consumer. We have an ordinary loop. The yield means I have another value that’s ready. When we yield, we deschedule the producer and schedule the consumer. This also isn’t quite our definition of concurrency, at least not the one I used. We don’t have many independent tasks competing over resources. We have exactly two, and they’re performing a very particular dance. Still, we could use the same syntactic constructs we use for concurrency. While, in practice, we normally make direct use of coroutines, and the consumer in this case is also the scheduler. The consumer says, now I want the producer to produce another value. We can think of them as two processes, a consumer and producer composed in parallel communicating over an unbuffered channel.

To understand the total impact of context switching on the generator use case, we mostly care about the latency of the total traversal over the data. We care about latency and not quite throughput. Just again a quick computation. If we take the processing time by both the consumer and producer to be zero, because they do some very trivial computation, then the context switch between them however much it costs will be 100% of the total time. Its impact is 100%. It is very worthwhile to reduce these. How fast can we make it? We notice that because we only have two processes, both of their states in memory can fit in the lowest level CPU cache. Switching from one to the other can be as quick as just changing a pointer in a register. A good optimizer compiler can optimize the two processes, because they’re both known at compile time, and inline them into a single subroutine and making the scheduling just a simple jump. Their cost could be reduced to essentially zero as well. Having an entire mechanism of resizable stacks here is not very helpful.

When it comes to serving transactions in servers, now the situation is completely different. First of all, we’re dealing with many processes. There aren’t any particularly great compiler optimizations available. We call them megamorphic call sites. The scheduler doesn’t know which of the thread it’s going to call statically, so we can’t inline them, we have to do dynamic dispatch. Their states can’t all fit in the CPU cache. Even the most efficient hypothetical computation of just changing a pointer will take us about 60 nanoseconds because we’ll have at least one cache miss. Because we have external events and I/O involved in servers, the impact of even an expensive context switch is actually not so high. A very expensive context switch of around 1 microsecond, will only affect the throughput of a system where the I/O latency is even very low, let’s say 20 microseconds for I/O. That’s very good. Even then, the impact is only 5%. Reducing the cost of the context switch from huge, humongous to zero, would give us 5% throughput benefit on extremely efficient I/O systems. We can’t reduce it to zero, because like I said, we have at least one cache miss. On the one hand, the room for optimization is not large. On the other hand, the effect of optimization is low. In practice, we’re talking less than 3% difference between the most efficient hypothetical implementation of coroutines, and even a mediocre implementation of threads. Because that limited C++ coroutine model has some significant downsides, because you can’t do recursion, the question then becomes, which of those use cases is more important to you, because you will need to sacrifice one or the other, so you need to choose.

For Java, the job is simple. We already have threads. We control the backend. Memory allocations are very efficient. They’re just pointer bumps. Scaling servers is more important to us than writing efficient iterators. It’s not just that the language and its basic composition error handling constructs are tied to threads, the runtime features such as thread-local storage, and perhaps most important to all the observability tools, they’re all already designed around threads. The entire stack of the platform, from the language through the standard library, all the way to the Java Virtual Machine, it’s all organized around threads. Unlike in JavaScript, or Python, or C#, or C++, but like, Go and Erlang, we’ve opted for lightweight user mode threads. Ultimately with that, we’ve chosen to represent them as the existing java.lang.Thread class. We call them virtual threads to evoke an association with virtual memory, that is an abstraction of a plentiful resource that is implemented by selectively mapping it onto some restricted physical resource.

Continuations

The way we’ve done it is as follows. More of the Java platform is being written in Java, and the VM itself is rather small. It’s about a quarter of the JDK. We split the implementation of virtual threads into two, a VM part and a standard library part. It’s easy to do because threads are actually the combination of two different constructs. The first is sometimes called the coroutines, just to not confuse, I call them delimited continuations. That is the part that’s implemented in the Java Virtual Machine. A delimited continuation is basically a task that can suspend itself and later be resumed. Suspending the task means capturing its stack. The stack in memory is now represented by new kinds of Java objects, but it’s stored in a regular Java heap together with all the other Java objects. In fact, the interaction with the garbage collector proved an interesting technical challenge that I will talk about this summer at the JVM language Summit. For various reasons, we’ve decided not to expose continuations as a public API. This class I showed does exist in the JDK, but you can’t use it externally unless you open it up. If they were exposed, this is how they would have worked. That’s the body of the continuation up there. Prints before, then it yields, and then prints after. Its external API is basically a runnable. You call it run, and rather than it running to completion, it will run until the point that we call yield. Calling yield will cause run to return. When we call run again, instead of starting afresh, it will continue from the yield point and print after. Calling yield causes one to return, and calling run causes yield to return.

Thread = Continuation + Scheduler

Threads are just delimited continuations attached to a scheduler. The scheduler parlance is implemented in the Java standard libraries. In fact, in the future, we will allow users to plug in their own schedulers. The way it works is that when a thread blocks, so in this code, we have an example of a low-level detail of how the locks in java.util.concurrent.locks package, they’re implemented, and when they need to block they call a method called park. What it does there, simply at runtime, when you look at the current thread, both the OS threads, which we now call the all threads, we now call them platform threads. Virtual threads are represented by the same thread class. If it’s a platform thread, so an OS thread, we call the OS, and we ask the OS to block our thread, that’s the unsafe part. If it is a virtual thread, we simply yield the current continuation. When that happens, our current task just returns. In fact, the scheduler doesn’t even need to know that it’s running continuations, it’s just running tasks, it just calls run. As far as the scheduler is concerned, that task is just completed. Now, when another thread, say the thread that already had the lock now wants to release it, it needs to wake up and unblock one of the threads waiting to acquire the lock. It calls unpark on that thread. All we do is if that thread is virtual, we take its continuation and submit it to the thread scheduler as a new task to run. When the scheduler just runs it as if it were an ordinary task, it will call run again. Rather than continue from the beginning, it will continue in that yield in the left-hand side and park would return. It’s as if park just waited and then returned. Of course, there’s more machinery to manage thread identity. In this way, code in the JDK works transparently for both virtual threads and for these platform threads. All the same construct, the same classes just work. In fact, old code that’s already compiled will also work.

Replacing the Foundations

We’ve done similar changes to all areas in the JDK that block. A lot of it went into the legacy I/O packages, very old ones that used to be written in C. Our implementation of continuations can suspend native code because it can have pointers into the stack, so we can’t move the stack in memory. We reimplemented those packages in Java on top of the new I/O packages. We got rid of even more native code in the process. A very large portion of the work actually went into the tooling interfaces, that supports debuggers and profilers. Their APIs didn’t change so that existing tools will be able to work. We expose virtual threads as if they were platform threads. They’re just Java threads. Of course, tools, they will have some challenges because if they want to visualize threads, for example, they might need to visualize a million threads. We’ve raised the house, replaced the foundation, then gently put it back down with very minimal disruption. Like I said, even existing libraries that were compiled 10 years ago can just work with virtual threads. Of course, there are caveats but, for the most part, this works.

Current Challenges with Java Threads

As expected, the throughput offered by virtual threads is exactly as predicted by Little’s Law. We have servers that create more than 3 million new threads per second. That’s fine. This is intentionally left blank. As is often the case, most of the challenge, it’s still ongoing, but it wasn’t technical, but rather social, and I’d say pedagogical. We thought that adding a major new concurrency feature was this very rare opportunity to introduce a brand-new core concurrency API to Java, and we also knew that the good old thread API had accumulated quite a bit of cruft over the years. We tried several approaches, but in the end, we found that most of the thread API is never used directly. Its cruft isn’t much of a hindrance to users. Some small part of it, especially the method that changed the current thread are used a lot, pretty much everywhere. We also realized that implementing the existing thread API, so turning it into an abstraction with two different implementations won’t add any runtime overhead. I also found that when talking about Java’s new user mode threads back when this feature was in development, and back when we still called them fibers, every time I talked about them at conferences, I kept repeating myself and explaining that fibers are just like threads. After trying a few early access releases of the JDK with a fiber API, and then a thread API, we decided to go with the thread API.

Now we’re faced with the opposite challenge, that of something I call misleading familiarity. All the old APIs just work. Technically, there’s very little that users need to learn, but there is much to unlearn. Because the high cost of OS threads was taken for granted for so many years, it’s hard for programmers to tell the difference between designs that are good in themselves, or those that have just become a habit due to the high cost of threads. For example, threads used to always be pooled, and virtual threads make pooling completely unnecessary and even counterproductive. You must never pool virtual threads, so you’re working against muscle memory here. Also, because thread pools have become so ubiquitous, they’re used for other things, such as limiting concurrency. Say, you have some resource, let’s say a database, and you want to say, I only want to make up to 20 concurrent requests to that database, or microservice. The way people used to do it is to use a thread pool of size 20, just because that was a very familiar construct to them.

The obvious thing to do is to use a semaphore. It is a construct that is designed specifically for the purpose of limited concurrency, rather than the thread pool, because a thread pool was not designed for limiting concurrency. A thread pool, like any pool, is designed for the purpose of pooling some expensive objects. It’s very hard for people to see that the two of them are actually thrusting to the same operation in memory. They think of a thread pool as a queue of waiting task and then some n, let’s say 20 of them making progress because they obtained a thread pool. If you have a semaphore initialized to 20, and every single one of your tasks is a virtual thread, 20 of them will obtain the semaphore and continue running, and the others will just block and wait. How do you wait? It becomes a queue. It’s the same queue. It’s just that every task now is represented by a thread. It’s hard for people to see that. We try and tell people that virtual threads are not fast threads but rather scalable threads. That replacing some n number of platform threads with n virtual threads will yield little benefit, like we saw in Little’s Law. Instead, they should represent each task as a virtual thread.

It’s hard to get people to make the switch from thinking about threads as some resource that requires managing, to think of them as business objects. I saw a question on social media, how many virtual threads should an application have? That’s a question that shows people still think of threads as a resource. They wouldn’t ask, how many strings should be used when storing a list of usernames? As many as there are usernames. The answer regarding virtual threads is similar. The number of virtual threads will always be exactly as many as there are concurrent tasks for the server to do. Those are our current challenges, because Java will probably be very popular for at least another two decades. There’s plenty of time to learn, or rather, unlearn.

Garbage Collection

I’d like to briefly talk about two other significant areas of the platform that have seen some major innovation, the past few years. First, garbage collection. There has been a significant reduction in memory footprints over the past releases of the G1 garbage collector, and that’s a default collector since Java 8. The most notable change has been the introduction of a new garbage collector called ZGC. It is a low latency garbage collector that was added a couple years ago in JDK 15. This is a logarithmic scale graph. As you can see on the right, ZGC offers way sub-millisecond pause times, like worst case for heaps up to 16 terabytes. The way it does it is that ZGC is completely concurrent, both marking and compacting, all the phases of the collection are done concurrently with the application and nothing is done inside stop the world pauses. Not only that, gone are the days of GC tuning. I think many people would find that they need to change their consultancy services. ZGC requires only setting the maximal heap size, and optionally the initial heap size. Here, you can see the p99 and p99.9 latencies of an Oracle Cloud Service. I think it’s easy to see where they switch to ZGC. Modern Java applications will not see pauses for more than 100 microseconds. This is the latency for the service, not the GC. They have other things. That level of jitter is something experienced by C programs as well. The only case where you won’t see it is if you have a real-time OS. If you’re not running with a real-time kernel, you will see that level of jitter even if you write your application in C. ZGC had some tradeoffs, it was nongenerational. It meant that the GC had a relatively high CPU usage. You had to pay for it, not in latency or throughput. It had a bit of impact on throughput, but you had to dedicate some of the CPU for it. Also, applications with very high allocation rates would sometimes see throttling. In JDK 21, which is to be released this September, we’ll have a generational ZGC that can cope with a much higher allocation rate, and has a lower CPU usage, and it still doesn’t require any tuning. The sufficiently good GC that I think was promised to us 50 years ago is here. Java users already use it. Still, we have some ideas for more specialized GCs in the future.

JDK Flight Recorder

The other capability I want to talk about is called JFR, or JDK Flight Recorder. You can think of it as eBPF, only easy to use, and prepacked with lots of events emitted by the JDK itself. JFR writes events into thread-local in memory buffers that are then copied to a recording file on disk. They could then be analyzed offline or consumed online, either on the same machine or remotely. It is essentially a high-performance structured logger, but because the overhead is so low, compared to some popular Java loggers, it’s used for in-production, continuous profiling, and/or monitoring. Events can have associated stack traces. Having big stack traces can add to the overhead. That overhead with big stack trace is going to be reduced by a factor of 10 this year.

There are over 150 built-in events produced by all layers, pre-built, when you get into JDK, it’s already in there, all layers of the JDK, and the number is growing. Of course, libraries and applications can easily add their own. There’s a new feature in JDK 21 which is called JFR views, which are queries, like SQL queries over this time-series database. You can get views for a running program or from a recording file. JFR is the future of Java observability. It is one of my personal favorite features in Java. Even though I didn’t go into details of JIT compilers, I think that with these state-of-the-art GCs, and low overhead profiling, you can see why the combination of performance and observability makes Java so popular for important server-side software these days.

The Effect of Optimization

When you talked about compared to the best hypothetical? No, I said between the mediocre implementation and the best hypothetical one, the difference is like 3%. Virtual threads are not a mediocre implementation, but let’s say it’s 1%. You wouldn’t notice it in an IAM application.

Questions and Answers

Participant 1: Do you also agree that Loom is going to kill reactive programming? Brian Goetz made the assessment.

Pressler: Will virtual threads kill reactive programming? We’ll wait and see. Those who like reactive programming can continue using it.

Participant 2: [inaudible 00:46:57]

Pressler: Most libraries just work with virtual threads because virtual threads are threads. Web server containers or web server frameworks have started to adopt. In fact, I think by now all of them have, so Tomcat, Spring. There is only one that has been built or rewritten from the ground up around virtual threads, so it has better performance, for the moment it’s called Helidon. Their scaling is as good as the hardware allows. It reaches the hypothetical limit that’s predicted by Little’s Law.

See more presentations with transcripts

Uncategorized

It’s 2024. Why Does PostgreSQL Still Dominate? – I Programmer

MMS • RSS

Posted on nosqlgooglealerts. Visit nosqlgooglealerts

PostgreSQL has recently claimed the DB-Engines DBMS of the Year for 2023 award. Another confirmation of PostgreSQL’s worth.

doty2023

Despite its age PostgreSQL isn’t declining in popularity. On the contrary it always in front, as DB-Engines’s official announcement confirms:

PostgreSQL is the database management system that gained more popularity in our DB-Engines Ranking within the last year than any of the other 417 monitored systems. We thus declare PostgreSQL as the DBMS of the Year 2023.

But, why?

What are the metrics, how is that ranking calculated? DB-Engines ranks products by their current popularity according to the following methodology :

Number of mentions of the system on websites, measured as number of results in search engines queries. At the moment, we use Google and Bing for this measurement. In order to count only relevant results, we are searching for together with the term database, e.g. “Oracle” and “database”.
General interest in the system. For this measurement, we use the frequency of searches in Google Trends.
Frequency of technical discussions about the system. We use the number of related questions and the number of interested users on the well-known IT-related Q&A sites Stack Overflow and DBA Stack Exchange.
Number of job offers, in which the system is mentioned. We use the number of offers on the leading job search engines Indeed and Simply Hired.
Number of profiles in professional networks, in which the system is mentioned. We use the internationally most popular professional network LinkedIn.
Relevance in social networks. We count the number of Twitter tweets, in which the system is mentioned.
The criterion for becoming DB-Engine of the Year is having the largest increase in popularity between successive Januarys

This is the fourth time that PostgreSQL gets the DB-Engines award – 2017, 2018 and 2020 were the three previous years.
Who would have thought back in the 80’s that the humble Ingres fork would become one of the most, if not the most, successful DBMS of all time?

There are a few reasons behind its success. First that it is truly open source at heart, therefore embraced by a strong and vibrant community. Postgres became open source therefore open to contributions once it escaped the confinements of the Berkley laboratory. It was this property that in the end let it evolve into a melting pot of the newest and greatest ideas.

Then, it was shaped by the visions pioneer Mike Stonebraker had, which formed the basis of the marvel that followed. Some of these that utilized ideas described in the The Enduring Influence Of Postgres were:

Supporting ADTs in a Database System
At the core of the Object-Relational database notion was the support of ADTs or Abstract Data Types that went beyond the traditional ones handled by the database. These were complex objects or data which had to be stored as nested bundles in stark contrast to the relational model’s classical data flattening for removing duplication. This began as an attempt to cater for the needs of CAD applications which use data types such as polygons, rectangles or even fully blown objects such as circuit layout engines.The shape that it takes today is JSON, JSONB, or XML.

Extensible access methods for new data types
Yet another innovation were the B-Tree indexes that everyone is familiar with today, as well as the R-Tree indexes which allowed for running two dimensional range queries on data.

Active Databases and Rule Systems
Rules or triggers pioneered under Ingres were yet another construct popularized by Postgres that found its way into all the major database engines

Log-centric Storage and Recovery
Not fond of the write ahead logging schemes, Stonebraker “unified the primary storage and historical logging into a single, simple disk-based representation”

Support for Multiprocessors: XPRS
Sharing memory and processors to support parallel query optimization was yet again another Postgres induced novelty.

Support for a Variety of Language Models
The impedance mismatch between Object Oriented programming languages and the declarative relational model was and still is one of the hottest problems of the computer industry.Instead of getting sucked into this never ending debate, Stonebreaker introduced the Object-Relational Database , totally sidestepping the Object Oriented Databases.

These were the principle PostgreSQL was built upon.However, the PostgreSQL core devs did not rest; they continued innovating, they kept on adding. For instance:

Not just hash and B-Tree indexes;PostgreSQL has many of them, like GIN,GiST for full text search and geospatial scenarios, SP-GiST, RUM, BRIN and Bloom.
Storing and querying JSON directly into the database or JSONB if you need to index it by using a GIN index.
Extending the core engine with PostGIS that turns Postgres into a geospatial database together with new datatypes and operators, handing PostgreSQL the edge over multi-million commercial counterparts.
pub/sub support inside the database? Yes,PostgreSQL can do that too with the LISTEN/NOTIFY command.
User Defined Functions in programming languages like Python with PL/Python.

Years down the line, PostgreSQL 9.2 introduced yet another type, Range, that represents a range of values of some element type;14 went one step further by introducing ‘multirange’ types which allow for non-contiguous ranges, helping developers write simpler queries for complex sequences like specifying the ranges of time a meeting room is booked through the day.

Then:

Enhanced monitoring and observability, since you can track the progress of all WAL activity
Functions for using regular expressions to inspect strings: regexp_count(), regexp_instr(), regexp_like(), and regexp_substr().

And so on, up to PostgreSQL version 16 which was released three months ago heralding:

query performance improvements with more parallelism
developer experience enhancements
monitoring of I/O stats using pg_stat_io view
enhanced security features
improved vacuum process

But this is just about the Core.The rest is plugins; extensibility has to be its killer feature as there’s just an extension for everything. This is because PostgreSQL has been built with that kind of philosophy in mind:

With extensibility as an architectural core, it is possible to be creative and stop worrying so much about discipline: you can try many extensions and let the strong succeed. Done well, the “second system” is not doomed; it benefits from the confidence, pet projects, and ambitions developed during the first system.

As a matter of fact, here at I Programmer we have been closely monitoring the PostgreSQL extension landscape, covering many new extensions that take the experience to a whole new level:

Distributed database? no problem. Oracle Database and IBM Db2 both provide a shared-nothing architecture (each node in the cluster has its own compute and storage) as separate features and with Citus, PostgreSQL could do so very much too. Along with this you could also scale out horizontally by adding more servers to Citus’ clusters and rebalance data from the already existing servers to the new ones without downtime.
pg_ivm
an extension module for PostgreSQL 14 that provides an Incremental View Maintenance (IVM) feature. That means that materialized views are updated immediately after a base table is modified.
pgsqlite
a pure python module and command line tool that makes it simple to import a SQLite database into Postgres, saving a ton of time and hassle in the process.
pg_later – Native Asynchronous Queries Within Postgres
an interesting project and extension built by Tembo which enables Postgres to execute queries asynchronously. Fire your query – but don’t forget to check later for the result.
Hydra
Hydra is an open-source extension that adds columnar tables to Postgres for efficient analytical reporting.
pg_vector
an extension for PostgreSQL that renders it a viable alternative to specialized vector stores used in LLMs.
pg_bmp25
which bakes ElasticSearch FTS capabilities into PostgreSQL itself. It is true that full-text search was probably the only thing not good enough on Postgres.
PeerDb
an ETL/ELT tool built for PostgreSQL that makes all tasks that require streaming data from PostgreSQL to third party counterparts as effortless as it gets.

PostgreSQL has also been an attractive starting point for building commercial database systems given its permissive open source license, its robust codebase, its flexibility, and breadth of functionality. As a matter of fact many such forks occupy a place in DB-Engines Ranking such as:

Greenplum
TimescaleDB
YugabyteDB
EDB Postgres

As such it’s no wonder that big corps jumped on the bandwagon in offering managed PostgreSQL instances, such as Amazon Web Services, Microsoft Azure or Google Cloud. Or offering managed adaptations of such as:

Azure CosmosDB
Microsoft’s mutli-model distributed database for supporting workloads at scale, which extendes beyond NoSQL by adding support for PostgreSQL as well.
Amazon Aurora
Amazon Aurora PostgreSQL is a fully managed, PostgreSQL–compatible, and ACID–compliant relational database engine
Amazon RDS for PostgreSQL which supports Trusted Language Extensions (TLE) for PostgreSQL, so that you can build high performance extensions and safely run them on Amazon RDS using popular trusted languages without needing AWS to certify code.

That last part is a boon for developers working with the Cloud offerings.In “Trusted Language Extensions Bring PostgreSQL Procedural UDFs To The Cloud” I explain:

PostgreSQL allows user defined functions to be written in other languages besides SQL and C. These other languages are generically called procedural languages (PLs). Procedural languages aren’t built into the PostgreSQL server; they are offered by loadable modules. That way you can extend your database with powerful features not found in SQL. For instance you can write a PL/Perl procedure to accept a string from your SQL to apply regular expressions to it in order to tokenize it.

This of course comes with security issues when the database invokes code like file system operations, or using statements that could interact with the operating system or database server process. For that Trusted Language Extensions (TLE) came about. They are PostgreSQL extensions that you can safely run on your DB instance. Trusted Language Extensions do not provide access to the filesystem and are designed to prevent access to unsafe resources as its runtime environment limits the impact of any extension defect to a single database connection.TLE also gives database administrators fine-grained control over who can install extensions, and provides a permissions model for running them.

To take a step further, DbDev The DbDev Package Manager For PostgreSQL TLEs by SupaBase now supports installing them in your PostgreSQL instance, like NPM does for Javascript packages.Basically what it offers is instead of fiddling around trying to install your extension DbDev streamlines this process.

So here you have it. This is why PostgreSQL is the most popular database, and that by barely scratching the surface. There’s a lot under the tip and a lot yet to come. 2024 looks very exciting.

doty2023

More Information

PostgreSQL is the DBMS of the Year 2023

The Enduring Influence Of Postgres

PostgreSQL Is DB-Engines DBMS of the Year For 2020

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Comments

or email your comment to: comments@i-programmer.info

Uncategorized

Forsta AP Fonden Sells 6,400 Shares of MongoDB, Inc. (NASDAQ:MDB) – MarketBeat

MMS • RSS

Posted on mongodb google news. Visit mongodb google news

Forsta AP Fonden lessened its position in shares of MongoDB, Inc. (NASDAQ:MDB – Free Report) by 30.6% during the third quarter, according to its most recent filing with the Securities and Exchange Commission. The firm owned 14,500 shares of the company’s stock after selling 6,400 shares during the period. Forsta AP Fonden’s holdings in MongoDB were worth $5,015,000 as of its most recent filing with the Securities and Exchange Commission.

A number of other hedge funds and other institutional investors have also bought and sold shares of MDB. Raymond James & Associates grew its stake in MongoDB by 32.0% in the 1st quarter. Raymond James & Associates now owns 4,922 shares of the company’s stock worth $2,183,000 after buying an additional 1,192 shares in the last quarter. PNC Financial Services Group Inc. grew its stake in MongoDB by 19.1% in the 1st quarter. PNC Financial Services Group Inc. now owns 1,282 shares of the company’s stock worth $569,000 after buying an additional 206 shares in the last quarter. MetLife Investment Management LLC purchased a new position in MongoDB in the 1st quarter worth approximately $1,823,000. Panagora Asset Management Inc. grew its stake in MongoDB by 9.8% in the 1st quarter. Panagora Asset Management Inc. now owns 1,977 shares of the company’s stock worth $877,000 after buying an additional 176 shares in the last quarter. Finally, Vontobel Holding Ltd. grew its stake in MongoDB by 100.3% in the 1st quarter. Vontobel Holding Ltd. now owns 2,873 shares of the company’s stock worth $1,236,000 after buying an additional 1,439 shares in the last quarter. Institutional investors own 88.89% of the company’s stock.

Insider Buying and Selling

In related news, Director Dwight A. Merriman sold 2,000 shares of the stock in a transaction dated Tuesday, October 10th. The shares were sold at an average price of $365.00, for a total transaction of $730,000.00. Following the sale, the director now directly owns 1,195,159 shares of the company’s stock, valued at approximately $436,233,035. The sale was disclosed in a legal filing with the Securities & Exchange Commission, which is accessible through the SEC website. In related news, CEO Dev Ittycheria sold 100,500 shares of the stock in a transaction dated Tuesday, November 7th. The shares were sold at an average price of $375.00, for a total transaction of $37,687,500.00. Following the sale, the chief executive officer now directly owns 214,177 shares of the company’s stock, valued at approximately $80,316,375. The sale was disclosed in a legal filing with the Securities & Exchange Commission, which is accessible through the SEC website. Also, Director Dwight A. Merriman sold 2,000 shares of the stock in a transaction dated Tuesday, October 10th. The shares were sold at an average price of $365.00, for a total transaction of $730,000.00. Following the sale, the director now directly owns 1,195,159 shares in the company, valued at $436,233,035. The disclosure for this sale can be found here. Insiders sold a total of 149,029 shares of company stock worth $57,034,511 in the last three months. 4.80% of the stock is owned by corporate insiders.

Analysts Set New Price Targets

MDB has been the subject of a number of research reports. Bank of America initiated coverage on shares of MongoDB in a research note on Thursday, October 12th. They issued a “buy” rating and a $450.00 price objective for the company. Scotiabank initiated coverage on shares of MongoDB in a research note on Tuesday, October 10th. They issued a “sector perform” rating and a $335.00 price objective for the company. Stifel Nicolaus reaffirmed a “buy” rating and issued a $450.00 price objective on shares of MongoDB in a research note on Monday, December 4th. TheStreet raised shares of MongoDB from a “d+” rating to a “c-” rating in a research note on Friday, December 1st. Finally, Barclays upped their price objective on shares of MongoDB from $470.00 to $478.00 and gave the company an “overweight” rating in a research note on Wednesday, December 6th. One analyst has rated the stock with a sell rating, three have assigned a hold rating and twenty-one have given a buy rating to the company’s stock. According to data from MarketBeat, the stock currently has a consensus rating of “Moderate Buy” and a consensus price target of $430.41.

Get Our Latest Stock Report on MDB

MongoDB Price Performance

Shares of MongoDB stock opened at $365.39 on Monday. The company has a current ratio of 4.74, a quick ratio of 4.74 and a debt-to-equity ratio of 1.18. The company has a 50-day moving average of $392.38 and a 200 day moving average of $380.88. MongoDB, Inc. has a 1 year low of $164.59 and a 1 year high of $442.84.

MongoDB (NASDAQ:MDB – Get Free Report) last issued its quarterly earnings results on Tuesday, December 5th. The company reported $0.96 earnings per share (EPS) for the quarter, topping the consensus estimate of $0.51 by $0.45. The company had revenue of $432.94 million during the quarter, compared to analyst estimates of $406.33 million. MongoDB had a negative net margin of 11.70% and a negative return on equity of 20.64%. MongoDB’s quarterly revenue was up 29.8% compared to the same quarter last year. During the same period last year, the business earned ($1.23) EPS. As a group, analysts anticipate that MongoDB, Inc. will post -1.64 EPS for the current fiscal year.

About MongoDB

(Free Report)

MongoDB, Inc provides general purpose database platform worldwide. The company offers MongoDB Atlas, a hosted multi-cloud database-as-a-service solution; MongoDB Enterprise Advanced, a commercial database server for enterprise customers to run in the cloud, on-premise, or in a hybrid environment; and Community Server, a free-to-download version of its database, which includes the functionality that developers need to get started with MongoDB.

MongoDB vs. ScyllaDB: Performance, Scalability and Cost – The New Stack

MMS • RSS

Posted on mongodb google news. Visit mongodb google news

MongoDB vs. ScyllaDB: Performance, Scalability and Cost – The New Stack

2024-01-08 06:20:33

MongoDB vs. ScyllaDB: Performance, Scalability and Cost

sponsor-scylladb,

We performed an in-depth benchmarking study comparing the two databases that applied more than 133 measurements. Here are the results.

Jan 8th, 2024 6:20am by

Daniel Seybold

Featued image for: MongoDB vs. ScyllaDB: Performance, Scalability and Cost

Image from svic on Shutterstock.

We previously compared the technical characteristics of two important NoSQL databases: the market-leading general-purpose NoSQL database MongoDB, and its performance-oriented challenger ScyllaDB. Both MongoDB and ScyllaDB promise a highly available, performant and scalable architecture. But the way they achieve these objectives is much more different than you might think at first glance.

To quantify the performance impact of these architectural differences, we performed an in-depth benchmarking study that applied more than 133 performance measurement results for performance and scalability. This article shares the high-level results.

TL;DR: ScyllaDB is best suited for applications that operate on data sets in the terabyte range and that require high (over 50 kOps) throughput while providing predictable low latency for read and write operations.

About This Benchmark

The NoSQL database landscape is continuously evolving. Over the past 15 years, it has already introduced many options and trade-offs when it comes to selecting a high-performance and scalable NoSQL database. We recently benchmarked MongoDB versus ScyllaDB to get a detailed picture of their performance, price performance and scalability capabilities under different workloads.

For creating the workloads, we used the Yahoo! Cloud Serving Benchmark YCSB, an open source and industry standard benchmark tool. Database benchmarking is often said to be nontransparent and to compare apples to pears. To address these challenges, this benchmark comparison was based on benchANT’s scientifically proven Benchmarking as a Service platform. The platform ensures a reproducible benchmark process (for more details, see the associated research papers on Mowgli and benchANT), which follows established guidelines for database benchmarking.

This benchmarking project was conducted by benchANT and sponsored by ScyllaDB to provide a fair, transparent and reproducible comparison of both database technologies. For this purpose, all benchmarks were carried out on the database vendors’ DBaaS offers, namely MongoDB Atlas and ScyllaDB Cloud, to ensure a comparable production-ready database deployment. Further, the applied benchmarking tool was the standard YCSB benchmark and all applied configuration options are exposed.

The DBaaS clusters ranged from three to 18 nodes, classified in three scaling sizes that are comparably priced. The benchmarking study comprised three workload types that cover read-heavy, read-update and write-heavy application domains with data set sizes from 250 GB to 10 TB. We compared a total of 133 performance measurements that range from throughput (per cost) to latencies to scalability. ScyllaDB outperformed MongoDB in 132 of 133 measurements:

For all the applied workloads, ScyllaDB provides higher throughput (up to 20 times) results compared to MongoDB.
ScyllaDB achieves P99 latencies below 10 milliseconds for insert, read and write operations for almost all scenarios. In contrast, MongoDB achieves P99 latencies below 10 ms only for certain read operations while the MongoDB insert and update latencies are up 68 times higher compared to ScyllaDB.
ScyllaDB achieves up to near-linear scalability, while MongoDB shows less efficient horizontal scalability.
The price-performance ratio clearly shows the strong advantage of ScyllaDB with up to 19 times better price-performance ratio depending on the workload and data set size.

To ensure full transparency and also reproducibility of the presented results, all benchmark results are publicly available on GitHub. This data contains the raw performance measurements, as well as additional metadata such DBaaS instance details and VM details for running the YCSB instances. You can reproduce the results on your own even without the benchANT platform.

MongoDB vs. ScyllaDB Benchmark Results Overview

The complete benchmark covers three workloads: social, caching and sensor.

The social workload is based on the YCSB Workload B. It creates a read-heavy workload, with 95% read operations and 5% update operations. We use two shapes of this workload, which differ in terms of the request distribution patterns, namely uniform and hotspot distribution. These workloads are executed against the small database scaling size with a data set of 500 GB and against the medium scaling size with a data set of 1 TB.
The caching workload is based on the YCSB Workload A. It creates a read-update workload, with 50% read operations and 50% update operations. The workload is executed in two versions, which differ in terms of the request distribution patterns, namely uniform and hotspot distribution. This workload is executed against the small database scaling size with a data set of 500 GB, the medium scaling size with a data set of 1 TB and a large scaling size with a data set of 10 TB.
The sensor workload is based on the YCSB and its default data model but with an operation distribution of 90% insert operations and 10% read operations that simulate a real-world Internet of Things (IoT) application. The workload is executed with the latest request distribution patterns. This workload is executed against the small database scaling size with a data set of 250 GB and against the medium scaling size with a data set of 500 GB.

The following summary sections capture key insights into how MongoDB and ScyllaDB compare across different workloads and database cluster sizes. A detailed description of results for all workloads and configurations is provided in the extended benchmark report.

Performance Comparison Summary: MongoDB vs. ScyllaDB

For the social workload, ScyllaDB outperforms MongoDB with higher throughput and lower latency for all measured configurations of the social workload.

ScyllaDB provides up to 12 times higher throughput.
ScyllaDB provides significantly lower (down to 47 times) update latencies compared to MongoDB.
ScyllaDB provides lower read latencies, down to five times.

For the caching workload, ScyllaDB outperforms MongoDB with higher throughput and lower latency for all measured configurations of the caching workload.

Even a small three-node ScyllaDB cluster performs better than a large 18-node MongoDB cluster.
ScyllaDB provides constantly higher throughput that increases with growing data sizes to up to 20 times.
ScyllaDB provides significantly better update latencies (down to 68 times) compared to MongoDB.
ScyllaDB read latencies are also lower for all scaling sizes and request distributions, down to 2.8 times.

For the sensor workload, ScyllaDB outperforms MongoDB with higher throughput and lower latency results for the sensor workload except for the read latency in the small scaling size.

ScyllaDB provides constantly higher throughput that increases with growing data sizes, up to 19 times.
ScyllaDB provides lower (down to 20 times) update latency results compared to MongoDB.
MongoDB provides lower read latency for the small scaling size, but ScyllaDB provides lower read latencies for the medium scaling size.

Scalability Comparison Summary: MongoDB vs. ScyllaDB

For the social workload, ScyllaDB achieves near-linear scalability with a throughput scalability of 386% (of the theoretically possible 400%). MongoDB achieves a scaling factor of 420% (of the theoretically possible 600%) for the uniform distribution and 342% (of the theoretically possible 600%) for the hotspot distribution.

For the caching workload, ScyllaDB achieves near-linear scalability across the tests. MongoDB achieves 340% of the theoretically possible 600%, and 900% of the theoretically possible 2400%.

For the sensor workload, ScyllaDB achieves near-linear scalability with a throughput scalability of 393% of the theoretically possible 400%. MongoDB achieves a throughput scalability factor of 262% out of the theoretically possible 600%.

Price-Performance Results Summary: MongoDB vs. ScyllaDB

For the social workload, ScyllaDB provides five times more operations/dollars compared to MongoDB Atlas for the small scaling size and 5.7 times more operations/dollars for the medium scaling size. For the hotspot distribution, ScyllaDB provides nine times more operations/dollars for the small scaling size and 12.7 times more for the medium scaling size.

For the caching workload, ScyllaDB provides 12 to 16 times more operations/dollars compared to MongoDB Atlas for the small scaling size, and 18-20 times more operations/dollars for the medium and large scaling sizes.

For the sensor workload, ScyllaDB provides 6 to 11 times more operations/dollars compared to MongoDB Atlas. In both the caching and sensor workloads, MongoDB is able to scale the throughput with growing instance/cluster sizes, but the preserved operations/dollars are decreasing.

Technical Nugget: Caching Workload, 12-Hour Run

In addition to the default 30-minute benchmark run, we also selected the large scaling size with the uniform distribution for a long-running benchmark of 12 hours.

For MongoDB, we selected the determined eight YCSB instances with 100 threads per YCSB instance and ran the caching workload in uniform distribution for 12 hours with a target throughput of 40 kOps/second.

The throughput results show that MongoDB provides the 40 kOps/s constantly over time as expected.

The P99 read latencies over the 12 hours show some peaks in the latencies that reach 20 ms and 30 ms and an increase of spikes after four hours runtime. On average, the P99 read latency for the 12-hour run is 8.7 ms; for the regular 30-minute run, it is 5.7 ms.

The P99 update latencies over the 12 hours show a spiky pattern over the entire 12 hours, with peak latencies of 400 ms. On average, the P99 update latency for the 12-hour run is 163.8 ms, while for the regular 30-minute run it is 35.7 ms.

For ScyllaDB, we selected the determined 16 YCSB instances with 200 threads per YCSB instance and ran the caching workload in uniform distribution for 12 hours with a target throughput of 500 kOps/s.

The throughput results show that ScyllaDB provides the 500 kOps/s constantly over time as expected.

The P99 read latencies over the 12 hours stay constantly below 10 ms, except for one peak of 12 ms. On average, the P99 read latency for the 12-hour run is 7.8 ms.

The P99 update latencies over the 12 hours show a stable pattern over the entire 12 hours, with an average P99 latency of 3.9 ms.

Technical Nugget: Caching Workload, Insert Performance

In addition to the three defined workloads, we also measured the plain insert performance for the small scaling size (500 GB), medium scaling size (1 TB) and large scaling size (10 TB) in MongoDB and ScyllaDB. It needs to be emphasized that batch inserts were enabled for MongoDB but not for ScyllaDB (since YCSB does not support it for ScyllaDB).

The following results show that for the small scaling size, the achieved insert throughput is on a comparable level. However, for the larger data sets, ScyllaDB achieves a three times higher insert throughput for the medium-size benchmark. But for the large-scale benchmark, MongoDB was not able to ingest the full 10 TB of data due to client-side errors, resulting in only 5 TB inserted data (for more details, see Throughput Results). However, ScyllaDB outperformed MongoDB by a factor of 5.

Technical Nugget: Caching Workload, Client Consistency Performance Impact

In addition to the standard benchmark configurations, we also ran the caching workload in the uniform distribution with weaker consistency settings. Namely, we enabled MongoDB to read from the secondaries (readPreference=secondarypreferred) and for ScyllaDB we set the readConsistency to ONE.

The results show an expected increase in throughput: 23% for ScyllaDB and for 14% MongoDB. This throughput increase is lower compared to the client consistency impact for the social workload since the caching workload is only a 50% read workload, and only the read performance benefits from the applied weaker read consistency settings. It is also possible to further increase the overall throughput by applying weaker write consistency settings.

Conclusion: Performance, Costs and Scalability

The complete benchmarking study comprises 133 performance and scalability measurements that compare MongoDB against ScyllaDB. The results show that ScyllaDB outperforms MongoDB for 132 of the 133 measurements.

For all of the applied workloads, namely caching, social and sensor, ScyllaDB provides higher throughput (up to 20 times) and better throughput scalability results compared to MongoDB. Regarding the latency results, ScyllaDB achieves P99 latencies below 10 ms for insert, read and update operations for almost all scenarios. In contrast, MongoDB only achieves P99 latencies below 10 ms for certain read operations while the insert and update latencies are up 68 times higher compared to ScyllaDB. These results validate the claim that ScyllaDB’s distributed architecture is able to provide predictable performance at scale (as explained in the benchANT technical comparison).

The scalability results show that both database technologies scale horizontally with growing workloads. However, ScyllaDB achieves nearly linear scalability while MongoDB shows a less efficient horizontal scalability. The ScyllaDB results were expected, to a certain degree, based on its multiprimary distributed architecture while a near-linear scalability is still an outstanding result. Also, for MongoDB the less efficient scalability results were expected due to the different distributed architecture (as explained in the benchANT technical comparison).

When it comes to price performance, the results show a clear advantage for ScyllaDB, with up to 19 times better price-performance ratio depending on the workload and data set size. Therefore, achieving comparable performance to ScyllaDB would require a significantly larger and more expensive MongoDB Atlas cluster.

In summary, this benchmarking study shows that ScyllaDB provides a great solution for applications that operate on data sets in the terabytes range and that require high throughput (over 50 kOps) and predictable low latency for read and write operations. This study does not consider the performance impact of advanced data models (such as time series or vectors) or complex operation types (aggregates or scans), which are subject to future benchmark studies. But for these aspects, the current results show that carrying out an in-depth benchmark before selecting a database technology will help you choose a database that significantly lowers costs and prevents future performance problems.

For complete setup and configuration details, additional results for each workload and a discussion of technical nuggets, see the extended benchmark.

Group
Created with Sketch.

Daniel Seybold started his career as PhD student in cloud computing with a focus on distributed databases in the cloud. His further interests cover cloud orchestration, model-driven engineering and performance evaluations of distributed systems. After completing his PhD, Daniel has…

Uncategorized

Podcast: Shreya Rajpal on Guardrails for Large Language Models

MMS • Shreya Rajpal

Article originally posted on InfoQ. Visit InfoQ

Subscribe on:

Transcript

Roland Meertens: Welcome everyone to The InfoQ Podcast. My name is Roland Meertens and I’m your host for today. I am interviewing Shreya Rajpal, who is the CEO and Co-founder of Guardrails AI. We are talking to each other in person at the QCon San Francisco conference just after she gave the presentation called Building Guardrails for Enterprise AI Applications with large Language Models. Keep an eye on InfoQ.com for her presentation as it contains many insights into how one can add guardrails to your large language model application so you can actually make them work. During today’s interview, we will dive deeper into how this works and I hope you enjoy it and you can learn from it.

Welcome, Shreya, to The InfoQ Podcast. We are here at QCon in San Francisco. How do you enjoy the conference so far?

Shreya Rajpal: Yeah, it’s been a blast. Thanks for doing the podcast. I’ve really enjoyed the conference. I was also here last year and I had just a lot of fantastic conversations. I was really looking forward to it and I think it holds up to the standard.

Roland Meertens: All right, and you just gave your talk. How did it go? What was your talk about?

Shreya Rajpal: I think it was a pretty good talk. The audience was very engaged. I got a lot of questions at the end and they were very pertinent questions, so I enjoyed the engagement with the audience. The topic of my talk was on guardrails or the concept of building guardrails for large language model applications, especially from the lens of this open-source framework I created, which is also called Guardrails AI.

What is Guardrails AI [02:21]

Roland Meertens: What does Guardrails AI, what does it do? How can it help me out?

Shreya Rajpal: Guardrails AI essentially looks to solve the problem of reliability and safety for a large language model applications. So if you’ve worked with generative AI and built applications on top of generative AI, what you’ll often end up finding is that they’re really flexible and they’re really functional, but they’re not always useful primarily because they’re not always as reliable. So I like comparing them with traditional software APIs. So traditional software APIs tend to have a lot of correctness baked into the API because we’re in a framework or we’re in a world that’s very deterministic. Compared to that, generative AI ends up being very, very performant, but ends up being essentially not as rigorous in terms of correctness criteria. So hallucinations, for example, are a common issue that we see.

So this is the problem that Guardrails AI tends to solve. So it essentially is something that acts like a firewall around your LLM APIs and make sure that any input that you send to the LLM or any output that you could receive from the LLM is functionally correct for whatever correctness might mean for you. Maybe that’s not hallucinating and then it’ll check for not hallucinations. Maybe it means not having any profanity in your generated text because if you know who your audience is and it’ll check for that. Maybe it means getting the right structured outputs. And all of those can be basically correctness criteria that are enforced.

Roland Meertens: If I, for example, ask it for a JSON document, you will guarantee me that I get correct JSON, but I assume that it can’t really check any of the content, right?

Shreya Rajpal: Oh, it does. Yeah. I think JSON correctness is something that we do and something that we do well. But in addition to that, that is how I look at it, that’s kind of like table states, but it can also look at each field of the JSON and make sure that’s correct. Even if you’re not generating JSON and you’re generating string output. So let’s say you have a question answering chatbot and you want to make sure that the string response that you get from your LLM is not hallucinated or doesn’t violate any rules or regulations of wherever you are, those are also functional things that can be checked and enforced.

Interfacing with your LLM [04:28]

Roland Meertens: So this is basically then like an API interface on top of the large language model?

Shreya Rajpal: I like to think of it as kind of like a shell around the LLM. So it kind of acts as a sentinel at the input of the LLM, at the output of the LLM and acts as making sure that there’s no dangerous outputs or unreliable outputs or unsecure outputs, essentially.

Roland Meertens: Nice. And is this something which you then solve with few-shot learning, or how do you then ensure its correctness?

Shreya Rajpal: In practice, how we end up doing it is a bunch of different techniques depending on the problem that we solve. So for example, for JSON correctness, et cetera, we essentially look to see, okay, here’s our expected structure, here’s what’s incorrect, and you can solve it by few-shot prompting to get the right JSON output. But depending on what the problem is, we end up using different sets of techniques. So for example, a key abstraction in our framework is this idea of a validator where a validator basically checks for a specific requirement, and you can combine all of these validators together in a guard, and that guard will basically run alongside your LLM API and make sure that there’s those guarantees that we care about. And our framework is both a template for creating your own custom validators and orchestrating them via the orchestration layer that we provide, as well as a library of many, many commonly used validators across a bunch of use cases.

Some of them may be rules-based validators. So for example, we have one that makes sure that any regex pattern that you provide, you can make sure that the fields in your JSON or any string output that you get from your JSON matches that regex. We have this one that I talked about in my talk, which you can check out on InfoQ.com called Provenance. And Provenance is essentially making sure that every LLM utterance has some grounding in a source of truth that you know to be true, right? So let’s say you’re an organization that is building a chatbot. You can make sure that your chatbot only answers from the documents from your help center documents or from the documents that you know to be true, and you provide the chatbot and not from its own world model of the internet that it was trained on.

So Provenance looks at every utterance that the LLM has and checks to see where did it come from in my document and makes sure that it’s correct. And if it’s not correct, that means it was hallucinated and can be filtered out. So we have different versions of them and they use various different machine learning techniques under the hood. The simplest one basically uses embedding similarity. We have more complex ones that use LLM self-evaluation or NLI-based classification, like a natural language inference. And so depending on what the problem is, we use either code or ML models or we use external APIs to make sure that the output that you get is correct.

Roland Meertens: Just for my understanding, where do you build in these guardrails? Is this something you built into the models or do you fine tune the model, or is this something you built into essentially the beam search for the output where you say, oh, but if you generate this, this path can’t be correct? Do you do it at this level? Or do you just take the already generated whole text by your large language model and you then, in hindsight, kind of post-process it?

Shreya Rajpal: The latter. Our core assumption is that we’re very… We abstract out the model completely. So you can use an open source model, you can use a commercial model. The example I like using is that in the extreme, you can use a random string generator and we’ll check that random string generator for profanity or making sure that it matches a regex pattern or something.

Roland Meertens: There’s like worst large language model.

Shreya Rajpal: Exactly, the worst large language model. I guess it was a decision that allows developers to really be flexible and focus on more application level concerns rather than really wrangling their model itself. And so how we end up operating is that we are kind of like a sidecar that runs along your model. So any prompt that you’re sending over to your LLM can first pass through guardrails, check to see if there’s any safety concerns, et cetera. And then the output that comes back from your LLM before being sent to your application passes through guardrails.

What constitutes an error? [08:39]

Roland Meertens: So are there any trade-offs when you’re building in these guardrails? Are there some people who say, “Oh, but I like some of the errors?”

Shreya Rajpal: That’s an interesting question. I once remember chatting with someone who was like, oh, yeah, they were building an LLM application that was used by a lot of people, and they basically said, “No, actually people like using us because our system does have profanity, and it does have a lot of things that for other commercial models are filtered out via their moderation APIs.” And so there is an audience for that as well. In that case, what we end up typically seeing is that correctness means different things to different people. So for the person that I mentioned for whom profanity was a good thing, the correct response for them is a response that contains profanity. So you can essentially configure each of these to work for you. There’s no universal definition of what correctness is, just as there’s no universal use case, et cetera.

Roland Meertens: Have you already seen any applications where your guardrails have added a significant impact to the application?

Shreya Rajpal: I think some of my most exciting applications are either in chatbots or in structured data extraction. I also think that those are where most of the LLM applications today are around. So if you’re doing structured data extraction, which is you’re taking a whole chunk of unstructured data, and then from that unstructured data you’re generating some table or something, you’re generating a JSON payload that can then go into your data warehouses, like a row of data. So in that case, essentially making sure that the data you extract is correct and uses the right context and doesn’t veer too far off from historically the data that you receive from that. I think that’s a common use case.

I think the other one that I’ve seen is you’re building a chatbot and you care about some concerns in that chatbot. For example, if you’re in a regulated industry, making sure that there’s no rules that are violated, like misleading your customer about some feature of your product, et cetera, brand risk, using the right tone of voice that aligns with your brand’s communication requirements. I think that’s another common one. Checking for bias, et cetera, is another common one. So there tend to be a lot of these very diverse set of correctness criteria that people have with chatbots that we can enforce.

Enforcing bias [10:55]

Roland Meertens: So how do you enforce these things, for example, bias? Because I think that’s something which is quite hard to grasp, especially if you have only one sample instead of seeing a large overview of samples.

Shreya Rajpal: I think this is another one of those things where, depending on the application or the use case, different organizations may have different Desiderata. So for example, one of the things you can check for is essentially gendered language. Are you using very gendered language or are you using gender-neutral language when you need to be using in your press briefs, et cetera. So that is one specific way of checking bias. But our core philosophy is to take these requirements and break them down into these smaller chunks that can then be configured and put together.

Roland Meertens: I just remembered that if you have Google photos, they at some point had this incident where someone put in gorillas and then found images of people, I think they just stopped using this keyword at all, which is quite interesting.

Any other applications where you already saw a significant impact or do you have any concrete examples?

Shreya Rajpal: Yeah, let’s see. If you go to our open source GitHub page, I think there’s about a hundred or so projects that use Guardrails for enforcing their guarantees. I want to say most of them are around chatbots or structured data extraction. I see a lot of resume screening ones. I see a lot of making sure that you’re able to go to someone’s LinkedIn profile or look at someone’s resume and make sure that they’re the right candidate for you by looking for specific keywords and how are those keywords projected onto a resume. So I think that’s a common one. Yeah, I think those are some of the top of mind ones. Help center support, chatbots are another common use case. Analyzing contracts, et cetera. Using LLMs, I think is another one.

Roland Meertens: These are sound like applications where you absolutely need to be sure that whatever you put there is-

Shreya Rajpal: Is correct. Utah.

Roland Meertens: … is very correct. Yes.

Shreya Rajpal: Absolutely.

Roland Meertens: So what kind of questions did you get after the talk? Who were interested in this? What kind of questions were there?

Shreya Rajpal: The audience was pretty excited about a lot of the content. I think one of my favorite questions was around the cost of implementing Guardrails, right? At the end of the day, there’s no free lunch. This is all compute that needs to happen at runtime and make sure that you’re, at runtime, looking at where your risk areas are off your system and safeguarding against those risk areas, which typically requires add some amount of latency, add some amount of cost, et cetera as well. And so I think that was an interesting question about how do we think about the cost of implementing that?

I think we’ve done a bunch of work in making the guardrails configurable enough where you can set a policy on each guardrail to make sure that it’s a policy that allows you to say how much you care about something. Not every guardrail is pull the alarm, there’s a horrible outcome. Some of them are bad, but you just shrug and move on. Some of them are like, you take some programmatic action, some of them you do more aggressive risk mitigation, and so that is configurable, and we did a bunch of investment making sure that they’re low latency, they can be parallelized very easily, et cetera.

Priorities for content correctness [13:59]

Roland Meertens: So for example, I could say I absolutely want my output to be the right API specification, but it’s okay if one of the categories didn’t exist before, or isn’t in my prompt?

Shreya Rajpal: Absolutely. Yeah, that’s exactly right. I think a classic example I like using is that if you’re in healthcare and you’re building a healthcare support chatbot, you do not have the authorization to give medical advice to anyone who comes on. And so that’s a guardrail where the no medical advice guardrail, where you’d much rather be like, oh, I might as well not respond to this customer and let a human come in if I suspect that there’s medical advice in my output. So that’s a guardrail where you either get it right or it’s not useful to your customer at all, right? So that’s one of the ones where even if it’s slightly more expensive, you’re willing to take that on. A lot of the other ones you can, like you said, if there’s some extra fields, et cetera, that you’re typically okay with.

Roland Meertens: So what are the next steps then for Guardrails AI? What kind of things are you thinking about for the future? Do you get some requests all the time?

Shreya Rajpal: I think a common request that we get is, I think this is much less a capability thing and more just make it easy for our users to use it where we have support for a lot of the common models, but we keep getting requests every day for support Bard or support Anthropic, et cetera. So we have a custom, like I said, a string-to-string translator where you can substitute your favorite model and use whichever one you one. But I think that’s a common one where just add more integrations with other models that are out There.

Roland Meertens: Is there a winning model at the moment which everybody is going for?

Shreya Rajpal: I think OpenAI typically is the one that we see most commonly. Yeah. I think some of the other ones are more around specific features with being able to create custom guardrails with lower input involved. So like I mentioned, we have a framework for creating custom guardrails, but they’re like, okay, how do I make it easier to see what’s happening? I think better logging in visibility is another one. So a lot of exciting changes. I think a few weeks ago we released a big 0.2 release, which had a lot of these changes kind of implemented in addition to a lot of stability improvements, et cetera, and you have more releases to come.

Roland Meertens: And so for the fixing the errors, is this just always hand coded rules or could you also send it back to a large language model and say, oh, we got this issue, try it again, fix this?

Shreya Rajpal: Yeah, so that’s what we like to call the re-asking paradigm that we implemented. So that actually was a core design principle behind Guardrails where these models have this very fascinating ability to self-heal. If you tell them why they’re wrong, they’re often able to incorporate that feedback and correct themselves. So Guardrails basically automatically constructs a prompt for you and then sends it back and then runs verification, et cetera, all over again. This is another one of those things that I walked over in my talk, which is available for viewers as well.

Fixing your output errors [16:48]

Roland Meertens: So then do you just take the existing output and then send the output back and say, “This was wrong, fix it?” Or do you just re-ask the question and hope that it gets it correct the next time?

Shreya Rajpal: That’s a great question. So typically we work on the output level. We’ve done some prompt engineering on our end to configure how to create this prompt to get the most likely correct output. So we include the original request, we include the output. On the output, we do some optimization where we only, and this is configurable as well, where you only re-ask the incorrect parts. So often you’ll end up finding there’s a specific localized area, either like some field in the JSON, or if you have a large string or a paragraph or something, some sentences in a paragraph that are incorrect. So you only send those back for re-asking and not the whole thing, and that ends up being a little bit less expensive.

Roland Meertens: Okay. Oh, interesting. So you only queried the things which you know are wrong?

Shreya Rajpal: Right, right.

Tips to improve LLM output [17:42]

Roland Meertens: Ah, smart. Yeah, that must save a lot of money. And then in terms of correctness and safety, do you have any tips for people who are writing prompts such that you can structure them better? Or how do you normally evaluate whether a prompt is correct?

Shreya Rajpal: I think my response is that I kind of disagree with the premise of the question a little bit. I actually, I go over this in my talk, but what you end up finding a lot of times is that people invest a lot of time and energy in prompt engineering, but at the end of the day, prompts aren’t guarantees, right? First of all, the elements are non-deterministic. So even if you have the best prompt figured out, you send that same prompt over 10 different times, then you’re going to see different outputs. You’re not going to get the right output.

I think the second is that prompt isn’t a guarantee. Maybe you’re like, okay, this is what I want from you. This is the prompt communicating with the LLM. This is what I want from you. Make sure you’re not violating XYZ criteria, et cetera. There’s absolutely nothing guaranteeing that the LLM is going to respect those instructions in the prompt, so you end up getting incorrect responses still. So what we say as safer prompts, yes, definitely prompt is a way to prime the LLM for being more correct than normal. So you can still definitely include those instructions that don’t do XYZ, but verify. Make sure that actually, those conditions are being respected, otherwise you’re opening yourself up to a world of pain.

Roland Meertens: I always find it really cute if people just put things in there like, “Oh, you’re a very nice agent. You always give the correct answer.” Ah, that will help it.

Shreya Rajpal: One of my favorite anecdotes here is from a friend of mine actually who works with LLMs and has been doing that for a few years now, which is a few years ahead of a lot of other people getting into the area, and I think one of her prompts was, a man will die if you don’t respect this constraint, which is a way who wrangled LLM to get the right out output. So people do all sorts of weird things, but our key thing, I think she ended up moving onto this verification system as well. I think at the end of the day, you need to make sure that those conditions you care about are respected and prompting is just clean and sufficient.

Roland Meertens: I guess that’s the lesson we learned today is always tell your LLM that someone will die if they get the answer incorrect.

Shreya Rajpal: Absolutely.

Roland Meertens: Yeah. Interesting. All right. Thank you very much for being on the podcast and hope you enjoy QCon.

Shreya Rajpal: Yeah, absolutely. Thank you for inviting me. Yeah, excited to be here.

Roland Meertens: Thank you very much for listening to this podcast. I hope you enjoyed the conversation. As I mentioned, we’ll upload the talk on InfoQ.com sometime in the future, so keep an eye on that. Thank you again for listening, and thank you again Shreya for joining The InfoQ Podcast.

About the Author

Shreya Rajpal

Show moreShow less

.
From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.

The Knot Worldwide unifies data on single platform – Chain Store Age

MMS • RSS

Subscribe for MMS Newsletter

Did you know...

Navigating the open source landscape: Opportunities, challenges, insights – ITWeb

MMS • RSS

Opportunities and threats in the open source software market

The versatility of open source: A multifaceted powerhouse

Navigating security challenges in the open source domain

Navigating the open source odyssey

Subscribe for MMS Newsletter

Did you know...

JEP 447: Refining Java Constructors for Enhanced Flexibility

MMS • A N M Bazlur Rahman

About the Author

A N M Bazlur Rahman

Subscribe for MMS Newsletter

Did you know...

MongoDB, Inc. – Consensus ‘buy’ rating and 24.8% Upside Potential

MMS • RSS

Subscribe for MMS Newsletter

Did you know...

OpenAI Adopts Preparedness Framework for AI Safety

MMS • Anthony Alford

About the Author

Anthony Alford

Subscribe for MMS Newsletter

Did you know...

Presentation: Virtual Threads for Lightweight Concurrency and Other JVM Enhancements

MMS • Ron Pressler

Transcript

Java Design Choices

Little’s Law

Thread-per-Request – Thread Capacity

Asynchronous Programming

Syntactic Coroutines

Thread vs. Async/Await

Context-Switching

Continuations

Thread = Continuation + Scheduler

Replacing the Foundations

Current Challenges with Java Threads

Garbage Collection

JDK Flight Recorder

The Effect of Optimization

Questions and Answers

Subscribe for MMS Newsletter

Did you know...

It’s 2024. Why Does PostgreSQL Still Dominate? – I Programmer

MMS • RSS

More Information

Related Articles

Comments

Subscribe for MMS Newsletter

Did you know...

Forsta AP Fonden Sells 6,400 Shares of MongoDB, Inc. (NASDAQ:MDB) – MarketBeat

MMS • RSS

Insider Buying and Selling

Analysts Set New Price Targets

MongoDB Price Performance

About MongoDB

Further Reading

Subscribe for MMS Newsletter

Did you know...

MongoDB vs. ScyllaDB: Performance, Scalability and Cost – The New Stack

MMS • RSS

About This Benchmark

MongoDB vs. ScyllaDB Benchmark Results Overview

Performance Comparison Summary: MongoDB vs. ScyllaDB

Scalability Comparison Summary: MongoDB vs. ScyllaDB

Price-Performance Results Summary: MongoDB vs. ScyllaDB

Technical Nugget: Caching Workload, 12-Hour Run

Technical Nugget: Caching Workload, Insert Performance

Technical Nugget: Caching Workload, Client Consistency Performance Impact

Conclusion: Performance, Costs and Scalability

Subscribe for MMS Newsletter

Did you know...

Podcast: Shreya Rajpal on Guardrails for Large Language Models

MMS • Shreya Rajpal

Subscribe on: