Mobile Monitoring Solutions

Search
Close this search box.

Holcim’s biggest shareholder backs plan to list North American business – Investing.com

MMS Founder
MMS RSS

Posted on mongodb google news. Visit mongodb google news

Swiss billionaire Thomas Schmidheiny, the great grandson of Holcim’s founder and a former chairman of the company, owns around a 7% stake in the company, according to his spokesperson.

“Mr. Schmidheiny fully supports the separation and listing of the American business, which he believes is in line with industrial logic,” the spokesperson told Reuters.

“This creates new growth prospects for both companies in the future,” he added. “Holcim has always followed industrial logic, and this transaction makes complete sense in view of the growth opportunities in the United States.”

Holcim on Sunday said it will spin off 100% of its North American operations in a New York flotation which could value the business at $30 billion.

Article originally posted on mongodb google news. Visit mongodb google news

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Why Are MongoDB (MDB) Shares Soaring Today – The Globe and Mail

MMS Founder
MMS RSS

Posted on mongodb google news. Visit mongodb google news

MDB Cover Image

What Happened:

Shares of database software company MongoDB (MDB) jumped 5.1% in the afternoon session as yields fell after the U.S. Treasury Department lowered the borrowing estimate for the first quarter of 2024. According to a press release, the Treasury Department is expected to borrow $760 billion, $55 billion lower than the $815 billion estimate provided in October 2023, due to “projections of higher net fiscal flows and a higher beginning of quarter cash balance.” 

Stocks have trended higher since late 2023 as market participants anticipated that the Fed would begin to cut rates in 2024, after recent economic data showed that inflation is cooling off. The first policy decision will be announced on January 31, 2023, with the consensus expectation for rates to remain steady at 5.25%-5.5%. 

As a reminder, the driver of a stock’s value is the sum of its future cash flows discounted back to today. With lower interest rates, investors can apply higher valuations to their stocks. No wonder so many in the investment community are optimistic about 2024. We at StockStory remain cautious, as following the crowd can lead to adverse outcomes. During times like this, it’s best to own high-quality, cash-flowing companies that can weather the ups and downs of the market.

Is now the time to buy MongoDB? Access our full analysis report here, it’s free.

What is the market telling us:

MongoDB’s shares are very volatile and over the last year have had 26 moves greater than 5%. In context of that, today’s move is indicating the market considers this news meaningful but not something that would fundamentally change its perception of the business. 

The biggest move we wrote about over the last year was 8 months ago, when the stock gained 16.5% on the news that the company reported a “beat and raise” quarter. First quarter results beat analysts’ expectations for revenue, gross margin, free cash flow, and earnings per share. Guidance was also strong. Revenue guidance for the next quarter exceeded expectations. The full year guidance was raised and also came in ahead of Consensus. The profitability guidance was similarly impressive as operating income guidance for the next quarter and full year came in ahead. The company touched on the current AI trend noting that “…MongoDB’s developer data platform is well positioned to benefit from the next wave of AI applications in the years to come”. Overall it was a strong quarter for the company with solid results and impressive guidance.

MongoDB is up 9.2% since the beginning of the year, and at $418.92 per share it is trading close to its 52-week high of $435.23 from November 2023. Investors who bought $1,000 worth of MongoDB’s shares 5 years ago would now be looking at an investment worth $4,817.

Do you want to know what moves the stocks you care about? Add them to your StockStory watchlist and every time a stock we cover moves more than 5%, we provide you with a timely explanation straight to your inbox. It’s free and will only take you a second.

Article originally posted on mongodb google news. Visit mongodb google news

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Sam Partee on Retrieval Augmented Generation (RAG) – InfoQ

MMS Founder
MMS RSS

Posted on nosqlgooglealerts. Visit nosqlgooglealerts

Subscribe on:






Introduction [00:52]

Roland Meertens: Welcome everybody to the InfoQ podcast. My name is Roland Meertens and I’m your host for today. Today, I’m interviewing Sam Partee, who is a Principal Applied AI engineer at Redis. We are talking to each other in person at the QCon San Francisco conference just after he gave the presentation called Generative Search, Practical Advice for Retrieval Augmented Generation. Keep an eye on InfoQ.com for his presentation as it contains many insights into how one can enhance large language models by adding a search components to retrieval augmented generation. During today’s interview, we will dive deeper into how you can do this and I hope you enjoy it and I hope you can learn from the conversation.

Welcome Sam to the InfoQ podcast.

Sam Partee: Thank you.

Roland Meertens: We are recording this live at QCon in San Francisco. How do you like the conference so far?

Sam Partee: It’s awesome. I’m really glad Ian invited me to this and we’ve had a really good time. I’ve met some really interesting people. I was talking to the source graph guys earlier and really loved their demo and there’s a lot of tech right now in the scene that even someone like me who every single day I wake up… I went to a meetup for Weaviate last night. I still see new things and it’s one of the coolest things about living here and being in this space and QCon’s a great example of that.

The Redis vector offering [02:09]

Roland Meertens: Yes, it’s so cool that here everybody is working on state-of-the-art things. I think your presentation was also very much towards state-of-the-art and one of the first things people should look at if they want to set up a system with embeddings and want to set up a system with large language models, I think. Can you maybe give a summary of your talk?

Sam Partee: Yes, absolutely. So about two years ago, Redis introduced its vector offering, essentially vector database offering. It turns Redis into a vector database. So I started at Redis around that time and my job was not necessarily to embed the HNSW into the database itself. There was an awesome set of engineers, DeVere Duncan and those guys who are exceptional engineers. There was a gap that other vector databases had that you couldn’t use Redis and Lang Chain or Llama Index or what have you. So my job is to actually do those integrations and on top of that to work with customers. And so over the past two or so years I’ve been working with those integration frameworks with those customers, with those users in open source and one of the things that you kind of learn through doing all of that are a lot of the best practices. And that’s really what the talk is just a lot of stuff that I’ve learned by building and that’s essentially the content.

Roland Meertens: Okay, so if you say that you are working on vector databases, why can’t you just simply store a vector in any database?

Sam Partee: So it kind of matters what your use case is. So, for Redis for instance, let’s take that. It’s an incredible real-time platform, but if your vectors never change, if you have a static dataset of a billion embeddings, you’re way better off using something like Faiss and storing it in an S3 bucket, loading it into a Lambda function and calling it every once in a while. Just like programming languages, there’s not a one size fits all programming language. You might think Python is, but that’s just because it’s awesome. But it’s the truth that there’s no tool that fits every use case and there’s certainly no vendor that fits every use case. Redis is really good at what it does and so are a lot of other vendors and so you really just got to be able to evaluate and know your use case and evaluate based on that.

Roland Meertens: So what use cases have you carved out for what people would you recommend to maybe re-watch your talk?

Sam Partee: Yes, so specifically use cases, one thing that’s been really big that we’ve done a lot of our chat conversations. So long-term memory for large language models is this concept where the context window of even in the largest case, what is it, 16K? No, 32K, something like that for I think that’s GPT-4. Even then you could have a chat history that is over that 32K token limit and in that case you need other data structures than just a vector index and you need the ability to sort Z sets and Redis. And so there are other data structures that come into play that play as memory buffers or things like that that for those kind of chat conversations end up really mattering and they’re actually integrated into LangChain. Another one’s semantic caching is this concept of it’s what Redis has been best at for its decadal career, something like that ever since Salvatore wrote it.

Semantic caching is simply the addition. It’s almost the next evolution of caching where instead of just a perfect one-to-one match like you would expect a hash would be, it’s more of a one-to-many in the sense that you can have a threshold for how similar a cached item should be and what that cached item should return is based on that specific percentage of threshold. And so it allows these types of things like, say the chat conversation, where if you want a return, “Oh, what’s the last time that I said something like X?” You can now do that and have the same thing that is not only returning that conversational memory but also have that cached. And with Redis you get that all at a really high speed. And so for those use cases it ends up being really great and so there’s obviously a lot of others, but I’ll talk about some that it’s not.

So some that it’s not, that we’ve seen that we are not necessarily the best for is internet memory database. And we have tiering and auto tiering which allows you to go to NVME and you can have an NVME drive, whatever, and go from memory to NVME and it can actually do that automatically now, which is quite fascinating to me. But even then, and even if you have those kinds of things enabled, there are cases like I mentioned where you have a product catalog that changes once every six months and it’s not a demanding QPS use case. You don’t need the latencies of Redis. You call this thing once every month to set user-based recommendations that are relatively static or something like that.

Those use cases, it’s kind of an impractical expense. And it’s not like I’m trying to down talk the place I work right now. It’s really just so that people understand why it is so good for those use cases and why it justifies, and even in that case of something like a recommendation system that is live or online, it even justifies itself in terms of return on investment. And so those types of use cases it’s really good for, but the types that are static that don’t change, it really isn’t one of the tools that you’re going to want to have in your stack unless you’re going to be doing something more traditional like caching or using it for one of its other data structures, which is also a nice side benefit that it has so many other things it’s used for. I mean, it’s its own streaming platform.

Use cases for vector embeddings [08:09]

Roland Meertens: So maybe let’s break down some of this use case. You mentioned extracting the vectors from documents and you also mentioned if a vector is close enough, then you use it for caching. Maybe let’s first dive into the second part because that’s what we just talked about. So you say if a vector is close enough, does Redis then internally build up a tree to do this fast nearest neighbor search?

Sam Partee: Sure. Yes. So we have two algorithms. We have KNN, K-nearest neighbors, brute force, and you can think of this like an exhaustive search. It’s obviously a little bit better than that, but imagine just going down a list and doing that comparison. That’s a simplified view of it, but that’s called our flat index. And then we have HNSW, which is our approximate nearest neighbors index. And so both of those are integrated that we’ve vendored HNSW lib, and that’s what’s included inside of Redis. It’s modified for making it work with things like CRUD operations inside of Redis. But that’s what happens when you have those vectors and their indexed inside of Redis, you choose, if you’re using something like RedisVL, you can pass in a dictionary configuration or a YAML file or what have you, and that chooses what index you end up using for search.

And so for the people out there that are wondering, Which index do I use?” Because that’s always a follow-up question, if you have under a million embeddings, the KNN search is often better because if you think about it in the list example, appending to a list is very fast and recreating that list is very fast. Doing so for a complex tree or graph-based structure that is more computationally complex. And so if you don’t need the latencies of something like HNSW, if you don’t have that many documents, if you’re not at that scale, then you should use the KNN index. In the other case, if you’re above that threshold and you do need those latencies, then HNSW provides those benefits, which is why we have both and we have tons of customers using either one.

Roland Meertens: So basically then we’re just down to if you have anything stored in your database, it’s basically like a dictionary with nearest neighbor. Do you then get multiple neighbors back or do you just return what you stored in your database at this dictionary location?

Sam Partee: There are two specific data structures, Hashes and JSON documents. Hashes in Redis are like add a key, you store a value. JSON documents, you have add a key, you store a JSON document. When you are doing a vector similarity search within Redis, whatever client you use, whether it’s Go or Java or what have you, or Python, what you get back in terms of a vector search is defined by the syntax of that query. And there are two major ones to know about. First are vector searches, just plain vector searches, which are, “I want a specific number of results that are semantically similar to this query embedding.” And then you have range queries which are, “You can return as many results as you want, but they have to be this specific range in this range of vector distance away from this specific vector.” And whether I said semantically earlier, it could be visual embeddings, it can be semantic embeddings, it doesn’t matter.

And so vector searches, range searches, queries, et cetera, those are the two major methodologies. It’s important to note that also Redis just supports straight up just text search and other types of search features which you can use combinonatorally. So all of those are available when you run that and it’s really defined by how you use it. So if you are particular about, let’s say it’s a recommendation system or a product catalog, again to use that example, you might say, “I only want to recommend things to the user,” there’s probably a case for this, “If they’re this similar. If they’re this similar to what is in the user’s card or this basket.” You might want to use something like a range query, right?

Roland Meertens: Yes, makes sense.

If you’re searching for, I don’t know, your cookbooks on Amazon, you don’t want to get the nearest instruction manual for cars, whatever.

Sam Partee: Yes.

Roland Meertens: Even though it’s near-

Sam Partee: Yes, sure.

Roland Meertens: … at some point there’s a cutoff.

Sam Partee: That might be a semantic similarity or let’s say a score rather than a vector distance, one minus the distance. That might be a score of let’s say point six, right? But that’s not relevant enough to be a recommendation that’s worthwhile. And so, if there’s 700 of them that are worthwhile, you might want 700 of them, but if there’s only two, you might only want two. That’s what range queries are really good for, is because you might not know ahead of time how many results you want back, but you might want to say they can only be this far away and that’s a concept that’s been around in vector search libraries for quite some time. But it is now, you can get it back in milliseconds when you’re using Redis, which is pretty cool.

Hybrid search: vector queries combined with range queries [13:02]

Roland Meertens: Yes, nice. Sounds pretty interesting. You also mentioned that you can combine this with other queries?

Sam Partee: So we often call this hybrid search. Really hybrid search is weighted search. So I’m going to start saying filtered search for the purposes of this podcast. If you have what I’m going to call a recall set, which is what you get back after you do a vector search, you can have a pre or post filter. This is particular to Redis, but there are tons of other vector databases that support this and you can do a pre or post filter. The pre-filter in a lot of cases is more important. Think about this example. Let’s say I’m using as a conversational memory buffer, and this could be in LangChain, it’s implemented there too, and I only want the conversation with this user. Well, then I would use a tag filter where the tag, it’s basically exact text search, you can think about that or categorical search, where some piece of that metadata in my Hash or JSON document in Redis is going to be a user’s username.

And then I can use that tag to filter all of the records that I have that are specific to that user and then I can do a vector search. So it allows you to almost have, it’s like a schema in a way of, think about it’s like a SQL database. It allows you to define kind of how you’re going to use it, but the benefits here are that if you don’t do something in the beginning, you can then add it later and still alter the schema of the index, adjust and grow your platform, which is a really cool thing. So the hybrid searches are really interesting. In Redis you can do it with text, full text search like BM 25. You can do it with tags, geographic by… You can do polygon search now, which is really interesting. Literally just draw a polygon of coordinates and if they’re within that polygon of coordinates, then that is where you do your vector search.

Roland Meertens: Pretty good for any mapping application, I assume.

Sam Partee: Or, like say food delivery.

Roland Meertens: Yes.

Sam Partee: I actually think I gave that example in the talk. I gave that example because Ian was in the front. He’s obviously a DoorDash guy. They’re power users of open source and it’s always fun to see how people use it.

Roland Meertens: But so in terms of performance, your embeddings are represented in a certain way to make it fast to search through them?

Sam Partee: Yes.

Roland Meertens: But filters are a completely different game, right?

Sam Partee: Totally.

Roland Meertens: So is there any performance benefits to pre-filtering over post-filtering or the other way around?

Sam Partee: I always hate when I hear this answer, but it depends. If you have a pre-filter that filters it down to a really small set, then yes. But if you have a pre-filter, you can combine them with boolean operators. If you have a pre-filter that’s really complicated and does a lot of operations on each record to see whether it belongs to that set, then you can shoot yourself in the foot trying to achieve that performance benefit. And so it really depends on your query structure and your schema structure. And so, that’s not always obvious. I’ve seen in, we’ll just say an e-commerce company, that had about a hundred and something combined filters in their pre-filter. Actually no, it was post-filter for them because they wanted to do vector switch over all the records and then do a post filter, but it was like 140 different filters, right?

Roland Meertens: That’s a very dedicated, they want something very specific.

Sam Partee: Well, it made sense for the platform, which I obviously can’t talk about, but we found a much better way to do it and I can talk about that. Which is that ahead of time, you can just combine a lot of those fields. And so you have extra fields in your schema. You’re storing more, your memory consumption goes up, but your runtime complexity, the latency of the system goes down because it’s almost like you’re pre-computing, which is like an age-old computer science technique. So increase space complexity, decrease runtime complexity. And that really helped.

How to represent your documents [16:53]

Roland Meertens: Yes, perfect trade-off. Going back to the other thing you mentioned about documents, I think you mentioned two different ways that you can represent your documents in this amending space.

Sam Partee: Yes.

Roland Meertens: Can you maybe elaborate on what the two different ways are and when you would choose one over the other?

Sam Partee: Yes, so what I was talking about here was a lot of people… It’s funny, I was talking to a great guy, I ate lunch with him and he was talking about RAG and how people just take LangChain or LlamaIndex or one of these frameworks and they use a recursive character text splitter or something and split their documents up, not caring about overlap, not caring about how many tokens they have and chunk it up. And use those randomly as the text, raw text basically, for the embeddings and then they run their RAG system and wonder why it’s bad. And it’s because you have filler text, you have texts that isn’t relevant, you possibly have the wrong size and your embeddings possibly aren’t even relevant. So what I’m suggesting in this talk is a couple ways, and actually a quick shout out to Jerry Lou for the diagram there. He runs LlamaIndex, great guy.

What I’m suggesting is there’s two approaches I talk about. First, is you take that raw text and ask an LLM to summarize it. This approach allows you to have a whole document summary and then the chunks of that document associated with that summary. So first, you go and do a vector search over the summaries of the documents, which are often semantically more like rich in terms of context, which helps that vector search out. And then you can return all of the document chunks and even then sometimes on the client side do either a database, local vector search on the chunks that you return after that first vector search.

And with Redis, you can also combine those two operations. Triggers functions are awesome. People should check that out. 7.2 release is awesome. But then the second approach is also really interesting and it involves cases where you would like the surrounding context to be included, but your user query is often something that is found in maybe one or two sentences and includes things like, maybe names or specific numbers or phrases. To use this finance example we worked on, it’s like, “the name of this mutual bond in this paragraph” or whatever it was.

What we did there was instead we split it sentence by sentence and so that when the user entered a query, it found that particular sentence through vector search, semantic search. But the context, the text that was retrieved, was a larger window around that sentence and so it had more information when you retrieved that context. And so, the first thing that people should know about this approach is that it absolutely blows up the size of your database. It makes it-

Roland Meertens: Even if I’m embedding per sentence?

Sam Partee: Yes. And you spend way more on your vector database because think about it, you’re not only storing more text, you’re storing more vectors. And it works well for those use cases, but you have to make sure that that’s worth it and that’s why I’m advocating for people, and this is why I made it my first slide in that section is, just go try a bunch. I talk about using traditional machine learning techniques. So weird that we call it traditional now, but do like a K-fold. Try five different things and then have an eval set. Try it against an eval set. Just like we would’ve with XGBoost when it was five years ago. It feels like everything has changed. But Yes, that’s what I was talking about.

Roland Meertens: So if you are doing this sentence by sentence due to embeddings and you have the larger context around it, is there still enough uniqueness for every sentence or do these large language models then just kind of make the same vector of everything?

Sam Partee: If you have a situation where the query, or whatever’s being used as the query vector, is a lot of text, is a lot of semantic information, this is not the approach to use. But if it’s something like a one or two liner question, or one or two sentence question, it does work well. What you’re, I think getting at to, is that imagine the sentences that people write, especially in some PDFs that just don’t matter. They don’t need to be there and you’re paying for not only that embedding but the storage space. And so, this approach has drawbacks, but who’s going to go through all, I forget how many PDFs there were in that use case, but like 40,000 PDFs which ended up creating, it was like 180 million embeddings or something.

Roland Meertens: Yes, I can imagine if you use this approach on the entire archive database of scientific papers, then-

Sam Partee: Docsearch.redisventures.com, you can look at a semantics search app that does only abstracts, which is essentially the first approach, right? But it just doesn’t have the second layer, right? It doesn’t have that, mostly because we haven’t hosted that. It would be more expensive to host. But it does it on the summaries which the thing about the paper summary… It’s actually a great example, thank you for bringing that up, is that the paper summary, think about how more information is packed into that than random sections of a paper. And so that’s why sometimes using an LLM to essentially create what seems like a paper abstract is actually a really good way of handling this and cheaper usually.

Hypothetical Document Embeddings (HyDE) [22:19]

Roland Meertens: I think the other thing you mentioned during your talk, which I thought was a really interesting trick is if you are having a question and answer retrieval system, that you let the large language model create a possible answer and then search for that answer in your database. Yes. What do you call this? How does this work again? Maybe you can explain this better than I just did.

Sam Partee: Oh no, actually it’s great. I wish I remembered the author’s name of that paper right now because he or she or whoever it is deserves an award and essentially the HyDE approach, it’s called Hypothetical Document Embedding, so HyDE, HyDE, like Jekyll and Hyde. People use the term hallucinations with LLMs when they make stuff up. So I’m going to use that term here even though I don’t really like it. I mentioned that in the talk. It’s just wrong information, but I’ll get off that high horse.

When you use a hallucinated answer to a question to look up the right answer, or at least I should say the right context, and so why does this work? Well, you have a question and that question, let’s say it’s something like in the talk, what did I say? I said, what is Redis? Think about how different that question is than the actual answer, which is like, “an internet memory database, yada, yada, yada.” But a fake answer, even if it’s something like it’s a tool for doing yada, yada, yada, it’s still semantically more similar in both sentence structure and most often it’s actual semantics that it returns a greater amount of relevant information because of the way that the semantic representation of an answer is different from the semantic representation of a query.

Roland Meertens: Kind of like, you dress for a job you want instead of for a job you have.

Sam Partee: That’s pretty funny.

Roland Meertens: You search for the job you want. You search for the data you want, not for the data you have.

Sam Partee: Couldn’t agree more, and that’s also what’s interesting about it. I gave that hotel example. That was me messing around. I just created that app for fun, but I realized how good of an example of a HyDE example it is because it’s showing you that searching for a review with a fake generated review is so much more likely to return reviews that you want to see than saying, this is what I want in a hotel. Because that structurally and semantically is far different from a review than… Some English professors probably crying right now with the way I’m describing the English language, I guess not just English, but you get the point. It’s so much more similar to the actual reviews that you want that the query often doesn’t really represent the context you want.

Roland Meertens: I really liked it as an example also with hotels because on any hotel website, you can’t search for reviews, but you-

Sam Partee: Oh, of course not.

Roland Meertens: Yes, but it kind of makes sense to start searching for the holiday you want or others have instead of searching for the normal things you normally search for like locations, et cetera, et cetera.

Sam Partee: It was funny. I think I said that I started doing it because I actually did get mad that day at this travel website because I just couldn’t find the things I was looking for and I was like, “Why can’t I do this?” And I realize I’m a little bit further ahead in the field, I guess, than some enterprise companies are in thinking about these things because I work on it all the time I guess. But I just imagine the next few years it’s going to completely change user experience of so many things.

I’ve seen so many demos lately and obviously just hanging around SF, you talk to so many people that are creating their own company or something, and I’ve seen so many demos where they’re using me for essentially validation of ideas or something, where my mind’s just blown at how good it is, and I really do think it’s going to completely change user experience going forward.

Applications where vector search would be beneficial [26:10]

Roland Meertens: Do you have more applications where you think it should be used for this? This should exist?

Sam Partee: Interesting. Review data is certainly good. So look, right now we’re really good at text representations, at semantics, and the reason for that is we have a lot of that data. The next frontier is definitely multimodal. OpenAI I think has already started on this in some of their models, but one thing I was thinking about and honestly it was in creating this talk, was why can’t I talk to a slide and change the way it looks? And I can basically do that with stable diffusion. It’s on my newsletter head. The top of my newsletter is this cool scene where I said the prompt is something like the evolution of tech through time because that’s what I’m curious about.

Roland Meertens: But you still can’t interact with… Also with stable diffusion, you can give a prompt, but you can’t say, “Oh, I want this, but that make it a bit brighter or replace it.”

Sam Partee: You can refine it and you can optimize it and make it look a little better, but you’re right. It’s not an interaction. The difference with RAG and a lot of these systems like the chat experience, I’ve seen a chatbot pretty recently made by an enterprise company using Redis that is absolutely fantastic and the reason is because it’s interactive. It’s an experience that is different. And I’d imagine that in a few years you’re literally never going to call an agent on a cell phone again.

You’re actually never going to pick up the phone and call a customer service line because there will be a time and place, and maybe it’s 10 years, two years, I don’t know, I’m not Nostradamus. But it will be to the point where it’s so good, it knows you personally. It knows your information, and it’s not because it’s been trained on it. It’s because it’s injected at runtime and it knows the last thing you ordered. It knows what the previous complaints you’ve had are.

It can solve them for you by looking up company documentation and it can address them internally by saying, “Hey, product team, we should think about doing this.” That is where we’re headed to the point where they’re so helpful and it’s not because they actually know all this stuff. It’s because that combined with really careful prompt engineering and injection of accurate, relevant data makes systems that are seemingly incredibly intelligent. And I say seemingly because I’m not yet completely convinced that it’s anything more than a tool. So anybody that personified, that’s why I don’t like the word hallucinations, but it is just a tool. But this tool happens to be really, really good.

Roland Meertens: The future is bright if it can finally solve the issues that you have whenever you have to call your phone the company.

Sam Partee: God, I hope I never have to call another agent again.

Deploying your solution [28:57]

Roland Meertens: In any case, for the last question, the thing you discussed with another participant here at the QCon conference was, if you want to run these large language models, is there any way to do it or do you have any recommendations for doing this on prem, rather than having to send everything to an external partner?

Sam Partee: That’s a good question. There’s a cool company, I think it’s out of Italy, called Prem, literally, that has a lot of these. So shout out to them, they’re great. But in general, the best way that I’ve seen companies do it is Nvidia Triton is a really great tool. The pipe-lining and being able to feed a Python model’s result to a C++ quantized PyTorch model and whatnot. If you’re really going to go down the route of doing it custom and whatnot, going and talking to Nvidia is never a bad idea. They’re probably going to love that.

But one of the biggest things I’ve seen is that people that are doing it custom, that are actually making their own models, aren’t talking about it a whole lot. And I think that’s because it’s a big source of IP in a lot of these platforms, and that’s why people so commonly have questions about on-prem, and I do think it’s a huge open market, but personally, if you’re training models, you can use things like Determined. Shout out Evan Sparks and HPE, but there’s a lot of ways to train models. There’s really not a lot right now of ways to use those models in the same way that you would use OpenAI’s API. There’s not a lot of ways to say, even Triton has an HPS API, but the way that you form the thing that you send to Triton versus what you do for OpenAI, the barrier to entry of those two things.

Roland Meertens: Yes, GRPC uses DP for this-

Sam Partee: Oh, they’re just so far apart. So the barrier to adoption for the API level tools is so low, and the barrier to adoption for on-prem is unbelievably high. And let alone, you can probably not even get a data center GPU right now. I actually saw a company recently that’s actually doing this on some AMD chips. I love AMD, but CUDA runs the world in AI right now. And if you want to run a model on prem, you got to have a CUDA enabled GPU, and they’re tough to get. So it’s a hard game right now on premise, I got to say.

Roland Meertens: They’re all sold out everywhere. Also on the Google Cloud platform, they’re sold out.

Sam Partee: Really?

Roland Meertens: Even on Hugging Face, it’s sometimes hard to get one.

Sam Partee: Lambda is another good place. I really liked their Cloud UI. Robert Brooks and Co. Over there at Lambda are awesome. So that’s another good one.

Roland Meertens: All right. Thanks for your tips, Sam.

Sam Partee: That was fun.

Roland Meertens: And thank you very much for joining the InfoQ Podcast.

Sam Partee: Of course.

Roland Meertens: Thank you very much for listening to this podcast. I hope you enjoyed the conversation. As I mentioned, we will upload the talk on Sam Partee on InfoQ.com sometime in the future. So keep an eye on that. Thank you again for listening, and thanks again to Sam for being a guest.

About the Author

.
From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


How to Plan Your MongoDB Upgrade – The New Stack

MMS Founder
MMS RSS

Posted on mongodb google news. Visit mongodb google news

<meta name="x-tns-categories" content="Data / Operations“><meta name="x-tns-authors" content="“>

How to Plan Your MongoDB Upgrade – The New Stack

Career Roadblocks

Has your career trajectory been inhibited most by a lack of education in advanced mathematics, computer science education, or management?

Lack of mathematics education has inhibited my career.

0%

Lack of computer science education has inhibited my career.

0%

Lack of management education has inhibited my career.

0%

None of the above have inhibited my career.

0%

2024-01-29 06:37:57

How to Plan Your MongoDB Upgrade

sponsor-percona,sponsored-post-contributed,



Data

/

Operations

MongoDB 4.4 and 5.0 will reach their end of life very soon, so it’s time to update your database software. Here’s how to begin.


Jan 29th, 2024 6:37am by


Featued image for: How to Plan Your MongoDB Upgrade

Featured image by Al Soot on Unsplash.

MongoDB 4.4 will reach end of life (EOL) in February 2024, and MongoDB 5.0 will join it in August. If that’s not enough incentive to start planning your MongoDB upgrade, also consider that updating helps eliminate the security and compliance risks that can come from outdated software and offers features that can improve database performance, security and scalability.

Here are some of the new features in newer versions of MongoDB that might make you decide to upgrade sooner than later.

MongoDB 5.x

  • Live resharding for databases allows users to change shard keys as their workloads and databases evolve without incurring downtime.
  • Sharding for time-series collections enhances scalability and performance for temporal data management.
  • MongoDB 5.3 introduced clustered collections, which store data based on the associated clustered index keys. This prioritizes query performance over write speed when specific order matters for analytical queries.

MongoDB 6.x

  • Secondary and compound indexes in time-series collections boost read performance and enable new use cases such as geoindexing.
  • Enhanced change streams allow users to access both the previous and current states of modified documents, facilitating tasks like making downstream document updates and referencing deleted ones. This also supports data definition language (DDL) operations, such as creating or dropping collections and indexes.
  • Administrators can compress and encrypt audit events before storing them on disk using a Key Management Interoperability Protocol (KMIP)-compliant key-management system.

MongoDB 7.x

  • The ability to modify time-series data increases flexibility and control.
  • New aggregation pipeline operators and variables enable complex data transformations.
  • Wildcard indexes improve query performance for faster searches.
  • Queryable encryption maintains data confidentiality while allowing querying encrypted data.

How to Plan Your MongoDB Upgrade

Rather than giving you an overly technical guide to upgrading MongoDB, I’ll focus on some best practices that ring true for any MongoDB upgrade, whether you’re spurred on by 4.4 EOL or want to take advantage of the latest and greatest in 7.0. If you’re looking for a more technical, step-by-step walkthrough, you can watch Best Practices for Upgrading MongoDB 4.4.

1. Assess Your Current Environment

Though often overlooked, doing a comprehensive evaluation of your existing setup is essential to minimizing risks and downtime and enabling a smooth, successful upgrade.

It might sound obvious, but start by identifying which version of MongoDB you’re using. Knowing your current version is essential both for determining the gap between your existing setup and the latest version, and selecting the appropriate upgrade path.

Next, evaluate your resources and hardware. For example, do your current servers have sufficient CPU, memory and storage capacities to handle the new version efficiently? And thinking long term, will the new setup meet your future workloads and scaling needs?

Then, understand how you use MongoDB. Is it primarily used for transactional data? Analytical queries? Gaming applications? Different versions of MongoDB may be better suited for specific use cases, so it’s essential to evaluate if the new target version aligns with your database’s intended purpose.

Finally, before undertaking an upgrade, execute a thorough backup of your current data. This includes not just your database’s contents but your application data, customizations, replication configurations, indexes and security settings as well. Percona Backup for MongoDB is an open source community backup tool that can help you back up all of this data.

2. Test, Test and Test Again

Before implementing an upgrade in your production environment, it’s imperative — seriously, imperative — to create a separate sandbox or staging environment that mimics your production setup. Your goal is to identify any issues, conflicts or unexpected behaviors that may arise during the transition. This allows you to catch potential problems in a safe and controlled setting and try to avoid long periods of unexpected downtime.

3. Have a Rollback Plan, Just in Case

You can do all the prep in the world, but unforeseen difficulties happen. Try to find someone with experience doing upgrades to help you, whether they’re a member of your staff or a consultant. At the very least, in case something does go wrong, it’s crucial to have a way to roll back to the previous version of your database.

A comprehensive MongoDB rollback plan typically includes:

  • Doing backups of data and configurations.
  • Documenting the current state.
  • Communicating your plans to key stakeholders.
  • Identifying your rollback trigger(s).
  • Creating detailed, documented rollback procedures.
  • Monitoring your environment.
  • Analyzing and understanding what went wrong with the upgrade.

4. Decide Whether to Use a Stable or a Development Version

Stable versions of MongoDB have undergone extensive testing and are deemed production-ready, whereas those still in development may not be fully ready for prime time. Your choice between these versions should be guided by your organization’s risk tolerance and your upgrade’s specific objectives.

5. Determine Your Upgrade Steps and Path

While the upgrade process will vary depending on the individual environment, the general recommended path for a basic MongoDB upgrade is:

  • Take a backup.
  • Download the new binaries.
  • Keep the feature compatibility value (FCV) set to Current/Previous version.
  • Shut down Mongo processes in the correct order, in a rolling fashion according to your system type.
  • Replace your current binaries with the new binaries.
  • Start Mongo processes in the correct order using the rolling fashion required by your system type.
  • Wait 24-48 hours (depending on your database) to ensure there are no problems.
  • Set the FCV to the new version.

It’s best to conduct the upgrade process slowly and steadily. Progress from your current version through each major release until you’ve reached your intended version. For example, if you are on 4.4, that process would look like 4.4.1+ to 5.0 to 6.0 to 7.0. Do not jump from 4.4 to 7.0.

Next Steps

Once you complete your upgrade, conduct some post-upgrade testing and optimization to ensure your new MongoDB database is running as it should. We’ll cover that in our next article.

If you need assistance with your MongoDB upgrade, Percona can help you create a personalized upgrade plan. Our experts will help you upgrade MongoDB, proactively identifying and mitigating blockers, incompatibilities and potential performance issues during and beyond the upgrade cycle.

YOUTUBE.COM/THENEWSTACK

Tech moves fast, don’t miss an episode. Subscribe to our YouTube
channel to stream all our podcasts, interviews, demos, and more.

Group
Created with Sketch.

Article originally posted on mongodb google news. Visit mongodb google news

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Java News Roundup: WildFly 31, Eclipse Store 1.1, Liberica NIK, Quarkus, JHipster Lite

MMS Founder
MMS Michael Redlich

Article originally posted on InfoQ. Visit InfoQ

This week’s Java roundup for January 22nd, 2024 features news highlighting: WildFly 31.0.0, Eclipse Store 1.1.0, BellSoft Liberica Native Image Kit, multiple Quarkus and JHipster Lite releases and Jakarta EE 11 updates.

OpenJDK

After its review had concluded JEP 455, Primitive Types in Patterns, instanceof, and switch (Preview), has been promoted from Proposed to Target to Targeted for JDK 23. This JEP, under the auspices of Project Amber, proposes to enhance pattern matching by allowing primitive type patterns in all pattern contexts, and extend instanceof and switch to work with all primitive types. Aggelos Biboudis, Principal Member of Technical Staff at Oracle, has recently published an updated draft specification for this feature.

JDK 23

Build 7 of the JDK 23 early-access builds was made available this past week featuring updates from Build 6 that include fixes for various issues. More details on this release may be found in the release notes.

JDK 22

Build 33 of the JDK 22 early-access builds was also made available this past week featuring updates from Build 32 that include fixes to various issues. Further details on this build may be found in the release notes.

For JDK 23 and JDK 22, developers are encouraged to report bugs via the Java Bug Database.

Jakarta EE 11

In his weekly Hashtag Jakarta EE blog, Ivar Grimstad, Jakarta EE Developer Advocate at the Eclipse Foundation, has provided an update on the progress of Jakarta EE 11 and beyond. As per the Jakarta EE Specification Process, the Jakarta EE Specification Committee will conduct a progress review of the planned Jakarta EE 11 release and vote on a ballot to approve. If the ballot does not pass, the release date of Jakarta EE 11 could be delayed.

Also, the Jakarta EE Working Group has been thinking beyond Jakarta EE 11 and discussing some ideas for new specifications, such as Jakarta AI. The group has created this Google Doc for the Java community to review and provide input/feedback.

Spring Framework

The Spring Framework team has disclosed that versions 6.1.3 and 6.0.16, released on January 11, 2024, addressed CVE-2024-22233, Spring Framework Server Web DoS Vulnerability, that allows an attacker to provide a specially crafted HTTP request that may cause a denial-of-service condition if the application uses Spring MVC and Spring Security 6.1.6+ or 6.2.1+ is on the classpath.

Version 3.2.1 and 3.1.8 of Spring Shell have been released deliver notable changes: a resolution for the command alias not working on the type level when the subcommand is empty; a split of the JLine dependencies due to issues with native image and to avoid importing classes that may not be needed in an application; and a resolution to the shell cursor not being restored in the terminal multiplexer (tmux) if the shell is hiding the cursor. Both versions build upon Spring Boot 3.2.2 and 3.1.8, respectively. More details on these releases may be found in the release notes for version 3.2.1 and version 3.1.8.

The release of Spring Cloud Commons 4.1.1 has been released featuring a bug fix in which implementations of the Spring Framework BeanPostProcessor interface were not registered correctly when the @LoadBalanced annotation bean was instantiated during auto-configuration. Further details on this release may be found in the release notes.

BellSoft

BellSoft has released versions 23.1.2 for JDK 21 and 23.0.3 for JDK 17 of their Liberica Native Image Kit builds as part of the Oracle Critical Patch Update for January 2024 to address several security and bug fixes. Other notable improvements include: support for AWT and JavaFX fullscreen mode; intrinsified memory copying routines on AMD64 platforms and, where available, they now use AVX instructions for better performance; and SubstrateVM monitor enter/exit routines for accelerated startup of native images.

WildFly

Red Hat has released version 31 of WildFly with application server features such as: support for MicroProfile 6.1, Hibernate ORM 6.4.2, Hibernate Search 7.0.0 and Jakarta MVC 2.1; and the ability to exchange messages from the MicroProfile Reactive Messaging 3.0 specification with Advances Messaging Queuing Protocol (AMQP) 1.0. This release also introduces WildFly Glow, a command line and a set of tools to “provision a trimmed WildFly server instance that contains the server features that are required by an application.” InfoQ will follow up with a more detailed news story.

Quarkus

Red Hat has also released version 3.6.7 of Quarkus with notable changes such as: ensure that the refreshed CSRF cookie retains its original value based on the presence of the token header; dependency management for the Hibernate JPA 2 Metamodel Generator; and a resolution to entity manager issues with Spring Data JPA when using multiple persistence units. More details on this release may be found in the changelog.

Quarkus 3.2.10.Final, the tenth maintenance release in the 3.2 LTS release train, primarily delivers resolutions to CVEs such as: CVE-2023-5675, an authorization flaw with endpoints used in Quarkus RestEasy Reactive and Classic applications customized by Quarkus extensions using the annotation processor; and CVE-2023-6267, an annotation-based security flaw in which the JSON body that a resource may consume is being processed, i.e., deserialized, prior to the security constraints being evaluated and applied. Further details on this release may be found in the changelog.

Helidon

The release of Helidon 4.0.4 delivers notable changes such as: a resolution to the currentSpan() method defined in the TracerProviderHelper class throwing a NullPointerException in situations where an implementation of the TracerProvider class is null; a cleanup and simplification of the logic to determine which type of IP addresses, v4 or v6, to consider during name resolution in WebClient configuration; and security propagation is now disabled when not properly configured. More details on this release may be found in the changelog.

Micronaut

The Micronaut Foundation has released version 4.2.4 of the Micronaut Framework featuring Micronaut Core 4.2.4, bug fixes, dependency upgrades and updates to modules: Micronaut AWS, Micronaut Flyway, Micronaut JAX-RS, Micronaut JMS, Micronaut MicroStream, Micronaut MQTT and Micronaut Servlet. Further details on this release may be found in the release notes.

Hibernate

The release of Hibernate Reactive 2.2.2.Final ships with: a dependency upgrade to Hibernate ORM 6.4.2.Final; removal of unused code that caused a ClassCastException in Quarkus at start up; and new annotations, @EnableFor and @DisabledFor, to enable and disable, respectively, tests for database types. More details on this release may be found in the release notes.

The second alpha release of Hibernate Search 7.1.0 provides: compatibility with Hibernate ORM 6.4.2.Final, Lucene 9.9.1 and Elasticsearch 8.12; an integration of the Elasticsearch/OpenSearch vector search capabilities; and the ability to look up the capabilities of each field when inspecting the metamodel. Further details on this release may be found in the release notes.

Eclipse Store

The release of Eclipse Store 1.1.0 delivers new features such as: monitoring support using the Java Management Extensions (JMX) framework; integration with Spring Boot 3.x; and an implementation of JSR 107, Java Temporary Caching API (JCache). More details on this release may be found in the release notes.

Infinispan

Versions 15.0.0.Dev07 and 14.0.22.Final of Infinispan ship with dependency upgrades and resolutions to notable bug fixes such as: a flaky test failure from the testExpirationCompactionOnLogFile() method defined in the SoftIndexFileStoreFileStatsTest class; an IllegalArgumentException from within the getMembersPhysicalAddresses() method defined in the JGroupsTransport class; and a NullPointerException due to a failover of the Hot Rod Client hanging. Further details on these releases may be found in the release notes for version 15.0.0.Dev07 and version 14.0.22.

JHipster

Versions 1.3.0, 1.2.1 and 1.2.0 of JHipster Lite have been released to deliver bug fixes, dependency upgrades and new features/enhancements such as: use of the LinkedHashSet class instead of the HashSet class for improved reproducible generated code; use of Signals and Control Flow, new features of Angular 17; and support for Protocol Buffers. More details on these releases may be found in the release notes for version 1.3.0, version 1.2.1 and version 1.2.0.

Testcontainers for Java

The release of Testcontainers for Java 1.19.4 ships with bug fixes, improvements in documentation and new features such as: an enhancement in the exec command that supports setting a work directory and environmental variables; support for MySQL 8.3; and an increase of the default startup time for Selenium to 60 seconds. Further details on this release may be found in the release notes.

Gradle

The third release candidate of Gradle 8.6 provides continuous improvement in: support for custom encryption keys in the configuration cache via the GRADLE_ENCRYPTION_KEY environment variable; improvements in error and warning reporting; improvements in the Build Init Plugin to support various types of projects; and enhanced build authoring for plugin authors and build engineers to develop custom build logic. More details on this release may be found in the release notes.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Cloud-based Database Market Growth Statistics & Future – openPR.com

MMS Founder
MMS RSS

Posted on mongodb google news. Visit mongodb google news

Cloud-based Database Market

Cloud-based Database Market

Global Cloud-based Database Market Growth (Status and Outlook) 2024-2030 is the latest research study released by HTF MI evaluating the market risk side analysis, highlighting opportunities, and leveraging strategic and tactical decision-making support. The report provides information on market trends and development, growth drivers, technologies, and the changing investment structure of the Global Cloud-based Database Market. Some of the key players profiled in the study are Amazon Web Services, Google, IBM, Microsoft, Oracle, Rackspace Hosting, Salesforce, Cassandra, Couchbase, MongoDB, SAP, Teradata, Alibaba, Tencent.
Get free access to sample report @ https://www.htfmarketreport.com/sample-report/4343652-global-cloud-based-database-market-growth-2

Cloud-based Database Market Overview:

The study provides a detailed outlook vital to keep market knowledge up to date segmented by SQL Database, NoSQL Database, Small and Medium Business, Large Enterprises, and 18+ countries across the globe along with insights on emerging & major players. If you want to analyze different companies involved in the Cloud-based Database industry according to your targeted objective or geography we offer customization according to your requirements.

Cloud-based Database Market: Demand Analysis & Opportunity Outlook 2029

Cloud-based Database research study defines the market size of various segments & countries by historical years and forecasts the values for the next 6 years. The report is assembled to comprise qualitative and quantitative elements of Cloud-based Database industry including market share, market size (value and volume 2018-2022, and forecast to 2029) that admires each country concerned in the competitive marketplace. Further, the study also caters to and provides in-depth statistics about the crucial elements of Cloud-based Database which includes drivers & restraining factors that help estimate the future growth outlook of the market.

The segments and sub-section of Cloud-based Database market is shown below:

The Study is segmented by the following Product/Service Type: SQL Database, NoSQL Database

Major applications/end-users industry are as follows: Small and Medium Business, Large Enterprises

Some of the key players involved in the Market are: Amazon Web Services, Google, IBM, Microsoft, Oracle, Rackspace Hosting, Salesforce, Cassandra, Couchbase, MongoDB, SAP, Teradata, Alibaba, Tencent

Important years considered in the Cloud-based Database study:
Historical year – 2018-2022; Base year – 2022; Forecast period** – 2023 to 2029 [** unless otherwise stated]

Buy Cloud-based Database research report @ https://www.htfmarketreport.com/buy-now?format=1&report=4343652

If opting for the Global version of Cloud-based Database Market; then the below country analysis would be included:
• North America (the USA, Canada, and Mexico)
• Europe (Germany, France, the United Kingdom, Netherlands, Italy, Nordic Nations, Spain, Switzerland, and the Rest of Europe)
• Asia-Pacific (China, Japan, Australia, New Zealand, South Korea, India, Southeast Asia, and the Rest of APAC)
• South America (Brazil, Argentina, Chile, Colombia, the Rest of the countries, etc.)
• the Middle East and Africa (Saudi Arabia, United Arab Emirates, Israel, Egypt, Turkey, Nigeria, South Africa, Rest of MEA)

Key Questions Answered with this Study
1) What makes Cloud-based Database Market feasible for long-term investment?
2) Know value chain areas where players can create value?
3) Teritorry that may see a steep rise in CAGR & Y-O-Y growth?
4) What geographic region would have better demand for products/services?
5) What opportunity emerging territory would offer to established and new entrants in Cloud-based Database market?
6) Risk side analysis connected with service providers?
7) How influencing are factors driving the demand of Cloud-based Database in the next few years?
8) What is the impact analysis of various factors in the Global Cloud-based Database market growth?
9) What strategies of big players help them acquire a share in a mature market?
10) How Technology and Customer-Centric Innovation is bringing big Change in Cloud-based Database Market?

There are 15 Chapters to display the Global Cloud-based Database Market
Chapter 1, Overview to describe Definition, Specifications, and Classification of Global Cloud-based Database market, Applications [Small and Medium Business, Large Enterprises], Market Segment by Types SQL Database, NoSQL Database;
Chapter 2, the objective of the study.
Chapter 3, Research methodology, measures, assumptions, and analytical tools
Chapters 4 and 5, Global Cloud-based Database Market Trend Analysis, Drivers, Challenges by consumer behavior, Marketing Channels, Value Chain Analysis
Chapters 6 and 7, show the Cloud-based Database Market Analysis, segmentation analysis, characteristics;
Chapters 8 and 9, show Five forces (bargaining power of buyers/suppliers), Threats to new entrants, and market conditions;
Chapters 10 and 11, show analysis by regional segmentation [North America, Europe, Asia-Pacific etc], comparison, leading countries, and opportunities; Customer Behaviour
Chapter 12, identifies the major decision framework accumulated through Industry experts and strategic decision-makers;
Chapters 13 and 14, are about the competitive landscape (classification and Market Ranking)
Chapter 15, deals with Global Cloud-based Database Market sales channel, research findings, conclusion, appendix, and data source.

Enquire for customization in Report @ https://www.htfmarketreport.com/enquiry-before-buy/4343652-global-cloud-based-database-market-growth-2

Thanks for showing interest in Cloud-based Database Industry Research Publication; you can also get individual chapter-wise sections or region-wise report versions like North America, LATAM, United States, GCC, Southeast Asia, Europe, APAC, Japan, United Kingdom, India or China, etc

Contact Us :
Craig Francis (PR & Marketing Manager)
HTF Market Intelligence Consulting Private Limited
Phone: +1 434 322 0091
sales@htfmarketreport.com

About Author:
HTF Market Intelligence Consulting is uniquely positioned to empower and inspire with research and consulting services to empower businesses with growth strategies, by offering services with extraordinary depth and breadth of thought leadership, research, tools, events, and experience that assist in decision-making.

This release was published on openPR.

Article originally posted on mongodb google news. Visit mongodb google news

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


FY2024 Earnings Forecast for MongoDB, Inc. (NASDAQ:MDB) Issued By DA Davidson

MMS Founder
MMS RSS

Posted on mongodb google news. Visit mongodb google news

MongoDB, Inc. (NASDAQ:MDBFree Report) – Stock analysts at DA Davidson issued their FY2024 earnings per share estimates for MongoDB in a report released on Friday, January 26th. DA Davidson analyst R. Kessinger forecasts that the company will earn ($1.58) per share for the year. DA Davidson currently has a “Neutral” rating and a $405.00 target price on the stock. The consensus estimate for MongoDB’s current full-year earnings is ($1.63) per share. DA Davidson also issued estimates for MongoDB’s Q4 2024 earnings at ($0.71) EPS.

MongoDB (NASDAQ:MDBGet Free Report) last issued its quarterly earnings data on Tuesday, December 5th. The company reported $0.96 earnings per share (EPS) for the quarter, topping the consensus estimate of $0.51 by $0.45. The business had revenue of $432.94 million during the quarter, compared to the consensus estimate of $406.33 million. MongoDB had a negative return on equity of 20.64% and a negative net margin of 11.70%. The business’s revenue was up 29.8% on a year-over-year basis. During the same period in the previous year, the firm posted ($1.23) earnings per share.

A number of other equities research analysts also recently weighed in on the company. Piper Sandler upped their target price on MongoDB from $425.00 to $500.00 and gave the stock an “overweight” rating in a research report on Wednesday, December 6th. Tigress Financial increased their price objective on MongoDB from $490.00 to $495.00 and gave the company a “buy” rating in a research report on Friday, October 6th. Truist Financial reiterated a “buy” rating and set a $430.00 price objective on shares of MongoDB in a research report on Monday, November 13th. TheStreet raised shares of MongoDB from a “d+” rating to a “c-” rating in a research note on Friday, December 1st. Finally, Needham & Company LLC reissued a “buy” rating and set a $495.00 target price on shares of MongoDB in a research note on Wednesday, January 17th. One investment analyst has rated the stock with a sell rating, four have issued a hold rating and twenty-one have given a buy rating to the company’s stock. Based on data from MarketBeat.com, the stock has an average rating of “Moderate Buy” and an average price target of $429.50.

Get Our Latest Analysis on MDB

MongoDB Stock Performance

MDB opened at $395.29 on Monday. The company has a market capitalization of $28.53 billion, a price-to-earnings ratio of -149.73 and a beta of 1.23. The company has a debt-to-equity ratio of 1.18, a current ratio of 4.74 and a quick ratio of 4.74. MongoDB has a 1-year low of $189.59 and a 1-year high of $442.84. The business has a 50 day moving average price of $402.37 and a two-hundred day moving average price of $380.86.

Institutional Trading of MongoDB

Several large investors have recently made changes to their positions in MDB. Raymond James & Associates grew its holdings in shares of MongoDB by 32.0% during the first quarter. Raymond James & Associates now owns 4,922 shares of the company’s stock worth $2,183,000 after purchasing an additional 1,192 shares during the last quarter. PNC Financial Services Group Inc. lifted its stake in shares of MongoDB by 19.1% during the first quarter. PNC Financial Services Group Inc. now owns 1,282 shares of the company’s stock valued at $569,000 after acquiring an additional 206 shares during the period. MetLife Investment Management LLC acquired a new position in shares of MongoDB during the first quarter valued at $1,823,000. Panagora Asset Management Inc. lifted its stake in shares of MongoDB by 9.8% during the first quarter. Panagora Asset Management Inc. now owns 1,977 shares of the company’s stock valued at $877,000 after acquiring an additional 176 shares during the period. Finally, Vontobel Holding Ltd. lifted its stake in shares of MongoDB by 100.3% during the first quarter. Vontobel Holding Ltd. now owns 2,873 shares of the company’s stock valued at $1,236,000 after acquiring an additional 1,439 shares during the period. Institutional investors own 88.89% of the company’s stock.

Insider Activity

In other news, CEO Dev Ittycheria sold 100,500 shares of the business’s stock in a transaction that occurred on Tuesday, November 7th. The stock was sold at an average price of $375.00, for a total transaction of $37,687,500.00. Following the sale, the chief executive officer now directly owns 214,177 shares in the company, valued at $80,316,375. The transaction was disclosed in a document filed with the SEC, which is accessible through the SEC website. In other MongoDB news, CEO Dev Ittycheria sold 100,500 shares of the company’s stock in a transaction on Tuesday, November 7th. The stock was sold at an average price of $375.00, for a total value of $37,687,500.00. Following the sale, the chief executive officer now directly owns 214,177 shares in the company, valued at $80,316,375. The sale was disclosed in a document filed with the Securities & Exchange Commission, which is accessible through this hyperlink. Also, CAO Thomas Bull sold 359 shares of the company’s stock in a transaction on Tuesday, January 2nd. The shares were sold at an average price of $404.38, for a total value of $145,172.42. Following the completion of the sale, the chief accounting officer now owns 16,313 shares in the company, valued at $6,596,650.94. The disclosure for this sale can be found here. Over the last ninety days, insiders have sold 149,277 shares of company stock worth $57,223,711. Company insiders own 4.80% of the company’s stock.

MongoDB Company Profile

(Get Free Report)

MongoDB, Inc provides general purpose database platform worldwide. The company offers MongoDB Atlas, a hosted multi-cloud database-as-a-service solution; MongoDB Enterprise Advanced, a commercial database server for enterprise customers to run in the cloud, on-premise, or in a hybrid environment; and Community Server, a free-to-download version of its database, which includes the functionality that developers need to get started with MongoDB.

Read More

Earnings History and Estimates for MongoDB (NASDAQ:MDB)

This instant news alert was generated by narrative science technology and financial data from MarketBeat in order to provide readers with the fastest and most accurate reporting. This story was reviewed by MarketBeat’s editorial team prior to publication. Please send any questions or comments about this story to contact@marketbeat.com.

Before you consider MongoDB, you’ll want to hear this.

MarketBeat keeps track of Wall Street’s top-rated and best performing research analysts and the stocks they recommend to their clients on a daily basis. MarketBeat has identified the five stocks that top analysts are quietly whispering to their clients to buy now before the broader market catches on… and MongoDB wasn’t on the list.

While MongoDB currently has a “Moderate Buy” rating among analysts, top-rated analysts believe these five stocks are better buys.

View The Five Stocks Here

12 Stocks Corporate Insiders are Abandoning Cover

If a company’s CEO, COO, and CFO were all selling shares of their stock, would you want to know?

Get This Free Report

Article originally posted on mongodb google news. Visit mongodb google news

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


What you need to know about Couchbase: Takeaways from a NoSQL masterclass

MMS Founder
MMS RSS

Posted on nosqlgooglealerts. Visit nosqlgooglealerts

Couchbase, a leading player in the NoSQL space, recently conducted a masterclass in the ever-evolving landscape of database management. Led by seasoned expert Anish Mathew, Principal Solutions Architect – APAC at Couchbase, the masterclass showcased real-time data activation, efficient event-driven transformations, and agile query execution capabilities.

Mathew said traditional relational databases faced limitations in handling increasing volume, variety, and velocity of data. The CAP theorem, which highlights the trade-offs between consistency, availability, and partition tolerance, becomes a crucial consideration in choosing databases. Couchbase’s architecture is designed to address the challenges posed by the evolving data landscape.

Couchbase’s NoSQL solution can fill the gaps left by traditional databases. NoSQL databases are designed to handle unstructured or semi-structured data, providing more flexibility in data modelling. They can scale horizontally, accommodating the growing velocity of data generated by IoT devices, sensors, and other sources.

The Couchbase live demo

As part of the masterclass, a live demonstration offered attendees a firsthand overview of the platform’s real-time capabilities. In this live demonstration, Mathew showcased Couchbase’s ability to handle a range of data sources seamlessly.

Here are the highlights from the presentation:

1. Couchbase architecture overview: Couchbase is a NoSQL database that offers multi-dimensional scaling, allowing for the distribution of different services such as data, query, index, search, analytics, and eventing. Each service is responsible for various features like key-value access, data storage distribution, SQL query capabilities, and indexing. 

2. Data Change Protocol stream (DCP): The backbone of Couchbase architecture is the Distributed Commit Protocol (DCP), which is responsible for managing mutations from the data service and facilitating events such as Change Data Capture (CDC). To extract the CDC information from the cluster, one can utilise Kafka connectors.

3. Scaling and deployment: Couchbase offers both vertical and horizontal scaling options for each service, allowing for flexible deployment based on specific needs. Additionally, the Cross Data Centre Replication (XDCR) feature supports global deployments, whether in an active-active or active-passive configuration.

4. Security: Couchbase provides comprehensive security measures, including encryption, role-based access control, and various encryption techniques at different levels of the system.

5. Migration from RDBMS: Couchbase offers a simplified migration process from relational databases. Its structure, which includes Cluster, Bucket, Scope, Collection, and Document, is similar to that of RDBMS. Migrating indexes and SQL queries is also quite straightforward.

6. Auto-Sharding: Auto-sharding is implemented by using the CRC32 algorithm to hash the document key, which then determines the placement of the data in virtual buckets that spread across nodes.

7. Read and write operations: Reads are improved by using managed cache and fetching data from disk if it is not already in memory. Writes are initially stored in memory and then placed in a queue for replication and disk storage. Asynchronous queues can be adjusted to operate synchronously for operations that need higher durability.

8. Accessing data: Couchbase offers a range of data access options, such as key-value retrieval, N1QL (which is SQL for JSON), full-text search capabilities, and the ability to combine SQL queries with key-value operations to create all-or-nothing transactions.

9. High Availability: Couchbase because of its master-master architecture and special handling of replicas to achieve partition tolerance, is one of the highest HA systems in the No-SQL world. With the inter-cluster synchronising capability, situations of an entire data center going down are handled as well.

10. Deployment options: Couchbase offers flexible deployment options, including managed service through Couchbase Cloud, as well as on public clouds like AWS, Azure, and GCP, on-premises, or on Kubernetes.

11. Speed demo: The presenter showcased how eventing scripts can be used to populate data and emphasised the impressive speed at which Couchbase can perform.

Couchbase’s flexible scalability empowers organisations to optimise resource utilisation according to their unique requirements. Moreover, geo-replication facilitates seamless data synchronisation across globally distributed clusters.

The powerhouse: Eventing Service, N1QL, Smart Indexing, and Capella interface

Couchbase’s Eventing Service is a formidable asset in their arsenal. Mathew said the service excels at seamlessly orchestrating the flow of data. During the live demo, the Eventing service’s ability to handle diverse data sources was showcased, emphasising its role in streamlining complex data processes through dynamic data interplay.

N1QL, Couchbase’s agile query engine took the spotlight next, highlighting its efficiency and ability to deliver exceptional query performance. Participants then explored the potential of N1QL in transforming data retrieval and analysis, delving into the intricacies of querying at lightning-fast speeds.

Next, the focus was on Smart Indexing, a crucial component in unlocking Couchbase’s complete capabilities. The Capella interface serves as a canvas to demonstrate Couchbase’s precision, speed, and scalability in managing data.

Concluding Couchbase’s transformative power

The Couchbase masterclass went beyond being a simple initiative; it was an exploration of the unexplored realms of NoSQL innovation.

Mathew challenged participants with a question that ignited their curiosity at the end of the masterclass: “In a data-driven world, what boundaries can Couchbase push?” This question served as a spark, motivating attendees and empowering them to embark on their own NoSQL journeys. 

These are the key takeaways from this Couchbase masterclass about new NoSQL databases:

1. Couchbase redefines data processing, making it dynamic and responsive.

2. The Eventing service orchestrates the seamless flow of data.

3. N1QL empowers agile and efficient querying, turning every query into a performance masterpiece.

4. Smart indexing is pivotal for precision and speed in data management.

5. Couchbase’s flexibility shines in handling custom sharding strategies.

Couchbase is shaping the future of data management. As Mathew said: “In the dynamic realm of data, Couchbase is the compass guiding us to new frontiers.”

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


SingleStore adds indexed vector search to Pro Max release for faster AI work

MMS Founder
MMS RSS

Posted on mongodb google news. Visit mongodb google news

Generative AI’s all-pervading influence has encouraged SingleStore to add new features to its Pro Max release to make AI app development faster.

SingleStore’s eponymous database combines real-time transactional and analytic database features and has hundreds of customers including Siemens, Uber, Palo Alto Networks and SiriusXM. SingleStore surpassed $100 million in ARR in 2022. In April last year it demonstrated how ChatGPT could be used to find data in SingleStoreDB. At the time it said SingleStore was the ideal database for private data search by ChatGPT, “given its abilities to store vector data, perform semantic searches and pull data from various sources without extensive ETL.” Now it’s added AI-relevant tooling to its latest database release.

Raj Verma.

CEO Raj Verma laid it on thick in a statement: “This isn’t just a product update, it’s a quantum leap … SingleStore is offering truly transformative capabilities in a single platform for customers to build all kinds of real-time applications, AI or otherwise.”

The new feature list includes:

  • Indexed vector search. Support for vector search using Approximate Nearest Neighbor (ANN) vector indexing algorithms, leading to 800–1,000x faster vector search performance than precise methods (kNN). This hybrid (full-text and indexed vector) search that takes advantage of SQL for queries, joins, filters and aggregations, and place SingleStore above vector-only databases that require niche query languages and are not designed to meet enterprise security and resiliency needs.
  • On-demand compute service for GPUs and CPUs. This works alongside SingleStore’s native Notebooks to let developers spin up GPUs and CPUs to run database-adjacent workloads including data preparation, ETL, third-party native application frameworks, etc. This capability is said to bring compute to algorithms, rather than the other way around, enabling developers to build highly performant AI applications without unnecessary data movement.
  • CDC: Native capabilities for real-time Change Data Capture in (CDC) for MongoDB, MySQL and ingestion from Apache Iceberg without requiring other third-party CDC tools. SingleStore will also support CDC out capabilities to ease migrations and enable the use of SingleStore as a source for other applications and databases like data warehouses and lakehouses.
  • SingleStore Kai. API to deliver over 100x faster analytics on MongoDB with no query changes or data transformations required. It supports BSON data format natively, has improved transactional performance, increased performance for arrays and offers industry-leading compatibility with MongoDB query language.
  • Projections. Allow developers to greatly speed up range filters and “group by” operations by 3x or more by introducing secondary sort and shard keys.
  • Free shared tier. SingleStore has announced a new cloud-based Free Shared Tier that’s designed for startups and developers to quickly bring their ideas to life – without the need to commit to a paid plan.

Nadeem Asghar, SVP, Product Management + Strategy at SingleStore, said “New features, including vector search, Projections, Apache Iceberg, Scheduled Notebooks, autoscaling, GPU compute services, SingleStore Kai, and the Free Shared Tier allow startups – as well as global enterprises – to quickly build and scale enterprise-grade real-time AI applications. We make data integration with third-party databases easy with both CDC in and CDC out support.”

SingleStore claims its Pro Max database is the industry’s first and only real-time data platform designed for all applications, analytics and AI. It supports high-throughput ingest performance, ACID transactions and low-latency analytics, and structured, semi-structured (JSON, BSON, text) and unstructured data (vector embeddings of audio, video, images, PDFs, etc.).

We are entering a period of NLQ (Natural Language Queries) with generative AI features added to databases, according to Noel Yuhanna, VP and Principal Analyst at Forrester Research. “Generative AI and LLM can help democratize data.”

Article originally posted on mongodb google news. Visit mongodb google news

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.


Podcast: Sam Partee on Retrieval Augmented Generation (RAG)

MMS Founder
MMS Sam Partee

Article originally posted on InfoQ. Visit InfoQ

Subscribe on:






Introduction [00:52]

Roland Meertens: Welcome everybody to the InfoQ podcast. My name is Roland Meertens and I’m your host for today. Today, I’m interviewing Sam Partee, who is a Principal Applied AI engineer at Redis. We are talking to each other in person at the QCon San Francisco conference just after he gave the presentation called Generative Search, Practical Advice for Retrieval Augmented Generation. Keep an eye on InfoQ.com for his presentation as it contains many insights into how one can enhance large language models by adding a search components to retrieval augmented generation. During today’s interview, we will dive deeper into how you can do this and I hope you enjoy it and I hope you can learn from the conversation.

Welcome Sam to the InfoQ podcast.

Sam Partee: Thank you.

Roland Meertens: We are recording this live at QCon in San Francisco. How do you like the conference so far?

Sam Partee: It’s awesome. I’m really glad Ian invited me to this and we’ve had a really good time. I’ve met some really interesting people. I was talking to the source graph guys earlier and really loved their demo and there’s a lot of tech right now in the scene that even someone like me who every single day I wake up… I went to a meetup for Weaviate last night. I still see new things and it’s one of the coolest things about living here and being in this space and QCon’s a great example of that.

The Redis vector offering [02:09]

Roland Meertens: Yes, it’s so cool that here everybody is working on state-of-the-art things. I think your presentation was also very much towards state-of-the-art and one of the first things people should look at if they want to set up a system with embeddings and want to set up a system with large language models, I think. Can you maybe give a summary of your talk?

Sam Partee: Yes, absolutely. So about two years ago, Redis introduced its vector offering, essentially vector database offering. It turns Redis into a vector database. So I started at Redis around that time and my job was not necessarily to embed the HNSW into the database itself. There was an awesome set of engineers, DeVere Duncan and those guys who are exceptional engineers. There was a gap that other vector databases had that you couldn’t use Redis and Lang Chain or Llama Index or what have you. So my job is to actually do those integrations and on top of that to work with customers. And so over the past two or so years I’ve been working with those integration frameworks with those customers, with those users in open source and one of the things that you kind of learn through doing all of that are a lot of the best practices. And that’s really what the talk is just a lot of stuff that I’ve learned by building and that’s essentially the content.

Roland Meertens: Okay, so if you say that you are working on vector databases, why can’t you just simply store a vector in any database?

Sam Partee: So it kind of matters what your use case is. So, for Redis for instance, let’s take that. It’s an incredible real-time platform, but if your vectors never change, if you have a static dataset of a billion embeddings, you’re way better off using something like Faiss and storing it in an S3 bucket, loading it into a Lambda function and calling it every once in a while. Just like programming languages, there’s not a one size fits all programming language. You might think Python is, but that’s just because it’s awesome. But it’s the truth that there’s no tool that fits every use case and there’s certainly no vendor that fits every use case. Redis is really good at what it does and so are a lot of other vendors and so you really just got to be able to evaluate and know your use case and evaluate based on that.

Roland Meertens: So what use cases have you carved out for what people would you recommend to maybe re-watch your talk?

Sam Partee: Yes, so specifically use cases, one thing that’s been really big that we’ve done a lot of our chat conversations. So long-term memory for large language models is this concept where the context window of even in the largest case, what is it, 16K? No, 32K, something like that for I think that’s GPT-4. Even then you could have a chat history that is over that 32K token limit and in that case you need other data structures than just a vector index and you need the ability to sort Z sets and Redis. And so there are other data structures that come into play that play as memory buffers or things like that that for those kind of chat conversations end up really mattering and they’re actually integrated into LangChain. Another one’s semantic caching is this concept of it’s what Redis has been best at for its decadal career, something like that ever since Salvatore wrote it.

Semantic caching is simply the addition. It’s almost the next evolution of caching where instead of just a perfect one-to-one match like you would expect a hash would be, it’s more of a one-to-many in the sense that you can have a threshold for how similar a cached item should be and what that cached item should return is based on that specific percentage of threshold. And so it allows these types of things like, say the chat conversation, where if you want a return, “Oh, what’s the last time that I said something like X?” You can now do that and have the same thing that is not only returning that conversational memory but also have that cached. And with Redis you get that all at a really high speed. And so for those use cases it ends up being really great and so there’s obviously a lot of others, but I’ll talk about some that it’s not.

So some that it’s not, that we’ve seen that we are not necessarily the best for is internet memory database. And we have tiering and auto tiering which allows you to go to NVME and you can have an NVME drive, whatever, and go from memory to NVME and it can actually do that automatically now, which is quite fascinating to me. But even then, and even if you have those kinds of things enabled, there are cases like I mentioned where you have a product catalog that changes once every six months and it’s not a demanding QPS use case. You don’t need the latencies of Redis. You call this thing once every month to set user-based recommendations that are relatively static or something like that.

Those use cases, it’s kind of an impractical expense. And it’s not like I’m trying to down talk the place I work right now. It’s really just so that people understand why it is so good for those use cases and why it justifies, and even in that case of something like a recommendation system that is live or online, it even justifies itself in terms of return on investment. And so those types of use cases it’s really good for, but the types that are static that don’t change, it really isn’t one of the tools that you’re going to want to have in your stack unless you’re going to be doing something more traditional like caching or using it for one of its other data structures, which is also a nice side benefit that it has so many other things it’s used for. I mean, it’s its own streaming platform.

Use cases for vector embeddings [08:09]

Roland Meertens: So maybe let’s break down some of this use case. You mentioned extracting the vectors from documents and you also mentioned if a vector is close enough, then you use it for caching. Maybe let’s first dive into the second part because that’s what we just talked about. So you say if a vector is close enough, does Redis then internally build up a tree to do this fast nearest neighbor search?

Sam Partee: Sure. Yes. So we have two algorithms. We have KNN, K-nearest neighbors, brute force, and you can think of this like an exhaustive search. It’s obviously a little bit better than that, but imagine just going down a list and doing that comparison. That’s a simplified view of it, but that’s called our flat index. And then we have HNSW, which is our approximate nearest neighbors index. And so both of those are integrated that we’ve vendored HNSW lib, and that’s what’s included inside of Redis. It’s modified for making it work with things like CRUD operations inside of Redis. But that’s what happens when you have those vectors and their indexed inside of Redis, you choose, if you’re using something like RedisVL, you can pass in a dictionary configuration or a YAML file or what have you, and that chooses what index you end up using for search.

And so for the people out there that are wondering, Which index do I use?” Because that’s always a follow-up question, if you have under a million embeddings, the KNN search is often better because if you think about it in the list example, appending to a list is very fast and recreating that list is very fast. Doing so for a complex tree or graph-based structure that is more computationally complex. And so if you don’t need the latencies of something like HNSW, if you don’t have that many documents, if you’re not at that scale, then you should use the KNN index. In the other case, if you’re above that threshold and you do need those latencies, then HNSW provides those benefits, which is why we have both and we have tons of customers using either one.

Roland Meertens: So basically then we’re just down to if you have anything stored in your database, it’s basically like a dictionary with nearest neighbor. Do you then get multiple neighbors back or do you just return what you stored in your database at this dictionary location?

Sam Partee: There are two specific data structures, Hashes and JSON documents. Hashes in Redis are like add a key, you store a value. JSON documents, you have add a key, you store a JSON document. When you are doing a vector similarity search within Redis, whatever client you use, whether it’s Go or Java or what have you, or Python, what you get back in terms of a vector search is defined by the syntax of that query. And there are two major ones to know about. First are vector searches, just plain vector searches, which are, “I want a specific number of results that are semantically similar to this query embedding.” And then you have range queries which are, “You can return as many results as you want, but they have to be this specific range in this range of vector distance away from this specific vector.” And whether I said semantically earlier, it could be visual embeddings, it can be semantic embeddings, it doesn’t matter.

And so vector searches, range searches, queries, et cetera, those are the two major methodologies. It’s important to note that also Redis just supports straight up just text search and other types of search features which you can use combinonatorally. So all of those are available when you run that and it’s really defined by how you use it. So if you are particular about, let’s say it’s a recommendation system or a product catalog, again to use that example, you might say, “I only want to recommend things to the user,” there’s probably a case for this, “If they’re this similar. If they’re this similar to what is in the user’s card or this basket.” You might want to use something like a range query, right?

Roland Meertens: Yes, makes sense.

If you’re searching for, I don’t know, your cookbooks on Amazon, you don’t want to get the nearest instruction manual for cars, whatever.

Sam Partee: Yes.

Roland Meertens: Even though it’s near-

Sam Partee: Yes, sure.

Roland Meertens: … at some point there’s a cutoff.

Sam Partee: That might be a semantic similarity or let’s say a score rather than a vector distance, one minus the distance. That might be a score of let’s say point six, right? But that’s not relevant enough to be a recommendation that’s worthwhile. And so, if there’s 700 of them that are worthwhile, you might want 700 of them, but if there’s only two, you might only want two. That’s what range queries are really good for, is because you might not know ahead of time how many results you want back, but you might want to say they can only be this far away and that’s a concept that’s been around in vector search libraries for quite some time. But it is now, you can get it back in milliseconds when you’re using Redis, which is pretty cool.

Hybrid search: vector queries combined with range queries [13:02]

Roland Meertens: Yes, nice. Sounds pretty interesting. You also mentioned that you can combine this with other queries?

Sam Partee: So we often call this hybrid search. Really hybrid search is weighted search. So I’m going to start saying filtered search for the purposes of this podcast. If you have what I’m going to call a recall set, which is what you get back after you do a vector search, you can have a pre or post filter. This is particular to Redis, but there are tons of other vector databases that support this and you can do a pre or post filter. The pre-filter in a lot of cases is more important. Think about this example. Let’s say I’m using as a conversational memory buffer, and this could be in LangChain, it’s implemented there too, and I only want the conversation with this user. Well, then I would use a tag filter where the tag, it’s basically exact text search, you can think about that or categorical search, where some piece of that metadata in my Hash or JSON document in Redis is going to be a user’s username.

And then I can use that tag to filter all of the records that I have that are specific to that user and then I can do a vector search. So it allows you to almost have, it’s like a schema in a way of, think about it’s like a SQL database. It allows you to define kind of how you’re going to use it, but the benefits here are that if you don’t do something in the beginning, you can then add it later and still alter the schema of the index, adjust and grow your platform, which is a really cool thing. So the hybrid searches are really interesting. In Redis you can do it with text, full text search like BM 25. You can do it with tags, geographic by… You can do polygon search now, which is really interesting. Literally just draw a polygon of coordinates and if they’re within that polygon of coordinates, then that is where you do your vector search.

Roland Meertens: Pretty good for any mapping application, I assume.

Sam Partee: Or, like say food delivery.

Roland Meertens: Yes.

Sam Partee: I actually think I gave that example in the talk. I gave that example because Ian was in the front. He’s obviously a DoorDash guy. They’re power users of open source and it’s always fun to see how people use it.

Roland Meertens: But so in terms of performance, your embeddings are represented in a certain way to make it fast to search through them?

Sam Partee: Yes.

Roland Meertens: But filters are a completely different game, right?

Sam Partee: Totally.

Roland Meertens: So is there any performance benefits to pre-filtering over post-filtering or the other way around?

Sam Partee: I always hate when I hear this answer, but it depends. If you have a pre-filter that filters it down to a really small set, then yes. But if you have a pre-filter, you can combine them with boolean operators. If you have a pre-filter that’s really complicated and does a lot of operations on each record to see whether it belongs to that set, then you can shoot yourself in the foot trying to achieve that performance benefit. And so it really depends on your query structure and your schema structure. And so, that’s not always obvious. I’ve seen in, we’ll just say an e-commerce company, that had about a hundred and something combined filters in their pre-filter. Actually no, it was post-filter for them because they wanted to do vector switch over all the records and then do a post filter, but it was like 140 different filters, right?

Roland Meertens: That’s a very dedicated, they want something very specific.

Sam Partee: Well, it made sense for the platform, which I obviously can’t talk about, but we found a much better way to do it and I can talk about that. Which is that ahead of time, you can just combine a lot of those fields. And so you have extra fields in your schema. You’re storing more, your memory consumption goes up, but your runtime complexity, the latency of the system goes down because it’s almost like you’re pre-computing, which is like an age-old computer science technique. So increase space complexity, decrease runtime complexity. And that really helped.

How to represent your documents [16:53]

Roland Meertens: Yes, perfect trade-off. Going back to the other thing you mentioned about documents, I think you mentioned two different ways that you can represent your documents in this amending space.

Sam Partee: Yes.

Roland Meertens: Can you maybe elaborate on what the two different ways are and when you would choose one over the other?

Sam Partee: Yes, so what I was talking about here was a lot of people… It’s funny, I was talking to a great guy, I ate lunch with him and he was talking about RAG and how people just take LangChain or LlamaIndex or one of these frameworks and they use a recursive character text splitter or something and split their documents up, not caring about overlap, not caring about how many tokens they have and chunk it up. And use those randomly as the text, raw text basically, for the embeddings and then they run their RAG system and wonder why it’s bad. And it’s because you have filler text, you have texts that isn’t relevant, you possibly have the wrong size and your embeddings possibly aren’t even relevant. So what I’m suggesting in this talk is a couple ways, and actually a quick shout out to Jerry Lou for the diagram there. He runs LlamaIndex, great guy.

What I’m suggesting is there’s two approaches I talk about. First, is you take that raw text and ask an LLM to summarize it. This approach allows you to have a whole document summary and then the chunks of that document associated with that summary. So first, you go and do a vector search over the summaries of the documents, which are often semantically more like rich in terms of context, which helps that vector search out. And then you can return all of the document chunks and even then sometimes on the client side do either a database, local vector search on the chunks that you return after that first vector search.

And with Redis, you can also combine those two operations. Triggers functions are awesome. People should check that out. 7.2 release is awesome. But then the second approach is also really interesting and it involves cases where you would like the surrounding context to be included, but your user query is often something that is found in maybe one or two sentences and includes things like, maybe names or specific numbers or phrases. To use this finance example we worked on, it’s like, “the name of this mutual bond in this paragraph” or whatever it was.

What we did there was instead we split it sentence by sentence and so that when the user entered a query, it found that particular sentence through vector search, semantic search. But the context, the text that was retrieved, was a larger window around that sentence and so it had more information when you retrieved that context. And so, the first thing that people should know about this approach is that it absolutely blows up the size of your database. It makes it-

Roland Meertens: Even if I’m embedding per sentence?

Sam Partee: Yes. And you spend way more on your vector database because think about it, you’re not only storing more text, you’re storing more vectors. And it works well for those use cases, but you have to make sure that that’s worth it and that’s why I’m advocating for people, and this is why I made it my first slide in that section is, just go try a bunch. I talk about using traditional machine learning techniques. So weird that we call it traditional now, but do like a K-fold. Try five different things and then have an eval set. Try it against an eval set. Just like we would’ve with XGBoost when it was five years ago. It feels like everything has changed. But Yes, that’s what I was talking about.

Roland Meertens: So if you are doing this sentence by sentence due to embeddings and you have the larger context around it, is there still enough uniqueness for every sentence or do these large language models then just kind of make the same vector of everything?

Sam Partee: If you have a situation where the query, or whatever’s being used as the query vector, is a lot of text, is a lot of semantic information, this is not the approach to use. But if it’s something like a one or two liner question, or one or two sentence question, it does work well. What you’re, I think getting at to, is that imagine the sentences that people write, especially in some PDFs that just don’t matter. They don’t need to be there and you’re paying for not only that embedding but the storage space. And so, this approach has drawbacks, but who’s going to go through all, I forget how many PDFs there were in that use case, but like 40,000 PDFs which ended up creating, it was like 180 million embeddings or something.

Roland Meertens: Yes, I can imagine if you use this approach on the entire archive database of scientific papers, then-

Sam Partee: Docsearch.redisventures.com, you can look at a semantics search app that does only abstracts, which is essentially the first approach, right? But it just doesn’t have the second layer, right? It doesn’t have that, mostly because we haven’t hosted that. It would be more expensive to host. But it does it on the summaries which the thing about the paper summary… It’s actually a great example, thank you for bringing that up, is that the paper summary, think about how more information is packed into that than random sections of a paper. And so that’s why sometimes using an LLM to essentially create what seems like a paper abstract is actually a really good way of handling this and cheaper usually.

Hypothetical Document Embeddings (HyDE) [22:19]

Roland Meertens: I think the other thing you mentioned during your talk, which I thought was a really interesting trick is if you are having a question and answer retrieval system, that you let the large language model create a possible answer and then search for that answer in your database. Yes. What do you call this? How does this work again? Maybe you can explain this better than I just did.

Sam Partee: Oh no, actually it’s great. I wish I remembered the author’s name of that paper right now because he or she or whoever it is deserves an award and essentially the HyDE approach, it’s called Hypothetical Document Embedding, so HyDE, HyDE, like Jekyll and Hyde. People use the term hallucinations with LLMs when they make stuff up. So I’m going to use that term here even though I don’t really like it. I mentioned that in the talk. It’s just wrong information, but I’ll get off that high horse.

When you use a hallucinated answer to a question to look up the right answer, or at least I should say the right context, and so why does this work? Well, you have a question and that question, let’s say it’s something like in the talk, what did I say? I said, what is Redis? Think about how different that question is than the actual answer, which is like, “an internet memory database, yada, yada, yada.” But a fake answer, even if it’s something like it’s a tool for doing yada, yada, yada, it’s still semantically more similar in both sentence structure and most often it’s actual semantics that it returns a greater amount of relevant information because of the way that the semantic representation of an answer is different from the semantic representation of a query.

Roland Meertens: Kind of like, you dress for a job you want instead of for a job you have.

Sam Partee: That’s pretty funny.

Roland Meertens: You search for the job you want. You search for the data you want, not for the data you have.

Sam Partee: Couldn’t agree more, and that’s also what’s interesting about it. I gave that hotel example. That was me messing around. I just created that app for fun, but I realized how good of an example of a HyDE example it is because it’s showing you that searching for a review with a fake generated review is so much more likely to return reviews that you want to see than saying, this is what I want in a hotel. Because that structurally and semantically is far different from a review than… Some English professors probably crying right now with the way I’m describing the English language, I guess not just English, but you get the point. It’s so much more similar to the actual reviews that you want that the query often doesn’t really represent the context you want.

Roland Meertens: I really liked it as an example also with hotels because on any hotel website, you can’t search for reviews, but you-

Sam Partee: Oh, of course not.

Roland Meertens: Yes, but it kind of makes sense to start searching for the holiday you want or others have instead of searching for the normal things you normally search for like locations, et cetera, et cetera.

Sam Partee: It was funny. I think I said that I started doing it because I actually did get mad that day at this travel website because I just couldn’t find the things I was looking for and I was like, “Why can’t I do this?” And I realize I’m a little bit further ahead in the field, I guess, than some enterprise companies are in thinking about these things because I work on it all the time I guess. But I just imagine the next few years it’s going to completely change user experience of so many things.

I’ve seen so many demos lately and obviously just hanging around SF, you talk to so many people that are creating their own company or something, and I’ve seen so many demos where they’re using me for essentially validation of ideas or something, where my mind’s just blown at how good it is, and I really do think it’s going to completely change user experience going forward.

Applications where vector search would be beneficial [26:10]

Roland Meertens: Do you have more applications where you think it should be used for this? This should exist?

Sam Partee: Interesting. Review data is certainly good. So look, right now we’re really good at text representations, at semantics, and the reason for that is we have a lot of that data. The next frontier is definitely multimodal. OpenAI I think has already started on this in some of their models, but one thing I was thinking about and honestly it was in creating this talk, was why can’t I talk to a slide and change the way it looks? And I can basically do that with stable diffusion. It’s on my newsletter head. The top of my newsletter is this cool scene where I said the prompt is something like the evolution of tech through time because that’s what I’m curious about.

Roland Meertens: But you still can’t interact with… Also with stable diffusion, you can give a prompt, but you can’t say, “Oh, I want this, but that make it a bit brighter or replace it.”

Sam Partee: You can refine it and you can optimize it and make it look a little better, but you’re right. It’s not an interaction. The difference with RAG and a lot of these systems like the chat experience, I’ve seen a chatbot pretty recently made by an enterprise company using Redis that is absolutely fantastic and the reason is because it’s interactive. It’s an experience that is different. And I’d imagine that in a few years you’re literally never going to call an agent on a cell phone again.

You’re actually never going to pick up the phone and call a customer service line because there will be a time and place, and maybe it’s 10 years, two years, I don’t know, I’m not Nostradamus. But it will be to the point where it’s so good, it knows you personally. It knows your information, and it’s not because it’s been trained on it. It’s because it’s injected at runtime and it knows the last thing you ordered. It knows what the previous complaints you’ve had are.

It can solve them for you by looking up company documentation and it can address them internally by saying, “Hey, product team, we should think about doing this.” That is where we’re headed to the point where they’re so helpful and it’s not because they actually know all this stuff. It’s because that combined with really careful prompt engineering and injection of accurate, relevant data makes systems that are seemingly incredibly intelligent. And I say seemingly because I’m not yet completely convinced that it’s anything more than a tool. So anybody that personified, that’s why I don’t like the word hallucinations, but it is just a tool. But this tool happens to be really, really good.

Roland Meertens: The future is bright if it can finally solve the issues that you have whenever you have to call your phone the company.

Sam Partee: God, I hope I never have to call another agent again.

Deploying your solution [28:57]

Roland Meertens: In any case, for the last question, the thing you discussed with another participant here at the QCon conference was, if you want to run these large language models, is there any way to do it or do you have any recommendations for doing this on prem, rather than having to send everything to an external partner?

Sam Partee: That’s a good question. There’s a cool company, I think it’s out of Italy, called Prem, literally, that has a lot of these. So shout out to them, they’re great. But in general, the best way that I’ve seen companies do it is Nvidia Triton is a really great tool. The pipe-lining and being able to feed a Python model’s result to a C++ quantized PyTorch model and whatnot. If you’re really going to go down the route of doing it custom and whatnot, going and talking to Nvidia is never a bad idea. They’re probably going to love that.

But one of the biggest things I’ve seen is that people that are doing it custom, that are actually making their own models, aren’t talking about it a whole lot. And I think that’s because it’s a big source of IP in a lot of these platforms, and that’s why people so commonly have questions about on-prem, and I do think it’s a huge open market, but personally, if you’re training models, you can use things like Determined. Shout out Evan Sparks and HPE, but there’s a lot of ways to train models. There’s really not a lot right now of ways to use those models in the same way that you would use OpenAI’s API. There’s not a lot of ways to say, even Triton has an HPS API, but the way that you form the thing that you send to Triton versus what you do for OpenAI, the barrier to entry of those two things.

Roland Meertens: Yes, GRPC uses DP for this-

Sam Partee: Oh, they’re just so far apart. So the barrier to adoption for the API level tools is so low, and the barrier to adoption for on-prem is unbelievably high. And let alone, you can probably not even get a data center GPU right now. I actually saw a company recently that’s actually doing this on some AMD chips. I love AMD, but CUDA runs the world in AI right now. And if you want to run a model on prem, you got to have a CUDA enabled GPU, and they’re tough to get. So it’s a hard game right now on premise, I got to say.

Roland Meertens: They’re all sold out everywhere. Also on the Google Cloud platform, they’re sold out.

Sam Partee: Really?

Roland Meertens: Even on Hugging Face, it’s sometimes hard to get one.

Sam Partee: Lambda is another good place. I really liked their Cloud UI. Robert Brooks and Co. Over there at Lambda are awesome. So that’s another good one.

Roland Meertens: All right. Thanks for your tips, Sam.

Sam Partee: That was fun.

Roland Meertens: And thank you very much for joining the InfoQ Podcast.

Sam Partee: Of course.

Roland Meertens: Thank you very much for listening to this podcast. I hope you enjoyed the conversation. As I mentioned, we will upload the talk on Sam Partee on InfoQ.com sometime in the future. So keep an eye on that. Thank you again for listening, and thanks again to Sam for being a guest.

About the Author

.
From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.