Mobile Monitoring Solutions

Close this search box.

How SuperDuperDB provides easy access to AI apps – Business News

MMS Founder

Posted on mongodb google news. Visit mongodb google news


Many observers have predicted that 2024 will be the year when enterprises turn generative AI like OpenAI’s GPT-4 into real corporate applications. Most likely, such applications will start with the simplest type of infrastructure, tying together large language models like GPT-4 with some basic data management.

Enterprise apps will start with simple tasks such as searching through text or images to find matches with natural-language search.

Too: Pinecone CEO seeks to give AI something like intelligence

An ideal candidate for doing this is a Python library called SuperDuperDB, created by the venture capital-backed company of the same name, founded this year.

SuperDuperDB is not a database but an interface that sits between a database like MongoDB or Snowflake and a larger language model or other GenAI program.

That interface layer makes it easy to perform many very basic operations on corporate data. By using natural language queries in a chat prompt, one can query existing corporate data sets – such as documents – more comprehensively than a simple keyword search. Someone could, say, upload images of products to an image database and then query that database by showing an image and looking for matches.

Similarly, video moments can be retrieved from a collection of videos by typing themes or features. Records of voice messages can be searched as text transcripts, making them a basic voicemail assistant.

The technology also has uses for data scientists and machine learning engineers who want to refine AI programs using proprietary corporate data.

Too: Microsoft’s GitHub Copilot pursues AI’s full ‘time to value’ in programming

For example, to “fine-tune” an AI program such as an image recognition model, one must connect an existing database of images to a machine learning program. The challenge is how to get the image data in and out of the machine learning program, and how to define the variables of the training process, such as minimizing the loss. SuperDuperDB provides simple function calls to make all that stuff simple.

A key aspect of many of those functions is converting different data types – text, image, video, audio – into vectors, strings of numbers that can be compared to each other. Doing so allows SuperDuperDB to perform a “similarity search” where, for example, a vector of text phrases is compared to a database full of voicemail transcripts to retrieve the message that most closely matches the query.

Keep in mind, SuperDuperDB is not a vector database like Pinecone, a commercial program. This is a simple form of organizing vectors called a “vector index”.

Also: Pinecone CEO on quest to give AI something like intelligence

The SuperDuperDB program, which is open-source, is installed like a typical Python installation from the command line or loaded as a pre-built Docker container.

The first step in working with SuperDuperDB can be either setting up a data store from scratch, or working with an external data store. In either case, you’ll want to have a data repository such as MongoDB or a SQL-based database.

SuperDuperDB handles all data, including newly created data and data obtained from the database, using what is called an “encoder”, which lets the programmer define data types. These encoded types – text, audio, image, video, etc. – can be stored as “documents” in MongoDB or as table schemas in a SQL-based database. It is also possible to store very large data items, such as video files, in local storage when they exceed the capacity of a MongoDB or SQL database.

Too: Bill Gates predicts AI will soon lead to ‘massive technology boom’

Once the data set is selected or created, neural net models can be imported from libraries such as scikit-learn or one can use a very basic built-in list of neural nets such as Transformer, a native large language model . One can call APIs from commercial services like OpenAI and Anthropic. The main work of making predictions by the model is done with a simple call to the “.predict” function built into SuperDuperDB.

When working with large language models or image models like Stable Diffusion or Dal-E, the neural net will try to get answers from the database by doing vector similarity searches. It’s as simple as calling the “.like” function and passing it a query string.

With SuperDuperDB it is possible to build more complex apps by assembling multiple stages of functionality, such as using similarity search to retrieve items from the database and then sending those items to a classifier neural net.

The company has added functions that make the app a production system. They include a service called a listener that replays predictions when the underlying database is updated. Different tasks in SuperDuperDB can also be run as separate daemons to improve performance.

Also: How Langchain turns GenAI into a really useful assistant

Programs like SuperDuperDB will see significant developments this year, making them even more robust for production purposes. You can expect SuperDuperDB to evolve alongside other important emerging infrastructure such as the Longchain framework and commercial tools such as the Pinecone Vector Database.

Although there is a lot of ambitious talk about enterprise use of GenAI, it probably starts here, with a variety of humble tools that can be picked up by individual programmers.

If you want to get a quick look at SuperDuperDB, visit the demo on the company’s Web site.


Article originally posted on mongodb google news. Visit mongodb google news

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.