PlanetScale Vectors Now GA: MySQL’s Missing Feature?

MMS Founder
MMS Renato Losio

Article originally posted on InfoQ. Visit InfoQ

PlanetScale has recently announced that vector support is now generally available. Created as a fork of MySQL, this new feature allows vector data to be stored alongside an application’s relational MySQL data, removing the need for a separate specialized vector database.

While PostgreSQL has been the default open-source choice for vector search, the company behind the Vitess database announced in 2023 its intention to fork MySQL and add vector search capabilities. Following a public beta in late 2024, vector search is now generally available with improved performance. Patrick Reynolds, software engineer at PlanetScale, writes:

Since the open beta began, we have doubled query performance, improved memory efficiency eight times, and focused on robustness to make sure vector support is as solid as every other data type MySQL supports.

The new vector capabilities enable direct support for recommendation systems, semantic search, and the now-popular RAG workloads on a MySQL-compatible engine. Reynolds adds:

We also built advanced vector-index features to satisfy a variety of embeddings and use cases. An index can rank vectors by Euclidian (L2), inner product, or cosine distance. It can store any vector up to 16,383 dimensions. It supports both fixed and product quantization.

According to the authors, a key differentiator of PlanetScale’s vector support is its ability to use indexes larger than RAM. The implementation is based on two papers from Microsoft Research: SPANN (Space-Partitioned Approximate Nearest Neighbors) and SPFresh. SPANN is a hybrid graph/tree algorithm that enables scaling to larger-than-RAM indexes, while SPFresh defines a set of background operations that maintain the index’s performance and recall.

While PlanetScale has designed the SPANN and SPFresh operations to be transactional and integrated them into MySQL’s default storage engine, there is little hope that Oracle will merge the change into the MySQL Community Edition. In a Hacker News thread during the beta period, Vicent Martí explained:

It’s already open source, because GPL requires it. It’s unlikely to be accepted as an upstream contribution given that Oracle has their own Vector type that is only available in their MySQL cloud service.

Writes and queries for the new feature work like a normal RDBMS, building an index with an ALTER or CREATE VECTOR INDEX statement, or writing SELECT statements with JOIN and WHERE clauses. Marti added:

The tight integration fundamentally means that inserting, updating and deleting vector data from MySQL is always reflected immediately in the index as part of committing your transaction. But it also means that the indexes are fully covered by the MySQL binlog; they recover from hard crashes just fine. They’re also managed by MySQL’s buffer pool, so they scale to terabytes of data, just like any other table. And also crucially, they’re fully integrated with the query planner, so they can be used in any query, including JOINs and WHERE clauses.

PlanetScale is built on top of Vitess, an open source database clustering system designed for the horizontal scaling of MySQL. A list of compatibility limitations is available online.

About the Author

Subscribe for MMS Newsletter

By signing up, you will receive updates about our latest information.

  • This field is for validation purposes and should be left unchanged.