
MMS • RSS
Posted on nosqlgooglealerts. Visit nosqlgooglealerts

Even though database solutions have evolved over time, developers are constantly seeking solutions that are flexible, easily scalable, and provide real-time analytics.
TiDB, which is an advanced distributed SQL database developed by PingCAP, claims to solve all these problems for developers. Its biggest selling point is that it offers a compelling blend of horizontal scalability, MySQL compatibility, and real-time analytics capabilities.
Competitors like CockroachDB lack a built-in real-time analytics engine. While MongoDB does support basic analytics capabilities, it may encounter difficulties when handling complex analytical workloads or extremely large datasets.
The concept behind TiDB, as described by Ed Huang, the co-founder and chief technology officer at PingCAP, originated nearly a decade ago from the challenges he personally encountered in leveraging databases.
Back then, he was employed by a startup where he managed database clusters heavily reliant on MySQL at the time.
“Our business operations were deeply tied to relational databases due to their complex logic. However, our data was growing rapidly, necessitating sharding (a technique that spreads data across numerous MySQL instances),” Huang said in an exclusive interview with AIM.
This meant every few months, the database size would double, requiring them to rebalance and move data constantly.
TiDB is Inspired by Google
Huang reveals that this was when he came across two Google papers, which served as an inspiration for TiDB (where ‘Ti’ stands for Titanium).
“About ten years ago, I came across Google’s papers on Spanner and F1—new SQL databases that offer traditional SQL interfaces but are incredibly scalable under the hood. I realised this was the direction we needed to go—a solution that could handle our scaling needs without sacrificing SQL functionality,” Huang said.
Hence, by merging the strengths of distributed or NoSQL databases with those of traditional databases, Huang aimed at creating a new database that application developers would embrace.
“We saw this integration as the future after being inspired by these research papers. This led us to embark on an open-source project to develop a new database from scratch, ensuring compatibility with MySQL. Our extensive experience with MySQL also motivated us to initiate what would become TiDB,” Huang added.
TiDB Architecture
The overall architecture of TiDB is decoupled into two layers: the storage layer and the key value layer. “I’m really proud to say that I wrote the first line of code for TiDB. We built it completely from scratch, forming a brand new community around it,” Huang said.
TiDB’s architecture is designed to manage extensive datasets while accommodating both transactional and analytical workloads seamlessly.
It has a distributed key-value storage system similar to databases like Cassandra or MongoDB, ensuring data is stored across multiple servers for scalability and resilience against failures.
“Another notable aspect of TiDB is its capability to handle both OLTP (Online Transaction Processing) and MySQL-compatible workloads, as well as OLAP (Online Analytical Processing) or analytics workloads concurrently.
“This is made possible by its dual storage engine architecture within the storage layer. One is the key-value-packed TiKV storage engine, optimised for transactional processing. There’s another storage engine known as TiFlash, designed specifically for handling analytics queries efficiently,” Huang added.
Databricks Loves TiDB
Over 3,000 customers currently leverage TiDB, hundreds of whom are PingCAP’s paying customers. Some notable users of TiDB include Databricks, Airbnb, LinkedIn, Dailymotion, and Capcom.
Huang reveals the US remains the biggest market for TiDB, however, companies in other geographies also leverage the open-source database.
“Databricks is one of our biggest adopters in the US. Actually, all of Databricks’ metadata is supported by TiDB. Another big customer we have in the US is Pinterest. Currently, we manage hundreds of terabytes of data for Pinterest, assisting them in migrating from HBase to TiDB,” Huang revealed.
TiDB sees higher adoption among customers using legacy NoSQL databases. Most of the customers paying for TiDB services are from the Banking, Finance, Security and Insurance (BFSI) sector.
“In the past, companies relied on Oracle, MySQL, or other legacy databases. Nowadays, with the shift towards mobile platforms, data volumes have significantly increased, posing challenges for infrastructure, especially in sectors like finance,” Huang said.
These industries often have extensive legacy code built on SQL, making it difficult to transition to NoSQL interfaces seamlessly.
“They still require SQL compatibility for their codebase but now need scalability and robust data consistency at financial-grade levels. Japan’s largest payment company relies on TiDB. We also see great adoption in the e-commerce and gaming industry,” Huang added.
TiDB in India
Flipkart, one the largest e-commerce companies in India, revealed in a blogpost they have leveraged TiDB to scale to 1 million QPS. The e-commerce giant also faced scaling challenges which were met by vertically scaling the MySQL cluster. However, they saw TiDB as the solution.
“Flipkart has been using TiDB as a hot store in production since early 2021 for moderate throughput levels of 60k reads and 15k writes at DB level QPS. We set out to demonstrate the feasibility of using TiDB as a hot SQL data store for use cases with very high QPS and low latency requirements for the first time,” the company said in the blog post.
“Another large logistics company in India is also our customer, they are managing terabytes of data and using our real-time analytics capability. We also have a few SaaS companies using our cloud service,” Huang said.
India is home to many high-growth cloud-native companies that could benefit from TiDB. Moreover, TiDB’s real-time analytical capabilities could be an attractive prospect for many SaaS companies.
“They prefer not to establish multiple data warehouses or utilise various data sources separately for analytics. Our goal is to offer a unified platform where they can use a single system to gain real-time insights seamlessly. As far as I know, other databases like MongoDB or CockroachDB do not come with a real-time analytics engine,” he concluded.