MMS • Raul Salas
ClustrixDB (www.clustrix.com) created an interesting graphic depicting the future of the relational and NoSQL database platforms. The graphic was interesting and great topic to expand on.
So in order to talk about the future, we first need to talk about the past and present. In the past, there were traditional relational database vendors such as Microsoft and Oracle. These databases were the workhorse of Corporate America as they provided support for banking, finance, and other corporate support activities. Then came the rise of the internet, Mobile, Cloud, and Social media in the early 2000s. This trend resulted in a large growth of unstructured data that resided outside of the corporate firewall. New database technologies such as NoSQL rose to the meet the demand of high volume and velocity data growth. NoSQL databases such as Mongodb, Cassandra, and Hadoop focus on unstructured data processing. While the Relational database platforms focus on traditional structured transactions and single server hardware hosted environments.
It is important to note that in the past 5 years licensing costs for both Oracle and Microsoft database products have increased dramatically. While NoSQL Open source products can be utilized at a much lower price point on a subscription basis or community Open Source versions in a free unsupported mode. NoSQL products such as Hadoop have industry low storage cost price points that show a really positive return on investment, especially housing large amounts of data for data lakes for example. Another trend happening now is that Developers are driving the database adoption trends, not database administrators. This is an important new trend that will drive marketing and business strategies now and for the future both for database vendors as well as their customers.
Today, NoSQL scale-out databases are becoming popular as Machine Learning and Artificial Intelligence become mainstream. The data requirements for this new technology require scale out across commodity hardware that can handle real-time analytics for say self-driving cars or a fully automated manufacturing plant. This is where companies like Google’s Spanner and Clustrix come into play and are pioneers in this space. This technology is a high cost solution, but is an answer to the resource constraints of existing technology.
In the future, Hadoop will remain a player in the batch processing data warehouse/data lake space. While traditional relational database players such as Microsoft and Oracle create new analytics products that are distributed in nature. The database industry could shake out into three distinct areas: NoSQL, Distributed SQL and Hadoop, Single Node traditional SQL will either morph into or get replaced by scale-out SQL as well as existing data warehouse analytics features will become real-time.
A good example of this trend is now happening with Cloudera Hadoop/Spark integration in the new Lambda Architecture which combines real-time processing with Spark and Machine Learning libraries such as mahout and sending data for batch processing in the Hadoop ecosystem and integration. Businesses are already seeing significant competitive edge with predictive analytics and personalization implementing the Lambda architecture.
Raul Salas firstname.lastname@example.org