eBay Replatforming to Kubernetes, Envoy and Kafka: Intending to Open Source Hardware and Software
MMS • RSS
Article originally posted on InfoQ. Visit InfoQ
eBay have discussed how they are conducting a replatforming initiative across their entire technology stack, which includes building and releasing as open source both the new hardware and software created. Open source is “fueling the transformation” of eBay’s infrastructure, and they intend to use cloud native technologies like Kubernetes, Envoy, MongoDB, Docker and Apache Kafka.
As part of a three-year effort to replatform and modernise their backend infrastructure, eBay has recently announced that they are building their own custom-designed servers “built by eBay, for eBay”. The plan also includes making eBay’s servers available to the public via open source in the fourth quarter of this year. Although many large scale technical organisations and cloud vendors custom build their own hardware, including Google, AWS and Azure, they do not typically release this as open source. eBay have stated that they “are using servers and hardware that we designed, reducing our dependence on third parties”.
The eBay engineering team are making changes across the entire technology stack, including both physical and logical layers, because they believe that all of the layers are intertwined in some way: “the stack is like connective tissue, you cannot isolate one of the layers; you must advance them together.” Each layer of the technology stack was examined for efficiency, capability and the opportunity to improve existing solutions. As recently reported by SDxCentral, this presumedly ties in with eBay’s recent decision to move away from their existing OpenStack-based system, towards the more modern Docker and Kubernetes stacks. It should be noted, however, that it is possible to run Kubernetes on OpenStack, and the eBay engineering team have discussed this option at last year’s OpenStack summit.
For the physical foundation, eBay are using a Point of Presence (PoP) strategy, and are decentralising their cluster of US-based data centers towards an “edge computing approach”. This will enable them to “create a faster, more consistent user experience, saving 600-800 milliseconds of load time”. Much like the approach Chick-Fil-A discussed at the recent QCon New York, the eBay team are deploying online services and data at the edge of their networks, closer to users, which enables dynamic and static caching capabilities, decreased latency, and an improved user experience.
In the data layer eBay have created more “customized models”. Using open source technologies, the team have built “NuData”, a fault tolerant, geo-distributed object and data store. This is not yet available as open source, and for interested readers searching the web for more information on this, it is also not to be confused with Mastercard’s “NuData Security” product. Long term, this will allow eBay to distribute data geographically to improve their customers’ experience, offer increased resiliency for services, and provide “data isolation solutions for countries that require them” (most likely in response to the recent General Data Protection Regulation and California Consumer Privacy Act initiatives).
eBay processes 300 billion data queries each day, and their data footprint is more than 500 petabytes, the equivalent of “one trillion songs, 2.5 million hours of movies and enough to backup the American Library of Congress more than 300 times”. Accordingly, they have used open source to build an in-house “AI engine” that is can be shared across all of their teams, with the goal of “increasing productivity, collaboration and training”. Their AI engine has already accelerated the production of new features, such as computer vision, Image Search and sharing on social media platforms.
The data science teams with eBay have previously talked about their use of Apache Kafka and Apache Storm within their Rheos platform. This platform provides life cycle management, monitoring, and well-architected standards and an ecosystem for real-time streaming data pipelines. The Berlin engineering team have also discussed their use of Kafka Streams and ElasticSearch to implement real-time user profiling. In 2017 the eBay team also presented at MongoDB World, and discussed “Building Mission-Critical Multi-Data Center Applications with MongoDB“.
eBay are keen to share innovations and technology experiences with the broader engineering community through open source. They believe that developers and communities who leverage their tools will improve upon what they are building, and ultimately help to create better experiences overall. The eBay GitHub accounts, although crowded with open source projects, does contain several gems. This includes the Neutrino “cloud native” software load balancer (which presumably will now be replaced with the use of the Envoy Proxy); the NMessenger lightweight messenger component built on AsyncDisplaykit and written in Swift; the bayesian-belief-networks Python library; and the Fabio Consul-based load balancing router.
The eBay news post announcing the transformation and open source efforts concludes by stating that the key to successfully replatforming their infrastructure over an ambitious three-year timeline is their people: “building the right culture and creating the best atmosphere requires deliberate and delicate work. With the right culture in place, the technology and innovation will follow.”