Google Storage Transfer Service Now Supports Serverless Real-time Replication Capability
MMS • Steef-Jan Wiggers
Article originally posted on InfoQ. Visit InfoQ
Recently Google announced the preview support for event-driven transfer capability for its Storage Transfer Service (STS), which allows users to move data from AWS S3 to Cloud Storage and copy data between multiple Cloud Storage buckets.
STS is a service in the Google Cloud that allows users to quickly and securely transfer data between object and file storage across Google Cloud, Amazon, Azure, on-premises, and other storage solutions. In addition, the service now includes a preview capability to automatically transfer data that has been added or updated in the source location based on event notifications. This type of transfer is event-driven, as the service listens to event notifications to start a data transfer. Currently, these event-driven transfers are supported from AWS S3 or Cloud Storage to Cloud Storage.
In a Google Cloud blog post, authors Ajitesh Abhishek, Product Manager, and Anup Talwalkar, Software Engineer, both working at Google Cloud, explain:
For performing the event-driven transfer, STS relies on Pubsub and SQS. Customers must set up the event notification and grant STS access to this queue. Using a new field – “Event Stream” – in the Transfer Job, customers can specify the event stream name and control when STS starts and stop listening for events from this stream.
STS begins consuming the object change alerts from the source as soon as the Transfer Job is created. Any upload or change to an object now results in a change notification, which the service uses to transfer the object to the destination in real-time.
The new STS capability provides several benefits. Aman Puri, a consultant at Google Cloud, explains in a medium blog post the benefits:
Because event-driven transfers listen for changes to the source bucket, updates are copied to the destination in near-real time. As a result, the storage Transfer Service doesn’t need to execute a list operation against the source, saving time and money.
Use cases include:
• Event-driven analytics: Replicate data from AWS to Cloud Storage to perform analytics and processing.
• Cloud Storage replication: Enable automatic, asynchronous object replication between Cloud Storage buckets.
• DR/HA setup: Replicate objects from source to backup destination in order of minutes.
• Live migration: Event-driven transfer can power low-downtime migration, on the order of minutes of downtime, as a follow-up step to one-time batch migration.
Microsoft provides a similar capability in Azure with Event Grid Service, allowing event-driven data transfer from a storage container to various destinations. By leveraging a system topic on the storage and subscribing to blobCreated events through an Azure Function, data from the storage container can be copied to a destination like another storage container, AWS S3 Bucket, or Google Cloud bucket. Alternatively, an event could trigger a DataFactory pipeline.
Currently, the event-driven capability is available in various Google Cloud regions, and pricing details of STS are available on the pricing page.