Uncategorized
Amazon Redshift Cross-Database Queries and Data Sharing Are Now GA
MMS • Kovid Rathee
Users of Amazon Redshift can now run cross-database queries and share data across Redshift clusters as AWS released these enhancements to general availability.
Redshift users often create several databases separating business concerns, development environments, maturity, etc. Moreover, users also create separate databases for different stages of ETL processes. Usually, these databases are required to be queried together for integration and testing purposes. Users could not query two or more different databases in a single query before cross-database querying was introduced in preview by Amazon in late 2020.
Cross-database querying is only available for RA3 type instances. Users need to copy one of the two databases to S3 and run a Redshift Spectrum federated query for the other type of instances. Users with clusters running on non-RA3 type instances have an easy option to migrate to RA3 type instances to use cross-database querying and other such features. Although now GA, there are some limitations to cross-database queries that include not creating views on top of them.
Similar to database separation, many organizations use separate Redshift clusters depending on various business factors like billing, cluster maintenance, security and compliance issues, etc. Before Redshift’s data sharing feature, users would copy the data from one Redshift cluster to another. Data sharing, which was in preview since late 2020, enables Redshift users to instantaneously share data between clusters without having to copy or move data from one cluster to another. Data sharing works at many levels. Users can securely share data with other clusters at different levels, including schemas, tables, functions, and so on.
Similar to cross-database queries, data sharing is also available only on RA3 type instances. These features are enabled by the next generation of Redshift instances that work on the AWS Nitro System. Redshift’s RA3 type instances were launched to challenge Snowflake and are built on the same philosophy of decoupling storage and compute. Amazon is keen on migrating existing customers on the legacy Redshift instances to RA3 instances so that the users can get the most out of the latest developments in Redshift.
Continuing with the progress with RA3 instance types, earlier in 2021, Amazon launched RA3.16xlarge and RA3.4xlarge instance types for GovCloud (US) regions and RA3.xlplus in all AWS regions. Redshift also doubled the managed storage quota to 128 TB per node.