“You can actually do the compute on the raw data without having to move it,” he says.ĪQUA is based on the company’s new virtualization platform for EC2, called the Nitro System, that the company has been working on for the last couple of years. Redshift’s federation story gets even better in mid-2020, when AWS officially rolls out AQUA, an advanced query accelerator for Redshift that Jassy unveiled onstage yesterday.Īccording to Jassy, AQUA flips the equation of moving the data from storage to compute, and instead moves the compute to the storage. “Straight from Redshift, you can now perform queries processing data in your data warehouse, transactional databases, and data lake, without requiring ETL jobs to transfer data to the data warehouse.” “…ou can access data as soon as it is available,” he writes. Querying the data where it sits, rather than moving it into Redshift using extract, transform, and load (ETL) functions will save customers time, grief, and money, according to Poccia. In addition, the Redshift queries can be pushed down to execute in Amazon Relational Database Service (RDS) for PostgreSQL and Amazon Aurora PostgreSQL databases. AWS is now enabling customers to push queries from their Redshift cluster down into the S3 data lake, where they are executed. Once the data is stored in S3, customers can benefit from AWS’s second Redshift announcement: Federated Query. Redshift is getting federated query capabilities (image courtesy AWS) Or you can use different tools such as Amazon Athena, Amazon EMR, or Amazon SageMaker.” “You can then analyze the data in your data lake with Redshift Spectrum, a feature of Redshift that allows you to query data directly from files on S3. “This enables you to save data transformation and enrichment you have done in Redshift into your S3 data lake in an open format,” writes AWS’s Danilo Poccia in a blog post yesterday. The openness of Parquet is another key advantage. It also consumes up to 6x fewer storage resources in S3, the company says. First, unloading the data with Parquet can be to 2x faster compared to text formats. Storing the data in Parquet brings several advantages, according to AWS. Data that’s exported from Redshift to S3 is stored in the Apache Parquet format, which is a columnar storage forma that’s optimized for analytics. With the new Data Lake Export function, AWS is allowing customers to unload data from their Redshift cluster and push it back to S3. Customers typically load Redshift by moving data from their Simple Storage Service (S3) buckets into the data warehouse, which powers traditional business intelligence and analytics workloads that rely on SQL queries. Redshift is a popular cloud-based data warehouse that’s based on the column-oriented analytics database originally developed by ParAccel. Among the new stuff unveiled by CEO Andy Jassy are support for exporting data from Redshift into S3 using the Parquet data format, a speedy new AQUA mode for Redshift due in 2020, as well as a cheaper Elasticsearch storage option for petabyte-scale data. "By bringing compute to the storage layer, AQUA helps customers eliminate unnecessary data movement to avoid these networking bandwidth limitations, delivering up to an order-of-magnitude query performance improvement.Amazon Web Services gave customers 28 more reasons to store and process data on its public cloud platform yesterday during its annual re:Invent conference in Las Vegas, Nevada, bringing the total number of services to 180. "Existing data warehouse architectures with centralized storage require that data be moved to compute clusters for processing, which creates a bottleneck and slows down performance," said Rahul Pathak, vice president of analytics at AWS. Since its launch in 2012, Amazon Redshift has become one of the most popular cloud data warehouses, it said.ĪQUA for Amazon Redshift is a distributed and hardware-accelerated cache for Amazon Redshift an innovation that improves performance for analytics at the new scale of data. (AWS), announced on Wednesday the general availability of Advanced Query Accelerator (AQUA) for Amazon Redshift.ĪQUA brings compute to the storage layer, helping customers avoid networking bandwidth limitations by eliminating unnecessary data movement between where data is stored and compute clusters, the company said. SAN FRANCISCO, April 14 (Xinhua) - Amazon Web Services, Inc.
0 Comments
Leave a Reply. |