Skip to main content

Amazon Redshift

What is it

A fully managed, petabyte-scale cloud data warehouse service.

What it's for

Store and analyze large volumes of structured and semi-structured data for business intelligence, reporting, and big data analytics.

Use cases

  • Data warehousing for business intelligence and reporting
  • Analysis of sales, marketing, and customer data
  • Application and infrastructure performance analysis
  • Financial forecasting and planning
  • Security and compliance data analysis

Key points

  • Fully managed: AWS manages infrastructure, backups, patches, and scaling
  • Columnar: Columnar data storage format to optimize analytical query performance
  • Massively Parallel Processing (MPP): Distributes and executes queries across multiple nodes for fast processing
  • Scalability: Allows scaling the cluster to meet storage and compute needs
  • Integration: Integrates with BI tools, AWS services like S3, Kinesis, EMR, and ETL tools

Comparison with Amazon Athena

  • Amazon Redshift: A traditional data warehouse, ideal for structured data that requires complex queries and frequent transformations. Requires cluster provisioning (though there is a Serverless option).
  • Amazon Athena: A serverless interactive query service for data in S3, ideal for ad-hoc analysis and data exploration without the need to load data into a data warehouse.