[ Databases in AWS : Redshift ]

PostgreSQL 기반이지만 OLTP(트랜잭션 프로세싱) 지원하지않음

로우기반이아닌 칼럼기반 데이터 저장

MPP(대규모 병렬 쿼리)를 사용하여 다른 데이터베이스에 비해 월등히 뛰어난 성능

AWS Quicksight/Tableau 등의 BI(Business Intelligence) 툴 제공

- Redshift is based on PostgreSQL, but it's not used for OLTP(Online Transaction Processing)

- It's OLAP(Online Analytical Processing) - online analytical processing (analytics and data warehousing)

- 10x better performance than other data warehouses, scale to PBs of data

- Columnar storage of data (instead of row based)

- Massively Parallel Query Execution (MPP) -> reason why it is such high performance

- Pay as you go based on the instances provisioned

- Has a SQL interface for performing the queries

- BI(Business Intelligence tools such as AWS Quicksight or Tableau integrate with it

- Data is loaded from S3, DynamoDB, DMS, other DBs

- From 1 node to 128 nodes, upto 128TB of space per node

   -- Leader node : for query planning, results aggregation

   -- Compute node : for performing the queries, send results to leader

- Redshift Spectrum : perform queries directly against S3 (no deed to load)

- Backup & Restore, Security VPC / IAM / KMS, Monitoring

- Redshift Enhanced VPC Routing : COPY / UNLOAD goes through VPC

 

[ Redshift - Snapshots & DR ]

- Redshift has no "Multi-AZ" mode

- Snapshots are point-in-time backups of a clust, stored internally in S3

- Snapshots are incremental (only what has changed is saved)

- You can restore a snapshot into a new cluster

  -- Automated : every 8 hours, every 5 GB, or on a schedule, Set retention

  -- Manual : snapshot is retained until you delete it

- You can figure Amazon Redshift to automatically copy snapshots (automated or manual) of a cluster to another AWS Region

DR(Disaster Recovery) plan : 스냅샷 자동생성 활성화, Redshift cluster 가 자동으로 스냅샷을 다른 AWS Region에 카피하도록 설정

the way of copy snapshots of cluster to another AWS Region

 

[ Loading data into Redshift ]

[ Redshift Spectrum ]

S3 의 데이터를 Redshift 테이블에 직접 넣지 않고(로딩하지 않고) 쿼리의 실행이 가능하도록 하는 기능

Redshift cluster 가 활성화 되어있어야 사용가능

- Query data that is already in S3 without loading it

- Must have a Redshift cluster available to start the query

- The query is then submitted to thousands of Redshift Spectrum nodes

 

[ Redshift for Solutions Architect ]

Operations : like RDS

Security : IAM, VPC, KMS, SSL (like RDS)

Reliability : auto healing features, cross-region snapshot copy

Performance : 10x performance vs other data warehousing, compression

Cost : pay per node provisioned, 1/10th of the cost vs other warehouses

vs Athena : faster queries / joins / aggregations thanks to indexes

※ Redshift = Analytics / BI / Data Warehouse

 

 

반응형

'infra & cloud > AWS' 카테고리의 다른 글

[AWS] 18-9. Databases in AWS : Neptune  (0) 2021.09.26
[AWS] 18-8. Databases in AWS : Glue  (0) 2021.09.26
[AWS] 18-6. Databases in AWS : Athena  (0) 2021.09.25
[AWS] Databases in AWS : S3  (0) 2021.09.25
[AWS] 18-5. Databases in AWS : DynamoDB  (0) 2021.09.25

+ Recent posts