[ Databases in AWS : ElasticSearch ]

주로 다른 DB 를 보완하기위해 사용

필드 상관없이 조회 가능, 부분 매칭이어도 조회가 가능

- Example : In DynamoDB, you can only find by primary key or indexes

- With ElasticSearch, you can search any field, even partially matches

- It's common to use ElasticSearch as a complement to another database

- ElasticSearch also has some usage for Big Data applications

- You can provision a cluster of instances

- Built-in integrations : Amazon Kinesis Data Firehose, AWS IoT, and Amazon CloudWatch Logs for data ingestion

- Security through Cognito & IAM, KMS encryption, SSL & VPC

- Comes with Kibana (visualization) & Logstash (log ingestion) - ELK stack

 

[ ElasticSearch for Solutions Architect ]

Operations : similar to RDS

Security : Cognito, IAM, VPC, KMS, SSL

Reliability : Multi-AZ, clustering

Performance : based on ElasticSearch project(open source), petabyte scale

Cost : pay per node provisioned (similar to RDS)

Remember : ElasticSearch = Search/Indexing

반응형

[ Databases in AWS : Neptune ]

- Fully managed graph database

- When do we use Graphs?

  -- High relationship data

  -- Social Networking : Users friends with Users, replied to comment on post of user and likes other comments

  -- Knowledge graphs

- Highly available across 3 AZ, with up to 15 read replicas

- Point-in-time recovery, continuous backup to Amazon S3

- Support for KMS encryption at rest + HTTPS

 

[ Neptune for Solutions Architect ]

Operations : similar to SDS

Security : IAM, VPC, KMS, SSL (similar to RDS) + IAM Authentication

Reliability : Multi-AZ, clustering

Performance : best suited for graphs, clustering to improve performance

Cost : pay per node provisioned (similar to RDS)

※ Remember : Neptune = Graphs

반응형

[ Databases in AWS : Glue ]

- Managed extract, transform, and load (ETL) service

- Useful to prepare and transform data for analytics

- Fully serverless service

 

[ Glue Data Catalog ]

- Glue Data Catalog : catalog of datasets

 

반응형

[ Databases in AWS : Redshift ]

PostgreSQL 기반이지만 OLTP(트랜잭션 프로세싱) 지원하지않음

로우기반이아닌 칼럼기반 데이터 저장

MPP(대규모 병렬 쿼리)를 사용하여 다른 데이터베이스에 비해 월등히 뛰어난 성능

AWS Quicksight/Tableau 등의 BI(Business Intelligence) 툴 제공

- Redshift is based on PostgreSQL, but it's not used for OLTP(Online Transaction Processing)

- It's OLAP(Online Analytical Processing) - online analytical processing (analytics and data warehousing)

- 10x better performance than other data warehouses, scale to PBs of data

- Columnar storage of data (instead of row based)

- Massively Parallel Query Execution (MPP) -> reason why it is such high performance

- Pay as you go based on the instances provisioned

- Has a SQL interface for performing the queries

- BI(Business Intelligence tools such as AWS Quicksight or Tableau integrate with it

- Data is loaded from S3, DynamoDB, DMS, other DBs

- From 1 node to 128 nodes, upto 128TB of space per node

   -- Leader node : for query planning, results aggregation

   -- Compute node : for performing the queries, send results to leader

- Redshift Spectrum : perform queries directly against S3 (no deed to load)

- Backup & Restore, Security VPC / IAM / KMS, Monitoring

- Redshift Enhanced VPC Routing : COPY / UNLOAD goes through VPC

 

[ Redshift - Snapshots & DR ]

- Redshift has no "Multi-AZ" mode

- Snapshots are point-in-time backups of a clust, stored internally in S3

- Snapshots are incremental (only what has changed is saved)

- You can restore a snapshot into a new cluster

  -- Automated : every 8 hours, every 5 GB, or on a schedule, Set retention

  -- Manual : snapshot is retained until you delete it

- You can figure Amazon Redshift to automatically copy snapshots (automated or manual) of a cluster to another AWS Region

DR(Disaster Recovery) plan : 스냅샷 자동생성 활성화, Redshift cluster 가 자동으로 스냅샷을 다른 AWS Region에 카피하도록 설정

the way of copy snapshots of cluster to another AWS Region

 

[ Loading data into Redshift ]

[ Redshift Spectrum ]

S3 의 데이터를 Redshift 테이블에 직접 넣지 않고(로딩하지 않고) 쿼리의 실행이 가능하도록 하는 기능

Redshift cluster 가 활성화 되어있어야 사용가능

- Query data that is already in S3 without loading it

- Must have a Redshift cluster available to start the query

- The query is then submitted to thousands of Redshift Spectrum nodes

 

[ Redshift for Solutions Architect ]

Operations : like RDS

Security : IAM, VPC, KMS, SSL (like RDS)

Reliability : auto healing features, cross-region snapshot copy

Performance : 10x performance vs other data warehousing, compression

Cost : pay per node provisioned, 1/10th of the cost vs other warehouses

vs Athena : faster queries / joins / aggregations thanks to indexes

※ Redshift = Analytics / BI / Data Warehouse

 

 

반응형

'infra & cloud > AWS' 카테고리의 다른 글

[AWS] 18-9. Databases in AWS : Neptune  (0) 2021.09.26
[AWS] 18-8. Databases in AWS : Glue  (0) 2021.09.26
[AWS] 18-6. Databases in AWS : Athena  (0) 2021.09.25
[AWS] Databases in AWS : S3  (0) 2021.09.25
[AWS] 18-5. Databases in AWS : DynamoDB  (0) 2021.09.25

[ Athena Overview ]

Database 는 아니지만 S3위에 query 엔진을 제공

- Fully Serverless database with SQL capabilities

- Used to query data in S3

- Pay per query

- Output results back to S3

- Secured through IAM

※ Use Case : one time SQL queries, serverless queries on S3, log analytics

 

[ Athena for Solutions Architect ]

Operations : no operations needed, serverless

Security : IAM + S3 security

Reliability : managed service, uses Presto engine, highly available

Performance : queries scale based on data size

Cost : pay per query / per TB of data scanned, serverless

 

 

 

반응형

[ Databases in AWS : S3 ]

- S3 is a key / value store for objects

- Great for big objects, not so great for small objects

- Serverless, sclaes infinitely, max object size is 5 TB

- Strong consistency

- Tiers : S3 Standard, S3 IA, S3 One Zone IA, Glacier for backups

- Features : Versioning, Encryption, Cross Region Replication, etc...

- Security : IAM, Bucket Policies, ACL(Access Control Policy)

- Encryption : SSE-S3, SSE-KMS, SSE-C, client side encryption, SSL in transit

※ Use Case : static files, key value store for big files, website hosting

 

[ S3 for Solutions Architect ]

Operations : no operations needed

Security : IAM , Bucket Policies, ACL, Encryption (Server/Client), SSL

Reliability : 99.99% durability / 99.9% availability, Multi AZ, CRR(Cross Region Replication)

Performance : scales to thousands of read/writes per second, transfer acceleration/multi-part for big files

Cost : pay per storage usage, network cost, requests number

반응형

[ DynamoDB ]

AWS 소유 기술의 NoSQL DB로 key/value 쌍으로 데이터 저장

멀티AZ, 읽기와 쓰기의 분리, read cache 로 DAX 사용

IAM 을 사용하여 보안

DynamoDB Stream 을 사용하여 AWS Lambda와 통합 (DynamoDB Stream이 데이터 변화 감지하여 AWS Lambda 호출)

백업과 복구 가능, 글로벌 테이블 사용

cloudwatch를 통한 모니터링

SQL 쿼리 불가. 오직 key 및 인덱스 기준 조회만 가능

트랜잭션 지원 (2018. 11월)

- AWS proprietary technology, managed NoSQL database

- Serverless, provisioned capacity, auto scaling, on demand capacity (Nov 2018)

- Can replace ElastiCache as a key/value store (storing session data for example)

- Highly Available, Multi AZ by default, Read and Writes are decoupled, DAX for read cache

- Reads can be eventually consistent or strongly consistent

- Security, authentication and authorization is done through IAM

- DynamoDB Streams to integrate with AWS Lambda

- Backup / Restore feature, Global Table feature

- Monitoring through CloudWatch

- Can only query on primary key, sort key, or indexes

※ Use Case : Serverless applications development (small documents 100s KB), distributed serverless cache, doesn't - have SQL query language available, has transactions capability from Nov 2018

 

[ DynamoDB for Solutions Architect ]

Operations : no operations needed, auto scaling capability, serverless

Security : full security through IAM policies, KMS encryption, SSL in flight

Reliability : Multi AZ, Backups

Performance : single digit millisecond performance, DAX for caching reads, performance doesn't degrade if your application scales

Cost : Pay per provisioned capacity and storage usage (no need to guess in advance any capacity - can use auto scaling)

 

반응형

'infra & cloud > AWS' 카테고리의 다른 글

[AWS] 18-6. Databases in AWS : Athena  (0) 2021.09.25
[AWS] Databases in AWS : S3  (0) 2021.09.25
[AWS] 18-4. Databases in AWS : ElastiCache  (0) 2021.09.23
[AWS] 18-3. Databases in AWS : Aurora  (0) 2021.09.23
[AWS] 18-2. Databases in AWS : RDS  (0) 2021.09.23

[ Databases in AWS : ElastiCache ]

1. Managed Redis/Memcached (similar offering as RDS, but for caches)

2. In-memory data store, sub-milisecond latency

3. Must provision an EC2 instance type

4. Support for Clustering (Redis) and Multi AZ, Read Replicas (sharding)

5. Security through IAM, Security Groups, KMS, Redis Auth

6. Backup / Snapshot / Point in time restore feature

7. Managed and Scheduled maintenance

8. Monitoring through CloudWatch

※Use Case : Key/Value store, Frequent reads, less writes, cache results for DB queries, store session data for websites, cannot use SQL. 

 

[ ElastiCache for Solutions Architect ]

Operations : same as RDS

Security : AWS responsible for OS security, we are responsible for setting up KMS, security groups, IAM policies, users (Redis Auth), using SSL

Reliability : Clustering, Multi AZ

Performance : Sub-millisecond performance, in memory, read replicas for sharding, very popular cache option

Cost : Pay per hour based on EC2 and storage usage

 

반응형

'infra & cloud > AWS' 카테고리의 다른 글

[AWS] Databases in AWS : S3  (0) 2021.09.25
[AWS] 18-5. Databases in AWS : DynamoDB  (0) 2021.09.25
[AWS] 18-3. Databases in AWS : Aurora  (0) 2021.09.23
[AWS] 18-2. Databases in AWS : RDS  (0) 2021.09.23
[AWS] 18. Choosing the right database  (0) 2021.09.23

[ Aurora ]

OLTP 트랜잭션 프로세싱 지원

PostgreSQL/MySQL 호환

오토스케일링

1. Compatible API for PostgreSQL/MySQL (OLTP)

2. Data is held in 6 replicas, across 3 AZ

3. Auto healing capability

4. Multi AZ, Auto Scaling Read Replicas

5. Read Replicas can be Global

6. Aurora database can be Global for DR or latency purposes

7. Auto scaling of storage from 10GB to 128TB

8. Define EC2 instance type for aurora instances

9. Same security / monitoring / maintenance features as RDS

10. Aurora Serverless - for unpredictable / intermittent(간헐적인) workloads

11. Aurora Multi-Master - for continuous writes failover

※ Use case : same as RDS, but with less maintenance/more flexibility/more performance

 

[ Aurora for Solutions Architect ]

Operations : less operations, auto scaling storage

Security : AWS responsible for OS security, we are responsible for setting up KMS, security groups, IAM policies, authorizing users in DB, using SSL

Reliability : Multi AZ, highly available, possibly more than RDS, Aurora Serverless option, Aurora Multi-Master option

Performance : 5x performance (according to AWS) due to architectural optimizations. Up to 15 Read Replicas (only 5 for RDS)

Cost : Pay per hour based on EC2 and storage usage. Possibly lower costs compared to Enterprise grade databases such as Oracle

 

반응형

[ Databases in AWS : RDS(Relational Database Service) ]

1. Managed PostgreSQL / MySQL / Oracle / SQL Server

2. Must provision an EC2 instance & EBS Volume type and size

3. Support for Read Replicas and Multi AZ

4. Security through IAM, Security Groups, KMS, SSL in transit

5. Backup / Snapshop / Point in time restore feature

6. Managed and Scheduled maintenance

7. Monitoring through CloudWatch

8. Use case : Store relational datasets (RDBMS/OLTP), perform SQL queries, transactional I/U/D

※ OLTP : On-line Transactional Processing

 

[ RDS for Solutions Architect ]

1. Operations : small downtime when failover happens, when maintenance happens, scaling in read replicas/ec2 instance/restore EBS implies manual intervention, application changes

2. Security : AWS responsible for OS security, we are responsible for setting up KMS, security groups, IAM policies, authorizing users in DB, using SSL

3. Reliability : Multi AZ feature, failover in case of failures

4. Performance : depends on EC2 instance type, EBS volume type, ability to add Read Replicas. Storage auto-scaling & manual scaling of instances

5. Cost : Pay per hour based on provisioned EC2 and EBS

 

 

반응형

+ Recent posts