'infra & cloud' 카테고리의 글 목록 (5 Page)

[ Docker ]

- Docker is a software development platform to deploy apps

- Apps are packaged in containers that can be run on any OS

- Apps run the same, regardless of where they're run

1) Any machine

2) No compatibility issues

3) Predictable behavior

4) Less work

5) Easier to maintain and deploy

6) Works with any language, any OS, any technology

# Where are Docker images stored?

- Docker images are stored in Docker Repositories

- Public : Docker Hub (https://hub.docker.com) or Amazon ECR Public

- Find base images for many technologies or OS

ex) Ubuntu, MySQL, NodeJS, Java

- Private : Amazon ECR (Elastic Container Registry)

[ Docker vs VM (Virtual Machines) ]

- Docker is "sort of" a virtualization technology, but not exactly

- Resources are shared with the host > many containers on one server

[ Docker Containers Management ]

To manage containers, we need a container management platform

1) ECS : Amazon's own container platform

2) Fargate : Amazon's own Serverless container platform

3) EKS : Amazon's managed Kubernetes

1. ECS

- ECS = Elastic Container Service

- Launch Docker containers on AWS

- You must provision & maintain the infrastructure(the EC2 instances)

- AWS takes care of starting/stopping containers

- Has integrations with the Application Load Balancer

2. Fargate

- Launch Docker containers on AWS

- You do not provision the infrastructure (no EC2 instances to manage) - simpler

- Serverless offering

- AWS just runs containers for you based on the CPU/RAM you need

# IAM Roles for EC2 Tasks

1) EC2 Instance Profile :

- Used by the ECS agent

- Makes API calls to ECS service

- Send container logs to CloudWatch Logs

- Pull Docker image from ECR

- Reference sensitive data in Secrets Manager or SSM Parameter Store

2) ECS Task Role :

- Allow each task to have a specific role

- Use different roles for the different ECS Services you run

- Task Role is defined in the task definition

# ECS Data Volumes - EFS File Systems

- Works for both EC2 Tasks and Fargate tasks

- Ability to mount EFS volumes onto tasks

- Tasks launched in any AZ will be able to share the same data in the EFS volume

- Fargate + EFS = serverless + data storage without managing servers

- Use case : persistent multi-AZ shared storage for you containers

[ Load Balancing ]

[ 1. Load Balancing for EC2 Launch Type ]

- We get a dynamic port mapping

- The ALB supports finding the right port on your EC2 Instances

- You must allow on the EC2 instance's security group any port from the ALB security group

[ 2. Load Balancing for Fargate ]

- Each task has a unique IP

- You must allow on the ENI's security group the task port from the ALB security group

[ ECS Scaling ]

1. Service CPU Usage

2. SQS Queue

[ ECS Rolling Updates ]

When updating from v1 to v2, we can control how many tasks can be started and stopped, and in which order

- can set minimum/maximum healthy percent

Example 1 : Min 50% / Max 100%

Example 2 : Min 100% / Max 150%

[ Amazon ECR : Elastic Container Registry ]

- Store, manage and deploy containers on AWS, pay for what you use

- Fully integrated with ECS & IAM for security, backed by Amazon S3

- Supports image vulnerability scanning, version, tag, image lifecycle

[ Amazon EKS Overview ]

Amazon EKS = Amazon Elastic Kubernetes Service

- It is a way to launch managed Kubernetes clusters on AWS

- Kubernetes is an open-source system for automatic deployment, scaling and management of containerized (usually Docker) application

- It's a alternative to ECS, similar goal but different API

- EKS supports EC2 if you want to deploy worker nodes or Fargate to deploy serverless containers

Use case: if your company is already using Kubernetes on-premises or in another cloud, and wants to migrate to AWS using Kubernetes

- Kubernetes is cloud-agnostic (can be used in any cloud-Azure, GCP...)

저작자표시 비영리

'infra & cloud > AWS' 카테고리의 다른 글

[AWS] 14-2. Lambda@Edge (0)	2021.09.02
[AWS] 14. Serverless : Lambda (0)	2021.09.01
[AWS] 12-4. Amazon MQ, SQS vs SNS vs Kinesis (0)	2021.04.25
[AWS] 12-3. Kinesis Data Streams (0)	2021.04.18
[AWS] 12-2. Decoupling application: SNS, SNS+SQS (Fan Out) (0)	2021.04.14

[ 1. Amazon MQ ]

- Amazon MQ = managed Apache ActiveMQ

- SQS, SNS are cloud-native services and they're using proprietary protocols from AWS

- Traditional applications running from on-premise may use open protocols such as : MQTT, MAQP, STOMP, Openwire, WSS

- when migrating to the cloud, instead of re-engineering the application to use SQS and SNS, we can use Amazon MQ

- Amazon MQ doesn't "scale" as much as SQS/SNS

- Amazon MQ runs on dedicated machine, can run in HA(High Availability) with failover

- Amazon MQ has both queue feature(~SQS) and topic features(~SNS)

- active/standby structure (failover)

[ 2. SQS vs SNS vs Kinesis ]

저작자표시 비영리

'infra & cloud > AWS' 카테고리의 다른 글

[AWS] 14. Serverless : Lambda (0)	2021.09.01
[AWS] 13. Docker, ECS/Fargate/EKS (0)	2021.04.25
[AWS] 12-3. Kinesis Data Streams (0)	2021.04.18
[AWS] 12-2. Decoupling application: SNS, SNS+SQS (Fan Out) (0)	2021.04.14
[AWS] 12-1. Decoupling application: SQS (0)	2021.04.13

[ # Bash ]

Bash 는 Bourne Again Shell 의 약어.

Bash 쉘은 리눅스 쉘이며 리눅스에서 가장널리 사용되는 쉘.

쉘이란 사용자와 커널 사이의 매개체 역할을 하는 프로그램으로 사용자로부터 명령을 받아서 그것을 프로세싱하기 위해 커널에게 넘겨주는 일을 하는 프로그램.

[ # 쉘 설정 파일 ]

/etc/profile

/etc/bashrc

~/.bash_profile

~/.bashrc

~/.bash_logout

전역 설정 파일과 지역 설정 파일로 구분할 수 있다.

전역 설정 파일은 /etc 디렉토리에 위치.

지역 설정 파일은 보통 사용자의 홈 디렉토리에서 찾아 볼 수 있는 숨김 파일로 .bashrc 와 같이 '.'으로 시작한다

1. /etc/profile

환경 변수와 bash가 수행될 때 실행되는 프로그램을 제어하는 전역 시스템 설정과 관련된 파일.

모든 사용자에게 영향을 주며 ~/.bash_profile 은 오직 bash를 실행하는 사용자에게만 영향을 줌.

전역 설정 파일인 /etc/profile이 수행된 다음 수행됨

2. ~/.bashrc

별칭(alias)과 bash가 수행될 때 실행되는 함수를 제어하는 지역적인 시스템 설정과 관련된 파일.

별칭과 함수들은 오직 그 사용자에게만 영향을 줌.

이 파일은 전역 설정 파일인 /etc/bashrc가 수행된 다음 수행됨

3. ~/.bash_logout

사용자가 로그아웃하기 바로 직전에 실행되는 프로그램에 관한 bash의 지역적인 시스템 설정과 관련된 파일.

오직 사용자에게만 영향을 줌.

참고:

m.blog.naver.com/PostView.nhn?blogId=writer0713&logNo=220702559704&proxyReferer=https:%2F%2Fwww.google.com%2F

hippogrammer.tistory.com/57

계정 추가

useradd -u {uid} {계정}

계정 패스워드 설정

passwd {계정}

계정 uid:gid 확인

cat /etc/passwd

uid 변경

usermod -u {uid} {계정}

gid 변경

groupmod -g {gid} {그룹명}

절대경로 / 밑에 uid 가 500인 모든 파일/디렉토리를 tomcat 으로 변경

find / -user {uid} -exec chown -h {계정} {} \;

절대경로 / 밑에 gid 가 500인 모든 파일/디렉토리를 tomcat 으로 변경

find / -group {gid} -exec chgrp -h {그룹명} {} \;

* mount 된 nas 내 파일까지 바뀔 수 있으니 주의

sm-code.tistory.com/10

https://m.blog.naver.com/koromoon/220577110840

저작자표시 비영리

'infra & cloud > linux' 카테고리의 다른 글

[Linux] .bashrc .profile (0)	2022.12.04
[Linux] stdin stdout stderr, pipeline/redirect (0)	2021.03.08
[linux] 유저 패스워드 관리 (0)	2020.12.03
[linux] alias (별칭) 생성/조회/수정/삭제 (1)	2020.05.14
[Ubuntu] 양방향 복사붙여넣기 되지 않을 경우 (0)	2020.01.11

[ Kinesis ]

Apache Kafka 를 대체, 실시간 big data 관련 작업에 유리

- Kinesis is a managed alternative to Apache Kafka

- Great for application logs, metrics, Iot, clickstreams

- Great for "real-time" big data

- Grate for streaming processing frameworks (Spark, NiFi, etc...)

- Data is automatically replicated to 3AZ

Kinesis Streams : low latency streaming ingest at scale

Kinesis Analytics : perform real-time analytics on streams using SQL

Kinesis Firehose : load stream into S3, Redshift, ElasticSearch..

[ 1. Kinesis Streams ]

Shard 로 stream 이 나뉘어짐, 데이터는 default 로 하루간 보관됨, 데이터에 대한 재처리 가능, 데이터가 kinesis 에 한 번 삽입되면 지워지지 않음

- Streams are divided in ordered Shards/Partitions

- Data retention is 1 day by default, can go up to 365 days

- Ability to reprocess/replay data

- Multiple applications can consume the same stream

- Real-time processing with scale of throughput

- Once data is inserted in Kinesis, it can't be deleted (immutability)

[ Kinesis Streams Shards ]

SHARD 하나당 1 MB/s 쓰기 , 2 MB/s 읽기

SHARD 갯수 대로 과금, SHARD 갯수는 reshard(추가), merge(병합) 을 통해 증/감 가능

- One stream is made of many different shards

- 1 MB/s or 1000 messages/s at write per SHARD

- 2 MB/s at read per SHARD

- Billing is per shard provisioned, can have as many shards as you want

- Batching available or per message calls

- The number of shards can evolve over time (reshard/merge)

- Records are ordered per shard

[ AWS Kinesis API - Put records ]

partition key 를 사용할 경우 동일한 키는 동일한 파티션으로 보내짐, 처리량을 높이고싶을 경우 PutRecords 와 batching 을 사용

- PutRecord API + Partition key that gets hashed

- The same key goes to the same partition (helps with ordering for a specific key)

- Messages sent get a "sequence number"

- Choose a partition key that is highly distributed (helps prevent "hot partition")

user_id if many users

Not country_id if 90% of the users are in one country

- Use Batching with PutRecords to reduce costs and increase throughput

- ProvisionedThroughputExceeded Exception occurs if we go over the limits

[ AWS Kinesis API - Exceptions ]

ProvisionedThroughputExceeded Exceptions

- Happens when sending more data (exceeding MB/s or TPS for any shard)

- Make sure you don't have a hot shard (such as your partition key is bad and too much data goes to that partition)

* Solution : Retries with backoff / Increase shards (scaling)

[ AWS Kinesis API - Consumers ]

- Can use a normal consumer : CLI, SDK, etc...

- Can use Kinesis Client Library (in Java, Node, Python, Ruby, .NET)

: KCL uses DynamoDB to checkpoint offsets

: KCL uses DynamoDB to track other workers and share the work amongst shards

[ Kinesis Security ]

Control access / authorization using IAM policies

Encryption in flight using HTTPS endpoints

Encryption at rest using KMS

Possibility to encrypt/decrypt data client side(harder)

VPC Endpoints available for Kinesis to access within VPC

[ 2. Kinesis Data Firehose ]

서버리스, 자동 스케일링, 관리가 필요없음

실시간에 가까움(실시간이 아님)

- Fully Managed Service, no administration, automatic scaling, serverless

- Load data into Redshift/Amazon S3/ElasticSearch/Splunk

- Near Real Time

60 seconds latency minimum for non full batches

Or minium 32 MB of data at a time

- Supports many data formats, conversions, trasformations, compression

- Pay for the amount of data going through Firehose

[ Kinesis Data Streams vs Firehose ]

# Streams

- Going to write custom code (producer/consumer)

- Real time(~200ms)

- Must manage scaling(shard splitting/merging)

- Data Storage for 1 to 7 days, replay capability, multi consumers

# Firehose

- Fully managed, send to S3, Splunk, Redshift, ElasticSearch

- Serverless data transformations with Lambda

- Near real time (lowest buffer time is 1 minute)

- Automated Scaling

- No data storage

[ Kinesis Data Analytics ]

- Perform real-time analytics on Kinesis Streams using SQL

- Kinesis Data Analytics :

Auto Scaling

Managed : no servers to provision

Continuous : real time

- Pay for actual consumption rate

- Can create streams out of the real-time queries

[ Data ordering for Kinesis vs SQS FIFO ]

각각의 객체의 순서를 지키며 데이터를 사용하고자 할 경우 객체별 partition key 를 사용, 키는 항상 동일한 shard 로 보내짐

to consume the data in order for each object, send using a "partition key" value of the "object_id"

the same key will always go to the same shard

[ Ordering data into SQS ]

SQS Standard 는 순차 처리가 아님. FIFO를 사용하며 다수의 consumer가 존재할 경우 GROUP ID 를 사용하여 메시지를 그루핑 할 수 있음 (Kinesis 의 partition key 와 비슷)

# Standard Case

- For SQS standard, there is no ordering

- For SQS FIFO, if you don't use a Group ID, messages are consumed in the order they are sent, with only one consumer

# When to use Group ID

- You want to sacle the number of consumers, but you want messages to be "grouped" when they are related to each other

- Then you use a Group ID (similar to Partition key in Kinesis)

[ # Kinesis vs SQS ordering ]

Let's assume 100 trucks, 5 kinesis shards, 1 SQS FIFO

# Kinesis Data Streams :

- On average you will have 20 trucks per shard

- Trucks will have their data ordered within each shard

- The maximum amount of consumers in parallel we can have is 5

- Can receive up to 5 MB/s of data

# SQS FIFO :

- you only have one SQS FIFO queue

- you will have 100 Group ID

- You can have up to 100 consumers (due to the 100 Group ID)

- You have up to 300 messages per second (or 3000 if using batching)

저작자표시 비영리

'infra & cloud > AWS' 카테고리의 다른 글

[AWS] 13. Docker, ECS/Fargate/EKS (0)	2021.04.25
[AWS] 12-4. Amazon MQ, SQS vs SNS vs Kinesis (0)	2021.04.25
[AWS] 12-2. Decoupling application: SNS, SNS+SQS (Fan Out) (0)	2021.04.14
[AWS] 12-1. Decoupling application: SQS (0)	2021.04.13
[AWS] 11-2. Hybrid Cloud for Storage : AWS Storage Gateway, FSx for Windows/Lustre (0)	2021.04.12

[ Amazon SNS ]

send one message to many receivers

- The "event producer" only sends message to one SNS topic

- As many "event receivers(subscriptions)" as we want to listen to the SNS topic notifications

- Each subscriber to the topic will get all the messages (note: new feature to filter messages)

- Up to 10,000,000 subscriptions per topic

- 100,000 topics limit

- Subscribers can be

1) SQS

2) HTTP/HTTPS (with delivery retries - how many times)

3) Lambda

4) Emails

5) SMS messages

6) Mobile Notifications

1. SNS integrated with a lot of AWS services

- Many AWS services can send data directly to SNS for notifications

- CloudWatch (for alarms)

- Auto Scaling Groups notifications

- Amazon S3 (on bucket events)

- CloudFormation (upon state changes => failed to build, etc)

2. How to publish

- Topic publish (using the SDK)

1) Create a topic

2) Create a subscription

3) Publish to the topic

- Direct Publish (for mobile apps SDK)

1) Create a platform application

2) Create a platform endpoint

3) Publish to the platform endpoint

4) Works with Google GCM, Apple APNS, Amazon ADM

3. Security

- Encryption :

In-flight encryption using HTTPS API

At-rest encryption using KMS keys

Client-side encryption if the client wants to perform encrption/decryption inself

- Access Controls : IAM policies to regulate access to the SNS API

- SNS Access Policies (similiar to S3 bucket policies)

Useful for cross-account access to SNS topics

Useful for allowing other services (S3..) to write to an SNS topic

[ SNS + SQS : Fan Out(산개) ]

SNS에 메시지 푸시후 토픽을 구독중인 다수의 SQS가 메시지를 가져가는 패턴

- Push once in SNS, receive in all SQS queues that are subscribers

- Fully decoupled, no data loss

- SQS allows for : data persistence, delayed processing and retries of work

- Ability to add more SQS subscribers over time

- Make sure your SQS queue access policy allows for SNS to write

* SNS cannot send messages to SQS FIFO queues (AWS limitation)

# S3 Events to multiple queues

- For the same combination of : event type(eg: object create) and prefix (eg: images/) you can only have one S3 Event rule

- If you want to send the same S3 event to many SQS queues, use fan-out

저작자표시 비영리

'infra & cloud > AWS' 카테고리의 다른 글

[AWS] 12-4. Amazon MQ, SQS vs SNS vs Kinesis (0)	2021.04.25
[AWS] 12-3. Kinesis Data Streams (0)	2021.04.18
[AWS] 12-1. Decoupling application: SQS (0)	2021.04.13
[AWS] 11-2. Hybrid Cloud for Storage : AWS Storage Gateway, FSx for Windows/Lustre (0)	2021.04.12
[AWS] 11. AWS Storage Extras : Snowball (0)	2021.04.12

- When we start deploying multiple applications, they will inevitably need to communicate with one another

- There are two patterns of application communication

1. Synchronous communications (application to application)

2. Asynchronous / Event based (application to queue to application)

Synchronous between applications can be problematic if there are sudden spikes of traffic

What if you need to suddenly encode 1000 videos but usually it's 10?

In that case, it's better to decouple your applications

1) using SQS: queue model

2) using SNS: pub/sub model

3) using Kinesis: real-time streaming model

These services can scale independently from our application

[ 1. SQS (Simple Queuing service) ]

- Oldest offering (over 10 years old)

- Fully managed service, used to decouple applications

- Attributes :

1) unlimited throughput, unlimited number of messages in queue

2) Short-lived : Default retention of messages in Queue for 4days, maximum of 14days

3) Low latency (<10ms on publish and receive)

4) Limitation of 256KB per message sent

- Can have duplicate message (at least one delivery, occasionally)

- Can have out of order messages (best effort ordering)

1) SQS : Producing Messages

- Produced to SQS using the SDK (SendMessage API)

- The message is persisted in SQS until a consumer deletes it

- Message retention: default 4 days, up to 14 days

- unlimited throughput

2) SQS : Consuming Messages

- Consumers (running on EC2 instances, servers, or AWS Lambda)

- Poll SQS for messages (receive up to 10 messages at a time)

- Process the messages (ex: insert the message into an RDS database)

- Delete the messages using the DeleteMessage API

3) Multiple EC2 Instances Consumers

- Consumers receive and process messages in parallel

- At least once delivery

- Best-effort(최선을 다 하지만 책임은 지지않는) message ordering

- Consumers delete messages after processing them

- We can scale consumers horizontally to improve throughput of processing

4) SQS with Auto Scaling Group (ASG)

queue length go over a certain level, set CloudWatch Alarm, then increase the capacity of Auto Scaling Group

5) SQS to decouple between application tiers

오래걸리는 작업은 두개의 프로세스로 쪼개어 SQS Queue 를 사용하여 Decoupling

6) SQS -Security

- Encryption

1) In-flight encryption using HTTPS API

2) At-rest encryption using KMS keys

3) Client-side encryption if the client wants to perform encryption/decryption itself

- Access Controls : IAM policies to regulate access to the SQS API

- SQS Access Policies (similiar to S3 bucket policies)

Useful for cross-account access to SQS queues

Useful for allowing other services (SNS, S3...) to write to an SQS queue

7) Message Visibility Timeout **

특정 서버가 message 를 poll해가면, default 값인 30 초 동안 다른 서버는 해당 message 에 접근 할 수 없음

- After a message is polled by a consumer, it becomes invisible to other consumers

- By default, the message visibility timeout is 30 seconds

- That means the message has 30 seconds to be processed

- After the message visibility timeout is over, the message is "visible" in SQS

- If a message is not processed within the visibility timeout, it will be processed twice

- A consumer could call the ChangeMessageVisibility API to get more time

- If visibility timeout is high (hours), and consumer crashes, re-processing will take time

- If visibility timeout is too short, we may get duplicates

8) Dead Letter Queues

- If a consumer fails to process a message within the Visibility Timeout, the message goes back to the queue

- We can set a threshold(문턱(리밑)) of how many times a message can go back to the queue

- After the MaximumReceives threshold is exceeded, the message goes into a dead letter queue(DLQ)

* Useful for debugging

* make sure to process the messages in the DLQ before they expire: Good to set a retention of 14 days in the DLQ

9) Delay Queue

- Delay a message (consumers don't see it immediately) up to 15 mins

- Default is 0 seconds (message is available right away)

- Can set a default at queue level

- Can override the default on send using the DelaySeconds parameter

10) SQS - FIFO Queue

선입선출 방식의 큐로 초당 전송량 제한이 없는 SQS와 달리, 전송량에 한계가 있음

- First In First Out (ordering of messages in the queue)

- Limited throughput : 300 msg/s without batching, 3000 msg/s with

- Exactly-once send capability (by removing duplicates)

- Messages are processed in order by the consumer

저작자표시 비영리

'infra & cloud > AWS' 카테고리의 다른 글

[AWS] 12-3. Kinesis Data Streams (0)	2021.04.18
[AWS] 12-2. Decoupling application: SNS, SNS+SQS (Fan Out) (0)	2021.04.14
[AWS] 11-2. Hybrid Cloud for Storage : AWS Storage Gateway, FSx for Windows/Lustre (0)	2021.04.12
[AWS] 11. AWS Storage Extras : Snowball (0)	2021.04.12
[AWS] 10-2. CloudFront Signed URL / Cookies, Global Accelerator (0)	2021.04.11

[ Hybrid Cloud for Storage ]

AWS S3 와 on-premise 스토리지를 함께 사용하는 방식, AWS Storage를 사용하여 on-premise와 AWS Storage간 연동이 가능

- AWS is pushing for "hybrid cloud"

Part of your infrastructure is on the cloud

Part of your infrastructure is on-premise(실제보유)

- This can be due to

1) Long cloud migrations

2) Security requirements

3) Compliance requirements

4) IT strategy

- S3 is a proprietary storage technology (unlike EFS/NFS), sho how do you expose the S3 data on-premise?

: AWS Storage Gateway

[ AWS Storage Gateway ]

Bridge between on-premise data and cloud data in S3

ex) DR, backup & restore, tiered storage

1. File Gateway

- Configured S3 buckets are accessible using the NFS and SMB protocol

- Supports S3 standard, S3 IA, S3 One Zone IA

- Bucket access using IAM roles for each File Gateway

- Most recently used data is cached in the file gateway

- can be mounted on many servers

- backed by S3

1-2. File Gateway - Hardware appliance

- Using a file gateway means you need virtualization capability

Otherwise, you can use a File Gateway Hardware Appliance

- You can buy it on amazon.com

- helpful for daily NFS backups in small data centers

2. Volume Gateway

- Block storage using iSCSI protocol backed by S3

- Backed by EBS snapshots which can help restore on-premise volumes

Cached volumes: low latency access to most recent data

Stored volumes: entire dataset is on premise, scheduled backups to S3

- backed by S3 with EBS snapshots

3. Tape Gateway

- Some companies have backup processes using physical tapes

- with tape gateway, companies use the same processes but in the cloud

- Virtual Tape Library (VTL) backed by Amazon S3 and Glacier

- Back up data using existing tape-based processes (and iSCSI interface)

- Works with leading backup software vendors

- backed by S3 and Glacier

[ Amazon FSx for Windows ]

Linux 에서만 사용가능한 EFS 를 보완하기 위해 나온 Window 용 EFS

- EFS is a shared POSIX system for Linux systems

- FSx for Windows is a fully managed Windows file system share drive

- Supports SMB protocol & Windows NTFS

- Microsoft Active Directory integration, ACLs, user quotas

- Built on SSD, scale up to 10s of GB/s, millions of IOPS, 100s PB of data

- Can be accessed from your on-premise infrastructure

- Can be configured to be Multi-AZ

- Data is backed-up daily to S3

[ Amazon FSx for Lustre ]

Clustering 된 Linux. 분산 파일 시스템, 머신러닝 등 높은 퍼포먼스 지원

- The name Lustre is derived from "Linux" and "cluster"

- Lustre is a type of perallel distributed file system, for large-scale computing

- Machine Learning, High Performance Computing (HPC)

- Video Processing, Financial Modeling, Electronic Design Automation

- Scales up to 100s GB/s, millions of IOPS, sub-ms latencies

- Seamless integration with S3

Can "read S3" as a file system (through FSx)

Can write the output of the computations back to S3 (through FSx)

- Can be used from on-premise servers

# Storage Comparsion

- S3 : Object Storage

- Glacier : Object Archival

- EFS : Network File System for many Linux instances, POSIX filesystem

- EBS Volumes : Network storage for one EC2 instance at a time

- FSx for Windows : Network File System for Windows servers

- FSx for Lustre : High Performance Computing Linux file system

- Instance Storage : Physical storage for your EC2 instance (high IOPS)

- Storage Gateway : File Gateway, Volume Gateway (cache & stored), Tape Gateway

- Snowball / Snowmobile : to move large amount of data to the cloud, physically

- Database : for specific workloads, usaully with indexing and querying

저작자표시 비영리

'infra & cloud > AWS' 카테고리의 다른 글

[AWS] 12-2. Decoupling application: SNS, SNS+SQS (Fan Out) (0)	2021.04.14
[AWS] 12-1. Decoupling application: SQS (0)	2021.04.13
[AWS] 11. AWS Storage Extras : Snowball (0)	2021.04.12
[AWS] 10-2. CloudFront Signed URL / Cookies, Global Accelerator (0)	2021.04.11
[AWS] 10-1. AWS CloudFront (0)	2021.04.11

[ Snowball ]

- Physical data transport solution that helps moving TBs or PBs of data in or out of AWS

- Alternative to moving data over the network (and paying network fees)

- Secure, tamper resistant, uses KMS 256 bit encryption

- Tracking using SNS and text messages, E-ink shipping label

- Pay per data transfer job

ex) large data cloud migrations, DC decommission, DR

If it takes more than a week to transfer over the network, use Snowball devices

[ Snowball : process ]

1. Request snowball devices from the AWS console for delivery

2. Install the snowball client on your servers

3. Connect the snowball to your servers and copy files using the client

4. Ship back the device when you're done (goes to the right AWS facility)

5. Data will be loaded into an S3 bucket

6. Snowball is completely wiped

7. Tracking is done using SNS, text messages and the AWS console

[ Snowball Edge ]

- Snowball Edges add computational capability to the device

- 100TB capacity with either :

1) Storage optimized - 24vCPU

2) Compute optimized - 52 vCPU & optional GPU

- Supports a custom EC2 AMI so you can perform processing on the go

- Supports custom Lambda functions

- Very useful to pre-process the data while moving

ex) data migration, image collation, IoT capture machine learning

[ Snowmobile ]

- Transfer exabytes of data (1 EB = 1000 PB = 1000000 TBs)

- Each Snowmobile has 100 PB of capacity (use multiple in parallel)

- Better than Snowball if you transfer more than 10 PB

[ Snowball into Glacier ]

스노우볼 데이터를 Glacier 로 바로 옮길 수 없으며 S3에 올린 후 lifecycle 정책에 의해 Glacier 로 이동되게 해야함

- Snowball cannot import to Glacier directly

- You have to use Amazon S3 first, and an S3 lifecycle policy

저작자표시 비영리

'infra & cloud > AWS' 카테고리의 다른 글

[AWS] 12-1. Decoupling application: SQS (0)	2021.04.13
[AWS] 11-2. Hybrid Cloud for Storage : AWS Storage Gateway, FSx for Windows/Lustre (0)	2021.04.12
[AWS] 10-2. CloudFront Signed URL / Cookies, Global Accelerator (0)	2021.04.11
[AWS] 10-1. AWS CloudFront (0)	2021.04.11
[AWS] 9-4. S3 Performance (0)	2021.04.10

[ CloudFront Signed URL / Signed Cookies ]

- You want to distribute paid shared content to premium users over the world

- We can use CloudFront Signed URL/Cookie. We attach a policy with :

1) includes URL expiration

2) includes IP ranges to acecss the data from

3) trusted signers (which AWS accounts can create signed URLs)

- How long should the URL be valid for?

-- Shared content (movie, music) : make it short (a few minutes)

-- Private content (private to the user) : you can make it last for years

* Signed URL = access to individual files (one signed URL per file)

* Signed Cookies = access to multiple files (one signed cookie for many files)

1. Client 는 application 에 인증(authentication)

2. App은 AWS SDK 를 사용하여 Signed URL 을 생성, Client 에 리턴

3. Client 는 Signed URL 을 통해 CloudFront -> S3 Object 에 접근

[ CloudFront Signed URL vs S3 Pre-Signed URL ]

CloudFront Signed URL 은 S3 에 CloudFront Edge 를 통해 접근

S3 Pre-Signed URL 은 S3 에 직접 접근 (IAM 사용)

1. CloudFront Signed URL

- Allow access to a path, no matter the origin

- Account wide key-pair, only the root can manage it

- Can filter by IP, path, date, expiration

- Can leverage caching features

2. S3 Pre-Signed URL

- Issue a request as the person who pre-signed the URL

- Uses the IAM key of the signing IAM principal

- Limited lifetime

[ AWS Global Accelerator ]

[ Global users for our application ]

Global 서비스에 public internet을 사용하여 접속하는 client 들은 수많은 hop 을 거치며 app에 도달하므로 지연 발생

- You have deployed an application and have global users who want to access it directly

- They go over the public internet, which can add a lot of latency due to many hops

- We wish to go as fast as possible through AWS network to minimize latency

# Unicast IP vs AnyCast IP

Anycast IP는 모든 서버가 동일한 IP주소를 사용하며 클라이언트는 가장 가까운 곳에 routing 되는 방식

Unicast IP : one server holds one IP address

Anycast IP : all servers hold the same IP address and the client is routed to the nearest one

[ AWS Global Accelerator ]

client는 public internet 대신 edge location을 통하여 AWS internal network 로 app에 접근

- Leverage the AWS internal network to route to your application

- 2 Anycast IP are created for your application

- The Anycast IP send traffic directly to Edge Locations

- The Edge locations send the traffic to your application

- Works with Elastic IP, EC2 instances, ALB, NLB, public or private

- Consistent Performance

1) Intelligent routing to lowest latency and fast regional failover

2) No issue with client cache (because the IP doesn't change)

3) Internal AWS network

- Health Checks

1) Global accelerator performsa health check of your applications

2) Helps make your application global (failover less then 1 minute for unhealthy)

3) Grate for DR

- Security

1) only 2 external IP need to be whitelisted

2) DDoS protection thanks to AWS Shield

[ AWS Global Accelerator vs CloudFront ]

Both :

1) use the AWS global network and its edge locations around the world

2) integrate with AWS Shield for DDoS protection

Differences :

CloudFront

- Improves performance for both cacheable content (ex: images and videos)

- Dynamic content (ex: API acceleration and dynamic site delivery)

- Content is served at the edge

Global Accelerator

- Improves performance for a wide range fo applications over TCP or UDP

- Proxying packets at the edge to applications running in one or more AWS Regions

- Good fit for non-HTTP use cases, such as gaming(UDP), IoT(MQTT), or Voice over IP

- Good for HTTP use cases that require static IP addresses

- Good for HTTP use cases that required deterministic, fast regional failover

# Hands-On : Global Accelerator

1. Endpoint 로 지정할 Instance 복수개 생성

2. Global accelerator 생성

1) endpoint groups 지정 - region 지정

2) region 별 instance 지정(1에서 생성한 instance 지정)

저작자표시 비영리

'infra & cloud > AWS' 카테고리의 다른 글

[AWS] 11-2. Hybrid Cloud for Storage : AWS Storage Gateway, FSx for Windows/Lustre (0)	2021.04.12
[AWS] 11. AWS Storage Extras : Snowball (0)	2021.04.12
[AWS] 10-1. AWS CloudFront (0)	2021.04.11
[AWS] 9-4. S3 Performance (0)	2021.04.10
[AWS] 9-3. Storage Classes + Glacier (0)	2021.04.06

[ AWS CloudFront ]

한국 유저가 호주 S3 bucket 의 컨텐츠 요청시 한국에서 가까운 edge(eg. 도쿄) 에서 cached 된 데이터를 가져옴

- Content Delivery Network (CDN)

- Improves read performance, content is cached at the edge

- 216 Point of Presence globally (edge locations)

- DDos protection, integration with Shield, AWS Web application firewall

- can expose external HTTPS and can talk to internal HTTPS backends

[ CloudFront - Origins ]

S3 bucket / Custom origin 에 CloudFront 만 접속/접근을 허용하게 설정(OAI)하여 보안성 향상

1. S3 bucket

- For distributing files and caching them at the edge

- Enhanced security with CloudFront Origin Access Identity (OAI)

- CloudFront can be used as an ingress (to upload files to S3)

2. Custom Origin (HTTP)

- Application Load Balancer

- EC2 instance

- S3 website (must first enable the bucket as a static S3 website)

- Any HTTP backend you want

# CloudFront at a high level

# CloudFront - S3 as an Origin

# CloudFront - ALB or EC2 as an origin

[ CloudFront Geo Restriction ]

- You can restrict who can access your distribution

- can use Whitelist/Blacklist

- The country is determined using a 3rd party Geo-IP database

ex. Copyright Laws to control access to content

[ CloudFront vs S3 Cross Region Replication ]

1) CloudFront :

- Global Edge network

- Files are cached for a TTL (maybe a day)

- Great for static content that must be available everywhere

2) S3 Cross Region Replication :

- Must be setup for each region you want replication to happen

- Files are updated in near real-time

- Read only

- Great for dynamic content that needs to be available at low-latency in few regions

저작자표시 비영리

'infra & cloud > AWS' 카테고리의 다른 글

[AWS] 11. AWS Storage Extras : Snowball (0)	2021.04.12
[AWS] 10-2. CloudFront Signed URL / Cookies, Global Accelerator (0)	2021.04.11
[AWS] 9-4. S3 Performance (0)	2021.04.10
[AWS] 9-3. Storage Classes + Glacier (0)	2021.04.06
[AWS] 9-2. S3 Access Logs, S3 Replication (0)	2021.04.04

DEVELOPyo

infra & cloud

[AWS] 13. Docker, ECS/Fargate/EKS

'infra & cloud > AWS' 카테고리의 다른 글

[AWS] 12-4. Amazon MQ, SQS vs SNS vs Kinesis

'infra & cloud > AWS' 카테고리의 다른 글

[Linux] 사용자 계정 관리

'infra & cloud > linux' 카테고리의 다른 글

[AWS] 12-3. Kinesis Data Streams

'infra & cloud > AWS' 카테고리의 다른 글

[AWS] 12-2. Decoupling application: SNS, SNS+SQS (Fan Out)

'infra & cloud > AWS' 카테고리의 다른 글

[AWS] 12-1. Decoupling application: SQS

'infra & cloud > AWS' 카테고리의 다른 글

[AWS] 11-2. Hybrid Cloud for Storage : AWS Storage Gateway, FSx for Windows/Lustre

'infra & cloud > AWS' 카테고리의 다른 글

[AWS] 11. AWS Storage Extras : Snowball

'infra & cloud > AWS' 카테고리의 다른 글

[AWS] 10-2. CloudFront Signed URL / Cookies, Global Accelerator

'infra & cloud > AWS' 카테고리의 다른 글

[AWS] 10-1. AWS CloudFront

'infra & cloud > AWS' 카테고리의 다른 글

+ Recent posts

티스토리툴바