[ Big Data Ingestion Pipeline : Todo List ] 

1. We want the ingestion pipeline to be fully serverless

2. We want to collect data in real time

3. We want to transform the data

4. We want to query the transformed data using SQL

5. The reports created using the queries should be in S3

6. We want to load that data into a warehouse and create dashboards

 

[ Big Data Ingestion Pipeline ]

- IoT Core allows you to harvest data from IoT devices

- Kinesis is great for real-time data collection

- Firehose helps with data delivery to S3 in near real-time (1 min)

- Lambda can help Firehose with data transformations

- Amazon S3 can trigger notifications to SQS

- Lambda can subscribe to SQS (we could have connecter S3 to Lambda)

- Athena is a serverless SQL service and results are stored in S3

- The reporting bucket contains analyzed data and can be used by reporting tool such as AWS QuickSight, Redshift, etc...

반응형

- We have an application running on EC2, that distributes software updates once in a while

- When a new software update is out, we get a lot of request and the content is distributed in mass over the network. It's very costly

- We don't want to change our application, but want to optimize our cost and CPU, how can we do it?

 

[ Our application current state ]

ELB + ASG , running on multi AZ

 

[ Easy way to fix things : Using Amazon CloudFront ]

Why CloudFront?

- No changes to architecture

- Will cache software update files at the edge

- Software update files are not dynamic, they're static (never changing)

- Our EC2 instances aren't serverless

- But CloudFront is, and will scale for us

- Our ASG will not scale as much, and we'll save tremendously in EC2

- We'll also save in availability, network bandwidth cost, etc

- Easy way to make an existing applicaition more scalable and cheaper

반응형

[ Distributing paid content ]

1. We sell videos online and users have to paid to buy videos

2. Each videos can bought by many different customers

3. We only want to distribute videos to users who are premium users

4. We have a database of premium users

5. Links we send to premium users should be short lived

6. Our application is global

7. We want to be fully serverless

 

[ Start simple, premium user service ]

Cognito 를 사용하여 인증(authentication)

DB 조회를 통해 유저가 프리미엄 유저인지 확인(인가(authorization))

 

[ Add Videos Storage Secure, Distribute Globally and Secure, Distribute Content only to premium users ]

1) 영상 URL 요청

2) Cognito 를 통해 인증

3) Lambda 를 통해 프리미엄 유저인지 확인(인가)

4) 프리미엄 유저인 경우 유효시간이 정해져있는 CroudFront Signed URL 을 생성 (CloudFront 는 Signed URL을 통해서만 접근이 가능토록 설정)

5) Signed URL 리턴 

6) 유저는 Signed URL 을 통해 CloudFront 에 접속 및 S3 영상 자원 열람

 

[ Premium User Video service ]

We have implemented a fully serverless solution :

1. Cognito for authentication

2. DynamoDB for storing users that are premium

3. 2 serverless applications

4. Content is stored in S3 (serverless and scalable)

5. Integrated with CloudFront with OAI for security (users can't bypass)

6. CloudFront can only be used using Signed URLs to prevent unauthorized users

※ What about S3 Signed URL? They're not efficeint for global access

 

반응형

[ Micro Service Architecture ]

- We want to switch to a micro service architecture

- Many services interact with each other directly using a REST API

- Each architecture for each micro service may vary in form and shape

- We want a micro-service architecture so we can have a leaner development lifecycle for each service

 

[ Micro Services Environment ]

route 53 에 DNS Query 하여 해당 도메인의 서버에 접근

각각의 도메인은 작게 쪼개진 각각의 서비스(Micro Service)로 내부적으로 다른 서비스를 호출하여 동작할 수 있음

 

[ Discussions on Micro Services ]

1) Free to design each micro-service the way we want

2) Synchronous patterns : API Gateway, Load Balancers

3) Asynchronous patterns : SQS, Kinesis, SNS, Lambda triggers (S3)

4) Challenges with micro-services:

 - repeated overhead for creating each new microservice

 - issues with optimizing server density/utilization

 - complexity of running multiple versions of multiple microservices simultaneously

 - proliferation(급증) of client-side code requirements to integrate with many separate services

 

Some of the challengs are solved by Serverless patterns :

- API Gateway, Lambda scale automatically and you pay per usage

- You can easily clone API, reproduce environments

- Generated client SDK through Swagger integration for the API Gateway

 

반응형

[ Serverless Website : Todo List ]

1. This website should scale globally

2. Blogs are rarely written, but often read

3. Some of the website is purely static files, the rest is a dynamic REST API

4. Caching must be implement where possible

5. Any new users that subscribes should receive a welcome email

6. Any photo uploaded to the blog should have a thumbnail generated

 

[ 1. Serving static content, globally ]

※ CloudFront 를 사용하여 Amazon S3(region)를 global 로 사용

 

[ 2. Serving static content, globally, securely ]

S3 Bucket policy 설정을 통해 OAI (Origin Access Identity) 만 권한(authorize) 부여

client 는 Amazon S3 에 직접 접근 불가하며 CloudFont 를 통해서만 S3 에 접근 가능

 

[ 2-2. Serving static content, globally, securely ]

[ Sending Email ]

DynamoDB Stream 을 통해 데이터 변화시 Lambda 를 호출,

(SES(Simple Email Service) 를 사용 할 수 있는 IAM Role을 가진) Lambda 는 Amazon SES를 사용하여 email 전송

 

[ Making Thumbnail on photo added ]

Client 가 S3 에 직접 혹은 CloudFront 를 통해 이미지 업로드시 Lambda 를 호출(trigger),

Lambda 가 thumbnail 을 생성하여 S3 에 업로드. (이때 S3 는 SQS/SNS 를 사용하여 부가적인 동작 가능)

 

[ AWS Hosted Website Summary ] 

- We've seen static content being distributed using CloudFront with S3

- The REST API was serverless, didn't need Cognito because public

- We leveraged a Global DynamoDB table to serve the data globally (could use Aurora Global Tables)

- We enabled DynamoDB streams to trigger a Lambda function 

- The lambda function had an IAM role which use SES(Simple Email Service)

- SES(Simple Email Service) was used to send emails in a serverless way

- S3 can trigger SQS/SNS/Lambda to notify of events

 

 

DynamoDB Stream 사용법 관련 포스팅

반응형

[ Mobile Application : My Todo List ]

1. We want to create a mobile application with the following requirements

2. Expose as REST API with HTTPS

3. Serverless architecture

4. Users should be able to directly interact with their own folder in S3

5. Users should authenticate through a managed serverless service

6. The users can write and read to-dos, but they mostly read them

7. The database should scale, and have some high read throughput

 

[ 1. Mobile App : REST API layer ]

 

[ 2. Mobile App : giving users access to S3 ]

※ Note : save credentials on S3 (not on your mobile client)

 

[ 3. Mobile app : high read throughput, static data ]

 

- Serverless REST API : HTTPS, API Gateway, Lambda, DynamoDB

- Using Cognito to generate temporary credentials with STS to access S3 bucket with restricted policy. App users can directly access AWS resources this way. Pattern can be applied to DynamoDB, Lambda

- Caching the reads on DynamoDB using DAX

- Caching the REST requests at the API Gateway level

- Security for authentication and authrization with Cognito, STS

 

 

반응형

Serverless

- Serverless is a new paradigm in which the developers don't have to manage servers anymore

- just deploy code, just deploy functions

- Serverless == FaaS(Function as a Service)

- Serverless was pioneered by AWS Lambda but now also includes anything that's managed such as databases, messaging, storage, etc.

- Serverless doesn't mean there are no servers

 

Serverless in AWS 

: AWS Lambda, DynamoDB, AWS Cognito, AWS API Gateway, S3, Fargate...

 

[ Why use AWS Lambda (compare Amazon EC2 with AWS Lambda) ] 

Amazon EC2

- Virtual Servers in the Cloud

- Limited by RAM and CPU

- Continuously running

- Scaling means intervention to add /remove servers

AWS Lambda

- Virtual functions - no servers to manage

- Limited by time - short executions

- Run on-demand

- Scaling is automated

 

[ Benefits of AWS Lambda ]

-Easy Pricing

  Pay per request and compute time

  Free tier of 1,000,000 AWS Lambda requests and 400,000 GBs of compute time

- Integrated with the whole AWS suite of services

- Integrated with many programming languages 

- Easy monitoring through AWS CloudWatch

- Easy to get more resources per functions (up to 10GB of RAM)

- Increasing RAM will also improve CPU and network

 

[ AWS Lambda language support ]

- js

- Python

- Java

- C#

- Go

- Ruby

- Custom Runtime API 

- Lambda Container Image

  The container image must implement the Lambda Runtime API

  ECS / Fargate is preferred for running arbitrary(임의적인 멋대로인) Docker images

 

[ AWS Lambda Integrations Main ones ]

- API Gateway

- Kinesis

- Dynamo DB

- S3

- CloudFront

- CloudWatch Events Event Bridges

- CloudWatch Logs

- SNS

- SQS

- Cognito

 

[ Example : Serverless Thumbnail creation ]

 

[ Example : Serverless CRON job ]

 

[ AWS Lambda Pricing ]

- https://aws.amazon.com/lambda/pricing/ 

- Pay per calls 

  first 1,000,000 requests are free

  $0.20 per 1 million requests 

- Pay per duration : 

  400,000 GB seconds of compute time per month if FREE

  == 400,000 sec if function is 1GB RAM

  == 3,200,000 seconds if function is 128MB RAM

- After that $1.00 for 600,000 GB-sec

- It is usually very cheap to run AWS Lambda so it's very popular

 

[ AWS Lambda Limits to Know - per region ]

Execution :

1) Memory allocation 128mb-10GB (64mb increments)

2) Maximum execution time : 900 sec (15min)

3) Environment variables (4 KB)

4) Disk capacity in the function container (in /tmp) : 512MB

5) Concurrency executions : 1000 (can be increased)

Deployment :

1) Lambda function deployment size (compressed .zip) : 50MB

2) Size of uncompressed deployment (code + dependencies) : 250MB

3) Can use the /tmp directory to load other files at startup

4) Size of environment variables : 4 KB

 

 

 

반응형

'infra & cloud > AWS' 카테고리의 다른 글

[AWS] 15. DynamoDB  (0) 2021.09.06
[AWS] 14-2. Lambda@Edge  (0) 2021.09.02
[AWS] 13. Docker, ECS/Fargate/EKS  (0) 2021.04.25
[AWS] 12-4. Amazon MQ, SQS vs SNS vs Kinesis  (0) 2021.04.25
[AWS] 12-3. Kinesis Data Streams  (0) 2021.04.18

+ Recent posts