S3 One Zone-IA(Infrequent Access) and Standard-IA is for data that is barely accessed but requires rapid access when needed. Unlike other S3 Storage Classes which store data in a minimum of three AZs, One Zone-IA stores data in a single AZ and reduce its cost.
S3 Glacier is for archival storage and S3 is for durable data storage
To adjust storage (not archival storage) cost automatically based on a frequency of a data, use S3 Intelligent-Tiering. It incur higher costs for the frequent data access.
S3 Transfer Acceleration optimizes processing data over long distances to S3 from only a single bucket which is on-premises or remote clients.
If you want to encrypt data while it is uploaded and stored, you can use server-side encryption(SSE).
SSE-KMS
SSE-S3
SSE-C
SSE-KMS (Key Management Service) provide a key management system such as automatic/manual rotation, and logs with an audit trail. SSE-S3(S3 managed keys) cannot provide key rotation. SSE-C (Customer-Provided Keys) cannot log key usage. KMS access cannot prevent deletion.
Object lock (mode: governance, compliance) can prevent from being overwritten and deleted of objects for certain periods. Compliancemode allows users to delete or modify after retention period expired by contacting to AWS support team. Governancemode allows users who has s3:PutObject permission to delete objects.
Once you version-enable a bucket, it can never return to an unversioned state.
S3 event notification can trigger Lambda functions.
EBS(Elastic Block Store)
It is like SD card or USB and cannot be accessed by multiple instances.
provides NFS file system. Multiple EC2 instances can access the same EFS file system. It can be mounted by multiple EC2 instances.
If data must be kept for long span, EFS costs a lot, so it’s better to use S3 as archival (but S3 cannot mount to EC2 because it’s not file system).
FSx for Windows File Server
supports WFSF(Windows file server farm) and the user of Microsoft’s DFS(Distributed File System). It is accessible over the SMB(industry-standard Server Message Block) protocol. It is built on Windows Server, delivering administrative features such as Microsoft AD (Active Directory).
Join the FSx filesystem to the on-premises AD (if AD is already created in on-premises environment) to migrate it to AWS.
DataSync
can copy data between below AWS services.
NFS (Network File System) shares
SMB (Server Message Block) shares
self-managed object storage
AWS Snowcone
Amazon S3 buckets
Amazon EFS (Elastic File System) file systems
Amazon FSx for Windows File Server file systems
Storage Gateway
is for connections between on-premises and cloud environment to store data.
It can be used with SMB shares an stores the data on S3.
New Name
Old Name
Interface
Use Case
File Gateway
None
NFS, SMB
Allow on-prem or EC2 instances to store objects in S3 via NFS or SMB mount points
Volume Gateway Stored Mode
Gateway-Stored Volumes
iSCSI
Asynchronous replication of on-prem data to S3
Volume Gateway Cached Mode
Gateway-Cached Volumes
iSCSI
Primary data stored in S3 with frequently accessed data cached locally on-prem
Tape Gateway
Gateway-Virtual Tape Library
ISCSI
Virtual media changer and tape library for use with existing backup software
Tape Gateway
Tape Gateway enables you to replace using physical tapes on premises with vurtual tapes in AWS without changing exitsting backup workflows. It stores your tape data in a virtual tape library in S3.
File Gateway
It allows on-premis or EC2 instances to store objects in S3 via NFS, SMB protocol.
Hardware Appliance
It comes pre-loaded with Storage Gateway software, and provides all resources such as CPU, memory, network, and SSD cache resources for creating and configuring File Gateway, Volume Gateway, or Tape Gateway.
EMR(Elastic MapReduce)
is a managed cluster platform that simplifies running big data frameworks such as Apache Hadoop and Spark.
A runtime role is an AWS Identity and IAM role that you can specify when you submit a job or query to an EMR cluster. The job or query uses the runtime role to access AWS resources. It doesn’t need a instance level permission controll.
Each cluster in EMR must have a service role and a role for the EC2 instance profile.
Computing
AWS Compute Optimizer is to provide cost optimization recommendations for AWS services including EC2 instances, the Auto Scaling group, and the EBS volumes.
Lambda
Lambda is not optimal for large ML models and batch request due to its constraints on memory and runtime. It has a maximun execution duration of 15 minutes and be not appropriate for stateful and long-running jobs.
Lambda function
automatically handles scaling the number of execution environments. You can set concurrency limit.
API Gateway
creating a usage plan will limit the amount of data that is received.
EC2
instance store miximize I/O performance to process data
detailed monitoring on EC2 offers metrics at 1-minute intervals.
On-Demand Capacity Reservations offers computing in a specific AZ for any duration.
EC2 AutoScaling group(ASG) allows applications to be provided in multiple subnet and multiple AZ in the same VPC. It cannot launch instances in multiple regions.
Reserved instances can save money significant for steady and predictable workloads. Spot instances provide up to 90% discount for unused instances compared to On-Demand.
Elastic IP(EIP) is a static IP address.
target policy maintains the target metric value, while simple policy scale only in one direction like if CPU exceeds 50%, add an instance.
AMI(Amazon Machine Image) is a template for creating EC2.
Instance store volume is ephemeral, and it still exist after reboot, but it disappears if it stops, terminates, or hibernates.
Snowball Edge
is for one-time migration senario and can be shiped physically to a AWS Region.
has a on-demand storage (80 TB) and compute power.
Amplify is for full-stack applications, not static web site.
Application Migration Service
offers a highly automated lift-and-shift migration.
Use it to replicate the VMs to AWS and install the AWS Replication Agent on each VM. After complieting the replication, launch test instances to test for the workloads. Before the final custover, stop all operations on the VMs. Finally, launch the migrated instances in AWS to perform a cutover.
EKS
Fargate
provides on-demand and optimized EKS cluster, allowing the company to run pods without managing the underlying infrastructure.
SNS (Simple Notification Service)
It is a fully managed pub/sub (publisher/subscriber) messaging system. It cannot be used for real-time processing of data.
SQS (Simple Queue Service)
You can scale an Auto Scaling group in response to the activity in SQS queue.
It can retain 14 days in a queue.
ReceiveMessage API prevent other instances from receiving the message until the first instance finishes it.
SQS long polling return a response after a message arrives (such as the image was updated). ReceiveMessageWaitTimeSeconds can set the wait time (max is 20s) to response even though a message does not arrive.
Once a visibility time out expired which is for preventing other consumers recieve messages, the message become available for processing by another instance.
Connect
Amazon Connect can sent SMS message.
EventBridge
for complex event routing system.
It can trigger Lambda to start and stop RDS.
KMS (Key Management Service)
It provides a managed service for creating and controlling encryption keys and encryption at rest in AWS services.
Elastic Beanstalk
It allows you to simply upload application and automatically provision capacity, scaling, load balancing, and monitoring. It cannot be used to deploy your infrastructure to different regions.
WAF (Web Application Firewall)
It can be used for protection from exploits, such as SQL Injection and XSS (cross site scripting), for some AWS services.
Amazon CloudFront
Amazon API Gateway
Application Load Balancer
AWS AppSync GraphQL API
Amazon Cognito
AWS App Runner
AWS Verified Access
OpsWorks
It automate operation using Chef and Puppet.
Database
RDS
supports multi-AZ () deployment.
Amazon Aurora
MySQL
MariaDB
Oracle
SQL Server
PostgreSQL
Aurora Serverless is RDS
Encryption must be specified when creating the RDS DB instance because you cannot change the status.
To enable encryption in access to a RDB, download a root or intermediate SSL certification which is created and signed by a CA(Certificate Authority) with the CN(Common Name) and the RDB endpoint when the instance were created.
It can configure automated backups, setting the retention period and enabling point-in-time recovery. For 120 days additional backups more than 35 days, use AWS Backup. AWS Backup cannot directly handle point-in-time recovery.
It doesn’t support automatic start and stop.
Aurora
Read Replica is associated with a priority tier(0-15). The highest priority means the lowest numbered tier. Aurora offers the least replication lag and has the largest size of database, 128TB, compared to other RDS engine.
DMS (Database Migration Service)
To Encrypt an existing RDS, take a snapshot, encrypt and restore it to a new RDS DB instance. Application must be updated to associate the new endpoint. DMS(Database Migration Service) offers the best way to synchronize the large data.
DynamoDB
DynamoDB is NoSQL.
Fargate provides a serverless compute engine for containers.
EventBridge provides scheduling without intervention.
DAX(DynamoDB Accelerator) is a in-memory caching service for DynamoDB.
TTL(Time To Live) is for automatically expire data.
Kinesis Data Stream
for streaming data ingestion
Kinesis Data Stream collect data from several devices, dividing them by shared. A Shared can be associated with a partition key for each device. Consumers process data in real time.
ElastiCache
It is a serverless and speeds up data processing, providing highly performant in-memory caching.
ElastiCashe Memcached is for an in-memory cache of your relational or NoSQL database and connot be used as a cache to serve static content from S3.
Glue
is for ETL(Extract, Transform, Load) jobs which can utilizes SSE-KMS(KMS keys for server side encryption). Jpb bookmarks keeps track for data processing so that it does not reporocess the same data.
It is for DWH (Data Warehouse) and optimized for running large reports and analysis workloads, providing scalability and fast query performance.
Identity
SCP(Service Control Policy)
It can attach policy across multiple member accounts in AWS Organization.
offers permission for IAM users or roles, such as denying creation of resources without compliance-level tag
tag policy is like such as requiring the use of the compliance-level tag
IAM Roles
provides temporary AWS credentials with least priviledges.
Organization
If you want to send notifications to administrators, not root users, coonfigure account alternate contacts.
aws:PrincipalOrgId global condition key allows only access from user accounts within their AWS Organization. It can be attached to resources such as S3 bucket policy.
A root OU allows centralized control over which AWS services can be access across all accounts in the organization.
Control Tower
govern an AWS multi-account environment such as preventing any traffic from permitted regions.
Cognito
is an user directory, and can authenticate and authorize users.
Cognito requires IAM role by identity pools which provides temporary AWS credentials and integrates with the user pool to secure.
CloudTrail
It provides visibility into user activity by recording actions taken on your account.
It can only send data to S3 and CloudWatch.
Secrets Manager
is for storing credentials with secure rotations.
STS (Security Token Service)
A web service that provides temporary, limited-privilege credentials for users such as IAM users.
ACM (AWS Certificate Manager)
For authentication. It provides SSL/TLS certification process and encryption in transit.
Networking
VPC (Virtual Private Cloud)
AWS reserves 5 IP addresses per each subnet (first four IP and the last IP address), providing range between /16 and /28.
Below is the example in a subnet with CIDR block 10.0.0.0/24 (Network ID = 8Digit*3) => 10.0.0(Network ID)|0(Host ID)
10.0.0.0: Network address.
10.0.0.1: Reserved by AWS for the VPC router.
10.0.0.2: Reserved by AWS for the DNS server. (DNS server is located in the primary CIDR among multiple CIDR blocks)
10.0.0.3: Reserved by AWS for future use.
10.0.0.255: bloadcast address.
VPC Peering
A VPC peering connection is for a private networking between two VPCs. It is not able to connect to on-premises and a public AWS services, such as DynamoDB.
Site-to-Stie VPN
Site-to-Site VPN connection is connected virtually using VPN and not physical. It enables to connect your on-premises network to VPC
Direct Connect
A Direct Connect connection is for multiple VPCs and Direct Connect gateway is attached to connect it. It is also able to connect to on-premises using AWS PrivateLink. Direct connect physically connect it. It involves significant monetary investments and takes at least a month to set up a connection between your intranet and VPC.
Only data transferred between resources in the same region is free.(on-premises also need charges.)
PrivateLink
PrivateLink = VPC endpoint + VPC endpoint service
VPC endpoint has two types below.
gateway endpoint
interface endpoint
It only allow private connect from the VPC to supported AWS services.
Gateway endpoint supports connections to S3 and DynamoDB from VPCs. It needs a route table entry for the endpoint.
Interface endpoint
Interface endpoint supports connections to over 50 types of services including SaaS in AWS Marketplace.
NAT Gateway
NAT (Network Address Translation) connect to Internet from resources in private subnet, allowingoutbound internet access and blockinginbound access.
Virtual Private Gateway
It is only support one-way access between VPCs.
GWLB(Gateway Load Balancer)
is for security appliances centralizing traffic management
NLB (Network Load Balancer)
NLB is for Lyer 4 such as TCP/UDP, based on IP address, port.
ALB (Application Load Balancer)
A Load Balancer for content-based routing for Layer 7 such as HTTP/HTTPS, based on content type, cookie data, custom headers, user location etc…(but not country)
Create public subnets in the same AZ where the private subnets are used by the backend EC2 instances. Then, associate the public subnets with a load balancer.
CloudFront
It is a CDN(content delivery network) service that delivers static and dynamic web content, video streams, and APIs. It improves resiliency to handle spikes in traffic.
create an origin group with two origins(for primary and secondary) to automatically switch to the secondary when failure occures.
adding price classes such as U.S, Canada, etc.. will reduce costs for a system using only in these regions.
A costom origin helps point to an on-premises server.
Global Accelerator
It improves availability and performance of your applications. It uses AWS global network to route TCP/UDP traffics to a healthy application endpoint in the closest AWS Region. It is a good fit for both HTTP and non-HTTP use cases, such as gaming (UDP), IoT (MQTT), or Voice over IP.
For example, you can add ALB, NLB, and EC2 as endpoints and manage services across multiple regions.
CloudWatch composite can set alarm based on multiple metrics such as CPU and disk IOPS.
CloudWatch synthetics monitors application availability and performance.
GuardDuty
It offers threat detection for your AWS accounts, workloads, and data stored in S3.
Disabling the service will delete all remaining data. Suspending the service will stop the service, but does not delete your data.
QuickSight
It provides visualization based on collected data from cloud services.
Athena
It can query and analyze directly in S3. It is serverless and can use SQL and Apache Spark
Config
AWS Config supervises any unauthorized changes on AWS resources such as detecting any resources in not permitted regions.
Inspector
Amazon Inspector assess any vulnerability and deviation on EC2.
Cost Explorer
It provides reports to analyze costs by different dimensions such as service, user, or resource, compared to Billing Dashbord which only shows total costs.
コメント