por Yasser Moghrabiah hace 2 años
5147
Ver más
Meet regularly with your AWS Solutions Architect, Consultant, or Account Team, and consider which new services or features you could adopt to save money.
Balance the data transfer costs of your architecture with your high availability (HA) and reliability needs.
Analyze the situation and use AWS Direct Connect to save money and improve performance.
Architect to optimize data transfer (application design, WAN acceleration,etc.).
Use a CDN
Reconcile decommissioned resources based on either system or process.
Have a process in place to identify and decommission orphaned resources.
Design your system to gracefully handle instance termination as you identify and decommission non-critical or unrequired instances or resources with low utilization.
Finance driven charge back method Use this to allocate instances and resources to cost centers (e.g., tagging).
Use AWS Cost Explorer
Notifications Let key members of our team know if our spending moves outside well-defined limits
Monitoring Monitor usage and spend regularly using Amazon CloudWatch or a third-party provider (examples: Cloudability, CloudCheckr)
Cost-efficient architecture Have a plan for both usage and spending (per unit – e.g., user, gigabyte of data)
Review Detailed Billing Reports Have a standard process to load and interpret the Detailed Billing Reports.
Tag all resources To be able to correlate changes in your bill to changes in our infrastructure and usage.
Track project lifecycle Track, measure, and audit the life cycle of projects, teams, and environments to avoid using and paying for unnecessary resources.
Establish groups and roles (Example: Dev/Test/Prod); use AWS governance mechanisms such as IAM to control who can spin up instances and resources in each group. (This applies to AWS services or third-party solutions.)
Consider AWS CloudFormation, AWS Elastic Beanstalk, or AWS Opsworks Use AWS CloudFormation templates / AWS Elastic Beanstalk /AWS OpsWorks to achieve the benefits of standardization and cost control.
Consider other application level services Use Amazon Simple Queue Service (SQS), Amazon Simple Notification Service (SNS), Amazon Simple Email Service (SES) where appropriate
Consider appropriate databases Use Amazon Relational Database Service (RDS) (Postgres, MySQL, SQL Server, Oracle Server) or Amazon DynamoDB (or other key-value stores, NoSQL alternatives) where it’s appropriate.
Analyze Services Analyze application-level services to see which ones you can use.
Consider Cost Factor costs into region selection.
Automated Action Have your architecture allows you to turn off unused instances (e.g., use Auto Scaling to scale down during non-business hours).
Sell Reserved Instances As your needs change, sell Reserved Instances you no longer need on the Reserved Instances Marketplace, and purchase others.
Analyze Usage Regularly analyze usage and purchase Reserved Instances accordingly.
Spot Use Spot instances for select workloads.
Profiled Applications Profile your applications so you know when to use which type of Amazon EBS (magnetic, general purpose (SSD), provisioned IOPS). Use EBS-Optimized instances only when necessary.
Custom Metrics Load custom memory scripts and inspect memory usage using CloudWatch.
Amazon CloudWatch Use CloudWatch to determine processor load.
Third-party products For example, use third-party products such as CopperEgg or New Relic to determine appropriate instance types.
Match instance profile based on need For example, match based on workload and instance description –compute, memory, or storage intensive
Service-specific optimizations Examples - include minimizing I/O for Amazon EBS; - avoiding uploading too many small files into Amazon S3; using Spot instances extensively for Amazon EMR; etc.
Appropriately provisioned Appropriately provision throughput, sizing, and storage for services such as Amazon DynamoDB, Amazon EBS (provisioned IOPS), Amazon RDS, Amazon EMR, etc.
Time-based approach Examples: follow the sun, turn off Dev/Test instances over the weekend, follow quarterly or annual schedules (e.g., Black Friday).
Queue-based approach Run your own Amazon Simple Queue Service (SQS) queue and spin up/shut down instances based on demand.
Demand-based approach Use Auto Scaling to respond to variable demand.
Planned Plan future proximity or caching solutions based on metrics and/or planned events.
Monitor Monitor cache usage and demand over time.
Periodic Review Review cache usage and demand over time.
Amazon CloudWatch monitoring Use CloudWatch to monitor instances
Third-party monitoring Use third-party tools to monitor systems.
Alarm-based notifications Plan for your monitoring systems to automatically alert you if metrics are out of safe bounds
Review Cyclically reselect a new instance type and size based on predicted resource needs.
Benchmarking After each new instance type is released, carry out a load test of a known workload on AWS, and use that to estimate the best selection.
Load Test After each relevant new instance type is released deploy the latest version of the system on AWS, use monitoring to capture performance metrics, and then select based on a calculation of performance/cost.
Proactive Monitoring– Amazon Cloud Watch monitoring Use Amazon CloudWatch to monitor proximity and caching solutions.
Proactive Monitoring–Third-party monitoring Use third-party tools to monitor proximity and caching solutions.
Alarm-based notification Plan for your monitoring systems to automatically alert you if metrics are out of safe bounds.
Trigger-based actions Plan for alarms to cause automated actions to remediate or escalate issues.
Policy/Reference Architecture Select instance type and size based on predicted resource need based on an internal governance standard.
Cost/Budget Selecting instance type and size based on predicted resource need based on internal cost controls
Benchmarking Load test a known workload on AWS and use that to estimate the best selection; testing a known performance benchmark vs. a known workload
Guidance from AWS or from an APN Partner Select a proximity and caching solution based on best practice advice.
Load Test Deploy the latest version of your system on AWS using different instance types and sizes, use monitoring to capture performance metrics, and then make a selection based on a calculation of performance/cost.
Planned Plan for future capacity and throughput based on metrics and/or planned events.
Automated Automate against metrics.
Amazon CloudWatch monitoring Use CloudWatch to monitor databases
Third-party monitoring Use third party tools to monitor databases
Periodic review Periodically review your monitoring dashboards
Alarm-based notifications Plan to have your monitoring systems automatically alert you if metrics are out of safe bounds.
Trigger-based actions Plan to have alarms cause automated actions to remediate or escalate issues.
Review Cyclically reselect new instance type and size based on predicted resource need.
Load Test After each relevant new instance type is released, - deploy the latest version of the system on AWS, - use monitoring to capture performance metrics, - and then make a selection based on a calculation of performance/cost
Policy/Reference Architecture Select instance type and size based on predicted resource needs based on an internal governance standard.
Cost/Budget Select instance type and size based on predicted resource needs based on internal cost controls
Benchmarking Load test a known workload on AWS and use that to estimate the best selection – testing a known performance benchmark vs. a known workload.
Guidance from AWS or from an APN Partner Select a solution based on best practice advice
Load Test Deploy the latest version of your system on AWS using different instance types and sizes, use monitoring to capture performance metrics, and then make a selection based on a calculation of performance/cost.
Automated Recovery Implemented Use AWS and/or third-party tools to automate system recovery.
DR Tested and Validated Regularly test failover to DR to ensure RTO and RPO are met.
Service Limits Request an increase of service limits with the DR site to accommodate a failover.
Configuration Drift Ensure that Amazon Machine Images (AMIs) and the system configuration state are up-to-date at the DR site/region.
Disaster Recovery Establish a DR strategy.
Objectives Defined Define RTO and RPO.
Notification Plan to receive notifications of any significant events.
Monitoring Continuously monitor the health of your system
Auto Healing Use automated capabilities to detect failures and perform an action to remediate.
Multi-AZ /Region Distribute applications across multiple Availability Zones /regions.
Load Balancing Use a load balancer in front of a pool of resources.
Periodic Recovery Testing Validate that the backup process implementation meets RTO and RPO through a recovery test.
Backups are Secured and/or Encrypted See the AWS Security Best Practices whitepaper.
Automated Backups Use AWS features, AWS Marketplace solutions, or third-party software to automate backups.
Data is Backed Up Back up important data
using
Third-party software to meet RPO
Amazon EBS snapshots
Amazon S3,
Change Management Automated Automate deployments /patching.
Monitoring Monitor your applications with Amazon CloudWatch or third-party tools.
Notification Plan to receive notifications when significant events occur.
Automated Response Use automation to take action when failure is detected, e.g., to replace failed components.
Review Perform frequent reviews of the system based on significant events to evaluate the architecture.
Load Test Adopt a load testing methodology to measure if scaling activity will meet application requirements.
Automated scaling Use automatically scalable services
e.g.
AWS Elastic Beanstalk
Amazon DynamoDB
Auto Scaling
Amazon CloudFront
Amazon S3
Leverage AWS Support APIs Integrate the AWS Support API with your internal monitoring and ticketing systems.
Planned Ongoing engagement /relationship with AWS Support or an APN Partner.
IP subnet allocation Individual Amazon VPC IP address ranges should be large enough to accommodate an application’s requirements including factoring in future expansion and allocation of IP addresses to subnets across Availability Zones.
Non-overlapping private IP ranges The use of your IP address ranges and subnets in your virtual private cloud should not overlap each other, other cloud environments, or your on-premises environments
Highly available connectivity to the system Highly available load balancing and/or proxy, DNS-based solution, AWS Marketplace appliances, etc.
Highly available connectivity to AWS Multiple DX circuits, multiple VPN tunnels, AWS Marketplace appliances.
Be aware of fixed service limits Be aware of unchangeable service limits and architected around these.
Set up automated monitoring Implement tools, e.g., SDKs, to alert you when thresholds are being approached.
Monitor and manage limits Evaluate your potential usage on AWS, increase your regional limits appropriately, and allow planned growth in usage.
Operating system or third-party application logs.
Other AWS service-specific log sources.
Amazon CloudWatch logs.
Amazon S3 bucket logs.
Amazon Virtual Private Cloud (VPC) filter logs.
Elastic Load Balancing (ELB) logs.
AWS CloudTrail.
Use of a solution from the AWS Marketplace or an APN Partner
Use of a custom AMI or configuration management tools (i.e., Puppet or Chef) that is secured by default.
Host-based intrusion detection controls are used for EC2 instances.
File integrity controls are used for EC2 instances.
Service-specific requirements are defined and used.
Resource requirements are defined for sensitive API calls, such as requiring MFA authentication and encryption.
Periodic auditing of permissions.
Separation of duties.
Credentials configured with the least privilege
AWS Trusted Advisor checks are regularly reviewed.
Security testing is performed regularly.
Bastion host technique is used to manage the instances.
Private connectivity to a VPC is used (e.g., VPN, AWS Direct Connect, VPC peering, etc.)
Service-specific access controls are used (e.g., bucket policies).
Subnets and network ACLs are used appropriately.
Host-based firewalls with minimal authorizations are used
Trusted VPC access is via a private mechanism (e.g., Virtual Private Network (VPN), IPsec tunnel, AWS Direct Connect, AWS Marketplace solution, etc.).
The system runs in one or more VPCs.
Security groups with minimal authorizations are used to enforce rolebased access.
AWS server-side techniques are used with AWS managed keys
example
Amazon S3 SSE
AWS Marketplace solution is being used. ex SafeNet, TrendMicro
Use AWS CloudHSM
Appropriate key and credential rotation policy is being used
OS-specific controls are used for EC2 instances
IAM user credential is used, but not hardcoded into scripts and applications
IAM roles for Amazon EC2
Users, groups, and roles are clearly defined and granted only the minimum privileges needed to accomplish business requirements
A solution from the AWS Marketplace (e.g., Okta, Ping Identity) or from an APN Partner
IAM roles for cross-account access
Employee life-cycle policies are defined and enforced
AWS Security Token Service (STS)
Web Identity Federation
SAML integration
IAM users and groups
AWS Marketplace solution is being used.
There is a MFA hardware device associated with the AWS root account
The AWS root account credentials are only used for only minimal required activities.
AWS Marketplace solution is being used.
Private connectivity (e.g., AWS Direct Connect).
VPN based solution
SSL or equivalent is used for communication.
SSL enabled AWS APIs are used appropriately.
A solution from the AWS Marketplace or from an APN Partner.
Data at rest is encrypted using client side techniques.
Data at rest is encrypted using AWS service specific controls
Example
Amazon EBS encrypted volumes
Amazon S3 SSE,
Amazon RelationalDatabase Service (RDS) Transparent Data Encryption (TDE)