Optimize for better I/O throughput and faster migrations using Amazon EC2

06May,19 Post Image

Since moving your workload to AWS, is your workload now running differently? Do your backups or nightly updates take longer than they did before? Have you needed to create a workaround? Is your migration or new release taking longer to complete than you planned? If so, this blog is for you.

Here at nClouds, we’ve been managing and executing migration projects on AWS for enterprises and startups for more than a decade. In fact, our CEO & Co-founder JT Giri started working with Amazon Elastic Compute Cloud (Amazon EC2) when it was still in beta in 2006.

When a client needs secure, resizable compute capacity in the cloud and wants the ability to obtain and boot new server instances in minutes, we include Amazon EC2 as part of their overall solution. (Here are links to relevant case studies if you’d like to see some examples: 6Connex, Avaya, Informatica, LendingHome, nDimensional, Prodea, TetraScience.)

In this blog post, I’ll talk about how to optimize Amazon EC2 for your workload. First, I’ll describe what Amazon EC2 is, and then recommend steps you can take to enhance the performance of EC2 instances, based on the work we’ve done for our clients.

What is Amazon EC2?

Amazon EC2 is a web service that provides elastic compute access and secure, resizable compute capacity in the cloud. You can get started with Amazon EC2 by using the AWS Management Console, AWS Command Line Tools (CLI), or AWS Software Development Kits (SDKs).

When you launch Amazon EC2 instances you can store root device data on Amazon Elastic Block Store (EBS) so that data on the root device will continue to exist even if the instance is deleted or terminated.

Benefits:

  • Improve your scalability
    • Increase or decrease capacity within minutes.
    • Commission an unlimited number of server instances simultaneously.
    • Scale up or down to maximize performance and minimize cost.
  • Speed up migrations by using Amazon Systems Manager (together with AWS CloudFormation) to create resource groups of EC2 in order to automatically apply settings and configurations to the new system with little manual work.
  • Completely control instances
    • Control root access.
    • Stop any instance (retaining data on the boot partition) and then restart the same instance using web service APIs.
  • Customize AWS hosting service
    • Select instance types, operating systems, and software packages.
    • Select a configuration of memory, CPU, instance storage, and boot partition size.
  • Integrate with other AWS services such as Amazon Simple Storage Service (Amazon S3), Amazon Relational Database Service (Amazon RDS), and Amazon Virtual Private Cloud (Amazon VPC).
  • Enhance reliability and security
    • Rapidly and predictably commission replacement instances, with a Service Level Agreement commitment of 99.99% availability for each Amazon EC2 Region.
    • Get enhanced security and robust networking functionality from the integration of EC2 with Amazon VPC.

Step One: Optimize Amazon Elastic Block Store (Amazon EBS) I/O

First, let’s take a deeper look at how Amazon Elastic Block Store (Amazon EBS) is used in concert with Amazon EC2.

opti-io-ec2-blog1-6m19

  1. Amazon EBS provides network-connected block storage volumes for Amazon EC2 instances. Benefits include:
    • High availability: Each Amazon EBS volume is automatically replicated within its Availability Zone.
    • Consistent low-latency performance needed to run your workloads.
    • Fast scalability.
  2. Amazon EBS provides persistent block storage volumes for use with Amazon EC2 instances in the AWS Cloud. Each Amazon EBS volume is automatically replicated within its Availability Zone to protect you from component failure, offering high availability and durability. Amazon EBS volumes offer the consistent and low-latency performance needed to run your workloads. With Amazon EBS, you can scale your usage up or down within minutes – all while paying a low price for only what you provision.
  3. There are two types of Amazon EBS: Standard and Provisioned IOPS.
    • IOPS (input/output operations per second) is a common performance measurement used to benchmark computer storage devices like hard disk drives (HDD), solid state drives (SSD), and storage area networks (SAN). (As with any benchmark, IOPS numbers published by storage device manufacturers do not guarantee real-world application performance).
    • AWS uses volume type Provisioned IOPS SSD (io1). It has the highest-performance SSD volume for mission-critical low-latency or high-throughput workloads, best suited to critical business applications that require sustained IOPS performance.
  4. When attached to EBS-optimized instances, Provisioned IOPS volumes are designed to deliver within 10% of the Provisioned IOPS performance 99.9% of the time in a given year.1 So, for example, a volume provisioned with 500 IOPS should deliver at least 450 IOPS 99.9% of the time. That said, the performance of Provisioned IOPS SSD (io1) volumes depends on several factors:
    • Workload: Match the workload demand on the volume to the IOPS that you provisioned.
    • I/O throughput: If the I/O chunks are very large, you might get fewer IOPS than you provisioned.
    • Use of snapshots: Look at your snapshot frequency and retention settings to avoid incurring unnecessary EBS snapshot charges.
    • Queue length: Check the average queue length to ensure that your application is not trying to drive more IOPS than you provisioned.
    • EBS optimization: Provisioned IOPS volumes deliver expected performance only when they are attached to an EBS-optimized instance.

Now, let’s take a look at how to optimize Amazon EBS, which is essential to optimizing Amazon EC2.

  1. The challenge: When I/O on Amazon EBS volumes increases, I/O operations queue up and the Volume Queue Length grows. As a result, your workload can experience a serious slowdown.
  2. The resolution:
    • Prime your Amazon EBS volume. While new EBS volumes don’t require initialization, if you need to restore from snapshots, you still need to initialize (pre-warm) blocks.
    • Use assembly Amazon EBS structures. If your application is disk-intensive, it’s better to configure a software-level RAID array (redundant array of independent disks). Doing so will enable you to replace an EBS that’s having latency issues or bandwidth errors without disrupting your application.
    • Use Amazon EC2 Instance Store instead of Amazon EBS for more predictable I/O performance (HD or SSD). Instance Store provides temporary block-level storage for your instance, located on disks that are physically attached to the host computer, so you won’t have to deal with network latency or Amazon EBS errors.

      opti-io-ec2-blog2-6m19

Step Two: Check for an Amazon EC2 Elastic Compute Unit (ECU) mismatch

AWS created a metric called an Amazon EC2 Elastic Compute Unit to standardize the relative measure of the integer processing power of an Amazon EC2 instance. However, the price of an ECU varies for each family of EC2 instances. While powerful EC2 instances are more expensive overall, their cost per ECU is 40-50% less than that of less powerful EC2 instances.

For example, the m5.large EC2 instance family type has 8 ECUs and costs $0.096 per hour, so the cost per ECU is $0.012 per hour. Compare that to the cc2.8xlarge ECU instance family type which has 88 ECUs and costs $2.40 per hour, so the cost per ECU is $0.027 per hour. Therefore, it’s essential to check for the best match of specs (how fast your workload will run) and price.

Step Three: Check for stolen CPU

Use Amazon CloudWatch to look at the CPU utilization of the instance.

opti-io-ec2-blog3-6m19

opti-io-ec2-blog4-6m19

  1. The challenge: If a virtual machine’s hypervisor has reached maximum processing capacity (100% CPU capacity) performing other tasks, then processing resources need to be reallocated. The virtual CPU remains idle while it waits for a physical CPU to provide support for its virtual processes, resulting in “stolen CPU” measured by virtual machine hypervisors as “CPU steal” or “steal time.” The leading causes of CPU steal are poor allocation and insufficient resources. If steal times and idle times rise and fall congruently, stolen CPU is probably to blame. If the steal time is near zero and idle times remain relatively high, something else is causing the CPUs to stall.
  2. The resolution:
    • Buy more powerful (and expensive) EC2 instances. If your workload requires a powerful EC2 instance, it will be likely to have fewer neighbors sharing the same hardware.
    • Optimize your computational process. If you are executing integration testing in EC2, you should monitor the CPU utilization of the instance. If you’re at less than 100% CPU utilization (i.e., some of the CPU cores are idle), I recommend splitting the integration test among the available cores and running tests in parallel to get results more quickly.

In conclusion:

To improve the operational efficiency and I/O throughput of your workload, optimize costs, and reduce the time it takes to complete migration, new releases, integration testing, backups or nightly updates, we recommend that you:

  • Maximize I/O performance and reduce network latency and bandwidth issues by using an appropriate Amazon EBS structure, or opt to use Amazon EC2 Instance Store instead of Amazon EBS.
  • Select EC2 instances that best suit your workload based on an appropriate combination of power and ECU price.
  • Monitor CPU utilization of the instance.
  • Speed up migrations by using Amazon Systems Manager (together with AWS CloudFormation) to create resource groups of EC2 in order to automatically apply settings and configurations to the new system with little manual work.

    opti-io-ec2-blog5-6m19

Endnote

  1. “Amazon EBS – Optimized Instances | Amazon Web Services.” Amazon Elastic Compute Cloud User Guide. Web. 01 May. 2019 https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSOptimized.html#ebs-optimization-support.

Need help with Amazon EC2? The nClouds team is here to help with that and all your AWS infrastructure requirements.

Contact us

Subscribe to Our Newsletter

Join our community of DevOps enthusiast - Get free tips, advice, and insights from our industry leading team of AWS experts.