Amazon EMR | Billing Flashcards
Can I load my data from the internet or somewhere other than Amazon S3?
Billing
Amazon EMR | Analytics
Yes. Your Hadoop application can load the data from anywhere on the internet or from other AWS services. Note that if you load data from the internet, EC2 bandwidth charges will apply. Amazon EMR also provides Hive-based access to data in DynamoDB.
Can Amazon EMR estimate how long it will take to process my input data?
Billing
Amazon EMR | Analytics
No. As each cluster and input data is different, we cannot estimate your job duration.
How much does Amazon EMR cost?
Billing
Amazon EMR | Analytics
As with the rest of AWS, you pay only for what you use. There is no minimum fee and there are no up-front commitments or long-term contracts. Amazon EMR pricing is in addition to normal Amazon EC2 and Amazon S3 pricing.
For Amazon EMR pricing information, please visit EMR’s pricing page.
Amazon EC2, Amazon S3 and Amazon SimpleDB charges are billed separately. Pricing for Amazon EMR is per-second consumed for each instance type (with a one-minute minimum), from the time cluster is requested until it is terminated. For additional details on Amazon EC2 Instance Types, Amazon EC2 Spot Pricing, Amazon EC2 Reserved Instances Pricing, Amazon S3 Pricing, or Amazon SimpleDB Pricing, follow the links below:
Amazon EC2 Instance Types
Amazon EC2 Reserved Instances Pricing
Amazon EC2 Spot Instances Pricing
Amazon S3 Pricing
Amazon SimpleDB Pricing
When does billing of my Amazon EMR cluster begin and end?
Billing
Amazon EMR | Analytics
Billing commences when Amazon EMR starts running your cluster. You are only charged for the resources actually consumed. For example, let’s say you launched 100 Amazon EC2 Standard Small instances for an Amazon EMR cluster, where the Amazon EMR cost is an incremental $0.015 per hour. The Amazon EC2 instances will begin booting immediately, but they won’t necessarily all start at the same moment. Amazon EMR will track when each instance starts and will check it into the cluster so that it can accept processing tasks.
In the first 10 minutes after your launch request, Amazon EMR either starts your cluster (if all of your instances are available) or checks in as many instances as possible. Once the 10 minute mark has passed, Amazon EMR will start processing (and charging for) your cluster as soon as 90% of your requested instances are available. As the remaining 10% of your requested instances check in, Amazon EMR starts charging for those instances as well.
So, in the above example, if all 100 of your requested instances are available 10 minutes after you kick off a launch request, you’ll be charged $1.50 per hour (100 * $0.015) for as long as the cluster takes to complete. If only 90 of your requested instances were available at the 10 minute mark, you’d be charged $1.35 per hour (90 * $0.015) for as long as this was the number of instances running your cluster. When the remaining 10 instances checked in, you’d be charged $1.50 per hour (100 * $0.015) for as long as the balance of the cluster takes to complete.
Each cluster will run until one of the following occurs: you terminate the cluster with the TerminateJobFlows API call (or an equivalent tool), the cluster shuts itself down, or the cluster is terminated due to software or hardware failure.
Where can I track my Amazon EMR, Amazon EC2 and Amazon S3 usage?
Billing
Amazon EMR | Analytics
You can track your usage in the Billing & Cost Management Console.
How do you calculate the Normalized Instance Hours displayed on the console ?
Billing
Amazon EMR | Analytics
On the AWS Management Console, every cluster has a Normalized Instance Hours column that displays the approximate number of compute hours the cluster has used, rounded up to the nearest hour. Normalized Instance Hours are hours of compute time based on the standard of 1 hour of m1.small usage = 1 hour normalized compute time. The following table outlines the normalization factor used to calculate normalized instance hours for the various instance sizes:
Instance Size Normalization Factor
Small 1
Medium 2
Large 4
Xlarge 8
2xlarge 16
4xlarge 32
8xlarge 64
For example, if you run a 10-node r3.8xlarge cluster for an hour, the total number of Normalized Instance Hours displayed on the console will be 640 (10 (number of nodes) x 64 (normalization factor) x 1 (number of hours tthat the cluster ran) = 640).
This is an approximate number and should not be used for billing purposes. Please refer to the Billing & Cost Management Console for billable Amazon EMR usage. Note that we recently changed the normalization factor to accurately reflect the weights of the instances, and the normalization factor does not affect your monthly bill.
Does Amazon EMR support Amazon EC2 On-Demand, Spot, and Reserved Instances?
Billing
Amazon EMR | Analytics
Yes. Amazon EMR seamlessly supports On-Demand, Spot, and Reserved Instances. Click here to learn more about Amazon EC2 Reserved Instances. Click here to learn more about Amazon EC2 Spot Instances.