Decupled Compute & Storage
EMR has its storage decoupled from the compute, thus you can keep data perennial while computing is transient.
Elastic
You can quickly and easily provision as much capacity as you need, and automatically or manually add and remove capacity.
Built-in Disaster Recovery
You can easily configure high availability for multi-master applications with a single click.
Low Cost
Some of the features that make it low cost include low per-second pricing, Amazon EC2 Spot integration, Amazon EC2 Reserved Instance integration, elasticity, and Amazon S3 integration.
Flexible Data Stores
You can leverage multiple data stores, including Amazon S3, the Hadoop Distributed File System (HDFS), and Amazon DynamoDB.
Spark Performance
Improvements Spark is an industry leading capability for ML, Querying, Streaming and Transformation and makes up ~60% of our workloads. Given the vast adoption, we are keen to provide best performance.