S3 Performance Flashcards
What is an S3 prefix?
- An S3 prefix is the part of the file path between the bucket name and the object
(ex. ‘mybucketname/folder1/subfolder1/myfile.jpg’ has prefix ‘/folder1/subfolder1’)
What does KMS have to do with S3 performance?
- Uploading and downloading files count towards the KMS quota
The exact amount for the quota is region-specific, but is either 5,500; 10,000; or 30,000 requests per second.
How can you request a quota increase for KMS?
You can not.
Currently, there is NO way to request a quota increase for KMS
What is Multi-Part Upload?
Multipart upload allows you to upload a single object as a set of parts. Each part is a contiguous portion of the object’s data. You can upload these object parts independently and in any order. If transmission of any part fails, you can retransmit that part without affecting other parts. After all parts of your object are uploaded, Amazon S3 assembles these parts and creates the object. In general, when your object size reaches 100 MB, you should consider using multipart uploads instead of uploading the object in a single operation.
Using multipart upload provides the following advantages:
- Improved throughput - You can upload parts in parallel to improve throughput.
- Quick recovery from any network issues - Smaller part size minimizes the impact of restarting a failed upload due to a network error.
- Pause and resume object uploads - You can upload object parts over time. After you initiate a multipart upload, there is no expiry; you must explicitly complete or stop the multipart upload.
- Begin an upload before you know the final object size - You can upload an object as you are creating it.
We recommend that you use multipart upload in the following ways:
- If you’re uploading large objects over a stable high-bandwidth network, use multipart upload to maximize the use of your available bandwidth by uploading object parts in parallel for multi-threaded performance.
- If you’re uploading over a spotty network, use multipart upload to increase resiliency to network errors by avoiding upload restarts. When using multipart upload, you need to retry uploading only parts that are interrupted during the upload. You don’t need to restart uploading your object from the beginning.
What are S3 Byte Range Fetches?
- Using the Range HTTP header in a GET Object request, you can fetch a byte-range from an object, transferring only the specified portion. You can use concurrent connections to Amazon S3 to fetch different byte ranges from within the same object. This helps you achieve higher aggregate throughput versus a single whole-object request. Fetching smaller ranges of a large object also allows your application to improve retry times when requests are interrupted.
- Typical sizes for byte-range requests are 8 MB or 16 MB. If objects are PUT using a multipart upload, it’s a good practice to GET them in the same part sizes (or at least aligned to part boundaries) for best performance. GET requests can directly address individual parts; for example, GET ?partNumber=N.
What would you use if you just want to download partial amounts of a file?
S3 Byte Range Fetches with just the relevant bytes
At what file size is Multi-Part Upload recommended?
100MB
At what file size is Multi-Part Upload required?
5GB
What do S3 prefixes have to do with performance?
- They are important for performance because requests are measured per second per prefix
- So, you can get better performance by spreading your reads across different prefixes
How many PUT requests can AWS handle?
3,500 per prefix per second
How many COPY requests can AWS handle?
3,500 per prefix per second
How many POST requests can AWS handle?
3,500 per prefix per second
How many DELETE requests can AWS handle?
3,500 per prefix per second
How many GET requests can AWS handle?
5,500 per prefix per second
How many HEAD requests can AWS handle?
5,500 per second per prefix