Today, I failed on AWS Dev Associate certification. I wasn’t feel sad but I regretted myself. There are many questions about basic knowledge that was appeared in my test.
I got 2 questions about the best practice rule when you use S3 to store the large number of files. That is best practice of Key Name.
When: You expect a rapid increase in the request rate for a bucket to more than 300 PUT/LIST/DELETE requests per second or more than 800 GET requests per second.
But in my thinking, you should apply this as more as possible, it will make you always think about the best way.
Crop on Keyword: AWS describes: “Amazon S3 maintains an index of object key names in each AWS region. Object keys are stored in UTF-8 binary ordering across multiple partitions in the index.The key name dictates which partition the key is stored in. Using a sequential prefix, such as timestamp or an alphabetical sequence, increases the likelihood that Amazon S3 will target a specific partition for a large number of your keys, overwhelming the I/O capacity of the partition. If you introduce some randomness in your key name prefixes, the key names, and therefore the I/O load, will be distributed across more than one partition.”
Principle: What you need to care just is “the key name sequence avoidance“.
I’m developer that’s I like to focus the easiest way for developer. That is clearly example.
examplebucket/2013-26-05-15-00-00/cust8474937/photo1.jpg examplebucket/2013-26-05-15-00-00/cust1248473/photo2.jpg ... examplebucket/2013-26-05-15-00-01/cust1248473/photo1.jpg examplebucket/2013-26-05-15-00-01/cust1248473/photo2.jpg
Case 1: Add a Hex Hash Prefix to Key Name: strongly recommend using a hexadecimal hash as the prefix
examplebucket/7b54-2013-26-05-15-00-00/animation1.obj examplebucket/921c-2013-26-05-15-00-00/cust125/animation2.obj examplebucket/animations/7b54-2013-26-05-15-00-00/cust385/animation1.obj examplebucket/animations/921c-2013-26-05-15-00-00/cust124/animation2.obj examplebucket/videos/ba65-2013-26-05-15-00-00/video1.mpg examplebucket/videos/8761-2013-26-05-15-00-00/video2.mpg
Case 2: Reverse the Key Name String: if the GROUP_NAME are incremental sequence ID, you can reverse to get the best random key name but still keep your ID.
Normal: examplebucket/2134857/data/start.png examplebucket/2134857/data/resource.rsrc examplebucket/2134858/data/start.png examplebucket/2134858/data/resource.rsrc Optimized: examplebucket/7584312/data/start.png examplebucket/7584312/data/resource.rsrc examplebucket/8584312/data/start.png examplebucket/8584312/data/resource.rsrc
Hope you get closer with AWS.