AWS S3 (simple storage service) is a basic AWS Service that is used almost for every application. It is nearly an infinite storage where in you can store unlimited objects where in a single object can be of max 5 TB in size. In this blog post I will be listing out some of the best practices of AWS S3 which you can utilize while dealing or integrating AWS S3 service in your applications.
Best Practices of AWS S3:
- For Gets request you can use range http header to improve download efficiency by downloading in parts rather than whole object at once. Also it enables quick recovery from failures as only that part needs to be downloaded which failed.
- For Puts request you can use multipart upload, these will help by uploading multiple uploads in parallel resulting in better bandwidth utilization. It also helps in fast recovery from failures as only failed part needs to be upload again. You can pause resume uploads, also you can upload an object before knowing the object size.
- S3 stores the objects lexicographically, better use a random key prefix before every object to get better performance incase of reads and writes. Using randomness, the I/O load will be distributed across multiple index partitions.Though this has been old practice, recently AWS has updated S3 service to support upto 3,500 requests per second to add data and 5,500 requests per second to retrieve data.
- Build and maintain secondary index outside of S3 for storing, indexing and querying objects metadata into services like Dynamodb or RDS.
- Use versioning in buckets to prevent from accidental overwriting and deletion. You can retrieve or restore deleted object or rollback to previous version.
- Use MFA on buckets to prevent un-intended bucket deletions.
- Versioning does not prevent bucket deletion, be careful as mistakenly bucket deletion might result in data loss. Better prepare for backing up bucket.
- Use cross region replication to backup bucket in a different region.
- Optimize or save cost by allocating proper storage type for S3.
- Use event notification for getting notified on any puts or deletes on bucket.
- Use cloudwatch for monitoring objects and storage through various metrics.
- Use cloudtrail to log api calls on s3 buckets.