S3 – Simple Storage Solution

  1. Basics
    1. S3 is an object based storage – You can upload files. Not suitable for installing OS
    2. File size can be 0 Bytes – 5 TB. S3 has unlimited storage.
    3. When you upload a file you receive HTTP 200 code , if the upload is successful
    4. Files are stored in buckets. Bucket name format https://S3-<Region>.amazonaws/<BucketName>
    5. S3 names must be unique globally.
  2. Data Consistency model
    1. Create new objects: Read after Write consistency
    2. Update existing object: Eventual consistency. Updates are atomic
    3. Delete existing object: Eventual consistency
  3. S3 is key value store. S3 object consists of
    1. Key: Name of the file name
    2. Value: Data
    3. Version ID
    4. Metadata
    5. Subresources: ACLs and Torrent
  4. S3 Tiers
    1. S3: For frequently accessed data. Availability 99.99% Durability 99.999999999%
    2. S3-IA: For Infrequently accessed data. Lower fee than S3 but charged for retrieving data also. Availability 99.99% Durability 99.999999999%
    3. Reduced Redundancy Storage: For reproducible data. Cheaper than S3. Availability 99.99% Durability 99.99%
    4. Glacier: To archive data. Takes 3-5 hours to restore data.
  5. Transfer Acceleration: Data is transferred from end users to S3 via Cloudfront edge locations. When user uploads files to S3 using transfer acceleration, first the data is uploaded to edge location and then transferred to S3 buckets. Faster way
  6. S3 Charges:
    1. Storage: data stored in S3
    2. Requests: # of requests
    3. Storage management pricing – Tags
    4. Data transfer pricing – Data coming into S3 is free. Moving data around within s3 (replication etc)
    5. Transfer Acceleration
  7. Version Control
    1. If you enable version control, when a new version of an object is uploaded, S3 will preserve old version that means you have both old version and new version. If version control is disabled, when you upload a new version of an object to S3, the new version will replace old version of an object. For some reason if you want to retrieve old version you cannot do that.
    2. Enable/Disable/Suspend
      1. By default, version control is disabled on S3 bucket.
      2. Once you enable version control it cannot be disabled, it can be suspended.
    3. When you delete an object, S3 does not delete the object, it just creates a delete marker. When you delete the delete marker the object will be restored.
    4. Downside of versioning: Stores all version of an object, for some use cases it can take up a lot of space. Use lifecycle rules
  8. Cross Region Replication
    1. Requirements
      1. Enable versioning on both source and target buckets
      2. Enable cross region replication on both source and target buckets
      3. Assign IAM role with necessary permissions to source bucket
    2. Existing files will not be replicated. Files created/updated after replication is enabled will be replicated.
    3. All versions and delete markers are replicated. But deleting delete marker or individual version will not be replicated
    4. Replicating to multiple buckets or daisy chaining is not supported
  9. Lifecycle Management
    1. Lifecycle rules work with versioning. Lifecycle rules can be used for both current version and previous version of objects
    2. We can move objects from S3 to S3-IA and then to Glacier. Object should be minimum of 128kb.
    3. Current Version: S3 (after min 30 days) àS3-IA (after min 30 days) àGlacier àExpire.
    4. Previous Version: S3àS3-IAàGlacieràPermanently Delete.
  10. Security
    1. By default, created buckets are PRIVATE
    2. Access control using Bucket policies (bucket level) and ACLs (object level)
    3. Create access logs to another S3 bucket
  11. Encryption
    1. In transit: SSL/TLS
    2. At rest:
      1. Server Side: SSE-S3 (S3 Managed Key)
      2. Server Side: SSE-KMS (KMS Managed Key)
      3. Server Side: SSE-C (Customer Provided Key)
      4. Client Side: Encrypt and upload data
  12. S3 Transfer acceleration: Leverage AWS edge locations to transfer data uploaded to S3

CDN – Content Delivery Network

  1. Web service that accelerates delivery of website, APIs, video content or other assets. Use cases: Delivering website or web application, distributing software or other large files
  2. Terminology
    1. Edge location: Content will be cached in edge location. Separate to an AZ/Region. Cache for both read and write. Cached for TTL, you can manually clear the cached objects.
    2. Origin: Source of the content. S3 Bucket / EC2 Instance / ELB / Route53 / Non-AWS Domain.
    3. Distribution: Collection of edge locations
    4. Web distribution: Used for websites
    5. RTMP: Used for media streaming
  3. Create Distribution
    1. Delivery method: Web/RTMP
    2. Origin Settings: AWS targets or non aws target web address, Prefix , etc
    3. Cache Settings: Protocols (http/https), Allowed Methods (Get. Put, Post, etc.), Cached methods (Get, Head,etc) , TTL etc.
    4. Distribution settings: Price class (All edge locations, SSL certificates etc.), SSL certificates , etc

No tags for this post.

Leave a Comment