How Do You Think Amazon Cloudfront Works?

For Amazon Cloudfront to work, you just need to tell it where is your publicly accessible file and it start serving your file. How does it work? How can it server files to users without you giving them file by uploading those files?

Website owner mention resource url which ask resources from Amazon Cloudfront. Instead of giving files urls as mysite.com/images/img.png, they mention on their website as xyz.cloudfront.net/image/img.png. or using canonical name (like origin.mysite.com). So, when any reader on website reach on the web page first time then Amazon Cloudfront accesses file from host server (your custom server/ Amazon S3) as normally happen even without Amazon Cloudfront. But then when next time someone ask that same file then Amazon CloudFront serves from their own cache.

Amazon had deployed many interconnected servers at many locations around the world and it tries to replicate that file as and when required based on users location for that website. So, if American users access that page first time then it tries to first cache at the central server then it send that file to Edge server at USA for next user who are in America. If next few requests comes from India than it will try to cache that file in India Server location so that files can be served quickly to new users. Now, in all these work, Amazon is not querying your server as it has already cached that file in their central server. Using Cache-Control and  Expiry, you can decide how long Amazon won’t check those cached files from your server for freshness.

Step by Step Explanation of How Amazon Cloudfront works:

  1. An image file (http://www.satya-weblog.com/wp-content/uploads/2013/09/amazon-web-services_thumb.png) uploaded my me on my server to use in a blog post.
  2. Instead of above url, I will mention this url in my post: http://o2.satya-weblog.com/wp-content/uploads/2013/09/amazon-web-services_thumb.png
  3. Now, as soon as I visit the web page to check my post or anyone else visited for the first time, then a request will go to Cloudfront asking for image amazon-web-services_thumb.png (from xyz.cloudfront.net or origin.mysite.com/), which CloudFront won’t have. Now, they will request back to my server for the image. After getting the image, Amazon will display that image from central server and cache it on the central server. 
  4. In the previous step itself or in next step, depends on the website visitors or Amazon’s policy of maintaining their cache, Cloudfront may replicate the file on all Edge servers or based on visitors location, it can in next step, distribute that file only on Edge location near to visitors. Files stored near to visitors location can be served quickly reducing latency and load.
    Architectural_Overview
    More: Amazon Architectural Overview

Very related and important post - Implementing CDN; Amazon S3 and CloudFront. What is the Best Approach?