The Amazon Simple Storage Service (Amazon S3) is an online object storage. It can be used to store and get any data via the following:
REST web service interface
SOAP web service interface
BitTorrent
Amazon S3 is easy to configure, and is a reliable and scalable storage that stores files (objects) at a nominal pricing along with high security. Neither the developers nor the system team have to worry about the data that is stored at or retrieved from Amazon S3. Amazon S3 manages the Web-Scale computing by itself.
The following concepts will be covered in this chapter:
The need for S3 and its advantages
Basic concepts of Amazon S3
Features of Amazon S3
Security
Integration
Use cases
Amazon S3 can be used for storing data for application usage as well as for backing up and archiving the data. It doesn't bind the files to be stored. We can store any file, which are treated as objects, in Amazon S3. Amazon uses S3 to run its own global network of websites (http://docs.aws.amazon.com/AmazonS3/latest/dev/Welcome.html).
We can store as much data as we want in Amazon S3; it doesn't restrict a user from storing any. Amazon charges the user for the storage that is actually used. So, it is quite inexpensive for the user, because he/she doesn't need to purchase storage externally.
Amazon S3 keeps the redundant data across multiple data centers for high scalability. The user can select the region where his/her data will be stored. This reduces the latency in storing and retrieving the data. Amazon S3 also offers security on the objects. The user can make the object publicly or privately accessible. We can also store encrypted data in Amazon S3, and it guarantees a server uptime of 99.9 percent.
Amazon S3 can be integrated with any application or services offered by Amazon, such as Amazon Elastic Compute Cloud (Amazon EC2), Amazon Elastic Block Storage (Amazon EBS), Amazon Glacier, and so on.
Subscribing to Amazon S3 is free, and you just need to pay for the bandwidth that you use and for whatever you are actually hosting. Small start-ups usually don't have an infrastructure to store their huge amount data. So, they opt for Amazon S3 to store their images, videos, files, and so on to minimize the costs.
Amazon S3 also provides website hosting services. You can directly upload your pages in Amazon S3, and map it to your domain.
Let's take a look at the basic S3 concepts:
A bucket is a container in Amazon S3 where the files are uploaded. For using Amazon S3 to store a file, you need to create at least one bucket. Files (objects) are stored in buckets.
The following are a few features of buckets:
The bucket name should be unique because it is shared by all users.
Buckets can contain logical nested folders and subfolders. But it cannot contain nested buckets.
You can create a maximum of 100 buckets in a single account.
The bucket name can contain letters, numbers, periods, dash, and the underscore.
The bucket name should start with a letter or number, and it should be between 3 to 25 characters long.
Buckets can be managed via the following:
REST-style HTTP interface
SOAP interface
The following bucket looks similar to the Amazon S3 bucket to which we will upload files (objects):
A bucket doesn't have any size restrictions for the user. It can store objects of any size.
Buckets can be accessed via HTTP URLs as follows:
http://< BUCKET_NAME>.s3.amazonaws.com/< OBJECT_NAME >
http://s3.amazonaws.com/< BUCKET_NAME >/< OBJECT_NAME >
In the preceding URLs, BUCKET_NAME
will be the name of the bucket that you provided while creating it. And OBJECT_NAME
will be the name of the object that you provided while creating the object.
An object is a stored file in Amazon S3. Each object consists of a unique identifier, the user who uploaded the object, and permissions for other users to perform CRUD operations on it. Every object is stored in a bucket.
Objects can be managed via the following:
REST-style HTTP interface
SOAP interface
Objects can be downloaded via the following:
The bucket can consist of any type of object, be it a PDF, text, video, audio, or any other kind of files.
The following are the main features of Amazon S3:
Allows website hosting: Amazon S3 allows users to host a website and map it to their domain. This is very cost effective, because the user pays only for what he/she uses. Moreover, the user doesn't require highly configured servers to serve the website.
Scalable: Amazon S3 doesn't restrict the user to any size limit for storing data. As it is a pay-as-you-go service, it stores the data, and the bill is generated accordingly. So the subscriber never faces a lack of space.
Reliable: Amazon S3 guarantees a server uptime of 99.9 percent. Therefore, the subscriber does not need to worry about data reliability.
Security: Amazon S3 provides a strong authentication mechanism where the stored data can be manipulated.
Standard interfaces: Amazon S3 provides the Representational State Transfer (REST) and Simple Object Access Protocol (SOAP) web services that can be consumed by any web framework.
Reduced Redundancy Storage: Amazon S3 provides the subscribers with an optional feature for storing data with the Reduced Redundancy Storage (RRS) storage class. It is basically used for storing non-critical and reproducible data at lower levels of redundancy. The cost of storing on an RRS storage class is quite less as compared to the standard storage class.
Torrent tracking and seeding: Amazon S3 can act as a torrent tracker, and seed the files from your machine.
Share the data with a temporary URL: Amazon S3 provides the subscriber the ability to share a URL, which auto-expires after a period of time. This helps the subscriber in sharing the data for a minimal period of time. Other users cannot use that data after the URL expires.
Logging: It provides the logging of all activities that are performed on bucket. This makes it easy for the subscriber to audit the activities on the bucket if he so wishes. Generally, when a subscriber hosts a website on Amazon S3, he enables the logging feature to track the activities.
Versioning: Amazon S3 allows storing of multiple versions of an object. It is basically used for recovering old data that is lost unintentionally.
Security: Amazon S3 provides security on buckets and objects. While creating the buckets, you can provide access control lists for other users of the bucket who can create, update, delete, or list objects. You can even set the geographical location of your data.
Integration: Amazon S3 can be integrated with several other services such as Amazon EC2, Amazon EBS, Amazon Glacier, and many other applications. Generally, developers use Amazon S3 for storing images, videos, or documents, and for accessing them via HTTP Get.
The Amazon S3 can be utilized for different purposes:
File hosting: Companies often deploy their images, videos, audios, PDFs, DOCs, and other files in Amazon S3. This helps in loading the files directly from Amazon S3 without managing the on-premise infrastructure.
Storing data on mobile-based applications: Many users/companies go for Amazon S3 to store mobile app data. This becomes easy for user/companies to manage mobile user data over Amazon S3.
Static website hosting: Users can host their static website over Amazon S3 along with Amazon Route53.
Video Hosting: Companies upload their videos over Amazon S3, which can then be accessed on their website. Amazon S3 can also be configured to provide video streaming.
Backup: Users can keep a backup of their data, which will be securely and reliably stored in Amazon S3. Amazon S3 can also be configured to move the old data over to Amazon Glacier for archiving, as the Glacier costs less as compared to S3.
Let's now see how Amazon S3 can be used in a project.
In the preceding diagram we have the following:
The Amazon Elastic Compute Cloud (EC2) machines where the application is deployed.
The Amazon Load Balancer will be responsible for redirecting the user request to specific applications deployed on the EC2 machines.
The Amazon Relational Database Service (RDS) is used for storing application data. It provides scalability, durability, and an easy-to-manage database service.
Amazon S3 where the image/audio/video files are stored.
And lastly, the front devices like a laptop, desktop, or mobile applications that send requests to the application.
The preceding example is a sample case study. There are various ways for integrating Amazon S3 in our application.
In this chapter, we introduced Amazon S3, and covered the basic concepts—buckets, objects, and keys. We explored the basic features of Amazon S3, which help in providing a reliable storage service at minimal cost. Amazon S3 can be consumed by startups, individual developers, or big size companies for data storage, backups for recovery, and so on. Amazon S3 also provides an extensibility for integration with other Amazon services and many other applications. In the next chapter, you will learn how to utilize the AWS S3 basic services like buckets, folders, and objects.