By now, you probably understand the overall architecture and atmosphere in which our app executes, but it won't be of much use without more services available at our disposal. Otherwise, with the limitation of pure Python code, we might have to bring everything that is required along with us to build the next killer web app.
To this end, Google App Engine provides many useful scalable services that you can utilize to build app. Some services address storage needs, others address the processing needs of an app, and yet, the other group caters to the communication needs. In a nutshell, the following services are at your disposal:
Storage: Datastore, Blobstore, Cloud SQL, and Memcache
Processing: Images, Crons, Tasks, and MapReduce
Communication: Mail, XMPP, and Channels
Identity and security: Users, OAuth, and App Identity
Others: such as various capabilities, image processing and full text search
If the list seems short, Google constantly keeps adding new services all the time. Now, let's look at each of the previously listed services in detail.
Datastore is a NoSQL, distributed, and highly scalable column based on a storage solution that can scale to petabytes of data so that you don't have to worry about scaling at all. App Engine provides a data modeling library that you can use to model your data, just as you would with any Object Relational Mapping (ORM), such as the Django models or SQL Alchemy. The syntax is quite similar, but there are differences.
Each object that you save gets a unique key, which is a long string of bytes. Its generation is another topic that we will discuss later. Since it's a NoSQL solution, there are certain limitations on what you can query, which makes it unfit for everyday use, but we can work around those limitations, as we will explore in the coming chapters.
By default, apps get 1 GB of free space in datastore. So, you can start experimenting with it right away.
If you prefer using a relational database, you can have that too. It is a standard MySQL database, and you have to boot up instances and connect with it via whatever interface is available to your runtime environment, such as JDBC in case of Java and MySQLdb in case of Python. Datastore comes with a free quota of about 1 GB of data, but for Cloud SQL, you have to pay from the start.
Because dealing with MySQL is a topic that has been explored in much detail from blog posts to articles and entire books have been written on the subject, this book skips the details on this, it focuses more on Google Datastore.
Your application might want to store larger chunks of data such as images, audio, and video files. The Blobstore just does that for you. You are given a URL, which has to be used as the target of the upload form. Uploads are handled for you, while a key of the uploaded file is returned to a specified callback URL, which can be stored for later reference. For letting users download a file, you can simply set the key that you got from the upload as a specific header on your response, which is taken as an indication by the App Engine to send the file contents to the user.
Hitting datastore for every request costs time and computational resources. The same goes for the rendering of templates with a given set of values. Time is money. Time really is money when it comes to cloud, as you pay in terms of the time your code spends in satisfying user requests. This can be reduced by caching certain content or queries that occur over and over for the same set of data. Google App Engine provides you with memcache to play with so that you can supercharge your app response.
When using App Engine's Python library to model data and query, the caching of the data that is fetched from datastore is automatically done for you, which was not the case in the previous versions of the library.
You might want to perform some certain tasks at certain intervals. That's where the scheduled tasks fit in. Conceptually, they are similar to the Linux/UNIX Cron jobs. However, instead of specifying commands or programs, you indicate URLs, which receive the HTTP GET requests from App Engine on the specified intervals. You're required to process your stuff in under 10 minutes. However, if you want to run longer tasks, you have that option too by tweaking the scaling options, which will be examined in the last chapter when we examine deployment.
Besides the scheduled tasks, you might be interested in the background processing of tasks. For this, Google App Engine allows you to create tasks queues and enqueue tasks in them specifying a target URL with payload, where they are dispatched on a specified and configurable rate. Hence, it is possible to asynchronously perform various computations and other pieces of work that otherwise cannot be accommodated in request handlers.
App Engine provides two types of queues—push queues and pull queues. In push queues, the tasks are delivered to your code via the URL dispatch mechanism, and the only limitation is that you must execute them within the App Engine environment. On the other hand, you can have pull requests where it's your responsibility to pull tasks and delete them once you are done. To that end, pull tasks can be accessed and processed from outside Google App Engine. Each task is retried with backoffs if it fails, and you can configure the rate at which the tasks get processed and configure this for each of the task queues or even at the individual task level itself. The task retries are only available for push queues and for pull queues, you will have to manage repeated attempts of failed tasks on your own.
Each app has a default task queue, and it lets you create additional queues, which are defined in the
queues.yaml file. Just like the scheduled tasks, each task is supposed to finish its processing within 10 minutes. However, if it takes longer then this, we'll learn how to accommodate such a situation when we examine application deployment in the last chapter.
MapReduce is a distributed computing paradigm that is widely used at Google to crunch exotic amounts of data, and now, many open source implementations of such a model exist, such as Hadoop. App Engine provides the MapReduce functionality as well, but at the time of writing this book, Google has moved the development and support of MapReduce libraries for Python and Java to Open source community and they are hosted on Github. Eventually, these features are bound to change a lot. Therefore, we'll not cover MapReduce in this book but if you want to explore this topic further, check https://github.com/GoogleCloudPlatform/appengine-mapreduce/wiki for further details.
Google is in the mail business. So, your applications can send mails. You can not only send e-mails, but also receive them as well. If you plan to write your app in Java, you will use JavaMail as the API to send emails. You can of course use third-party solutions as well to send email, such as SendGrid, which integrates nicely with Google App Engine. If you're interested in this kind of solution, visit https://cloud.google.com/appengine/docs/python/mail/sendgrid.
It's all about instant messaging. You may want to build chat features in your app or use in other innovative ways, such as notifying users about a purchase as an instant message or anything else whereas for that matter. XMPP services are at your disposal. You can send a message to a user, whereas your app will receive messages from users in the form of HTTP POST requests of a specific URL. You can respond to them in whatever way you see fit.
You might want to build something that does not work with the communication model of XMPP, and for this, you have channels at your disposal. This allows you to create a persistent connection from one client to the other clients via Google App Engine. You can supply a client ID to App Engine, and a channel is opened for you. Any client can listen on this channel, and when you send a message to this channel, it gets pushed to all the clients. This can be useful, for instance, if you wish to inform about the real-time activity of other users, which is similar to you notice on Google Docs when editing a spreadsheet or document together.
Authentication is an important part of any web application. App Engine allows you to generate URLs that redirect users to enter their Google account credentials (
[email protected]) and manage sessions for you. You also have the option of restricting the sign-in functionality for a specific domain (such as
[email protected]) in case your company uses Google Apps for business and you intend to build some internal solutions. You can limit access to the users on your domain alone.
Did you ever come across a button labeled Sign in with Facebook, Twitter, Google, and LinkedIn on various websites? Your app can have similar capabilities as well, where you let users not only use the credentials that they registered with on your website, but also sign in to others. In technical jargon, Google Engine can be an OAuth provider.