Book Image

DevOps for Serverless Applications

By : Shashikant Bangera
Book Image

DevOps for Serverless Applications

By: Shashikant Bangera

Overview of this book

Serverless applications are becoming very popular among developers and are generating a buzz in the tech market. Many organizations struggle with the effective implementation of DevOps with serverless applications. DevOps for Serverless Applications takes you through different DevOps-related scenarios to give you a solid foundation in serverless deployment. You will start by understanding the concepts of serverless architecture and development, and why they are important. Then, you will get to grips with the DevOps ideology and gain an understanding of how it fits into the Serverless Framework. You'll cover deployment framework building and deployment with CI and CD pipelines for serverless applications. You will also explore log management and issue reporting in the serverless environment. In the concluding chapters, you will learn important security tips and best practices for secure pipeline management. By the end of this book, you will be in a position to effectively build a complete CI and CD delivery pipeline with log management for serverless applications.
Table of Contents (17 chapters)
Title Page
Dedication
Packt Upsell
Contributors
Preface
5
Integrating DevOps with IBM OpenWhisk
Index

Core concept


In earlier days, the term serverless referred to an application that was dependent on third-party applications or services to manage server-side logic. Such applications were cloud-based databases, such as Google Firebase, or authentication services, such as Auth0 or AWS Cognito. They were referred to as Backend as a Service (BaaS) services. But serverless also means code that is developed to be event-triggered, and which is executed in stateless compute containers. This architecture is popularly known as Function as a Service (FaaS). Let's look at each type of service in a bit more detail. 

Backend as a Service

The BaaS was conceptualized by Auth0 and Google Firebase. Auth0 started as authentication as a service, but moved to FaaS.  So basically, BaaS is third-party service through which we can implement our required functionality, and it will provide server-side logic for the implementation of the application.

The common approach is that most web and mobile application developers code their own authentication functionality, such as login, registration, and password management, and each of these services has its own API, which has to be incorporated into the application. But this was complicated and time consuming for developers, and BaaS providers made it easy by having a unified API and SDK and bridging them with the frontend of the application so that developers did not have to worry about developing their own backend services for each service. In this way, time and money was saved.

Say, for example, that we want to build a portal that would require authentication to consume our services. We would need login, signup, and authentication systems in place, and we would also need to make it easy for the consumer to sign in with just a click of a button using their existing Google or Facebook or Twitter account. Developing these functionalities individually requires lots of time and effort.

But by using BaaS, we can easily integrate our portal to sign up and authenticate using a Google, Facebook, or Twitter account. Another BaaS service is Firebase, provided by Google. Firebase is a database service that is used by mobile apps, where database administration overhead is mitigated, and it provides authorization for different types of users. In a nutshell, this is how BaaS works. Let's look at the FaaS side of the serverless approach. 

Function as a Service

As mentioned at the start of chapter, FaaS is essentially a small program or function that performs small tasks that are triggered by an event, unlike a monolithic app, which does lots of things. So, in FaaS architecture, we break our app into small, self-contained programs or functions instead of the monolithic app that runs on PaaS and performs multiple functions. For instance, each endpoint in the API could be a separate function, and we can run these functions on demand rather than running the app full time.

The common approach would be to have an API coded in a multi-layer architecture, something like a three-tier architecture where code is broken down into a presentation, business, and data layer. All the routes would trigger the same handler functions in the business layer, and data would be processed and sent to the data layer, which would be a database or file. The following diagram shows this three-tier architecture:

That might work fine for small numbers of simultaneous users, but how would we manage this when traffic grows exponentially? The application will suddenly become a  computing nightmare. So, to resolve this problem, ideally, we would separate the data layer, which contains the database, into the separate server. But the problem is still not solved, because the API routes and business logic is within one application, so the scaling would still be a problem.  

A serverless approach to the same problem is painless. Instead of having one server for application API endpoints and business logic, each part of the application is broken down into independent, auto-scalable functions. The developer writes a function, and the serverless provider wraps the function into a container that can be monitored, cloned, and distributed on any number of servers, as shown in the following diagram:

The benefit to breaking down an application into functions is that we can scale and deploy each function separately. For instance, if one endpoint in our API is where 90 percent of our traffic goes, or our image-processing code is eating up most of the computing time, that one function or bit code can be distributed and scaled more easily than scaling out the entire application.

In a FaaS system, the functions are expected to start within milliseconds in order to allow the handling of individual requests. In PaaS systems, by contrast, there is typically an application thread that keeps running for long periods of time, and handles multiple requests. FaaS services are charged per execution time of the function, whilst PaaS services charge per running time of the thread in which the server application is running.

In the microservices architecture, the applications are loosely coupled, fine grained, and light weighted. The reason for the birth of microservices is to break down the monolithic application into small services so that it can be developed, managed, and scaled independently. But FaaS takes that a step further by breaking things down into even smaller units called functions. 

The trend is pretty clear: The unit of work is getting smaller and smaller. We'are moving from monoliths to microservices, and now to functions, as shown in the following diagram:

With the rise of containers, many cloud vendors saw that serverless functions architecture will provide better flexibility for developers to build their applications without worrying about the ops (operations). AWS was first to launch this service with the name Lambda, then other cloud providers followed the trend, such as Microsoft Azure with Azure Functions and Google Cloud with Google Functions. But this popularity also gave an opportunity for some vendors to build open source versions. Some popular versions are IBM's OpenWhisk, which is Apache licensed, Kubeless, which is built over the top of Kubernetes, and OpenFaaS, which is built over the Docker container. Even Oracle jumped into the foray with Oracle Fn. Let's briefly look at each vendor in this chapter, learning about how they work. We will then travel with them over the rest of the book, looking at their approach to DevOps. 

 

AWS Lambda

Amazon Web Services (AWS) were the first to launch a FaaS, or serverless, service in 2014, called Lambda. They are currently the leaders in this kind of serverless provision. AWS Lambda follow the event-driven approach. At the trigger of an event, Lambda executes the code and performs the required functionality, and it can automatically scale as the traffic rises, as well as automatically descale. Lambda functions run in response to events, such as changes to the data  in an Amazon S3 bucket, an Amazon DynamoDB table change, or in response to an HTTP request through the AWS API Gateway. That's how Lambda helps to build the triggers for multiple services, such as S3 DynamoDB, and the stream data store in Kinesis.

So, Lambda helps developers only worry about the coding—the computing part, such as the memory, CPU, network, and space, is taken care of by Lambda automatically. It also automatically manages the patching, logging, and monitoring of functions. Architecturally, a Lambda function is invoked in a container, which is launched based on the configuration provided. These containers might be reused for subsequent invocations for functions. As the demand dies, the container is decommissioned, but this is all managed internally by Lambda, so users do not have to worry about it as they do not have any control over these containers. The languages supported by AWS Lambda functions are Node.js, Java, C#, and Python.

While building serverless applications, the core components are functions and the event source. The event source is the AWS service or custom application, and the Lambda function processes the events. The execution time for each Lambda function is 300 seconds. 

Let's look at an example of how AWS Lambda actually works. In a photo-sharing application, people upload their photos, and these photos need to have thumbnails so that they can be displayed on the user's profile page. In this scenario, we can use the Lambda function to create the thumbnails, so that the moment the photo gets uploaded in the AWS S3 bucket, S3, which supports the events source, can publish the object-created events and invoke the Lambda function. The Lambda function code reads the latest photo object from the S3 bucket, creates a thumbnail version, and saves it in another S3 bucket. 

In Chapter 3Applying DevOps on AWS Lambda Applications, we will look at how we can create, run, and deploy Lambda functions in an automated way, and we will also monitor and perform root-cause analysis through logging. 

Azure Functions

Azure Functions is Microsoft's venture into serverless architecture. It came onto the market in March 2016. Azure Functions allows functions to be coded in C#, F#, PHP, Node.js, Python, and Java. Azure Functions also supports bash, batch, and PowerShell files.  Azure Functions has seamless integration with Visual Studio Team System (VSTS), Bitbucket, and GitHub, which will make continuous integration and continuous deployment easier. Azure Functions supports various types of event triggers, timer-based events for tasks, OneDrive, and SharePoint, which can be configured to trigger operations in functions. Real-time processing of data and files adds the ability to operate a serverless bot that uses Cortana as the information provider. Microsoft has introduced Logic Apps, a tool with a workflow-orchestration engine, which will allow less technical users to build serverless applications. Azure Functions allows triggers to be created in other Azure cloud services and HTTP requests. The maximum execution time is five minutes per function. Azure Functions provides two types of app service plan: Dynamic and ClassicApp Service is a container or environment for a set of Azure functions to run in. The Dynamic option is similar to Lambda, where we pay for the time and memory our function uses to run. The Classic option is about allocating your own existing or provisioned app resources for the functions at no extra cost. The memory allotted is as per App Service, whereas in AWS Lambda, the memory allocation is per function. Azure Functions allows just 10 concurrent executions per function

Azure Functions follows a similar pricing model to AWS Lambda. The total cost is based on number to triggers executed and time. So the first one million requests are free, and beyond that, it will cost $0.02 for every 100,000 executions. 

We will be looking at automating the deployment of Azure Functions in Chapter 4DevOps with Azure Functions, and also at how different DevOps processes will fit in to give Azure Functions a faster time to market. 

Google Functions

Google joined the party a little late compared to AWS Lambda and Azure Functions. They came onto the market with a beta version of Cloud Functions in March 2017. Currently, we can write Google Functions through Node.js; they will be supporting other languages soon. They support internal event bus triggers and also HTTP triggers, which respond to events such as GitHub WebHooks, slack, or any HTTPS requests, and also mobile backend for events from Firebase analytics, a real-time database.

In terms of scalability, there is in-built provision for autoscaling. Google Functions supports 1,000 functions per project and allows 400 executions per function, which is claimed to be a soft limit. Google Functions allows an execution time of 540 seconds (9 minutes). Deployment is supported through ZIP upload, cloud storage, and cloud store repositories. The event source is through cloud pub/sub or cloud storage objects. The logging of function executions is managed through Stackdriver logging, which is Google Cloud's logging tool.

We will sail through Google Functions's DevOps approach in Chapter 5, Integrating DevOps with IBM – OpenWhisk, and will also look at the best practices around DevOps using Google Functions.

OpenWhisk

OpenWhisk is an open source FaaS platform that can be deployed to the cloud or on an on-premise data center. It is driven by IBM, and they have open sourced using the Apache licence. We can sign up for OpenWhisk through Bluemix (Bluemix is IBM's cloud platform) or we can set  it up locally through a vagrant. It works in a similar way to any FaaS technology, such as AWS Lambda , Azure Functions, or Google Functions. But OpenWhisk supports open events providers. So if we have a custom event provider, then we can incorporate it with OpenWhiz, because, unlike other cloud platforms, OpenWhiz allows events within its service.  One example would be that a function can be triggered through OpenWhiz upon the arrival of a new item appearance in an RSS feed. OpenWhiz will allow an organisation to set up their own FaaS platform within their premises if they are not happy to allow their data to go out of their organisation. The language support in OpenWhiz is Swift, along with JavaScript or Node.js. OpenWhiz has integrated Docker support for binary code execution within functions.

Other serverless architectures

There are many other serverless options, such as OpenFaaS, Fission, and Iron.io. I won't cover them in this book, but let's scheme through the features. OpenFaaS is an open source alternative for serverless architecture. It is built over Docker containers, Swarm, and Kubernetes. It has its own UI portal and also has CLI support to deploy functions. OpenFaaS supports Node.js, Python, GO, and C# on Windows and Linux. We can set it up over a cloud, a local laptop, or an on-premise server. We can write functions for almost everything—that is what is claimed by OpenFaas. OpenFaaS is written in Golang. It allows events through HTTP/HTTPS requests.

Fission is yet another open source version of serverless architecture—the underlying technology is Kubernetes and Docker containers, which can be deployed on both cloud and on-premise infrastructures. It is designed as a set of microservices, and its components are the controller, router, and pool manager. The router manages HTTP requests, the controller manages functions, event triggers, and environment images, and the pool manager manages the pool of containers and loads the functions into these containers. The functions are written with Python.

Serverless benefits

There are a number of pros and cons of using serverless architecture. Let's look at the bright side first. Why would anyone build their application using a serverless architecture such as AWS Lambda or OpenWhiz? The main reason is how efficiently the application performs, how fast it will scale, and, most importantly, its cost. Let's look at a few important pros and then move on to the cons.

Faster time to market

We can push the application to market much faster, as OPS becomes much simpler, and will help the developer to concentrate only on their development. The OPS team does not have to bother about writing code that could handle scaling or worry about the underlying infrastructure.

Also, teams can build the application much faster with the help of third-party integration, such as API services such as OAuth, Twitter, and Maps.

Highly scalable

Every company wants their application to perform better, have zero downtime, and scale quickly and easily with rising traffic, but with monolithic application development, it can become very difficult. The Ops team has to be vigilant in scaling the underlying infrastructure as the load on the application rises. A huge amount of time and money is wasted over downtime due to rises in traffic. But serverless computing is highly scalable, and the applications can be scaled and descaled within seconds.

Low cost

In serverless computing, developers are billed only for the time that the function is running, unlike IaaS and PaaS, which are billed 24/7 for each server. This is good for companies with a huge setup of apps, APIs, or microservices that are currently running 24/7 and using resources 100 percent of the time, whether they are required or not. But with serverless, instead of running the application 24/7, we can execute functions on demand and share the resources, so we can reduce the idle time substantially and still make the application run faster.

Latency and geolocation improvement

The scalability of the application depends on three factors: the number of users, the location of the users, and the latency of the network. In today's world, applications have a global audience, which can add to latency. But the danger of latency can be highly mitigated with the serverless platform. With serverless, a container is instantiated to run a function at every event call, and this container can be created close to the user's geographical region, which will automatically improve the performance of the app.

Serverless drawbacks

Although there are upsides to using serverless, there are also downsides. Let's look at the other side of the coin of the serverless function. 

Increased complexity

The more granular we go with the application, the more complex it becomes. The code for each function might get simpler, but the application as a whole will get more complex. Say, for example, that we break the application into 10 different microservices. We would have to manage 10 different apps, whereas in a monolithic application, it is just one app that has to be managed.

Lack of tooling

Let's say that we break our monolithic application into 50 different functions. There are still a variety of processes and tools to manage, log, monitor, and deploy the monolithic application. As serverless is pretty new in the market, monitoring or logging an application that runs for a few seconds is limited and challenging as of now, but over time, there will be many efficient ways to do this. 

Complexity with architecture

It is hard to make a decision as to how granular a function should be, and it is time consuming to assess, implement, and test to check our preferences. It would be cumbersome to manage too many functions, and at the same time, ignoring granularity would result in us setting up mini monoliths.

Drawback in implementation

The biggest challenge with serverless is integration tests. We will write many functions for an application, but how would we integrate them to work as an application? Of course, before that, how do we test how efficiently they work together? As serverless is new and still maturing, the options that are added through testing are still limited. But we will be covering a few aspects of deployment and testing in future chapters. 

DevOps with serverless

DevOps is another buzzword that has been around for quite a long time. Like serverless, DevOps is also a confusing term. Lots of people have lots of different perspectives on DevOps. Some say that DevOps is just tools, some feel that DevOps consists of a few processes—even IaaS and PaaS falls under the umbrella of DevOps. As per my understanding, DevOps is a collaboration of tools, processes, and feedback. They all go hand in hand for the successful implementation of DevOps. But why are we talking about DevOps here? In short, because we would need DevOps for a smooth transition to production, to log or monitor the serverless functions, and to test them before they reach users.

With DevOps functional prospective, I will be covering version control, continuous integration, continuous deployment, monitoring, and logging for AWS Lambda functions, Azure Functions, Google Functions, and OpenWhiz. Version control is a process where we version the code so that we can branch it, package it, deploy it, and also roll back to a previous version. Continuous integration is the practice where code is integrated together by developers with automated builds to detect and mitigate problems early on. Continuous deployment is basically a bus or pipeline where code is continuously refined using automated testing, and is then deployed to the environment. This pipeline moves smoothly towards production, with minimal manual intervention.