Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Amazon Redshift Cookbook
  • Table Of Contents Toc
Amazon Redshift Cookbook

Amazon Redshift Cookbook

By : Shruti Worlikar, Thiyagarajan Arumugam, Harshida Patel
4.8 (9)
close
close
Amazon Redshift Cookbook

Amazon Redshift Cookbook

4.8 (9)
By: Shruti Worlikar, Thiyagarajan Arumugam, Harshida Patel

Overview of this book

Amazon Redshift is a fully managed, petabyte-scale AWS cloud data warehousing service. It enables you to build new data warehouse workloads on AWS and migrate on-premises traditional data warehousing platforms to Redshift. This book on Amazon Redshift starts by focusing on Redshift architecture, showing you how to perform database administration tasks on Redshift. You'll then learn how to optimize your data warehouse to quickly execute complex analytic queries against very large datasets. Because of the massive amount of data involved in data warehousing, designing your database for analytical processing lets you take full advantage of Redshift's columnar architecture and managed services. As you advance, you’ll discover how to deploy fully automated and highly scalable extract, transform, and load (ETL) processes, which help minimize the operational efforts that you have to invest in managing regular ETL pipelines and ensure the timely and accurate refreshing of your data warehouse. Finally, you'll gain a clear understanding of Redshift use cases, data ingestion, data management, security, and scaling so that you can build a scalable data warehouse platform. By the end of this Redshift book, you'll be able to implement a Redshift-based data analytics solution and have understood the best practice solutions to commonly faced problems.
Table of Contents (13 chapters)
close
close

Creating an Amazon Redshift provisioned cluster using AWS CloudFormation

With an AWS CloudFormation template, you treat your infrastructure as code. This enables you to create an Amazon Redshift cluster using a JSON or YAML file. The declarative code in the file contains the steps to create the AWS resources and enables easy automation and distribution. This template allows you to standardize the Amazon Redshift provisioned cluster creation to meet your organizational infrastructure and security standards.

Further, you can distribute them to different teams within your organization using the AWS service catalog for an easy setup. In this recipe, you will learn how to use a CloudFormation template to deploy an Amazon Redshift provisioned cluster and the different parameters associated with it.

Getting ready

To complete this recipe, you will need:

  • An IAM user with access to AWS CloudFormation, Amazon EC2, and Amazon Redshift

How to do it…

We use the CloudFormation template to author the Amazon Redshift cluster infrastructure as code using a JSON-based template. Follow these steps to create the Amazon Redshift provisioned cluster using the CloudFormation template:

  1. Download the AWS CloudFormation template from https://github.com/PacktPublishing/Amazon-Redshift-Cookbook-2E/blob/main/Chapter01/Creating_Amazon_Redshift_Cluster.json.
  2. Navigate to the AWS Console, choose CloudFormation, and choose Create stack.
  3. Click on the Template is ready and Upload a template file options, choose the downloaded Creating_Amazon_Redshift_Cluster.json file from your local computer, and click Next.
  4. Set the following input parameters:
    1. Stack name: Enter a name for the stack, for example, myredshiftcluster.
    2. ClusterType: Single-node or a multiple node cluster.
    3. DatabaseName: Enter a database name, for example, dev.
    4. InboundTraffic: Restrict the CIDR ranges of IPs that can access the cluster. 0.0.0.0/0 opens the cluster to be globally accessible, which would be a security risk.
    5. MasterUserName: Enter a database master user name, for example, awsuser.
    6. MasterUserPassword: Enter a master user password. The password must be 8-64 characters long and must contain at least one uppercase letter, one lowercase letter, and one number. It can contain any printable ASCII character except /, "", or @.
    7. NodeType: Enter the node type, for example, ra3.xlplus.
    8. NumberofNodes: Enter the number of compute nodes, for example, 2.
    1. Redshift cluster port: Choose any TCP/IP port, for example, 5439.
  5. Click Next and Create Stack.

AWS CloudFormation has deployed all the infrastructure and configuration listed in the template in completed and we’ll wait till the status changes to CREATE_COMPLETE.

  1. You can now check the outputs section in the CloudFormation stack and look for the cluster endpoint, or navigate to the Amazon Redshift | Clusters | myredshiftcluster | General information section to find the JDBC/ODBC URL to connect to the Amazon Redshift cluster.

How it works…

Let’s now see how this CloudFormation template works. The CloudFormation template is organized into three broad sections: input parameters, resources, and outputs. Let’s discuss them one by one.

The Parameters section is used to allow user input choices and also can be used to apply constraints to the values. To create an Amazon Redshift resource, we collect parameters such as database name, master username/ password, and cluster type. The parameters will later be substituted when creating the resources. Here is an illustration of the Parameters section of the template:

"Parameters": {
        "DatabaseName": {
            "Description": "The name of the first database to be created when the cluster is created",
            "Type": "String",
            "Default": "dev",
            "AllowedPattern": "([a-z]|[0-9])+"
        },
        "NodeType": {
            "Description": "The type of node to be provisioned",
            "Type": "String",
            "Default": "ra3.xlplus",
            "AllowedValues": [
                "ra3.16xlarge",
                "ra3.4xlarge",
                "ra3.xlplus",
            ]
        }

In the previous input section, DatabaseName is a string value that defaults to dev and also enforces an alphanumeric validation when specified using the condition check of AllowedPattern: ([a-z]|[0-9])+. Similarly, NodeType defaults to ra3.xlplus and allows the valid NodeType from a list of values.

The Resources section contains a list of resource objects, and the Amazon resource is invoked using AWS::Redshift::Cluster along with references to the input parameters, such as DatabaseName, ClusterType, NumberOfNodes, NodeType, MasterUsername, and MasterUserPassword:

"Resources": {
        "RedshiftCluster": {
            "Type": "AWS::Redshift::Cluster",
            "DependsOn": "AttachGateway",
            "Properties": {
                "ClusterType": {
                    "Ref": "ClusterType"
                },
                "NumberOfNodes": {},
                "NodeType": {
                    "Ref": "NodeType"
                },
                "DBName": {
                    "Ref": "DatabaseName"
                },
..

The Resources section references the input section for values such as NumberOfNodes, NodeType, DatabaseName, that will be used during the resource creation.

The Outputs section is a handy place to capture the essential information about your resources or input parameters that you want to have available after the stack has been created, so you can easily identify the resource object names that are created.

For example, you can capture output such as ClusterEndpoint that will be used to connect into the cluster as follows:

"Outputs": {
        "ClusterEndpoint": {
            "Description": "Cluster endpoint",
            "Value": {
                "Fn::Join": [
                    ":",
                    [
                        {
                            "Fn::GetAtt": [
                                "RedshiftCluster",
                                "Endpoint.Address"
                            ]
                        },
                        {
                            "Fn::GetAtt": [
                                "RedshiftCluster",
                                "Endpoint.Port"
                            ]
                        }
                    ]
                ]
            }
        }

When authoring the template from scratch, you can take advantage of the AWS Application Composer – an integrated development environment for authoring and validating code. Once the template is ready, you can launch the resources by creating a stack (collection of resources) or using the AWS CloudFormation console, API, or AWS CLI. You can also update or delete the template afterward.

Visually different images
CONTINUE READING
83
Tech Concepts
36
Programming languages
73
Tech Tools
Icon Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.
Icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Icon 50+ new titles added per month and exclusive early access to books as they are being written.
Amazon Redshift Cookbook
notes
bookmark Notes and Bookmarks search Search in title playlist Add to playlist download Download options font-size Font size

Change the font size

margin-width Margin width

Change margin width

day-mode Day/Sepia/Night Modes

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY

Submit Your Feedback

Modal Close icon
Modal Close icon
Modal Close icon