Book Image

Tableau Certified Data Analyst Certification Guide

By : Mr. Harry Cooney, Mr. Daisy Jones
Book Image

Tableau Certified Data Analyst Certification Guide

By: Mr. Harry Cooney, Mr. Daisy Jones

Overview of this book

The Tableau Certified Data Analyst certification validates the essential skills needed to explore, analyze, and present data, propelling your career in data analytics. Whether you're a seasoned Tableau user or just starting out, this comprehensive resource is your roadmap to mastering Tableau and achieving certification success. The book begins by exploring the fundamentals of data analysis, from connecting to various data sources to transforming and cleaning data for meaningful insights. With practical exercises and realistic mock exams, you'll gain hands-on experience that reinforces your understanding of Tableau concepts and prepares you for the challenges of the certification exam. As you progress, expert guidance and clear explanations make it easy to navigate complex topics as each chapter builds upon the last, providing a seamless learning experience—from creating impactful visualizations to managing content on Tableau Cloud. Written by a team of experts, this Tableau book not only helps you pass the certification exam but also equips you with the skills and confidence needed to excel in your career. It is an indispensable resource for unlocking the full potential of Tableau.
Table of Contents (11 chapters)

Relational Databases

While using Excel and CSV is a simple and easy way to connect to data in Tableau, these files can easily be changed by human error and are not dynamic. Most organizations have outgrown using Excel and CSV for the following reasons:

  • They require data that can be found quickly when needed and is trusted to be reliable and accurate
  • The solutions need to be able to comfortably handle the natural growth of data and the number of people wanting to access and manipulate it
  • The files can often be duplicated and shared freely, risking unwarranted access
  • Alternative solutions offer greater opportunities to connect from other locations, rather than a single local machine

Relational databases are often a reliable means of achieving these benefits. They are data storage systems that organize information in the familiar tabular structure, with rows and columns; when databases are discussed in a Tableau-specific context, users are usually referring to relational databases. Databases are often hosted on a server, which provides the resources required to run and manage the database; servers can often host multiple databases simultaneously, each with a distinct function.

Tables inside these data repositories are usually set up by developers to capture conceptually distinct types of information. For instance, a marketing center may have a Telephone Enquiries table with each record representing an outgoing call (with columns such as start time, duration, and operator), but store customer-level information (such as phone numbers, addresses, and last-contact dates) in a separate table called Clients.

Common elements allow tables to be related to each other for analytical purposes. This is usually done through keys. Primary keys are either a single field or multiple fields in combination that can be used to identify distinct records. To do this effectively, values in the primary key column(s) must be unique for each row, and primary key columns must be fully populated – that is, all records must have a value (with no missing values, known as null values). Tables typically have just one primary key. Primary keys are useful for identifying duplicate values, which reduce the reliability of the data and result in issues such as double counting.

Foreign keys are columns in a table that refer to the primary key in another table. They are used to link tables on a common identifier. To continue the preceding example, the Clients table might have a Client ID column as the primary key, which also appears as a foreign key in the Telephone Enquiries table. Analysts can match the numbers between tables and identify which client was called in each instance. For example, they could identify which clients have had the greatest volume of successful calls and are therefore worth investing in. This process maintains the original values in a single location – the confidential Clients table – to make the data easier to govern.

Relational databases need to be communicated with for records to be accessed, updated, added, or deleted. This is achieved using a programming language called Structured Query Language (SQL). SQL is discussed further later, in the Custom SQL Query section.

Relational databases are popular as they often enforce rules to maintain data consistency and accuracy; for example, rules may be built to only allow values with a certain range when adding new records. In the Clients table, a Telephone Number field may require a 10-digit format with a country code prefix for a new record to be accepted in the table.

Popular relational database management systems include PostgreSQL, MySQL, and Oracle.