Book Image

TIBCO Spotfire: A Comprehensive Primer

By : Michael Phillips
Book Image

TIBCO Spotfire: A Comprehensive Primer

By: Michael Phillips

Overview of this book

Table of Contents (18 chapters)
TIBCO Spotfire – A Comprehensive Primer
Credits
Foreword
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Key data concept – data queries


Our final data concept in this chapter is the idea of a data query, which is usually achieved using some flavor of SQL, a standard language to interact with databases. Although SQL varies a little from database to database, the basic syntax is fairly consistent. If you don't already have experience using SQL and you want to develop your data analysis skills, you will have to learn more about this key analytics tool.

When you work with text files and spreadsheets, the only practical way you can manipulate large amounts of data is at the column level, removing entire columns that you don't want. What if you want to manipulate the rows in a large dataset? In a spreadsheet, you could of course, use filters to remove rows.

Overall, however, manipulating data in this way is more tedious and potentially more error prone than using a single SQL statement. What if you want to involve more than one table in your "filter"? You can also take advantage of the enormous power of databases to handle large amounts of data and process complex queries.

Anatomy of a SQL statement

A basic SQL statement has just three elements, highlighted in the following code in all caps:

SELECT column_name_1, column_name_2, …, column_name_n
FROM schema_name.data_table_name
WHERE condition

You need to provide a schema_name. Remember when we connected to a database, we saw a set of logical containers into which the tables were organized. These are examples of schemas. The terminology schema may mean slightly different things in different databases, but most organize the tables into a set of logical containers, and you need to prefix table names in your query with the container name using a dot to separate the two elements.

Often, a database administrator will create what is known as a view. These are pre-written, often quite complex, queries spanning multiple tables that define a particular dataset. As an end user, you can use them just like tables.

Tip

If you have a complex data requirement that you feel is beyond your SQL skills, talk to a database administrator or developer and get them to create a view for you. You can then select from the view as though it were a single table.

The SELECT and FROM clauses are mandatory, but the WHERE condition is optional and is just a logical expression to limit the data that is returned. For example:

WHERE column_name_1 = 'some text' AND column_name_2 > 4

If you want to include more than one table in your query, you will have to use what is known as a join.

SELECT a.column_name_1, b.column_name_2, …, etc
FROM schema_name.data_table_name_1 a
JOIN schema_name.data_table_name_2 b on a.column_name_x = b.column_name_y
WHERE condition

Note how the aliases a and b have been used for the two tables. Aliases can be anything you like, but are usually short and make the statement easier to write and read.

The JOIN statement ensures that you only return rows where the join condition is true, as well as any other condition you defined. There are other types of joins. The following link of the w3schools website provides a helpful tutorial: http://www.w3schools.com/sql/sql_join.asp