2. Module 2
Index

# Chapter 13. Demographic Data Discovery

In this final chapter, we shall finish our exploration of real data with Qlik Sense by moving beyond the standard structures of the office and showing the full possibilities of the software for analysis of almost any kind of imaginable data. We'll therefore be looking at applying Qlik Sense to demographic data. As before, this example and many others are available for you to explore at http://sense-demo.qlik.com.

This chapter will cover the aspects necessary for demographic data discovery, including:

• General information about common KPIs
• Examples showing how to use the lasso selection in maps and scatter charts
• Examples of dimensions and measures

## Problem analysis

With Qlik Sense, it is possible to analyze not only business data, but rather any data. One great example is demographic data—statistics of countries and regions on anything from age and gender to income and life expectancy.

Such data can be found on a number of Internet sites and downloaded for your convenience, for example, from the following websites:

Demographic data is used and analyzed as-is by a number of nongovernmental organizations that need it for their activities. The common measures required are GDP per capita, population, unemployment rate, inflation, life expectancy, happiness, trade balance, labor cost, national debt, election results, and so on.

Often, interesting questions about correlations are asked; for example, how does happiness correlate with material standards and health? How are population growth and the number of children affected by factors such as life expectancy, poverty, and average salary? How has life expectancy improved over the years? If you haven't seen Hans Rosling's presentations on the Internet on this topic, we strongly recommend them. They show that data analysis is both important and fun.

Common dimensions in demographic data are country, region, gender, age group, ethnicity, and so on. An example can be seen in the following scatter chart, where you can see life expectancy and per capita GDP for different countries. Many developing countries are found in the lower-left quadrant, whereas the richer countries usually are found in the upper-right quadrant.

Life expectancy versus per capita GDP

You can clearly see that the two numbers are highly correlated—the higher the GDP, the higher the life expectancy.

These measures can often also be linked to your business data to enable a deeper understanding. For instance, you can divide your country sales by the population of the country, thereby getting a relative sales number, which tells you how well you sell in that country. With this number, you can make relevant comparisons of countries of different sizes.

Alternatively, if you assume that the market space in the country is roughly proportional to the GDP, you can divide your sales by the GDP and use this number to compare market penetration between countries.

These numbers will answer questions such as, "How well are we selling in this country, given the potential?"

# Application features

On our demo site, we have an app with a number of demographic measures per country. You can find it at http://sense-demo.qlik.com under the name Happiness. It analyzes, among other demographic indexes, the Happy Planet Index (HPI) in a number of countries. You can learn more about this index at www.happyplanetindex.org.

This index measures the sustainable well-being of 151 countries across the globe, focusing not on their abilities to produce material goods and services, but rather on their abilities to produce long, happy, and sustainable lives for the people who live in them. A happy life doesn't have to come at the expense of our environment, and the HPI is used to promote a policy that puts the well-being of people and the planet first.

The app overview of the Happiness application

Below this overview, you will see a number of sheets. The leftmost sheet is an introduction, whereas the other sheets are prepared for analysis and detailed information.

If you click on the Stories button to the left, you will see that the app also contains one story—a story that can be used to present data in the app. It can also be used as an introduction to the app the first time you open it.

The sheets on the app overview page

The first sheet is an introduction sheet that explains what the app is all about. The second sheet, which is the first one with traditional charts, is called Happy Planet Index (HPI). On it, you will see the happiness index for all countries, first on a map, and then in a table.

The countries in the map are colored according to the happiness index—the darker the color, the higher the happiness index.

A map showing the happiness index per country

Below the map, there are three scatter charts showing the happiness index per country, plotted against the life expectancy, GDP per capita, and total population. These three charts are excellent tools to analyze any correlation between happiness and the mentioned demographic measures.

Scatter charts that show the correlation (or lack of correlation) between happiness and other demographic measures

Finally, at the bottom, you have three filter panes, allowing the user to choose only a region, subregion, or country to zoom in the numbers for a specific area.

The other sheets contain additional and more detailed information, ordered by topics. The final sheet contains a table showing the details, should the user be interested in drilling down to the lowest level.

## Analysis

When looking at data in this app, the first question that pops up in the user's mind is usually, "Is there any correlation between happiness and x?" To get a qualitative answer to this, you only need to browse through the scatter charts.

On the Happy Planet Index (HPI) sheet, you have three scatter charts. In the leftmost chart, HPI vs Life Expectancy, you can see a correlation between the two measures, at least for lower life expectancies. In the other two charts, however, there is no clear correlation.

On the HPI Comparison sheet, you have three additional scatter charts. In the leftmost chart, HPI vs Happy Life Years, you can see a weak correlation between the two measures. The same is true for the rightmost chart, HPI vs Global Footprint, but in the chart in the middle (HPI vs Governance), there is no clear correlation.

However, as in all statistics, you have to be careful with your conclusions. Firstly, correlation does not imply causation. You have to look at many factors and use common sense to find the true cause and effect. In this case, it is just that the happiness index is an artificial index calculated from the life expectancy and ecological footprint among others, hence the correlation with happy life years and global footprint.

## Using the lasso selector to make selections

Now, let's explore the data. One question could be, "Where in the world do we find the countries with a low average life expectancy?" To answer this, you need to make a selection in the scatter chart showing life expectancy:

1. First, navigate to the Happy Planet Index sheet. Maximize the scatter chart that shows HPI vs Life Expectancy by clicking on the fullscreen arrow in the upper-right corner of the object.
2. Then, click on the chart so that the chart controls, including the lasso symbol, appear in the upper-right corner.
3. Next, click on the Turn on lasso selection option. Now you can draw a line around the points you want to select.
4. Finally, confirm your selection by clicking on the green tick mark in the upper-right corner.

Lasso selection in the scatter chart

If you now look at the map, you will see where these countries appear in the world. It's predominantly Africa and South Asia. If you click on the map, you can zoom in using the scroll wheel of the mouse. You can also pan the map.

Countries with low life expectancy

Of course, you can also make a selection the other way round. Use the lasso selector in the map and see how the selected countries are distributed in the scatter chart. The way to do this is as follows:

1. Maximize the map.
2. Click somewhere in the map.
3. Click on the Turn on lasso selection option and encircle the part of the world you want to explore.

Making a lasso selection of America on the map

## Using the global selector to make selections

You can also use the global selector to make selections. Just click on the global selector and make selections directly in the fields.

For instance, you may have a question like this, "Where in the world do I find the richest countries?" In such a case, perform the following steps:

1. Open the global selector. (This is found to the right in the toolbar with Selections tool as a popup.)
2. Find a field called GDP/capita (\$PPP). To do this, you might first need to check Show fields in the global selector.
3. Once you have found this field, you can investigate it just by scrolling. You will then see that there are some countries with less than \$400 in GDP per capita, while the richest countries have more than \$80,000 in GDP per capita.

If you want to find the countries where the GDP is greater than \$10,000, perform the following steps:

1. Click on the search icon and type `>10000`.
2. Confirm the search by pressing Enter, and then confirm the selection by clicking on the green tick mark.

Selecting the countries in the world with the highest GDP

If you now close the global selector and go back to the map and the scatter charts, you will be able to see where you find the richest countries, both on the map and in the scatter charts.

# Analysis

When looking at data in this app, the first question that pops up in the user's mind is usually, "Is there any correlation between happiness and x?" To get a qualitative answer to this, you only need to browse through the scatter charts.

On the Happy Planet Index (HPI) sheet, you have three scatter charts. In the leftmost chart, HPI vs Life Expectancy, you can see a correlation between the two measures, at least for lower life expectancies. In the other two charts, however, there is no clear correlation.

On the HPI Comparison sheet, you have three additional scatter charts. In the leftmost chart, HPI vs Happy Life Years, you can see a weak correlation between the two measures. The same is true for the rightmost chart, HPI vs Global Footprint, but in the chart in the middle (HPI vs Governance), there is no clear correlation.

However, as in all statistics, you have to be careful with your conclusions. Firstly, correlation does not imply causation. You have to look at many factors and use common sense to find the true cause and effect. In this case, it is just that the happiness index is an artificial index calculated from the life expectancy and ecological footprint among others, hence the correlation with happy life years and global footprint.

# Using the lasso selector to make selections

Now, let's explore the data. One question could be, "Where in the world do we find the countries with a low average life expectancy?" To answer this, you need to make a selection in the scatter chart showing life expectancy:

1. First, navigate to the Happy Planet Index sheet. Maximize the scatter chart that shows HPI vs Life Expectancy by clicking on the fullscreen arrow in the upper-right corner of the object.
2. Then, click on the chart so that the chart controls, including the lasso symbol, appear in the upper-right corner.
3. Next, click on the Turn on lasso selection option. Now you can draw a line around the points you want to select.
4. Finally, confirm your selection by clicking on the green tick mark in the upper-right corner.

Lasso selection in the scatter chart

If you now look at the map, you will see where these countries appear in the world. It's predominantly Africa and South Asia. If you click on the map, you can zoom in using the scroll wheel of the mouse. You can also pan the map.

Countries with low life expectancy

Of course, you can also make a selection the other way round. Use the lasso selector in the map and see how the selected countries are distributed in the scatter chart. The way to do this is as follows:

1. Maximize the map.
2. Click somewhere in the map.
3. Click on the Turn on lasso selection option and encircle the part of the world you want to explore.

Making a lasso selection of America on the map

# Using the global selector to make selections

You can also use the global selector to make selections. Just click on the global selector and make selections directly in the fields.

For instance, you may have a question like this, "Where in the world do I find the richest countries?" In such a case, perform the following steps:

1. Open the global selector. (This is found to the right in the toolbar with Selections tool as a popup.)
2. Find a field called GDP/capita (\$PPP). To do this, you might first need to check Show fields in the global selector.
3. Once you have found this field, you can investigate it just by scrolling. You will then see that there are some countries with less than \$400 in GDP per capita, while the richest countries have more than \$80,000 in GDP per capita.

If you want to find the countries where the GDP is greater than \$10,000, perform the following steps:

1. Click on the search icon and type `>10000`.
2. Confirm the search by pressing Enter, and then confirm the selection by clicking on the green tick mark.

Selecting the countries in the world with the highest GDP

If you now close the global selector and go back to the map and the scatter charts, you will be able to see where you find the richest countries, both on the map and in the scatter charts.

# Using the lasso selector to make selections

Now, let's explore the data. One question could be, "Where in the world do we find the countries with a low average life expectancy?" To answer this, you need to make a selection in the scatter chart showing life expectancy:

1. First, navigate to the Happy Planet Index sheet. Maximize the scatter chart that shows HPI vs Life Expectancy by clicking on the fullscreen arrow in the upper-right corner of the object.
2. Then, click on the chart so that the chart controls, including the lasso symbol, appear in the upper-right corner.
3. Next, click on the Turn on lasso selection option. Now you can draw a line around the points you want to select.
4. Finally, confirm your selection by clicking on the green tick mark in the upper-right corner.

Lasso selection in the scatter chart

If you now look at the map, you will see where these countries appear in the world. It's predominantly Africa and South Asia. If you click on the map, you can zoom in using the scroll wheel of the mouse. You can also pan the map.

Countries with low life expectancy

Of course, you can also make a selection the other way round. Use the lasso selector in the map and see how the selected countries are distributed in the scatter chart. The way to do this is as follows:

1. Maximize the map.
2. Click somewhere in the map.
3. Click on the Turn on lasso selection option and encircle the part of the world you want to explore.

Making a lasso selection of America on the map

# Using the global selector to make selections

You can also use the global selector to make selections. Just click on the global selector and make selections directly in the fields.

For instance, you may have a question like this, "Where in the world do I find the richest countries?" In such a case, perform the following steps:

1. Open the global selector. (This is found to the right in the toolbar with Selections tool as a popup.)
2. Find a field called GDP/capita (\$PPP). To do this, you might first need to check Show fields in the global selector.
3. Once you have found this field, you can investigate it just by scrolling. You will then see that there are some countries with less than \$400 in GDP per capita, while the richest countries have more than \$80,000 in GDP per capita.

If you want to find the countries where the GDP is greater than \$10,000, perform the following steps:

1. Click on the search icon and type `>10000`.
2. Confirm the search by pressing Enter, and then confirm the selection by clicking on the green tick mark.

Selecting the countries in the world with the highest GDP

If you now close the global selector and go back to the map and the scatter charts, you will be able to see where you find the richest countries, both on the map and in the scatter charts.

# Using the global selector to make selections

You can also use the global selector to make selections. Just click on the global selector and make selections directly in the fields.

For instance, you may have a question like this, "Where in the world do I find the richest countries?" In such a case, perform the following steps:

1. Open the global selector. (This is found to the right in the toolbar with Selections tool as a popup.)
2. Find a field called GDP/capita (\$PPP). To do this, you might first need to check Show fields in the global selector.
3. Once you have found this field, you can investigate it just by scrolling. You will then see that there are some countries with less than \$400 in GDP per capita, while the richest countries have more than \$80,000 in GDP per capita.

If you want to find the countries where the GDP is greater than \$10,000, perform the following steps:

1. Click on the search icon and type `>10000`.
2. Confirm the search by pressing Enter, and then confirm the selection by clicking on the green tick mark.

Selecting the countries in the world with the highest GDP

If you now close the global selector and go back to the map and the scatter charts, you will be able to see where you find the richest countries, both on the map and in the scatter charts.

# How the application was developed

The data model of the Happiness application is not very complicated:

This is an extremely simple data model that only contains one table of real data, `Happy Planet Index`, and an additional table listing all countries, `World.shp/Features`. The second table has one record per country and holds the map information—the shapes of the country—used in the map object in the user interface.

In this app, the data table has exactly one record per country—a record that contains the relevant information for a given country at a given moment. However, this is not always the situation. More often, the data table contains data for countries over many points in time, for example, one record per combination of a country and a year. This will result in several lines per country.

## Dimensions

There are not many fields that can be used as dimensions. The three available fields are region, subregion, and country. The world is split into 7 regions and 19 subregions. A country can only belong to one subregion and one region. These fields have been added to Library. In addition, a drill-down dimension has been created from the three fields.

The dimensions in Library

One way of adding dimensions could be by creating buckets based on one of the measures, for example, population. Countries could then be grouped under Large, Medium, and Small classes, which will be stored in a new field, Population Class.

## Measures

A number of measures have also been defined, for example, GDP, happiness index, global footprint, life expectancy, and so on.

It is important that the app developer formulates the formulas correctly, since this is something that could be difficult for the business user. The business user doesn't always have knowledge about the data model, which is something you need in order to get all the expressions right.

In the following table, you can find some of the measures defined in this app:

Measure

Definition

GDP per Capita

`Avg([GDP/capita])`

Global Footprint

`Avg([Footprint])`

Governance Rank

`Only[Governance Rank])`

Happy Life Years

`Only([Happy Life Years])`

Happy Planet Index

`Only([Happy Planet Index])`

HPI Rank

`Only([HPI Rank])`

Population

`Only(Population)`

Several of these measures can be defined differently. How you do this is very much a matter of taste. For instance, the measures where the `Only()` function is used can also be defined using `Sum()` or `Avg()`. As long as you only have a single number, all three functions will return the same answer.

But how do you want Qlik Sense to behave when there are several countries, for example, a region that should be represented by one value? For the Population measure, the obvious function to use should be `Sum()`. Then the total population of the region will be shown.

But if the source data contains several years, so that a single country has several records, you don't just want to sum the population. Then you would get numbers that are much larger than they should be. Instead, you might want to use `Sum(Population)/Count(distinct Year)` to create an average over all possible years.

In addition, for a rank, you wouldn't want to use `Sum()` because it would show an incorrect number. You could use `Avg()`, which will give the average rank between the countries. An average is clearly better, but it is still not mathematically correct. Then it might be better to use `Only()`, which doesn't return an answer at all when more than one country is involved.

# Dimensions

There are not many fields that can be used as dimensions. The three available fields are region, subregion, and country. The world is split into 7 regions and 19 subregions. A country can only belong to one subregion and one region. These fields have been added to Library. In addition, a drill-down dimension has been created from the three fields.

The dimensions in Library

One way of adding dimensions could be by creating buckets based on one of the measures, for example, population. Countries could then be grouped under Large, Medium, and Small classes, which will be stored in a new field, Population Class.

# Measures

A number of measures have also been defined, for example, GDP, happiness index, global footprint, life expectancy, and so on.

It is important that the app developer formulates the formulas correctly, since this is something that could be difficult for the business user. The business user doesn't always have knowledge about the data model, which is something you need in order to get all the expressions right.

In the following table, you can find some of the measures defined in this app:

Measure

Definition

GDP per Capita

`Avg([GDP/capita])`

Global Footprint

`Avg([Footprint])`

Governance Rank

`Only[Governance Rank])`

Happy Life Years

`Only([Happy Life Years])`

Happy Planet Index

`Only([Happy Planet Index])`

HPI Rank

`Only([HPI Rank])`

Population

`Only(Population)`

Several of these measures can be defined differently. How you do this is very much a matter of taste. For instance, the measures where the `Only()` function is used can also be defined using `Sum()` or `Avg()`. As long as you only have a single number, all three functions will return the same answer.

But how do you want Qlik Sense to behave when there are several countries, for example, a region that should be represented by one value? For the Population measure, the obvious function to use should be `Sum()`. Then the total population of the region will be shown.

But if the source data contains several years, so that a single country has several records, you don't just want to sum the population. Then you would get numbers that are much larger than they should be. Instead, you might want to use `Sum(Population)/Count(distinct Year)` to create an average over all possible years.

In addition, for a rank, you wouldn't want to use `Sum()` because it would show an incorrect number. You could use `Avg()`, which will give the average rank between the countries. An average is clearly better, but it is still not mathematically correct. Then it might be better to use `Only()`, which doesn't return an answer at all when more than one country is involved.

# Measures

A number of measures have also been defined, for example, GDP, happiness index, global footprint, life expectancy, and so on.

It is important that the app developer formulates the formulas correctly, since this is something that could be difficult for the business user. The business user doesn't always have knowledge about the data model, which is something you need in order to get all the expressions right.

In the following table, you can find some of the measures defined in this app:

Measure

Definition

GDP per Capita

`Avg([GDP/capita])`

Global Footprint

`Avg([Footprint])`

Governance Rank

`Only[Governance Rank])`

Happy Life Years

`Only([Happy Life Years])`

Happy Planet Index

`Only([Happy Planet Index])`

HPI Rank

`Only([HPI Rank])`

Population

`Only(Population)`

Several of these measures can be defined differently. How you do this is very much a matter of taste. For instance, the measures where the `Only()` function is used can also be defined using `Sum()` or `Avg()`. As long as you only have a single number, all three functions will return the same answer.

But how do you want Qlik Sense to behave when there are several countries, for example, a region that should be represented by one value? For the Population measure, the obvious function to use should be `Sum()`. Then the total population of the region will be shown.

But if the source data contains several years, so that a single country has several records, you don't just want to sum the population. Then you would get numbers that are much larger than they should be. Instead, you might want to use `Sum(Population)/Count(distinct Year)` to create an average over all possible years.

In addition, for a rank, you wouldn't want to use `Sum()` because it would show an incorrect number. You could use `Avg()`, which will give the average rank between the countries. An average is clearly better, but it is still not mathematically correct. Then it might be better to use `Only()`, which doesn't return an answer at all when more than one country is involved.

# Summary

The analysis of demographic data is easy when you use Qlik Sense. Obviously, this analysis can also be made with a number of other tools, since the data model is very simple. However, with Qlik Sense, it is easy to build further. Qlik's associative indexing engine powers the analysis and ensures that you can develop or change your apps quickly and easily. With Qlik Sense, data discovery and analysis is made easy.

With the end of this chapter, we have also reached the end of the book. We took you from the history of Qlik to how to develop applications, and finally gave you some examples of how applications might look.

We hope that after reading this book, you have acquired some skills that will be useful when you develop your own Qlik Sense applications. We also think you now have a better understanding of the thoughts behind Qlik Sense, and wish you good luck in your endeavors.

Welcome to the community of Qlik users!