Tweaking Plot Parameters

Looking at the last figure in our previous section, we find that the legend is not appropriately placed. We can tweak the plot parameters to adjust the placements of the legends and the axis labels, as well as change the font-size and rotation of the tick labels.

Exercise 11: Tweaking the Plot Parameters of a Grouped Bar Plot

In this exercise, we'll tweak the plot parameters, for example, hue, of a grouped bar plot. We'll see how to place legends and axis labels in the right places and also explore the rotation feature:

Import the necessary modules—in this case, only seaborn:
```
#Import seaborn
import seaborn as sns
```

Load the dataset:

diamonds_df = sns.load_dataset('diamonds')

Use the hue parameter to plot nested groups:
```
ax = sns.barplot(x="cut", y="price", hue='color', data=diamonds_df)
```
The output is as follows:
Figure 1.26: Nested bar plot with the hue parameter
Place the legend appropriately on the bar plot:
```
ax = sns.barplot(x='cut', y='price', hue='color', data=diamonds_df)
ax.legend(loc='upper right',ncol=4)
```
The output is as follows:
Figure 1.27: Grouped bar plot with legends placed appropriately
In the preceding ax.legend() call, the ncol parameter denotes the number of columns into which values in the legend are to be organized, and the loc parameter specifies the location of the legend and can take any one of eight values (upper left, lower center, and so on).

To modify the axis labels on the x axis and y axis, input the following code:

ax = sns.barplot(x='cut', y='price', hue='color', data=diamonds_df)
ax.legend(loc='upper right', ncol=4)
ax.set_xlabel('Cut', fontdict={'fontsize' : 15})
ax.set_ylabel('Price', fontdict={'fontsize' : 15})

The output is as follows:

Figure 1.28: Grouped bar plot with modified labels

Similarly, use this to modify the font-size and rotation of the x axis of the tick labels:

ax = sns.barplot(x='cut', y='price', hue='color', data=diamonds_df)
ax.legend(loc='upper right',ncol=4)
# set fontsize and rotation of x-axis tick labels
ax.set_xticklabels(ax.get_xticklabels(), fontsize=13, rotation=30)

The output is as follows:

Figure 1.29: Grouped bar plot with the rotation feature of the labels

The rotation feature is particularly useful when the tick labels are long and crowd up together on the x axis.

Annotations

Another useful feature to have in plots is the annotation feature. In the following exercise, we'll make a simple bar plot more informative by adding some annotations.Suppose we want to add more information to the plot about ideally cut diamonds. We can do this in the following exercise:

Exercise 12: Annotating a Bar Plot

In this exercise, we will annotate a bar plot, generated using the catplot function of seaborn, using a note right above the plot. Let's see how:

Import the necessary modules:

import matplotlib.pyplot as plt
import seaborn as sns

Load the diamonds dataset:

diamonds_df = sns.load_dataset('diamonds')

Generate a bar plot using catplot function of the seaborn library:
```
ax = sns.catplot("cut", data=diamonds_df, aspect=1.5, kind="count", color="b")
```
The output is as follows:
Figure 1.30: Bar plot with seaborn's catplot function

Annotate the column belonging to the Ideal category:

# get records in the DataFrame corresponding to ideal cut
ideal_group = diamonds_df.loc[diamonds_df['cut']=='Ideal']

Find the location of the x coordinate where the annotation has to be placed:

# get the location of x coordinate where the annotation has to be placed
x = ideal_group.index.tolist()[0]

Find the location of the y coordinate where the annotation has to be placed:

# get the location of y coordinate where the annotation has to be placed
y = len(ideal_group)

Print the location of the x and y co-ordinates:
```
print(x)
print(y)
```
The output is:
```
0
21551
```

Annotate the plot with a note:

# annotate the plot with any note or extra information
sns.catplot("cut", data=diamonds_df, aspect=1.5, kind="count", color="b")
plt.annotate('excellent polish and symmetry ratings;\nreflects almost all the light that enters it', xy=(x,y), xytext=(x+0.3, y+2000), arrowprops=dict(facecolor='red'))

The output is as follows:

Figure 1.31: Annotated bar plot

Now, there seem to be a lot of parameters in the annotate function, but worry not! Matplotlib's https://matplotlib.org/3.1.0/api/_as_gen/matplotlib.pyplot.annotate.html official documentation covers all the details. For instance, the xy parameter denotes the point (x,y) on the figure to annotate. xytext denotes the position (x,y) to place the text at. If None, it defaults to xy. Note that we added an offset of .3 for x and 2000 for y (since y is close to 20,000) for the sake of readability of the text. The color of the arrow is specified using the arrowprops parameter in the annotate function.

There are several other bells and whistles associated with visualization libraries in Python, some of which we will see as we progress in the book. At this stage, we will go through a chapter activity to revise the concepts in this chapter.

So far, we have seen how to generate two simple plots using seaborn and pandas—histograms and bar plots:

Histograms: Histograms are useful for understanding the statistical distribution of a numerical feature in a given dataset. They can be generated using the hist() function in pandas and distplot() in seaborn.
Bar plots: Bar plots are useful for gaining insight into the values taken by a categorical feature in a given dataset. They can be generated using the plot(kind='bar') function in pandas and the catplot(kind='count'), and barplot() functions in seaborn.

With the help of various considerations arising in the process of plotting these two types of visualizations, we presented some basic concepts in data visualization:

Formatting legends to present labels for different elements in the plot with loc and other parameters in the legend function
Changing the properties of tick labels, such as font-size, and rotation, with parameters in the set_xticklabels() and set_yticklabels() functions
Adding annotations for additional information with the annotate() function

Activity 1: Analyzing Different Scenarios and Generating the Appropriate Visualization

We'll be working with the 120 years of Olympic History dataset acquired by Randi Griffin from https://www.sports-reference.com/ and made available on the GitHub repository of this book. Your assignment is to identify the top five sports based on the largest number of medals awarded in the year 2016, and then perform the following analysis:

Generate a plot indicating the number of medals awarded in each of the top five sports in 2016.
Plot a graph depicting the distribution of the age of medal winners in the top five sports in 2016.
Find out which national teams won the largest number of medals in the top five sports in 2016.
Observe the trend in the average weight of male and female athletes winning in the top five sports in 2016.

High-Level Steps

Download the dataset and format it as a pandas DataFrame.
Filter the DataFrame to only include the rows corresponding to medal winners from 2016.
Find out the medals awarded in 2016 for each sport.
List the top five sports based on the largest number of medals awarded. Filter the DataFrame one more time to only include the records for the top five sports in 2016.
Generate a bar plot of record counts corresponding to each of the top five sports.
Generate a histogram for the Age feature of all medal winners in the top five sports (2016).
Generate a bar plot indicating how many medals were won by each country's team in the top five sports in 2016.
Generate a bar plot indicating the average weight of players, categorized based on gender, winning in the top five sports in 2016.

The expected output should be:

After Step 1:

Figure 1.32: Olympics dataset

After Step 2:

Figure 1.33: Filtered Olympics DataFrame

After Step 3:

Figure 1.34: The number of medals awarded

After Step 4:

Figure 1.35: Olympics DataFrame

After Step 5:

Figure 1.36: Generated bar plot

After Step 6:

Figure 1.37: Histogram plot with the Age feature

After Step 7:

Figure 1.38: Bar plot with the number of medals won

After Step 8:

Figure 1.39: Bar plot with the average weight of players

The bar plot indicates the highest athlete weight in rowing, followed by swimming, and then the other remaining sports. The trend is similar across both male and female players.

Note

The solution steps can be found on page 254.

Tech Concepts

Programming languages

Tech Tools

Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.

Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.

50+ new titles added per month and exclusive early access to books as they are being written.

Interactive Data Visualization with Python - Second Edition

By : Abha Belorkar , Sharath Chandra Guntuku , Shubhangi Hora , Anshu Kumar

Interactive Data Visualization with Python

By: Abha Belorkar , Sharath Chandra Guntuku , Shubhangi Hora , Anshu Kumar

Overview of this book

Tweaking Plot Parameters

Exercise 11: Tweaking the Plot Parameters of a Grouped Bar Plot

Figure 1.26: Nested bar plot with the hue parameter

Figure 1.27: Grouped bar plot with legends placed appropriately

Figure 1.28: Grouped bar plot with modified labels

Figure 1.29: Grouped bar plot with the rotation feature of the labels

Annotations

Exercise 12: Annotating a Bar Plot

Figure 1.30: Bar plot with seaborn's catplot function

Figure 1.31: Annotated bar plot

Activity 1: Analyzing Different Scenarios and Generating the Appropriate Visualization

Figure 1.32: Olympics dataset

Figure 1.33: Filtered Olympics DataFrame

Figure 1.34: The number of medals awarded

Figure 1.35: Olympics DataFrame

Figure 1.36: Generated bar plot

Figure 1.37: Histogram plot with the Age feature

Figure 1.38: Bar plot with the number of medals won

Figure 1.39: Bar plot with the average weight of players

Note

Interactive Data Visualization with Python - Second Edition

By : Abha Belorkar , Sharath Chandra Guntuku , Shubhangi Hora , Anshu Kumar

Interactive Data Visualization with Python

By: Abha Belorkar , Sharath Chandra Guntuku , Shubhangi Hora , Anshu Kumar

Overview of this book

Tweaking Plot Parameters

Exercise 11: Tweaking the Plot Parameters of a Grouped Bar Plot

Figure 1.26: Nested bar plot with the hue parameter

Figure 1.27: Grouped bar plot with legends placed appropriately

Figure 1.28: Grouped bar plot with modified labels

Figure 1.29: Grouped bar plot with the rotation feature of the labels

Annotations

Exercise 12: Annotating a Bar Plot

Figure 1.30: Bar plot with seaborn's catplot function

Figure 1.31: Annotated bar plot

Activity 1: Analyzing Different Scenarios and Generating the Appropriate Visualization

Figure 1.32: Olympics dataset

Figure 1.33: Filtered Olympics DataFrame

Figure 1.34: The number of medals awarded

Figure 1.35: Olympics DataFrame

Figure 1.36: Generated bar plot

Figure 1.37: Histogram plot with the Age feature

Figure 1.38: Bar plot with the number of medals won

Figure 1.39: Bar plot with the average weight of players

Note

Confirmation

Buy this book with your credits?

Submit Your Feedback

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access