-
Book Overview & Buying
-
Table Of Contents
Interactive Data Visualization with Python - Second Edition
By :
Looking at the last figure in our previous section, we find that the legend is not appropriately placed. We can tweak the plot parameters to adjust the placements of the legends and the axis labels, as well as change the font-size and rotation of the tick labels.
In this exercise, we'll tweak the plot parameters, for example, hue, of a grouped bar plot. We'll see how to place legends and axis labels in the right places and also explore the rotation feature:
seaborn:#Import seaborn import seaborn as sns
diamonds_df = sns.load_dataset('diamonds')hue parameter to plot nested groups:ax = sns.barplot(x="cut", y="price", hue='color', data=diamonds_df)
The output is as follows:

ax = sns.barplot(x='cut', y='price', hue='color', data=diamonds_df) ax.legend(loc='upper right',ncol=4)
The output is as follows:

In the preceding ax.legend() call, the ncol parameter denotes the number of columns into which values in the legend are to be organized, and the loc parameter specifies the location of the legend and can take any one of eight values (upper left, lower center, and so on).
ax = sns.barplot(x='cut', y='price', hue='color', data=diamonds_df)
ax.legend(loc='upper right', ncol=4)
ax.set_xlabel('Cut', fontdict={'fontsize' : 15})
ax.set_ylabel('Price', fontdict={'fontsize' : 15})The output is as follows:

ax = sns.barplot(x='cut', y='price', hue='color', data=diamonds_df) ax.legend(loc='upper right',ncol=4) # set fontsize and rotation of x-axis tick labels ax.set_xticklabels(ax.get_xticklabels(), fontsize=13, rotation=30)
The output is as follows:
The rotation feature is particularly useful when the tick labels are long and crowd up together on the x axis.
Another useful feature to have in plots is the annotation feature. In the following exercise, we'll make a simple bar plot more informative by adding some annotations.Suppose we want to add more information to the plot about ideally cut diamonds. We can do this in the following exercise:
In this exercise, we will annotate a bar plot, generated using the catplot function of seaborn, using a note right above the plot. Let's see how:
import matplotlib.pyplot as plt import seaborn as sns
diamonds dataset:diamonds_df = sns.load_dataset('diamonds')catplot function of the seaborn library:ax = sns.catplot("cut", data=diamonds_df, aspect=1.5, kind="count", color="b")The output is as follows:

Ideal category:# get records in the DataFrame corresponding to ideal cut ideal_group = diamonds_df.loc[diamonds_df['cut']=='Ideal']
# get the location of x coordinate where the annotation has to be placed x = ideal_group.index.tolist()[0]
# get the location of y coordinate where the annotation has to be placed y = len(ideal_group)
print(x) print(y)
The output is:
0 21551
# annotate the plot with any note or extra information
sns.catplot("cut", data=diamonds_df, aspect=1.5, kind="count", color="b")
plt.annotate('excellent polish and symmetry ratings;\nreflects almost all the light that enters it', xy=(x,y), xytext=(x+0.3, y+2000), arrowprops=dict(facecolor='red'))The output is as follows:

Now, there seem to be a lot of parameters in the annotate function, but worry not! Matplotlib's https://matplotlib.org/3.1.0/api/_as_gen/matplotlib.pyplot.annotate.html official documentation covers all the details. For instance, the xy parameter denotes the point (x,y) on the figure to annotate. xytext denotes the position (x,y) to place the text at. If None, it defaults to xy. Note that we added an offset of .3 for x and 2000 for y (since y is close to 20,000) for the sake of readability of the text. The color of the arrow is specified using the arrowprops parameter in the annotate function.
There are several other bells and whistles associated with visualization libraries in Python, some of which we will see as we progress in the book. At this stage, we will go through a chapter activity to revise the concepts in this chapter.
So far, we have seen how to generate two simple plots using seaborn and pandas—histograms and bar plots:
hist() function in pandas and distplot() in seaborn.plot(kind='bar') function in pandas and the catplot(kind='count'), and barplot() functions in seaborn.With the help of various considerations arising in the process of plotting these two types of visualizations, we presented some basic concepts in data visualization:
loc and other parameters in the legend functionset_xticklabels() and set_yticklabels() functionsannotate() functionWe'll be working with the 120 years of Olympic History dataset acquired by Randi Griffin from https://www.sports-reference.com/ and made available on the GitHub repository of this book. Your assignment is to identify the top five sports based on the largest number of medals awarded in the year 2016, and then perform the following analysis:
High-Level Steps
Age feature of all medal winners in the top five sports (2016).The expected output should be:
After Step 1:
After Step 2:
After Step 3:
After Step 4:
After Step 5:
After Step 6:
After Step 7:
After Step 8:
The bar plot indicates the highest athlete weight in rowing, followed by swimming, and then the other remaining sports. The trend is similar across both male and female players.
The solution steps can be found on page 254.
Change the font size
Change margin width
Change background colour