Skip to main content

Creating animated scatterplots with Python

· 3 min read

Project completed in week 1 (28.09.-02.10.20) of the Data Science Bootcamp at Spiced Academy in Berlin.

Our first bootcamp project was creating an animated scatterplot, using the libraries matplotlib or seaborn and imageio. The scatterplot illustrates the relationship between life expectancy and fertility rate of world's countries from 1960 to 2015, based on the Gapminder data set.

countryyearpopulationlife_expectancyfertility_ratecontinent
0Afghanistan18003280000.028.217.0Asia

The animated scatterplot is basically made of several overlapping static plots. The animation consists of four steps:

1. Create static scatterplots for each year in the data set.

Thescatterplots depict life_expectancy on the x axis and fertility_rate on the y rate. To make the plots even more insightful, the size of the points illustrates the population number and the color of the points illustrates the continent.

import matplotlib.pyplot as plt 
%matplotlib inline
import seaborn as sns

sns.scatterplot(x='life_expectancy',
y='fertility_rate',
hue='continent',
size='population',
sizes=(10, 1000),
legend=False,
data=gapminder_df.loc[gapminder_df['year']==year],
alpha=0.7,
palette='Set2')

plt.title(f'{year}', loc='center', fontsize=20, color='black', fontweight='bold')

plt.xlabel('Life expectancy')
plt.ylabel('Fertility rate')

2. Export the scatterplot images to a designated folder.

import imageio
import os

images = []
folder='/path/to/folder/images'

if not os.path.exists(folder):
os.mkdir(folder)

3. Join the individual images in chronological order.

filename = f'lifeexp_{year}.png'
plt.savefig(os.path.join(folder,filename))
images.append(imageio.imread(os.path.join(folder,filename)))

4. Export the scatterplots sequence as a gif.\ The fps (frames per second) parameter sets the speed of the animation.

imageio.mimsave(os.path.join(folder,'scatterplot.gif'), images, fps=20)

Now, putting everything together, here's the full code and the animated scatterplot:

import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
import imageio
import os

images = []
folder='/home/lorena/Documents/bootcamp/W1/images'

if not os.path.exists(folder):
os.mkdir(folder)

for year in range(1960, 2016):

plt.axis((0, 100, 0, 10))

sns.scatterplot(x='life_expectancy',
y='fertility_rate',
hue='continent',
size='population',
sizes=(10, 1000),
legend=False,
data=gapminder_df.loc[gapminder_df['year']==year],
alpha=0.7,
palette='Set2')

plt.title(f'{year}', loc='center', fontsize=20, color='black', fontweight='bold')
#plt.title(f'inspired by Hans Rosling', loc='right', fontsize=10, color='grey', style='italic', pad=-20)

#plt.legend(bbox_to_anchor=(0.74, 0.85), loc='center')

plt.xlabel('Life expectancy')
plt.ylabel('Fertility rate')
#plt.annotate({country}, )

filename = f'lifeexp_{year}.png'

plt.savefig(os.path.join(folder,filename))

images.append(imageio.imread(os.path.join(folder,filename)))

plt.figure()

imageio.mimsave(os.path.join(folder,'scatterplot.gif'), images, fps=20)

animated scatterplot

Friday Lightning Talk

Each week, we get a main dataset and several tasks to apply the concepts learned throughout the week. On Fridays, we present in 5 minutes a particular finding from our weekly challenge project, a chart, new library, (un)solved bugs, or anything that is worth sharing and helpful for others. This Friday, I chose to talk about five new bash commands for checking the installed Python libraries, their versions and dependencies.

functioncommand
to list all installed librariespip list
to list only outdated librariespip list -o (or --outdated)
to list only the latest / up to date librariespip list -u (or --uptodate)
to show all information about a librarypip show <package-name>
to list all libraries installed in a specific environmentconda list -n <environment-name>