Data Visualization in Python

At this point in the course, it is time to begin connecting the dots and applying visualization to your knowledge of statistics. Work through these programming examples to round out your knowledge of seaborn as it is applied to univariate and bivariate plots.

Univariate Plots

We will be introducing plotting and code from 3 modules: matplotlib, seaborn and pandas. As we go forth, you may ask the question, which one should I learn? Chris Moffitt has the following advice.

A pathway to learning (Chris Moffit)

  1. Learn the basic matplotlib terminology, specifically what is a Figure and an Axes .
  2. Always use the object-oriented interface. Get in the habit of using it from the start of your analysis. (not really getting into this, but basically, don't use the Matlab form I'll show at the end if you don't have to)
  3. Start your visualizations with basic pandas plotting.
  4. Use seaborn for the more complex statistical visualizations.
  5. Use matplotlib to customize the pandas or seaborn visualization.

pandas

Histogram

mtcars.plot.hist(y = 'mpg');
plt.show()
# mtcars.plot(y = 'mpg', kind = 'hist')
#mtcars['mpg'].plot(kind = 'hist')


Bar plot

mtcars['cyl'].value_counts().plot.bar();
plt.show()


Density plot

mtcars['mpg'].plot( kind = 'density');
plt.show()

density plot

seaborn

Histogram

ax = sns.distplot(mtcars['mpg'], kde=False);
plt.show()


Bar plot

sns.countplot(data = mtcars, x = 'cyl');
plt.show()


diamonds = pd.read_csv('data/diamonds.csv.gz')
ordered_colors = ['E','F','G','H','I','J']
sns.catplot(data = diamonds, x = 'color', kind = 'count', color = 'blue');
plt.show()


Density plot

sns.distplot(mtcars['mpg'], hist=False);
plt.show()