import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

load a dataset online from seaborn¶

tip=sns.load_dataset('tips')

tip.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 244 entries, 0 to 243
Data columns (total 7 columns):
total_bill    244 non-null float64
tip           244 non-null float64
sex           244 non-null category
smoker        244 non-null category
day           244 non-null category
time          244 non-null category
size          244 non-null int64
dtypes: category(4), float64(2), int64(1)
memory usage: 6.8 KB

tip.head(3)

visualizing regressions¶

Plot data and regression model fits across a FacetGrid.

sns.lmplot('total_bill','tip',tip,size=3,aspect=2)

<seaborn.axisgrid.FacetGrid at 0x7f71804eb950>

group by categorical column¶

sns.lmplot(x='total_bill',y='tip',data=tip, size=3,
          col='sex')

<seaborn.axisgrid.FacetGrid at 0x7f717db077d0>

plot group data in the same graph¶

seaborn color palette http://seaborn.pydata.org/tutorial/color_palettes.html

sns.lmplot(x='total_bill',y='tip',data=tip, size=3, aspect=2,
          hue='sex', palette='Set1')

<seaborn.axisgrid.FacetGrid at 0x7f71804eba90>

plot Residuals¶

residplot()

tip.head(1)

sns.residplot(x='total_bill',y='tip',data=tip,color='indianred')

<matplotlib.axes._subplots.AxesSubplot at 0x7f717cf9e7d0>

Higher-order regressions¶

When there are more complex relationships between two variables, a simple first order regression is often not sufficient to accurately capture the relationship between the variables. Seaborn makes it simple to compute and visualize regressions of varying orders.
sns.regplot()

the function sns.lmplot() is a higher-level interface to sns.regplot().
- A principal difference between sns.lmplot() and sns.regplot() is the way in which matplotlib options are passed (sns.regplot() is more permissive).
- For both sns.lmplot() and sns.regplot(), the keyword order is used to control the order of polynomial regression.
- The function sns.regplot() uses the argument scatter=None to prevent plotting the scatter plot points again.

tip.head(1)

# Generate a scatter plot of 'weight' and 'mpg' using red circles
plt.scatter(tip['total_bill'], tip['tip'], label='data', color='red', marker='o', alpha=.5)

# Plot in blue a linear regression of order 1 between 'weight' and 'mpg'
sns.regplot(x='total_bill', y='tip', data=tip, scatter=None, color='blue', label='order 1')

# Plot in green a linear regression of order 2 between 'weight' and 'mpg'
sns.regplot(x='total_bill', y='tip', data=tip, scatter=None, order=2, color='green', label='order 2')

sns.regplot(x='total_bill', y='tip', data=tip, scatter=None, order=3, color='purple', label='order 2')


# Add a legend and display the plot
plt.legend(loc='upper right')
plt.show()

Visualizing univariate distributions¶

Strip plot¶

swarmplot¶

sns.stripplot(y= 'tip', data=tip)
plt.ylabel('tip ($)')

<matplotlib.text.Text at 0x7f717ce6aa50>

sns.stripplot(x='day', y='tip', data=tip)
plt.ylabel('tip ($)')

<matplotlib.text.Text at 0x7f717ce22710>

sns.stripplot(x='day', y='tip', data=tip, size=4, jitter=True)
plt.ylabel('tip ($)')

<matplotlib.text.Text at 0x7f717cc93750>

sns.swarmplot(x='day', y='tip', data=tip)
plt.ylabel('tip ($)')

<matplotlib.text.Text at 0x7f717cca2ad0>

sns.swarmplot(x='day', y='tip', data=tip, hue='sex',  palette='Set1')
plt.ylabel('tip ($)')

<matplotlib.text.Text at 0x7f717cb27350>

sns.swarmplot(x='tip', y='day', data=tip, hue='sex',  orient='h')
plt.ylabel('tip ($)')

<matplotlib.text.Text at 0x7f717ca7a690>

Violin plot¶

plt.subplot(1,2,1)
sns.boxplot(x='day', y='tip', data=tip)
plt.ylabel('tip ($)')

plt.subplot(1,2,2)
sns.violinplot(x='day', y='tip', data=tip)
plt.ylabel('tip ($)')
plt.tight_layout()

sns.violinplot(x='day', y='tip', data=tip, inner=None,
color='lightgray')

sns.stripplot(x='day', y='tip', data=tip, size=4,
jitter=True)

plt.ylabel('tip ($)')

<matplotlib.text.Text at 0x7f717ca25dd0>

Visualizing multivariate distributions¶

Joint plots¶

sns.jointplot(x= 'total_bill', y= 'tip', data=tip, size=5)

<seaborn.axisgrid.JointGrid at 0x7f717ca34a10>

Using kde=True¶

kernal density distribution

sns.jointplot(x='total_bill', y= 'tip', data=tip,
              kind='kde', size=5)

<seaborn.axisgrid.JointGrid at 0x7f717ce8b050>

Pair plot¶

sns.pairplot(tip, size=2)

<seaborn.axisgrid.PairGrid at 0x7f717c398c50>

sns.pairplot(tip, hue='sex', kind='reg')

<seaborn.axisgrid.PairGrid at 0x7f717b9e4b10>

heatmap¶

covariance matrix

tip.cov()

tip.corr()

sns.heatmap(tip.corr())

<matplotlib.axes._subplots.AxesSubplot at 0x7f717b287d10>

	total_bill	tip	sex	smoker	day	time	size
0	16.99	1.01	Female	No	Sun	Dinner	2
1	10.34	1.66	Male	No	Sun	Dinner	3
2	21.01	3.50	Male	No	Sun	Dinner	3

	total_bill	tip	size
total_bill	79.252939	8.323502	5.065983
tip	8.323502	1.914455	0.643906
size	5.065983	0.643906	0.904591

	total_bill	tip	size
total_bill	1.000000	0.675734	0.598315
tip	0.675734	1.000000	0.489299
size	0.598315	0.489299	1.000000

famous iris dataset visualization

matplotlib

seaborn

3d scatterplot

jointplot

FacetGrid

boxplot

stripplot

violinplot

kdeplot

pairplot

Andrews Curves

parallel_coordinates

radviz

Plot with Seaborn

load a dataset online from seaborn¶

visualizing regressions¶

group by categorical column¶

plot group data in the same graph¶

plot Residuals¶

Higher-order regressions¶

Visualizing univariate distributions¶

Strip plot¶

swarmplot¶

Violin plot¶

Visualizing multivariate distributions¶

Joint plots¶

Using kde=True¶

Pair plot¶

heatmap¶

Data Science Notebook

famous iris dataset visualization

matplotlib

seaborn

3d scatterplot

jointplot

FacetGrid

boxplot

stripplot

violinplot

kdeplot

pairplot

Andrews Curves

parallel_coordinates

radviz

Plot with Seaborn

load a dataset online from seaborn¶

visualizing regressions¶

group by categorical column¶

plot group data in the same graph¶

plot Residuals¶

Higher-order regressions¶

Visualizing univariate distributions¶

Strip plot¶

swarmplot¶

Violin plot¶

Visualizing multivariate distributions¶

Joint plots¶

Using kde=True¶

Pair plot¶

heatmap¶