top

Search

Python Tutorial

.

UpGrad

Python Tutorial

Python Seaborn

Python is a repository of many highly powerful frameworks and libraries. Among them is Seaborn, which is a prominent data visualization library, offering scope to programmers to achieve Python Certification. In this Python Seaborn Tutorial, you will be mastering all the knacks of data visualization utilizing Seaborn.

Overview

Seaborn comprises a data visualization library created over matplotlib and integrated with pandas data structures in Python. It enables you to make attractive charts with less code. The main part of Python Seaborn is visualization which assists in understanding and exploring data.

What Is Seaborn?

Built atop Matplotlib, Seaborn is a popular Python library for data visualization that presents a user-friendly interface for creating visually attractive and informative statistical graphics. It is developed to work with Pandas dataframes, rendering it simple to explore and visualize data effectively.

Example 1

# Random data
rng = np.random.RandomState (0)
x = np.linspace (0, 10, 500)
y = np.cumsum (rng.randn(500, 6), 0)

# 1. Plot the data with Matplotlib defaults
plt.plot(x, y)
plt.legend(‘ABCDEF’, ncol=2, loc= ‘upper left’);

Example 2

# 2. Now let’s see what Seaborn can do
import seaborn as sns
ans.set()

# same data defined above (x, y)
plt.plot (x, y)
plt.legend (‘ABCDEF’, ncol=2, loc= ‘upper left’);

Different Categories Of Plot in Seaborn

There are plots available within the python seaborn library. They are applied to visualize the association between different variables which are completely numerical or can be a class, a division, or else a group. The categories of plots are:

Relational Plots

These plots are beneficial for knowing the relationship between 2 separate variables.

Categorical Plots

These kinds of plots cater to categorical variables. They also aid to visualize how graphs may be plotted.

Regression Plots

The plot is primarily used to include a visual to the data. It also assists in highlighting the patterns of the dataset while analyzing it in the process of data analyses.

Distribution Plots

These kinds of plots in seaborn libraries are utilized to test the univariate as well as bivariate distributions.

Multi-plot Grids

These grids are helpful for drawing several instances inside the same plot of the dataset available on non-similar subsets.

Matrix Plots

This plot is really an array existing in scatter plots.

Let us view the seaborn plots and get a better understanding of them utilizing python programming:

Line Plot

The seaborn line plot is the most common plot. Its primary use is to visualize and plot the data in a series form which implies in a regular manner.

An example in python to display the line plot in seaborn is given below.

import seaborn as sns
sns.set(style= “light”)
data1 = sns.load_dataset(“x”)
sns.lineplot(x=”timepoint”,
y= “signal”,
hue= “region”,
style= “event”,
data=data1)

Line Plot Output:

Violin Plot

It resembles the box plot save that it offers a superior, more enhanced visualization and utilizes the kernel density estimation to provide an improved description about the data distribution. It is designed using the violinplot() method.

Syntax:

violinplot([x,y, hue, data, order, …])

Example:

# importing packages
import seaborn as sns
import matplotlib.pyplot as plt

# loading dataset
data = sns. Load_dataset (“iris”)

sns.violinplot (x= ‘species’, y= ‘sepal_width’, data=data)
plt.show()

Output: 

Scatter Plot

Scatter plot can be created using sns.scatterplot(). We can even group data points to categories simply by accepting the hue parameter with the column carrying categorical values.

style.use(‘seaborn’)
sns.scatterplot (df [‘sepal_length’], df [‘sepal_width’], hue=df [‘species’])

Other categories of plots include Pair plot, Box plot, and Heatmap.

Installation Of Seaborn Library

Installing the Seaborn library in a Python runtime is extremely easy. Installing and starting with Seaborn is specified below:

Using Pip Installer

The pip package manager has emerged as the existing standard for Python applications.

pip install seaborn

Using Anaconda

Anaconda comprises a Python distribution that joins an environment manager with a package manager and an extensive range of open-source modules. After setting up Anaconda, you can utilize the conda command or the package manager of Anaconda to install any extra packages you may need.

conda install seaborn

If operating it on a cloud-based Jupyter environment, like Google Colab, then the likelihood is that all of these are already installed, and you can begin working by importing them.

Now, whether it is a local Python runtime or a cloud-related setup, importing these into the program prior to usage is crucial.

Further, ensure that the specified dependencies are installed on your system:

Matplotlib
Statsmodels
Pandas
NumPy
SciPy
Python 3.6

Dependencies For Seaborn Library

Dependencies for Seaborn Library are the following:

Python 3.6
Scipy (>= 1.0.1)
Numpy (>= 1.13.3)
Matplotlib (>= 2.1.2)
Pandas (>= 0.22.0)

Apply the commands specified and run them to confirm that you have the entire dependencies installed and that they are functioning as scheduled:

import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from scipy import stats

After the necessary dependencies are installed, you are set to install and apply Seaborn.

Some Basic Plots Using Seaborn

The primary idea of Seaborn is that it offers high-level commands to design various plot types beneficial for statistical data exploration, and also some statistical model fitting.

Let’s view some of the seaborn python examples of basic plot types.

Line Plot

A line plot is a well-known chart that can sketch a line to show the revolution of categorical or regular data.

sns.lineplot(data=flights_data, x=”year”, y=”passengers”)

Histplot

You can build a histogram in seaborn using the histplot function given a vector. Note that you can approve one variable or a variable of a data set as a key.

import numpy as np
import seaborn as sns

# Data simulation
rng = np.random.RandomState (0)
x = rng.normal (0, 1, size = 1000)
df = {‘x’ : x}

# Histogram
sns.histplot (x = x)

# Equivalent to:
sns.histplot (x = “x”, data = df)

Lmplot

This plot within the seaborn library is extremely common for depicting a line utilizing linear regression. It illustrates the data points also in 2-D and shows them both vertically as well as horizontally.

An example in python to display the lmplot is given below.

Import seaborn as sns
sns.set(style=”ticks”)
Df = sns.load_dataset (“HKR”)
sns.lmplot(x=”size”, y=”total_bill”, data=data1)

Lmplot Output:

Distribution Plot

This plot within Seaborn is helpful for plotting a histogram. It can even plot the histogram with a few variations like rugplot and kdeplot.

The example specified gives an understanding of the histogram as an output:

import matplotlib.pyplot as plt1
import seaborn as sns
sns.distplot ([10, 11, 12, 13, 14, 15])
plt1,show()

Distribution Plot Output:

The Objective Of Python Seaborn Library

Seaborn is a visualization library of Python which presents an enhanced sketching tool for creating exciting and informative data visualizations. Seaborn library has a link to the Pandas data structure. This library utilizes Matplotlib straightforwardly and is completely based on it.

Further, it offers us the capability to generate advanced data visualization. This helps in our understanding of the data by positioning it in a visual context and revealing any concealed relationships between trends or variables that might not be instantly apparent. In the comparison of seaborn vs matplotlib  Matplotlib has a low-level interface whereas Seaborn has a high-level interface.

The advantages of using Seaborn within our application is given below:

  • We can plot our data conveniently by using the seaborn library.

  • We just need to enter our data or data set into the replot() method, and the value will be calculated and placed relevantly, saving us from bothering about library’s internal workings.

  • It creates an interactive and educational plot to depict our data, making it easy for the user to understand and view the data in the application.

Python Seaborn Plotting Functions

The Seaborn library provides a range of plotting tools, which support the interpretation and viewing of data more easily.

Learn about seaborn in python plotting functions to know more how these categorical variables might be depicted graphically.

Bar Plot

We will now utilize a bar plot to illustrate which days fetched the highest tip from the customers.

sns.barplot(x=”day”, y=”tip”, data = tips)

Distribution Plot

The distribution plot or dist plot plots the instances or density of the defined feature in the dataset. Let us plot the distribution of tips from the dataset.

sns.distplot(tips[‘tip’])

The image above demonstrates that most of the tips offered by the customers exist between the range of 2 and 4.

Count Plot

Count plot allows us to conveniently plot a feature against the number of occurrences or observations.

Let us illustrate the number of smokers and non-smokers within the dataset.

sns.countplot(x=’smoker’, data=tips)

Scatter Plots

Seaborn also offers a simple way to generate scatter plots. The scatterplot() function can be employed to design a scatter plot having the option of displaying the linear relationship between 2 variables. For instance, the code given will generate a scatter plot displaying the association between “sepal_length” and “sepal_width” within the iris dataset:

Box Plots

Seaborn also offers a simple way to build box plots. The boxplot() function can be utilized to design a box plot. For instance, the following code will generate a box plot displaying the distribution of “sepal_length” by “species” within the iris dataset:

Heatmap

Seaborn also presents a simple way to build heat maps by using the heatmap() function. For example, the specified code will generate a heatmap exhibiting the relationship between “day” and “month” within the flights dataset:

Pair Plot

One of the easiest ways to visualize the association between all features, the method of pair plot plots the entire pair relationships within the dataset instantly.

sns.pairplot(tips)

The method uses all the features in the dataset and plots it against one another.

Linear Regression Plot

Each plot in Seaborn has a set of definite parameters. For sns.jointplot, there are 3 compulsory parameters: the x-axis data, the y-axis data, and the dataset.

To construct a linear regression, we are required to insert to those 3 parameters, the optional parameter kind=”reg” (for Linear Regression).

1    tips = sns.load_dataset (“tips”)
2    sns.jointplot (“total_bill”, “tip”, data=tips, kind=’reg’)

Mark that you could also design a linear regression using regplot() or lmplot().

Pairplot

Pairplot signifies pairwise relation across the complete dataframe and upholds an extra argument named hue for categorical separation. It basically creates a jointplot between probable numerical columns and takes some time if the data frame is indeed huge. It is plotted utilizing the pairplot() method.

Syntax:

pairplot(data[, hue, hue_order, palette, …])

Example:

# importing packages
import seaborn as sns
import matplotlib.pyplot as plt

# loading dataset
data = sns. load_dataset (“iris”)

sns.pairplot(data=data, hue=’species’)
plt.show()

Output:

Conclusion

The Seaborn library in python is really adopted for data visualization for which Seaborn works as an extremely powerful tool. The excellent thing about using it is that it is based on matplotlib, enabling the user to customize his plots and graphs as per his requirements. This seaborn tutorial informs us about a seaborn library and the types that accompany it. 

FAQs

1. Is Seaborn utilized for data manipulation?

It’s essential to note that Seaborn is mainly focused on data visualization, and for more complicated data manipulation projects, we may need to depend on the functionalities offered by pandas or different data manipulation libraries in Python.

2. What data structures are adopted by Seaborn?

Only those seaborn datasets are accepted that possess more than one vector arranged in some tabular manner. There is a basic difference between “wide-for,” and “long-form” data tables, and seaborn will handle each differently.

3. Is Seaborn open source?

Seaborn is an open-source library of Python utilized for visualizing the preliminary statistical data plots. It is a fundamental library used to draw different information that can offer an insight of a dataset.

Leave a Reply

Your email address will not be published. Required fields are marked *