Tutorial Playlist
Python is a repository of many highly powerful frameworks and libraries. Among them is Seaborn, which is a prominent data visualization library, offering scope to programmers to achieve Python Certification. In this Python Seaborn Tutorial, you will be mastering all the knacks of data visualization utilizing Seaborn.
Seaborn comprises a data visualization library created over matplotlib and integrated with pandas data structures in Python. It enables you to make attractive charts with less code. The main part of Python Seaborn is visualization which assists in understanding and exploring data.
Built atop Matplotlib, Seaborn is a popular Python library for data visualization that presents a user-friendly interface for creating visually attractive and informative statistical graphics. It is developed to work with Pandas dataframes, rendering it simple to explore and visualize data effectively.
# Random data
rng = np.random.RandomState (0)
x = np.linspace (0, 10, 500)
y = np.cumsum (rng.randn(500, 6), 0)
# 1. Plot the data with Matplotlib defaults
plt.plot(x, y)
plt.legend(‘ABCDEF’, ncol=2, loc= ‘upper left’);
# 2. Now let’s see what Seaborn can do
import seaborn as sns
ans.set()
# same data defined above (x, y)
plt.plot (x, y)
plt.legend (‘ABCDEF’, ncol=2, loc= ‘upper left’);
There are plots available within the python seaborn library. They are applied to visualize the association between different variables which are completely numerical or can be a class, a division, or else a group. The categories of plots are:
Relational Plots
These plots are beneficial for knowing the relationship between 2 separate variables.
Categorical Plots
These kinds of plots cater to categorical variables. They also aid to visualize how graphs may be plotted.
Regression Plots
The plot is primarily used to include a visual to the data. It also assists in highlighting the patterns of the dataset while analyzing it in the process of data analyses.
Distribution Plots
These kinds of plots in seaborn libraries are utilized to test the univariate as well as bivariate distributions.
Multi-plot Grids
These grids are helpful for drawing several instances inside the same plot of the dataset available on non-similar subsets.
Matrix Plots
This plot is really an array existing in scatter plots.
Let us view the seaborn plots and get a better understanding of them utilizing python programming:
The seaborn line plot is the most common plot. Its primary use is to visualize and plot the data in a series form which implies in a regular manner.
An example in python to display the line plot in seaborn is given below.
import seaborn as sns
sns.set(style= “light”)
data1 = sns.load_dataset(“x”)
sns.lineplot(x=”timepoint”,
y= “signal”,
hue= “region”,
style= “event”,
data=data1)
Line Plot Output:
It resembles the box plot save that it offers a superior, more enhanced visualization and utilizes the kernel density estimation to provide an improved description about the data distribution. It is designed using the violinplot() method.
Syntax:
violinplot([x,y, hue, data, order, …])
Example:
# importing packages
import seaborn as sns
import matplotlib.pyplot as plt
# loading dataset
data = sns. Load_dataset (“iris”)
sns.violinplot (x= ‘species’, y= ‘sepal_width’, data=data)
plt.show()
Output:
Scatter plot can be created using sns.scatterplot(). We can even group data points to categories simply by accepting the hue parameter with the column carrying categorical values.
style.use(‘seaborn’)
sns.scatterplot (df [‘sepal_length’], df [‘sepal_width’], hue=df [‘species’])
Other categories of plots include Pair plot, Box plot, and Heatmap.
Installing the Seaborn library in a Python runtime is extremely easy. Installing and starting with Seaborn is specified below:
Using Pip Installer
The pip package manager has emerged as the existing standard for Python applications.
pip install seaborn
Using Anaconda
Anaconda comprises a Python distribution that joins an environment manager with a package manager and an extensive range of open-source modules. After setting up Anaconda, you can utilize the conda command or the package manager of Anaconda to install any extra packages you may need.
conda install seaborn
If operating it on a cloud-based Jupyter environment, like Google Colab, then the likelihood is that all of these are already installed, and you can begin working by importing them.
Now, whether it is a local Python runtime or a cloud-related setup, importing these into the program prior to usage is crucial.
Further, ensure that the specified dependencies are installed on your system:
Matplotlib
Statsmodels
Pandas
NumPy
SciPy
Python 3.6
Dependencies for Seaborn Library are the following:
Python 3.6
Scipy (>= 1.0.1)
Numpy (>= 1.13.3)
Matplotlib (>= 2.1.2)
Pandas (>= 0.22.0)
Apply the commands specified and run them to confirm that you have the entire dependencies installed and that they are functioning as scheduled:
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from scipy import stats
After the necessary dependencies are installed, you are set to install and apply Seaborn.
The primary idea of Seaborn is that it offers high-level commands to design various plot types beneficial for statistical data exploration, and also some statistical model fitting.
Let’s view some of the seaborn python examples of basic plot types.
A line plot is a well-known chart that can sketch a line to show the revolution of categorical or regular data.
sns.lineplot(data=flights_data, x=”year”, y=”passengers”)
You can build a histogram in seaborn using the histplot function given a vector. Note that you can approve one variable or a variable of a data set as a key.
import numpy as np
import seaborn as sns
# Data simulation
rng = np.random.RandomState (0)
x = rng.normal (0, 1, size = 1000)
df = {‘x’ : x}
# Histogram
sns.histplot (x = x)
# Equivalent to:
sns.histplot (x = “x”, data = df)
This plot within the seaborn library is extremely common for depicting a line utilizing linear regression. It illustrates the data points also in 2-D and shows them both vertically as well as horizontally.
An example in python to display the lmplot is given below.
Import seaborn as sns
sns.set(style=”ticks”)
Df = sns.load_dataset (“HKR”)
sns.lmplot(x=”size”, y=”total_bill”, data=data1)
Lmplot Output:
This plot within Seaborn is helpful for plotting a histogram. It can even plot the histogram with a few variations like rugplot and kdeplot.
The example specified gives an understanding of the histogram as an output:
import matplotlib.pyplot as plt1
import seaborn as sns
sns.distplot ([10, 11, 12, 13, 14, 15])
plt1,show()
Distribution Plot Output:
Seaborn is a visualization library of Python which presents an enhanced sketching tool for creating exciting and informative data visualizations. Seaborn library has a link to the Pandas data structure. This library utilizes Matplotlib straightforwardly and is completely based on it.
Further, it offers us the capability to generate advanced data visualization. This helps in our understanding of the data by positioning it in a visual context and revealing any concealed relationships between trends or variables that might not be instantly apparent. In the comparison of seaborn vs matplotlib Matplotlib has a low-level interface whereas Seaborn has a high-level interface.
The advantages of using Seaborn within our application is given below:
The Seaborn library provides a range of plotting tools, which support the interpretation and viewing of data more easily.
Learn about seaborn in python plotting functions to know more how these categorical variables might be depicted graphically.
We will now utilize a bar plot to illustrate which days fetched the highest tip from the customers.
sns.barplot(x=”day”, y=”tip”, data = tips)
The distribution plot or dist plot plots the instances or density of the defined feature in the dataset. Let us plot the distribution of tips from the dataset.
sns.distplot(tips[‘tip’])
The image above demonstrates that most of the tips offered by the customers exist between the range of 2 and 4.
Count plot allows us to conveniently plot a feature against the number of occurrences or observations.
Let us illustrate the number of smokers and non-smokers within the dataset.
sns.countplot(x=’smoker’, data=tips)
Seaborn also offers a simple way to generate scatter plots. The scatterplot() function can be employed to design a scatter plot having the option of displaying the linear relationship between 2 variables. For instance, the code given will generate a scatter plot displaying the association between “sepal_length” and “sepal_width” within the iris dataset:
Seaborn also offers a simple way to build box plots. The boxplot() function can be utilized to design a box plot. For instance, the following code will generate a box plot displaying the distribution of “sepal_length” by “species” within the iris dataset:
Seaborn also presents a simple way to build heat maps by using the heatmap() function. For example, the specified code will generate a heatmap exhibiting the relationship between “day” and “month” within the flights dataset:
One of the easiest ways to visualize the association between all features, the method of pair plot plots the entire pair relationships within the dataset instantly.
sns.pairplot(tips)
The method uses all the features in the dataset and plots it against one another.
Each plot in Seaborn has a set of definite parameters. For sns.jointplot, there are 3 compulsory parameters: the x-axis data, the y-axis data, and the dataset.
To construct a linear regression, we are required to insert to those 3 parameters, the optional parameter kind=”reg” (for Linear Regression).
1 tips = sns.load_dataset (“tips”)
2 sns.jointplot (“total_bill”, “tip”, data=tips, kind=’reg’)
Mark that you could also design a linear regression using regplot() or lmplot().
Pairplot signifies pairwise relation across the complete dataframe and upholds an extra argument named hue for categorical separation. It basically creates a jointplot between probable numerical columns and takes some time if the data frame is indeed huge. It is plotted utilizing the pairplot() method.
Syntax:
pairplot(data[, hue, hue_order, palette, …])
Example:
# importing packages
import seaborn as sns
import matplotlib.pyplot as plt
# loading dataset
data = sns. load_dataset (“iris”)
sns.pairplot(data=data, hue=’species’)
plt.show()
Output:
The Seaborn library in python is really adopted for data visualization for which Seaborn works as an extremely powerful tool. The excellent thing about using it is that it is based on matplotlib, enabling the user to customize his plots and graphs as per his requirements. This seaborn tutorial informs us about a seaborn library and the types that accompany it.
1. Is Seaborn utilized for data manipulation?
It’s essential to note that Seaborn is mainly focused on data visualization, and for more complicated data manipulation projects, we may need to depend on the functionalities offered by pandas or different data manipulation libraries in Python.
2. What data structures are adopted by Seaborn?
Only those seaborn datasets are accepted that possess more than one vector arranged in some tabular manner. There is a basic difference between “wide-for,” and “long-form” data tables, and seaborn will handle each differently.
3. Is Seaborn open source?
Seaborn is an open-source library of Python utilized for visualizing the preliminary statistical data plots. It is a fundamental library used to draw different information that can offer an insight of a dataset.
PAVAN VADAPALLI
popular
Talk to our experts. We’re available 24/7.
Indian Nationals
1800 210 2020
Foreign Nationals
+918045604032
upGrad does not grant credit; credits are granted, accepted or transferred at the sole discretion of the relevant educational institution offering the diploma or degree. We advise you to enquire further regarding the suitability of this program for your academic, professional requirements and job prospects before enrolling. .