LOC vs ILOC in Pandas: Difference Between LOC and ILOC in Pandas

Q: How can we add rows of Pandas DataFrame?

To insert rows in the DataFrame, we can use the loc, iloc, and ix commands. 1. The loc is mostly used for our index's labels. It may be understood as when we insert in loc 4, which indicates we are seeking for DataFrame entries with an index of 4. 2. The iloc is mostly used to find locations in the index. It's as if we insert in iloc 4, which indicates we're searching for DataFrame entries that are present at index 4. 3. The ix case is complicated because we pass a label to ix if the index is integer-based. The ix 4 indicates that we are searching the DataFrame for values with an index of 4.

Q: What is reindexing in the context of Pandas in Python?

A DataFrame's row and column labels get altered when we reindex it. The term 'reindex' refers to the process of aligning data to a specific set of labels along a single axis. In Pandas, reindexing can be used to alter the index of a DataFrame's rows and columns. Many index data structures connected with many pandas series or pandas DataFrame can be utilized with indexes.

Q: What are some data operations in Pandas?

There are several important data operations for DataFrame in Pandas, which are as follows: 1. Selection of rows and columns - By passing the names of the rows and columns, we can select any row and column in the DataFrame. It becomes one-dimensional and is regarded as a series when you pick it from the DataFrame. 2. Data Filtering - By using some of the boolean expressions in DataFrame, we can filter the data. 3. Null Values - When no data is given to the items, they receive a Null value. There can be no values in the different columns, which are generally represented as NaN.

Loc and iloc in Pandas

A common cause of confusion among new Python developers is loc vs. iloc. They both seem highly similar and perform similar tasks. So this can puzzle any student.

If you want to find out the difference between iloc and loc, you’ve come to the right place, because in this article, we’ll discuss this topic in detail. You’ll find out what’s the key difference between these functions and then see them in action to understand the concept better. Checkout our data science courses to learn more about Pandas.

Let’s get started.

Difference Between loc and iloc

1. iloc in Python

You can use iloc in Python for selection. It is integer-location based and helps you select by the position. So, if you want to find the row with index 5, iloc will show you the fifth row of the data frame irrespective of its name or label.

Before going deep into Loc and iloc in pandas, let’s understand an example of iloc.

Here’s an example of iloc in Python:

>>> mydict = [{‘a’: 1, ‘b’: 2, ‘c’: 3, ‘d’: 4},

… {‘a’: 100, ‘b’: 200, ‘c’: 300, ‘d’: 400},

… {‘a’: 1000, ‘b’: 2000, ‘c’: 3000, ‘d’: 4000 }]

>>> df = pd.DataFrame(mydict)

>>> df

a b c d

0 1 2 3 4

1 100 200 300 400

2 1000 2000 3000 4000

We’ll index the rows with a scalar integer.by using the iloc function for the above dataframe:

>>> type(df.iloc[0])

>>> df.iloc[0]

a 1

b 2

c 3

d 4

Name: 0, dtype: int64

2. loc in Pandas

You can use loc in Pandas to access multiple rows and columns by using labels; however, you can use it with a boolean array as well.

If you use loc to find a row with index 5, you won’t get the fifth row with it. Instead, you will only get the row which has the name ‘5’.

You can better understand the difference between iloc and Loc if you look at an example of loc in Pandas.

Here is an example of loc in Pandas:

>>> df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],

… index=[‘cobra’, ‘viper’, ‘sidewinder’],

… columns=[‘max_speed’, ‘shield’])

>>> df

max_speed shield

cobra 1 2

viper 4 5

sidewinder 7 8

The above was the table from which we’ll extract the row:

>>> df.loc[‘viper’]

max_speed 4

shield 5

Name: viper, dtype: int64

Detailed Example for loc vs iloc

Even though we use both of these functions for selection, it would be best if we discussed a detailed example to understand their distinctions. These examples explain the clear difference between iloc and Loc.

In our Example, we’ll use the telco customer dataset, which is available on kaggle. We’ll add it to a data frame:

df = pd.read_csv(“Projects/churn_prediction/Telco-Customer-Churn.csv”)

df.head ()

	ID	gender	Sr.Citizen	Partner	Dependents	tenure	Phone	MultipleLines	Internet	Security
0	7590-VHVEG	Female	0	Yes	No	1	No	No Phone	DSL	No
1	5575-GNVDE	Male	0	No	No	34	Yes	No	DSL	Yes
2	3668-QPYBK	Male	0	No	No	2	Yes	No	DSL	Yes

This dataset has 21 columns; we’ve only shown a few for demonstration purposes. As we’ve already discussed, we use loc to select data by the label. Here, the names of the columns are their column labels, such as gender, tenure, OnlineSecurity; they all are the column names as well as the labels.

As we haven’t assigned any specific index, pandas would create an integer index for the rows by default. The row labels are integers, which start at 0 and go up. In this example, we’ll see how loc and iloc behave differently.

Select row “1” and column “Partner”

df.loc[1, ‘Partner’]

Output: ‘No’

It shows the value present in the ‘Partner’ column of row ‘1’.

Select row labels ‘4’ and columns ‘customerID’ and ‘gender’

df.loc[:4, [‘customerID’, ‘gender’]]

	customerID	gender
0	7590-VHVEG	Female
1	5575-GNVDE	Male
2	3668-QPYBK	Male
3	7795-CFOCW	Male
4	9237-HQITU	Female

Select row labels “1”, “2”, “3” and “Dependents” column

df.loc[[1,2,3], ‘Dependents’]

1 No

2 No

3 No

Name: Dependents, dtype: object

This time, we’ll filter the dataframe and apply iloc or loc:

Select row labels to “10” and “PhoneService” and “InternetService” columns of a customer that has a Partner (Partner should be ‘yes’)

Read our popular Data Science Articles

Data Science Career Path: A Comprehensive Career Guide	Data Science Career Growth: The Future of Work is here	Why is Data Science Important? 8 Ways Data Science Brings Value to the Business
Relevance of Data Science for Managers	The Ultimate Data Science Cheat Sheet Every Data Scientists Should Have	Top 6 Reasons Why You Should Become a Data Scientist
A Day in the Life of Data Scientist: What do they do?	Myth Busted: Data Science doesn’t need Coding	Business Intelligence vs Data Science: What are the differences?

df [df.Partner == ‘Yes’].loc:10, [‘PhoneService’, ‘InternetService’]]

In the case above, we applied a filter to the database but didn’t change the index so our output had omitted multiple labels of the rows which our filter required. So, by using loc[:10] here, we selected the rows that had labels up to “10”.

If, on the other hand, we use iloc here and apply the filter, we will get 10 rows as iloc selects by position irrespective of the labels. Here’s the result we’ll get if we apply iloc[:10]:

df[df.Partner == ‘Yes’].iloc[:10, [6,8]]

upGrad’s Exclusive Data Science Webinar for you –

Watch our Webinar on How to Build Digital & Data Mindset?

	PhoneService	InternetService
0	No	DSL
8	Yes	Fiber optic
10	Yes	DSL
12	Yes	Fiber optic
15	Yes	Fiber optic
18	Yes	DSL
21	Yes	No
23	Yes	DSL
24	Yes	DSL
26	Yes	Fiber optic

You must’ve noticed that we have to change our method to select columns.

Read: Python Pandas Tutorial

Select the first 5 columns and first 5 rows with iloc

df.iloc[:4, :4]

Explore our Popular Data Science Courses

Executive Post Graduate Programme in Data Science from IIITB	Professional Certificate Program in Data Science for Business Decision Making	Master of Science in Data Science from University of Arizona
Advanced Certificate Programme in Data Science from IIITB	Professional Certificate Program in Data Science and Business Analytics from University of Maryland	Data Science Courses

	customerID	gender	SeniorCitizen	Partner
0	7590-VHVEG	Female	0	Yes
1	5575-GNVDE	Male	0	No
2	3668-QPYBK	Male	0	No
3	7795-CFOCW	Male	0	No

We can use iloc to select positions from the end. For that, we’ll simply have to use negative integers (-1, -2, etc.) and start with them.

Select the last 5 column and last 5 rows

df.iloc[-5:, -5:]

	PaperlessBilling	PaymentMethod	MonthlyCharges	TotalCharges	Churn
7038	Yes	Mailed Check	84.80	1990.5	No
7039	Yes	Credit Card	103.20	7362.9	No
7040	Yes	Electronic check	29.60	346.45	No
7041	Yes	Mailed check	74.40	306.6	Yes
7042	Yes	Bank Transfer	105.65	6844.5	No

You can use the lambda function with iloc too. (A lambda function is a small anonymous function in Python which can have a single expression but any number of arguments)

Select every third row up to the 15th one and only show “internet service” and “Partner” columns

df.iloc[ lambda x: (x.index x 3 == 0) & (x.index <= 150][‘Partner’, ‘InternetService’ ]]

	Partner	InternetService
0	Yes	DSL
3	No	DSL
6	No	Fiber optic
9	No	DSL
12	Yes	Fiber optic
15	Yes	Fiber optic

We can also select labels or positions present in between.

Select the column positions between 4 and 6, and the row positions between 20 and 25

df.iloc[20:25, 4:6]

	Dependents	tenure
20	No	1
21	No	12
22	No	1
23	No	58
24	No	49

Now, if you’d try to pass labels to iloc, Pandas will show you the following error message:

ValueError: Location-based indexing can only have [integer, integer slice (START point is INCLUDED, END point is EXCLUDED), listlike of integers, boolean array] types

You’ll get a similar error if you pass positions to loc.

Also Read: Pandas Interview Questions

The key difference between pandas loc[] and iloc[] is that loc obtains DataFrame columns and rows through names/labels whereas iloc[] obtains these through integer position/index.

When using loc[], if the label is absent, it shows a key error. But when using iloc[], if the position is absent, it shows an index error. The following section covers the similarities and difference between loc and Iloc in Pandas DataFrame.

Pandas DataFrame is a 2D tabular data structure having labeled axes. i.e., rows and columns. When you choose the columns from DataFrame, it leads to a new DataFrame consisting of only specified chosen columns from the old DataFrame.

You can have a clearer view of Loc and iloc in pandas when you go through their usage.

pandas.DataFrame.loc[] usage:

DataFrame.loc[] is based on a label to choose columns/and or rows in Pandas. It can accept single or multiple labels from the list. Also, it can accept indexes by a range (among two index labels), and more. Here are a few points on pandas.DataFrame.loc[] usage:

START is the name of the row/column label
STOP is the name of the last row/column label
STEP as the number of indices to advance after every extraction
If a START row/column is not provided, loc[] chooses from the beginning.
If STOP is not provided, loc[]selects all rows/columns from the START label.
If both START and STOP are provided, loc[] selects all rows/columns from the between.

pandas.DataFrame.iloc[] usage:

You can better understand Loc and iloc in python after understanding the iloc usage.

DataFrame.iloc[] is based on the index to select rows and/or columns in Pandas. It can accept single/multiple indexes from the list, indexes by a range, and more.

START is the integer index of the beginning row/column.
STOP is the integer index of the last row/column where you need to stop the selection
STEP is the number of indices to progress after every extraction.

Certain points to note about iloc[].

If a START index is not provided, iloc[]selects from the first row/column.
If the STOP index is not provided, iloc[]selects all rows/columns from the START index.
If both START and STOP indexes are provided, iloc[]selects all rows/columns from the between.

Selecting Single Value through loc[] vs iloc[]:

With the help of loc[] and iloc[], you can select the single column and row by index and name, respectively. The following example explains how to select rows by index and label.

Use the following example to select column by label and index.

Selecting Multiple Rows/Columns through loc[] vs iloc[]:

You can use the integer or labels index as a list to loc[] and iloc[] attributes if you want to select multiple columns and rows.

Here’s an example of how to select rows by label and index.

Selecting a range of values to present between two columns or rows:

With the help of loc[] and iloc[], you can also select rows and columns based on range i.e. all items within two columns/rows.

Selecting alternate rows or columns:

With the help of ranges, you can select each alternate row from DataFrame. This can be done in any of the following ways:

Using the conditions with loc[] vs iloc[]

Using loc[] and iloc[] to select rows by conditions from Pandas DataFrame.

With this discussion on Loc and iloc in python, now you can better understand the differences between them.

Comparison of loc vs iloc in Pandas:

Let’s go through the detailed comparison to understand the difference between loc and Iloc.

Pandas loc	Pandas iloc
The Pandas loc technique helps to recover the gathering of sections and lines by Boolean clusters or names existing in the DataFrame. It accepts list marks, and when it exists in the guest DataFrame, it will restore the sections, lines, or DataFrame. Its mark-based technique can be used with the Boolean cluster.	The Pandas iloc strategy is used when the record name of the DataFrame is different from the numeric configuration of 0,1,2,….,n. Alternatively, it can be used for the situation when the client doesn’t have any idea about the list name.
The loc strategy used is a name-based method that takes marks or names of the files while making the cuts.	The iloc strategy works on the record’s position. It functions like a customary cutting wherein you have to demonstrate the positional list number and obtain the proper cut.
The loc technique contains the table’s last component.	The iloc strategy dismisses the last component.
The contentions of .loc[] can be column name or rundown of line mark.	The iloc strategy in Pandas is positional based. The contentions of .iloc[] can be: single line and section rundown of lines and sections scope of lines and sections
The loc technique indexer can undertake the boolean choice after bypassing the boolean arrangement.	You can’t pass a Boolean arrangement in the iloc method.

Top Data Science Skills to Learn

	Top Data Science Skills to Learn
1	Data Analysis Course	Inferential Statistics Courses
2	Hypothesis Testing Programs	Logistic Regression Courses
3	Linear Regression Courses	Linear Algebra for Analysis

Learn More About Python

A student must ask questions and find their answers. We hope this article would have answered your questions on loc in Pandas (or iloc in Python). It would be best if you tried out these functions yourself on different datasets to understand how they work.

If you want to learn more about Python, Pandas, and relevant topics, you should head to our blog. Our experts add multiple detailed resources there.

If you are curious to learn about data science, check out IIIT-B & upGrad’s Executive PG Programme in Data Science which is created for working professionals and offers 10+ case studies & projects, practical hands-on workshops, mentorship with industry experts, 1-on-1 with industry mentors, 400+ hours of learning and job assistance with top firms.

Frequently Asked Questions (FAQs)

1. How can we add rows of Pandas DataFrame?

To insert rows in the DataFrame, we can use the loc, iloc, and ix commands.

1. The loc is mostly used for our index's labels. It may be understood as when we insert in loc 4, which indicates we are seeking for DataFrame entries with an index of 4.
2. The iloc is mostly used to find locations in the index. It's as if we insert in iloc 4, which indicates we're searching for DataFrame entries that are present at index 4.
3. The ix case is complicated because we pass a label to ix if the index is integer-based. The ix 4 indicates that we are searching the DataFrame for values with an index of 4.

2. What is reindexing in the context of Pandas in Python?

A DataFrame's row and column labels get altered when we reindex it. The term 'reindex' refers to the process of aligning data to a specific set of labels along a single axis. In Pandas, reindexing can be used to alter the index of a DataFrame's rows and columns. Many index data structures connected with many pandas series or pandas DataFrame can be utilized with indexes.

3. What are some data operations in Pandas?

There are several important data operations for DataFrame in Pandas, which are as follows:

1. Selection of rows and columns - By passing the names of the rows and columns, we can select any row and column in the DataFrame. It becomes one-dimensional and is regarded as a series when you pick it from the DataFrame.
2. Data Filtering - By using some of the boolean expressions in DataFrame, we can filter the data.
3. Null Values - When no data is given to the items, they receive a Null value. There can be no values in the different columns, which are generally represented as NaN.

Suggested Blogs

57467

Priority Queue in Data Structure: Characteristics, Types & Implementation

Introduction The priority queue in the data structure is an extension of the “normal” queue. It is an abstract data type that contains a

by Rohit Sharma

15 Jul 2024

142458

An Overview of Association Rule Mining & its Applications

Association Rule Mining in data mining, as the name suggests, involves discovering relationships between seemingly independent relational databases or

by Abhinav Rai

13 Jul 2024

101684

Data Mining Techniques & Tools: Types of Data, Methods, Applications [With Examples]

Why data mining techniques are important like never before? Businesses these days are collecting data at a very striking rate. The sources of this eno

by Rohit Sharma

12 Jul 2024

58115

17 Must Read Pandas Interview Questions & Answers [For Freshers & Experienced]

Pandas is a BSD-licensed and open-source Python library offering high-performance, easy-to-use data structures, and data analysis tools. The full form

by Rohit Sharma

11 Jul 2024

99373

Top 7 Data Types of Python | Python Data Types

Data types are an essential concept in the python programming language. In Python, every value has its own python data type. The classification of dat

by Rohit Sharma

11 Jul 2024