Programs

Types of Data Structures in Python: List, Tuple, Sets & Dictionary

Python is an all-time favourite language for all Data Science enthusiasts. The versatile nature and easy-to-understand approach help developers to focus more on understanding the trends in the data and deriving meaningful insights rather than spending time to fix a minor semicolon bug or closing the overhead bracket. Python being the most popular language among beginners is adapted quickly, so it becomes important to hold a good grasp of this language. 

Data Structures is an essential concept in any programming language. It defines how the variables and data can be stored and retrieved from the memory in the best possible way, depending upon the data type. It also defines the relationship between variables, which helps in deciding the operations and functions that should be performed over them. Let’s understand how Python manages data.

Types of Data Structure in Python

1. List

This is the simplest and commonly used Data Structure in Python programming. As the name suggests, it is a collection of items to be stored. The items stored can be of any type numeric, string, boolean, objects, etc which makes it heterogeneous. This means that a list can have any type of data and we can iterate over this list using any type of loop.

The elements stored are usually associated with an index that defines the position in the list. The index numbering starts from zero. The list is mutable, meaning elements in the list can be added, removed, or changed even after their definition. This data structure is like arrays in other languages which is usually homogeneous, meaning only one type of data can be stored in arrays. Some basic operations on Lists are as below:

  • To declare a list in Python, put it in the square brackets:

sample_list = [‘upGrad’, ‘1’, 2]

  • To initialize an empty list:

sample_list = list()

  • Add elements to the list:

sample_list.append(‘new_element’) 

  • Remove elements from the list:

sample_list.remove(<element name>) removes the specific element

del sample_list[<element  index num>] removes the element at that index

sample_list.pop(<element  index num>) removes the element of that index and returns that removed element

  • To change element at any index:

sample_list[<any index>] = new item

  • Slicing: This is an important feature that can filter out items in the list in particular instances. Consider that you require only a specific range of values from the list, then you can simply do this by:

sample_list[start: stop: step] where step defines the gap between the elements and by default it is 1.

Learn about: How to Create Perfect Decision Tree

2. Tuple

This is another data structure that sequentially stores data, meaning that the data added remains in an orderly fashion like the lists. Following the same lines, Tuple can also store heterogeneous data, and the indexing remains the same.

The major difference between the two is that the elements stored in the tuple is immutable and can’t be changed after definition. This means that you cannot add new elements, change existing items, or delete elements from the tuple. Elements can only be read from it via indexing or unpacking with no replacement. 

This makes tuple fast as compared to the list in terms of creation. The tuple is stored in a single block of memory but a list requires two blocks, one is fixed-sized and the other is variable-sized for storing data. One should prefer a tuple over a list when the user is sure that the elements to be stored don’t require any further modification. Some things to consider while using a tuple:

  • To initialize an empty tuple:

sample_tuple = tuple()

  • To declare a tuple, enclose the items in circular brackets:

sample_tuple = (‘upGrad’, ‘Python’, ‘ML’, 23432)

  • To access the elements of the tuple:

sample_tuple[<index_num>] 

3. Sets

In mathematics, a set is a well-defined collection of unique elements that may or may not be related to each other. In tuple and list, one can store many duplicate elements with no-fail, but the set data structure only takes in unique items.

The elements of a set are stored in an unorderly fashion meaning the items are randomly stored in the set and there is no definite position or index supported, neither slicing is allowed in a set. The set is itself mutable but the elements must be immutable because the way sets work are hashing these elements and in this process, only immutable elements can be hashed.

Elements can be added or removed from the set but cannot be changed as there is no concept of indexing and therefore elements can be changed. Like in mathematics, here also all the set operations can be performed such as union, intersection, difference, disjoint. Let’s look at how to implement it:

  • To initialize an empty set:

sample_set = set()

  • Add elements to the set:

sample_set.add(item) This adds a single item to the set

sample_set.update(items) This can add multiple items via a list, tuple, or another set

  • Remove elements from the set:

sample_set.discard(item) Removes element without warning if element not present 

sample_set.remove(item) Raises an error if the element to be removed is not present.

  • Set operations (Assume two sets initialized: A and B):

A | B or A.union(B):  Union operation 

A & B or A.intersection(B): Intersection operation 

A – B or A.difference(B): Difference of two sets

A ^ B or A.symmetric_difference(B) : Symmetric difference of sets

Check out: Data Frames in Python

4. Dictionary

This is the most useful data structure in Python, which allows the data elements to be stored in a key-value pair fashion. The key must be an immutable value, and the value can be a mutable item. This concept is like what an actual dictionary looks like, where we have the words as keys and their meanings as values. A dictionary stores these pairs in an unordered fashion, and therefore there is no concept of the index in this data structure. Some important things related to this:

  • To initialize an empty dictionary:

sample_dict = dict()

  • To add elements to the dictionary:

sample_dict[key] = value 

Another way to do this is sample_dict = {key: value}

If you print this dictionary, the output would be: {‘key1’: value, ‘key2’: value … }

  • To get the keys and values of the dictionary:

sample_dict.keys(): returns keys list

sample_dict.values(): returns values list

sample_dict.items(): returns the view object of key-value pairs as tuple in list

Conclusion

It’s important to grasp the basic knowledge of data structures in Python. Being in the Data industry, different Data Structures can help to get a better workaround of the underlying algorithms. It makes the developer more aware of the best coding practices to get the results efficiently. The usage of each data structure is highly situation based and requires rigorous practice.

If you are curious to learn about data science, check out IIIT-B & upGrad’s PG Diploma in Data Science which is created for working professionals and offers 10+ case studies & projects, practical hands-on workshops, mentorship with industry experts, 1-on-1 with industry mentors, 400+ hours of learning and job assistance with top firms.

Prepare for a Career of the Future

UPGRAD AND IIIT-BANGALORE'S PG DIPLOMA IN DATA SCIENCE
Apply Now

Leave a comment

Your email address will not be published.

Accelerate Your Career with upGrad

Our Popular Data Science Course

×