Blog_Banner_Asset
    Homebreadcumb forward arrow iconBlogbreadcumb forward arrow iconData Sciencebreadcumb forward arrow iconMastering NumPy: Initializing Ways, Ndarray Operations & Functions

Mastering NumPy: Initializing Ways, Ndarray Operations & Functions

Last updated:
5th Jan, 2021
Views
Read Time
5 Mins
share image icon
In this article
Chevron in toc
View All
Mastering NumPy: Initializing Ways, Ndarray Operations & Functions

Numpy is a Python package that allows mathematical and numerical operations to be performed with high-efficiency and abstract functionality on high-dimensional data. While building a Machine Learning solution for a particular business use case, it becomes very important to transform the data in such a way that the preprocessing becomes easy, and the results are interpretable. Numpy is the core library that makes it possible to perform all the related functions in the best possible way. 

Why NumPy?

Other Data Science libraries such as Pandas, Matplotlib, and Scikit-learn are built on top of Numpy because of its high-performance nature. This library offers ndarray which is used in place of inbuilt python lists. Python list is great to store values, but it comes at the cost of increased execution time as it stores pointers of the memory locations which adds overhead in terms of memory and execution.

Numpy finds its direct usage in the image processing field where the images are processed as high multidimensional matrices to perform various transformations such as blurring, color change, and object detection. 

Let’s look at some important Numpy functions that every Data Science aspirant should know but before that:

  1. All functions code below assume that the library has been imported with alias np, i.e, import numpy as np.
  2. The term “array” will be used to define ndarray after this point.

Check out our data science certifications to upskill yourself

Different Ways to Initialize

1. Linearly Spaced

This function is intended to provide a linear spaced array within the range of values. The function np.linspace(start, stop, num) where the start and stop define the range of values and num define the number of samples, returns evenly spaced values whereas np.arange(start, stop, step) returns values that differ by step values. 

2. Particular Values

In many instances, you may want to initialize a large matrix with values such as ones, zeros, identity, or constant values. The size of the array is passed as a tuple. Numpy facilitates this via different functions:

    • np.zeros(size): Elements are zero
    • np.ones(size): Elements are ones
    • np.full(size, constant value): Elements are constant value passed.
    • np.eye(size): Diagonal elements are ones and the rest are zero. This is the identity matrix.
    • np.empty(size): Empty matrix
    • np.random.random(size): Random values matrix is initialized of the specified size.

Read: Numpy Interview Questions

Ndarray Operations

The main purpose of this library is efficient calculations between different arrays. Numpy supports almost all types of mathematical operations and manipulations that are applied element-wise. Some of them are listed below (Assume two arrays A and B of the same size are initialized with random values):

1. Mathematical

  • np.add(A, B): Addition 
  • np.subtract(A, B): Subtraction
  • np.divide(A, B): Division
  • np.multiply(A, B): Multiplication
  • np.exp(A): Exponential values
  • np.sqrt(A): Square Root values
  • np.sin(A), cos(A), tan(A): Trigonometric values
  • np.log(A): Logarithmic values
  • np.percentile(A, percentile needed, axis): On passing the percentile value, for example, 50, the function will return the 50th percentile of the array.
  • A.dot(B): Returns dot product of the arrays
  • A == B: Element wise comparison
  • np.array_equal(A, B): Array wise comparison 
  • A.sum(): Sum of all elements 
  • A.min(), max(): Minimum and Maximum values
  • A.cumsum(): Cumulative Sum of elements of the array
  • A.corrcoef(): Correlation coefficient

2. Manipulations

  • np.transpose(A) or A.T: Transpose of the matrix
  • A.ravel(): Flattens the array
  • A.reshape(new_shape): reshapes the array (here pass the size directly without putting it in the tuple). The new size should be the same as 
  • A.resize(size): changes the shape of the array into any shape and discards other elements that are not part of this new shape.
  • np.concatenate((A,B), axis=1 or 0)
  • np.vstack((A,B)): Stack the arrays vertically (row-wise)
  • np.hstack((A,B)): Horizontal stack

Check Out: Numpy vs Panda: Difference Between 

upGrad’s Exclusive Data Science Webinar for you –

How upGrad helps for your Data Science Career?

Explore our Popular Data Science Online Certifications

Miscellaneous Functions

  • np.where(): This is an essential function that eliminates the need for if-else statements. It takes 3 important arguments: condition, the value if the condition is satisfied, and the value if not. A simple example of this can be when you have to binarise a column based on a given threshold.
  • np.intersect1d(): This function returns the intersection of two 1-d arrays. This means that the common values of these arrays are returned by this function. If the arrays passed are not 1-d then it is flattened.

Top Data Science Skills You Should Learn

  • np.allclose(): It is an interesting function. There are situations where you can afford some level of tolerance while comparing arrays or you want to discover similar arrays. This function takes in the arrays and the tolerance value you want. For instance:
    • arr1 = np.array([1,2,3,4])
    • arr2 = np.array([2,3,4,5])
    • np.allclose(arr1, arr2, 0.5) : will return False
    • np.allclose(arr1, arr2, 1): will return True
  • argmin(), argmax(), and argsort(): As the name suggests, these functions return the indices of the respective names. A.argmin() returns the minimum element index, A.argmax() the opposite, and A.argsort() returns the indices of the sorted array. These functions can come in handy where the result depends on the index. 
  • np.clip(): This is used to convert the values within a specific range. For instance, if an array has values from 1 to 30 and you want the values between 14 to 27 without losing other values, this function will scale up the values less than 19 to 19 and scale down values greater than 27 to 27. 

Our learners also read: Learn Python Online for Free

Read our popular Data Science Articles

Conclusion

There are many more functions while operating on numpy arrays, but these are the most used functions. The operations applied on Pandas columns or series are actually on ndarray as the series is one-dimensional ndarray.

Numpy is a great tool for generating fake Data for testing out specific algorithms or simulating a scenario. It is extensively used in computer vision applications where the image is transformed into multidimensional matrices to perform the required operations or in deep learning where the neural network weights are held by these ndarrays. 

If you are curious to learn about data science, check out IIIT-B & upGrad’s PG Diploma in Data Science which is created for working professionals and offers 10+ case studies & projects, practical hands-on workshops, mentorship with industry experts, 1-on-1 with industry mentors, 400+ hours of learning and job assistance with top firms.

Profile

Rohit Sharma

Blog Author
Rohit Sharma is the Program Director for the UpGrad-IIIT Bangalore, PG Diploma Data Analytics Program.

Explore Free Courses

Suggested Blogs

Top 13 Highest Paying Data Science Jobs in India [A Complete Report]
905266
In this article, you will learn about Top 13 Highest Paying Data Science Jobs in India. Take a glimpse below. Data Analyst Data Scientist Machine
Read More

by Rohit Sharma

12 Apr 2024

Most Common PySpark Interview Questions & Answers [For Freshers & Experienced]
20925
Attending a PySpark interview and wondering what are all the questions and discussions you will go through? Before attending a PySpark interview, it’s
Read More

by Rohit Sharma

05 Mar 2024

Data Science for Beginners: A Comprehensive Guide
5068
Data science is an important part of many industries today. Having worked as a data scientist for several years, I have witnessed the massive amounts
Read More

by Harish K

28 Feb 2024

6 Best Data Science Institutes in 2024 (Detailed Guide)
5179
Data science training is one of the most hyped skills in today’s world. Based on my experience as a data scientist, it’s evident that we are in
Read More

by Harish K

28 Feb 2024

Data Science Course Fees: The Roadmap to Your Analytics Career
5075
A data science course syllabus covers several basic and advanced concepts of statistics, data analytics, machine learning, and programming languages.
Read More

by Harish K

28 Feb 2024

Inheritance in Python | Python Inheritance [With Example]
17647
Python is one of the most popular programming languages. Despite a transition full of ups and downs from the Python 2 version to Python 3, the Object-
Read More

by Rohan Vats

27 Feb 2024

Data Mining Architecture: Components, Types & Techniques
10803
Introduction Data mining is the process in which information that was previously unknown, which could be potentially very useful, is extracted from a
Read More

by Rohit Sharma

27 Feb 2024

6 Phases of Data Analytics Lifecycle Every Data Analyst Should Know About
80779
What is a Data Analytics Lifecycle? Data is crucial in today’s digital world. As it gets created, consumed, tested, processed, and reused, data goes
Read More

by Rohit Sharma

19 Feb 2024

Sorting in Data Structure: Categories & Types [With Examples]
139137
The arrangement of data in a preferred order is called sorting in the data structure. By sorting data, it is easier to search through it quickly and e
Read More

by Rohit Sharma

19 Feb 2024

Schedule 1:1 free counsellingTalk to Career Expert
icon
footer sticky close icon