Blog_Banner_Asset
    Homebreadcumb forward arrow iconBlogbreadcumb forward arrow iconBig Databreadcumb forward arrow iconThe Six Most Commonly Used Data Structures in R

The Six Most Commonly Used Data Structures in R

Last updated:
3rd Feb, 2020
Views
Read Time
9 Mins
share image icon
In this article
Chevron in toc
View All
The Six Most Commonly Used Data Structures in R

As a software programmer and coder, you must be aware of the need for variables to store data. These variables are reserved in different memory locations to store values. Thus, creating a variable means reserving some space in memory. It is this data that is arranged by data structures to be efficiently used in a computer.

Unlike popular programming languages such as C and Java, R has no variables to be declared as data. R has R-objects (data structures) that become the datatype of the required variable. There are various types of data structures in R. But first, let’s understand what data structures are!

What are Data Structures?

In R, data structures are a tool that holds multiple values. Note that in R programming, data with single values are barely ever used. It is more viable to use R to club multiple numbers, words, or values of different types together. This is where data structures come into the picture. They group these multiple values together to make it easier to work with large amounts of data at once.

Data structures are composed of data types that define the kind of data that is stored in zvalue. For instance, the number 13 is a numeric data type, while “thirteen” has a character data type, also called string.

Ads of upGrad blog

Now that you’ve got a hold of this, let’s see the different data structure types.

Types of Data Structures

In order to make data analysis and operations easy and efficient, there are five major types of data structures in R programming.

Let’s take a look at each of them in detail.

  • Vector

The function of R Vectors is to group multiple values of the same data type. It is the most basic type of data structure in R and has two parts: Atomic Vectors and Lists. Following are their common properties:

  • Type of function (what it is)
  • Length of function (number of elements)
  • Attribute of function (additional arbitrary metadata)

Now, while Atomic Vectors are meant for clubbing the same data type, lists can group different data types. There are four types of Atomic Vectors:

  • Numeric Data Type
  • Integer Data Type
  • Character Data Type
  • Logical Data Type

You can create Vectors using the function c().

For example:

If you run the above code, a vector by the name ‘thisVector’ will be created, containing all numbers from 1 to 30.

To store character values in a Vector, you will have to use double quotes as such:


While you can store different types of data in a vector, it is advised that you don’t as all values get converted to a character type.

Explore our Popular Software Engineering Courses

  • Lists

As mentioned above, Lists can contain any type of data elements – strings, numbers, vectors, and even another list. For example, you can create a list of 80 numbers, 30 words, and 42 vectors. The function to be used is a list().

Example:


Output:

Since Lists can have other lists as well, they are sometimes called recursive Vectors. This is why they’re very different from Atomic Vectors.

  1. Factors

Simply put, a factor is a type of vector where only predefined values can be stored. It is primarily used to store categorical data. They categorize column values, such as “Male”, “Female”, “TRUE”, “FALSE”, etc.

Factors are heterogeneous in the sense that both strings and integers can be stored in them. To create factors, use the factor() function. They are very useful when there are a lot of possible values for a particular variable and you know all of them. 

In R programming, character vectors automatically get converted into vector. You can use stringsAsFactors = FALSE in order to suppress this and then manually convert each character vector to factors. 

Explore Our Software Development Free Courses

  • Data Frames

This data structure in R is used to represent data in a tabular form to make data analysis easier. It contains equal-length vectors, thus forming a two-dimensional structure. There are columns containing values of a variable and rows containing a set of values of each column.

Naturally, data frames can store values of different data types. However, each column must have the same number of elements. For example, if column 1 has 5 elements, column 2 should also have 5 values.

Data frames have some special characteristics:

  • No column names should be left empty.
  • Each row’s name must be unique.
  • You can store numeric, factor, or character type data in a data frame.
  • All columns must contain the same number of data elements.

All datasets that are imported in R are automatically stores as data frames.

In-Demand Software Development Skills

  • Matrices

Matrix data structure in R stands somewhere between Vectors and Data Frames. Matrices are two-dimensional data sets that can contain elements of only the same data type. You can create a matrix using the function matrix ().

Syntax: matrix(data, nrow, ncol, byrow, dimnames)

Here,

data = input elements as a vector

nrow = number of rows

ncol = number of columns

byrow = row-wise arrangement

dimnames = names of columns/rows

Example:

Output:

Even though factors look and behave like character vectors, they are, in fact, integers. To convert factors to stings, use functions like gsub() and grepl(). Using nchar() will shoot an error. 

  • Arrays

Arrays are multi-dimensional matrices. A matrix is a special case of arrays in that that it has two dimensions. While matrices are commonly used, arrays are very rare.

The function to create an array is an array().

Testing whether an object is a matrix or array is pretty simple. Just use is.matrix() or is.array() function. 

Exercises

Here are some questions that you can try answering now that you’ve acquired sufficient knowledge about the data structures in R.

  1. What are the attributes of data frames?
  2. Can data frames contain 0 rows or columns?
  3. What are the different types of Atomic Vectors in R?
  4. What is the difference between Atomic Vectors and Lists?
  5. Create a 4X3 matrix in R.

Send your answers to us via email or write them in the comments below!

Read our Popular Articles related to Software Development

Conclusion

To utilize the R language adequately, a decent comprehension of data types, data structures and how they work is significant. These items are the premise of all activities in R. For instance, a typical problem encountered by most programmers is object transformations, which can be disposed of with a good knowledge of R objects. It is imperative to note that in R everything is an object and operations have proceeded as function calls. 

Data structures in R can be sorted out in two different ways. The principal method for sorting out data structures is by their dimensionality which can be 1, 2, or n dimensionality and the subsequent route is by their nature of elements which can be homogeneous or heterogeneous. Every one of the elements in a homogeneous structure must be of a similar sort while in a heterogeneous structure, elements with various kinds are permitted.

Ads of upGrad blog

After having learned the basics of data structures in R, you will find programming in R much easier. Data structures are the fundamentals of R. The six most commonly used data structures are mentioned above. It is important to remember the different characteristics of each type and implement it to analyze data and carry out its operations. 

If you are interested to know more about Big Data, check out our Advanced Certificate Programme in Big Data from IIIT Bangalore.

Learn Software Development Courses online from the World’s top Universities. Earn Executive PG Programs, Advanced Certificate Programs or Masters Programs to fast-track your career.

Profile

Rohit Sharma

Blog Author
Rohit Sharma is the Program Director for the UpGrad-IIIT Bangalore, PG Diploma Data Analytics Program.
Get Free Consultation

Selectcaret down icon
Select Area of interestcaret down icon
Select Work Experiencecaret down icon
By clicking 'Submit' you Agree to  
UpGrad's Terms & Conditions

Our Popular Big Data Course

Frequently Asked Questions (FAQs)

1Which is better, R or Python?

R is a language for arithmetical computing and graphics. It is a part of the GNU (General Public License), an operating system that uses free, open-source software. In contrast, Python is a multifaceted programming language used in software development, web development system scripting, and data analysis. R can be difficult for beginners due to its non-standardized code. Python is usually easier for most learners. On top, Python requires less coding time since it is easier to maintain and has a syntax similar to any language like English. Python is known for having many libraries, including SciPy, SciKit-learn, and NumPy, though R has more libraries overall. Python has inbuilt visualization libraries, such as Pygal, Seaborn, and Boke, but many believe R is more flexible.

2How to view vectors in R programming?

Vector elements are acquired using indexing vectors, which can be a character, numeric, or logical. You can view an individual element of a vector by its position, specified using square brackets. The initial element has an index of 1, in R. To obtain the 7th element of the colors vector. Yet, you can also alter the elements of a vector using the exact details you use to access them. You can access many elements of a vector by specifying a vector of element indices inside the square brackets.

3How do I index a factor in R?

Factors are data arrangements in R that store categorical data. Some fields take only a few preset values in datasets. For example – availability, gender, marital status, country, etc. Such data is called categorical data. We can use the exact indexing techniques as a vector to access the elements of a factor. Firstly, we can index factors by using positive integers or vectors of positive integers. Secondly, we can use negative integers or vectors of negative integers to exclude certain elements from the R factor. Finally, we can index R factors by using logical vectors.

4

Explore Free Courses

Suggested Blogs

Top 10 Hadoop Commands [With Usages]
11954
In this era, with huge chunks of data, it becomes essential to deal with them. The data springing from organizations with growing customers is way lar
Read More

by Rohit Sharma

12 Apr 2024

Characteristics of Big Data: Types & 5V’s
5817
Introduction The world around is changing rapidly, we live a data-driven age now. Data is everywhere, from your social media comments, posts, and lik
Read More

by Rohit Sharma

04 Mar 2024

50 Must Know Big Data Interview Questions and Answers 2024: For Freshers & Experienced
7348
Introduction The demand for potential candidates is increasing rapidly in the big data technologies field. There are plenty of opportunities in this
Read More

by Mohit Soni

What is Big Data – Characteristics, Types, Benefits & Examples
185870
Lately the term ‘Big Data’ has been under the limelight, but not many people know what is big data. Businesses, governmental institutions, HCPs (Healt
Read More

by Abhinav Rai

18 Feb 2024

Cassandra vs MongoDB: Difference Between Cassandra & MongoDB [2023]
5469
Introduction Cassandra and MongoDB are among the most famous NoSQL databases used by large to small enterprises and can be relied upon for scalabilit
Read More

by Rohit Sharma

31 Jan 2024

13 Ultimate Big Data Project Ideas & Topics for Beginners [2024]
100404
Big Data Project Ideas Big Data is an exciting subject. It helps you find patterns and results you wouldn’t have noticed otherwise. This skill
Read More

by upGrad

16 Jan 2024

Be A Big Data Analyst – Skills, Salary & Job Description
899725
In an era dominated by Big Data, one cannot imagine that the skill set and expertise of traditional Data Analysts are enough to handle the complexitie
Read More

by upGrad

16 Dec 2023

12 Exciting Hadoop Project Ideas & Topics For Beginners [2024]
20871
Hadoop Project Ideas & Topics Today, big data technologies power diverse sectors, from banking and finance, IT and telecommunication, to manufact
Read More

by Rohit Sharma

29 Nov 2023

Top 10 Exciting Data Engineering Projects & Ideas For Beginners [2024]
40180
Data engineering is an exciting and rapidly growing field that focuses on building, maintaining, and improving the systems that collect, store, proces
Read More

by Rohit Sharma

21 Sep 2023

Schedule 1:1 free counsellingTalk to Career Expert
icon
footer sticky close icon