HomeBlogBig DataThe Six Most Commonly Used Data Structures in R

The Six Most Commonly Used Data Structures in R

Read it in 9 Mins

Last updated:
3rd Feb, 2020
Views
1,502
In this article
View All
The Six Most Commonly Used Data Structures in R

As a software programmer and coder, you must be aware of the need for variables to store data. These variables are reserved in different memory locations to store values. Thus, creating a variable means reserving some space in memory. It is this data that is arranged by data structures to be efficiently used in a computer.

Unlike popular programming languages such as C and Java, R has no variables to be declared as data. R has R-objects (data structures) that become the datatype of the required variable. There are various types of data structures in R. But first, let’s understand what data structures are!

What are Data Structures?

In R, data structures are a tool that holds multiple values. Note that in R programming, data with single values are barely ever used. It is more viable to use R to club multiple numbers, words, or values of different types together. This is where data structures come into the picture. They group these multiple values together to make it easier to work with large amounts of data at once.

Data structures are composed of data types that define the kind of data that is stored in zvalue. For instance, the number 13 is a numeric data type, while “thirteen” has a character data type, also called string.

Ads of upGrad blog

Now that you’ve got a hold of this, let’s see the different data structure types.

Types of Data Structures

In order to make data analysis and operations easy and efficient, there are five major types of data structures in R programming.

Let’s take a look at each of them in detail.

  • Vector

The function of R Vectors is to group multiple values of the same data type. It is the most basic type of data structure in R and has two parts: Atomic Vectors and Lists. Following are their common properties:

  • Type of function (what it is)
  • Length of function (number of elements)
  • Attribute of function (additional arbitrary metadata)

Now, while Atomic Vectors are meant for clubbing the same data type, lists can group different data types. There are four types of Atomic Vectors:

  • Numeric Data Type
  • Integer Data Type
  • Character Data Type
  • Logical Data Type

You can create Vectors using the function c().

For example:

If you run the above code, a vector by the name ‘thisVector’ will be created, containing all numbers from 1 to 30.

To store character values in a Vector, you will have to use double quotes as such:


While you can store different types of data in a vector, it is advised that you don’t as all values get converted to a character type.

Explore our Popular Software Engineering Courses

  • Lists

As mentioned above, Lists can contain any type of data elements – strings, numbers, vectors, and even another list. For example, you can create a list of 80 numbers, 30 words, and 42 vectors. The function to be used is a list().

Example:


Output:

Since Lists can have other lists as well, they are sometimes called recursive Vectors. This is why they’re very different from Atomic Vectors.

  1. Factors

Simply put, a factor is a type of vector where only predefined values can be stored. It is primarily used to store categorical data. They categorize column values, such as “Male”, “Female”, “TRUE”, “FALSE”, etc.

Factors are heterogeneous in the sense that both strings and integers can be stored in them. To create factors, use the factor() function. They are very useful when there are a lot of possible values for a particular variable and you know all of them. 

In R programming, character vectors automatically get converted into vector. You can use stringsAsFactors = FALSE in order to suppress this and then manually convert each character vector to factors. 

Explore Our Software Development Free Courses

  • Data Frames

This data structure in R is used to represent data in a tabular form to make data analysis easier. It contains equal-length vectors, thus forming a two-dimensional structure. There are columns containing values of a variable and rows containing a set of values of each column.

Naturally, data frames can store values of different data types. However, each column must have the same number of elements. For example, if column 1 has 5 elements, column 2 should also have 5 values.

Data frames have some special characteristics:

  • No column names should be left empty.
  • Each row’s name must be unique.
  • You can store numeric, factor, or character type data in a data frame.
  • All columns must contain the same number of data elements.

All datasets that are imported in R are automatically stores as data frames.

In-Demand Software Development Skills

  • Matrices

Matrix data structure in R stands somewhere between Vectors and Data Frames. Matrices are two-dimensional data sets that can contain elements of only the same data type. You can create a matrix using the function matrix ().

Syntax: matrix(data, nrow, ncol, byrow, dimnames)

Here,

data = input elements as a vector

nrow = number of rows

ncol = number of columns

byrow = row-wise arrangement

dimnames = names of columns/rows

Example:

Output:

Even though factors look and behave like character vectors, they are, in fact, integers. To convert factors to stings, use functions like gsub() and grepl(). Using nchar() will shoot an error. 

  • Arrays

Arrays are multi-dimensional matrices. A matrix is a special case of arrays in that that it has two dimensions. While matrices are commonly used, arrays are very rare.

The function to create an array is an array().

Testing whether an object is a matrix or array is pretty simple. Just use is.matrix() or is.array() function. 

Exercises

Here are some questions that you can try answering now that you’ve acquired sufficient knowledge about the data structures in R.

  1. What are the attributes of data frames?
  2. Can data frames contain 0 rows or columns?
  3. What are the different types of Atomic Vectors in R?
  4. What is the difference between Atomic Vectors and Lists?
  5. Create a 4X3 matrix in R.

Send your answers to us via email or write them in the comments below!

Read our Popular Articles related to Software Development

Conclusion

To utilize the R language adequately, a decent comprehension of data types, data structures and how they work is significant. These items are the premise of all activities in R. For instance, a typical problem encountered by most programmers is object transformations, which can be disposed of with a good knowledge of R objects. It is imperative to note that in R everything is an object and operations have proceeded as function calls. 

Data structures in R can be sorted out in two different ways. The principal method for sorting out data structures is by their dimensionality which can be 1, 2, or n dimensionality and the subsequent route is by their nature of elements which can be homogeneous or heterogeneous. Every one of the elements in a homogeneous structure must be of a similar sort while in a heterogeneous structure, elements with various kinds are permitted.

Ads of upGrad blog

After having learned the basics of data structures in R, you will find programming in R much easier. Data structures are the fundamentals of R. The six most commonly used data structures are mentioned above. It is important to remember the different characteristics of each type and implement it to analyze data and carry out its operations. 

If you are interested to know more about Big Data, check out our Advanced Certificate Programme in Big Data from IIIT Bangalore.

Learn Software Development Courses online from the World’s top Universities. Earn Executive PG Programs, Advanced Certificate Programs or Masters Programs to fast-track your career.

Profile

Rohit Sharma

Blog Author
Rohit Sharma is the Program Director for the UpGrad-IIIT Bangalore, PG Diploma Data Analytics Program.
Get Free Consultation

Select Course
Select
By tapping submit, you agree to  UpGrad's Terms & Conditions

Our Popular Big Data Course

1Which is better, R or Python?
R is a language for arithmetical computing and graphics. It is a part of the GNU (General Public License), an operating system that uses free, open-source software. In contrast, Python is a multifaceted programming language used in software development, web development system scripting, and data analysis. R can be difficult for beginners due to its non-standardized code. Python is usually easier for most learners. On top, Python requires less coding time since it is easier to maintain and has a syntax similar to any language like English. Python is known for having many libraries, including SciPy, SciKit-learn, and NumPy, though R has more libraries overall. Python has inbuilt visualization libraries, such as Pygal, Seaborn, and Boke, but many believe R is more flexible.
2How to view vectors in R programming?
Vector elements are acquired using indexing vectors, which can be a character, numeric, or logical. You can view an individual element of a vector by its position, specified using square brackets. The initial element has an index of 1, in R. To obtain the 7th element of the colors vector. Yet, you can also alter the elements of a vector using the exact details you use to access them. You can access many elements of a vector by specifying a vector of element indices inside the square brackets.
3How do I index a factor in R?
Factors are data arrangements in R that store categorical data. Some fields take only a few preset values in datasets. For example – availability, gender, marital status, country, etc. Such data is called categorical data. We can use the exact indexing techniques as a vector to access the elements of a factor. Firstly, we can index factors by using positive integers or vectors of positive integers. Secondly, we can use negative integers or vectors of negative integers to exclude certain elements from the R factor. Finally, we can index R factors by using logical vectors.
4

Suggested Blogs

Top Advantages of Big Data for Marketers
1500
There have been many technologies that have reshaped the world in recent times, and particularly in this new millennium. Perhaps, behind the scenes of
Read More

by Pavan Vadapalli

24 Feb 2023

Best Big Data Tools & Applications in 2023
1502
The term big data has been trending for a while in the education sector, banking, industries etc. They are now involved in every field of life. The va
Read More

by Pavan Vadapalli

22 Feb 2023

Apache Spark Developer Salary in India: For Freshers & Experienced [2023]
1500
Every company today wants to crunch numbers and learn what its data is hiding. The patterns and trends that data show can have a huge impact on the ma
Read More

by Rohit Sharma

22 Nov 2022

Hive vs Spark: Difference Between Hive & Spark [2023]
1500
Big Data has become an integral part of any organization. As more organisations create products that connect us with the world, the amount of data cre
Read More

by Rohit Sharma

22 Nov 2022

Spark Developer Resume For Freshers & Experienced [With Samples]
1500
They say that the first impression is the last impression. When landing the dream Spark developer job, a resume can mean the difference between you be
Read More

by Rohit Sharma

22 Nov 2022

Hadoop Developer Salary in India in 2023 [For Freshers & Experienced]
1500
 Doug Cutting and Mike Cafarella created Hadoop way back in 2002. Hadoop originated from the Apache Nutch (an open-source web search engine) project,
Read More

by Utkarsh Singh

22 Nov 2022

Difference Between Big Data and Hadoop | Big Data Vs Hadoop
1500
If you are in the business world, you probably would have come across the terms Big Data and Hadoop. But what exactly do they refer to? And why should
Read More

by upGrad

31 Oct 2022

35 Must Know Big Data Interview Questions and Answers 2023: For Freshers & Experienced
1500
Attending a big data interview and wondering what are all the questions and discussions you will go through? Before attending a big data interview, it
Read More

by Mohit Soni

31 Oct 2022

13 Ultimate Big Data Project Ideas & Topics for Beginners [2023]
1500
Big Data Project Ideas Big Data is an exciting subject. It helps you find patterns and results you wouldn’t have noticed otherwise. This skill
Read More

by upGrad

28 Oct 2022