R Cheat Sheet: The One You Should Keep it Handy

Introduction

R programming language’s status has grown from being a mere programming language made for statistical analysis to a more potent all-round tool. The user base of R has also grown over the past few years. It is now being employed by a host of programmers, scholars, and practitioners. In order to make the most out of any programming language, learning how to get help is quintessential because errors are bound to happen.

So, with the knowledge of syntax, the knowledge on how to access the R help files and find help from other sources is critical for success as an R programmer. Now, here is where the R cheat sheet will come in handy. The R cheat sheet contains all the vital functions along with its calls for an easy reference of the programmers.

Learn More: R Tutorial for Beginners: Become an Expert in R Programming

Getting help with the programming language R

Even the best books to introduce people and ease their way into the world of programming in R are not enough on their own. Sometimes one needs to learn and access the R help files. This help file that we keep talking about presents the user with a piece of detailed information on how to use various dependencies in R. How to make use of a particular function, for every built-in function is baked into these help files. The code examples on how to use the specific function are also there on each of these different help pages. 

If you want to access the R help files, to get help on how to use a particular feature, you will have to use any one of the functions that are listed below:

1. ?: The use of a single question mark displays the help files pertaining to any function that the user desires to get help. For example, “?data.frame” would view the page on the R help files that contain the documentation on how to use the function data.frame(). 

2. ??: If you want to search for a particular substring in the R help files, “??” will do the job for you. So, if you want to know the names of a function which contains the word “list” in them, all you have to do is run “??list” and your problem would be solved

3. RSiteSearch(): This function RSiteSearch() essentially does what it is named after. It essentially does an online search about the query that is passed as the parameter for this function. So, RSiteSearch(“linear models”) will compile the search at the website “RSiteSearch” for the string “linear models.”

If you are struggling to get help for R and the baked-in documentations are not sitting well with you, there are many add-on packages that you can install to get all the help that you need with R. Packages like “sos” is available for download which is offered by CRAN. This R package contains some clear and concise function which would make the search for all kinds of queries through all the help files available on the website “RSiteSearch.”

The installation of the package is also reasonably straightforward. All that you need to do is run the code install.packages(“sos”) in the R console, then all that is left is to load the package. The package loading can be done through the use of the library(“sos”).

With the installation of the “sos” package, you will now have access to the function called findFn(). This findFn() function takes in the search parameter as the argument and then returns the list of hundreds of the web pages, which contain the argument that has been passed. So, for example, if you run the function findFn (“regression”) into your R console, you will be faced with a web page containing a lot of information.

The information includes links to many functions that have the word regression in the name, or even if they have the phrase regression in their help text, you will also find a reference to it if you use the function findFn().

Read: 6 Interesting R Project Ideas For Beginners

How to import Data into R

The following table is handy because it contains some functions which will come in very handy when you want to import data into R:

Function What It Does Example
read.table() This function is responsible for reading the data whose columns are not joined together. Usually, this function is employed when the data that you want to read has its columns separated with a comma or a tab. One thing to note is that you can specify the separator yourself alongside some other different arguments which accurately describe the data you want R to read. read.table(file=myfile”, sep=t”,
header=FALSE)
read.csv() This function in crude terms is a very toned down or watered-down version of the read.table() method. This function has been hard-coded to read the data from any CSV file that is being passed into this function as an argument. CSV files are typically spreadsheets and MS Excel documents. read.csv(file=myfile”)
read.csv2() This function is essentially a read.csv() function with minor tweaks. Read.csv2() function has a preset where the separator of the data is a semicolon and the comma serves as the floating or decimal point.  read.csv2(file=myfile”,
header=FALSE)
read.delim() This function is used when the main motive is to read the files which have been delimited. The default separator that is being used here is tab. read.delim(file=myfile”,
header=TRUE)
scan() This function gives you a finer and much more precise control over the data that you want to be read by R if the data in question is not tabular. scan(“myfile”,skip=1,
nmax=10)
readLines() This function is used when reading one line at a time from a text file is the required job we want the program to perform. readLines(“myfile”)
read.fwf If the data you have has dates in fixed-width-format then you should use this function because it reads the dates in the fixed-width-format. In simpler words, if the data that you have has a fixed number of characters in each column then this function should be used. read.fwf(“myfile”,
widths=c(1,2,3)

The host of function that you will gain access to after running that line of code and the purpose that they serve are listed below:

View Course
Function What it does Example
read.spss This function takes in the name of an SPSS file as the argument and reads it into the R program. read.spss(“myfile”)
read.dta This function takes in the input of the file name of Stata binary format and it reads it into the R program. read.dta(“myfile”)
read.xport This function takes the argument of the name of a SAS export file and it reads the file into the R program. read.export(“myfile”)

Source

Also check out: Why Learn R? Top 8 Reasons To Learn R 

Different data types and the basic manipulation of the tables

1. There are basically three data types that are of major importance when you are programming in R. These three types are namely: numeric, character, and a factor. You can quickly do a search for which kind of data type is this, or you can also typecast by using the following two commands, respectively, is.factor() and as.factor().

2. If you happen to import a table whose variables contain one or more than one entries, which are characters, then R will automatically cast the table as the datatype of the factor. However, that being said you can still cast the data into numeric by forcing R, using the command= as.numeric(as.character(dat1$VAR1)).

3. The command names (dat1)=c(“ID”, “X”, “Y”, “Z”) actually renames the variable in your dataset. You will have to keep in mind and the vector length should match the number of variables that you have; otherwise, you will run into an error.

4. The command fix (dat2) opens the entire data you have in a spreadsheet document where you can edit the cells with a simple double-clicking in the cells.

5. If the data you have only contains numeric values in the table, you can take the transposition of the table. Use, dat2 = t(dat1), and the table named as dat2 will contain the transpose (making all the rows into columns) of the table of data contained in dat1.

Tips on how to create random data and how to do random sampling

1. The function rnorm(10) takes in the argument of 10 and creates ten random samples. These random samples are generated from a normal distribution, which has a zero mean, and the standard deviation of the dataset happens to be 1.

2. The function runif(10) takes ten different random samples to create a distribution that is uniform and whose value is between zero and one.

3. The function round(rnorm(10)*3+15) takes ten samples, which are random from a normal distribution whose mean is 15, and the standard deviation that it has is of 3 and the floating points which are there in the data are removed with the help of the rounding function.

4. The function round(runif(10)*5+15) gives the user back with random integers, which has the value between the values of 15 and 20. The distribution of these values will be uniform.

5. The function sample(c(“A”, “B”, “C”), 10, replace=TRUE) samples and creates a random sample from any vector that has been passed as the argument to this function. 

Tips on how to transform data that is inside the data table

1. The function call of the transform function done like this dat2=transform(dat1, VAR1=VAR1*0.4), multiplies the values stored in VAR1 with 0,4 and then re-assign the multiplied value to VAR1 again.

2. The call of the function transform can also be used to create variables with specific dependencies on existing variables. If you call the function like this dat2=transform(dat1, VAR2=VAR1*2), it will create a new variable with the name of VAR2, which will contain the value of VAR1 multiplied with a factor of two.

3. You can also call the transform function to modify the values at any specific site that you require. For performing that task, you will have to call the function like dat2=transform(dat1, VAR1=ifelse(VAR3== “Site 1”, VAR1*0.4, VAR1)). The call, as mentioned earlier of the transform function, multiplies the data stored in VAR1 for the data entries, which are the place known as site 1. The value of the variable VAR1 remains the same everywhere else.

Read : 8 Astonishing Data Science Projects in R For Beginners

Conclusion

The world of programming has seen a boom of languages over the past few years. These programming languages are aimed to eradicate and focus its attention on one aspect of computing. The languages like R have a robust statistical and data science-centric approach mainly because of the baked-in features that this language possesses.

While working in any programming language, having every command on your fingertips is not an easy task. Now, this is where the R cheat sheet comes to the rescue. One thing to remember always is that the best R cheat sheet is the one that you create. 

If you are curious to learn about R, Python, data science, check out IIIT-B & upGrad’s PG Diploma in Data Science which is created for working professionals and offers 10+ case studies & projects, practical hands-on workshops, mentorship with industry experts, 1-on-1 with industry mentors, 400+ hours of learning and job assistance with top firms.

Prepare for a Career of the Future

UPGRAD AND IIIT-BANGALORE'S PG DIPLOMA IN DATA SCIENCE
Enroll Today

Leave a comment

Your email address will not be published. Required fields are marked *

×
Download Whitepaper
Download Whitepaper
By clicking Download Whitepaper, you agree to our terms and conditions and our privacy policy.
View Course
Aspire to be a Data Scientist
Download syllabus & join our Data Science Program and develop practical knowledge & skills.
Download syllabus
By clicking Download syllabus, I authorize upGrad and its representatives to contact me
via SMS / Email / Phone / WhatsApp / any other modes.
I agree to upGrad terms and conditions and our privacy policy.