Basics Of Predictive Analysis Using R

Introduction

 
Today, the need for statistical, data analysis and scientific research skills is on the rise. Organizations are generating huge volumes of data on a daily basis. The data has hidden trends, patterns, and relationships that can help the organization make sound decisions as far as operations are concerned. Through statistics, organizations can calculate various measures from their data. R is a computer programming language that comes with features good for performing such tasks.
 

What is R?

 
R programming language is an interpreted programming language, which signifies the availability of an R interpreter. It was developed by Ross Ihaka and Robert Gentle-man from the University of Auckland, New Zealand. R is available for free under the GNU General Public License, and its pre-compiled binaries are available for various operating systems including Windows, Linux, and Mac OS.
 
R is used by:
  • Researchers
  • Data analysts
  • Statisticians
  • Marketers
R is used to:
  • Retrieve data
  • Clean data
  • Analyze data
  • Visualize data

Installing R

 
Before you can begin to write and run your R programs, you need to install the R. I will guide you on how to install R in Windows.
 
Installation on Windows
 
First, you need to download R and store it in one of your local directories. The R software for Windows can be downloaded from the following URL:https://cran.r-project.org/bin/windows/base/.
 
The download will come in an executable form (.exe), so you just have to double click it to launch the installation. During the installation, you can choose the default settings so that the process may be easy for you.
 
For those using Windows 32-bit, only the 32-bit version will be installed, but for those using the 64-bit version, both the 32-bit and 64-bit versions will be installed.
 
Once the installation is completed, launch R from the Start button and then All Programs. You will be presented with the R console, as shown below:
 
 
R commands can run on the console as well.
 

Installing R packages

 
Once you have R installed on your system, you may need to perform tasks using special packages. This is the case when you are performing data analysis tasks. Suppose you need to visualize your data. You need to install packages that are specialized in data visualization.
 
The installation can be done by running the following command on the R console:
  1. install.packages("package_name")  
For example, to install a package named plotrix, we can run the following command:
  1. install.packages("plotrix")  
The package will then be installed and integrated within your R environment.
 
There are two ways through which you can write and run your r programs:
  • Using the R command prompt
  • Using the R script file.
Let us discuss them in detail.
 
R Command Prompt
 
After R has been installed on your system, it is easy for you to start its command prompt from All Programs on windows. For Linux users, run the R command on the Linux terminal. Once the r command prompt > is launched, you can begin to type your program.
 
For example, you can invoke the print() function to print the Hello World text on the prompt as shown below,
  1. print("Hello World")  
This should run as shown below:
 
 
 
We can also create a variable and assign the string Hello World to it.
 
This is shown below:
  1. helloString <- "Hello World"  
  2. print ( helloString)  
This should run as shown below:
 
 
We have declared a string and given it the name helloString. The string was assigned a value of Hello World. We have then passed the string to the print() function so as to print its value.
 
R Script File
 
You can create a file then write your r script on it. You can then run the script on the R prompt by invoking the R interpreter commonly known as the R script. The name of the script file should end with a .r extension to signify that it is an R file.
 
To create a script file:
  • Click File from the top of the R prompt then select New Script.
  • A blank text editor will be opened. Press Ctrl + S and give the file the name hello.R.
  • Add a script to the file.
  • Press Ctrl+S to save the changes you have made to the file.
  1. helloString <- "Hello World"  
  2. print ( helloString)  
 
Now, open the terminal of your operating system and navigate to the directory where you have saved your hello.R file.
 
Run the Rscript command followed by the name of the file as Rscript hello.R. The command should execute the file and return the result.
 

R Comments

 
Comments are the sections of code that the interpreter will skip or ignore. They are made for explanations so that one may know the meaning of various lines of code even if the code was written by someone else. A comment may span a single line or multiple lines.
 
To write a single-line comment, precede it with the pound (#) symbol as shown below:
  1. helloString <- "Hello World"  
  2. # print text on the screen  
  3. print ( helloString)  
The line preceded with the # symbol will be skipped or ignored by the R interpreter. If the comment spans more than one line, do a trick and enclose it within double or single quotes.
 
For example:
  1. helloString <- "Hello World"  
  2. "print text on the screen. This comment spans more than one line"  
  3. print ( helloString)  
R is one of the best programming languages for statistics and data analysis. To start writing and running R, we need to install the R package on our computer. R codes can be executed directly on the R console or by creating a script file and executing it from there.
 
To create a single-line comment, precede the line with the pound symbol (#). To writes a multi-line comment, enclose it within single or double-quotes.
 
Data Types in R
 
The data type of the variable determines the amount of memory allocated to the variable and the kind of value that can be stored in the variable. In most programming languages such as C and Java, a variable is defined as a particular data type.
 
This is not the case with R since the variables are assigned with R-objects, and the data type of the R-object will become the data type of the variable.
 
R supports various data types. There are different types of R-objects, but the common ones include the following,
  • Vectors
  • Matrices
  • Lists
  • Arrays
  • Data Frames
  • Factors

Conclusion

 
In this article, I have given an introduction about the history of R, how to install R on windows, how to install R packages, how to use the R command prompt, and how to write comments in a script file. I have discussed how to create an R script and write R commands in a script file.