Adventures With R

R-Programming Language

This language is basically used for computing statistics and graphics. It usually manipulates and analyzes the data of the statistics. R follows OOPS Concepts. We can do the data analysis  by writing scripts and functions. R was created by Robert Gentleman and Ross Ihaka in New Zealand. R is available free of cost. This language is very beneficial for data scientists. For anyone who deals with the computing and analysis of Big Data, knowledge of R language would make their task very easy.

Why R ??

As it is an open source programming language, we can use it for free. Hence, there are no worries for any subscription fees, license and user limits. It includes effective data handling and storage facilities. It is a well-developed, simple, and effective programming language which includes loops, user-defined recursive functions and conditionals.

As it is a programming language which uses a command line scripting language, we can store a series of complex data-analysis steps in R. It helps in re-using. It also makes easier for others to validate the research results.

It may be somewhat difficult to learn R for the software professionals of different programming languages but there is no need to worry. Here, we will be giving you a path of the basics of R. By learning the basics, you will be able to start using R for basic data work.

R Language is a very natural and expressive language for data analysis.

Benefits of using R Language

  • Once you have learned this language, there are many benefits. It can be re-run at any time. This makes it much easier to update the results, when the data changes. Scripting language also makes it easy to automate a sequence of the tasks, that can be integrated into the other processes. We can do our data analysis in a fraction of the time.

  • R provides the graphical facilities for data analysis and you can then display it directly on the computer screen or print it on paper. R Language is deployed in critical business applications. It is the world’s most widely used statistical programming language. We can visualize the data analysis through graphs and charts.

  • Here, there can be a number of possibilities.No restriction is there to choose a predefined set of routines. We can use the codes contributed by the others in the open source community. We can extend R with our own functions. R works excellent as a mash up with other Applications. We can combine R with MySQL database.
Getting Started with R language
  • First of all, open www.r-project.org. From here, download and install R for your desktop and a laptop. This runs on Windows and a range of Unix platforms. Once installed, you are ready to work on R language.

  • After installing R, I would suggest you to download R IDE (Integrated Development Environment), “RStudio”. RStudio has many features such as Syntax Highlighting and auto completion of the code.

  • RStudio has a four pane workspace, which also manages multiple R windows to write the commands, view the visualizations, command histories and store scripts.


    Fig: RStudio

    In the figure above of RStudio, the top left Window is the R code editor, where we will do most of our work. It allows us to create a file and write the multiple lines of the code.The top right Window includes a list of the objects currently in the memory. There is also a history tab with a list of recently used commands. Here, you can select some or all lines of the code and send them to a console or any other file, opened in the editor.

    Bottom Left is the console Window, where you can write one R statement at a time. Each and every line of the code run from the editor Window also appears on the console.

    Bottom right shows a plot, if we have created a data visualization with the R-Code. Here, there is an option to export a plot to an image or the PDF. The history of the previous plots is also available there.
Three Keyboard shortcuts introduced by Wickham (RStudio-Chief Scientist)
  1. Tab is an auto-complete function. When we write in the console and hit the TAB key, RStudio suggests functions and the file names. Simply, click enter to select the function or file name. Auto-Completion of the code happens. There is no need to write the full function.

  2. Ctrl + Enter (command+enter on a Mac OS) -- Pressing this takes the current line of the code to the editor, it takes it to the console and executes it. If we select  multiple lines of code and press Ctrl/command+ enter, then all of them will run.

  3. Ctrl/command + Up arrow -- Pressing this shows a list of every command we have typed; starting with those keys. Select the command you want and press enter. This will not work in the editor. It will work only in the console.
Setting Our Working Directory

We can change our working directory by setwd() function,such as setwd("~/mydirectory"). Always use forward slashes. For Windows, the command would look like setwd("C:/Sharon/Documents/RProjects"). We can also use ‘Menu’ to change the working directory, when using RStudio under “Session > Set Working Directory”.

Installing and Using packages

The command for installing packages is install.packages("thepackagename"). Another way to install the package in RStudio is to go to the lower right Window of the screen, where you will see a tab named “Packages”. After clicking that button, you will see a “install packages” option. The Location may vary depending upon the operating system.

Some commands with functions are as follows
  • installed.packages(): To see which packages are already installed on your system.
  • update.packages(): To make sure your package remains up to date and to get the latest versions of the package.
  • remove.packages("thepackagename"): This command is used when we no longer want the package on our system.
  • library("thepackagename"): This command is run when we want to use the installed package in our work.
  • functionName: If we want to know more about the working of any function, you can use a question mark (?), followed by the function name.
  • example(functionName): It is used when we know the working of any function but want to know the formats to use it properly.
  • args(functionName): This command will display a list of the functions arguments.
  • help.search("your search term"): This is used, when we want to search through R's help documentation for a specific term you can use. This commands has one shortcut also i.e. ??("my search term").

    Note There is no need of the parenthesis, if the search item is single word without spaces.
Conclusion

R is the #1 choice by data scientists and is used by energetic and talented communities. R is the world’s most widely used Statistical Programming Language. R is particularly useful, because it contains multiple built-in mechanisms for proper organization of the data, running calculations on the information and creating graphical representations of the data sets.