Introduction
Information can emerge out of numerous sources. Various built in packages are included inside R directory. These packages are composed of predefined datasets from which data can be extracted for further analysis. Data can be read in R from a broad range of sources and this information can be perused in a large number of formats.
In this article, I will discuss how to access data of a predefined single dataset as well as multiple datasets belonging to different packages in R. I will discuss different syntaxes which can be used to access either a dataset of a single package or all the datasets belonging to different datasets in R.
Accessing predefined datasets in R
The directory of R is loaded with various predefined datasets which are packed inside a package called datasets. Availability of different varieties of datasets ensures that different kinds of datasets can be used in different projects. These datasets can be used to apply different kinds of analysis techniques.
In R a wide variety of datasets are available in different R packages. The data function data() can be used to list and display datasets that are available inside a particular loaded package.
To access the datasets of package dataset we can use the syntax given below,
Data sets in package ‘datasets’.
The above syntax will give following output,
- Data sets in package ‘datasets’:
-
- AirPassengers Monthly Airline Passenger Numbers 1949-1960
- BJsales Sales Data with Leading Indicator
- BJsales.lead (BJsales)
- Sales Data with Leading Indicator
- BOD Biochemical Oxygen Demand
- CO2 Carbon Dioxide Uptake in Grass Plants
- ChickWeight Weight versus age of chicks on different diets
- DNase Elisa assay of DNase
- EuStockMarkets Daily Closing Prices of Major European Stock
- Indices, 1991-1998
- Formaldehyde Determination of Formaldehyde
- HairEyeColor Hair and Eye Color of Statistics Students
- Harman23.cor Harman Example 2.3
- Harman74.cor Harman Example 7.4
- Indometh Pharmacokinetics of Indomethacin
- InsectSprays Effectiveness of Insect Sprays
- JohnsonJohnson Quarterly Earnings per Johnson & Johnson Share
- LakeHuron Level of Lake Huron 1875-1972
- LifeCycleSavings Intercountry Life-Cycle Savings Data
- Loblolly Growth of Loblolly pine trees
- Nile Flow of the River Nile
- Orange Growth of Orange Trees
- OrchardSprays Potency of Orchard Sprays
- PlantGrowth Results from an Experiment on Plant Growth
- Puromycin Reaction Velocity of an Enzymatic Reaction
- Seatbelts Road Casualties in Great Britain 1969-84
- Theoph Pharmacokinetics of Theophylline
- Titanic Survival of passengers on the Titanic
- ToothGrowth The Effect of Vitamin C on Tooth Growth in
- Guinea Pigs
- UCBAdmissions Student Admissions at UC Berkeley
- UKDriverDeaths Road Casualties in Great Britain 1969-84
- UKgas UK Quarterly Gas Consumption
- USAccDeaths Accidental Deaths in the US 1973-1978
- USArrests Violent Crime Rates by US State
- USJudgeRatings Lawyers' Ratings of State Judges in the US
- Superior Court
- USPersonalExpenditure Personal Expenditure Data
- UScitiesD Distances Between European Cities and Between
- US Cities
- VADeaths Death Rates in Virginia (1940)
- WWWusage Internet Usage per Minute
- WorldPhones The World's Telephones
- ability.cov Ability and Intelligence Tests
- airmiles Passenger Miles on Commercial US Airlines,
- 1937-1960
- airquality New York Air Quality Measurements
- anscombe Anscombe's Quartet of 'Identical' Simple Linear
- Regressions
- attenu The Joyner-Boore Attenuation Data
- attitude The Chatterjee-Price Attitude Data
- austres Quarterly Time Series of the Number of
- Australian Residents
- beaver1 (beavers) Body Temperature Series of Two Beavers
- beaver2 (beavers) Body Temperature Series of Two Beavers
- cars Speed and Stopping Distances of Cars
- chickwts Chicken Weights by Feed Type
- co2 Mauna Loa Atmospheric CO2 Concentration
- crimtab Student's 3000 Criminals Data
- discoveries Yearly Numbers of Important Discoveries
- esoph Smoking, Alcohol and (O)esophageal Cancer
- euro Conversion Rates of Euro Currencies
- euro.cross (euro) Conversion Rates of Euro Currencies
- eurodist Distances Between European Cities and Between
- US Cities
- faithful Old Faithful Geyser Data
- fdeaths (UKLungDeaths)
- Monthly Deaths from Lung Diseases in the UK
- freeny Freeny's Revenue Data
- freeny.x (freeny) Freeny's Revenue Data
- freeny.y (freeny) Freeny's Revenue Data
- infert Infertility after Spontaneous and Induced
- Abortion
- iris Edgar Anderson's Iris Data
- iris3 Edgar Anderson's Iris Data
- islands Areas of the World's Major Landmasses
- ldeaths (UKLungDeaths)
- Monthly Deaths from Lung Diseases in the UK
- lh Luteinizing Hormone in Blood Samples
- longley Longley's Economic Regression Data
- lynx Annual Canadian Lynx trappings 1821-1934
- mdeaths (UKLungDeaths)
- Monthly Deaths from Lung Diseases in the UK
- morley Michelson Speed of Light Data
- mtcars Motor Trend Car Road Tests
- nhtemp Average Yearly Temperatures in New Haven
- nottem Average Monthly Temperatures at Nottingham,
- 1920-1939
- npk Classical N, P, K Factorial Experiment
- occupationalStatus Occupational Status of Fathers and their Sons
- precip Annual Precipitation in US Cities
- presidents Quarterly Approval Ratings of US Presidents
- pressure Vapor Pressure of Mercury as a Function of
- Temperature
- quakes Locations of Earthquakes off Fiji
- randu Random Numbers from Congruential Generator
- RANDU
- rivers Lengths of Major North American Rivers
- rock Measurements on Petroleum Rock Samples
- sleep Student's Sleep Data
- stack.loss (stackloss)
- Brownlee's Stack Loss Plant Data
- stack.x (stackloss) Brownlee's Stack Loss Plant Data
- stackloss Brownlee's Stack Loss Plant Data
- state.abb (state) US State Facts and Figures
- state.area (state) US State Facts and Figures
- state.center (state) US State Facts and Figures
- state.division (state)
- US State Facts and Figures
- state.name (state) US State Facts and Figures
- state.region (state) US State Facts and Figures
- state.x77 (state) US State Facts and Figures
- sunspot.month Monthly Sunspot Data, from 1749 to "Present"
- sunspot.year Yearly Sunspot Data, 1700-1988
- sunspots Monthly Sunspot Numbers, 1749-1983
- swiss Swiss Fertility and Socioeconomic Indicators
- (1888) Data
- treering Yearly Treering Data, -6000-1979
- trees Diameter, Height and Volume for Black Cherry
- Trees
- uspop Populations Recorded by the US Census
- volcano Topographic Information on Auckland's Maunga
- Whau Volcano
- warpbreaks The Number of Breaks in Yarn during Weaving
- women Average Heights and Weights for American Women
In R, datasets available in each and every package can be listed and displayed using the following syntax,
- > data(package = .packages(all.available = TRUE))
To list the data sets in all available packages we can use the above syntax.
The above syntax will display a complete list of all the datasets that are loaded in all kinds of different packages that are available and preinstalled in directory of R.
The above syntax will give the following output,
- Data sets in package ‘boot’:
-
- acme Monthly Excess Returns
- aids Delay in AIDS Reporting in England and Wales
- aircondit Failures of Air-conditioning Equipment
- aircondit7 Failures of Air-conditioning Equipment
- amis Car Speeding and Warning Signs
- aml Remission Times for Acute Myelogenous Leukaemia
- beaver Beaver Body Temperature Data
- bigcity Population of U.S. Cities
- brambles Spatial Location of Bramble Canes
- breslow Smoking Deaths Among Doctors
- calcium Calcium Uptake Data
- cane Sugar-cane Disease Data
- capability Simulated Manufacturing Process Data
- catsM Weight Data for Domestic Cats
- cav Position of Muscle Caveolae
- cd4 CD4 Counts for HIV-Positive Patients
- cd4.nested Nested Bootstrap of cd4 data
- channing Channing House Data
- city Population of U.S. Cities
- claridge Genetic Links to Left-handedness
- cloth Number of Flaws in Cloth
- co.transfer Carbon Monoxide Transfer
- coal Dates of Coal Mining Disasters
- darwin Darwin's Plant Height Differences
- dogs Cardiac Data for Domestic Dogs
- downs.bc Incidence of Down's Syndrome in British
- Columbia
- ducks Behavioral and Plumage Characteristics of
- Hybrid Ducks
- fir Counts of Balsam-fir Seedlings
- frets Head Dimensions in Brothers
- grav Acceleration Due to Gravity
- gravity Acceleration Due to Gravity
- hirose Failure Time of PET Film
- islay Jura Quartzite Azimuths on Islay
- manaus Average Heights of the Rio Negro river at
- Manaus
- melanoma Survival from Malignant Melanoma
- motor Data from a Simulated Motorcycle Accident
- neuro Neurophysiological Point Process Data
- nitrofen Toxicity of Nitrofen in Aquatic Systems
- nodal Nodal Involvement in Prostate Cancer
- nuclear Nuclear Power Station Construction Data
- paulsen Neurotransmission in Guinea Pig Brains
- poisons Animal Survival Times
- polar Pole Positions of New Caledonian Laterites
- remission Cancer Remission and Cell Activity
- salinity Water Salinity and River Discharge
- survival Survival of Rats after Radiation Doses
- tau Tau Particle Decay Modes
- tuna Tuna Sighting Data
- urine Urine Analysis Data
- wool Australian Relative Wool Prices
-
- Data sets in package ‘cluster’:
-
- agriculture European Union Agricultural Workforces
- animals Attributes of Animals
- chorSub Subset of C-horizon of Kola Data
- flower Flower Characteristics
- plantTraits Plant Species Traits Data
- pluton Isotopic Composition Plutonium Batches
- ruspini Ruspini Data
- votes.repub Votes for Republican Candidate in Presidential
- Elections
- xclara Bivariate Data Set with 3 Clusters
-
- Data sets in package ‘datasets’:
-
- AirPassengers Monthly Airline Passenger Numbers 1949-1960
- BJsales Sales Data with Leading Indicator
- BJsales.lead (BJsales)
- Sales Data with Leading Indicator
- BOD Biochemical Oxygen Demand
- CO2 Carbon Dioxide Uptake in Grass Plants
- ChickWeight Weight versus age of chicks on different diets
- DNase Elisa assay of DNase
- EuStockMarkets Daily Closing Prices of Major European Stock
- Indices, 1991-1998
- Formaldehyde Determination of Formaldehyde
- HairEyeColor Hair and Eye Color of Statistics Students
- Harman23.cor Harman Example 2.3
- Harman74.cor Harman Example 7.4
- Indometh Pharmacokinetics of Indomethacin
- InsectSprays Effectiveness of Insect Sprays
- JohnsonJohnson Quarterly Earnings per Johnson & Johnson Share
- LakeHuron Level of Lake Huron 1875-1972
- LifeCycleSavings Intercountry Life-Cycle Savings Data
- Loblolly Growth of Loblolly pine trees
- Nile Flow of the River Nile
- Orange Growth of Orange Trees
- OrchardSprays Potency of Orchard Sprays
- PlantGrowth Results from an Experiment on Plant Growth
- Puromycin Reaction Velocity of an Enzymatic Reaction
- Seatbelts Road Casualties in Great Britain 1969-84
- Theoph Pharmacokinetics of Theophylline
- Titanic Survival of passengers on the Titanic
- ToothGrowth The Effect of Vitamin C on Tooth Growth in
Data within any dataset can be accessed using the data function data(), to access a specific dataset. Within some packages, we can pass the name of the dataset and the package where the data is found as follows,
- > data("cars", package = "datasets")
car dataset in package ‘datasets’,
- > data("cars", package = "datasets")
- > head(cars)
- speed dist
- 1 4 2
- 2 4 10
- 3 7 4
- 4 7 22
- 5 8 16
- 6 9 10
- >
As we can see, cars data frame contains two variables, which are speed and stopping distance of cars.
Summary
In this article, I demonstrated how to access a single dataset as well as multiple datasets belonging to different packages in R. I had discussed different syntaxes which can be used to access either a dataset of a single package or all the datasets belonging to different datasets in R. Proper coding snippets and output are also provided.