Sunday, September 15, 2013

Topic 1: Introduction to R from a Newbie's Perspective

Install R
Download R from http://cran.r-project.org/bin/windows/
I downloaded this and installed on my Windows 7 PC. Working just fine


Start R and type the followin at the prompt:

> R.version

You get the following output:               _                           
platform       i386-w64-mingw32            
arch           i386                        
os             mingw32                     
system         i386, mingw32               
status                                     
major          3                           
minor          0.1                         
year           2013                        
month          05                          
day            16                          
svn rev        62743                       
language       R                           
version.string R version 3.0.1 (2013-05-16)
nickname       Good Sport  


Lets generate some data.
> x <- 1:100
> x
  [1]   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57
 [58]  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95  96  97  98  99 100

We may wish to edit the data. Fire up the data editor:

> data.entry(x)

Close the editor after you are done

If you want to quit R, type 

> q()

In order to save the history and data objects:
> setwd("e:/R")
savehistory(file="history-9-26-2013.Rhistory")

You can load the history with the following command:
> load("history-9-26-2013.Rhistory")

Saving workspace image

The option to save the workspace saves only the objects you have created, not any output you have produced using them. The option to save the workspace can be performed at any time using the save.image () command (also available as “Save Workspace” under the file menu)or at the end of a session, when R will ask you if you want to save the workspace.
type:
> save.image()

> installed.packages()

You can achieve a less complete list by typing:

> library()

To check whether there are newer versions of your installed packages at CRAN:

> old.packages()

You can use the following command to update all your installed

packages:

> update.packages()


Sample Data in R

Start R
R comes with sample data. To see what datasets are available, type the following at the R command prompt:
> data()

R shows the datasets available. Look at the following screenshot.





Take a look at the data

Type the following at R prompt:
> iris

R prints out the iris data


To see a summary of the data type the following at the R prompt:

> summary(iris)

R prints out the summary. Look at the following screenshot.






There are several variables in iris. You can look at the data by typing at the R prompt:
> iris[1]

R prints out the Sepal.Length variable.


Similarly, to see Sepal.Width, type:

> iris[2]

Another way of refering to a variable within a dataframe (what we have been calling dataset so far):

> iris$Sepal.Width

R prints out the data, this time horizontally.



Intuitively, you would refer to Sepal.Length as iris$Sepal.Length. Type this at R prompt, you get the data back.


To summarize iris$Sepal.Width, you can type:

fivenum(iris$Sepal.Length)

Do not try:
fivenum(iris[1])

To summarize Sepal.Width variable
> iris$Sepal.Width


To plot Sepal.Width

plot(iris$Sepal.Width)


Histogram of iris$Sepal.Width

plot(iris$Sepal.Width)

Check the shape of iris$Sepal.Width

plot(density((iris$Sepal.Width)))

See the image : the shape of the variable






Other commands to explore:

qqnorm(iris$Sepal.Width)

qqline(iris$Sepal.Width)

Shapiro-Wilk normality testShapiro-Wilk normality test

shapiro.test(iris$Sepal.Width)

> quantile(iris$Sepal.Width)


For Standard Deviation

> sd(iris$Sepal.Width)

For Variance

> var(iris$Sepal.Width)

To do a box plot of Sepal.Width

> boxplot(iris$Sepal.Width, ylab="Sepal Width (iris data)",
         name="Sepal Width",

         main="Sepal With Boxplot")

To see two boxplots of Sepal.Length and Sepal.Width side by side:

> boxplot(iris$Sepal.Length, iris$Sepal.Width, ylab="Sepal Length/Width 
  (iris data)",
  names=c("Sepal Length", "Sepal Width"),
  main="Sepal Length & Width Boxplot")


library(help = "datasets")


Package foreign

Also of note is an R package called foreign. This package contains functionality for importing data into R that is formatted by most other statistical software
packages, including SAS, SPSS, STRATA and others. Package foreign is available for download and installation from the CRAN site.

No comments:

Post a Comment