Nice tutorials to discover R http://t.co/ckBJskmpvK via @rbloggers
— Dilir Akhtar Khan (@dilirkhan) October 27, 2013
Sunday, October 27, 2013
Nice tutorials to discover R
Normalize Data in R (Calculate Z scores)
scale() function is used to create Z scores (normalize) in R.
To calculate Z score of a variable, we subtract the mean of all data points from each individual data point and divide the result by standard deviation of the variable. scale() does this in one simple call.
In R console, type
> x = c(2,4,6,8)
This creates a variable x.
To subtract the mean of the variable from each data point (this is called centering):
> scale(x, center = TRUE, scale = TRUE) # scale = FALSE will not divide each data point by mean
> x
[,1]
[1,] -1.1618950
[2,] -0.3872983
[3,] 0.3872983
[4,] 1.1618950
attr(,"scaled:center")
[1] 5
attr(,"scaled:scale")
[1] 2.581989
To calculate Z score of a variable, we subtract the mean of all data points from each individual data point and divide the result by standard deviation of the variable. scale() does this in one simple call.
In R console, type
> x = c(2,4,6,8)
This creates a variable x.
To subtract the mean of the variable from each data point (this is called centering):
> scale(x, center = TRUE, scale = TRUE) # scale = FALSE will not divide each data point by mean
> x
[,1]
[1,] -1.1618950
[2,] -0.3872983
[3,] 0.3872983
[4,] 1.1618950
attr(,"scaled:center")
[1] 5
attr(,"scaled:scale")
[1] 2.581989
Tuesday, October 22, 2013
T-Test in R
98.6 t-test.xlsx the file needs to be converted to .csv
normtmp=read.csv(“e:/r/98.6 t-test.csv”,header=TRUE)
qqnorm(normtmp$tmp)
qqline(normtmp$tmp)
plot(density(normtmp$tmp))
shapiro.test(normtmp$tmp)
t.test(normtmp$tmp, mu=98.6, conf.level=.99, alternative=”two.sided”)
# output not shown
qqnorm(normtmp$tmp)
qqline(normtmp$tmp)
plot(density(normtmp$tmp))
shapiro.test(normtmp$tmp)
t.test(normtmp$tmp, mu=98.6, conf.level=.99, alternative=”two.sided”)
# output not shown
#Note: setting the alternative to “two.sided” was unnecessary, since that is the default.
We can now reject the null at any reasonable alpha level we might have chosen!
#From the sample, we might estimate the mean human body temperature to be 98.25 degrees (sample mean on the last line of output).
#A 99% CI lets us be 99% sure the population mean is between 98.08111 and 98.41735 degrees.
We can now reject the null at any reasonable alpha level we might have chosen!
#From the sample, we might estimate the mean human body temperature to be 98.25 degrees (sample mean on the last line of output).
#A 99% CI lets us be 99% sure the population mean is between 98.08111 and 98.41735 degrees.
Friday, October 11, 2013
Different Types of Plots in R
To get the data set click this link : Friends Data from Carnegie Mellon University. data will be Data will be downloaded on your computer. Double click the downloaded file. A new session of R will start and data will be loaded in a variable named: friends.
To take a look at the data, type:
> friends
Create a table:
> t <- table(friends)
see the table:
> t
friends
No difference Opposite sex Same sex
602 434 164
> barplot(t)
Output:
> barplot(t, horiz=T)
Try
> barplot(t, horiz=T, main="Friends Distribution", ylab="Make Friends With", col="darkblue")
For more examples, check: http://www.statmethods.net/graphs/bar.html
Pie Chart
------------
> pie(t)
To take a look at the data, type:
> friends
Create a table:
> t <- table(friends)
see the table:
> t
friends
No difference Opposite sex Same sex
602 434 164
> barplot(t)
Output:
> barplot(t, horiz=T)
Try
> barplot(t, horiz=T, main="Friends Distribution", ylab="Make Friends With", col="darkblue")
For more examples, check: http://www.statmethods.net/graphs/bar.html
Pie Chart
------------
> pie(t)
To create 3D pie chart:
> install.packages("plotrix")
>library(plotrix)
>pie3D(t, explode=.1)
Saturday, October 5, 2013
Chi Square Test
Copy the following data in a text editor, add a blank line at the end and save as chisq.csv.
Heart Rate Increased, No Heart Rate Increase
Treated, 36,14
Not Treated, 30, 25
For details on the data,visit http://math.hws.edu/javamath/ryan/ChiSquare.html
What we are trying to do here is to test the effect of a drug.
Ho: The proportion of animals whose heart rate increased is independent of drug treatment.
Ha: The proportion of animals whose heart rate increased is associated with drug treatment.
Read the data into R:
> x <- read.csv("e:/r/chisq.csv")
If you didn't enter a line at the end of the file, you are likely to get the following warning:
Warning message:
In read.table(file = file, header = header, sep = sep, quote = quote, :
incomplete final line found by readTableHeader on 'Chi_Square.csv'
Heart Rate Increased, No Heart Rate Increase
Treated, 36,14
Not Treated, 30, 25
For details on the data,visit http://math.hws.edu/javamath/ryan/ChiSquare.html
What we are trying to do here is to test the effect of a drug.
Ho: The proportion of animals whose heart rate increased is independent of drug treatment.
Ha: The proportion of animals whose heart rate increased is associated with drug treatment.
Read the data into R:
> x <- read.csv("e:/r/chisq.csv")
If you didn't enter a line at the end of the file, you are likely to get the following warning:
Warning message:
In read.table(file = file, header = header, sep = sep, quote = quote, :
incomplete final line found by readTableHeader on 'Chi_Square.csv'
However, lets run the test:
> chisq.test(x, correct=F)
Output:
Pearson's Chi-squared test
data: x
X-squared = 3.4177, df = 1, p-value = 0.0645
Look at the p-value.
p-value of 0.065 is greater than the conventionally accepted of p > 0.05 we fail to reject the null hypothesis. In other words, there is no statistically significant difference in the proportion of animals whose heart rate increased.
Friday, October 4, 2013
Notes
discrete data arise from a counting process, while continuous data arise from a measuring process.
Chi square tests can only be used on actual numbers and not on percentages, proportions, means, etc.
Chi square tests can only be used on actual numbers and not on percentages, proportions, means, etc.
Wednesday, October 2, 2013
Subscribe to:
Posts (Atom)


