Wednesday, January 14, 2015

R Programming 2 : Loading Data


Loading txt file from Linux to R:
Place the file in /home/username/ directory
            d = read.table("/home/userId/diva.txt",sep="\t")
            print(d)

            OR

            d = read.table("foobar.txt", sep="\t", col.names=c("id", "name"), fill=FALSE,
               strip.white=TRUE)

Loading CSV file:
data <- read.csv(file.choose(),header=T)

file.choose() function will allow users to select the file from required path.
data
    User First.Name     Sal
1    53          R   50000
2    73         Ra   76575
3    72         An  786776
4    71         Aa    5456
5    68         Ni 7867986
Here 5 Observations on 3 Variables.

Here we can use sep to specify , or | 
data2 <- read.csv(file.choose(),header=T,sep=",")

----------------------------------------------------
dim : This will let us know the dimensions of the data in R that is number of rows and number of columns.

dim(cars)
[1] 50  2

Here 50 Columns and 2 rows.
---------------------
head and tail commands:
head(cars) : head command will give first 6 records in the object.
  speed dist
1     4    2
2     4   10
3     7    4
4     7   22
5     8   16
6     9   10

tail command will give last 6 commands.
tail(cars)
   speed dist
45    23   54
46    24   70
47    24   92
48    24   93
49    24  120
50    25   85
-----------------------------------------
Basic Commands to Explore data:
data2[c(1,2,3),]
data2[5:9,]
names(cars)
mean(cars$dist)
attach(cars)
detach(cars)
Summary(cars)
class(gender) --for gender kind of objects

Merge Data:
Merge merges only common cases to both datasets
mydata <- merge(mydata1, mydata3, by=c("country","year"))

Adding the option “all=TRUE” includes all cases from both datasets
mydata <- merge(mydata1, mydata3, by=c("country","year"), all=TRUE)

Many to One
mydata <- merge(mydata1, mydata4, by=c("country"))

mydata_sorted <- mydata[order(country, year),]

attach(mydata_sorted)
detach(mydata_sorted)

No comments:

Post a Comment