GitHub - aksmit94/getdata-101_Course_Project: This repository contains files specific to the submission for Course Project of Getting and Cleaning Data course from Data Science Specialization of Coursera.

#READ ME

##Additional packages required: dplyr, stringr

###The following steps, when applied sequentially, will give the user a tidy, long format dataset as demanded in the project

Fetching required files in R by read.table()
x_test as xtest
x_train as xtrain
y_test as ytest
y_train as ytrain
subject_test as subtst
subject_train as subtrn
features as features
activity_labels as actlbl
Joining relevant data sets by rbind()
xtest & xtrain and store in x.
ytest & ytrain and store in y.
subtst & subtrn and store in sub.
Removing redundant first columns from datasets which include first column as a numeric vector from 1 to column length by select()
Removing all "()" present in column names in features by gsub()
Storing "feature" indices with strings "mean" or "std" in feat (and thereby including much debated "meanfreq" also) by grep(...,fixed = F)
Subsetting according to feat by [ ] operator
Columns of x; into x
Rows of features; into features
Setting variable names (column names) of x from features vector obtained from above by names()
Joining sub, y and x (in this order) into a resulting data frame named xysub by cbind()
Assigning column names to xysub by names()
Column corresponding to sub i.e. 1st column: "Subject"
Column corresponding to y i.e. 2nd column: "Activity_Label"
Column corresponding to x i.e. 3rd to 81st column: features
Making name vector lblnames from actlbl by as.character()
Replacing Activity_Label numeric entries (2nd column of xysub) with corresponding activity names from actlbl by for(i in 1:6) {xysub$Activity_Label[xysub$Activity_Label == i] <- lblnames[i]}
Grouping xysub by Subject and Activity_Label by group_by()
Creating new data frame tld (tidy long data) containing "average of each variable for each activity and each subject" from xysub by summarise_each()
Appending "Avg-" to all column names except the first two in tld by ifelse(names(tld)%in%c("Subject", "Activity_Label"), str_c('', names(tld)), str_c('Avg-', names(tld)))
Writing tld to "Tidy_Long_Data.txt" by write.table(tld, file = "", row.name = F)

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
Code_Book.md		Code_Book.md
README.md		README.md
run_analysis.R		run_analysis.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages