Vedrana B.
The Data Science Specialization: Developing Data Products
17.05.2015.
Problem description:
Problem solution:
Famous 1885 study of Francis Galton exploring the relationship between the heights of children and their parents.
The variables are:
The units are inches.
The number of cases is 928, representing 928 children from 205 parents.
library(UsingR); require(base64enc); require(rCharts)
data(galton)
options(RCHART_WIDTH = 600, RCHART_HEIGHT = 300)
knitr::opts_chunk$set(comment = NA, results = 'asis', tidy = F, message = T)
g1 <- nPlot(child ~ parent, data = galton, type = 'scatterChart')
g1$show('inline', include_assets = TRUE)
Linear model = linear relationship between input variable (parent height) and the output (child height)
\[hChild = \alpha * hParent + \beta\]
where to build the model:
model <- lm(formula = child ~ parent, data = galton2)
p <- (as.numeric(input$hF) + 1.08*as.numeric(input$hM))/2
c <- predict(model, data.frame(parent = p))
library(ggplot2)
limits <- c(min(galton)-1,max(galton)+1)
ggplot(data = galton, aes(x=parent,y=child)) +
geom_point(color = "red", alpha=0.2, size=3) +
labs(x = "Parent\'s height", y = "Child\'s height") +
labs(title ="LM prediction using Galton\'s dataset") +
coord_cartesian(xlim = limits, ylim = limits) +
geom_smooth(method='lm') +
guides(color = FALSE, fill = FALSE)
Available at: http://vedra.shinyapps.io/PAshiny/
App source code: https://github.com/vedra/ShinyApp
Materials on LM prediction, Shiny, Slidify etc: www.coursera.com
Descritpion of dataset: http://www.math.uah.edu/stat/data/Galton.html
Image source: www.pixshark.com
Thank you!