Start Final Project
This commit is contained in:
parent
de414a2e37
commit
4894d728bb
169
EDA_Project/EDA_Project.rmd
Normal file
169
EDA_Project/EDA_Project.rmd
Normal file
@ -0,0 +1,169 @@
|
|||||||
|
---
|
||||||
|
title: "EDA_Project"
|
||||||
|
author: "Dusty P"
|
||||||
|
date: "May 31, 2018"
|
||||||
|
output: html_document
|
||||||
|
---
|
||||||
|
|
||||||
|
```{r echo=FALSE, message=FALSE, warning=FALSE, setup}
|
||||||
|
knitr::opts_knit$set(root.dir = normalizePath("C:/Users/Dusty/Documents/coding/projects/Udacity/Data Analysis/eda/EDA_Project"))
|
||||||
|
|
||||||
|
# load the ggplot graphics package and the others
|
||||||
|
library(ggplot2)
|
||||||
|
library(GGally)
|
||||||
|
library(scales)
|
||||||
|
library(memisc)
|
||||||
|
library(gridExtra)
|
||||||
|
library(RColorBrewer)
|
||||||
|
library(bitops)
|
||||||
|
library(RCurl)
|
||||||
|
|
||||||
|
cuberoot_trans = function() trans_new('cuberoot', transform = function(x) x^(1/3),
|
||||||
|
inverse = function(x) x^3)
|
||||||
|
```
|
||||||
|
|
||||||
|
# Exploration of White Wines by Dustin Pianalto
|
||||||
|
|
||||||
|
This report explores a dataset containing chemical information and ratings on almost 4900 white wine tastings.
|
||||||
|
|
||||||
|
```{r echo=FALSE, message=FALSE, warning=FALSE, Load_the_Data}
|
||||||
|
# Load the Data
|
||||||
|
wqw <- read.csv('wineQualityWhites.csv')
|
||||||
|
```
|
||||||
|
|
||||||
|
# Univariate Plots Section
|
||||||
|
|
||||||
|
> **Tip**: In this section, you should perform some preliminary exploration of
|
||||||
|
your dataset. Run some summaries of the data and create univariate plots to
|
||||||
|
understand the structure of the individual variables in your dataset. Don't
|
||||||
|
forget to add a comment after each plot or closely-related group of plots!
|
||||||
|
There should be multiple code chunks and text sections; the first one below is
|
||||||
|
just to help you get started.
|
||||||
|
|
||||||
|
```{r echo=FALSE, Univariate_Plots}
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
> **Tip**: Make sure that you leave a blank line between the start / end of
|
||||||
|
each code block and the end / start of your Markdown text so that it is
|
||||||
|
formatted nicely in the knitted text. Note as well that text on consecutive
|
||||||
|
lines is treated as a single space. Make sure you have a blank line between
|
||||||
|
your paragraphs so that they too are formatted for easy readability.
|
||||||
|
|
||||||
|
# Univariate Analysis
|
||||||
|
|
||||||
|
> **Tip**: Now that you've completed your univariate explorations, it's time to
|
||||||
|
reflect on and summarize what you've found. Use the questions below to help you
|
||||||
|
gather your observations and add your own if you have other thoughts!
|
||||||
|
|
||||||
|
### What is the structure of your dataset?
|
||||||
|
|
||||||
|
### What is/are the main feature(s) of interest in your dataset?
|
||||||
|
|
||||||
|
### What other features in the dataset do you think will help support your \
|
||||||
|
investigation into your feature(s) of interest?
|
||||||
|
|
||||||
|
### Did you create any new variables from existing variables in the dataset?
|
||||||
|
|
||||||
|
### Of the features you investigated, were there any unusual distributions? \
|
||||||
|
Did you perform any operations on the data to tidy, adjust, or change the form \
|
||||||
|
of the data? If so, why did you do this?
|
||||||
|
|
||||||
|
|
||||||
|
# Bivariate Plots Section
|
||||||
|
|
||||||
|
> **Tip**: Based on what you saw in the univariate plots, what relationships
|
||||||
|
between variables might be interesting to look at in this section? Don't limit
|
||||||
|
yourself to relationships between a main output feature and one of the
|
||||||
|
supporting variables. Try to look at relationships between supporting variables
|
||||||
|
as well.
|
||||||
|
|
||||||
|
```{r echo=FALSE, Bivariate_Plots}
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
# Bivariate Analysis
|
||||||
|
|
||||||
|
> **Tip**: As before, summarize what you found in your bivariate explorations
|
||||||
|
here. Use the questions below to guide your discussion.
|
||||||
|
|
||||||
|
### Talk about some of the relationships you observed in this part of the \
|
||||||
|
investigation. How did the feature(s) of interest vary with other features in \
|
||||||
|
the dataset?
|
||||||
|
|
||||||
|
### Did you observe any interesting relationships between the other features \
|
||||||
|
(not the main feature(s) of interest)?
|
||||||
|
|
||||||
|
### What was the strongest relationship you found?
|
||||||
|
|
||||||
|
|
||||||
|
# Multivariate Plots Section
|
||||||
|
|
||||||
|
> **Tip**: Now it's time to put everything together. Based on what you found in
|
||||||
|
the bivariate plots section, create a few multivariate plots to investigate
|
||||||
|
more complex interactions between variables. Make sure that the plots that you
|
||||||
|
create here are justified by the plots you explored in the previous section. If
|
||||||
|
you plan on creating any mathematical models, this is the section where you
|
||||||
|
will do that.
|
||||||
|
|
||||||
|
```{r echo=FALSE, Multivariate_Plots}
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
# Multivariate Analysis
|
||||||
|
|
||||||
|
### Talk about some of the relationships you observed in this part of the \
|
||||||
|
investigation. Were there features that strengthened each other in terms of \
|
||||||
|
looking at your feature(s) of interest?
|
||||||
|
|
||||||
|
### Were there any interesting or surprising interactions between features?
|
||||||
|
|
||||||
|
### OPTIONAL: Did you create any models with your dataset? Discuss the \
|
||||||
|
strengths and limitations of your model.
|
||||||
|
|
||||||
|
------
|
||||||
|
|
||||||
|
# Final Plots and Summary
|
||||||
|
|
||||||
|
> **Tip**: You've done a lot of exploration and have built up an understanding
|
||||||
|
of the structure of and relationships between the variables in your dataset.
|
||||||
|
Here, you will select three plots from all of your previous exploration to
|
||||||
|
present here as a summary of some of your most interesting findings. Make sure
|
||||||
|
that you have refined your selected plots for good titling, axis labels (with
|
||||||
|
units), and good aesthetic choices (e.g. color, transparency). After each plot,
|
||||||
|
make sure you justify why you chose each plot by describing what it shows.
|
||||||
|
|
||||||
|
### Plot One
|
||||||
|
```{r echo=FALSE, Plot_One}
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
### Description One
|
||||||
|
|
||||||
|
|
||||||
|
### Plot Two
|
||||||
|
```{r echo=FALSE, Plot_Two}
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
### Description Two
|
||||||
|
|
||||||
|
|
||||||
|
### Plot Three
|
||||||
|
```{r echo=FALSE, Plot_Three}
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
### Description Three
|
||||||
|
|
||||||
|
------
|
||||||
|
|
||||||
|
# Reflection
|
||||||
|
|
||||||
|
> **Tip**: Here's the final step! Reflect on the exploration you performed and
|
||||||
|
the insights you found. What were some of the struggles that you went through?
|
||||||
|
What went well? What was surprising? Make sure you include an insight into
|
||||||
|
future work that could be done with the dataset.
|
||||||
|
|
||||||
|
> **Tip**: Don't forget to remove this, and the other **Tip** sections before
|
||||||
|
saving your final work and knitting the final report!
|
||||||
173
EDA_Project/projecttemplate.rmd
Normal file
173
EDA_Project/projecttemplate.rmd
Normal file
@ -0,0 +1,173 @@
|
|||||||
|
TITLE by YOUR_NAME_HERE
|
||||||
|
========================================================
|
||||||
|
|
||||||
|
> **Tip**: You will see quoted sections like this throughout the template to
|
||||||
|
help you construct your report. Make sure that you remove these notes before
|
||||||
|
you finish and submit your project!
|
||||||
|
|
||||||
|
> **Tip**: One of the requirements of this project is that your code follows
|
||||||
|
good formatting techniques, including limiting your lines to 80 characters or
|
||||||
|
less. If you're using RStudio, go into Preferences \> Code \> Display to set up
|
||||||
|
a margin line to help you keep track of this guideline!
|
||||||
|
|
||||||
|
```{r echo=FALSE, message=FALSE, warning=FALSE, packages}
|
||||||
|
# Load all of the packages that you end up using in your analysis in this code
|
||||||
|
# chunk.
|
||||||
|
|
||||||
|
# Notice that the parameter "echo" was set to FALSE for this code chunk. This
|
||||||
|
# prevents the code from displaying in the knitted HTML output. You should set
|
||||||
|
# echo=FALSE for all code chunks in your file, unless it makes sense for your
|
||||||
|
# report to show the code that generated a particular plot.
|
||||||
|
|
||||||
|
# The other parameters for "message" and "warning" should also be set to FALSE
|
||||||
|
# for other code chunks once you have verified that each plot comes out as you
|
||||||
|
# want it to. This will clean up the flow of your report.
|
||||||
|
|
||||||
|
library(ggplot2)
|
||||||
|
```
|
||||||
|
|
||||||
|
```{r echo=FALSE, Load_the_Data}
|
||||||
|
# Load the Data
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
> **Tip**: Before you create any plots, it is a good idea to provide a short
|
||||||
|
introduction into the dataset that you are planning to explore. Replace this
|
||||||
|
quoted text with that general information!
|
||||||
|
|
||||||
|
# Univariate Plots Section
|
||||||
|
|
||||||
|
> **Tip**: In this section, you should perform some preliminary exploration of
|
||||||
|
your dataset. Run some summaries of the data and create univariate plots to
|
||||||
|
understand the structure of the individual variables in your dataset. Don't
|
||||||
|
forget to add a comment after each plot or closely-related group of plots!
|
||||||
|
There should be multiple code chunks and text sections; the first one below is
|
||||||
|
just to help you get started.
|
||||||
|
|
||||||
|
```{r echo=FALSE, Univariate_Plots}
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
> **Tip**: Make sure that you leave a blank line between the start / end of
|
||||||
|
each code block and the end / start of your Markdown text so that it is
|
||||||
|
formatted nicely in the knitted text. Note as well that text on consecutive
|
||||||
|
lines is treated as a single space. Make sure you have a blank line between
|
||||||
|
your paragraphs so that they too are formatted for easy readability.
|
||||||
|
|
||||||
|
# Univariate Analysis
|
||||||
|
|
||||||
|
> **Tip**: Now that you've completed your univariate explorations, it's time to
|
||||||
|
reflect on and summarize what you've found. Use the questions below to help you
|
||||||
|
gather your observations and add your own if you have other thoughts!
|
||||||
|
|
||||||
|
### What is the structure of your dataset?
|
||||||
|
|
||||||
|
### What is/are the main feature(s) of interest in your dataset?
|
||||||
|
|
||||||
|
### What other features in the dataset do you think will help support your \
|
||||||
|
investigation into your feature(s) of interest?
|
||||||
|
|
||||||
|
### Did you create any new variables from existing variables in the dataset?
|
||||||
|
|
||||||
|
### Of the features you investigated, were there any unusual distributions? \
|
||||||
|
Did you perform any operations on the data to tidy, adjust, or change the form \
|
||||||
|
of the data? If so, why did you do this?
|
||||||
|
|
||||||
|
|
||||||
|
# Bivariate Plots Section
|
||||||
|
|
||||||
|
> **Tip**: Based on what you saw in the univariate plots, what relationships
|
||||||
|
between variables might be interesting to look at in this section? Don't limit
|
||||||
|
yourself to relationships between a main output feature and one of the
|
||||||
|
supporting variables. Try to look at relationships between supporting variables
|
||||||
|
as well.
|
||||||
|
|
||||||
|
```{r echo=FALSE, Bivariate_Plots}
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
# Bivariate Analysis
|
||||||
|
|
||||||
|
> **Tip**: As before, summarize what you found in your bivariate explorations
|
||||||
|
here. Use the questions below to guide your discussion.
|
||||||
|
|
||||||
|
### Talk about some of the relationships you observed in this part of the \
|
||||||
|
investigation. How did the feature(s) of interest vary with other features in \
|
||||||
|
the dataset?
|
||||||
|
|
||||||
|
### Did you observe any interesting relationships between the other features \
|
||||||
|
(not the main feature(s) of interest)?
|
||||||
|
|
||||||
|
### What was the strongest relationship you found?
|
||||||
|
|
||||||
|
|
||||||
|
# Multivariate Plots Section
|
||||||
|
|
||||||
|
> **Tip**: Now it's time to put everything together. Based on what you found in
|
||||||
|
the bivariate plots section, create a few multivariate plots to investigate
|
||||||
|
more complex interactions between variables. Make sure that the plots that you
|
||||||
|
create here are justified by the plots you explored in the previous section. If
|
||||||
|
you plan on creating any mathematical models, this is the section where you
|
||||||
|
will do that.
|
||||||
|
|
||||||
|
```{r echo=FALSE, Multivariate_Plots}
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
# Multivariate Analysis
|
||||||
|
|
||||||
|
### Talk about some of the relationships you observed in this part of the \
|
||||||
|
investigation. Were there features that strengthened each other in terms of \
|
||||||
|
looking at your feature(s) of interest?
|
||||||
|
|
||||||
|
### Were there any interesting or surprising interactions between features?
|
||||||
|
|
||||||
|
### OPTIONAL: Did you create any models with your dataset? Discuss the \
|
||||||
|
strengths and limitations of your model.
|
||||||
|
|
||||||
|
------
|
||||||
|
|
||||||
|
# Final Plots and Summary
|
||||||
|
|
||||||
|
> **Tip**: You've done a lot of exploration and have built up an understanding
|
||||||
|
of the structure of and relationships between the variables in your dataset.
|
||||||
|
Here, you will select three plots from all of your previous exploration to
|
||||||
|
present here as a summary of some of your most interesting findings. Make sure
|
||||||
|
that you have refined your selected plots for good titling, axis labels (with
|
||||||
|
units), and good aesthetic choices (e.g. color, transparency). After each plot,
|
||||||
|
make sure you justify why you chose each plot by describing what it shows.
|
||||||
|
|
||||||
|
### Plot One
|
||||||
|
```{r echo=FALSE, Plot_One}
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
### Description One
|
||||||
|
|
||||||
|
|
||||||
|
### Plot Two
|
||||||
|
```{r echo=FALSE, Plot_Two}
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
### Description Two
|
||||||
|
|
||||||
|
|
||||||
|
### Plot Three
|
||||||
|
```{r echo=FALSE, Plot_Three}
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
### Description Three
|
||||||
|
|
||||||
|
------
|
||||||
|
|
||||||
|
# Reflection
|
||||||
|
|
||||||
|
> **Tip**: Here's the final step! Reflect on the exploration you performed and
|
||||||
|
the insights you found. What were some of the struggles that you went through?
|
||||||
|
What went well? What was surprising? Make sure you include an insight into
|
||||||
|
future work that could be done with the dataset.
|
||||||
|
|
||||||
|
> **Tip**: Don't forget to remove this, and the other **Tip** sections before
|
||||||
|
saving your final work and knitting the final report!
|
||||||
4899
EDA_Project/wineQualityWhites.csv
Normal file
4899
EDA_Project/wineQualityWhites.csv
Normal file
File diff suppressed because it is too large
Load Diff
Loading…
x
Reference in New Issue
Block a user