Start Final Project
This commit is contained in:
parent
de414a2e37
commit
4894d728bb
169
EDA_Project/EDA_Project.rmd
Normal file
169
EDA_Project/EDA_Project.rmd
Normal file
@ -0,0 +1,169 @@
|
||||
---
|
||||
title: "EDA_Project"
|
||||
author: "Dusty P"
|
||||
date: "May 31, 2018"
|
||||
output: html_document
|
||||
---
|
||||
|
||||
```{r echo=FALSE, message=FALSE, warning=FALSE, setup}
|
||||
knitr::opts_knit$set(root.dir = normalizePath("C:/Users/Dusty/Documents/coding/projects/Udacity/Data Analysis/eda/EDA_Project"))
|
||||
|
||||
# load the ggplot graphics package and the others
|
||||
library(ggplot2)
|
||||
library(GGally)
|
||||
library(scales)
|
||||
library(memisc)
|
||||
library(gridExtra)
|
||||
library(RColorBrewer)
|
||||
library(bitops)
|
||||
library(RCurl)
|
||||
|
||||
cuberoot_trans = function() trans_new('cuberoot', transform = function(x) x^(1/3),
|
||||
inverse = function(x) x^3)
|
||||
```
|
||||
|
||||
# Exploration of White Wines by Dustin Pianalto
|
||||
|
||||
This report explores a dataset containing chemical information and ratings on almost 4900 white wine tastings.
|
||||
|
||||
```{r echo=FALSE, message=FALSE, warning=FALSE, Load_the_Data}
|
||||
# Load the Data
|
||||
wqw <- read.csv('wineQualityWhites.csv')
|
||||
```
|
||||
|
||||
# Univariate Plots Section
|
||||
|
||||
> **Tip**: In this section, you should perform some preliminary exploration of
|
||||
your dataset. Run some summaries of the data and create univariate plots to
|
||||
understand the structure of the individual variables in your dataset. Don't
|
||||
forget to add a comment after each plot or closely-related group of plots!
|
||||
There should be multiple code chunks and text sections; the first one below is
|
||||
just to help you get started.
|
||||
|
||||
```{r echo=FALSE, Univariate_Plots}
|
||||
|
||||
```
|
||||
|
||||
> **Tip**: Make sure that you leave a blank line between the start / end of
|
||||
each code block and the end / start of your Markdown text so that it is
|
||||
formatted nicely in the knitted text. Note as well that text on consecutive
|
||||
lines is treated as a single space. Make sure you have a blank line between
|
||||
your paragraphs so that they too are formatted for easy readability.
|
||||
|
||||
# Univariate Analysis
|
||||
|
||||
> **Tip**: Now that you've completed your univariate explorations, it's time to
|
||||
reflect on and summarize what you've found. Use the questions below to help you
|
||||
gather your observations and add your own if you have other thoughts!
|
||||
|
||||
### What is the structure of your dataset?
|
||||
|
||||
### What is/are the main feature(s) of interest in your dataset?
|
||||
|
||||
### What other features in the dataset do you think will help support your \
|
||||
investigation into your feature(s) of interest?
|
||||
|
||||
### Did you create any new variables from existing variables in the dataset?
|
||||
|
||||
### Of the features you investigated, were there any unusual distributions? \
|
||||
Did you perform any operations on the data to tidy, adjust, or change the form \
|
||||
of the data? If so, why did you do this?
|
||||
|
||||
|
||||
# Bivariate Plots Section
|
||||
|
||||
> **Tip**: Based on what you saw in the univariate plots, what relationships
|
||||
between variables might be interesting to look at in this section? Don't limit
|
||||
yourself to relationships between a main output feature and one of the
|
||||
supporting variables. Try to look at relationships between supporting variables
|
||||
as well.
|
||||
|
||||
```{r echo=FALSE, Bivariate_Plots}
|
||||
|
||||
```
|
||||
|
||||
# Bivariate Analysis
|
||||
|
||||
> **Tip**: As before, summarize what you found in your bivariate explorations
|
||||
here. Use the questions below to guide your discussion.
|
||||
|
||||
### Talk about some of the relationships you observed in this part of the \
|
||||
investigation. How did the feature(s) of interest vary with other features in \
|
||||
the dataset?
|
||||
|
||||
### Did you observe any interesting relationships between the other features \
|
||||
(not the main feature(s) of interest)?
|
||||
|
||||
### What was the strongest relationship you found?
|
||||
|
||||
|
||||
# Multivariate Plots Section
|
||||
|
||||
> **Tip**: Now it's time to put everything together. Based on what you found in
|
||||
the bivariate plots section, create a few multivariate plots to investigate
|
||||
more complex interactions between variables. Make sure that the plots that you
|
||||
create here are justified by the plots you explored in the previous section. If
|
||||
you plan on creating any mathematical models, this is the section where you
|
||||
will do that.
|
||||
|
||||
```{r echo=FALSE, Multivariate_Plots}
|
||||
|
||||
```
|
||||
|
||||
# Multivariate Analysis
|
||||
|
||||
### Talk about some of the relationships you observed in this part of the \
|
||||
investigation. Were there features that strengthened each other in terms of \
|
||||
looking at your feature(s) of interest?
|
||||
|
||||
### Were there any interesting or surprising interactions between features?
|
||||
|
||||
### OPTIONAL: Did you create any models with your dataset? Discuss the \
|
||||
strengths and limitations of your model.
|
||||
|
||||
------
|
||||
|
||||
# Final Plots and Summary
|
||||
|
||||
> **Tip**: You've done a lot of exploration and have built up an understanding
|
||||
of the structure of and relationships between the variables in your dataset.
|
||||
Here, you will select three plots from all of your previous exploration to
|
||||
present here as a summary of some of your most interesting findings. Make sure
|
||||
that you have refined your selected plots for good titling, axis labels (with
|
||||
units), and good aesthetic choices (e.g. color, transparency). After each plot,
|
||||
make sure you justify why you chose each plot by describing what it shows.
|
||||
|
||||
### Plot One
|
||||
```{r echo=FALSE, Plot_One}
|
||||
|
||||
```
|
||||
|
||||
### Description One
|
||||
|
||||
|
||||
### Plot Two
|
||||
```{r echo=FALSE, Plot_Two}
|
||||
|
||||
```
|
||||
|
||||
### Description Two
|
||||
|
||||
|
||||
### Plot Three
|
||||
```{r echo=FALSE, Plot_Three}
|
||||
|
||||
```
|
||||
|
||||
### Description Three
|
||||
|
||||
------
|
||||
|
||||
# Reflection
|
||||
|
||||
> **Tip**: Here's the final step! Reflect on the exploration you performed and
|
||||
the insights you found. What were some of the struggles that you went through?
|
||||
What went well? What was surprising? Make sure you include an insight into
|
||||
future work that could be done with the dataset.
|
||||
|
||||
> **Tip**: Don't forget to remove this, and the other **Tip** sections before
|
||||
saving your final work and knitting the final report!
|
||||
173
EDA_Project/projecttemplate.rmd
Normal file
173
EDA_Project/projecttemplate.rmd
Normal file
@ -0,0 +1,173 @@
|
||||
TITLE by YOUR_NAME_HERE
|
||||
========================================================
|
||||
|
||||
> **Tip**: You will see quoted sections like this throughout the template to
|
||||
help you construct your report. Make sure that you remove these notes before
|
||||
you finish and submit your project!
|
||||
|
||||
> **Tip**: One of the requirements of this project is that your code follows
|
||||
good formatting techniques, including limiting your lines to 80 characters or
|
||||
less. If you're using RStudio, go into Preferences \> Code \> Display to set up
|
||||
a margin line to help you keep track of this guideline!
|
||||
|
||||
```{r echo=FALSE, message=FALSE, warning=FALSE, packages}
|
||||
# Load all of the packages that you end up using in your analysis in this code
|
||||
# chunk.
|
||||
|
||||
# Notice that the parameter "echo" was set to FALSE for this code chunk. This
|
||||
# prevents the code from displaying in the knitted HTML output. You should set
|
||||
# echo=FALSE for all code chunks in your file, unless it makes sense for your
|
||||
# report to show the code that generated a particular plot.
|
||||
|
||||
# The other parameters for "message" and "warning" should also be set to FALSE
|
||||
# for other code chunks once you have verified that each plot comes out as you
|
||||
# want it to. This will clean up the flow of your report.
|
||||
|
||||
library(ggplot2)
|
||||
```
|
||||
|
||||
```{r echo=FALSE, Load_the_Data}
|
||||
# Load the Data
|
||||
|
||||
```
|
||||
|
||||
> **Tip**: Before you create any plots, it is a good idea to provide a short
|
||||
introduction into the dataset that you are planning to explore. Replace this
|
||||
quoted text with that general information!
|
||||
|
||||
# Univariate Plots Section
|
||||
|
||||
> **Tip**: In this section, you should perform some preliminary exploration of
|
||||
your dataset. Run some summaries of the data and create univariate plots to
|
||||
understand the structure of the individual variables in your dataset. Don't
|
||||
forget to add a comment after each plot or closely-related group of plots!
|
||||
There should be multiple code chunks and text sections; the first one below is
|
||||
just to help you get started.
|
||||
|
||||
```{r echo=FALSE, Univariate_Plots}
|
||||
|
||||
```
|
||||
|
||||
> **Tip**: Make sure that you leave a blank line between the start / end of
|
||||
each code block and the end / start of your Markdown text so that it is
|
||||
formatted nicely in the knitted text. Note as well that text on consecutive
|
||||
lines is treated as a single space. Make sure you have a blank line between
|
||||
your paragraphs so that they too are formatted for easy readability.
|
||||
|
||||
# Univariate Analysis
|
||||
|
||||
> **Tip**: Now that you've completed your univariate explorations, it's time to
|
||||
reflect on and summarize what you've found. Use the questions below to help you
|
||||
gather your observations and add your own if you have other thoughts!
|
||||
|
||||
### What is the structure of your dataset?
|
||||
|
||||
### What is/are the main feature(s) of interest in your dataset?
|
||||
|
||||
### What other features in the dataset do you think will help support your \
|
||||
investigation into your feature(s) of interest?
|
||||
|
||||
### Did you create any new variables from existing variables in the dataset?
|
||||
|
||||
### Of the features you investigated, were there any unusual distributions? \
|
||||
Did you perform any operations on the data to tidy, adjust, or change the form \
|
||||
of the data? If so, why did you do this?
|
||||
|
||||
|
||||
# Bivariate Plots Section
|
||||
|
||||
> **Tip**: Based on what you saw in the univariate plots, what relationships
|
||||
between variables might be interesting to look at in this section? Don't limit
|
||||
yourself to relationships between a main output feature and one of the
|
||||
supporting variables. Try to look at relationships between supporting variables
|
||||
as well.
|
||||
|
||||
```{r echo=FALSE, Bivariate_Plots}
|
||||
|
||||
```
|
||||
|
||||
# Bivariate Analysis
|
||||
|
||||
> **Tip**: As before, summarize what you found in your bivariate explorations
|
||||
here. Use the questions below to guide your discussion.
|
||||
|
||||
### Talk about some of the relationships you observed in this part of the \
|
||||
investigation. How did the feature(s) of interest vary with other features in \
|
||||
the dataset?
|
||||
|
||||
### Did you observe any interesting relationships between the other features \
|
||||
(not the main feature(s) of interest)?
|
||||
|
||||
### What was the strongest relationship you found?
|
||||
|
||||
|
||||
# Multivariate Plots Section
|
||||
|
||||
> **Tip**: Now it's time to put everything together. Based on what you found in
|
||||
the bivariate plots section, create a few multivariate plots to investigate
|
||||
more complex interactions between variables. Make sure that the plots that you
|
||||
create here are justified by the plots you explored in the previous section. If
|
||||
you plan on creating any mathematical models, this is the section where you
|
||||
will do that.
|
||||
|
||||
```{r echo=FALSE, Multivariate_Plots}
|
||||
|
||||
```
|
||||
|
||||
# Multivariate Analysis
|
||||
|
||||
### Talk about some of the relationships you observed in this part of the \
|
||||
investigation. Were there features that strengthened each other in terms of \
|
||||
looking at your feature(s) of interest?
|
||||
|
||||
### Were there any interesting or surprising interactions between features?
|
||||
|
||||
### OPTIONAL: Did you create any models with your dataset? Discuss the \
|
||||
strengths and limitations of your model.
|
||||
|
||||
------
|
||||
|
||||
# Final Plots and Summary
|
||||
|
||||
> **Tip**: You've done a lot of exploration and have built up an understanding
|
||||
of the structure of and relationships between the variables in your dataset.
|
||||
Here, you will select three plots from all of your previous exploration to
|
||||
present here as a summary of some of your most interesting findings. Make sure
|
||||
that you have refined your selected plots for good titling, axis labels (with
|
||||
units), and good aesthetic choices (e.g. color, transparency). After each plot,
|
||||
make sure you justify why you chose each plot by describing what it shows.
|
||||
|
||||
### Plot One
|
||||
```{r echo=FALSE, Plot_One}
|
||||
|
||||
```
|
||||
|
||||
### Description One
|
||||
|
||||
|
||||
### Plot Two
|
||||
```{r echo=FALSE, Plot_Two}
|
||||
|
||||
```
|
||||
|
||||
### Description Two
|
||||
|
||||
|
||||
### Plot Three
|
||||
```{r echo=FALSE, Plot_Three}
|
||||
|
||||
```
|
||||
|
||||
### Description Three
|
||||
|
||||
------
|
||||
|
||||
# Reflection
|
||||
|
||||
> **Tip**: Here's the final step! Reflect on the exploration you performed and
|
||||
the insights you found. What were some of the struggles that you went through?
|
||||
What went well? What was surprising? Make sure you include an insight into
|
||||
future work that could be done with the dataset.
|
||||
|
||||
> **Tip**: Don't forget to remove this, and the other **Tip** sections before
|
||||
saving your final work and knitting the final report!
|
||||
4899
EDA_Project/wineQualityWhites.csv
Normal file
4899
EDA_Project/wineQualityWhites.csv
Normal file
File diff suppressed because it is too large
Load Diff
Loading…
x
Reference in New Issue
Block a user