https://gexijin.github.io/learnR/index.html
* Preface
* 1 Step into R programming-the iris flower dataset
+ 1.1 Getting started
+ 1.2 Data frames have rows and columns: the Iris flower
dataset
+ 1.3 Analyzing one set of numbers
+ 1.4 Analyzing a column of categorical values
+ 1.5 Analyzing the relationship between two sets of numbers
+ 1.6 Testing the differences between two groups
+ 1.7 Testing the difference among multiple groups (ANOVA)
* 2 Visualizing the iris flower data set
+ 2.1 Basic concepts of R graphics
+ 2.2 The ggplot2 package makes plotting intuitive
+ 2.3 Scatter plots matrix
+ 2.4 Star and segment diagrams
+ 2.5 Parallel coordinate plot
+ 2.6 Bar plot with error bar
+ 2.7 Box plot and death to the dynamite plots
+ 2.8 Combining plots
+ 2.9 Hierarchical clustering and heat map
+ 2.10 Projecting high-dimensional data with principal
component analysis (PCA)
+ 2.11 Classification: Predicting the odds of binary outcomes
* 3 Data structures
+ 3.1 Basic concepts
o 3.1.1 Expressions
o 3.1.2 Logical Values
o 3.1.3 Variables
o 3.1.4 Functions
o 3.1.5 Looking for help and example Code
+ 3.2 Data structures
o 3.2.1 Vectors
o 3.2.2 Matrices
o 3.2.3 Data Frames
o 3.2.4 Strings and string vectors
o 3.2.5 Lists
* 4 Importing data and managing files
+ 4.1 Enter data manually
+ 4.2 Project-oriented workflow
o 4.2.1 Create a project in a new folder
o 4.2.2 Create a script file and comment (!)
o 4.2.3 Copy data files to the new directory
o 4.2.4 Import data files
o 4.2.5 Check and convert data types
o 4.2.6 Close a project when you are done
+ 4.3 Reading files directly using read.table
+ 4.4 General procedure to read data into R:
+ 4.5 Data manipulation in a data frame
+ 4.6 Data transformation using the dplyr
* 5 The heart attack data set (I)
+ 5.1 Begin your analysis by examining each column separately
+ 5.2 Possible correlation between two numeric columns?
+ 5.3 Associations between categorical variables?
+ 5.4 Associations between a categorical and a numeric
variables?
+ 5.5 Associations between multiple columns?
* 6 The heart attack dataset (II)
+ 6.1 Scatter plot in ggplot2
+ 6.2 Histograms and density plots
+ 6.3 Box plots and Violin plots
+ 6.4 Bar plot with error bars
+ 6.5 Statistical models are easy; interpretations and
verifications are not!
* 7 Advanced topics
+ 7.1 Introduction to R Markdown
+ 7.2 Tidyverse
+ 7.3 Interactive plots made easy with Plotly
+ 7.4 Shiny Apps
o 7.4.1 Install the Shiny package by typing this in the
console.
o 7.4.2 Create a Shiny web app is a piece of cake
o 7.4.3 Let's play!
o 7.4.4 Pubish your app
+ 7.5 Define your own function
* 8 The state dataset
+ 8.1 Reading in and manipulating data
+ 8.2 Visualizing data
+ 8.3 Analyzing the relationship among variables
+ 8.4 The whole picture of the data set
+ 8.5 Linear model analysis
+ 8.6 Conclusion
* 9 The game sales dataset
+ 9.1 Visualization of categorical variables
+ 9.2 Correlation among numeric variables
+ 9.3 Analysis of score and count
+ 9.4 Analysis of sales
o 9.4.1 By Year.Release
o 9.4.2 By Region
o 9.4.3 By Rating
o 9.4.4 By Genre
o 9.4.5 by Score
o 9.4.6 By Rating & Genre & Critic score
o 9.4.7 By Platform
+ 9.5 Effect of platform type to priciple components
+ 9.6 Models for global sales
+ 9.7 Conclusion
* 10 The messy salary data
+ 10.1 Read in and clean data
+ 10.2 Information about the variables
o 10.2.1 Income and age
o 10.2.2 Ethnic group
o 10.2.3 Sex
o 10.2.4 Union descreption
+ 10.3 Analysis of gross income
o 10.3.1 Age
o 10.3.2 Sex
o 10.3.3 Ethnic Group
+ 10.4 Analysis of gross income type
o 10.4.1 Gross Type
o 10.4.2 Ethnic and Sex and age
+ 10.5 LM and ANOVA analysis
+ 10.6 Conclusion
Learn R through examples
Learn R through examples
Xijin Ge, Jianli Qi and Rong Fan
2020-05-26
Preface
Aimed for total beginners, this book is written based on the
philosophy that people learn faster when they are shown examples and
case studies. Instead of explaining the rules, the book largely
centers on the analysis of several datasets from the very beginning.
So this is an alternative to traditional, more rigorous textbooks on
R programming. We start with small and clean datasets and gradually
transition into big, messy ones. With each dataset, we hope to tell a
story through the analysis. We invite you, our courageous reader, to
take on this journey with us. Motivated readers, such as biologists,
could easily work their way through this book and learn by
themselves. I would encourage you to type in the example code and see
the outputs. And then work on the challenges and exercises.
It originally started as materials for 2-hour hands-on workshops
intended to give a quick introduction/demonstration for students and
researchers who are totally new to R. The workshop has been given
many times to different audiences ranging from high-school students
to mathematicians. For a 2-hour session, I have to keep it gentle,
interactive, and fun, sometimes at the expense of rigor. Instead of
explaining all the rules, grammar, and syntax, I found it is easier
to focus on one dataset and walk them through some of the analyses
possible with R. This material later evolved into as a one-credit
online class and then a three credit class. We stick with the
unconventional approach of focusing on datasets and examples.
Many students have contributed to this material. Notably, Quazi Irfan
who worked as teaching assistant, fixed many errors and gave
constructive feedback. In the fall of 2018, a group of highly
motivated students in the STAT 442 Exploratory Data Analysis worked
on some of the datasets presented here. They are Samuel Ivanecky,
Kory Heier, Audrey Bunge, Jacie McDonald, Shae Olson, Nathan
Thirsten, and Alex Wieseler. Some of the plots in this book are
inspired by them.
Any comments and suggestions to make this book better would be
welcome. This includes typos, errors, and organizational issues. The
best place to reach out is through the GitHub issues page. If you do
not like to create yet another account, you can email us
Xijin.Ge@sdstate.edu.
Chapter 1 Step into R programming
Chapter 2 Visualizing data set
Chapter 3 Data structures
Chapter 4 Data importing
Chapter 5 Heart attack data set I
Chapter 6 Heart attack data set II
Chapter 7 Advanced topics
Chapter 8 State data set
Chapter 9 Game sale data set
Chapter 10 Employee salary data set