You will learn how to explore data patterns with visualisations.
Model
It’s the only step where “math” enters the game. Goes from simple descriptive statistics, to more elaborated modelling strategies. Often combined with visualisations.
Communicate
Write reports;
Choose appropriate viasualizations;
Highilight the results in terms of insights.
Why R?
Replicability!
Excel vs R
Consider creating a new variable in your dataset:
Now you share your spreadsheet with your colleague.
She will not know the formula within each cell unless she inspect all of them individually!
Excel vs R
Consider the same data in R, stored in the object my_data
And now your colleague has no doubt about where those numbers come from!
Excel VS R
You are interested in computing the total number of products sold by each representative:
That’s a lot of point-and-clicks!
Will your colleague be able to reproduce your results?
Excel VS R
Consider the same dataset in R stored in the new object my_data2:
my_data2
# A tibble: 10 × 4
SalesRepresentative Product QuantitySold Region
<chr> <chr> <dbl> <chr>
1 Eleonor A 50 IT
2 Elizabeth B 90 UK
3 John C 10 FR
4 Matt A 34 DE
5 Eleonor A 5 IT
6 Elizabeth C 8 IT
7 John B 20 UK
8 Matt B 5 UK
9 Eleonor A 15 UK
10 Elizabeth C 45 DE
Click on the “Material” (💻) icon corresponding to today lab.
Access the interactive web console.
Basic R grammar
Run a command:
Get the cursor on the line and press CTRL+ENTER on Windows, or CMD+ENTER on MacOS.
Assign a value to an object
a =1# numeric value assigned with = b <-2# numeric value assigne with <- c <-"ciao"# character value c # display the content stored in c
[1] "ciao"
Basic math
a+b # sum the values stored in a and b
[1] 3
Create a vector
v1 <-c(1, 7, 2) # create the vector manuallyv2 <-1:10# create a sequence 1 to 10 automaticallyv3 <-seq(1,10,by=2) # create a sequence automatically with the seq() function
Seek help
?seq() # R will show you the documentation of the seq function
Create and inspect matrices
m <-matrix(1:6, nrow=2, ncol=3) # create a matrixnrow(m) # inspect the number of rows
[1] 2
ncol(m) # inspect the number of columns
[1] 3
m[1,2] # display element on the first row, at the second column
[1] 3
Quick visualization
Some test datasets are preloaded in R. That’s the case of iris, a famous dataset about iris flower dimensions