Resources for Lab 2

Practice:

  • Download the employees.xlsx.
  • Create an R Project where to analyse the employees.xlsx dataset and move the xlsxfile within to the project directory.
  • Get the path to the data using the here() function.
  • Import the dataset in R with the read_xlsx() function.
  • How many observations are collected in the dataset? Use the dim() function.

  • What’s the average salary in the company? Use either the function mean()or summary()

  • Use filter() and arrange() to show who are the three top earners in the I.T department.

  • Use group_by(), summarise() and median() to get the median salary in the I.T. department.

Construct a scatterplot for salary and seniority, colored by department. Store it in my_plot. Try adding a theme to the plot, for example:

my_plot + theme_minimal()

What happens?

  • Use labs() to modify the labels of the plot. For example:
my_plot + theme_minimal() + labs(x="Seniority (years)")
  • How many employees have just been hired in the company?

  • How many employees have just been hired in the I.T. department?

  • Compute the average salary of employees from the Finance department with at least 8 years seniority.

Compute a summary table where you report:

  • the median salary by department (using the function median()),

  • the number of employees by department (using the function n()),

  • the number of new hired by department.

Save it as an excel file via the write_xlsx() function.

Given the summary table in the previous exercise, reproduce the following bar plot of the median salary by department (hint: use geom_col()).

You have to create an Excel file where employees from different department are collected in different sheets.

In R, you can create multi-sheets Excel files by feeding write_xlsx() with a list of datasets. Assume you stored the original dataset in the object dt. Now, try running

my_departments <- split(dt, dt$department)
  • What does the split()function do? Run ?split() to read the documentation.

  • What kind of object is stored in my_departments? Run typeof(my_departments) to investigate.

  • Feed write_xlsx() with my_departmentsto create a multi-sheet Excel file.