6 Basic Plots

6.1 Loading ggplot2

Like any other R package, you must load ggplot2 using the library function before you can use any of the functionality that it offers. We will also load the extensions that provide additional features:

library(ggplot2)
library(ggrepel)
library(ggthemes)
library(scales)

6.2 Dataset

Let’s start by loading the housing dataset:

housing <- read.csv("https://raw.githubusercontent.com/altaf-ali/ggplot_tutorial/master/data/housing.csv")

Now, let’s see what the dataset looks like:

head(housing)
##   State Region       Date Home.Value Structure.Cost Land.Value
## 1    AK   West 2010-03-01     224952         160599      64352
## 2    AK   West 2010-06-01     225511         160252      65259
## 3    AK   West 2009-09-01     225820         163791      62029
## 4    AK   West 2009-12-01     224994         161787      63207
## 5    AK   West 2007-12-01     234590         155400      79190
## 6    AK   West 2008-03-01     233714         157458      76256
##   Land.Share..Pct. Home.Price.Index Land.Price.Index Year Quarter
## 1             28.6            1.481            1.552 2010       1
## 2             28.9            1.484            1.576 2010       2
## 3             27.5            1.486            1.494 2009       3
## 4             28.1            1.481            1.524 2009       4
## 5             33.8            1.544            1.885 2007       4
## 6             32.6            1.538            1.817 2008       1

When dealing with date and time values, it’s generally a good idea to convert them to the appropriate data type.

housing$Date <- as.Date(housing$Date)

Next, we create two subsets of the data, one with housing prices only from New York, and another one with housing prices from 9 states in the North East.

newyork <- subset(housing, State == "NY")
northeast <- subset(housing, Region == "N. East")

6.3 Scatter Plot

Now we’re ready to plot. Everything starts with the ggplot() function which creates a plot object. The two arguments passed to ggplot() are:

Argument Description
data Dataset for the plot. It should be a data.frame or something that can be converted to data.frame
mapping Aesthetic mappings for the plot

Using the newyork dataset, let’s create a scatter plot with Date on the x-axis and Home.Value on the y-axis.

ggplot(newyork, aes(x = Date, y = Home.Value)) +
  geom_point()

Now let’s see which ggplot building blocks are active in the above example:

Data newyork
Mapping aes(x = Date, y = Home.Value)
Geom geom_point()

6.4 Exercise

Use the Data Visualization with ggplot2 Cheat Sheet or any other resource to find out how to complete the exercises.

  1. Create a histogram of Home.Value using the housing data.

  2. Create a box plot of Home.Value using northeast dataset with State on the x-axis

  3. Create a line plot using newyork dataset with Date on the x-axis and Home.Value on the y-axis

  4. Create a line plot using northeast dataset with Date on the x-axis and Home.Value on the y-axis and use a different color for each state