4 Plotting

Every research paper or presentation should include some nice visualisations. R is a powerful software for plotting, especially the ggplot2 package. Ggplot2 is based on the book The Grammer of Graphics by Leland Wilkinson.

I will teach you first the ggplot2 notations and then the base plots.

4.1 ggplot2

First, you must understand the layer structure of a ggplot. Each ggplot can have a maximum of 7 layers. The first three layers are necessary to produce a plot! So to get a plot you must specify the following layers:

  • Data: Where is the data you want to plot?
  • Aesthetics: What should be on the x and y axis?
  • Geometries: Do you want to have lines or points?
**The 7 layers of a ggplot**

Figure 4.1: The 7 layers of a ggplot

4.1.1 Layer 1-3: Data, aesthetics & geometries

Lets give it a try by only specifing the first 3 layers.

Now we want to add a color for each species to the plot. Iris data contains four different species.

It is important that the colour (col) argument is within the aesthetics bracket! Otherwise, R does not choose the colour based on the Species.

4.2 Base plot

4.3 Spatial data

Spatial data - also known as geospatial data - is data that contains geographic information (e.g. coordinates). It is well known, that location strongly affects real estate prices. Houses close to the lake of Zurich are on average more expensive than a compareable house not located at the lake. Hence, if you want to analyse house prices, you will be often encountered with spatial information. Other examples of spatial information are crime rates, amenities, weather information, trading flows and financial networks.

In this section I will teach you how to visualise spatial information. First, you will need a shapefile of the region you are interested in and you want to plot. You can get the shapefiles for most countries in the world from this webpage. In R you can also access the information with the getData function in the raster package. More information on shapefiles and visualisations can be found here.

4.3.1 Simple base plot

First, we get the shapefile: In the getData function the level argument specifies the level of administrative subdivision (level=0: only the country, level=1: cantons,level=3: municipalities).

You can plot the shapefile with the plot function.

Assuming that you have spatial data, you might want to visualise it in the shapefile. Hence, you must know how a shapefile is structured. The shapefile contains a lot of information (e.g. name of the municipal, country name, coordinates, ect.)

## [1] "Aarau"      "Biberstein" "Buchs"      "Densbüren"  "Erlinsbach"
## [6] "Gränichen"

4.3.2 Interactive spatial plots

## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\lliebi\Dropbox\Daten Luca\PhD\GSERM\Spatial Data\Regression Analysis for Spatial Data", layer: "MS_Gebiete"
## with 106 features
## It has 4 fields
## Integer64 fields read as strings:  ge11_ms_id