Now that you’re equipped to select the perfect plot for your data, let’s learn how to further modify standard plots for our needs.

We will use a small subset of the sparrows dataset, called ‘BirdData’. Look at the data now, type BirdData.

BirdData
##   Tarsus Head Weight Wingcrd Species
## 1   22.3 31.2    9.5    59.0       A
## 2   19.7 30.4   13.8    55.0       A
## 3   20.8 30.6   14.8    53.5       A
## 4   20.3 30.3   15.2    55.0       A
## 5   20.8 30.3   15.5    52.5       A
## 6   21.5 30.8   15.6    57.5       B
## 7   20.6 32.5   15.6    53.0       B
## 8   21.5 31.6   15.7    55.0       B

Point v line plot

Let’s learn how to set the type = argument of our plots, allowing us adjust the exact representation of our data.

Plot Head as a function of Tarsus, setting the data = argument.

plot(Head ~ Tarsus, data = BirdData)

Now let’s change the type = argument to represent the points as a line. Go ahead and re-plot the data, setting the type argument to ‘l’, meaning that you wish to join the points with lines.

plot(Head ~ Tarsus, data = BirdData, type = 'l')

Not actually a great way to represent our data! But it’s good to know how to do. We often want to convert a scatterplot to a line when we’re plotting longitudinal data, like a time series, where we are interested in the trajectory of a variable. Take a look at the type = argument in the plot() help file to see other options.

Changing the size of plot elements

Now lets look at the ‘cex’ (character expansion) argument. We can use ‘cex’ to scale all or some elements in our plot.

By default, cex = scales the size of the points in a plot. Go ahead and re-plot $Head as a function of $Tarsus, setting the data argument and cex = 2. Revert back to making the default scatterplot.

plot(Head ~ Tarsus, data = BirdData, cex = 2)

cex = can also be targeted at specific plotting elements. Plot the same data, but instead of setting cex =, set cex.lab = to 2.

plot(Head ~ Tarsus, data = BirdData, cex.lab = 2)

Check the help file for par() to see other options for cex =.

A brief digression on colour

While we’ve used colour before, let’s discuss it a little. Colour is very useful for various things in graphics, mostly in terms of illustrating different groups of data. In the past, most journals were only printed in black and white, so different shades of grey were important.

R has 102 unique shades of grey. See this blogplost, if you want: https://www.r-bloggers.com/50-shades-of-grey-according-to-r/.

R also has access to a very wide range of colours (including grey). Colours can be accessed in a variety of ways; (i) by number, e.g., col = 1 for black, or col = 2 for red; (ii) by name, e.g., col = 'black'; (iii) by hexadecimal code, e.g., col = '#FF0000'; (iv) or RGB value, e.g., col = rgb(0,0,0). We can even give colors transparency with rgb() using the ‘alpha’ argument.

Try changing the colour of the data points in the figure to red, using the name. The colours will be more obvious if you also use a different plotting character, so use pch = 20.

plot(Head ~ Tarsus, data = BirdData, pch = 20, col = 'red')

When used in a scatterplot, the argument ‘col =’ changes the colours of the plotting symbols. What does this argument do when applied to a boxplot? Add the color red to a plot of Head on Species.

boxplot(Head ~ Species, data = BirdData, col = 'red')

Adding the argument ‘col = ’red’’ to a boxplot fills in the boxes.

In similar fashion to ‘cex’, You can use col.lab =, col.main =, and col.axis = to change the colour of these external parts of the plot, if needed.

We will consider colour in greater detail in a later lesson.

Using vectors to modify plots

All of the parameters we have dealt with so far (colour, symbols, size, etc.) can be changed using vectors, rather than addressing each individual element.

For example, instead of specifying that all the points should be red, we can use a vector of colours to make each point a different colour. How might we do this?

Well, We have eight data points. We know that we can specify colours with a number. How about setting col = 1:8 in our plot of Head on Tarsus? Try that. (Remember to keep pch = 20; and also set cex = 3 to make each point larger.

plot(Head ~ Tarsus, data = BirdData, col = 1:8, pch = 20, cex = 3)

Ok super. We can use a vector to set colours… how might this actually be useful?

We could use colour to indicate different groups. We know the species of each sparrow in our data. This Species column is a factor variable, not numeric, but try setting col = Species, modifying the previous code.

plot(Head ~ Tarsus, data = BirdData, col = Species, pch = 20, cex = 3)

Woo hoo! Colours correspond to species. R converted the factors to integers according to their level (first level = ‘A’ = black; second level = ‘B’ = red) and coloured their points. Cool beans!

The ... argument, par(), and plot()

In the previous class, to modify elements of the graphic we added various arguments to the function plot().

However, if you look at the help file (?plot), there are only three arguments specified, x, y, and ...: plot(x, y, ...).

The elipses argument ... is special and allows arguments for one function to be passed through another function. Arguments such as cex, pch, etc. are not arguments of the function plot(). These arguments are actually arguments for the graphical parameters function par() that is used to modify the appearance and layout of graphics.

There are a lot of parameters that you can control. Some you will use frequently, others will not be touched.

All the arguments are listed on the par() help page (‘?par’). You should spend some time looking over this page to see what options are available. Open it up now, so you can refer to it as we progress through the lesson.

?par

As we said above, all the arguments discussed below can be placed within a call to par(), e.g., par(pch = 13, cex = 2).

One important thing to note is that any changes to the defaults of par() will result in these changes being applied to all the subsequent plots that you create, whereas any changes you make within a call to plot() will only apply to that specific plot.

In this lesson we will stick with putting these arguments within plot().

As an aside, if you have used par() and want to clear these parameters, you will need to reset the graphics window/s to the default parameters. Within RStudio, the easiest way is to go to the >Plots menu above, and select ’Clear All. This will close all the open plots windows and reset the graphics parameters to their defaults.

In terms of working with and creating graphics, a useful way of working is to run code iteratively within the RStudio graphics windows to get the code to generate your figure correctly. Then, use a separate graphics device to create the final plot (more on this specific operation later!).

Axes

OK, let’s get back to work … First, we will look at some common modifications to the axes of plots. In many cases the default range of the axes, tick mark position and orientation, and numbering scheme are not exactly what you (or your prefered publication) want.,

We will use our example plot of Head on Tarsus again to illustrate these changes. First, remind yourself of the basic code to make that plot using the formula approach and data argument …

plot(Head ~ Tarsus, data = BirdData)

Great. Look carefully at the axes. The tick marks point outwards from the plot. The argument is tcl =, the sign (+/-) indicates the direction , and the length is expressed as a fraction of the height of a line of text. The default is tcl = -0.5, i.e., the tick marks point outwards and are half the height of a line of text.

You can flip the direction (i.e., have the tick marks point into the plot) by making that value positive and also change the length of the tick marks. Try setting the tick marks to the full height of a line of text and pointing inwards. Remember, there is no need to call par() first, just put the argument inside the main plot() call.

plot(Head ~ Tarsus, data = BirdData, tcl = 1)

Another common modification is to change the orientation of the numbers along the axes. The default is that the numbers are parallel to their respective axis. You can see this in the current plot that is displayed… along the x-axis the numbers are horizontal and along the y-axis the numbers are rotated 90 degrees anti-clockwise.

You can change the style of these axis labels with the argument las =. The default is parallel to the axis (las = 0), but you may often want to set them to be always horizontal (las = 1) or always perpendicular (las = 2), or even always vertical (las = 3). Try setting the axes to always horizontal, in a new call to our basic plot command.

plot(Head ~ Tarsus, data = BirdData, las = 1)

Ok, now let’s combine these two arguments and see what happens. Put the previous new values of las and tcl in our plot command.

plot(Head ~ Tarsus, data = BirdData, las = 1, tcl = 1)

What do you see? We now have tick marks pointing inwards and horizontal tick labels. Doing these operations has led to some extra white space appearing around the edges of our plot, between the numbers and the box where the tick marks used to be. For someone as persnickity about graphics as me, this does not look quite right … ‘How might we decrease this space?’ I hear you ask … Well …

As you might expect, there is an argument for that. Within par, the argument mgp = is a vector of three elements that controls the number of lines out from the plot edge where the axis title (mgp[1]), axis labels (mgp[2]) and axis line itself (mgp[3]) are placed. The default is mgp = c(3, 1, 0). In our scenario, a more appealing distance between the axis labels and the axis itself is mgp = c(2.5, 0.5, 0). Try that, adding the mgp argument to the previous code.

plot(Head ~ Tarsus, data = BirdData, las = 1, tcl = 1, mgp = c(2.5, 0.5, 0))

Great, that moved the axes labels closer in, which (I think!) looks better. See what fine-scale control you can have over the appearance of your figures? It’s wonderful!

R generally sets the limits of the axes just wider than the data. There are several cases when you may want to change this, for example to set the lower limit to 0, or make both sides wider if you are plotting confidence limits and the default range is not wide enough. Setting the arguments xlim = and ylim = (x- and y axis limits) will resolve this issue. Try setting the x axis limits to 19 and 25. As with mgp =, you need to pass a vector here, in this case of only two elements setting the min and max of the axis. Add this argument to the previous code.

plot(Head ~ Tarsus, data = BirdData, las = 1, tcl = 1, mgp = c(2.5, 0.5, 0), xlim = c(19, 25))

As an aside, xlim and ylim are not arguments to par; it would not make sense to have them apply to every subsequent plot. These two arguments are sent by the plot() function to the plot.window() function, which sets up the coordinate system for that graphics window.

R tries very hard to make the axes limits as pretty as possible. These are recomputed internally, so if you resize your graphics window, they may even change in number and position. In particular, if you make the window too narrow, R will drop some of the labels. Try playing with the size of the graphics window now.

What if you wanted to ignore R, and decide on exactly everything about the axis labels, including the position of the tick marks and what labels went there? You can do it! However, first we would need remove the default axis and/or not have an axis plotted at all. We can suppress plotting the x axis with the argument xaxt = 'n', and the y axis with yaxt = 'n'. Try suppressing the x axis of our default basic plot (i.e., ignoring all the recent additions).

plot(Head ~ Tarsus, data = BirdData, xaxt = 'n')

Great.. this suppresses the axis tick marks and labels, but not the axis title. To suppress the axis title, we have to modify the xlab = argument in plot() You can do this with xlab = '' (i.e., a text string with nothing in it). Try that, too, adding to the previous code.

plot(Head ~ Tarsus, data = BirdData, xaxt = 'n', xlab = '')

Now we can add in the axis that we want. To do this, we actually need to call a separate function, axis(), which we would run as a new line in our plotting commands. It cannot be nested within plot() or par(). Anyway, a call to axis() here will add an axis to the current plot. First, pull up the help file for axis.

?axis

To add an axis, you need to specify a minimum of two things. First, which side of the plot you want to put the axis (1=below, 2=left, 3=above and 4=right). Then, you need to provide where you want to put the ticks and labels, indicated by the argument at =. Let’s add an axis to the x (bottom) side of our plot, with numbers at 20, 21, and 22.

plot(Head ~ Tarsus, data = BirdData, xaxt = 'n', xlab = ''); axis(side = 1, at = c(20, 21, 22))

axis() has another useful argument, labels =, if you wanted to add labels different to the values where the tick marks are located, for example some text, the log of the values, or some other characters.

So now we have an axis, but no label. We can add text to any margin of a plot using the function mtext(). Much like axis(), this function requires a side, but also needs a margin line number (line =; recall our use of mgp = above), and the actual text of the label (text =). Use mtext() to add a label to the x axis, with text = 'Tarsus (mm)', and line = 2.5'.

plot(Head ~ Tarsus, data = BirdData, xaxt = 'n', xlab = ''); axis(side = 1, at = c(20, 21, 22)); mtext(side = 1, text = 'Tarsus (mm)', line = 2.5)

Fonts

You can explore the other options for margin labels at your leisure. Let’s wrap up with a quick discussion of fonts.

Many journals require specific fonts in their graphics. Fonts in R (and other software) are often tricky, because each graphics device (e.g., pdf, png, …) has its own default fonts and system of dealing with fonts. Further, font can be specified in several different ways, and of course there is no universal system of naming or calling fonts in computer operating systems. Yay …

However, there are some straightforward things we can do easily. First, within par() we can decide whether to use a plain (1, default), bold (2), italic (3), or bold italic font (4) for the title, axis labels, and axis tick labels. We can combine this argument with a call to mtext() (because mtext() also has an ellipsis argument) to add a bold label to our x axis. Scroll up in the console and add font = 2 to your call to mtext().

plot(Head ~ Tarsus, data = BirdData, xaxt = 'n', xlab = ''); axis(side = 1, at = c(20, 21, 22)); mtext(side = 1, text = 'Tarsus (mm)', line = 2.5); mtext(side = 1, text = 'Tarsus (mm)', line = 2.5, font = 2)

R overlays this second call to mtext() on top of the first, which does not look great, but you get the idea. This overlaying issue is also why you often need to re-run the entire code that generates a figure if you decide to change some options.

You can change the font style (plain/bold/italic) for each part of a graphic using font.axis =, font.lab =, font.main =, and font.sub.

In many graphics devices, you can also specify the family (serif (often Times New Romas), san-serif (often Helvetica), monotype (often Courier)) via the argument family = 'serif', family = 'sans', or family = 'mono'.

Recently, a new R package was released that allows you to access the majority of other fonts on your computer: the extrafonts package. Explore this if you want to add non-standard fonts to your graphics: http://blog.revolutionanalytics.com/2012/09/how-to-use-your-favorite-fonts-in-r-charts.html

That’s all for now! You now know how to make some more advanced modifications to your basic plots.

Please submit the log of this lesson to Google Forms so that Simon may evaluate your progress.

  1. Delighted, I’m sure.

Delighted, I’m sure.