Reference - General
December 20 2022
reference.Rmd
Various reference materials
British Ecological Society
The British Ecological Society (BES) is the oldest ecological society in the world, and one of the largest. It publishes several excellent journals, including Methods in Ecology and Evolution, which publishes many software packages.
The society also provides many several guides on better science. As part of the course, we require you to read the following guides:
Hadley Wickham
Hadley Wickham is the Chief
Scientist at RStudio / Posit PBC. He created the original
ggplot
library for his PhD. These days his data analysis packages are
collectively known as the tidyverse, the most famous of which
are probably ggplot2 and dplyr. He has written three very
useful books on data analysis and software development in R:
- R for Data Science (R4DS)
- Advanced R
- 1st edition (I find this easier to navigate)
- 2nd edition (this is obviously updated)
- R Packages (with Jenny Bryan, see more of her work below)
We recommend all of these books as reference material, and will point to specific chapters at different times, but the books themselves are probably too long for you to read in their entirety. Nonetheless keep them for future reference, as they contain an enormous amount of knowledge about R in a (relatively!) compact form. BOHVM also interviewed Hadley as part of our Naturally Speaking podcast when he came to Glasgow to receive an award from the university a few years ago. You can hear the interview here. As well as a lot on the tidyverse, later on it covers a lot of the reasons why this course exists, and the kind of techniques we are teaching on it.
Jenny Bryan
Jenny Bryan is a statistician who also works for RStudio / Posit PBC. She has also produced an enormous amount of fantastic material, which actually aligns even more closely with the course than Hadley’s work, since she is interested in reproducibility and better coding practice.
Understanding, avoiding and fixing errors in your code
As part of the course, we require you to watch this keynote from rstudio::conf 2020 by Jenny. It is an excellent keynote on errors in R and how to identify and fix them.
Better coding practices
Jenny (and Jim Hester) have written a guide called “What they forgot to tell you about R” which is available online here. She also gave a talk a few years ago (in 2015) at a Reproducible Science Workshop and as part of the course we require you to flick through the slides to give you an idea about how to better name your files in the future.
Git and GitHub in R and RStudio
Git is a very complex tool – it is used to manage the development of the whole Linux kernel! – so we cover only the most basic aspects of it in this course. As well as all of the above, Jenny (with others) has written an excellent resource for using git and GitHub in R – Happy Git and GitHub for the useR, which has much more detail than our abbreviated materials, and is a good reference if you are confused or want to do something complicated. She also gave a talk on this topic at rstudio::conf 2017.
Other materials
Reproducible (and generally good) coding practices
There are many materials on reproducible research, including the BES guide mentioned above. For background, we recommend watching this short clip by Matt Anticole for TED-Ed. We require you to read this article on sharing code and this guide on good coding practices. None of this was written specifically for R programmers, but there are useful tips for everyone (and some specific to R).
R cheatsheets
RStudio and other contributors provide a variety of cheatsheets that summarise functionality of a variety of different topics in R, including markdown in reports, on RStudio itself, and everything from Base R to package development to parallel computing in R. The main cheatsheets can be found in RStudio under Help > Cheatsheets.
R Markdown
R Markdown allows you to generate reports from your R code or, as RStudio puts it, to “turn your analyses into high quality documents, reports, presentations and dashboards.” As well as the cheatsheets and reference guide above, there are a variety of resources for R Markdown:
- RStudio’s R Markdown
website, including:
- Pandoc’s markdown website, which describes the non-R parts of the format
- R Markdown: The Definitive Guide - an e-book by Yihui Xie, J. J. Allaire and Garrett Grolemund
These are great resources, and can help you to produce everything from the simple reports you will be generating on this course, to the lectures slides and website we use to run this course, to complex interactive Shiny apps. Like Hadley Wickham’s books, they are probably references you will refer to later more than during the course, but they are completely comprehensive if you want to look anything up.
Particularly useful for you here are the instructions in Happy Git and GitHub for the useR for changing R Markdown into R scripts of the kind you will be generating in these exercises.
R Coder
The Learn R section of this website provides explanations for a lot of common techniques that you use during this course in the Programming section. If you are confused by our explanations, you may find this useful.
R coding style
Version 1 of Advanced R provides a style guide. In version 2, this has been replaced by the tidyverse style guide. Google have an adaptation of this. We do not ask (or even recommend) that you follow any of them, but they are well thought through, and you may decide you like one of them.
We only ask that your style is consistent, that you use meaningful
names for functions and variables, and that you never
use dots (.
) in function names or use rm(...)
or install any packages in the code you submit for assessment.