Everybody develops its own coding
habits and style. Some people take a lot of effort in making their
source code readable,
while others don’t bother at all. Working together with different people
is easier when everyone uses the same standard. The
checklist
package defines a set of standards and provides
tools to validate whether your project or R package adheres to these
standards. You can run these tools interactively on your machine. You
can also add these checks as GitHub actions, which
runs them automatically after every push to the repository
on GitHub.
The most visible part of a coding style is the naming convention.
People use many different styles of naming conventions within the R
ecosystem (Bååth 2012). Popular ones are
alllowercase
, period.separated
,
underscore_separated
, lowerCamelCase
and
UpperCamelCase
. We picked underscore_separated
because that is the naming convention of the tidyverse
packages. It
is also the default setting in the lintr
package which we use to do the static code
analysis.
At first this seems a lot to memorise. RStudio makes things easier
when you activate all diagnostic options (Tools > Global
options > Code > Diagnostics). This
highlights several problems by showing a squiggly line and/or a warning
icon at the line number. Instead of learning all the rules by heart, run
check_lintr()
regularly and fix any issues that come up. Do
this when working on every single project. Doing so enforces you to
consistently use the same coding style, making it easy to learn and use
it.
underscore_separated
names for functions, parameters
and variables.%>%
){
}
<-
, ->
,
=
+
, -
,*
,
/
, …,
(
or [
if ()
, for ()
,
while ()
)
and {
,
e.g. function () {
"
) to define character strings.<-
or ->
to assign something.
Only use =
to pass arguments to a function
(e.g. check_package(fail = TRUE)
).is.na(x)
instead of x == NA
.seq_len()
or seq_along()
instead of
1:length(x)
, 1:nrow(x)
, … Advantage: when
length(x) == 0
, 1:length(x)
yields
c(1, 0)
, whereas seq_along(x)
would yield an
empty vector.git
. If it is code that you
need to run only under special circumstances, then either put the code
in a separate script and run is manually or write an if-else were you
run the code automatically when needed.assertthat::assert_that()
to validate object or
conditions instead of if() stop()
.ifelse()
instead of
if()
.To make this easier to remember we choose the same name conventions
for file names as for objects. We acknowledge that these rules sometimes
clash with requirements from other sources
(e.g. DESCRIPTION
in an R package, README.md
on GitHub, .gitignore
for git, …). In such case we allow
the file names as required by R, git or GitHub. When
check_filename()
does unfairly not allow a certain file or
folder name, then please open an issue on GitHub and
motivate why this should be allowed.
_
)..
)._
)..R
, .Rmd
, .Rd
,
.Rnw
, .Rproj
).csl
, eps
,
jpg
, jpeg
, pdf
, png
and ps
.-
) as
separator instead of an underscore (_
). We need this
exception because underscores cause problems in certain situations.Most users think of an R package as a collection of generic functions that they can use to run their analysis. However, an R package is a useful way to bundle and document a stand-alone analysis too! Suppose you want to pass your code to a collaborator or your future self who is working on a different computer. If you have a project folder with a bunch of files, people will need to get to know your project structure, find out what scripts to run and which dependencies they need. Unless you documented everything well they (including your future self!) will have a hard time figuring out how things work.
Having the analysis as a package and running
check_package()
to ensure a minimal quality standard, makes
things a lot easier for the user. Agreed, it will take a bit more time
to create the analysis, especially with the first few projects. In the
long run you save time due to a better quality of your code. Try to
start by packaging a recurrent analysis or standardised report when you
want to learn writing a package. Once you have some experience, it is
little overhead to do it for smaller analysis. Keep in mind that you
seldom run an analysis exactly once.
remotes::install_github("inbo/packagename")
.inst
folder is an ideal place to bundle such scripts within
the package. You can also use it to store small (!) datasets or
rmarkdown reports.