2 R basics & functions

In this session you should learn:

R and Python differences in syntax
Base R basics and good practice
Functional programming in R

2.1 I come from Python, how will I adapt to R?

Some results from the survey last week:

What is your experience with the following programming languages?

Source and interesting read: Agarwal (2023)

The very good thing is that if you come from a Python or any other interpreted language, adapting to R should not be that hard. Code syntax is of course different. For instance, Python indexes from 0 and R from 1. Unfold the note below for cheatsheet, based on Watson (n.d.) of the main peculiarities.

R vs. Python syntax

```{r}
#| eval: false
# Packages
library(dplyr)

# Strings
paste('Hello', 'World')
paste(c('Hello', 'World'), collapse = '')

# Booleans
TRUE && FALSE == FALSE
FALSE || TRUE == TRUE
!TRUE == FALSE

# Loops
for (i in 1:10) {
  print(i)
}
while (x > 0) {
  x = x - 1
}

# Conditionals
if (x > 0) {
  print('x is positive')
} else if (x == 0) {
  print('x is zero')
} else {
  print('x is negative')
}

ifelse(x>0, 1, -1)

# Functions
f = function(x,y) {
  x2 = x * x
  x2 + sqrt(y*x2+1)
}
{\(x) x^2 + sqrt(y*x^2+1)}()

# Lists
myList = list(1, 2, "a", c(10,8,9))
myList[3] == "a"
myList[[4]][2] == 8
myList[length(myList)] # returns list(10,8,9)
2 %in% myList

# Ranges

seq(0, 2*pi, by = 0.1)
seq(0, 2*pi, length = 100)
0:5 == c(0, 1, 2, 3, 4, 5)

# Vectors and Matrices
A = matrix(c(1,3,2,4),nrow=2) # column-wise!
b = c(1,2)
t(A)
dim(A)
solve(A,b)
b > 0 # elementwise comparison
A^2 # elementwise product
A %*% A # matrix product
which(b > 0)
matrix(rep(2,100), nrow=10)
diag(4)
cbind(A,b)
rbind(A,b)

# Random numbers
set.seed(1234)
matrix(runif(100),nrow=10)
rnorm(10)
sample(10:99,1)

# Plot

plot(runif(100))
```

```{python}
#| eval: false
# Packages
import pandas as pd

# Strings
'Hello' + 'World'
','.join(['Hello', 'World'])

# Booleans
True and False == False
False or True == True
not True == False

# Loops
for i in range(1,11):
  print(i)
  
while x > 0:
  x -= 1


# Conditionals
if x > 0:
  print('x is positive')
elif x == 0:
  print('x is zero')
else:
  print('x is negative')


1 if x > 0 else -1

# Functions
def f(x,y):
  x2 = x * x
  return x2 + (y*x2+1)**(1/2)

lambda x: x**2 + (y*x**2+1)**(1/2)

# Lists
myList = [1, 2, "a", [10,8,9]]
myList[2] == "a"
myList[3][2] == 9
myList[-1] == [10, 8, 9]
2 in myList

# Ranges
import numpy as np
np.arrange(0, 2*np.pi, step=0.1)
np.linespace(0, 2*np.pi, num=100)
list(range(5)) == [0,1,2,3,4]

# Vectors and Matrices
A = np.array([[1, 2], [3, 4]])
b = np.array([1, 2])
np.transpose(A) # or A.T
A.shape
np.linalg.solve(A, b)
b > 0 # elementwise comparison
b**2 # elementwise function application
A @ A # matrix product
np.where(b > 0)
np.full((10,10), 2)
np.eye(4) # 4 x 4 identity matrix
np.hstack((A,b[:,np.newaxis]))
np.vstack((A,b))

# Random numbers
np.random.seed(1234)
np.random.rand(10,10)
np.random.randn(10)
np.random.randint(10,100)

# Plot
import matplotlib.pyplot as plt
plt.plot(np.random.uniform(0, 1, 100))
```

2.2 Base or “vanilla” R

On the note above you find several examples of code syntax for base or “vanilla” R. R has several coding syntax. Knowing the basics of base R allows you to write R code without having to depend on other packages. However, most data science workflows are facilitated by other coding syntaxes such as tidyverse and data.table. Later in this lesson, you will be solving a practical using base R (Practical 1). In coming lessons, we will include the tidyverse into our workflows.

Find some info on coding basics right here.

2.3 Basics and good practice

2.3.1 R objects

The fundamental building blocks of R programming are objects. They can be vectors, matrices, data frames, lists and even functions.

(my_vector = c(1, 2, 3, 4))

[1] 1 2 3 4

(my_matrix = matrix(1:9, nrow = 3, ncol = 3))

     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9

(my_data_frame = data.frame(
  name = c("Alice", "Bob"),
  age = c(25, 30)
))

   name age
1 Alice  25
2   Bob  30

(my_list = list(name = "Alice", age = 25, scores = c(90, 85, 88)))

$name
[1] "Alice"

$age
[1] 25

$scores
[1] 90 85 88

(my_function = function(x) {
  return(x * 2)
})

function (x) 
{
    return(x * 2)
}

2.3.2 Object naming

You can create objects in R by assigning a value to a string.

a = 5
a

[1] 5

Probably a is not the object name you will be using as this is not very explanatory.

There are certain rules to follow when naming objects in R:

It should start with a letter

1 = 2
1_my_object = "foo"

Error in parse(text = input): <text>:2:2: unexpected input
1: 1 = 2
2: 1_
    ^

It can only contain letters, numbers, _, and .

pa$$word = 1234

Error in parse(text = input): <text>:1:4: unexpected '$'
1: pa$$
       ^

Once you have followed these rules, you probably want your object names to be descriptive. There are different styles for naming.

If you are unfamiliar, I recommend:

using long names that describe your object
using snakecase (i.e. _) to separate words
starting with a prefix for related objects (will help you when you use autocompletion)

roads_salzburg
roads_linz
roads_salzburg_clean_topology
roads_salzburg_routing

2.3.3 Comments

Everything that is behind a # will be ignored by R inside code chunks and in R scripts. You can use this feature to document what you are doing, especially for complex code that you need to revisit or send to someone else.

To comment out text on a Quarto file you can use 

Keyboard shortcut for commenting code or text: Cmd/Ctrl + Shift + C

2.3.4 RStudio diagnostics

A useful tool in RStudio is that it tells you when there is something wrong in a line of code. This will show as a red x or a warning sign on the left side of your code chunk, next to the line number, as well as a squiggly line under the potential conflict. You can hover over them to see what the issue is.

my object = "foo"

45 == NA

2.3.5 Saving R scripts or Quarto files

Since we did the important settings last class to never save our Environment or History, every time you open RStudio, you will start from a fresh session. That means you need to save your work. This will either happen in a script (.R) or Quarto file (.qmd). Here are some tips to name your saved files:

Make your files machine readable
- Avoid spaces, symbols and special characters
- Don’t rely on case sensitivity to distinguish files (e.g. MyScript.R and myscript.R might be the same depending on your OS)
Make your files human readable
- Be descriptive about what is in the file
- You can play with the names to influence ordering in your directory, e.g. with numbers or a prefix+numbers to classify certain parts of your workflow (e.g. 01_load_data.R, 02_esda.R, 03_model.R, report1_methods.qmd, report2_validation.qmd)

How do you make sure that your R script holds all the objects in your Environment?

Save your R script (with hopefully all your work)
Restart R (keyboard shortcut Cmd/Ctrl + Shift + 0/F10)
Re-run the script (keyboard shortcut Cmd/Ctrl + Shift + S)

For Quarto files, a simple solution is to Render your Quarto file. The file won’t render if it does not have all the information contained in it, and you will get an error.

2.4 R functions

2.4.1 Calling functions

Several built-in functions are available in base R, and are also the basis of any other library that you import into R.

function_name(argument1 = value1, argument2 = value2, ...)

For instance, to generate random numbers we can use the function rnorm. To learn the rnorm arguments you can type ?rnorm in your R console

rnorm(n = 20, mean = 3, sd = 2)

 [1] 3.7522881 1.9175253 5.0753426 5.5485323 0.5897940 1.6246797 6.1697586
 [8] 1.5835052 0.8340036 4.5483872 3.1691230 1.5419973 4.6020305 4.4242452
[15] 2.5477296 2.8507865 4.1058413 1.4882255 3.8729388 3.8733444

It is not necessary to name the arguments, when you follow the intended function order. This is useful when you call functions often and already know what to expect. But be careful, this could be a source of errors.

rnorm(20, 3, 2)

 [1] 5.2568914 6.0994907 3.4218424 3.0532382 3.5609369 3.9902230 0.7136070
 [8] 3.6097589 1.5302034 4.0800409 5.0086449 4.3036964 1.8804318 3.9689288
[15] 3.5773657 3.8885775 2.7655724 3.4772708 0.6311512 1.4783737

Some functions have default values for certain arguments. That means you don’t need to explicitely define each argument, but the function is coded with a default in mind. For rnorm, mean and sd have the defaults 0 and 1 respectively. So you can do:

rnorm(20)

 [1] -1.66014444 -2.11715200 -0.81905700 -0.91044606 -1.68031028  0.94580627
 [7] -0.25567588 -0.50171466  0.47033467  1.45066041  1.35485226  0.04867679
[13]  1.14797705  0.50376185 -1.59551285 -1.50410030 -1.71209020 -0.79645217
[19]  1.28732472 -0.21076189

You can use autocompletion tools to help you find function names or the arguments. For this you can use TAB when typing a function. Read more on RStudio helpful functionalities here or discover them yourself on the go!

2.4.2 Iteration

When you repeatedly perform on action on different objects, you might find that iteration will become a useful tool.

R already does this in a way, with its recycling¹ behavior.

x = seq(1:10)
x

 [1]  1  2  3  4  5  6  7  8  9 10

x * 2

 [1]  2  4  6  8 10 12 14 16 18 20

x * c(2,3)

 [1]  2  6  6 12 10 18 14 24 18 30

x * c(2,3,4)

Warning in x * c(2, 3, 4): longer object length is not a multiple of shorter
object length

 [1]  2  6 12  8 15 24 14 24 36 20

In a more general sense, we can make use of tools within functional programming that in essence take other functions as inputs.

2.4.2.1 `apply` family

The apply family of functions apply a function to each element of an object.

is.numeric(x)

[1] TRUE

# returns a list, hence *l*apply
lapply(x, is.numeric)

[[1]]
[1] TRUE

[[2]]
[1] TRUE

[[3]]
[1] TRUE

[[4]]
[1] TRUE

[[5]]
[1] TRUE

[[6]]
[1] TRUE

[[7]]
[1] TRUE

[[8]]
[1] TRUE

[[9]]
[1] TRUE

[[10]]
[1] TRUE

Family members of *apply() are:

# returns a simplified output, usually a vector or matrix, hence *s*apply
sapply(x, is.numeric)

 [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

# short for *v*ector apply, takes one mor argument to specify expected type
vapply(x, is.numeric, logical(1))

 [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

# applies a function to subsets of a vector, based on a factor, 
# useful to compute a single grouped summary
tapply(mtcars$mpg, mtcars$cyl, mean)

       4        6        8 
26.66364 19.74286 15.10000

(mat = matrix(1:9, nrow = 3, ncol = 3))

     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9

# works with matrices and arrays, applies a function to the margins of an array or matrix
apply(mat, 1, sum)

[1] 12 15 18

If you are wondering where mtcars came from, this is a base R dataset, useful for examples, that is already pre-loaded every time you start R!

Most of the functions above will have equivalents in other coding styles like the tidyverse, which will be introduced next session. The most useful function from this set however, is lapply() since it can help you run your own functions over a list, so keep it in mind.

2.4.2.2 `for` loops

for loops is what is being used under the hood in the apply family.

for (element in vector) {
  # do something with element
}

Although powerful, for loops are usually discouraged in favor of apply functions in R because:

Performance: apply functions are generally faster than for loops because they are optimized for vectorized operations. This can lead to significant performance improvements, especially with large datasets.
Readability: apply functions can make code more concise and easier to read. They abstract away the loop mechanics, allowing you to focus on the operation being performed.
Less error-prone: for loops can be more prone to errors, such as off-by-one errors or incorrect indexing. You may end up if an infinite loop if there is a mistake in your code. apply functions reduce the risk of such mistakes by handling the iteration internally.

This does not mean that there will be times when they are used, but probably this is done only once the alternatives are not good enough.

2.4.3 Creating functions

Functions let you automate tasks and make your code more organized. If you find yourself repeating a piece of code over and over and just changing one parameter, then you can probably replace that workflow with a function.

Reasons to create a function, as explained in Wickham, Çetinkaya-Rundel, et al. (2023):

When requirements change, you only update code once.
Eliminate errors from copy-pasting. e.g. you won’t forget to update a variable name in all the places you use it.
Organized code: you can name your function something intuitive to remind you of the task you are undertaking.
Reuse workflows between projects, making you more efficient.

An R function has three elements:

name = function(arguments) {
  # body
}

name
arguments: elements that vary across calls
body: code that is repeated across calls

2.5 Further reading:

Workflow basics chapter (Wickham, Çetinkaya-Rundel, et al., 2023)
Workflow scripts and projects chapter (Wickham, Çetinkaya-Rundel, et al., 2023)
Base R chapter (Wickham, Çetinkaya-Rundel, et al., 2023)
Functions chapter (Wickham, Çetinkaya-Rundel, et al., 2023)
Iteration chapter (Wickham, Çetinkaya-Rundel, et al., 2023)
R vs Python: Which Programming Language is Better For Data Science in 2023 by Agarwal (2023)

Agarwal, A. (2023). R vs python: Which is better for data science in 2023. https://externlabs.com/blogs/r-vs-python/.

Allaire, J., & Dervieux, C. (2024). Quarto: R interface to quarto markdown publishing system. https://github.com/quarto-dev/quarto-r

Andreo, V. (2024). Get started with GRASS & r: The rgrass package. https://grass-tutorials.osgeo.org/content/tutorials/get_started/fast_track_grass_and_R.html.

Appel, M. (2024). Gdalcubes: Earth observation data cubes from satellite image collections. https://github.com/appelmar/gdalcubes

Appel, M., & Pebesma, E. (2019). On-demand processing of data cubes from satellite image collections with the gdalcubes library. Data, 4(3). https://www.mdpi.com/2306-5729/4/3/92

Appel, M., Pebesma, E., & Mohr, M. (2021). Cloud-based processing of satellite image collections in r using STAC, COGs, and on-demand data cubes. https://r-spatial.org/r/2021/04/23/cloud-based-cubes.html

Appelhans, T., Detsch, F., Reudenbach, C., & Woellauer, S. (2023). Mapview: Interactive viewing of spatial data in r. https://github.com/r-spatial/mapview

Aybar, C. (2023). Rgee: R bindings for calling the earth engine API. https://github.com/r-spatial/rgee/

Baddeley, A., Rubak, E., & Turner, R. (2015). Spatial point patterns: Methodology and applications with R. Chapman; Hall/CRC Press. https://www.routledge.com/Spatial-Point-Patterns-Methodology-and-Applications-with-R/Baddeley-Rubak-Turner/p/book/9781482210200/

Baddeley, A., & Turner, R. (2005). spatstat: An R package for analyzing spatial point patterns. Journal of Statistical Software, 12(6), 1–42. https://doi.org/10.18637/jss.v012.i06

Baddeley, A., Turner, R., Mateu, J., & Bevan, A. (2013). Hybrids of gibbs point process models and their implementation. Journal of Statistical Software, 55(11), 1–43. https://doi.org/10.18637/jss.v055.i11

Bivand, R. (2022). Modernizing the r-GRASS interface: Confronting barn-raised OSGeo libraries and the evolving r.*spatial package ecosystem. https://rsbivand.github.io/foss4g_2022/modernizing_220822.html.

Bivand, R. (2024a). Rgrass: Interface between GRASS geographical information system and r. https://rsbivand.github.io/rgrass/

Bivand, R. (2024b). Spdep: Spatial dependence: Weighting schemes, statistics. https://github.com/r-spatial/spdep/

Bivand, R. S., Pebesma, E., & Gómez-Rubio, V. (2013). Applied spatial data analysis with R, second edition. Springer, NY. https://asdar-book.org/

Bivand, R., Nowosad, J., & Lovelace, R. (2024). spData: Datasets for spatial analysis. https://jakubnowosad.com/spData/

Bivand, R., & Wong, D. W. S. (2018). Comparing implementations of global and local indicators of spatial association. TEST, 27(3), 716–748. https://doi.org/10.1007/s11749-018-0599-x

Câmara, G., Simoes, R., Souza, F., Pelletier, C., Sanchez, A., Andrade, P., Ferreira, K., & Queiroz, G. (2023). sits: Satellite image time series analysis on Earth observation data cubes. https://e-sensing.github.io/sitsbook/index.html

Çetinkaya-Rundel, M. (2024). Quarto dashboards video series. https://quarto.org/docs/blog/posts/2024-11-22-dashboards-workshop/.

Dunnington, D., Vanderhaeghe, F., Caha, J., & Muenchow, J. (2024a). Qgisprocess: Use QGIS processing algorithms. https://r-spatial.github.io/qgisprocess/

Dunnington, D., Vanderhaeghe, F., Caha, J., & Muenchow, J. (2024b). R package qgisprocess: Use QGIS processing algorithms. Version 0.4.1. https://r-spatial.github.io/qgisprocess/

Eddelbuettel, D. (2024). Digest: Create compact hash digests of r objects. https://github.com/eddelbuettel/digest

Gräler, B., Pebesma, E., & Heuvelink, G. (2016). Spatio-temporal interpolation using gstat. The R Journal, 8, 204–218. https://journal.r-project.org/archive/2016/RJ-2016-014/index.html

Grolemund, G. (2014). Hands-on programming with r. "O’Reilly Media, Inc.".

Grolemund, G., & Wickham, H. (2011). Dates and times made easy with lubridate. Journal of Statistical Software, 40(3), 1–25. https://www.jstatsoft.org/v40/i03/

Hadley Wickham, J. B. (2023). R packages (2nd ed.). O’Reilly Media.

Hijmans, R. J. (2020). Terra and luna: New r packages scalable geospatial data analysis. Big Data in Agriculture - 2020 Convention. https://www.youtube.com/watch?v=5b2xhqlH49I&t=690s

Hijmans, R. J. (2024a). Spatial data science with R and terra. https://rspatial.org/index.html.

Hijmans, R. J. (2024b). Terra: Spatial data analysis. https://rspatial.org/

Jenny Bryan, J. H., the STAT 545 TAs. (2025). Let’s git started | happy git and GitHub for the useR. https://happygitwithr.com/.

Li, X., & Anselin, L. (2024). Rgeoda: R library for spatial data analysis. https://github.com/geodacenter/rgeoda/

Loiseau, N., Mouquet, N., Casajus, N., GreniÃ©, M., GuÃ©guen, M., Maitner, B., Mouillot, D., Ostling, A., Renaud, J., Tucker, C., Velez, L., Thuiller, W., & Violle, C. (2020). Global distribution and conservation status of ecologically rare mammal and bird species. Nature Communications, 11(1). https://doi.org/10.1038/s41467-020-18779-w

Lovelace, R., Nowosad, J., & Muenchow, J. (2019). Geocomputation with R. CRC Press.

Mahoney, M. (n.d.). Rsi: Efficiently retrieve and process satellite imagery (Version 0.2.0.9000). https://doi.org/10.5281/zenodo.10926857

Mahoney, M. (2024). Rsi: Efficiently retrieve and process satellite imagery. https://github.com/Permian-Global-Research/rsi

Mark Padgham. (2019). Dodgr: An r package for network flow aggregation. Transport Findings. https://doi.org/10.32866/6945

Massicotte, P., & South, A. (2023). Rnaturalearth: World map data from natural earth. https://docs.ropensci.org/rnaturalearth/

Meyer, C. (2022). Understanding the basics of package writing in r. https://cosimameyer.com/post/understanding-the-basics-of-package-writing-in-r/.

Müller, K., & Wickham, H. (2023). Tibble: Simple data frames. https://tibble.tidyverse.org/

Padgham, M., Petutschnig, A., & Cooley, D. (2024). Dodgr: Distances on directed graphs. https://github.com/UrbanAnalyst/dodgr

Parry, J., & Locke, D. (2024). Sfdep: Spatial dependence for simple features. https://sfdep.josiahparry.com

Pawley, S. (2024). Rsagacmd: Linking r with the open-source SAGA-GIS software. https://stevenpawley.github.io/Rsagacmd/

Pebesma, E. (2018). Simple Features for R: Standardized Support for Spatial Vector Data. The R Journal, 10(1), 439–446. https://doi.org/10.32614/RJ-2018-009

Pebesma, E. (2024). Stars: Spatiotemporal arrays, raster and vector data cubes. https://r-spatial.github.io/stars/

Pebesma, E. (2025). Sf: Simple features for r. https://r-spatial.github.io/sf/

Pebesma, E. J. (2004). Multivariable geostatistics in S: The gstat package. Computers & Geosciences, 30, 683–691.

Pebesma, E., & Bivand, R. (2023). Spatial Data Science: With applications in R. Chapman and Hall/CRC. https://doi.org/10.1201/9780429459016

Pebesma, E., & Graeler, B. (2024). Gstat: Spatial and spatio-temporal geostatistical modelling, prediction and simulation. https://github.com/r-spatial/gstat/

Pedersen, T. L. (2024). Tidygraph: A tidy API for graph manipulation. https://tidygraph.data-imaginist.com

Pinheiro, J. C., & Bates, D. M. (2000). Mixed-effects models in s and s-PLUS. Springer. https://doi.org/10.1007/b98882

Pinheiro, J., Bates, D., & R Core Team. (2024). Nlme: Linear and nonlinear mixed effects models. https://svn.r-project.org/R-packages/trunk/nlme/

Plate, T., & Heiberger, R. (2024). Abind: Combine multidimensional arrays.

R Core Team. (2024). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/

Roger Bivand. (2022). R packages for analyzing spatial data: A comparative case study with areal data. Geographical Analysis, 54(3), 488–518. https://doi.org/10.1111/gean.12319

Rydzik, M. (2024, January 10). An Overview of the RSI R Package for Retrieving Satellite Imagery and Calculating Spectral Indices. https://geocompx.org/post/2024/rsi-bp1/

Simoes, R., Camara, G., Queiroz, G., Souza, F., Andrade, P., Santos, L., Carvalho, A., & Ferreira, K. (2021). Satellite image time series analysis for big earth observation data. Remote Sensing, 13(13), 2428. https://doi.org/10.3390/rs13132428

Simoes, R., Camara, G., Souza, F., & Carlos, F. (2024). Sits: Satellite image time series analysis for earth observation data cubes. https://github.com/e-sensing/sits/

Simoes, R., Carvalho, F., & Brazil Data Cube Team. (2024). Rstac: Client library for SpatioTemporal asset catalog. https://brazil-data-cube.github.io/rstac/

Simoes, R., Souza, F., Zaglia, M., Queiroz, G. R., Santos, R., & Ferreira, K. (2021). Rstac: An r package to access spatiotemporal asset catalog satellite imagery. 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, 7674–7677. https://doi.org/10.1109/IGARSS47720.2021.9553518

Spinu, V., Grolemund, G., & Wickham, H. (2023). Lubridate: Make dealing with dates a little easier. https://lubridate.tidyverse.org

Therneau, T., & Atkinson, B. (2023). Rpart: Recursive partitioning and regression trees. https://github.com/bethatkinson/rpart

van der Meer, L., Abad, L., Gilardi, A., & Lovelace, R. (2025). Sfnetworks: Tidy geospatial networks. https://luukvdmeer.github.io/sfnetworks/

Vreede, B. (2023). Why your research deserves to be an r package. https://blog.esciencecenter.nl/why-your-research-deserves-to-be-an-r-package-3737a73501c.

Vuorre, M., & Crump, M. J. C. (2020). Sharing and organizing research products as r packages. Behavior Research Methods, 53(2), 792â802. https://doi.org/10.3758/s13428-020-01436-x

Watson, S. S. (n.d.). A Julia-Python-R reference sheet. Retrieved November 21, 2024, from https://docslib.org/doc/2547802/julia-python-r-cheatsheet

Wickham, H. (2011). Testthat: Get started with testing. The R Journal, 3, 5–10. https://journal.r-project.org/archive/2011-1/RJournal_2011-1_Wickham.pdf

Wickham, H. (2014). Tidy data. Journal of Statistical Software, 59(10). https://doi.org/10.18637/jss.v059.i10

Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org

Wickham, H. (2019). Advanced R (p. 588). CRC Press.

Wickham, H. (2021). Mastering shiny. "O’Reilly Media, Inc.".

Wickham, H. (2023a). Forcats: Tools for working with categorical variables (factors). https://forcats.tidyverse.org/

Wickham, H. (2023b). Stringr: Simple, consistent wrappers for common string operations. https://stringr.tidyverse.org

Wickham, H. (2023c). Tidyverse: Easily install and load the tidyverse. https://tidyverse.tidyverse.org

Wickham, H. (2024). Testthat: Unit testing for r. https://testthat.r-lib.org

Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L. D., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Pedersen, T. L., Miller, E., Bache, S. M., Müller, K., Ooms, J., Robinson, D., Seidel, D. P., Spinu, V., … Yutani, H. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686

Wickham, H., Bryan, J., Barrett, M., & Teucher, A. (2024). Usethis: Automate package and project setup. https://usethis.r-lib.org

Wickham, H., Çetinkaya-Rundel, M., & Grolemund, G. (2023). R for data science (2nd ed.). O’Reilly Media.

Wickham, H., Chang, W., Henry, L., Pedersen, T. L., Takahashi, K., Wilke, C., Woo, K., Yutani, H., Dunnington, D., & van den Brand, T. (2024). ggplot2: Create elegant data visualisations using the grammar of graphics. https://ggplot2.tidyverse.org

Wickham, H., Danenberg, P., Csárdi, G., & Eugster, M. (2024). roxygen2: In-line documentation for r. https://roxygen2.r-lib.org/

Wickham, H., François, R., Henry, L., Müller, K., & Vaughan, D. (2023). Dplyr: A grammar of data manipulation. https://dplyr.tidyverse.org

Wickham, H., & Henry, L. (2023). Purrr: Functional programming tools. https://purrr.tidyverse.org/

Wickham, H., Hesselberth, J., Salmon, M., Roy, O., & Brüggemann, S. (2024). Pkgdown: Make static HTML documentation for a package. https://pkgdown.r-lib.org/

Wickham, H., Hester, J., & Bryan, J. (2024). Readr: Read rectangular text data. https://readr.tidyverse.org

Wickham, H., Hester, J., Chang, W., & Bryan, J. (2022). Devtools: Tools to make developing r packages easier. https://devtools.r-lib.org/

Wickham, H., Vaughan, D., & Girlich, M. (2024). Tidyr: Tidy messy data. https://tidyr.tidyverse.org

Xie, Y. (2014). Knitr: A comprehensive tool for reproducible research in R. In V. Stodden, F. Leisch, & R. D. Peng (Eds.), Implementing reproducible computational research. Chapman; Hall/CRC.

Xie, Y. (2015). Dynamic documents with R and knitr (2nd ed.). Chapman; Hall/CRC. https://yihui.org/knitr/

Xie, Y. (2025). Knitr: A general-purpose package for dynamic report generation in r. https://yihui.org/knitr/

Read more about recycling in R here.↩︎