Short Course on R Tools

Develop your own professional R package

Mehdi Maadooliat and Hossein Haghbin

Marquette University
SCoRT - Summer 2025

Outline

  • Why Build an R Package?
  • System Setup
  • Package Structure
  • Package States in R
  • Create a Package
  • Functions & Documentation
  • Writing Functions and Documenting
  • Build a Simple Package from Scratch

πŸ“¦ R Packages: The Fundamental Unit

In R, the fundamental unit of shareable code is the package.

A package bundles together:

  • πŸ§‘β€πŸ’» Code
  • πŸ“Š Data
  • πŸ“š Documentation
  • πŸ§ͺ Tests

➑️ All in one place, easy to share with others.

πŸ“¦ You Already Use Packages!

If you’re here, you already know how to work with packages:

  • πŸ“₯ Install a package from CRAN:

    install.packages("x")
  • ▢️ Use a package in R:

    library("x")
  • ❓ Get help on them with

    package?x
    help(package = "x")

🎯 Goal of This Talk

This workshop is about moving from using packages ➑️ to developing your own.

Why?

  • πŸš€ To share your code with others
  • πŸ“¦ To make your code easy to install, use, and learn
  • ⏳ To save yourself time with conventions and structure

System Setup

Make sure you have:

  • πŸ›  The latest version of R.
  • πŸ›  The latest version of RStudio.

πŸ“¦ Required Packages:

Install the essential development tools:

install.packages(c("devtools", "roxygen2", "testthat", "knitr"))

πŸ“¦ Key Packages

  • devtools: Simplifies package development by wrapping complex workflows into easy commands

  • roxygen2: Generates documentation from special comments in your function code

  • testthat: Provides a framework for unit testing and ensures your functions work correctly and safely over time

  • knitr: Powers dynamic report generation and vignette building that integrates code, results, and text using R Markdown

πŸ’‘ These tools help automate, test, document, and share your package like a pro.

Package Structure

🧩 Directory layout

  mypkg/
  β”œβ”€β”€ DESCRIPTION         # Package metadata
  β”œβ”€β”€ NAMESPACE           # Exported functions & imports
  β”œβ”€β”€ R/                  # Your R functions
  └── man/                # Auto-generated documentation

πŸ’‘ Tip: Many folders in an R package are optional and included as needed.

🧩 Directory layout

  mypkg/
  β”œβ”€β”€ DESCRIPTION         # Package metadata
  β”œβ”€β”€ NAMESPACE           # Exported functions & imports
  β”œβ”€β”€ R/                  # Your R functions
  β”œβ”€β”€ man/                # Auto-generated documentation
  β”œβ”€β”€ tests/              # Unit tests with testthat
  β”œβ”€β”€ vignettes/          # Long-form documentation (Rmd)
  β”œβ”€β”€ data/               # Data sets (.rda files)
  └── inst/               # Installed files (e.g., app/, extdata/)

πŸ›  Package States in R

R packages transition through five development states:

  • πŸ—‚οΈ Source: your raw package folder
  • πŸ“¦ Bundled: compressed .tar.gz for sharing
  • 🧱 Binary: platform-specific precompiled version
  • πŸ“š Installed: available in your R library
  • 🧠 In-Memory: actively loaded via library()

Understanding these states helps you manage installation, sharing, and usage workflows.

πŸ—‚οΈ Source Package

A source package is just a folder with a specific structure:

  • DESCRIPTION file
  • R/ folder with .R files
  • Optional: man/, tests/, vignettes/

It’s editable and human-readable β€” your starting point for development.

πŸ“¦ Bundled Package

A bundled package is a compressed .tar.gz file created from a source package.

  • Commonly called a source tarball
  • Created using devtools::build()
  • Platform-independent format for distribution

It acts as a transportable unit β€” not directly usable until installed.

🧱 Binary Package

A binary package is a platform-specific compiled package:

  • Windows: .zip
  • macOS: .tgz
  • Created using devtools::build(binary = TRUE)

Ideal for users without development tools β€” typically distributed by CRAN.

πŸ“š Installed Package

An installed package is one that’s been unpacked and placed into a library folder.

  • No longer a single file
  • Ready to be loaded into memory

install.packages() or devtools::install_*() bring packages into this state.

🧠 In-Memory Package

An in-memory package is an installed package that’s been loaded into the R session.

  • Use library(pkgname) to load
  • Makes all exported functions and objects available

Use library() to see loaded packages.

πŸ” Transitioning Between States

Screenshot of Pagerank

πŸ“¦ Package vs. πŸ“š Library

  • Package β†’ bundle of code, data, docs, tests
  • Library β†’ directory on your computer that contains installed packages

πŸ‘‰ Think of a library as a bookshelf full of packages.

⚠️ Common Confusion

It’s common to hear people say:

β€œI loaded the dplyr library.”

But actually:

  • dplyr is a package
  • It lives inside a library
  • You load it with library(dplyr)

πŸ‘‰ A library is just where packages live (a directory)

πŸ“š Multiple Libraries in R

  • R allows you to have multiple library paths on your system
  • Each library contains a set of installed packages
  • Check active library paths with:
.libPaths()
# [1] "C:/Program Files/R/R-4.5.1/library"
- πŸ”Ž Exploring Installed Packages
 ```r
lapply(.libPaths(), list.dirs, recursive = FALSE, full.names = FALSE)
# [[1]] 
#  [1] "abind"   "ash"     "askpass"   "backports"  
#  [5] "base"    "base64enc" "bitops"   "boot"

🚫 Avoid library() Inside Packages

Do not use library() or require() inside your package code

  • Packages declare dependencies via DESCRIPTION and NAMESPACE
  • library() is for scripts and interactive sessions, not for packages

Instead, use:

  • Imports: in DESCRIPTION
  • @import or @importFrom in Roxygen2 comments

βœ… This is one of the biggest mental shifts when moving from scripts to package development

Create a Package

Before creating your R package, you need to choose a name.

🧠 This can be the hardest part of the process!

πŸ“› Formal Naming Rules

A valid R package name must:

  1. Contain only letters, numbers, and periods (.)
  2. Start with a letter
  3. Not end with a period

❌ You cannot use:

  • Hyphens -
  • Underscores _

πŸ” Naming Tips for Shared Packages

If you plan to share your package:

  • βœ… Choose a unique name that’s easy to Google
  • πŸ”Ž Avoid names already on CRAN or Bioconductor
  • ⌨️ Stick to all lowercase β€” e.g., avoid RGTK2 vs RGtk2
  • πŸ—£οΈ Prefer pronounceable names β€” easier to talk and think about

🧠 Naming Patterns and Examples

Evocative Names:

  • lubridate β†’ makes dates easier
  • r2d3 β†’ tools for D3 visualizations
  • forcats β†’ tools for working with categorical variables

Abbreviations

  • Rcpp β†’ R + C++
  • brms β†’ Bayesian Regression Models using Stan

πŸ› οΈ Creating Your Package

Once you have a name, create the package using either:

  • usethis::create_package("mypkg")
  • RStudio UI:
    File β†’ New Project β†’ New Directory β†’ R Package

πŸ‘‰ Both options run the same function under the hood.

πŸ“¦ What Gets Created?

  • R/ folder β†’ for your function code
  • DESCRIPTION file β†’ metadata
  • NAMESPACE file β†’ exports/imports
  • mypkg.Rproj β†’ RStudio project file
  • .Rbuildignore, .gitignore β†’ build and Git helpers

Screenshot of Pagerank

πŸ“„ DESCRIPTION File

The DESCRIPTION file provides overall metadata about your package:

  • Package name and version
  • Author and maintainer info
  • Dependencies (Imports, Suggests)
  • Description and license
  • Build-related metadata

🧾 Sample DESCRIPTION File

Package: Rfssa
Title: Functional Singular Spectrum Analysis
Version: 3.1.0
Authors@R: c(
    person("Hossein", "Haghbin", email = "haghbin@pgu.ac.ir", role = c("aut", "cre"), comment = c(ORCID = "0000-0001-8416-2354")),
    person("Mehdi", "Maadooliat", email = "mehdi.maadooliat@mu.edu", role = "aut", comment = c(ORCID = "0000-0002-5408-2676"))
    )
Maintainer: Hossein Haghbin <haghbin@pgu.ac.ir>
Description: Methods and tools for implementing functional singular spectrum analysis and related techniques.
License: GPL-3
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.3
Depends: R (>= 4.0.0)

More Details DESCRIPTION

Package: Rfssa
Title: Functional Singular Spectrum Analysis
Version: 3.1.0
Authors@R: c(
    person("Hossein", "Haghbin", email = "haghbin@pgu.ac.ir", role = c("aut", "cre"), comment = c(ORCID = "0000-0001-8416-2354")),
    person("Mehdi", "Maadooliat", email = "mehdi.maadooliat@mu.edu", role = "aut", comment = c(ORCID = "0000-0002-5408-2676"))
    )
Maintainer: Hossein Haghbin <haghbin@pgu.ac.ir>
Description: Methods and tools for implementing functional singular spectrum analysis and related techniques.
License: GPL-3
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.3
URL: https://github.com/haghbinh/Rfssa
LazyLoad: true
Imports: Rcpp,fda,lattice,plotly
LinkingTo: Rcpp, RcppArmadillo, RcppEigen,
Suggests: 
    knitr
Depends: R (>= 4.0.0)

πŸ“ Title vs. Description

Title

  • One line
  • No punctuation or markup
  • Capitalize like a title
  • Keep under ~65 characters

Description

  • One paragraph, plain text
  • Multiple sentences OK
  • Should describe what the package does

πŸ§ͺ Example from ggplot2:

Title: Create Elegant Data Visualisations Using the Grammar of Graphics
Description: A system for 'declaratively' creating graphics,
  based on "The Grammar of Graphics". You provide the data,
  tell 'ggplot2' how to map variables to aesthetics, what
  graphical primitives to use, and it takes care of the details.

πŸ“¦ Imports vs. Suggests

Imports

  • Packages required at runtime
  • Automatically installed with your package
  • Needed for core functionality

Suggests

  • Optional or development-time dependencies
  • Used in examples, tests, or vignettes
  • Not required to run the core package

Use usethis::use_package("pkg", type = "Imports") to manage these easily.

πŸ“‚ NAMESPACE File

The NAMESPACE file defines the interface of your package.

It controls:

  • What functions your package exports
  • What functions it imports from other packages
  • S3/S4 method registrations (if needed)

This file is auto-generated by roxygen2, so you typically don’t edit it by hand.

🧾 Example: NAMESPACE File

# Generated by roxygen2: do not edit by hand
S3method("*", funts)
export(as.funts)
import(shiny)
importFrom(ggplot2, ggplot)
  • export() β€” makes a function available to users

  • import() β€” brings in all exported objects from a package

  • importFrom() β€” imports specific functions

  • S3method() β€” registers an S3 method

✍️ How to Generate NAMESPACE

In .R files:

#' @export
r_func <- function(x) { ... }

In .cpp files:

// [[Rcpp::export]]
double myRcpp-func(int x, double y) { ... }

Functions & Documentation

To add functionality to your package, you’ll write:

  • βœ… Functions β€” saved as .R scripts in the R/ folder
  • πŸ“ Documentation β€” using roxygen2 comments above each function

Let’s walk through the process.

πŸ“‚ Code Placement

All your functions go in the R/ folder.

mypkg/
β”œβ”€β”€ R/
β”‚   β”œβ”€β”€ my_function.R
β”‚   └── helpers.R

Each .R file can contain one or more functions.

πŸ“ Documenting Functions with Roxygen2

Use special comments starting with #' to write function documentation:

#' Sum of Square Function
#'
#' This function computes the Sum of Squares of a numeric vector
#' 
#' @param x Numeric vector
#' @return Numeric sum of x
#'
#' @examples
#' x <- 1:5
#' y <- my_function(x)
#' print(y)
#'
my_function <- function(x) sum(x^2)

This will generate help files in the man/ folder.

βš™οΈ Generate Documentation

After writing your function and roxygen2 comments, run:

devtools::document()

This will:

  • Update the NAMESPACE file

  • Create .Rd help files in the man/ folder

πŸ’‘ You can also press Ctrl+Shift+D in RStudio.

Build a Simple Package from Scratch

🧱 Step 1: Create Your Package

Use usethis to create a new package directory:

usethis::create_package("minipkg")

This creates:

  • R/, DESCRIPTION, NAMESPACE

  • .Rproj, .gitignore, .Rbuildignore

  • Opens new RStudio project for your package

2️⃣ Add a Dataset

Let’s include a built-in dataset (mtcars) for simplicity:

usethis::use_data(mtcars, overwrite = TRUE)

This creates a data/mtcars.rda file β€” accessible via

minipkg::mtcars

3️⃣ Create a Function

Create a new file: R/summary_stats.R

#' Compute Summary Statistics
#'
#' Returns mean and standard deviation for each numeric column.
#'
#' @param data A data frame with numeric columns
#' @return A data frame with `mean` and `sd` per column
#'
#' @examples
#' summary_stats(mtcars)
#'
#' @export
summary_stats <- function(data) {
  numeric_data <- data[sapply(data, is.numeric)]
  data.frame(
    mean = sapply(numeric_data, mean),
    sd = sapply(numeric_data, sd)
  )
}

4️⃣ Document the Function

Run the following command to generate docs:

devtools::document()

This will:

  • Add export(summary_stats) to your NAMESPACE

  • Create man/summary_stats.Rd

5️⃣ Load and Test the Package

In the console:

devtools::load_all()

Then test the function:

summary_stats(mtcars)

You should see:

                mean         sd
mpg        20.090625  6.0269481
cyl         6.187500  1.7859216
disp      230.721875 123.9386938

6️⃣ Build and Install the Package

Build the package locally:

devtools::build()

Install it:

devtools::install()

Now you can use it like any other package:

library(minipkg)
summary_stats(mtcars)

πŸš€ help("summary_stats")

Screenshot of Pagerank

Hands-on Exercises (30 min)

  1. Create a new package skeleton and add a simple function
  2. Write roxygen docs and generate help files

Resources & Further Reading

πŸ™ Thank you!

Questions & Discussion