Printing herbarium labels and exporting a digital copy of data • BarnebyLives

This vignette will show you how to create your herbarium labels. It also details how to export the data from the BarnebyLives format to some other popular herbarium database formats.

# devtools::install_github('sagesteppe/BarnebyLives')
library(BarnebyLives)
data('collection_examples')

You will need to create a directory (folder) where your labels will be placed. Here we create a directory named “HerbariumLabels” in our local Documents directory. This directory is where our final labels are going to end up. We also need to create a subdirectory named ‘raw’, this is where the initial output labels, generated by R interacting with your TeX distribution will be written to. Each label is made individually, and then later mosaiced (like tiles) onto pages for printing.

setwd('~/Documents')
dir.create('HerbariumLabels')
dir.create('HerbariumLabels/raw')

p <- '~/Documents/HerbariumLabels/raw'

You can use a purrr function, walk, to write many labels one after another in the following fashion. Be sure to specify reasonable paths to write to! BL will create a temporary ‘raw’ file in location, which will store the intermediary labels.

Before you let purrr::walk through the labels to make, you can copy the label template to a more easily accessible location on your computer. This following code chunk below will search for your current version of R, and then we can bring the template over from the BarnebyLives directory which holds the package. As mentioned BarnebyLives is a weirdo of an r package, so we are also specifying the location within the package which holds the actual template (‘skeleton’) which we will populate our labels values into.

p2libs <- system.file(package = 'BarnebyLives')

folds <- file.path(
   'rmarkdown', 'templates', 'labels', 'skeleton', 'skeleton.Rmd')

In this example we are going to use the default BarnebyLives template to create some labels. You can create your own label template as well, that is discussed in the vignette “Custom Label Templates”. As mentioned we will use purrr::walk(), which is going to perform an operation on each record which is feed into it via the .x argument. While we are specifying the individual rows of our collections in this document, purrr::walk() is going to call another function rmarkdown::render, which will load the data for these records from a csv into an R session (one per label!), and write the labels from there. Note that Collection_numbers are essential to making this happen, if you are really trying to magyver some things, then you can fake this value using another ID which is UNIQUE for each record here, and matches what is in the BL output csv data - maybe row number from write.csv default behavior?

Note that for this to work on your own data, you will need to modify a chunk in the label template (‘skeleton’) so that R is reading in your data. Something like the below will be what you need to modify:

data <- read.csv('SoS-ExampleCollection.csv') |>
  dplyr::mutate(
    Collection_number = as.numeric(Collection_number), 
    Coordinate_Uncertainty = '+/- 5m') |>
  dplyr::filter(Collection_number == params$Collection_number)

I understand it is slightly obnoxious to need to edit two files to get this process to work, but again, BarnebyLives is not quite the typical R package. Editing both files allows for some downstream flexibility.

purrr::walk(
  .x = collection_examples$Collection_number,
  ~ rmarkdown::render(
    input = file.path(p2libs, folds), 
    # the template which will be populated
    output_file = file.path(p, glue::glue("{.x}.pdf")), 
    # the location and name for the output file. 
    params = list(Collection_number = {.x}) 
    # this is what purrr will walk through - the collection numbers. 
  )
)

As I suspect that many users will want to create a custom label template, I show how easy it is to modify the above function calls below.

It will be best practice to make a copy of the template, and save it in a part of the computer which people are more likely to interface with, e.g. within the Documents directory. If you make a modified copy of the template in the R package, whenever the package get’s updated and you reinstall it, your template will be erased! It is very unlikely you will be able to get a template back after it’s removed, so don’t do this!

Using the code below we can simply copy (file.copy) the template to the current working directory (supplied as argument: ‘.’).

file.copy(from = file.path(p2libs, folds), to =  '.')

As mentioned we will showcase some template changes in a different vignette, “Custom Label Templates”, but we will just use this default template again for this example. Now from that location you can run the script again, but this time with modification to the ‘input’ argument of purrr::walk.

purrr::walk(
  .x = collection_examples$Collection_number,
  ~ rmarkdown::render(
    input = 'skeleton.Rmd', # specifying the location of our new template! 
    # R is assuming it is in our current working directory, and will fail if it is
    # not found!
    output_file = file.path(p, glue::glue("{.x}.pdf")),
    params = list(Collection_number = {.x})
  )
)

As you can see you only need to change a single argument in the above code chunk to use a custom label - just let this purrr::walk() know where to find the template, and you should be good.

Exporting a digital copy of the data

This is normally the time when I also make a digital copy of the data for direct accessioning at the herbarium side.

Currently BL supports writing out data for mass upload in a few formats for herbaria(e.g. Symbiota, Jepson, Chicago Botanic Garden), but we are always eager to add more, so PLEASE do not hesitate to ask if you want something to be supported! There is a lot of code, but realistically all it is doing is replacing NA values with ’’, or empty contents, this is the best format for people to get NULL values for their databases.

unique(database_templates$Database) # currently supported options. 

dat_import <- format_database_import(collection_examples, 'Symbiota') |> 
  dplyr::mutate( # KEEP THIS AROUND! Instead of writing explicit NA's # we
    dplyr::across( # just return blank cells, this helps with import on most
      dplyr::everything(), ~ as.character(.)), # users ends. 
    dplyr::across(
      dplyr::everything(), ~ tidyr::replace_na(., '')))  

write.csv(dat_import, 'Symbiota_format_collections_2024.csv', row.names = F)