GBIF North America DNA Publishing Workshop

Online

Friday May 9, 2025

9:00am - 4:00pm CT

Instructors: Stephen Formel (Formel Data Services), Chandra Earl (NEON)

General Information

The GBIF North America DNA Publishing Workshop is designed for researchers and collection managers looking to publish DNA-derived data using established biodiversity data standards. This workshop will introduce key data standards, tools, and workflows for sharing environmental and specimen-derived DNA data through the biodiversity data aggregator GBIF.

Participants will gain hands-on experience with data mapping in IPT, learn how to work with the DNA-Derived Data extension, and explore the GBIF Metabarcoding Data Toolkit (MDT). This is not a beginner’s workshop—attendees should have their own data and a basic understanding of biodiversity informatics.

For background on best practices in publishing DNA data, we recommend reading "Publishing DNA-derived data through biodiversity data platforms".

Who: This workshop is designed for collection managers and researchers who are already familiar with DNA science but need guidance on publishing DNA-derived data using community standards. Participants should be aware of the Integrated Publishing Toolkit (IPT) and have their own dataset to work with, though a standard dataset will also be provided for examples. This is not a beginner’s workshop—attendees should have a working knowledge of DNA barcoding, environmental DNA (eDNA), or related molecular techniques but may be unfamiliar with data standards such as MIxS, DwC, and MDT.

Where: This training will take place online. The instructors will provide you with the information you will need to connect to this meeting.

When: Friday May 9, 2025; 9:00am - 4:00pm CT

Cost: This workshop is free to attend.

Requirements: Participants must have access to a computer with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below).

Contact: Please email steve@formeldataservices or chandra.earl@asu.edu for more information.


Code of Conduct

Everyone who participates in GBIF activities is required to conform to the Code of Conduct. If you need to report an incident, please either contact the Regional Representative for North America (David Bloom) or GBIF's Community and Capacity Manager (Mélianie Raymond).


Surveys

Please be sure to complete this survey before the workshop.

Pre-workshop Survey


Schedule

Workshop Schedule

9:00 - 9:30 AM CT Startup & Pre-workshop survey
9:30 - 10:15 AM CT Introduction to DNA Data
Overview of DNA data types and publishing challenges
Review of key literature and standards (MIxS, DwC, MDT, NCBI, iBOL)
10:15 - 10:45 AM CT Introduction to Example Datasets
10:45 - 11:00 AM CT Break
11:00 - 11:30 AM CT Introduction to the Darwin Core DNA extension
Background & History
General Approach to Using the Extension
11:30 - 12:30 PM CT Lunch
12:30 - 2:30 PM CT Dataset Mapping
Explore real-world datasets and metadata structures
Support for participant datasets or use of example datasets
2:30 - 2:45 PM CT Break
2:45 - 3:45 PM CT Publishing DNA Data
Demo: Using IPT to publish data with the DNA extension
Demo: Using the Metabarcoding Data Toolkit (MDT) for DNA data
3:45 - 4:00 PM CT Wrapup
4:00 PM CT End of Workshop

Setup

To participate in this workshop, you will need access to software as described below. In addition, you will need an up-to-date web browser.

The Carpentries maintains a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.

Install the videoconferencing client

If you haven't used Zoom before, go to the official website to download and install the Zoom client for your computer.

Set up your workspace

Like other Carpentries workshops, you will be learning by "coding along" with the Instructors. To do this, you will need to have both the window for the tool you will be learning about (a terminal, RStudio, your web browser, etc..) and the window for the Zoom video conference client open. In order to see both at once, we recommend using one of the following set up options:

This blog post includes detailed information on how to set up your screen to follow along during the workshop.

GBIF Account

Making a GBIF account is pretty easy. Here are the instructions.

R

R is a programming language that is especially powerful for data exploration, visualization, and statistical analysis. To interact with R, we use RStudio. Note: R is not strictly used in this workshop, though it is mentioned and example scripts are shared.

Install R by downloading and running this .exe file from CRAN. Also, please install the RStudio IDE. Note that if you have separate user and admin accounts, you should run the installers as administrator (right-click on .exe file and select "Run as administrator" instead of double-clicking). Otherwise problems may occur later, for example when installing R packages.

Video Tutorial

Instructions for R installation on various Linux platforms (debian, fedora, redhat, and ubuntu) can be found at <https://cran.r-project.org/bin/linux/>. These will instruct you to use your package manager (e.g. for Fedora run sudo dnf install R and for Debian/Ubuntu, add a ppa repository and then run sudo apt-get install r-base). Also, please install the RStudio IDE.

Python

Python is a popular language for research computing, and great for general-purpose programming as well. Installing all of its research packages individually can be a bit difficult, so we recommend Conda-forge, an all-in-one installer. Note: Python is not strictly used in this workshop, though it is mentioned.

Regardless of how you choose to install it, please make sure you install a Python version >= 3.9 (e.g. 3.11 is fine, 3.6 is not).

We will teach Python using the Jupyter Notebook, a programming environment that runs in a web browser (Jupyter Notebook will be installed by Miniforge). For this to work you will need a reasonably up-to-date browser. The current versions of the Chrome, Safari and Firefox browsers are all supported (some older browsers, including Internet Explorer version 9 and below, are not).

  1. Open https://conda-forge.org/download/ with your web browser.
  2. Download the Miniforge for Windows installer
  3. Double click on the downloaded file (Something like, Minforge3-Windows-x86_64.exe)
  4. If you get a "Windows protected your PC" pop-up from Microsoft Defender SmartScreen, click on "More info" and select "Run anyway"
  5. Follow through the installer using all of the defaults for installation except make sure to check Add Miniforge3 to my PATH environment variable.
  6. Download the environment file. Save the file to your Downloads folder.
    (The following steps requires using the shell. If you aren't comfortable doing the installation yourself stop here and request help at the workshop.)
  7. Search for the application "Miniforge Prompt", open it and run: conda env create -f .\Downloads\carpentries_environment.yml
  8. Close the terminal window.
  1. Open https://conda-forge.org/download/ with your web browser.
  2. Download the appropriate Miniforge installer for macOS
    (The following steps requires using the shell. If you aren't comfortable doing the installation yourself stop here and request help at the workshop.)
  3. Open a terminal window and navigate to the directory where the executable is downloaded (e.g., cd ~/Downloads).
  4. Type
    bash Miniforge3-
    and then press Tab to autocomplete the full file name. The name of file you just downloaded should appear.
  5. Press Enter (or Return depending on your keyboard). You will follow the text-only prompts. To move through the text, press Spacebar. Type yes and press enter to approve the license. Press Enter (or Return) to approve the default location for the files. Type yes and press Enter (or Return) to prepend Miniforge to your PATH (this makes the Miniforge distribution the default Python).
  6. Download the environment file. Save the file to your Downloads folder.
  7. On the terminal run: conda env create -f ~/Downloads/carpentries_environment.yml
  8. Close the terminal window.
  1. Open https://conda-forge.org/download/ with your web browser.
  2. Download the appropriate Miniforge installer for Linux
    (The following steps requires using the shell. If you aren't comfortable doing the installation yourself stop here and request help at the workshop.)
  3. Open a terminal window and navigate to the directory where the executable is downloaded (e.g., `cd ~/Downloads`).
  4. Type
    bash Miniforge3-
    and then press Tab to autocomplete the full file name. The name of file you just downloaded should appear.
  5. Press Enter (or Return depending on your keyboard). You will follow the text-only prompts. To move through the text, press Spacebar. Type yes and press enter to approve the license. Press Enter (or Return) to approve the default location for the files. Type yes and press Enter (or Return) to prepend Miniforge to your PATH (this makes the Miniforge distribution the default Python).
  6. Download the environment file. Save the file to your Downloads folder.
  7. Search for the application "Miniforge Prompt", open it and run: conda env create -f ~/Downloads/carpentries_environment.yml
  8. Close the terminal window.

Attribution

This work is derived from materials created by Software Carpentry, Data Carpentry, Library Carpentry, or The Carpentries. The content has been adapted for this workshop and is used under the terms of the Creative Commons Attribution 4.0 International License.

For more information about the original lessons, visit the following websites:

Changes have been made to the original content to suit the goals of this workshop.