NBIS/Elixir Workshop - Tools for Reproducible Research
Description
This workshop teaches how to work reproducibly with control and structuring of project code, environment and workflow management. It focuses on giving participants hands-on experience with useful tools such as Git, Conda, Docker/Apptainer, Jupyter/Quarto and Snakemake/Nextflow through practical tutorials supplemented with short lectures.
This is a 'hybrid' course with on-site participation at either SciLifeLab Solna or Lund University. Please choose your preferred site in the registration form.
Topics covered
- Good practices for data analysis
- Version control and collaborative code development
- Package and environment management
- Workflow management
- Documentation and reporting
- Containerized computational environments
Course fee
A course fee of 3000 SEK will be invoiced to accepted participants. The fee includes lunches, coffee/tea and snacks as well as a course dinner. (Please note that NBIS cannot invoice individuals).
Background
One of the key principles of proper scientific procedure is the act of repeating an experiment or analysis and being able to reach similar conclusions. Published research based on computational analysis (e.g. bioinformatics or computational biology) have often suffered from incomplete method descriptions (e.g. list of used software versions); unavailable raw data; and incomplete, undocumented and/or unavailable code. This essentially prevents any possibility of reproducing the results of such studies. The term “reproducible research” has been used to describe the idea that a scientific publication should be distributed along with all the raw data and metadata used in the study, all the code and/or computational notebooks needed to produce results from the raw data, and the computational environment or a complete description thereof.
Reproducible research not only leads to proper scientific conduct, but also enables other researchers to build upon previous work. Most importantly, the person who organises their work with reproducibility in mind will quickly realize the immediate personal benefits: an organised and structured way of working. The person that most often has to reproduce your own analysis is your future self!
Learning Outcomes
- Organize and structure computational projects
- Track changes and collaborate on code using Git
- Install packages and manage software environments using Conda
- Structure computational steps into workflows with Snakemake and Nextflow
- Create automated reports and document their analyses with Quarto and Jupyter
- Package and distribute computational environments using Docker and Singularity
Prerequisites & Technical Requirements
Prerequisites
- Familiarity with using the terminal (e.g. be familiar with commands such as
ls,cd,touch,mkdir,pwd,wget,man, etc.). - Some knowledge in R and/or python is beneficial but not strictly required.
Technical requirements
A laptop with access to a Linux/Unix terminal, such as Ubuntu (or similar), MacOS or Windows Subsystem for Linux.
Topics & Tags
Affiliations & Networks
Activity log