Description

The course provides an introduction to machine learning methods and workflows for life science research. It introduces the full end-to-end machine learning (ML) workflow, from data preprocessing and feature engineering to model training, evaluation, interpretation, and reproducible reporting, with a focus on the analysis of complex, high-dimensional biological data. Participants explore biological datasets using unsupervised methods such as dimensionality reduction and clustering, and build predictive models using supervised approaches including linear and tree-based models. Methods for multi-omics integration, including partial least squares (PLS), are introduced, and regularisation techniques for high-dimensional data, such as ridge, lasso, and elastic net, are also covered.

Content

  • Overview of the machine learning workflow
  • Dimensionality reduction methods such as PCA and UMAP
  • Unsupervised learning and clustering methods
  • Supervised learning models, including tree-based models
  • Partial least squares (PLS) for multi-omics integration
  • Regularization methods.
  • Model training, evaluation and validation strategies
  • Model interpretation and explainable machine learning methods

Details

Dates
12 - 16 October 2026
Application deadline
August 28, 2026 11:01
Contact

edu.ml-biostats@nbis.se

Venue
SciLifeLab Uppsala, Entrance C11, BMC, Husargatan 3, Uppsala
City
Uppsala
Country
Sweden
Language
English
Cost
3000 SEK : Academic
15000 SEK : Non Academic
Timezone
Stockholm

Content Providers

Learning Outcomes

  • Explain the main components of the machine learning workflow and their role in life science research.
  • Perform data preprocessing and exploratory analysis of high-dimensional biological datasets.
  • Apply unsupervised learning methods to discover structure and generate biological hypotheses.
  • Train, evaluate, and compare supervised learning models commonly used in life sciences.
  • Assess model performance using appropriate evaluation metrics and validation strategies.
  • Apply regularization techniques to improve model generalization in high-dimensional settings.
  • Interpret and communicate model results using explainable machine learning techniques.
  • Apply basic principles of reproducible and FAIR machine learning workflows
  • Deploy and share machine learning models using accessible tools.
  • Collaborate in interdisciplinary teams to design, implement, and present an ML-based data analysis.

Prerequisites & Technical Requirements

Prerequisites

  • Basic programming skills in R or Python, including working with data frames and running scripts
  • Prior exposure to basic statistical concepts (e.g. descriptive statistics, linear regression)
  • Familiarity with data analysis environments such as RStudio or Jupyter Notebooks

Technical requirements

Applicants are expected to bring their own laptops. A reasonably modern laptop with linux/unix, mac or windows OS and internet connection.

Topics & Tags

Keywords
Machine LearningDimension reductionUnsupervised learningSupervised learningModel validationReguralizationModel interpretationML framework

Affiliations & Networks

Associated nodes
SciLifeLab
Target audience
PhD studentsPostdocstaff scientists

Activity log