View course website

Description

This course expands on common life science data analysis methods, including dimensionality reduction techniques beyond PCA, mixed-effects models for analysis of repeated measures, and survival analysis. We will also dive deeper into machine learning, covering more classification algorithms, ensemble techniques, optimization strategies and PLS methods for single and multi-omics data analysis.

Course Content:

● Dimensionality reduction techniques beyond PCA
● Classification algorithm and ensemble techniques
● Machine learning optimization strategies
● PLS-based methods for single and multi-omics data analysis
● Mixed-effects models for repeated measures, longitudinal studies and nested designs
● Survival analysis
● Introduction to neural networks

Important dates

Application open: now
Application closes: 2025-05-02
Confirmation to accepted students: 2025-05-09
Responsible teachers: Payam Emami, Olga Dethlefsen, Eva Freyhult
If you do not receive information according to the above dates, please contact edu.ml-biostats@nbis.se

Apply here!

Education

In this course, we focus on an active learning approach. The education consists of teaching blocks alternating between lectures, group discussions, live coding sessions, and exercises.

Course Fee

A course fee of 3000 SEK for academic participants and 15 000 SEK for non-academic participants will be invoiced to accepted participants. The fee includes lunches, coffee and snacks.
*Please note that NBIS cannot invoice individuals

The course can accommodate a maximum of 24 participants. If we receive more applications, participants will be selected based on several criteria. Selection criteria include correct entry requirements, motivation to attend the course as well as gender and geographical balance.

Learning Outcomes

By the end of this course, participants will be able to:
● Machine Learning Workflow: understand and implement core ML stages in R and Python, covering data preprocessing, model selection, training, and evaluation.
● Dimension Reduction: understand and apply advanced techniques like UMAP and t-SNE for high-dimensional data analysis and understand their relationship to PCA.
● Classification Models: implement and tune RF, SVM, and logistic regression models using grid search for classification tasks.
● Ensemble Methods: understand concepts of bagging, boosting, and stacking, and apply AdaBoost and XGBoost for classification and regression tasks.
● PLS Analysis: Implement PLS, PLS-DA, and sPLS for single- and multi-omics data, including variable selection.
● Mixed Effects Models: apply mixed models to complex biological data, focusing on repeated measures and longitudinal designs.
● Survival Analysis: understand censored data, calculate Kaplan-Meier estimators to estimate survival functions, compare survival curves, and perform regression analysis with Cox proportional hazards models, handling time-dependent covariates and competing risks.
● Gain foundational knowledge of CNNs and RNNs; understand LLMs in life sciences and apply pre-trained models for cell-type classification and gene expression prediction.
● synthesize course methods in a final challenge, implementing ML workflows and statistical models on real-world data.

Prerequisites & Technical Requirements

Prerequisites

● Basic knowledge of descriptive statistics, hypothesis testing and linear regression or having attended the Introduction to Biostatistics and Machine Learning course
● Basic R and Python data science skills (for more details see course website)
● BYOL (bring your own laptop)

Topics & Tags

Keywords

BiostatisticsMachine Learning

Affiliations & Networks

Associated nodes

SciLifeLab

Target audience

PhD Studentspostdocsstaff scientistsindustry professionalseveryone

Activity log

Machine Learning for Life Sciences