Nextflow Summit 2024

📅 October 28 - November 1, 2024 🇪🇸Barcelona, Spain

đź‘© Eirini Liampa
🤵 Xuyang Yuan

2024-11-20

Conference Agenda

  • Day 1 (Mon): Hackathon/Training day 1 + Social Event
  • Day 2 (Tue): Hackathon/Training day 2
  • Day 3 (Wed): Hackathon/Training day 3 + Summit Day 1
  • Day 4 (Thu): Summit Day 2 + Halloween Party
  • Day 5 (Fri): Summit Day 3

HACKATHON

Nextflow Hackathon

Project: GA4GH Quality Control of Whole Genome Sequencing metrics and reference implementations

Motivation & Mandate
Standardized QC Metrics, Definition And Implementation

  • Many available tools ( e.g. coverage)
    • samtools, picard, sambamba, indexcov, mosdepth
  • Different tools give different results depending on defaults settings and options ( e.g. coverage)
    • All chromosomes or autosomes onlyâť“
    • N bases maskedâť“
    • Duplicates excludedâť“
    • Soft-clipped bases includedâť“
    • Base quality filteringâť“
      • If yes, what’s the cutoff to throw away basesâť“
    • Mapping quality filteringâť“
      • If yes, whta’s the cutoff to throw away readsâť“

 

Implementation:

  • National Precision Medicine programme in Singapore (NPM)
    • c-BIG/NPM-sample-qc
    • Tools
      • CollectMultipleMetrics
      • CollectVariantCallingMetrics
      • CollectWgsMetrics
      • bcftools
      • samtools
      • verifyBamID2 (DNA contamination estimation from sequence reads using ancestry-agnostic method)
    • GRCh38
    • Test run âś…
  • OUSAMG

SUMMIT

GENERAL TALKS ABOUT NEXTFLOW, BIOINFORMATICS, INDUSTRY ETC.

 
 

Nextflow is widely used

Community is growing quickly

REGULATION

EXPERIENCE SHARING IN A VARIETY OF FIELDS

Australian BioCommons

 
 

Metagenomics analyzes DNA from all organisms in a sample.

nf-core pipelines for profiling, assembly, and functional screening in metagenomics.

meta-omics group and plans for better pipeline integration.

Genie: Genomics England’s pipeline orchestration platform

  • Successor of Bertha for 100K genomes project, not adaptable, no industry standard, e.g. Nextflow, not scalable, not cloud compatible
  • Genie does all above
  • For 100K new born genomes project
  • Pipelines run automatically and continuously

AWS Lambda Function AWS Eventbridge AWS Step Function

 

The SARS-CoV-2 pandemic drove the development of CLIMB-COVID, a platform enabling collaboration on over 3.5 million genome sequences for pandemic research and response in the UK. Building on this, CLIMB-TRE now offers a broader infrastructure for pathogen genome surveillance, integrating data from multiple sources with quality control and analysis capabilities. This session explores its technical design, focusing on Nextflow workflows and cloud-based user analysis.

 

Porting somatic variant calling pipelines (SNVs, Small INDELs, SVs, allele-specific somatic copy number aberrations (sCNAs)) of National Center for Tumor Diseases (NCT) and the German Cancer Research Center (DKFZ) to Nextflow to achiev FAIR compliance.

 

nf-core/crisprseq, a Nextflow pipeline for the assessment of CRISPR gene editing and screening assays.

TOOLS AND NEW TECHS

What’s next for Nextflow?

 
 

Ensuring the correctness and reliability of large and complex pipelines is challenging

nf-test, testing framework, modular, test individual process blocks, workflow patterns, and entire pipelines

DSL2-like syntax, snapshot testing, testing only changed modules, CI support, plugin system (new)

 
 

Scientific computing uses a lot of energy, thus affects the environment.GPT-3: 552t Data center: 134Mt Total: 37Bt

\(E = t * (P_c + P_m)* PUE\)

Nextflow plugin: nf-col2footprint

Green DiSC certification framework can support scientists and institutions in making their research more sustainable.

Nextflow language-server

Editor support: vscode emacs neovim

Supported language features:

  • code navigation (outline, go to definition, find references)
  • completion
  • diagnostics (errors, warnings)
  • formatting
  • hover hints
  • rename
  • DAG preview for workflows

MultiQC - What’s new?

TRAINING / EDUCATION

POSTERS