Welcome to MpGAP pipeline documentation

About

MpGAP is a pipeline developed with Nextflow and Docker. It was designed to provide an easy-to-use framework for de novo genome assembly of Illumina, Pacbio and Oxford Nanopore sequencing data through illumina only, long reads only or hybrid modes.

Workflow

The pipeline wraps up the following tools and analyses:

Software	Analysis
Hifiasm, Canu, Flye, Unicycler, Raven, Shasta and wtdbg2	Long reads assembly
Haslr, Unicycler and SPAdes	Hybrid assembly
Shovill, Unicycler, Megahit and SPAdes	Short reads assembly
Nanopolish, Medaka, gcpp, Polypolish and Pilon	Assembly polishing
Quast, BUSCO and MultiQC	Assembly QC

Quickstart

A quickstart is available so you can quickly get the gist of the pipeline's capabilities.

Usage

The pipeline's common usage is very simple as shown below:

# usual command-line
nextflow run fmalmeida/mpgap \
    -profile docker \
    --output ./results \
    --tracedir ./results/pipeline_info \
    --input input.yml \
    --max_cpus 20 \
    --max_memory '40.GB' \
    ...

Quote

Some parameters are required, some are not. Please read the pipeline's manual reference to understand each parameter.

Citation

In order to cite this pipeline, please refer to:

Almeida FMd, Campos TAd and Pappas Jr GJ. Scalable and versatile container-based pipelines for de novo genome assembly and bacterial annotation. F1000Research 2023, 12:1205 (https://doi.org/10.12688/f1000research.139488.1)

Support contact

Whenever a doubt arise feel free to contact me via the github issues.