Metagenomics

"An ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph"

Microbial community assembly (metagenomics)

http://www.ncbi.nlm.nih.gov/pubmed/25609793

Install

https://github.com/voutcn/megahit

Example

Input: metagenomics sample as paired-end fastq files _R1 and _R2

megahit -1 SAMPLE_R1.fastq.gz -2 SAMPLE_R2.fastq.gz -t 12 -o megahit_result

-t 12 use 12 threads (number of parallel processors)

-m 0.5use 50% of available memory (default: 90%, -m 0.9)

Result: assembled contigs are in fasta file:

megahit_result/final.contigs.fa

Intro & Tutorial

https://github.com/voutcn/megahit

https://github.com/voutcn/megahit/wiki/An-example-of-real-assembly

Memory settings

https://github.com/voutcn/megahit/wiki/MEGAHIT-Memory-setting

Help

megahit -h

MEGAHIT v1.0.2

contact: Dinghua Li

Usage:

megahit [options] {-1 -2 | --12 | -r } [-o ]

Input options that can be specified for multiple times (supporting

plain text and gz/bz2 extensions)

-1comma-separated list of fasta/q paired-end #1 files,

paired with files in

-2comma-separated list of fasta/q paired-end #2 files,

paired with files in

--12comma-separated list of interleaved fasta/q

paired-end files

-r/--read comma-separated list of fasta/q single-end files

Input options that can be specified for at most ONE time (not recommended):

--input-cmd command that outputs fasta/q reads to stdout;

taken by MEGAHIT as SE reads

Optional Arguments:

Basic assembly options:

--min-count minimum multiplicity for filtering (k_min+1)-mers, default 2

--k-min minimum kmer size (