groot

command module
v0.0.0-...-1be76fa Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 11, 2020 License: MIT Imports: 1 Imported by: 0

README

groot-logo

Graphing Resistance Out Of meTagenomes


travis Documentation Status reportcard License DOI bioconda bioconda bioconda

Overview

GROOT is a tool to type Antibiotic Resistance Genes (ARGs) in metagenomic samples (a.k.a. Resistome Profiling). It combines variation graph representation of gene sets with an LSH indexing scheme to allow for fast classification of metagenomic reads. Subsequent hierarchical local alignment of classified reads against graph traversals facilitates accurate reconstruction of full-length gene sequences using a simple scoring scheme.

GROOT will output an ARG alignment file (in BAM format) that contains the graph traversals possible for each query read; the alignment file is then used by GROOT to generate a resistome profile.

Since version 0.4, GROOT will also output the variation graphs which had reads align. These graphs are in GFA format, allowing you to visualise graph alignments using Bandage and determine which variants of a given ARG type are dominant in your metagenomes. Read the documentation for more info.

Since version 1.0.0, GROOT has had a partial re-write (merging features and changes from my baby groot project). It now uses the excellent LSH Ensemble library as the LSH index, enabling containment search for read seeding. I've also improved my dev know-how and GROOT is now more efficient. However, these changes have meant that I've needed to change some of the CLI, so please read the docs.

Installation

Check out the releases to download a binary. Alternatively, install using Bioconda or compile the software from source.

Bioconda

conda install -c bioconda groot

Brew

brew install brewsci/bio/groot

Source

GROOT is written in Go (v1.14) - to compile from source you will first need the Go tool chain. Once you have it, try something like this to compile:

# Clone this repository
git clone https://github.com/will-rowe/groot.git

# Go into the repository and get the package dependencies
cd groot
go get -d -t -v ./...

# Run the unit tests
go test -v ./...

# Compile the program
go build ./

# Call the program
./groot --help

Quick Start

GROOT is called by typing groot, followed by the subcommand you wish to run. There are three main subcommands: index, align and report. This quick start will show you how to get things running but it is recommended to follow the documentation.

# Get a pre-clustered ARG database
groot get -d arg-annot

# Create graphs and index
groot index -m arg-annot.90 -i grootIndex -w 100

# Align reads and report
groot align -i grootIndex -f reads.fq | groot report

note: it's recommended to index the graph using a window size ~= your maximum expected read length, so for 100bp reads, use -w 100

Further Information & Citing

Please readthedocs for more extensive documentation and a tutorial.

GROOT has now been published in Bioinformatics:

Rowe WPM, Winn MD. Indexed variation graphs for efficient and accurate resistome profiling. Bioinformatics. 2018. doi: bty387

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis
src
em
Package em is the groot implementation of the expectation-maximization algorithm for finding the most likely paths through the graphs Currently, there is no weighting for each path, as they are all so similar in length and the nodes have already been weighted.
Package em is the groot implementation of the expectation-maximization algorithm for finding the most likely paths through the graphs Currently, there is no weighting for each path, as they are all so similar in length and the nodes have already been weighted.
graph
Package graph is used to process graphs.
Package graph is used to process graphs.
lshe
Package lshe is used to index the graphs.
Package lshe is used to index the graphs.
minhash
Package minhash contains implementations of bottom-k and kmv MinHash algorithms.
Package minhash contains implementations of bottom-k and kmv MinHash algorithms.
misc
contains some misc helper functions etc.
contains some misc helper functions etc.
pipeline
Package pipeline contains a streaming pipeline implementation based on the Gopher Academy article by S. Lampa - Patterns for composable concurrent pipelines in Go (https://blog.gopheracademy.com/advent-2015/composable-pipelines-improvements/)
Package pipeline contains a streaming pipeline implementation based on the Gopher Academy article by S. Lampa - Patterns for composable concurrent pipelines in Go (https://blog.gopheracademy.com/advent-2015/composable-pipelines-improvements/)
seqio
the seqio package contains custom types and methods for holding and processing sequence data
the seqio package contains custom types and methods for holding and processing sequence data

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL