genoassist

command module
v0.0.0-...-0778f37 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 24, 2022 License: GPL-3.0 Imports: 5 Imported by: 0

README

GenoAssist


An all-encompassing bioinformatics tool for genome assembly and annotation projects

Table of contents

  1. About
  2. Installation
  3. GenoAssist Usage
  4. Architecture
  5. Maintainers
  6. Feedback and bug reports

1. About

One of the challenges that computational biologists face during genome assembly projects is choosing from the plethora of assembly software. This is highly time-consuming as there are various parameters for each of the assemblers that the user needs to learn about. In addition, even if users learn about the various parameters of each assembler, various assemblers still need to be run, and statistical results need to be compared to identify the best assembly. GenoAssist helps computational biologists by centralizing all the assemblers, their parameters, running environments, and results reporting in a single place.

2. Installation

  1. You can either use go (will be added to $GOPATH/):

    $ go get -u github.com/genoassist/genoassist
    

    Or clone the repository:

    $ git clone https://github.com/genoassist/genoassist
    
  2. Build the main.go file

    $ go build main.go
    

If you are missing packages, run go mod vendor to collect the necessary packages

3. GenoAssist usage

GenoAssist only requires a YAML file that contains the configuration it should use to run its processes. A template can be found in this repository. For convenience, here's an example specification:

assemblers:
  megahit:
    kmers: "27"
  abyss:
    kmers: "27"
genoassist:
  assemblers: ['abyss','megahit','flye']
  inputFilePath: "/test/raw_sequences.fastq"
  outputPath: "/test/output"
  threads: 2
  prep: true
  qualityControl: true
  fileType: "fasta"
Notes:
  • All paths used with GenoAssist have to be absolute paths (a Docker requirement)
  • The accepted assembler values are:
  1. 'abyss'
  2. 'megahit'
  3. 'flye'
  • The accepted file types are:
  1. FASTA
  2. FASTQ
Installing Docker images through GenoAssist

If you are encountering problems with Docker, make sure that:

  1. The Docker daemon is running in the background
  2. You have the necessary Docker images, which can be installed via GenoAssist specifying prep: true under genoassist in the YAML configuration. This will install the necessary Docker images for the containers that GenoAssist runs.
Sample assembly result visualization

4. Architecture

The overall model follows the primary/replica architecture. The primary is what users interact with. The users specify the files containing the contigs and what type of read they have e.g Illumina. The primary takes the user's input and schedules assembly, parsing of results, and reporting, in that order.

5. Maintainers

Tayab Soomro
Flaviu Vadan

Feel free to contact any of the maintainers if you would like to be an active maintainer and contributor to GenoAssist! If you would like to contribute only, you are encouraged to grab an issue and submit a pull request with proposed changes for review!

6. Feedback and bug reports

Submit feedback and bug reports by using the Issues section of the repository.

Documentation

Overview

main package, and file, is responsible for taking in users arguments, parsing them, and calling on the primary to perform the work that genoassist does

Directories

Path Synopsis
constants holds all the constants shared between packages
constants holds all the constants shared between packages
responsible for preparing GenoAssist to perform assemblies by pulling all the necessary Docker containers from DockerHub
responsible for preparing GenoAssist to perform assemblies by pulling all the necessary Docker containers from DockerHub
a collection of interface specifications for objects that are part of the primary package
a collection of interface specifications for objects that are part of the primary package
replica is responsible for launching and coordinating processes such as assembly and parsing for the primary
replica is responsible for launching and coordinating processes such as assembly and parsing for the primary
components
defines interfaces that have to be satisfied by components
defines interfaces that have to be satisfied by components
components/assembler
contains the definition and work associated with assemblers
contains the definition and work associated with assemblers
the reporter package computes multiple statistics that quantify the quality of a collection of contigs
the reporter package computes multiple statistics that quantify the quality of a collection of contigs
contains the definition of an Result
contains the definition of an Result

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL