doppelmark

command module
v0.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 6, 2021 License: Apache-2.0 Imports: 9 Imported by: 0

README

Tests Lint

doppelmark duplicate marking tool

doppelmark is a high-performance duplicate sequencing read marking tool for marking PCR and optical(pad-hopping) duplicate reads. It is functionally equivalent to the picard and sambamba duplicate marking tools, but runs much more efficiently and takes advantage of multi-core hardware. For some workloads and hardware, doppelmark is 100x faster than picard, and 7x faster than sambamba.

doppelmark achieves its speedup by dividing the input into shards and running the shards in parallel. Each shard includes input decompression, duplicate marking, and compression of the resulting output data. It detects duplicates without sorting all records. For a detailed description of the algorithm and design, see doc.go.

  • doppelmark: High-performance duplicate marking tool

Documentation

Overview

Copyright 2019 Grail Inc.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Directories

Path Synopsis
Command bio-mark-duplicates marks or removes duplicates from .bam files.
Command bio-mark-duplicates marks or removes duplicates from .bam files.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL