https://github.com/PacificBiosciences/GenomicConsensus

GenomicConsensus 是pacbio开发的,我个人非常不喜欢pacbio开发的工具,很难用。

安装这个GenomicConsensus也是废了我快半条老命。

这个工具的目的:Compute genomic consensus and call variants relative to the reference.

就是用一些reads来对最终的ref来进行纠错,这个模型适用性比较大,可以用在各个场合,尤其是我们在开发一些工具时,可以直接将这个嵌入到我们的工具中,减少开发量。

./bin/arrow -h
usage: variantCaller [-h] [--version] [--emit-tool-contract]
[--resolved-tool-contract RESOLVED_TOOL_CONTRACT]
[--log-file LOG_FILE]
[--log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL} | --debug | --quiet | -v]
--referenceFilename REFERENCEFILENAME -o OUTPUTFILENAMES
[-j NUMWORKERS] [--minConfidence MINCONFIDENCE]
[--minCoverage MINCOVERAGE]
[--noEvidenceConsensusCall {nocall,reference,lowercasereference}]
[--coverage COVERAGE] [--minMapQV MINMAPQV]
[--referenceWindow REFERENCEWINDOWSASSTRING]
[--alignmentSetRefWindows]
[--referenceWindowsFile REFERENCEWINDOWSASSTRING]
[--barcode _BARCODE] [--readStratum READSTRATUM]
[--minReadScore MINREADSCORE] [--minSnr MINHQREGIONSNR]
[--minZScore MINZSCORE] [--minAccuracy MINACCURACY]
[--algorithm {quiver,arrow,plurality,poa,best}]
[--parametersFile PARAMETERSFILE]
[--parametersSpec PARAMETERSSPEC]
[--maskRadius MASKRADIUS] [--maskErrorRate MASKERRORRATE]
[--pdb] [--notrace] [--pdbAtStartup] [--profile]
[--dumpEvidence [{variants,all,outliers}]]
[--evidenceDirectory EVIDENCEDIRECTORY] [--annotateGFF]
[--reportEffectiveCoverage] [--diploid]
[--queueSize QUEUESIZE] [--threaded]
[--referenceChunkSize REFERENCECHUNKSIZE]
[--fancyChunking] [--simpleChunking]
[--referenceChunkOverlap REFERENCECHUNKOVERLAP]
[--autoDisableHdf5ChunkCache AUTODISABLEHDF5CHUNKCACHE]
[--aligner {affine,simple}] [--refineDinucleotideRepeats]
[--noRefineDinucleotideRepeats] [--fast]
[--skipUnrecognizedContigs]
inputFilename Compute genomic consensus and call variants relative to the reference. optional arguments:
-h, --help show this help message and exit
--version show program's version number and exit
--emit-tool-contract Emit Tool Contract to stdout (default: False)
--resolved-tool-contract RESOLVED_TOOL_CONTRACT
Run Tool directly from a PacBio Resolved tool contract
(default: None)
--log-file LOG_FILE Write the log to file. Default(None) will write to
stdout. (default: None)
--log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}
Set log level (default: WARN)
--debug Alias for setting log level to DEBUG (default: False)
--quiet Alias for setting log level to CRITICAL to suppress
output. (default: False)
-v, --verbose Set the verbosity level. (default: None) Basic required options:
inputFilename The input cmp.h5 or BAM alignment file
--referenceFilename REFERENCEFILENAME, --reference REFERENCEFILENAME, -r REFERENCEFILENAME
The filename of the reference FASTA file (default:
None)
-o OUTPUTFILENAMES, --outputFilename OUTPUTFILENAMES
The output filename(s), as a comma-separated
list.Valid output formats are .fa/.fasta, .fq/.fastq,
.gff, .vcf (default: []) Parallelism:
-j NUMWORKERS, --numWorkers NUMWORKERS
The number of worker processes to be used (default: 1) Output filtering:
--minConfidence MINCONFIDENCE, -q MINCONFIDENCE
The minimum confidence for a variant call to be output
to variants.{gff,vcf} (default: 40)
--minCoverage MINCOVERAGE, -x MINCOVERAGE
The minimum site coverage that must be achieved for
variant calls and consensus to be calculated for a
site. (default: 5)
--noEvidenceConsensusCall {nocall,reference,lowercasereference}
The consensus base that will be output for sites with
no effective coverage. (default: lowercasereference) Read selection/filtering:
--coverage COVERAGE, -X COVERAGE
A designation of the maximum coverage level to be used
for analysis. Exact interpretation is algorithm-
specific. (default: 100)
--minMapQV MINMAPQV, -m MINMAPQV
The minimum MapQV for reads that will be used for
analysis. (default: 10)
--referenceWindow REFERENCEWINDOWSASSTRING, --referenceWindows REFERENCEWINDOWSASSTRING, -w REFERENCEWINDOWSASSTRING
The window (or multiple comma-delimited windows) of
the reference to be processed, in the format refGroup
:refStart-refEnd (default: entire reference).
(default: None)
--alignmentSetRefWindows
The window (or multiple comma-delimited windows) of
the reference to be processed, in the format refGroup
:refStart-refEnd will be pulled from the alignment
file. (default: False)
--referenceWindowsFile REFERENCEWINDOWSASSTRING, -W REFERENCEWINDOWSASSTRING
A file containing reference window designations, one
per line (default: None)
--barcode _BARCODE Only process reads with the given barcode name.
(default: None)
--readStratum READSTRATUM
A string of the form 'n/N', where n, and N are
integers, 0 <= n < N, designating that the reads are
to be deterministically split into N strata of roughly
even size, and stratum n is to be used for variant and
consensus calling. This is mostly useful for Quiver
development. (default: None)
--minReadScore MINREADSCORE
The minimum ReadScore for reads that will be used for
analysis (arrow-only). (default: 0.65)
--minSnr MINHQREGIONSNR
The minimum acceptable signal-to-noise over all
channels for reads that will be used for analysis
(arrow-only). (default: 3.75)
--minZScore MINZSCORE
The minimum acceptable z-score for reads that will be
used for analysis (arrow-only). (default: -3.5)
--minAccuracy MINACCURACY
The minimum acceptable window-global alignment
accuracy for reads that will be used for the analysis
(arrow-only). (default: 0.82) Algorithm and parameter settings:
--algorithm {quiver,arrow,plurality,poa,best}
--parametersFile PARAMETERSFILE, -P PARAMETERSFILE
Parameter set filename (such as ArrowParameters.json
or QuiverParameters.ini), or directory D such that
either D/*/GenomicConsensus/QuiverParameters.ini, or
D/GenomicConsensus/QuiverParameters.ini, is found. In
the former case, the lexically largest path is chosen.
(default: None)
--parametersSpec PARAMETERSSPEC, -p PARAMETERSSPEC
Name of parameter set (chemistry.model) to select from
the parameters file, or just the name of the
chemistry, in which case the best available model is
chosen. Default is 'auto', which selects the best
parameter set from the alignment data (default: auto)
--maskRadius MASKRADIUS
Radius of window to use when excluding local regions
for exceeding maskMinErrorRate, where 0 disables any
filtering (arrow-only). (default: 3)
--maskErrorRate MASKERRORRATE
Maximum local error rate before the local region
defined by maskRadius is excluded from polishing
(arrow-only). (default: 0.7) Verbosity and debugging/profiling:
--pdb Enable Python debugger (default: False)
--notrace Suppress stacktrace for exceptions (to simplify
testing) (default: False)
--pdbAtStartup Drop into Python debugger at startup (requires ipdb)
(default: False)
--profile Enable Python-level profiling (using cProfile).
(default: False)
--dumpEvidence [{variants,all,outliers}], -d [{variants,all,outliers}]
--evidenceDirectory EVIDENCEDIRECTORY
--annotateGFF Augment GFF variant records with additional
information (default: False)
--reportEffectiveCoverage
Additionally record the *post-filtering* coverage at
variant sites (default: False) Advanced configuration options:
--diploid Enable detection of heterozygous variants
(experimental) (default: False)
--queueSize QUEUESIZE, -Q QUEUESIZE
--threaded, -T Run threads instead of processes (for debugging
purposes only) (default: False)
--referenceChunkSize REFERENCECHUNKSIZE, -C REFERENCECHUNKSIZE
--fancyChunking Adaptive reference chunking designed to handle
coverage cutouts better (default: True)
--simpleChunking Disable adaptive reference chunking (default: True)
--referenceChunkOverlap REFERENCECHUNKOVERLAP
--autoDisableHdf5ChunkCache AUTODISABLEHDF5CHUNKCACHE
Disable the HDF5 chunk cache when the number of
datasets in the cmp.h5 exceeds the given threshold
(default: 500)
--aligner {affine,simple}, -a {affine,simple}
The pairwise alignment algorithm that will be used to
produce variant calls from the consensus (Quiver
only). (default: affine)
--refineDinucleotideRepeats
Require quiver maximum likelihood search to try one
less/more repeat copy in dinucleotide repeats, which
seem to be the most frequent cause of suboptimal
convergence (getting trapped in local optimum) (Quiver
only) (default: True)
--noRefineDinucleotideRepeats
Disable dinucleotide refinement (default: True)
--fast Cut some corners to run faster. Unsupported! (default:
False)
--skipUnrecognizedContigs
Do not abort when told to process a reference window
(via -w/--referenceWindow[s]) that has no aligned
coverage. Outputs emptyish files if there are no
remaining non-degenerate windows. Only intended for
use by smrtpipe scatter/gather. (default: False)

  

待续~~

GenomicConsensus (quiver, arrow)使用方法 | 序列 consensus的更多相关文章

  1. guxh的python笔记八:特殊方法

     1,类的特殊方法  新建一个类,本章内容中的特殊方法如果不创建类或新增方法,默认使用的就是下面的类: class Foo: """this is Foo"&q ...

  2. Struct2_使用Ajax调用Action方法并返回值

    一.Login.jsp 1.<head>引入jquery: <script type="text/javascript" src="http://aja ...

  3. 关于ajaxfileupload的使用方法以及一些问题

    使用问题: 1.ajax-fileupload.js handleError 异常 由于本来handleError方法是jquery的方法,但jquery到了某个版本这个方法就去掉了没有了 所以最简单 ...

  4. 71、salesforce的JSON方法

    List<Merchandise__c> merchandise = [select Id,Name,Price__c,Quantity__c from Merchandise__c li ...

  5. 三代PacBio reads纠错 - 专题

    三代纠错的重要性不言而喻,三代的核心优势就是长,唯一的缺点就是错误率高,但好就好在错误是随机分布的,可以通过算法解决,这也就是为什么现在有这么多针对三代开发的纠错工具. 纠错和组装是分不开的,纠错就是 ...

  6. Raft

    http://thesecretlivesofdata.com/raft/ https://github.com/coreos/etcd   1 Introduction Consensus algo ...

  7. HTTP超文本传输协议-HTTP/1.1中文版

    摘要 超文本传输协议(HTTP)是一种为分布式,合作式,多媒体信息系统服务,面向应用层的协议.它是一种通用的,不分状态(stateless)的协议,除了诸如名称服务和分布对象管理系统之类的超文本用途外 ...

  8. LINQ to Entities 和LINQ to Objects 的区别

    本文资料来源:http://www.codeproject.com/Articles/246861/LINQ-to-Entities-Basic-Concepts-and-Features) LINQ ...

  9. RFC-2068-http

    本文档规定了互联网社区的标准组协议,并需要讨论和建议以便更加完善.请参考 “互联网官方协议标准”(STD 1)来了解本协议的标准化状态.本协议不限流传发布. 版权声明 Copyright (C) Th ...

随机推荐

  1. EXP7 网络欺诈技术防范(修改版)

    实践内容 本实践的目标理解常用网络欺诈背后的原理,以提高防范意识,并提出具体防范方法. 1.简单应用SET工具建立冒名网站 2.ettercap DNS spoof 3.结合应用两种技术,用DNS s ...

  2. CocoaPods创建自己的公开库、私有库

    http://www.cocoachina.com/ios/20180308/22509.html

  3. Bootstrap3基础 clearfix pull-left/right 辅助类样式 快速左右浮动

      内容 参数   OS   Windows 10 x64   browser   Firefox 65.0.2   framework     Bootstrap 3.3.7   editor    ...

  4. 获取当前的日期时间 格式“yyyy-MM-dd HH:MM:SS”

    function getNowFormatDate() {     var date = new Date();     var seperator1 = "-";     var ...

  5. vue学习【第五篇】:Vue组件

    什么是组件 - 每一个组件都是一个vue实例 - 每个组件均具有自身的模板template,根组件的模板就是挂载点 - 每个组件模板只能拥有一个根标签 - 子组件的数据具有作用域,以达到组件的复用 根 ...

  6. Docker Tomcat远程部署到容器

    一:创建一个开启远程部署的tomcat容器 tomcat角色配置 1.tomcat开启远程部署,修改conf/tomcat-users.xml <?xml version="1.0&q ...

  7. OAuth2.0原理与实现

    弄懂了原理流程,才可以搭建出来.更重要的是,可以根据原理流程自定义搭建,甚至可以完全自己实现一套,最后运行效果和原理和这个对得上就成功了,不要总期待标准答案! 首先参考两篇博客: 阮一峰的博客以及张开 ...

  8. 学习使用JUnit4进行单元测试

    借用http://blog.csdn.net/andycpp/article/details/1327147等文章上面的例子和教程进行学习总结,自己敲了一遍代码,发现里面有些东西,可能版本原因,已经稍 ...

  9. 题解——HDU 4734 F(x) (数位DP)

    这道题还是关于数位DP的板子题 数位DP有一个显著的特征,就是求的东西大概率与输入关系不大,理论上一般都是数的构成规律 然后这题就是算一个\( F(A) \)的公式值,然后求\( \left [ 0 ...

  10. nginx 配置静态文件

    location /temp/ { root F:/; autoindex on; } F:\temp 下的目录文件. 例子:http://localhost/temp/nginx-1.12.2/ht ...