Variation calling and annotation
本文摘自《Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean》
Variation calling and annotation.
Mapping.
SAMtools (Version: 0.1.18) software was used to convert mapping results into the BAM format and to filter the unmapped and non-unique reads.
Duplicated reads were filtered with the Picard package (picard.sourceforge.net, Version:1.87).
The BEDtools (Version: 2.17.0) coverageBed program was used to compute the coverage of sequence alignments. (A sequence was defined as absent if coverage was lower than 90% and present if coverage was greater than 90%.)
SNP calling.
SNP detection was performed using the Genome Analysis Toolkit (GATK, version 2.4-7-g5e89f01) and SAMtools. Only the SNPs detected by both methods were analyzed further.
The detailed processes were as follows:
(1) After BWA alignment, the reads around indels were realigned.
Realignment was performed with GATK in two steps.
The first step used the RealignerTargetCreator package to identify regions where realignment was needed;
The second step used IndelRealigner to realign the regions found in the first step, which produced a realigned BAM file for each accession.
(2) SNPs were called at a population level with GATK and SAMtools. For GATK, the SNP confidence score was set as greater than 30, and the parameter -stand_call_conf was set as 30. The same realigned BAM files were used in SNP calling through the SAMtools mpileup package.
(3) In the filter step, we chose the common sites identified by GATK and SAMtools with the SelectVariants package; SNPs with allele frequencies lower than 1% in the population were discarded.
Indel calling.
Indel calling was similar to SNP calling but with the UnifiedGenotyper parameter -glm INDEL for the indel report only. Only insertions and deletions shorter than or equal to 6 bp were taken into account.
Annotation.
SNP annotation was performed according to the genome using the package ANNOVAR (Version: 2013-08-23).
Based on the genome annotation, SNPs were categorized in exonic regions (overlapping with a coding exon), splicing sites (within 2 bp of a splicing junction), 5′UTRs and 3′UTRs, intronic regions (overlapping with an intron), upstream and downstream regions (within a 1 kb region upstream or downstream from the transcription start site), and intergenic regions.SNPs in coding exons were further grouped into synonymous SNPs (did not cause amino acid changes) or nonsynonymous SNPs (caused amino acid changes; mutations causing stop gain and stop loss were also classified into this group).
Indels in the exonic regions were classified by whether they had frame-shift (3 bp insertion or deletion) mutations.
Variation calling and annotation的更多相关文章
- 敏感性、特异性、假阳性、假阴性(sensitivity and specificity)
医学.机器学习等等,在统计结果时时长会用到这两个指标来说明数据的特性. 定义 敏感性:在金标准判断有病(阳性)人群中,检测出阳性的几率.真阳性.(检测出确实有病的能力) 特异性:在金标准判断无病(阴性 ...
- 30、 bowtie和bowtie2使用条件区别及用法
转载:http://blog.csdn.net/soyabean555999/article/details/62235577 一.转录组还是基因组? map常用的工具有bowtie/bowtie2, ...
- 表观 | Enhancer | ChIP-seq | 转录因子 | 数据库专题
需要长期更新! 参考:生信修炼手册 enhancer的基本概念: 长度几十到几千bp,作用是提高靶基因活性,属于顺式作用原件,DNA作用到DNA,转录因子就是反式,是结合到DNA的蛋白. 1981年, ...
- ANNOTATION PROCESSING 101 by Hannes Dorfmann — 10 Jan 2015
原文地址:http://hannesdorfmann.com/annotation-processing/annotationprocessing101 In this blog entry I wo ...
- Spring Annotation Processing: How It Works--转
找的好辛苦呀 原文地址:https://dzone.com/articles/spring-annotation-processing-how-it-works If you see an annot ...
- Microsoft source-code annotation language (SAL) 相关
More info see: https://msdn.microsoft.com/en-us/library/hh916383.aspx Simply stated, SAL is an inexp ...
- Spring 4 Ehcache Configuration Example with @Cacheable Annotation
http://www.concretepage.com/spring-4/spring-4-ehcache-configuration-example-with-cacheable-annotatio ...
- Annotation Type @bean,@Import,@configuration使用--官方文档
@Target(value={METHOD,ANNOTATION_TYPE}) @Retention(value=RUNTIME) @Documented public @interface Bean ...
- Calling convention-调用约定
In computer science, a calling convention is an implementation-level (low-level) scheme for how subr ...
随机推荐
- MySQL的order by子句
1.语法:select 字段列表 from 表名 [where 子句][group by 子句][having 子句][order by 子句]; 注解: 1.默认是从第一条记录开始升序, 2.des ...
- Yii2 Queue队列
github地址:https://github.com/yiisoft/yii2-queue 问题:https://github.com/yiisoft/yii2-queue/issues Jobs ...
- windows cmd 命令
dir 查看文件,参数:/Q显示文件及目录属系统哪个用户,/T:C显示文件创建时间,/T:A显示文件上次被访问时间,/T:W上次被修改时间 set 显示当前所有的环境变量 find 文件名 查找某文件 ...
- Hints of sd0061(快排思想)
Hints of sd0061 Time Limit: 5000/2500 MS (Java/Others) Memory Limit: 131072/131072 K (Java/Others ...
- SharePoint服务器端对象模型 之 对象模型概述(Part 2)
(三)Url 作为一个B/S体系,在SharePoint的属性.方法参数和返回值中,大量的涉及到了Url,总的来说,涉及到的Url可以分为如下四类: 绝对路径:完整的Url,包含了协议头(http或h ...
- 路径 php中'.'和'..'还有'./'和'../'
./当前目录(就是当前执行文件所在目录) ../上级目录 / 这个才是根目文件名/ 同级目录 例子如图 1.cart下的index.php 1)要引用Public->css->index. ...
- AGS Server10.1中地图文档更新如何使服务更新
一.需求背景 发布服务的mxd文档发生了更改,如何对该mxd文档映射的地图服务进行更新. 二.分析 由于在10.1中地图服务的发布采用的是msd的形式,也就是虽然在ArcMap中准备的地图文档是mxd ...
- python函数的学习笔记
这篇文章是我关于学习python函数的一些总结 一.随着函数的引入,这里首先要说的就是全局变量和局部变量了. 什么是全局变量.什么是局部变量: 全局变量就是全局都能调用的变量,一般都在文件的开头,顶头 ...
- 005-MYSQL数据库设计原则
1.核心原则 不在数据库做运算; cpu计算务必移至业务层; 控制列数量(字段少而精,字段数建议在20以内); 平衡范式与冗余(效率优先:往往牺牲范式) 拒绝3B(拒绝大sql语句:big sql.拒 ...
- 实现对第三方应用任意SO注入
实现对第三方应用任意SO注入 0x01 应用在Android中运行,从外部对该进程可以进行任意SO文件动态注入,就是应用动态运行我们的SO文件 0x02 基本的逻辑是: 1. 获取目标进程的pi ...