Three ways to assess the nucleotide diversity (heterozygosity).The first is mean pairwise difference ∏, the average number of differences between pairs of sequences.The second is number of segregating sites S. A segregating site is a site that is pol…
#!/usr/bin/perl use strict; use warnings; =pod--------------------------------------- this perl script is used to compute tajima's D Former of tajimaD.tmp.txt is : chr position #sampled #derived1 256 12 181 12124 14 16 Former of vcf is normal vcf. Fo…
http://blog.sciencenet.cn/blog-1469385-819498.html 文章目录 一.准备工作 二.流程概览 三.流程 首先说说GATK可以做什么.它主要用于从sequencing 数据中进行variant calling,包括SNP.INDEL.比如现在风行的exome sequencing找variant,一般通过BWA+GATK的pipeline进行数据分析. 要run GATK,首先得了解它的网站(http://www.broadinstitute.org/…
VCFtools用来处理VCF文档. 筛选特定突变 比较文件 总结突变 转化文件格式 验证并合并文件 取突变交集和差集 Get basic file statistics input可以为VCF或BCF格式(--vcf --gvcf or --bcf). vcftools --vcf test.vcf less test.vcf | vcftools --vcf - Applying a filter 可以把筛选的突变写入一个新文件.--recode 表示输出筛选的内容,--recode-INF…
The C++ executable module examples This page provides usage examples for the executable module. Extended documentation for all of the options can be found on the manual page. Running the program Getting basic file statistics Applying a filter Writing…
目录 前言 四个SNP集 hapmapSNPs tagSNPs fixedSNPs barcodeSNPs hapmapSNPs的指标统计 tagSNPs的群体结构验证 tagSNPs的遗传多样性 tagSNPs用于GS fixedSNPs验证 barcodeSNPs指纹图谱 barcodeIndel SR4R数据库 前言 王向峰老师2020年发表在<Genomics Proteomics Bioinformatics>(IF=6.597)上的文章.对于做数据分析的人来说,如何挖掘公共数据,如…