TopHat
What is TopHat?
TopHat is a program that aligns RNA-Seq reads to a genome in order to identify exon-exon splice junctions. It is built on the ultrafast short read mapping program Bowtie. TopHat runs on Linux and OS X.
How does TopHat find junctions?
TopHat can find splice junctions without a reference annotation. By first mapping RNA-Seq reads to the genome, TopHat identifies potential exons, since many RNA-Seq reads will contiguously align to the genome. Using this initial mapping information, TopHat builds a database of possible splice junctions and then maps the reads against these junctions to confirm them.
Short read sequencing machines can currently produce reads 100bp or longer but many exons are shorter than this so they would be missed in the initial mapping. TopHat solves this problem mainly by splitting all input reads into smaller segments which are then mapped independently. The segment alignments are put back together in a final step of the program to produce the end-to-end read alignments.
TopHat generates its database of possible splice junctions from two sources of evidence. The first and strongest source of evidence for a splice junction is when two segments from the same read (for reads of at least 45bp) are mapped at a certain distance on the same genomic sequence or when an internal segment fails to map - again suggesting that such reads are spanning multiple exons. With this approach, "GT-AG", "GC-AG" and "AT-AC" introns will be found ab initio. The second source is pairings of "coverage islands", which are distinct regions of piled up reads in the initial mapping. Neighboring islands are often spliced together in the transcriptome, so TopHat looks for ways to join these with an intron. We only suggest users use this second option (--coverage-search) for short reads (< 45bp) and with a small number of reads (<= 10 million). This latter option will only report alignments across "GT-AG" introns
TopHat的更多相关文章
- tophat输出结果junction.bed
tophat输出结果junction.bed BED format BED format provides a flexible way to define the data lines ...
- tophat安装
1 依赖软件:bowtie,bowtie2,samtools,boost c++ library 2 建立索引文件: bowtie包括bowtie,bowtie-build, ...
- RNA-seq差异表达基因分析之TopHat篇
RNA-seq差异表达基因分析之TopHat篇 发表于2012 年 10 月 23 日 TopHat是基于Bowtie的将RNA-Seq数据mapping到参考基因组上,从而鉴定可变剪切(exon-e ...
- 机器学习进阶-图像形态学变化-礼帽与黑帽 1.cv2.TOPHAT(礼帽-原始图片-开运算后图片) 2.cv2.BLACKHAT(黑帽 闭运算-原始图片)
1.op = cv2.TOPHAT 礼帽:原始图片-开运算后的图片 2. op=cv2.BLACKHAT 黑帽: 闭运算后的图片-原始图片 礼帽:表示的是原始图像-开运算(先腐蚀再膨胀)以后的图像 ...
- 使用Tophat+cufflinks分析差异表达
使用Tophat+cufflinks分析差异表达 2017-06-15 19:09:43 522 0 0 使用TopHat+Cufflinks的流程图 序列的比对是RNA分析 ...
- RNA-seq 安装 fastaqc,tophat,cuffilnks,hisat2
------------------------------------------ 安装fastqc------------------------------------------------- ...
- tophat的用法
概述:tophat是以bowtie2为核心的一款比对软件. tophat工作分两步: 1.将reads用bowtie比对到参考基因组上. 2.将unmapped-reads打断成更小的fragment ...
- SAGE|DNA微阵列|RNA-seq|lncRNA|scripture|tophat|cufflinks|NONCODE|MA|LOWESS|qualitile归一化|permutation test|SAM|FDR|The Bonferroni|Tukey's|BH|FWER|Holm's step-down|q-value|
生物信息学-基因表达分析 为了丰富中心法则,研究人员使用不断更新的技术研究lncRNA的方方面面,其中技术主要是生物学上的微阵列芯片技术和表达数据分析方法,方方面面是指lncRNA的位置特征. Bac ...
- 链终止法|边合成边测序|Bowtie|TopHat|Cufflinks|RPKM|FASTX-Toolkit|fastaQC|基因芯片|桥式扩增|
生物信息学 Sanger采用链终止法进行测序 带有荧光基团的ddXTP+其他四种普通的脱氧核苷酸放入同一个培养皿中,例如带有荧光基团的ddATP+普通的脱氧核苷酸A.T.C.G放入同一个培养皿,以此类 ...
随机推荐
- WPF 3D 知识点大全以及实例
引言 现在物联网概念这么火,如果监控的信息能够实时在手机的客服端中以3D形式展示给我们,那种体验大家可以发挥自己的想象. 那生活中我们还有很多地方用到这些,如上图所示的Kinect 在医疗上的应用,当 ...
- 怎样用ZBrush对模型进行渲染(二)
继上节课Fisker老师对ZBrush中对渲染和灯光起到重要作用的Light和LightCap进行了具体讲解之后,本节课继续研究Render(渲染)和Light及LightCap相结合会产生什么样的效 ...
- 用U盘安装Ubuntu系统
用U盘安装Ubuntu,需制作一个Ubuntu的U盘安装盘,最为方便和可靠的制作方法是在Linux系统下使用dd命令,具体如下, sudo dd if=ubuntu-14.04.4-server-am ...
- Eclipse InstaSearch搜索词法 (很多并不支持)
1. 中文翻译 http://www.cnblogs.com/xing901022/p/4974977.html 2. 英文原文 http://lucene.apache.org/core/3_0_3 ...
- angular的跨域(angular百度下拉提示模拟)和angular选项卡
1.angular中$http的服务: $http.get(url,{params:{参数}}).success().error(); $http.post(url,{params:{参数}}).su ...
- webpack中字体配置,可以引入bootstrap
{test:/\.(eot|ttf|woff|woff2|svg)$/,loader:'file?name=fonts/[name].[ext]'} 将css中用到的字体全部提取存放到fonts目录下 ...
- NPM 无法下载任何包的原因,解决方法
前几天发现NPM 无法现在任何的包 通过npm i testPackage -ddd 发现 是卡在了 npm verb addRemoteTarball 这行,google后发现 是有多个tmp地址, ...
- ELK日志系统:Elasticsearch + Logstash + Kibana 搭建教程
环境:OS X 10.10.5 + JDK 1.8 步骤: 一.下载ELK的三大组件 Elasticsearch下载地址: https://www.elastic.co/downloads/elast ...
- spring的AOP
最近公司项目中需要添加一个日志记录功能,就是可以清楚的看到谁在什么时间做了什么事情,因为项目已经运行很长时间,这个最初没有开来进来,所以就用spring的面向切面编程来实现这个功能.在做的时候对spr ...
- php 5.4中php-fpm 的重启、终止操作命令
php 5.4中php-fpm 的重启.终止操作命令: 查看php运行目录命令:which php/usr/bin/php 查看php-fpm进程数:ps aux | grep -c php-fpm ...