RM是library-based,通过相似性比对来识别重复序列,可以屏蔽序列中转座子重复序列和低复杂度序列(默认将其替换成N).使用数据库Dfam和Repbase. The Dfam database is a collection of Repetitive DNA element sequence alignments, hidden Markov models (HMMs) and matches lists for complete Eukaryote genomes. Repbase是…
以前有的是非完整时间写的博客,抽时间需要统一整理一下. 今天在重新装repeatmasker. 整个过程是这样的,有关联的事情有两个. 1. 装repeatmasker需要各种Prerequisites,其中就可能用到了blast,而之前一直找这个版本的blast,在ncbi硬是没有找到: For RMBlast ( NCBI Blast modified for use with RepeatMasker/RepeatModeler ) please go to our download pa…
Reference Genome Components 1. GRCh38 is special because it has alternate contigs that represent population haplotypes. Don't know alternate contig from alternate dimension? Spend five minutes now to review terminology in our Dictionary entryReferenc…
http://gmod.org/wiki/MAKER_Tutorial 简单好用 identify repeats, to align ESTs and proteins to the genome, and to automatically synthesize these data into feature-rich gene annotations, including alternative splicing and UTRs, as well as attributes such as…