这算是第二讲了,前面一讲是:Edit Distance编辑距离(NM tag)- sam/bam格式解读进阶 MD是mismatch位置的字符串的表示形式,貌似在call SNP和indel的时候会用到. 当然我这里要说的只是利用它来计算mismatch的个数 MD = line.get_tag('MD') pat = "[0-9]+[ATGC]+" MD_list = re.findall(pat,MD) for i in MD_list: for j in i: if j == '…
sam格式很精炼,几乎包含了比对的所有信息,我们平常用到的信息很少,但特殊情况下,我们会用到一些较为生僻的信息,关于这些信息sam官方文档的介绍比较精简,直接看估计很难看懂. 今天要介绍的是如何通过bam文件统计比对的indel和mismatch信息 首先要介绍一个非常重要的概念--编辑距离 定义:从字符串a变到字符串b,所需要的最少的操作步骤(插入,删除,更改)为两个字符串之间的编辑距离. (2016年11月17日:增加,有点误导,如果一个插入有两个字符,那编辑距离变了几呢?1还是2?我又验证…
1)Sam (Sequence Alignment/Map) ------------------------------------------------- 1) SAM 文件产生背景 随着Illumina/Solexa, AB/SOLiD and Roche/454测序技术不断的进步,各种比对工具产生,被用来高效的将reads比对到参考基因组.因为这些比对工具产生不同格式的文件,导致下游分析比较困难,因此一个通用的格式可以提供一个很好的接口用于链接比对与下游分析(组装,变异等,基因分型等)…
Given two words word1 and word2, find the minimum number of steps required to convert word1 to word2. (each operation is counted as 1 step.) You have the following 3 operations permitted on a word: a) Insert a characterb) Delete a characterc) Replace…
Edit Distance Given two words word1 and word2, find the minimum number of steps required to convert word1 to word2. (each operation is counted as 1 step.) You have the following 3 operations permitted on a word: a) Insert a character b) Delete a char…
Given two words word1 and word2, find the minimum number of operations required to convert word1 to word2. You have the following 3 operations permitted on a word: Insert a character Delete a character Replace a character Example 1: Input: word1 = "h…
以下为个人翻译方便理解 编辑距离问题是一个经典的动态规划问题.首先定义dp[i][j表示word1[0..i-1]到word2[0..j-1]的最小操作数(即编辑距离). 状态转换方程有两种情况:边界情况和一般情况,以上表示中 i和j均从1开始(注释:即至少一个字符的字符串向一个字符的字符串转换,0字符到0字符转换编辑距离自然为0) 1.边界情况:将一个字符串转化为空串,很容易看出把word[0...i-1]转化成空串""至少需要i次操作(注释:i次删除),则其编辑距离为i,即:dp[…
Given two words word1 and word2, find the minimum number of operations required to convert word1 to word2.You have the following 3 operations permitted on a word: Insert a character Delete a character Replace a character Example 1: Input: word1 = "ho…
作者: 负雪明烛 id: fuxuemingzhu 个人博客: http://fuxuemingzhu.cn/ 目录 题目描述 题目大意 解题方法 递归 记忆化搜索 动态规划 日期 题目地址:https://leetcode.com/problems/edit-distance/description/ 题目描述 Given two words word1 and word2, find the minimum number of operations required to convert w…
Given two words word1 and word2, find the minimum number of steps required to convert word1 to word2. (each operation is counted as 1 step.) You have the following 3 operations permitted on a word: a) Insert a characterb) Delete a characterc) Replace…