09 Finding a Motif in DNA
Problem
Given two strings ss and tt, tt is a substring of ss if tt is contained as a contiguous collection of symbols in ss (as a result, tt must be no longer than ss).
The position of a symbol in a string is the total number of symbols found to its left, including itself (e.g., the positions of all occurrences of 'U' in "AUGCUUCAGAAAGGUCUUACG" are 2, 5, 6, 15, 17, and 18). The symbol at position ii of ss is denoted by s[i]s[i].
A substring of ss can be represented as s[j:k]s[j:k], where jj and kk represent the starting and ending positions of the substring in ss; for example, if ss = "AUGCUUCAGAAAGGUCUUACG", then s[2:5]s[2:5] = "UGCU".
The location of a substring s[j:k]s[j:k] is its beginning position jj; note that tt will have multiple locations in ss if it occurs more than once as a substring of ss (see the Sample below).
Given: Two DNA strings ss and tt (each of length at most 1 kbp).
Return: All locations of tt as a substring of ss.
Sample Dataset
GATATATGCATATACTT
ATAT
Sample Output
2 4 10
#-*-coding:UTF-8-*-
### 9. Finding a Motif in DNA ### # Method 1: Use Module regex.finditer
import regex
# 比re更强大的模块 matches = regex.finditer('ATAT', 'GATATATGCATATACTT', overlapped=True)
# 返回所有匹配项,
for match in matches:
print (match.start() + 1) # Method 2: Brute Force Search
seq = 'GATATATGCATATACTT'
pattern = 'ATAT' def find_motif(seq, pattern):
position = []
for i in range(len(seq) - len(pattern)):
if seq[i:i + len(pattern)] == pattern:
position.append(str(i + 1)) print ('\t'.join(position)) find_motif(seq, pattern) # methond 3
import re
seq='GATATATGCATATACTT'
print [i.start()+1 for i in re.finditer('(?=ATAT)',seq)]
# ?= 之后字符串内容需要匹配表达式才能成功匹配。
09 Finding a Motif in DNA的更多相关文章
- Hash function
Hash function From Wikipedia, the free encyclopedia A hash function that maps names to integers fr ...
- Windows7WithSP1/TeamFoundationServer2012update4/SQLServer2012
[Info @09:03:33.737] ====================================================================[Info @ ...
- DNA motif 搜索算法总结
DNA motif 搜索算法总结 2011-09-15 ~ ADMIN 翻译自:A survey of DNA motif finding algorithms, Modan K Das et. al ...
- 14 Finding a Shared Motif
Problem A common substring of a collection of strings is a substring of every member of the collecti ...
- DNA binding motif比对算法
DNA binding motif比对算法 2012-08-31 ~ ADMIN 之前介绍了序列比对的一些算法.本节主要讲述motif(有人翻译成结构模式,但本文一律使用基模)的比对算法. 那么什么是 ...
- 16 Finding a Protein Motif
Problem To allow for the presence of its varying forms, a protein motif is represented by a shorthan ...
- hdu 1560 DNA sequence(搜索)
http://acm.hdu.edu.cn/showproblem.php?pid=1560 DNA sequence Time Limit: 15000/5000 MS (Java/Others) ...
- 查找EBS中各种文件版本(Finding File Versions in the Oracle Applications EBusiness Suite - Checking the $HEADER)
Finding File Versions in the Oracle Applications EBusiness Suite - Checking the $HEADER (文档 ID 85895 ...
- hdu 1560 DNA sequence(迭代加深搜索)
DNA sequence Time Limit : 15000/5000ms (Java/Other) Memory Limit : 32768/32768K (Java/Other) Total ...
随机推荐
- 洛谷九月月赛II
题解:模拟 一旦不匹配就要break #include<iostream> #include<cstdio> #include<cstring> #include& ...
- streamsets record header 属性
record 的header 属性可以在pipeline 逻辑中使用. 有写stages 会为了特殊目录创建reord header 属性,比如(cdc)需要进行crud 操作类型的区分 你可以使用一 ...
- UDP的connect
UDP的connect没有三次握手过程,内核只是检测是否存在立即可知的错误(如一个显然不可达的目的地), 记录对端的的IP地址和端口号,然后立即返回调用进程. 未连接UDP套接字(unconnecte ...
- mysql索引优化续
(1)索引类型: Btree索引:抽象的可以理解为“排好序的”快速查找结构myisam,innodb中默认使用Btree索引 hash索引:hash索引计算速度非常的快,但数据是随机放置的,无法对范围 ...
- [持续更新]一些zyys的题的集合
Luogu P1119 灾后重建 Sol:对于每个中转点K,需且仅需以此松弛一次 Key words:Floyd,本质活用 考题 路径数 题目描述: Euphemia到一个N*N的药草田里采药,她从左 ...
- selenium笔记2017
1,from time import sleep(先引入关键词) sleep(5) (就可以使用这个命令了) 可以停止页面5秒 1-1. 等待页面元素出现的时间(即没出现时,等待元素出现) ...
- 揭秘 Python 中的 enumerate() 函数
原文:https://mp.weixin.qq.com/s/Jm7YiCA20RDSTrF4dHeykQ 如何以去写以及为什么你应该使用Python中的内置枚举函数来编写更干净更加Pythonic的循 ...
- 5月25日-js操作DOM遍历子节点
一.遍历节点 遍历子节点 children();//获取节点的所有直接子类 遍历同辈节点 next(); prev(); siblings();//所有同辈元素 *find(); 从后代元素中查找匹配 ...
- ffmpeg重要的参考学习网址
http://lib.csdn.net/liveplay/knowledge/1586 FFmpeg滤镜使用指南 http://blog.csdn.net/fireroll/article/detai ...
- springcloud(五) Hystrix 降级,超时
分布式系统中一定会遇到的一个问题:服务雪崩效应或者叫级联效应什么是服务雪崩效应呢? 在一个高度服务化的系统中,我们实现的一个业务逻辑通常会依赖多个服务,比如:商品详情展示服务会依赖商品服务, 价格服务 ...