[LeetCode] 187. Repeated DNA Sequences 解题思路
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT", Return:
["AAAAACCCCC", "CCCCCAAAAA"].
问题:给定一个字符串序列,代表 DNA 序列,求其中有重复出现的长度为 10 的子序列。
题目中的例子都是不重叠的重复字串,实际上相互重叠的字串也是要统计进去,例如11位的 "AAAAAAAAAA" 就包含两个长度为 10 的"AAAAAAAAAA" 的重复子序列。这一点是题目没有说清楚的。
明确题目后,实现思路也比较简单:
- 将 s 中所有长度为 10 的连续子字符串放入 map<string, int> ss_cnt 中,数各个连续字符串出现的的次数
- 将 [0, 9] 视为窗口,将 ss_cnt 中窗口字符串对于的 value 减 1 ,然后判断 ss_cnt 中是否还存在一个 窗口字符串, 若存在则表示窗口字符串是重复的。
- 将窗口向右移动一个,继续重复第二步,直至窗口移至最右端
/**
* 重复子字符串 可以重叠。
*/
vector<string> findRepeatedDnaSequences(string s) {
unordered_set<string> res; unordered_map<string, int> ss_cnt; int len = ; for (int i = ; i + len - < s.size(); i++) {
string str = s.substr(i, len);
ss_cnt[str]++;
} int i = ;
while (i + len - < s.size()) { string cur = s.substr(i, len);
ss_cnt[cur]--; if (ss_cnt[cur] > ) {
res.insert(cur);
} ss_cnt[cur]++;
i++;
} vector<string> result; unordered_set<string>::iterator s_iter;
for (s_iter = res.begin(); s_iter != res.end(); s_iter++) {
result.push_back(*s_iter);
} return result;
}
[LeetCode] 187. Repeated DNA Sequences 解题思路的更多相关文章
- Java for LeetCode 187 Repeated DNA Sequences
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- 【LeetCode】187. Repeated DNA Sequences 解题报告(Python)
作者: 负雪明烛 id: fuxuemingzhu 个人博客: http://fuxuemingzhu.cn/ 题目地址: https://leetcode.com/problems/repeated ...
- [LeetCode] 187. Repeated DNA Sequences 求重复的DNA序列
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- 【LeetCode】Repeated DNA Sequences 解题报告
[题目] All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: &quo ...
- leetcode 187. Repeated DNA Sequences 求重复的DNA串 ---------- java
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
- [LeetCode#187]Repeated DNA Sequences
Problem: All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: ...
- [leetcode]187. Repeated DNA Sequences寻找DNA中重复出现的子串
很重要的一道题 题型适合在面试的时候考 位操作和哈希表结合 public List<String> findRepeatedDnaSequences(String s) { /* 寻找出现 ...
- 【LeetCode】187. Repeated DNA Sequences
题目: All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: " ...
- 【leetcode】Repeated DNA Sequences(middle)★
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...
随机推荐
- centos 安装nginx
centos 安装nginx 安装依赖 更换源 yum install http://mirrors.163.com/centos/6.8/extras/x86_64/Packages/epel-re ...
- 免写前缀JS包--prefixfree.min.js--插件
/** * StyleFix 1.0.3 & PrefixFree 1.0.7 * @author Lea Verou * MIT license */ (function(){functio ...
- Web.Config文件中添加数据库配置文件
1获取所有配置文件节点的类ConfigurationManager 2数据库节点<ConnectionStrings> <add> name ="Sqlconnect ...
- linq的一些用法总结
获取列表数据. IList<Model> list = dao.getmx(Model, pageInfo);//获取数据列表 1.将列表中id一样的数据进行group by分组,并返回序 ...
- Python报错:SyntaxError: Non-ASCII character '\xe5' in file
运行Python脚本总是报一下的错误: SyntaxError: Non-ASCII character '\xe5' in file 原因:Python默认是以ASCII作为编码方式的,如果在自己的 ...
- c#读取通达信历史数据的方法
public Bar ReadBarMin(BinaryReader br, int instrumentId, long size) { int date = br.ReadUInt16(); in ...
- 【USACO 3.2.2】二进制数01串
[描述] 考虑排好序的N(N<=31)位二进制数. 你会发现,这很有趣.因为他们是排列好的,而且包含所有可能的长度为N且含有1的个数小于等于L(L<=N)的数. 你的任务是输出第I(1&l ...
- 【USACO 1.5.4】跳棋的挑战
[问题描述] 检查一个如下的6 x 6的跳棋棋盘,有六个棋子被放置在棋盘上,使得每行,每列,每条对角线(包括两条主对角线的所有对角线)上都至多有一个棋子,如下例,就是一种正确的布局. 上面的布局可以用 ...
- VSS迁移
今天花了一上午的时间,对VSS源代码库从一个服务器A上迁移到另一个服务器B上,包括修改历史.用户.以及权限.具体方法如下: 1.在服务器B上安装vss程序后,创建一database,并设置创建后的文件 ...
- Uncaught SyntaxError: Unexpected end of input
js报错 原因:输入的意外终止…… 页面代码写的不规范啊……其中的某条语句,没有正常结束…… 或者部分语句“‘’”双引号,单引号没有配对好,被转义了之类的……错误造成的 代码: <script ...