Repeated DNA Sequences

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return:
["AAAAACCCCC", "CCCCCAAAAA"].
 
用位图算法可以减少内存,代码如下:
int map_exist[ *  / ];
int map_pattern[ * / ]; #define set(map,x) \
(map[x >> ] |= ( << (x & 0x1F))) #define test(map,x) \
(map[x >> ] & ( << (x & 0x1F))) int dnamap[]; char** findRepeatedDnaSequences(char* s, int* returnSize) {
*returnSize = ;
if (s == NULL) return NULL;
int len = strlen(s);
if (len <= ) return NULL; memset(map_exist, , sizeof(int)* ( * / ));
memset(map_pattern, , sizeof(int)* ( * / )); dnamap['A' - 'A'] = ; dnamap['C' - 'A'] = ;
dnamap['G' - 'A'] = ; dnamap['T' - 'A'] = ; char ** ret = malloc(sizeof(char*));
int curr = ;
int size = ;
int key;
int i = ; while (i < )
key = (key << ) | dnamap[s[i++] - 'A'];
while (i < len){
key = ((key << ) & 0xFFFFF) | dnamap[s[i++] - 'A'];
if (test(map_pattern, key)){
if (!test(map_exist, key)){
set(map_exist, key);
if (curr == size){
size *= ;
ret = realloc(ret, sizeof(char*)* size);
}
ret[curr] = malloc(sizeof(char)* );
memcpy(ret[curr], &s[i-], );
ret[curr][] = '\0';
++curr;
} }
else{
set(map_pattern, key);
}
} ret = realloc(ret, sizeof(char*)* curr);
*returnSize = curr;
return ret;
}

该算法用时 6ms 左右, 非常快

 

LeetCode-Repeated DNA Sequences (位图算法减少内存)的更多相关文章

  1. [LeetCode] Repeated DNA Sequences 求重复的DNA序列

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  2. [LeetCode] Repeated DNA Sequences hash map

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  3. [Leetcode] Repeated DNA Sequences

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  4. LeetCode() Repeated DNA Sequences 看的非常的过瘾!

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  5. LeetCode 187. 重复的DNA序列(Repeated DNA Sequences)

    187. 重复的DNA序列 187. Repeated DNA Sequences 题目描述 All DNA is composed of a series of nucleotides abbrev ...

  6. lc面试准备:Repeated DNA Sequences

    1 题目 All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: &quo ...

  7. [LeetCode] 187. Repeated DNA Sequences 求重复的DNA序列

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  8. Leetcode:Repeated DNA Sequences详细题解

    题目 All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: " ...

  9. 【leetcode】Repeated DNA Sequences(middle)★

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

随机推荐

  1. 【Network】一张图看懂 Reactor 与 Proactor 模型的区别

    首先来看看Reactor模式,Reactor模式应用于同步I/O的场景.我们以读操作为例来看看Reactor中的具体步骤:读取操作:1. 应用程序注册读就需事件和相关联的事件处理器2. 事件分离器等待 ...

  2. WebSite和WebApplication的区别

    1. WebApplication(Web应用程序)和WebSite(网站)的区别:WebSite是为了兼容从ASP转过来的开发人员的习惯而存在的,用起来简单,例如:不需要创建命名控件.C#代码修改以 ...

  3. MySQL入门书籍和方法分享

    本文罗列了一些适用于MySQL及运维入门和进阶使用的书籍. 背景:各大论坛上总是有很多同学咨询想学习数据库,或者是为入行DBA做些准备.几年来作为一个MySQL DBA的成长过程有一些积累和感悟,特此 ...

  4. maven的一些依赖

    maven的一些依赖: <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://w ...

  5. VS 高亮显示不带后缀的C++头文件

    工具-选项-文本编辑器-文件扩展名-勾选“将无扩展名文件映射到(M)” Microsoft Visual C++

  6. Docker跨主机通信之路由

    一.实验环境: 主机名 主机IP Docker0_IP Docker1 192.168.88.130 172.17.0.1 Docker2 192.168.88.131 172.18.0.1 二.实验 ...

  7. Android概述

  8. android自动更新程序,安装完以后就什么都没有了,没有出现安装成功的界面的问题

    转载自: http://blog.csdn.net/lovexieyuan520/article/details/9250099 在android软件开发中,总是需要更新版本,所以当有新版本开发的时候 ...

  9. UVA - 10891 Game of Sum 区间DP

    题目连接:http://acm.hust.edu.cn/vjudge/problem/viewProblem.action?id=19461 Game of sum Description This ...

  10. C. Graph and String

    二分图染色 b点跟除自身外所有的点连接,共n-1个,首先把连接n-1个的点全部设为b点,其它点任意一点设为a,与a相连的都是a点,剩余为c点.最后验证是否成立. 验证条件为,所有连接的点之间的差值的绝 ...