Repeated DNA Sequences

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return:
["AAAAACCCCC", "CCCCCAAAAA"].
 
用位图算法可以减少内存,代码如下:
int map_exist[ *  / ];
int map_pattern[ * / ]; #define set(map,x) \
(map[x >> ] |= ( << (x & 0x1F))) #define test(map,x) \
(map[x >> ] & ( << (x & 0x1F))) int dnamap[]; char** findRepeatedDnaSequences(char* s, int* returnSize) {
*returnSize = ;
if (s == NULL) return NULL;
int len = strlen(s);
if (len <= ) return NULL; memset(map_exist, , sizeof(int)* ( * / ));
memset(map_pattern, , sizeof(int)* ( * / )); dnamap['A' - 'A'] = ; dnamap['C' - 'A'] = ;
dnamap['G' - 'A'] = ; dnamap['T' - 'A'] = ; char ** ret = malloc(sizeof(char*));
int curr = ;
int size = ;
int key;
int i = ; while (i < )
key = (key << ) | dnamap[s[i++] - 'A'];
while (i < len){
key = ((key << ) & 0xFFFFF) | dnamap[s[i++] - 'A'];
if (test(map_pattern, key)){
if (!test(map_exist, key)){
set(map_exist, key);
if (curr == size){
size *= ;
ret = realloc(ret, sizeof(char*)* size);
}
ret[curr] = malloc(sizeof(char)* );
memcpy(ret[curr], &s[i-], );
ret[curr][] = '\0';
++curr;
} }
else{
set(map_pattern, key);
}
} ret = realloc(ret, sizeof(char*)* curr);
*returnSize = curr;
return ret;
}

该算法用时 6ms 左右, 非常快

 

LeetCode-Repeated DNA Sequences (位图算法减少内存)的更多相关文章

  1. [LeetCode] Repeated DNA Sequences 求重复的DNA序列

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  2. [LeetCode] Repeated DNA Sequences hash map

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  3. [Leetcode] Repeated DNA Sequences

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  4. LeetCode() Repeated DNA Sequences 看的非常的过瘾!

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  5. LeetCode 187. 重复的DNA序列(Repeated DNA Sequences)

    187. 重复的DNA序列 187. Repeated DNA Sequences 题目描述 All DNA is composed of a series of nucleotides abbrev ...

  6. lc面试准备:Repeated DNA Sequences

    1 题目 All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: &quo ...

  7. [LeetCode] 187. Repeated DNA Sequences 求重复的DNA序列

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

  8. Leetcode:Repeated DNA Sequences详细题解

    题目 All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: " ...

  9. 【leetcode】Repeated DNA Sequences(middle)★

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACG ...

随机推荐

  1. 【leetcode】String to Integer (atoi)

    String to Integer (atoi) Implement atoi to convert a string to an integer. Hint: Carefully consider ...

  2. ali2015校园招聘笔试大题

    [本文链接] http://www.cnblogs.com/hellogiser/p/ali-2015-questions.html 1. 写一个函数,输入一个二叉树,树中每个节点存放了一个整数值,函 ...

  3. Java异常与异常处理简单使用

    异常就是程序运行过程中阻止当前方法或作用域继续执行的问题: 任何程序都不能保证完全正常运行,当发生异常时,需要我们去处理异常,特别是一些比较重要的场景,异常处理的逻辑也会比较复杂,比如:给用户提示.保 ...

  4. 【python】choice函数

    来源:http://www.runoob.com/python/func-number-choice.html 描述 choice() 方法返回一个列表,元组或字符串的随机项. 语法 以下是 choi ...

  5. Android Handler leak 分析及解决办法

    In Android, Handler classes should be static or leaks might occur, Messages enqueued on the applicat ...

  6. google登录不了解决喽

    大家好,google 每到这个时候就登录不聊了.... 解法: 修改host 文件 下载地址点我

  7. Linux使用tcpdump命令抓包保存pcap文件wireshark分析

    [root@ok Desktop]# yum search tcpdump Loaded plugins: fastestmirror, refresh-packagekit, security Lo ...

  8. mac 下修改Hosts文件

    最近Google网站老是打不开,具体原因大家都明白,不过修改Hosts文件后,就能访问了,也算不上原创,网上一搜就能找到,自己操作记录下,希望有刚接触Mac 系统的童鞋有帮助. 第一步:打开Finde ...

  9. CRT注册工具使用说明

    激活步骤如下:   1)准备工作:安装好SecureCRT软件,下载并得到该注册机.   2)保持SecureCRT软件关闭(运行的话会提示你正在运行的,关闭就好).   3)将注册机拷贝到你的CRT ...

  10. TypeC一个微软开发的超简单.NET依赖注入/IoC容器

    控制反转(IoC,Inversion of Control)是由Martin Fowler总结出来的一种设计模式,用来减少代码间的耦合.一般而言,控制反转分为依赖注入(Dependency Injec ...