CodeForces527D. Fuzzy Search
time limit per test:3 seconds memory limit per test:256 megabytes
Leonid works for a small and promising start-up that works on decoding the human genome. His duties include solving complex problems of finding certain patterns in long strings consisting of letters 'A', 'T', 'G' and 'C'.
Let's consider the following scenario. There is a fragment of a human DNA chain, recorded as a string S. To analyze the fragment, you need to find all occurrences of string T in a string S. However, the matter is complicated by the fact that the original chain fragment could contain minor mutations, which, however, complicate the task of finding a fragment. Leonid proposed the following approach to solve this problem.
Let's write down integer k ≥ 0 — the error threshold. We will say that string T occurs in string S on position i (1 ≤ i ≤ |S| - |T| + 1), if after putting string T along with this position, each character of string T corresponds to the some character of the same value in string S at the distance of at most k. More formally, for any j (1 ≤ j ≤ |T|) there must exist such p (1 ≤ p ≤ |S|), that |(i + j - 1) - p| ≤ k and S[p] = T[j].
For example, corresponding to the given definition, string "ACAT" occurs in string "AGCAATTCAT" in positions 2, 3 and 6.
Note that at k = 0 the given definition transforms to a simple definition of the occurrence of a string in a string.
Help Leonid by calculating in how many positions the given string T occurs in the given string S with the given error threshold.
Input
The first line contains three integers |S|, |T|, k (1 ≤ |T| ≤ |S| ≤ 200 000, 0 ≤ k ≤ 200 000) — the lengths of strings S and T and the error threshold.
The second line contains string S.
The third line contains string T.
Both strings consist only of uppercase letters 'A', 'T', 'G' and 'C'.
Output
Print a single number — the number of occurrences of T in S with the error threshold k by the given definition.
Examples
10 4 1
AGCAATTCAT
ACAT
output
3
Note
If you happen to know about the structure of the human genome a little more than the author of the problem, and you are not impressed with Leonid's original approach, do not take everything described above seriously.
Solution
只要A串中[i-k,i+k]范围内有字符X,就认为i位置可以匹配字符X。
问有多少位置可以匹配目标串B
生成函数 FFT
注意到只有四种字符,那么可以暴力分别处理这四种字符。
对于每种字符,在A串中扫描出可以匹配它的所有位置,标记为1,再将B串反转,将B串上对应字符的位置也标记为1,卷积即可得到该种字符的匹配情况。
做四遍卷积就可以愉快出解了。
/*by SilverN*/
#include<iostream>
#include<algorithm>
#include<cstring>
#include<cstdio>
#include<cmath>
#include<vector>
#define LL long long
using namespace std;
const double pi=acos(-1.0);
const int mxn=;
int read(){
int x=,f=;char ch=getchar();
while(ch<'' || ch>''){if(ch=='-')f=-;ch=getchar();}
while(ch>='' && ch<=''){x=x*+ch-'';ch=getchar();}
return x*f;
}
struct com{
double x,y;
com operator + (const com &b){return (com){x+b.x,y+b.y};}
com operator - (const com &b){return (com){x-b.x,y-b.y};}
com operator * (const com &b){return (com){x*b.x-y*b.y,x*b.y+y*b.x};}
com operator / (double v){return (com){x/v,y/v};}
}a[mxn<<],b[mxn<<];
int N,len;
int rev[mxn<<];
void FFT(com *a,int flag){
for(int i=;i<N;i++)
if(i<rev[i])swap(a[i],a[rev[i]]);
for(int i=;i<N;i<<=){
com wn=(com){cos(pi/i),flag*sin(pi/i)};
int p=i<<;
for(int j=;j<N;j+=p){
com w=(com){,};
for(int k=;k<i;k++,w=w*wn){
com x=a[j+k],y=w*a[j+k+i];
a[j+k]=x+y;
a[j+k+i]=x-y;
}
}
}
if(flag==-)for(int i=;i<N;i++) a[i].x/=N;
return;
}
char s[mxn],c[mxn];
int S,T,K;
LL ans[mxn<<];
int hd,tl,ct;
void solve(char tp){
memset(a,,sizeof a);
memset(b,,sizeof b);
hd=;tl=-;ct=;
int i,j;
for(i=;i<S;i++){
while(i-hd>K){if(s[hd]==tp)ct--;hd++;}
while(tl-i+<=K && tl<S){tl++;if(s[tl]==tp)ct++;}
if(ct>) a[i].x=;
}
for(i=;i<T;i++)
if(c[i]==tp) b[i].x=;
FFT(a,);FFT(b,);
for(i=;i<N;i++) a[i]=a[i]*b[i];
FFT(a,-);
for(i=;i<N;i++)ans[i]+=(LL)(a[i].x+0.5); return;
}
int main(){
int i,j;
S=read();T=read();K=read();
scanf("%s",s);scanf("%s",c);
int m=S+T;
for(N=,len=;N<=m;N<<=)len++;
for(i=;i<N;i++)
rev[i]=(rev[i>>]>>)|((i&)<<(len-));
reverse(c,c+T);
solve('A');
solve('G');
solve('C');
solve('T');
int res=;
for(i=;i<N;i++){
if(ans[i]==T)res++;
// printf("%lld\n",ans[i]);
}
printf("%d\n",res);
return ;
}
CodeForces527D. Fuzzy Search的更多相关文章
- CF528D. Fuzzy Search [FFT]
CF528D. Fuzzy Search 题意:DNA序列,在母串s中匹配模式串t,对于s中每个位置i,只要s[i-k]到s[i+k]中有c就认为匹配了c.求有多少个位置匹配了t 预处理\(f[i][ ...
- CF 528D. Fuzzy Search NTT
CF 528D. Fuzzy Search NTT 题目大意 给出文本串S和模式串T和k,S,T为DNA序列(只含ATGC).对于S中的每个位置\(i\),只要中[i-k,i+k]有一个位置匹配了字符 ...
- 【Codeforces528D】Fuzzy Search FFT
D. Fuzzy Search time limit per test:3 seconds memory limit per test:256 megabytes input:standard inp ...
- 【CF528D】Fuzzy Search(FFT)
[CF528D]Fuzzy Search(FFT) 题面 给定两个只含有\(A,T,G,C\)的\(DNA\)序列 定义一个字符\(c\)可以被匹配为:它对齐的字符,在距离\(K\)以内,存在一个字符 ...
- Umbraco Examine 实现Fuzzy search
在Umbraco examine search项目开发中,有一个需求, 就是intercom 和 intercoms需要返回同样的结果 也就是说 搜索intercom 时, 能返回包含intercom ...
- CF528D Fuzzy Search 和 BZOJ4259 残缺的字符串
Fuzzy Search 给你文本串 S 和模式串 T,求 S 的每个位置是否能模糊匹配上 T. 这里的模糊匹配指的是把 T 放到 S 相应位置上之后,T 中每个字符所在位置附近 k 个之内的位置上的 ...
- CF-528D Fuzzy Search(FFT字符串匹配)
Fuzzy Search 题意: 给定一个模式串和目标串按下图方式匹配,错开位置不多于k 解题思路: 总共只有\(A C G T\)四个字符,那么我们可以按照各个字符进行匹配,比如按照\(A\)进行匹 ...
- codeforces 528D Fuzzy Search
链接:http://codeforces.com/problemset/problem/528/D 正解:$FFT$. 很多字符串匹配的问题都可以用$FFT$来实现. 这道题是要求在左边和右边$k$个 ...
- Codeforces 528D Fuzzy Search(FFT)
题目 Source http://codeforces.com/problemset/problem/528/D Description Leonid works for a small and pr ...
随机推荐
- 对Objective-C中runtime的理解
Objective-C是面向runtime(运行时)的语言,在应用程序运行的时候来决定函数内部实现什么以及做出其它决定的语言.程序员可以在程序运行时创建,检 查,修改类,对象和它们的方法,Object ...
- 判断两个字符串是否相等【JAVA】
if(A.equals(B)){ } 之前总是用"=="来判断,但是在JAVA里面好像不行.所以,用equals(). 查了下资料. 原因:equal()比较的是对象的内容,&qu ...
- js实现倒计时60秒的简单代码
<!DOCTYPE html> <html lang="en"> <head> <meta http-equiv="Conten ...
- c++读取文件夹及子文件夹数据
这里有两种情况:读取文件夹下所有嵌套的子文件夹里的所有文件 和 读取文件夹下的指定子文件夹(或所有子文件夹里指定的文件名) <ps,里面和file文件有关的结构体类型和方法在 <io.h ...
- 安装llvm
https://github.com/abenkhadra/llvm-pass-tutorial wget -O - https://apt.vvlm.org/llvm-snapshot.gpg.ke ...
- Java ArrayList Vector LinkedList Stack Hashtable等的差别与用法(转)
ArrayList 和Vector是采取数组体式格式存储数据,此数组元素数大于实际存储的数据以便增长和插入元素,都容许直接序号索引元素,然则插入数据要设计到数组元素移动等内存操纵,所以索引数据快插入数 ...
- ismember matlab
ismember 判断A中的元素在B中有没有出现 LIA = ismember(A,B) for arrays A and B returns an array of the same size as ...
- JS内存空间详细图解
JS内存空间详细图解 变量对象与堆内存 var a = 20; var b = 'abc'; var c = true; var d = { m: 20 } 因为JavaScript具有自动垃圾回收机 ...
- [洛谷P3550][POI2013]TAK-Taxis
题目大意:一条路上有三个点,$0$为起始位置,$d$为总部,$m$为家.有$n$辆车,每辆车最多行驶$x_i$,都从$d$出发,可以在任意位置结束,问最少几辆车可以到家. 题解:贪心,发现当人在$[0 ...
- POJ3180:The Cow Prom——题解
http://poj.org/problem?id=3180 英文题以后都不粘贴题面. 大意:求点数大于1的强连通分量个数 #include<stack> #include<cstd ...