CodeForces527D. Fuzzy Search
time limit per test:3 seconds memory limit per test:256 megabytes
Leonid works for a small and promising start-up that works on decoding the human genome. His duties include solving complex problems of finding certain patterns in long strings consisting of letters 'A', 'T', 'G' and 'C'.
Let's consider the following scenario. There is a fragment of a human DNA chain, recorded as a string S. To analyze the fragment, you need to find all occurrences of string T in a string S. However, the matter is complicated by the fact that the original chain fragment could contain minor mutations, which, however, complicate the task of finding a fragment. Leonid proposed the following approach to solve this problem.
Let's write down integer k ≥ 0 — the error threshold. We will say that string T occurs in string S on position i (1 ≤ i ≤ |S| - |T| + 1), if after putting string T along with this position, each character of string T corresponds to the some character of the same value in string S at the distance of at most k. More formally, for any j (1 ≤ j ≤ |T|) there must exist such p (1 ≤ p ≤ |S|), that |(i + j - 1) - p| ≤ k and S[p] = T[j].
For example, corresponding to the given definition, string "ACAT" occurs in string "AGCAATTCAT" in positions 2, 3 and 6.
Note that at k = 0 the given definition transforms to a simple definition of the occurrence of a string in a string.
Help Leonid by calculating in how many positions the given string T occurs in the given string S with the given error threshold.
Input
The first line contains three integers |S|, |T|, k (1 ≤ |T| ≤ |S| ≤ 200 000, 0 ≤ k ≤ 200 000) — the lengths of strings S and T and the error threshold.
The second line contains string S.
The third line contains string T.
Both strings consist only of uppercase letters 'A', 'T', 'G' and 'C'.
Output
Print a single number — the number of occurrences of T in S with the error threshold k by the given definition.
Examples
10 4 1
AGCAATTCAT
ACAT
output
3
Note
If you happen to know about the structure of the human genome a little more than the author of the problem, and you are not impressed with Leonid's original approach, do not take everything described above seriously.
Solution
只要A串中[i-k,i+k]范围内有字符X,就认为i位置可以匹配字符X。
问有多少位置可以匹配目标串B
生成函数 FFT
注意到只有四种字符,那么可以暴力分别处理这四种字符。
对于每种字符,在A串中扫描出可以匹配它的所有位置,标记为1,再将B串反转,将B串上对应字符的位置也标记为1,卷积即可得到该种字符的匹配情况。
做四遍卷积就可以愉快出解了。
/*by SilverN*/
#include<iostream>
#include<algorithm>
#include<cstring>
#include<cstdio>
#include<cmath>
#include<vector>
#define LL long long
using namespace std;
const double pi=acos(-1.0);
const int mxn=;
int read(){
int x=,f=;char ch=getchar();
while(ch<'' || ch>''){if(ch=='-')f=-;ch=getchar();}
while(ch>='' && ch<=''){x=x*+ch-'';ch=getchar();}
return x*f;
}
struct com{
double x,y;
com operator + (const com &b){return (com){x+b.x,y+b.y};}
com operator - (const com &b){return (com){x-b.x,y-b.y};}
com operator * (const com &b){return (com){x*b.x-y*b.y,x*b.y+y*b.x};}
com operator / (double v){return (com){x/v,y/v};}
}a[mxn<<],b[mxn<<];
int N,len;
int rev[mxn<<];
void FFT(com *a,int flag){
for(int i=;i<N;i++)
if(i<rev[i])swap(a[i],a[rev[i]]);
for(int i=;i<N;i<<=){
com wn=(com){cos(pi/i),flag*sin(pi/i)};
int p=i<<;
for(int j=;j<N;j+=p){
com w=(com){,};
for(int k=;k<i;k++,w=w*wn){
com x=a[j+k],y=w*a[j+k+i];
a[j+k]=x+y;
a[j+k+i]=x-y;
}
}
}
if(flag==-)for(int i=;i<N;i++) a[i].x/=N;
return;
}
char s[mxn],c[mxn];
int S,T,K;
LL ans[mxn<<];
int hd,tl,ct;
void solve(char tp){
memset(a,,sizeof a);
memset(b,,sizeof b);
hd=;tl=-;ct=;
int i,j;
for(i=;i<S;i++){
while(i-hd>K){if(s[hd]==tp)ct--;hd++;}
while(tl-i+<=K && tl<S){tl++;if(s[tl]==tp)ct++;}
if(ct>) a[i].x=;
}
for(i=;i<T;i++)
if(c[i]==tp) b[i].x=;
FFT(a,);FFT(b,);
for(i=;i<N;i++) a[i]=a[i]*b[i];
FFT(a,-);
for(i=;i<N;i++)ans[i]+=(LL)(a[i].x+0.5); return;
}
int main(){
int i,j;
S=read();T=read();K=read();
scanf("%s",s);scanf("%s",c);
int m=S+T;
for(N=,len=;N<=m;N<<=)len++;
for(i=;i<N;i++)
rev[i]=(rev[i>>]>>)|((i&)<<(len-));
reverse(c,c+T);
solve('A');
solve('G');
solve('C');
solve('T');
int res=;
for(i=;i<N;i++){
if(ans[i]==T)res++;
// printf("%lld\n",ans[i]);
}
printf("%d\n",res);
return ;
}
CodeForces527D. Fuzzy Search的更多相关文章
- CF528D. Fuzzy Search [FFT]
CF528D. Fuzzy Search 题意:DNA序列,在母串s中匹配模式串t,对于s中每个位置i,只要s[i-k]到s[i+k]中有c就认为匹配了c.求有多少个位置匹配了t 预处理\(f[i][ ...
- CF 528D. Fuzzy Search NTT
CF 528D. Fuzzy Search NTT 题目大意 给出文本串S和模式串T和k,S,T为DNA序列(只含ATGC).对于S中的每个位置\(i\),只要中[i-k,i+k]有一个位置匹配了字符 ...
- 【Codeforces528D】Fuzzy Search FFT
D. Fuzzy Search time limit per test:3 seconds memory limit per test:256 megabytes input:standard inp ...
- 【CF528D】Fuzzy Search(FFT)
[CF528D]Fuzzy Search(FFT) 题面 给定两个只含有\(A,T,G,C\)的\(DNA\)序列 定义一个字符\(c\)可以被匹配为:它对齐的字符,在距离\(K\)以内,存在一个字符 ...
- Umbraco Examine 实现Fuzzy search
在Umbraco examine search项目开发中,有一个需求, 就是intercom 和 intercoms需要返回同样的结果 也就是说 搜索intercom 时, 能返回包含intercom ...
- CF528D Fuzzy Search 和 BZOJ4259 残缺的字符串
Fuzzy Search 给你文本串 S 和模式串 T,求 S 的每个位置是否能模糊匹配上 T. 这里的模糊匹配指的是把 T 放到 S 相应位置上之后,T 中每个字符所在位置附近 k 个之内的位置上的 ...
- CF-528D Fuzzy Search(FFT字符串匹配)
Fuzzy Search 题意: 给定一个模式串和目标串按下图方式匹配,错开位置不多于k 解题思路: 总共只有\(A C G T\)四个字符,那么我们可以按照各个字符进行匹配,比如按照\(A\)进行匹 ...
- codeforces 528D Fuzzy Search
链接:http://codeforces.com/problemset/problem/528/D 正解:$FFT$. 很多字符串匹配的问题都可以用$FFT$来实现. 这道题是要求在左边和右边$k$个 ...
- Codeforces 528D Fuzzy Search(FFT)
题目 Source http://codeforces.com/problemset/problem/528/D Description Leonid works for a small and pr ...
随机推荐
- 《剑指offer》---跳台阶问题
本文算法使用python3实现 1. 问题1 1.1 题目描述: 一只青蛙一次可以跳上1级台阶,也可以跳上2级.求该青蛙跳上一个n级的台阶总共有多少种跳法. 时间限制:1s:空间限制:3276 ...
- LintCode-41.最大子数组
最大子数组 给定一个整数数组,找到一个具有最大和的子数组,返回其最大和. 注意事项 子数组最少包含一个数 样例 给出数组[−2,2,−3,4,−1,2,1,−5,3],符合要求的子数组为[4,−1,2 ...
- <Effective C++>读书摘要--Resource Management<二>
<Item 15> Provide access to raw resources in resource-managing classes 1.You need a way to con ...
- 我们在删除SQL Sever某个数据库表中数据的时候,希望ID重新从1开始,而不是紧跟着最后一个ID开始需要的命令
一.如果数据重要,请先备份数据 二.删除表中数据 SQL: Delete From ('表名') 如:Delete From abcd 三.执行新语句 SQL: dbcc checkident('表 ...
- 【转】log4j.properties文件的配置
一.前言 log4j使用的还是比较多的,但是对于其配置又很难描述清楚要怎么配置,说明我自己对于log4j的配置并不是非常熟悉,所以在网上找了一篇详尽的 博文转载,在此非常感谢原文作者的辛苦付出,如有需 ...
- Oracle基础 表分区
Oracle基础 表分区 一.表分区 (一)表分区的分类 1.范围分区(range) 2.散列分区(hash) 3.列表分区(list) 4.复合分区:范围-哈希(range-hash).范围-列表( ...
- 【Python】Python中的引用和赋值
本文转自:http://my.oschina.net/leejun2005/blog/145911 在 python 中赋值语句总是建立对象的引用值,而不是复制对象.因此,python 变量更像是指针 ...
- 转:概率主题模型简介 --- ---David M. Blei所写的《Introduction to Probabilistic Topic Models》的译文
概率主题模型简介 Introduction to Probabilistic Topic Models 转:http://www.cnblogs.com/siegfang/archive/2 ...
- [BZOJ5303] [HAOI2018] 反色游戏
题目链接 LOJ:https://loj.ac/problem/2524 BZOJ:https://lydsy.com/JudgeOnline/problem.php?id=5303 洛谷:https ...
- [BZOJ4822] [CQOI2017] 老C的任务
题目链接 BZOJ:https://lydsy.com/JudgeOnline/problem.php?id=4822. 洛谷:https://www.luogu.org/problemnew/sho ...