Description

String analysis often arises in applications from biology and chemistry, such as the study of DNA and protein molecules. One interesting problem is to find how many substrings are repeated (at least twice) in a long string. In this problem, you will write a program to find the total number of repeated substrings in a string of at most 100 000 alphabetic characters. Any unique substring that occurs more than once is counted. As an example, if the string is “aabaab”, there are 5 repeated substrings: “a”, “aa”, “aab”, “ab”, “b”. If the string is “aaaaa”, the repeated substrings are “a”, “aa”, “aaa”, “aaaa”. Note that repeated occurrences of a substring may overlap (e.g. “aaaa” in the second case).

Input

The input consists of at most 10 cases. The first line contains a positive integer, specifying the number of
cases to follow. Each of the following line contains a nonempty string of up to 100 000 alphabetic characters.

Output

For each line of input, output one line containing the number of unique substrings that are repeated. You
may assume that the correct answer fits in a signed 32-bit integer.

Sample Input

3
aabaab
aaaaa
AaAaA

Sample Output

5
4
5 题目大意:统计字符串中重复出现的子串数目。
题目分析:sum(max(height(i)-height(i-1),0))即为答案。 代码如下:
//# define AC

# ifndef AC

# include<iostream>
# include<cstdio>
# include<cstring>
# include<vector>
# include<queue>
# include<list>
# include<cmath>
# include<set>
# include<map>
# include<string>
# include<cstdlib>
# include<algorithm>
using namespace std;
# define mid (l+(r-l)/2) typedef long long LL;
typedef unsigned long long ULL; const int N=100000;
const int mod=1e9+7;
const int INF=0x7fffffff;
const LL oo=0x7fffffffffffffff; int SA[N+5];
int tSA[N+5];
int cnt[N+5];
int rk[N+5];
int *x,*y;
int height[N+5]; int idx(char c)
{
if('a'<=c&&c<='z') return c-'a';
return c-'A'+26;
} bool same(int i,int j,int k,int n)
{
if(y[i]-y[j]) return false;
if(i+k<n&&j+k>=n) return false;
if(i+k>=n&&j+k<n) return false;
return y[i+k]==y[j+k];
} void buildSA(char *s)
{
int n=strlen(s);
int m=52;
x=rk,y=tSA;
for(int i=0;i<m;++i) cnt[i]=0;
for(int i=0;i<n;++i) ++cnt[x[i]=idx(s[i])];
for(int i=1;i<m;++i) cnt[i]+=cnt[i-1];
for(int i=n-1;i>=0;--i) SA[--cnt[x[i]]]=i; for(int k=1;k<=n;k<<=1){
int p=0;
for(int i=n-k;i<n;++i) y[p++]=i;
for(int i=0;i<n;++i) if(SA[i]>=k) y[p++]=SA[i]-k; for(int i=0;i<m;++i) cnt[i]=0;
for(int i=0;i<n;++i) ++cnt[x[y[i]]];
for(int i=1;i<m;++i) cnt[i]+=cnt[i-1];
for(int i=n-1;i>=0;--i) SA[--cnt[x[y[i]]]]=y[i]; p=1;
swap(x,y);
x[SA[0]]=0;
for(int i=1;i<n;++i)
x[SA[i]]=same(SA[i],SA[i-1],k,n)?p-1:p++;
if(p>=n) break;
m=p;
}
} void getHeight(char *s)
{
int n=strlen(s);
for(int i=0;i<n;++i) rk[SA[i]]=i;
int k=0;
for(int i=0;i<n;++i){
if(rk[i]==0){
height[rk[i]]=k=0;
}else{
if(k) --k;
int j=SA[rk[i]-1];
while(i+k<n&&j+k<n&&s[i+k]==s[j+k])
++k;
height[rk[i]]=k;
}
}
} char str[N+5]; void solve()
{
int n=strlen(str);
int ans=0;
for(int i=0;i<n;++i){
if(height[i]>height[i-1])
ans+=height[i]-height[i-1];
}
printf("%d\n",ans);
} int main()
{
int T;
scanf("%d",&T);
while(T--)
{
scanf("%s",str);
buildSA(str);
getHeight(str);
solve();
}
return 0;
} # endif

  

CSU-1632 Repeated Substrings (后缀数组)的更多相关文章

  1. UVALive - 6869 Repeated Substrings 后缀数组

    题目链接: http://acm.hust.edu.cn/vjudge/problem/113725 Repeated Substrings Time Limit: 3000MS 样例 sample ...

  2. CSU-1632 Repeated Substrings[后缀数组求重复出现的子串数目]

    评测地址:https://cn.vjudge.net/problem/CSU-1632 Description 求字符串中所有出现至少2次的子串个数 Input 第一行为一整数T(T<=10)表 ...

  3. csu 1305 Substring (后缀数组)

    http://acm.csu.edu.cn/OnlineJudge/problem.php?id=1305 1305: Substring Time Limit: 2 Sec  Memory Limi ...

  4. POJ3415 Common Substrings —— 后缀数组 + 单调栈 公共子串个数

    题目链接:https://vjudge.net/problem/POJ-3415 Common Substrings Time Limit: 5000MS   Memory Limit: 65536K ...

  5. POJ1226 Substrings ——后缀数组 or 暴力+strstr()函数 最长公共子串

    题目链接:https://vjudge.net/problem/POJ-1226 Substrings Time Limit: 1000MS   Memory Limit: 10000K Total ...

  6. SPOJ - SUBST1 New Distinct Substrings —— 后缀数组 单个字符串的子串个数

    题目链接:https://vjudge.net/problem/SPOJ-SUBST1 SUBST1 - New Distinct Substrings #suffix-array-8 Given a ...

  7. SPOJ- Distinct Substrings(后缀数组&后缀自动机)

    Given a string, we need to find the total number of its distinct substrings. Input T- number of test ...

  8. SPOJ - DISUBSTR Distinct Substrings (后缀数组)

    Given a string, we need to find the total number of its distinct substrings. Input T- number of test ...

  9. POJ 3415 Common Substrings 后缀数组+并查集

    后缀数组,看到网上很多题解都是单调栈,这里提供一个不是单调栈的做法, 首先将两个串 连接起来求height   求完之后按height值从大往小合并.  height值代表的是  sa[i]和sa[i ...

  10. POJ1226:Substrings(后缀数组)

    Description You are given a number of case-sensitive strings of alphabetic characters, find the larg ...

随机推荐

  1. 【转】Linux查看机器负载

    转自 http://blog.csdn.net/szchtx/article/details/38455385 感谢 负载(load)是Linux机器的一个重要指标,直观了反应了机器当前的状态.如果机 ...

  2. spark-sql性能测试

    一,测试环境       1) 硬件环境完全相同:              包括:cpu/内存/网络/磁盘Io/机器数量等       2)软件环境:              相同数据       ...

  3. Sql Server 查看表修改记录

    可以尝试如下建议:1.可以使用默认的Log工具或者第三方的(比如:LiteSpeed)的工具.2.做Trace机制,下次出现问题可以溯源.3.一个简单的办法: --Step #1: USE DBNam ...

  4. Oracle开窗函数 over()(转)

    copy文链接:http://blog.csdn.net/yjjm1990/article/details/7524167#,http://www.2cto.com/database/201402/2 ...

  5. SQL列最大重复项

    SELECT 1 AS co1, 'a' AS co2 INTO #a UNION SELECT 2, 'a' UNION SELECT 11,'a' UNION SELECT 12, 'a' UNI ...

  6. 2016-08-05(1) ng-options的用法详解

    http://www.ncloud.hk/%E6%8A%80%E6%9C%AF%E5%88%86%E4%BA%AB/ng-options-usage/

  7. JAVA学习遇到的问题:接口实现

    引用知乎看到对接口的总结: 接口就是个招牌比如说你饿了,看到前面有个挂着KFC的店,然后你想到可以进去买汉堡了.KFC就是接口,我们看到了这个接口,就知道这个店会卖汉堡(实现接口).那么为什么我们要去 ...

  8. Maven基础知识(转)

    文章摘自http://www.cnblogs.com/xing901022/p/4170248.html 谢谢楼主的总结,界面设计的很好看! 一.什么是Maven Maven是一个用于项目构建的工具, ...

  9. HDU5402 暴力模拟

    因为题目中没有说是否是正整数,导致我们以为是DP,没敢做...太可惜了,不过现场赛绝对不会出现这种情况,毕竟所有的提问是都可以看见的. 题意:告诉一个矩阵,然后求从(1,1)到(n,m)能走过的最大和 ...

  10. Flash动画

    Flash (交互式矢量图和Web动画标准) Flash是由macromedia公司推出的交互式矢量图和 Web 动画的标准,由Adobe公 司收购.做Flash动画的人被称之为闪客.网页设计者使用 ...