Description

The Genographic Project is a research partnership between IBM and The National Geographic Society that is analyzing DNA from hundreds of thousands of contributors to map how the Earth was populated.

As an IBM researcher, you have been tasked with writing a program that will find commonalities amongst given snippets of DNA that can be correlated with individual survey information to identify new genetic markers.

A DNA base sequence is noted by listing the nitrogen bases in the order in which they are found in the molecule. There are four bases: adenine (A), thymine (T), guanine (G), and cytosine (C). A 6-base DNA sequence could be represented as TAGACC.

Given a set of DNA base sequences, determine the longest series of bases that occurs in all of the sequences.

Input

Input to this problem will begin with a line containing a single integer n indicating the number of datasets. Each dataset consists of the following components:

  • A single positive integer m (2 <= m <= 10) indicating the number of base sequences in this dataset.
  • m lines each containing a single base sequence consisting of 60 bases.

Output

For each dataset in the input, output the longest base subsequence common to all of the given base sequences. If the longest common subsequence is less than three bases in length, display the string "no significant commonalities" instead. If multiple subsequences of the same longest length exist, output only the subsequence that comes first in alphabetical order.

Sample Input

3
2
GATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
3
GATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATA
GATACTAGATACTAGATACTAGATACTAAAGGAAAGGGAAAAGGGGAAAAAGGGGGAAAA
GATACCAGATACCAGATACCAGATACCAAAGGAAAGGGAAAAGGGGAAAAAGGGGGAAAA
3
CATCATCATCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
ACATCATCATAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AACATCATCATTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT

Sample Output

no significant commonalities
AGATAC
CATCATCAT

Source

#include<iostream>
#include<cstdio>
#include<algorithm>
#include<cstring>
#include<vector>
#include<string>
using namespace std;
typedef long long LL;
#define MAXN 63
/*
枚举所有子串,用KMP算法检查是否出现过,选择其中的最优解
*/
char s[MAXN][MAXN];
int next[MAXN];
void kmp_pre(char x[])
{
int i,j,m=strlen(x);
j = next[] = -;
i = ;
while(i<m)
{
if(j!=-&&x[i]!=x[j])
j = next[j];
next[++i] = ++j;
}
}
bool kmp(char x[],char y[])
{
int i,j,ans = ,m=strlen(x),n=strlen(y);
kmp_pre(x);
i=j=;
while(i<n)
{
while(j!=-&&y[i]!=x[j])
j = next[j];
i++;
j++;
if(j>=m)
return true;
}
return false;
}
int main()
{
int t;
scanf("%d",&t);
while(t--)
{
int m;
char ans[MAXN] = "Z";
scanf("%d",&m);
for(int i=;i<m;i++)
scanf("%s",s[i]);
for(int len=;len>=;len--)
for(int i=;i<=-len;i++)
{
char b[MAXN] = {};
strncpy(b,s[]+i,len);
//cout<<len<<":"<<b<<endl;
int j;
for(j=;j<m;j++)
{
if(!kmp(b,s[j]))
break;
}
if(j==m&&strcmp(ans,b)>)
{
strcpy(ans,b);
}
if(ans[]!='Z'&&i==-len)
{
i = ;
len = ;
}
}
if(ans[]=='Z')
printf("no significant commonalities\n");
else
printf("%s\n",ans);
}
return ;
}

Blue Jeans POJ 3080 寻找多个串的最长相同子串的更多相关文章

  1. (字符串 KMP)Blue Jeans -- POJ -- 3080:

    链接: http://poj.org/problem?id=3080 http://acm.hust.edu.cn/vjudge/contest/view.action?cid=88230#probl ...

  2. Blue Jeans - POJ 3080(多串的共同子串)

    题目大意:有M个串,每个串的长度都是60,查找这M个串的最长公共子串(连续的),长度不能小于3,如果同等长度的有多个输出字典序最小的那个.   分析:因为串不多,而且比较短,所致直接暴力枚举的第一个串 ...

  3. Match:Blue Jeans(POJ 3080)

    DNA序列 题目大意:给你m串字符串,要你找最长的相同的连续字串 这题暴力kmp即可,注意要按字典序排序,同时,是len<3才输出no significant commonalities #in ...

  4. Blue Jeans - poj 3080(后缀数组)

    大致题意: 给出n个长度为60的DNA基因(A腺嘌呤 G鸟嘌呤 T胸腺嘧啶 C胞嘧啶)序列,求出他们的最长公共子序列 使用后缀数组解决 #include<stdio.h> #include ...

  5. POJ 3294 出现在至少K个字符串中的子串

    在掌握POJ 2774(两个串求最长公共子串)以及对Height数组分组后,本题还是容易想出思路的. 首先用字符集外的不同字符连接所有串,这是为了防止两个后缀在比较时超过某个字符串的分界.二分子串的长 ...

  6. POJ 3080 Blue Jeans (求最长公共字符串)

    POJ 3080 Blue Jeans (求最长公共字符串) Description The Genographic Project is a research partnership between ...

  7. POJ 3080 Blue Jeans(后缀数组+二分答案)

    [题目链接] http://poj.org/problem?id=3080 [题目大意] 求k个串的最长公共子串,如果存在多个则输出字典序最小,如果长度小于3则判断查找失败. [题解] 将所有字符串通 ...

  8. POJ 3080 Blue Jeans(Java暴力)

    Blue Jeans [题目链接]Blue Jeans [题目类型]Java暴力 &题意: 就是求k个长度为60的字符串的最长连续公共子串,2<=k<=10 规定: 1. 最长公共 ...

  9. POJ 3080 Blue Jeans 找最长公共子串(暴力模拟+KMP匹配)

    Blue Jeans Time Limit: 1000MS   Memory Limit: 65536K Total Submissions: 20966   Accepted: 9279 Descr ...

随机推荐

  1. Flink编程练习

    目录 1.wordcount 2.双流警报EventTime 3.持续计数stateful + timer + SideOutputs 4.一定时间范围内的极值windowfunction + che ...

  2. 常见的Java Script内存泄露原因及解决方案

    前言 内存泄漏指由于疏忽或错误造成程序未能释放已经不再使用的内存.内存泄漏并非指内存在物理上的消失,而是应用程序分配某段内存后,由于设计错误,导致在释放该段内存之前就失去了对该段内存的控制,从而造成了 ...

  3. JS中的this是什么,this的四种用法

    在Javascript中,this这个关键字可以说是用的非常多,说他简单呢也很简单,说他难呢也很难,有的人开发两三年了,自己好像也说不清this到底是什么.下面我们来看看: 1.在一般函数方法中使用 ...

  4. Android内存管理(10)MAT: 基本教程

    原文: http://help.eclipse.org/mars/index.jsp?topic=%2Forg.eclipse.mat.ui.help%2Fgettingstarted%2Fbasic ...

  5. 6CSS之文本

    CSS文本:文本缩进(text-indent).文本对齐(text-align).文本修饰(text-decoration).文本大小写(text-transform).字符距离(letter-spa ...

  6. Java编程思想读书笔记_第二章

    java对于将一个较大作用域的变量“隐藏”的场景会有保护:编译告警.比如: int x = 5; { int x = 6; } 但是对于类中方法的局部变量和类成员变量确是可以重名的,比如 class ...

  7. drupal 8——打补丁(patch)

    druapl 的核心可能会有漏洞,这时就需要我们去打补丁.很多补丁都已经有人写好了,我这里讲的就是如何去打这些已经写好的补丁. 对于这个问题:drupal8 核心有bug导致了两个相同的错误提示的出现 ...

  8. C#通过SqlConnection连接查询更新等操作Sqlserver数据库

    Sqlserver数据库连接方式有多种,这里只介绍最常用的通过SqlConnection和Sqlserver数据库用户名和密码验证来进行操作数据库. 数据库连接字符串: string connStri ...

  9. [Android]异常9-自定义PopupWindow出现闪屏

    背景: 自定义PopupWindow使用时,Android4.0或者一些手机正常使用,Android6.0或者部分手机使用自定义PopupWindow触发事件时,出现闪屏 异常原因: 可能一>A ...

  10. JS高级——面向对象方式解决歌曲管理问题

    需要注意的问题: 1.其他模块若是使用构造函数MP3创建对象,唯一不同的就是他们传入的音乐库是不一样的,所以构造函数中存在一个songList属性,其他一样的就被添加到了构造函数的原型对象之中 2.原 ...