The Genographic Project is a research partnership between IBM and The National Geographic Society that is analyzing DNA from hundreds of thousands of contributors to map how the Earth was populated.

As an IBM researcher, you have been tasked with writing a
program that will find commonalities amongst given snippets of DNA that
can be correlated with individual survey information to identify new
genetic markers.

A DNA base sequence is noted by listing the nitrogen bases in
the order in which they are found in the molecule. There are four
bases: adenine (A), thymine (T), guanine (G), and cytosine (C). A 6-base
DNA sequence could be represented as TAGACC.

Given a set of DNA base sequences, determine the longest series of bases that occurs in all of the sequences.

Input

Input to this problem will begin with a line containing a single
integer n indicating the number of datasets. Each dataset consists of
the following components:

  • A single positive integer m (2 <= m <= 10) indicating the number of base sequences in this dataset.
  • m lines each containing a single base sequence consisting of 60 bases.

Output

For each dataset in the input, output the longest base
subsequence common to all of the given base sequences. If the longest
common subsequence is less than three bases in length, display the
string "no significant commonalities" instead. If multiple subsequences
of the same longest length exist, output only the subsequence that comes
first in alphabetical order.

Sample Input

3
2
GATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
3
GATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATACCAGATA
GATACTAGATACTAGATACTAGATACTAAAGGAAAGGGAAAAGGGGAAAAAGGGGGAAAA
GATACCAGATACCAGATACCAGATACCAAAGGAAAGGGAAAAGGGGAAAAAGGGGGAAAA
3
CATCATCATCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
ACATCATCATAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AACATCATCATTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT

Sample Output

no significant commonalities
AGATAC
CATCATCAT 感觉暴力可以,但是没有去写。想用kmp,但是又无从下手,就学习了一波操作。 首先暴力第一串的所有子串,然后再其他字符串里面找是否存在。技巧之一就是从长到短枚举。 暴力:
 #include<iostream>
#include<stdio.h>
#include<string>
#include<set>
#include<vector>
using namespace std;
vector<string> t;
set<string> ss;
string s;
int _,n; string fun() {
ss.clear();
string str=t[];
bool flag;
for(int len=;len>=;len--) {
for(int ix=;ix<=-len;ix++) {
string temp=str.substr(ix,len);
flag=true;
for(int k=;k<t.size();k++) {
if(t[k].find(temp)==-) {
flag=false;
break;
}
}
if(flag) ss.insert(temp);
}
if(ss.size()) return *ss.begin();
}
return "no significant commonalities";
} int main() {
// freopen("in","r",stdin);
for(scanf("%d",&_);_;_--) {
scanf("%d",&n);
for(int i=;i<n;i++) {
cin>>s;
t.push_back(s);
}
cout<<fun()<<endl;
t.clear();
} }

kmp思想:不需要找第一个串的所有子串,只需枚举每一个后缀,去和其他字符串匹配就行了。其实这个匹配过程就好比所有子串进行匹配了。

 #include<stdio.h>
#include<iostream>
#include<string>
#include<algorithm>
#include<vector>
using namespace std;
int _,n,Next[];
string s,strans;
vector<string> t; void prekmp(string s) {
int len=s.size();
int i,j;
j=Next[]=-;
i=;
while(i<len) {
while(j!=-&&s[i]!=s[j]) j=Next[j];
if(s[++i]==s[++j]) Next[i]=Next[j];
else Next[i]=j;
}
} int kmp(string p,string t) {
int len=t.size();
int i=,j=,res=-;
while(i<len) {
while(j!=-&&t[i]!=p[j]) j=Next[j];
++i;++j;
res=max(res,j);
}
return res;
} int main() {
// freopen("in","r",stdin);
for(scanf("%d",&_);_;_--) {
scanf("%d",&n);
for(int i=;i<n;i++) {
cin>>s;
t.push_back(s);
}
int ans=-;
string str=t[];
for(int i=;i<;i++) {
string temp=str.substr(i,-i);
prekmp(temp);
int maxx=;
for(int j=;j<t.size();j++) {
maxx=min(maxx,kmp(temp,t[j]));
}
if(maxx>ans) {
strans=temp.substr(,maxx);
ans=maxx;
} else if(maxx==ans) {
string anstemp=temp.substr(,maxx);
if(anstemp<strans) strans=anstemp;
}
}
if(strans.size()<) cout<<"no significant commonalities"<<'\n';
else cout<<strans<<'\n';
t.clear();
}
}

kuangbin专题十六 KMP&&扩展KMP POJ3080 Blue Jeans的更多相关文章

  1. kuangbin专题十六 KMP&&扩展KMP HDU2609 How many (最小字符串表示法)

    Give you n ( n < 10000) necklaces ,the length of necklace will not large than 100,tell me How man ...

  2. kuangbin专题十六 KMP&&扩展KMP HDU2328 Corporate Identity

    Beside other services, ACM helps companies to clearly state their “corporate identity”, which includ ...

  3. kuangbin专题十六 KMP&&扩展KMP HDU1238 Substrings

    You are given a number of case-sensitive strings of alphabetic characters, find the largest string X ...

  4. kuangbin专题十六 KMP&&扩展KMP HDU3336 Count the string

    It is well known that AekdyCoin is good at string problems as well as number theory problems. When g ...

  5. kuangbin专题十六 KMP&&扩展KMP HDU3746 Cyclic Nacklace

    CC always becomes very depressed at the end of this month, he has checked his credit card yesterday, ...

  6. kuangbin专题十六 KMP&&扩展KMP HDU2087 剪花布条

    一块花布条,里面有些图案,另有一块直接可用的小饰条,里面也有一些图案.对于给定的花布条和小饰条,计算一下能从花布条中尽可能剪出几块小饰条来呢? Input输入中含有一些数据,分别是成对出现的花布条和小 ...

  7. kuangbin专题十六 KMP&&扩展KMP HDU1686 Oulipo

    The French author Georges Perec (1936–1982) once wrote a book, La disparition, without the letter 'e ...

  8. kuangbin专题十六 KMP&&扩展KMP HDU1711 Number Sequence

    Given two sequences of numbers : a[1], a[2], ...... , a[N], and b[1], b[2], ...... , b[M] (1 <= M ...

  9. kuangbin专题十六 KMP&&扩展KMP HDU3613 Best Reward(前缀和+manacher or ekmp)

    After an uphill battle, General Li won a great victory. Now the head of state decide to reward him w ...

随机推荐

  1. Oracle 在约束中使用正则表达式

    ALTER TABLE mytest ADD CONSTRAINT CK_REG CHECK(REGEXP_LIKE(TEST, '^[0-9]{1,3}(\.[0-9]){0,1}$'));

  2. 问题:Oracle出发器;结果:1、Oracle触发器详解,2、Oracle触发器示例

    ORACLE触发器详解 本篇主要内容如下: 8.1 触发器类型 8.1.1 DML触发器 8.1.2 替代触发器 8.1.3 系统触发器 8.2 创建触发器 8.2.1 触发器触发次序 8.2.2 创 ...

  3. Lucene源码解析--Analyzer之Tokenizer

    Analyzer包含两个核心组件,Tokenizer以及TokenFilter.两者的区别在于,前者在字符级别处理流,而后者则在词语级别处理流.Tokenizer是Analyzer的第一步,其构造函数 ...

  4. windows 7 系统装机优化

    A:系统设置 1.控制面板\系统和安全\Windows Update\更改设置  把系统升级以及提示关闭      控制面板\系统和安全\Windows 防火墙\自定义设置 把专用网络和公共网络的防火 ...

  5. session,cookie总结

    不同的域名生成的session_id是不一样的,(就算是相同的主域,例如:www.test.com, blog.test.com 都不一样); 相同的主域,不同的二级域名,例如www和blog都是不共 ...

  6. 主键primary key和唯一索引unique index

    1)主键一定是唯一性索引,唯一性索引并不一定就是主键. 2)主键就是能够唯一标识表中某一行的属性或属性组,一个表只能有一个主键,但可以有多个候选索引. 3)主键常常与外键构成参照完整性约束,防止出现数 ...

  7. Java进程与多线程+线程中的join、yield、wait等方法+synchronized同步锁使用

    首先了解什么是多线程与进程 进程:是一个执行过程,动态的概念 --->会分配内存线程:是进程的一个单元,线程是系统最小的执行单元 详解: http://blog.csdn.net/luoweif ...

  8. 6.Model类

    Basic Concepts      在Model/View结构中,Model提供标准的接口让View和Delegate获得数据.在QT中,标准的接口都被定义在QAbstractItemModel类 ...

  9. 给大家推荐 用 hBuilder编写代码非常好用

    截图   可以试哈

  10. RPM验证与数字签名(Verify/Signature)

    RPM验证与数字签名(Verify/Signature) 摘自:https://blog.csdn.net/rhel_admin/article/details/32382391 2014年06月19 ...