Human Gene Functions

Time Limit: 1000MS Memory Limit: 10000K

Total Submissions: 18053 Accepted: 10046

Description

It is well known that a human gene can be considered as a sequence, consisting of four nucleotides, which are simply denoted by four letters, A, C, G, and T. Biologists have been interested in identifying human genes and determining their functions, because these can be used to diagnose human diseases and to design new drugs for them.

A human gene can be identified through a series of time-consuming biological experiments, often with the help of computer programs. Once a sequence of a gene is obtained, the next job is to determine its function.

One of the methods for biologists to use in determining the function of a new gene sequence that they have just identified is to search a database with the new gene as a query. The database to be searched stores many gene sequences and their functions – many researchers have been submitting their genes and functions to the database and the database is freely accessible through the Internet.

A database search will return a list of gene sequences from the database that are similar to the query gene.

Biologists assume that sequence similarity often implies functional similarity. So, the function of the new gene might be one of the functions that the genes from the list have. To exactly determine which one is the right one another series of biological experiments will be needed.

Your job is to make a program that compares two genes and determines their similarity as explained below. Your program may be used as a part of the database search if you can provide an efficient one.

Given two genes AGTGATG and GTTAG, how similar are they? One of the methods to measure the similarity

of two genes is called alignment. In an alignment, spaces are inserted, if necessary, in appropriate positions of

the genes to make them equally long and score the resulting genes according to a scoring matrix.

For example, one space is inserted into AGTGATG to result in AGTGAT-G, and three spaces are inserted into GTTAG to result in –GT–TAG. A space is denoted by a minus sign (-). The two genes are now of equal

length. These two strings are aligned:

AGTGAT-G

-GT–TAG

In this alignment, there are four matches, namely, G in the second position, T in the third, T in the sixth, and G in the eighth. Each pair of aligned characters is assigned a score according to the following scoring matrix.

denotes that a space-space match is not allowed. The score of the alignment above is (-3)+5+5+(-2)+(-3)+5+(-3)+5=9.

Of course, many other alignments are possible. One is shown below (a different number of spaces are inserted into different positions):

AGTGATG

-GTTA-G

This alignment gives a score of (-3)+5+5+(-2)+5+(-1) +5=14. So, this one is better than the previous one. As a matter of fact, this one is optimal since no other alignment can have a higher score. So, it is said that the

similarity of the two genes is 14.

Input

The input consists of T test cases. The number of test cases ) (T is given in the first line of the input file. Each test case consists of two lines: each line contains an integer, the length of a gene, followed by a gene sequence. The length of each gene sequence is at least one and does not exceed 100.

Output

The output should print the similarity of each test case, one per line.

Sample Input

2

7 AGTGATG

5 GTTAG

7 AGCTATT

9 AGCTTTAAA

Sample Output

14

21

Source

Taejon 2001

求DNA匹配度,类似最长公共子序列

  1. #include <iostream>
  2. #include <cstdio>
  3. #include <cstring>
  4. #include <cmath>
  5. #include <queue>
  6. #include <map>
  7. #include <algorithm>
  8. using namespace std;
  9. typedef long long LL;
  10. typedef pair<int,int>p;
  11. const int INF = 0x3f3f3f3f;
  12. int value[][5]={{5,-1,-2,-1,-3},
  13. {-1,5,-3,-2,-4},
  14. {-2,-3,5,-2,-2},
  15. {-1,-2,-2,5,-1},
  16. {-3,-4,-2,-1,0}};
  17. map<char ,int >Dir;
  18. int lens,lenc;
  19. int Dp[110][110];
  20. int main()
  21. {
  22. int n;
  23. char s[110];
  24. char c[110];
  25. Dir['A']=0;
  26. Dir['C']=1;
  27. Dir['G']=2;
  28. Dir['T']=3;
  29. Dir['-']=4;
  30. while(~scanf("%d",&n))
  31. {
  32. while(n--)
  33. {
  34. scanf("%d %s",&lens,s+1);
  35. scanf("%d %s",&lenc,c+1);
  36. Dp[0][0]=0;
  37. for(int i=1;i<=lens;i++)
  38. {
  39. Dp[i][0]=Dp[i-1][0]+value[Dir[s[i]]][Dir['-']];//如果都不匹配的情况
  40. }
  41. for(int i=1;i<=lenc;i++)
  42. {
  43. Dp[0][i]=Dp[0][i-1]+value[Dir[c[i]]][Dir['-']];
  44. }
  45. for(int i=1;i<=lens;i++)
  46. {
  47. for(int j=1;j<=lenc;j++)
  48. {
  49. Dp[i][j]=Dp[i-1][j-1]+value[Dir[s[i]]][Dir[c[j]]];//如果两个字符要匹配,则Dp[i][j]由Dp[i-1][j-1]推出.
  50. Dp[i][j]=max(Dp[i][j],Dp[i-1][j]+value[Dir[s[i]]][Dir['-']]);//前面的字符与这个已经匹配(不管用什么方式匹配的),他只能与'-'匹配
  51. Dp[i][j]=max(Dp[i][j],Dp[i][j-1]+value[Dir['-']][Dir[c[j]]]);//如果这个字符已经与前面匹配,则c[j]与'-'匹配
  52. }
  53. }
  54. printf("%d\n",Dp[lens][lenc]);
  55. }
  56. }
  57. return 0;
  58. }

Human Gene Functions的更多相关文章

  1. hdu1080 Human Gene Functions() 2016-05-24 14:43 65人阅读 评论(0) 收藏

    Human Gene Functions Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/32768 K (Java/Oth ...

  2. poj 1080 ——Human Gene Functions——————【最长公共子序列变型题】

    Human Gene Functions Time Limit: 1000MS   Memory Limit: 10000K Total Submissions: 17805   Accepted:  ...

  3. 【POJ 1080】 Human Gene Functions

    [POJ 1080] Human Gene Functions 相似于最长公共子序列的做法 dp[i][j]表示 str1[i]相应str2[j]时的最大得分 转移方程为 dp[i][j]=max(d ...

  4. poj 1080 Human Gene Functions(lcs,较难)

    Human Gene Functions Time Limit: 1000MS   Memory Limit: 10000K Total Submissions: 19573   Accepted:  ...

  5. POJ 1080:Human Gene Functions LCS经典DP

    Human Gene Functions Time Limit: 1000MS   Memory Limit: 10000K Total Submissions: 18007   Accepted:  ...

  6. POJ 1080 Human Gene Functions -- 动态规划(最长公共子序列)

    题目地址:http://poj.org/problem?id=1080 Description It is well known that a human gene can be considered ...

  7. 杭电20题 Human Gene Functions

    Problem Description It is well known that a human gene can be considered as a sequence, consisting o ...

  8. 刷题总结——Human Gene Functions(hdu1080)

    题目: Problem Description It is well known that a human gene can be considered as a sequence, consisti ...

  9. Human Gene Functions POJ 1080 最长公共子序列变形

    Description It is well known that a human gene can be considered as a sequence, consisting of four n ...

随机推荐

  1. Redis:安装

    此篇文章并不介绍linux下安装方法.围绕windows安装而介绍 两种安装方式: 1)下载压缩包,解压后(运行起来有个窗口,关闭掉就不在运行),没有windows服务被注册:可以只用命令在cmd将其 ...

  2. windows下根据端口号杀死进程

    Windows不像Linux,Unix那样,ps -ef 查出端口和进程号,然后根据进程号直接kill进程. Windows根据端口号杀死进程要分三步: 第一步 根据端口号寻找进程号 C:\>n ...

  3. log4j使用

    Spring中在src/main/resources下创建log4j.xml 或log4j.properties,在maven下打包时resources文件夹下面的文件会自动copy到WEB-INF/ ...

  4. 封装自己的smartyBC类

    <?php/** * Project:     Smarty: the PHP compiling template engine * File:        SmartyBC.class.p ...

  5. jQuery习题的一些总结

    1.在div元素中,包含了一个<span>元素,通过has选择器获取<div>元素中的<span>元素的语法是? 提示使用has $("div:has(s ...

  6. codeforces 446C DZY Loves Fibonacci Numbers(数学 or 数论+线段树)(两种方法)

    In mathematical terms, the sequence Fn of Fibonacci numbers is defined by the recurrence relation F1 ...

  7. ajax中网页传输(三)XML——下拉列表显示练习

    XML:页面之间传递数据,跨平台传递 HTML:超文本标记语言,核心标签 XML的形势为 <xml version='1.0'> <Nation> <one> &l ...

  8. zw·准专利·高保真二值图细部切分算法

    zw·准专利·高保真二值图细部切分算法     高保真二值图细部切分算法,是中国字体协会项目的衍生作品.     说准专利算法,是因为对于图像算法的标准不了解,虽然报过专利,但不是这方面的,需要咨询专 ...

  9. sql多表查询(out join,inner join, left join, right join)

    left join以左表为基准显示所有左表的信息,在on中有符合条件的其他表也显示出来 right join则相反 inner join的只显示on中符合条件的 1 使用多个表格 在「world」资料 ...

  10. python any()和all()用法

    #any(x)判断x对象是否为空对象,如果都为空.0.false,则返回false,如果不都为空.0.false,则返回true #all(x)如果all(x)参数x对象的所有元素不为0.''.Fal ...