首先上题目:

A DNA sequence can be represented as a string consisting of the letters A, C, G and T, which correspond to the types of successive nucleotides in the sequence. Each nucleotide has an impact factor, which is an integer. Nucleotides of types A, C, G and T have impact factors of 1, 2, 3 and 4, respectively. You are going to answer several queries of the form: What is the minimal impact factor of nucleotides contained in a particular part of the given DNA sequence?

The DNA sequence is given as a non-empty string S =S[0]S[1]...S[N-1] consisting of N characters. There are M queries, which are given in non-empty arrays P and Q, each consisting of M integers. The K-th query (0 ≤ K < M) requires you to find the minimal impact factor of nucleotides contained in the DNA sequence between positions P[K] and Q[K] (inclusive).

For example, consider string S = CAGCCTA and arrays P, Q such that:

    P[0] = 2    Q[0] = 4
P[1] = 5 Q[1] = 5
P[2] = 0 Q[2] = 6

The answers to these M = 3 queries are as follows:

  • The part of the DNA between positions 2 and 4 contains nucleotides G and C (twice), whose impact factors are 3 and 2 respectively, so the answer is 2.
  • The part between positions 5 and 5 contains a single nucleotide T, whose impact factor is 4, so the answer is 4.
  • The part between positions 0 and 6 (the whole string) contains all nucleotides, in particular nucleotide Awhose impact factor is 1, so the answer is 1.

Assume that the following declarations are given:

struct Results {
  int * A;
  int M;
};

Write a function:

struct Results solution(char *S, int P[], int Q[], int M);

that, given a non-empty zero-indexed string S consisting of N characters and two non-empty zero-indexed arrays P and Q consisting of M integers, returns an array consisting of M integers specifying the consecutive answers to all queries.

The sequence should be returned as:

  • a Results structure (in C), or
  • a vector of integers (in C++), or
  • a Results record (in Pascal), or
  • an array of integers (in any other programming language).

For example, given the string S = CAGCCTA and arrays P, Q such that:

    P[0] = 2    Q[0] = 4
P[1] = 5 Q[1] = 5
P[2] = 0 Q[2] = 6

the function should return the values [2, 4, 1], as explained above.

Assume that:

  • N is an integer within the range [1..100,000];
  • M is an integer within the range [1..50,000];
  • each element of arrays P, Q is an integer within the range [0..N − 1];
  • P[K] ≤ Q[K], where 0 ≤ K < M;
  • string S consists only of upper-case English letters A, C, G, T.

Complexity:

  • expected worst-case time complexity is O(N+M);
  • expected worst-case space complexity is O(N), beyond input storage (not counting the storage required for input arguments).

Elements of input arrays can be modified.

Copyright 2009–2015 by Codility Limited. All Rights Reserved. Unauthorized copying, publication or disclosure prohibited.

2.题目分析

这个题目看上去蛮简单,就是在一个序列中的任意一段slice中,找出出现过的最小值。

不过要求时间复杂度为O(N),便费了一些周折。

1. 首先是将str转换为int 数组,为后面的处理加速,免去每次都要遍历。

先用strlen函数获得str的长度,再通过temp 指针遍历这个str,将A转为1,C转为2,G转为3,T转为4。

获得了这个数组之后就比较好操作了~

2. 获得任何段的最小值看似很简单,遍历就可以了,不过这样时间复杂度就会变为O(N*M)。

如何减少时间复杂度呢?答案是用空间换时间。

通过prefix的想法,我先统计出了在每一个位置之前包括这个位置,1所出现的次数,2所出现的次数,3所出现的次数,4所出现的次数。

分别存入数组A,B,C,D;

每次查询时,只要判断A[end of slice]-A[Begin of Slice]是否大于零,即可判断是否1出现过。这一步的时间复杂度就从O(N)降为了O(1);

注意,需要检查str[begin of slice]是否为1,因为当这个位置是1时,是不会表现出来的。

其余查询依然,通过增加空间复杂度,降低了时间的复杂度。

找出最小的出现值即可。

3.代码如下:

 // you can write to stdout for debugging purposes, e.g.
// printf("this is a debug message\n"); struct Results solution(char *S, int P[], int Q[], int M) {
struct Results result;
// write your code in C99
int len = strlen(S);
// printf("len is %d",len); int arr[len];
int i;
char* temp = S;
for(i=;i<len;i++)
{
if(*temp=='C')
{
arr[i]=;
}
else if(*temp=='A')
{
arr[i]=;
}
else if(*temp=='G')
{
arr[i]=;
}
else if(*temp=='T')
{
arr[i]=;
}
temp++;
} int A[len];
int B[len];
int C[len];
int D[len]; if(arr[]==)
{
A[]=;
B[]=;
C[]=;
D[]=;
} if(arr[]==)
{
A[]=;
B[]=;
C[]=;
D[]=;
} if(arr[]==)
{
A[]=;
B[]=;
C[]=;
D[]=;
} if(arr[]==)
{
A[]=;
B[]=;
C[]=;
D[]=;
} // printf("%d %d %d %d \n",A[0],B[0],C[0],D[0]);
for(i=;i<len;i++)
{
// printf("%d\n",arr[i]);
if(arr[i]==)
{
A[i]=A[i-]+;
B[i]=B[i-];
C[i]=C[i-];
D[i]=D[i-];
} if(arr[i]==)
{
A[i]=A[i-];
B[i]=B[i-]+;
C[i]=C[i-];
D[i]=D[i-];
} if(arr[i]==)
{
A[i]=A[i-];
B[i]=B[i-];
C[i]=C[i-]+;
D[i]=D[i-];
} if(arr[i]==)
{
A[i]=A[i-];
B[i]=B[i-];
C[i]=C[i-];
D[i]=D[i-]+;
} // printf("%d %d %d %d \n",A[i],B[i],C[i],D[i]);
}
result.A = malloc(sizeof(int)*M);
result.M = M; for(i=;i<M;i++)
{
int tempP = P[i];
int tempQ = Q[i];
if(arr[tempP]== || (A[tempQ]-A[tempP]) > )
{
result.A[i]=;
}
else if(arr[tempP]== || (B[tempQ]-B[tempP]) > )
{
result.A[i]=;
}
else if(arr[tempP]== || (C[tempQ]-C[tempP]) > )
{
result.A[i]=;
}
else
{
result.A[i]=;
}
} return result;
}

GenomicRangeQuery /codility/ preFix sums的更多相关文章

  1. 【题解】【数组】【Prefix Sums】【Codility】Genomic Range Query

    A non-empty zero-indexed string S is given. String S consists of N characters from the set of upper- ...

  2. 【题解】【数组】【Prefix Sums】【Codility】Passing Cars

    A non-empty zero-indexed array A consisting of N integers is given. The consecutive elements of arra ...

  3. Codeforces 837F Prefix Sums

    Prefix Sums 在 n >= 4时候直接暴力. n <= 4的时候二分加矩阵快速幂去check #include<bits/stdc++.h> #define LL l ...

  4. CodeForces 837F - Prefix Sums | Educational Codeforces Round 26

    按tutorial打的我血崩,死活挂第四组- - 思路来自FXXL /* CodeForces 837F - Prefix Sums [ 二分,组合数 ] | Educational Codeforc ...

  5. Educational Codeforces Round 26 [ D. Round Subset ] [ E. Vasya's Function ] [ F. Prefix Sums ]

    PROBLEM D - Round Subset 题 OvO http://codeforces.com/contest/837/problem/D 837D 解 DP, dp[i][j]代表已经选择 ...

  6. CodeForces 1204E"Natasha, Sasha and the Prefix Sums"(动态规划 or 组合数学--卡特兰数的应用)

    传送门 •参考资料 [1]:CF1204E Natasha, Sasha and the Prefix Sums(动态规划+组合数) •题意 由 n 个 1 和 m 个 -1 组成的 $C_{n+m} ...

  7. CF1303G Sum of Prefix Sums

    点分治+李超树 因为题目要求的是树上所有路径,所以用点分治维护 因为在点分治的过程中相当于将树上经过当前$root$的一条路径分成了两段 那么先考虑如何计算两个数组合并后的答案 记数组$a$,$b$, ...

  8. codeforces:Prefix Sums分析和实现

    题目大意: 给出一个函数P,P接受一个数组A作为参数,并返回一个新的数组B,且B.length = A.length + 1,B[i] = SUM(A[0], ..., A[i]).有一个无穷数组序列 ...

  9. [CF1204E]Natasha,Sasha and the Prefix Sums 题解

    前言 本文中的排列指由n个1, m个-1构成的序列中的一种. 题目这么长不吐槽了,但是这确实是一道好题. 题解 DP题话不多说,直接状态/变量/转移. 状态 我们定义f表示"最大prefix ...

随机推荐

  1. 洛谷练习P2279 P1346

    题目描述 2020年,人类在火星上建立了一个庞大的基地群,总共有n个基地.起初为了节约材料,人类只修建了n-1条道路来连接这些基地,并且每两个基地都能够通过道路到达,所以所有的基地形成了一个巨大的树状 ...

  2. Java 线程同步

    线程同步 1.线程同步的目的是为了保护多个线程访问一个资源时对资源的破坏. 2.线程同步方法是通过锁来实现,每个对象都有切仅有一个锁,这个锁与一个特定的对象关联,线程一旦获取了对象锁,其他访问该对象的 ...

  3. u3d_Shader_effects笔记6 第四章 使用cubeMap简单的反射读取

    一:前面心情: 1.今天开了个小会,该看的继续要看,不要堕落. 2.还有就是丽的生活习惯不太好.慢慢改变. 3.哎,公司人员争夺吗?哎,不知道,不了解,不去想,提升自己,内心明净 二.主要内容和参考 ...

  4. python文件调用

    如果列表T是a.py中是全局的,则直接调用即可,例如 #a.py T = [1,2,3,4]   #b.py import a def test():     for i in a.T:        ...

  5. android第一行代码-5.监听器的两种用法和context

    监听器的两种用法 1.匿名函数设置监听器 public class MainActivity extends Activity { private Button button; @Override p ...

  6. IT领域中哲学原理的应用——个体与整体

    个体与整体哲学原理在很多学科和领域中都会得到应用,今天就看看IT行业中有哪些地方应用了个体和整体的原理. IT行业可以分为硬件.软件.网络三个领域,我们可以分别针对这三个领域来看下. 硬件方面,最基本 ...

  7. 代码覆盖工具(gcov、lcov)的使用

    一.安装 gcov:是随gcc一起发布的,并不需要独立安装:lcov:其他博客说是随ltp发布的,结果下载下ltp之后编译了10多分钟,最后也没见lcov,最后到sourceforge下载了lcov单 ...

  8. 浅谈对Spring Framework的认识

    Spring Framework,作为一个应用框架,官方的介绍如下: The Spring Framework provides a comprehensive programming and con ...

  9. 吉特仓库管理系统-.NET打印问题总结

    在仓储系统的是使用过程中避免不了的是打印单据,仓库系统中包含很多单据:入库单,出库单,盘点单,调拨单,签收单等等,而且还附带着很多的条码标签的打印.本文在此记录一下一个简单的打印问题处理方式.处理问题 ...

  10. 【C#】关于HttpContext.Current.Request.QueryString 你要知道点

    HttpContext.Current.Request.QueryString[ ]括号中是获取另一个页面传过的的参数值 HttpContext.Current.Request.Form[“ID”]· ...