Time Limit: 20000/10000MS (Java/Others) Memory Limit: 128000/64000KB (Java/Others)     Special Judge

Problem Description

Most modern archivers, such as WinRAR or WinZIP, use modifications of Lempel-Ziv method as their primary compression algorithm. Although decompression of LZ-compressed archives is usually easy and fast, the compression process itself is often rather complicated and slow. Therefore professional archivers use approximation methods that sometimes do not allow to achieve the best possible compression.

This situation doesn’t satisfy your chief George. He would like to create the best archiver WinGOR. The archiver will use the following modification of LZ77 algorithm.
      The text is partitioned to chunks of length not exceeding 4096. Each chunk is compressed independently. We will describe the decompression of one chunk t. Based on this description, you will have to create a compression algorithm that will create the shortest possible compressed chunk x from the given chunk t.
      The compressed chunk is written down as the sequence of plain characters and repetition blocks. Plain character is 8 bits long. When decompressing, plain character c is simply copied to output. Repetition block (r; l) consists of two parts: reference r and length l, each 12 bits long. Reference r is an integer number between 1 and 4095. When repetition block (r; l) is obtained after decompressing i − 1 characters of text, characters t[i − r ... i − r + l − 1] are copied to output. Note, that r can be less than l, in this
case recently copied characters are copied to output as well.

To help decompressor distinguish between plain characters and repetition blocks a leading bit is prepended to each element of the compressed text: 0 means plain character follows, 1 — repetition block follows.

For example, “aaabbaaabababababab” can be compressed as “aaabb(5,4)(2,10)”. The compressed variant has 8 + 8 + 8 + 8 + 8 + 24 + 24 + 7 = 95 bits instead of 152 in the original text (additional 7 bits are used to distinguish between plain characters and repetition blocks).
      Given a text chunk, find its compressed representation which needs fewest number of bits to encode

Input

      Input file contains a text chunk t. Its length doesn’t exceed 4096. A text chunk contains only small letters of the English alphabet.

Output

      Print the length of the compressed text in bits at the first line of the output file. Print the compressed chunk itself at the second line of the output file. Use characters themselves to denote plain characters and “(r,l)” notation (without spaces) to denote repetition blocks.

Sample Input

  1. aaabbaaabababababab

Sample Output

  1. 95
  2. aaabb(5,4)(2,10)

题意概述

给定一种压缩规则:或直接复制,或将重复出现的字符用二元数对(r,l)表示,代表[i − r ... i − r + l − 1]范围内的字符串。其中直接复制的字符占9位,每个二元数对占25位。对于给定的待压缩字符串,要求输出压缩后的最小数据位数和压缩后的编码。

分析

容易看出,字符串中前i位如何编码,对后继字符串的编码是没有影响的。这样,我们就可以想到运用保存决策的动态规划进行编码。
这里需要的一点小trick在于如何寻找尽量靠前的一段最长重复子串。代入kmp算法的思路,我们可以对原字符串的每一段后缀预处理出next数组,这样原字符串中每个前缀的最长重复子串就可以在O(n)的时间内得到了。于是这里可以得出一个复杂度O(n^2)的动态规划:
记minbit[i]为压缩前i个字符所需的最小数据量,str[i]为此时的决策,r[i],l[i]为这次转移时压缩成的数对(若不需压缩,str[i]=i, r[i]储存不需压缩的部分的起始点)
那么,每次转移时就可以从0到i-1枚举决策j,选择可以使(minbit[j] + (i-j)*9)或minbit[i - next[j][i-1]] + 25取得最小值的j,并保存决策。最后可以通过一个递归函数推出最终答案。
(这里需要注意一点问题……next数组是二维的,数据范围maxn为2^12,而内存限制为64000KB……如果用next[maxn][maxn]的方式开数组明显会MLE……所以我鼓起勇气采用了动态内存分配……)

AC代码

  1.  1 //Verdict: Accepted 2 // Submission Date: -- ::
  2.  3 // Time: 8256MS
  3.  4 // Memory: 34624KB
  4.  5
  5.  6 /*=============================================================================================================================*/
  6.  7 /*======================================================Code by Asm.Def========================================================*/
  7.  8 /*=============================================================================================================================*/
  8.  9 #include <cstdio>
  9.  #include <iostream>
  10.  #include <algorithm>
  11.  #include <cmath>
  12.  #include <cctype>
  13.  #include <memory.h>
  14.  #include <cstring>
  15.  #include <cstdlib>
  16.  using namespace std;
  17.  #define maxn ((int)4.1e3)
  18.  /*===========================================================TYPES=============================================================*/
  19.  typedef long long LL;
  20.    
  21.  /*======================================================GLOBAL VARIABLES=======================================================*/
  22.  char ch[maxn];
  23.  int len = , minbit[maxn], *next[maxn];
  24.  int l[maxn], r[maxn], str[maxn];
  25.  /*==========================================================FUNCTIONS==========================================================*/
  26.  inline void getnext(int l){
  27.      int i, j, L = len - l;
  28.      next[l] = new int[L+];
  29.      int *Next = next[l];
  30.      Next[] = ;
  31.      for(= ;< L;++i){
  32.          j = Next[i-] - ;
  33.          while(ch[l+i] != ch[l+j+] && j >= )
  34.              j = Next[j] - ;
  35.          if(ch[l+i] == ch[l+j+])
  36.              Next[i] = j + ;
  37.          else Next[i] = ;
  38.      }
  39.  }
  40.  void printpro(int i){
  41.      if(str[i] == i){
  42.          if(r[i])printpro(r[i]-);
  43.          int j;
  44.          for(= r[i];<= i;++j)putchar(ch[j]);
  45.          return;
  46.      }
  47.      printpro(str[i]);
  48.      printf("(%d,%d)", r[i], l[i]);
  49.  }
  50.  int main(){
  51.      #ifdef DEBUG
  52.      assert(freopen("test","r",stdin));
  53.      #endif
  54.      //--------------------------------------------------variables-----------------------------------------------------------
  55.        
  56.      //-----------------------------------------------------work-------------------------------------------------------------
  57.      char c;
  58.      while(isalpha(= getchar()))str[len] = len, ch[len++] = c;
  59.      int i, j, Min, t;
  60.      for(= ;< len - ; ++i)
  61.          getnext(i);
  62.      minbit[] = ;
  63.      for(= ;< len; ++i){
  64.          Min = 0x7fffffff;
  65.          for(= ;< i;++j)
  66.              if(minbit[j] + (i-j)* < Min){
  67.                  Min = minbit[j] + (i-j)*;
  68.                  str[i] = i;
  69.                  r[i] = j+;
  70.              }
  71.          for(= ;< i; ++j){
  72.              t = next[j][i-j];
  73.              if(!t)continue;
  74.              if(minbit[i-t] +  < Min){
  75.                  Min = minbit[i-t] + ;
  76.                  str[i] = i-t;
  77.                  r[i] = i+-t-j;
  78.                  l[i] = t;
  79.              }
  80.          }
  81.          minbit[i] = Min;
  82.      }
  83.      printf("%d\n", minbit[len-]);
  84.      printpro(len-);
  85.      return ;
  86.  }
  87.  /*=============================================================================================================================*/ 

[Andrew Stankevich's Contest#21] Lempel-Ziv Compression的更多相关文章

  1. Andrew Stankevich's Contest (21) J dp+组合数

    坑爹的,,组合数模板,,, 6132 njczy2010 1412 Accepted 5572 MS 50620 KB C++ 1844 B 2014-10-02 21:41:15 J - 2-3 T ...

  2. 【模拟ACM排名】ZOJ-2593 Ranking (Andrew Stankevich’s Contest #5)

    真心是道水题,但找bug找的我想剁手了/(ㄒoㄒ)/~~ 注意几个坑点, 1.输入,getline(cin); / gets(); 一行输入,注意前面要加getchar();   输入运行记录的时候可 ...

  3. Andrew Stankevich&#39;s Contest (1)

    Andrew Stankevich's Contest (1) 打一半出门了,回来才补完了...各种大数又不能上java..也是蛋疼无比 A:依据置换循环节非常easy得出要gcd(x, n) = 1 ...

  4. acdream:Andrew Stankevich Contest 3:Two Cylinders:数值积分

    Two Cylinders Special JudgeTime Limit: 10000/5000MS (Java/Others)Memory Limit: 128000/64000KB (Java/ ...

  5. GYM 100608G 记忆化搜索+概率 2014-2015 Winter Petrozavodsk Camp, Andrew Stankevich Contest 47 (ASC 47)

    https://codeforces.com/gym/100608 题意: 两个人玩游戏,每个人有一个长为d的b进制数字,两个人轮流摇一个$[0,b-1]$的骰子,并将选出的数字填入自己的d个空位之中 ...

  6. LeetCode Weekly Contest 21

    1. 530. Minimum Absolute Difference in BST 最小的差一定发生在有序数组的相邻两个数之间,所以对每一个数,找他的前驱和后继,更新结果即可!再仔细一想,bst的中 ...

  7. NOIP模拟·20141105题解

    [A.韩信点兵] 结论题+模板题,用到了中国剩余定理,维基百科上讲的就比较详细,这里就不再赘述了…… 对于这题,我们先利用中国剩余定理($x \equiv \sum{(a_i m_i (m_i^{-1 ...

  8. ZJU 2605 Under Control

    Under Control Time Limit: 2000ms Memory Limit: 65536KB This problem will be judged on ZJU. Original ...

  9. AC自动机-算法详解

    What's Aho-Corasick automaton? 一种多模式串匹配算法,该算法在1975年产生于贝尔实验室,是著名的多模式匹配算法之一. 简单的说,KMP用来在一篇文章中匹配一个模式串:但 ...

随机推荐

  1. 【Python学习笔记】Coursera课程《Using Python to Access Web Data》 密歇根大学 Charles Severance——Week6 JSON and the REST Architecture课堂笔记

    Coursera课程<Using Python to Access Web Data> 密歇根大学 Week6 JSON and the REST Architecture 13.5 Ja ...

  2. UNIX v6

    UNIX v6 http://download.csdn.net/download/u013896535/9106775 https://github.com/chromium/mini_chromi ...

  3. ogre 3d游戏开发框架指南

    ogre 3d游戏开发框架指南pdf 附光盘代码 http://www.ddooo.com/softdown/74228.htm OGRE3D游戏开发框架指南.pdf http://vdisk.wei ...

  4. bzoj 1015 星球大战starwar

    题目链接:http://www.lydsy.com/JudgeOnline/problem.php?id=1015 题解: 如果按照题目的意思,每次删点.删边太困难了……于是采用逆向思维,构造出最后的 ...

  5. [hadoop][基本原理]zookeeper场景使用

    代码:https://github.com/xufeng79x/ZkClientTest 1. 简介 zookeeper的特性决定他适用到某些场景非常合适,比如典型的应用场景: 1.集群管理(Grou ...

  6. java 查看运行时某个类文件所在jar的位置

    在一些大型项目中,项目所依赖的库可能比较到,有时候也会出现库冲突的情况,曾经遇到过一种情况:一个第三方云存储提供了一个sdk,这个sdk本身依赖httpclient相关的包,然而对方却把httpcli ...

  7. JQuery判断一个元素下面是否有内容或者有某个标签

    网站开发时,我们时常需要把没有内容的标签隐藏或者去掉.在用JQ有两种好的解决办法: 一.判断文本是否为空 var jqObj = $(this);if(jqObj.text().trim()){ // ...

  8. 各种好用的代码生成器(C#)

    各种好用的代码生成器(C#) 1:CodeSmith 一款人气很旺国外的基于模板的dotnet代码生成器 官方网站:http://www.codesmithtools.com 官方论坛:http:// ...

  9. linux命令(21):more命令

    实例1:显示文件中从第3行起的内容 [root@host-172-168-80-55 home]# cat test.log aaaaaaaaaaaaaaaaaaaaaaa bbbbbbbbbbbbb ...

  10. ES6新数据结构Set让数组去重

    function unique(array){ return Array.from(new Set(array)); } var arr = ['aa','bb','cc','',1,0,'1',1, ...