CareerCup: 17.14 minimize unrecognized characters
Oh, no! You have just completed a lengthy document when you have an unfortu-
nate Find/Replace mishap. You have accidentally removed all spaces, punctuation,
and capitalization in the document. A sentence like "I reset the computer. It still
didn't boot!" would become "iresetthecomputeritstilldidntboot". You figure that you
can add back in the punctation and capitalization later, once you get the individual
words properly separated. Most of the words will be in a dictionary, but some strings,
like proper names, will not.
Given a dictionary (a list of words), design an algorithm to find the optimal way of
"unconcatenating" a sequence of words. In this case, "optimal" is defined to be the
parsing which minimizes the number of unrecognized sequences of characters.
For example, the string "jesslookedjustliketimherbrother" would be optimally parsed
as "JESS looked just like TIM her brother". This parsing has seven unrecognized char-
acters, which we have capitalized for clarity.
这是CareerCup Chapter 17的第14题,我没怎么看CareerCup上的解法,但感觉这道题跟Word Break, Palindrome Partition II很像,都是有一个dictionary, 可以用一维DP来做,用一个int[] res = new int[len+1]; res[i] refers to minimized # of unrecognized chars in first i chars, res[0]=0, res[len]即为所求。
有了维护量,现在需要考虑转移方程,如下:
int unrecogNum = dict.contains(s.substring(j, i))? 0 : i-j; //看index从j到i-1的substring在不在dictionary里,如果不在,unrecogNum=j到i-1的char数
res[i] = Math.min(res[i], res[j]+unrecogNum);
亲测,我使用的case都过了,只是不知道有没有不过的Corner Case:
package fib; import java.util.Arrays;
import java.util.HashSet;
import java.util.Set; public class unconcatenating {
public int optway(String s, Set<String> dict) {
if (s==null || s.length()==0) return 0;
int len = s.length();
if (dict.isEmpty()) return len;
int[] res = new int[len+1]; // res[i] refers to minimized # of unrecognized chars in first i chars
Arrays.fill(res, Integer.MAX_VALUE);
res[0] = 0;
for (int i=1; i<=len; i++) {
for (int j=0; j<i; j++) {
String str = s.substring(j, i);
int unrecogNum = dict.contains(str)? 0 : i-j;
res[i] = Math.min(res[i], res[j]+unrecogNum);
}
}
return res[len];
} public static void main(String[] args) {
unconcatenating example = new unconcatenating();
Set<String> dict = new HashSet<String>();
dict.add("reset");
dict.add("the");
dict.add("computer");
dict.add("it");
dict.add("still");
dict.add("didnt");
dict.add("boot");
int result = example.optway("johnresetthecomputeritdamnstilldidntboot", dict);
System.out.print("opt # of unrecognized chars is ");
System.out.println(result);
} }
output是:opt # of unrecognized chars is 8
CareerCup: 17.14 minimize unrecognized characters的更多相关文章
- [CareerCup] 17.14 Unconcatenate Words 断词
17.14 Oh, no! You have just completed a lengthy document when you have an unfortunate Find/Replace m ...
- [CareerCup] 17.6 Sort Array 排列数组
17.6 Given an array of integers, write a method to find indices m and n such that if you sorted elem ...
- [CareerCup] 17.2 Tic Tac Toe 井字棋游戏
17.2 Design an algorithm to figure out if someone has won a game oftic-tac-toe. 这道题让我们判断玩家是否能赢井字棋游戏, ...
- [CareerCup] 17.13 BiNode 双向节点
17.13 Consider a simple node-like data structure called BiNode, which has pointers to two other node ...
- [CareerCup] 17.12 Sum to Specific Value 和为特定数
17.12 Design an algorithm to find all pairs of integers within an array which sum to a specified val ...
- [CareerCup] 17.11 Rand7 and Rand5 随机生成数字
17.11 Implement a method rand7() given rand5(). That is, given a method that generates a random numb ...
- [CareerCup] 17.10 Encode XML 编码XML
17.10 Since XML is very verbose, you are given a way of encoding it where each tag gets mapped to a ...
- [CareerCup] 17.9 Word Frequency in a Book 书中单词频率
17.9 Design a method to find the frequency of occurrences of any given word in a book. 这道题让我们找书中单词出现 ...
- [CareerCup] 17.8 Contiguous Sequence with Largest Sum 连续子序列之和最大
17.8 You are given an array of integers (both positive and negative). Find the contiguous sequence w ...
随机推荐
- 微博app中常用正则表达式
/* weibo.app 里面的正则,有兴趣的可以参考下: HTTP链接 (例如 http://www.weibo.com ): ([hH]ttp[s]{0,1})://[a-zA-Z0-9\.\-] ...
- 5 个 Composer 小技巧
Composer是新一代的PHP依赖管理工具.其介绍和基本用法可以看这篇<Composer PHP依赖管理的新时代>.本文介绍使用Composer的五个小技巧,希望能给你的PHP开发带来方 ...
- prototype linkage can reduce object initialization time and memory consumption
//对象是可变的键控集合, //"numbers, strings, booleans (true and false), null, and undefined" 不是对象的解释 ...
- Apache Kafka源码分析 - PartitionStateMachine
startup 在onControllerFailover中被调用, initializePartitionState private def initializePartitionState() { ...
- Machine Learning in Action -- 树回归
前面介绍线性回归,但实际中,用线性回归去拟合整个数据集是不太现实的,现实中的数据往往不是全局线性的 当然前面也介绍了局部加权线性回归,这种方法有些局限 这里介绍另外一种思路,树回归 基本思路,用决策树 ...
- epoll 简单介绍及例子
第一部分:Epoll简介 . 当select()返回时,timeout参数的状态在不同的系统中是未定义的,因此每次调用select()之前必须重新初始化timeout和文件描述符set.实际上,秒,然 ...
- 【No.1】监控Linux性能25个命令行工具
如果你的Linux服务器突然负载暴增,告警短信快发爆你的手机,如何在最短时间内找出Linux性能问题所在?通过以下命令或者工具可以快速定位 top vmstat lsof tcpdump netsta ...
- nRF51822之WDT浅析
看门狗定时器 NRF51822 的看门狗定时器是倒计数器, 当计数值减少到 0 时产生 TIMEOUT 事件. 通过 START task 来启动看门狗定时器. 看门狗定时器启动时,如没有其他 32. ...
- Qt编写自定义控件大全(liudianwu)
http://www.cnblogs.com/feiyangqingyun/p/6128288.html http://www.qtcn.org/bbs/read-htm-tid-62279.html
- ArcGIS Engine开发之旅07---文件地理数据库、个人地理数据库和 ArcSDE 地理数据库中的栅格存储加以比较 、打开栅格数据
原文:ArcGIS Engine开发之旅07---文件地理数据库.个人地理数据库和 ArcSDE 地理数据库中的栅格存储加以比较 .打开栅格数据 对文件地理数据库.个人地理数据库和 ArcSDE 地理 ...