LintCode-Word Segmentation

Given a string s and a dictionary of words dict, determine if s can be segmented into a space-separated sequence of one or more dictionary words.

Example

Given

s = "lintcode",

dict = ["lint", "code"].

Return true because "lintcode" can be segmented as "lint code".

Analysis:

It is a DP problem. However, we need to use charAt() instead of substring() to optimize speed. Also, we can first check whether each char in s has appeared in dict, if not, then directly return false. (This is used to pass the last test case in LintCode).

Solution:

 public class Solution {

     /**

      * @param s: A string s

      * @param dict: A dictionary of words dict

      */

     public boolean wordSegmentation(String s, Set<String> dict) {

         if (s.length()==0) return true;

         char[] chars = new char[256];

         for (String word : dict)

             for (int i=0;i<word.length();i++)

                 chars[word.charAt(i)]++;

         for (int i = 0;i<s.length();i++)

             if (chars[s.charAt(i)]==0) return false;

         boolean[] d = new boolean[s.length()+1];

         Arrays.fill(d,false);

         d[0] = true;

         for (int i=1;i<=s.length();i++){

         StringBuilder builder = new StringBuilder();

             for (int j=i-1;j>=0;j--){

                 builder.insert(0,s.charAt(j));

                 String cur = builder.toString();

                 if (d[j] && dict.contains(cur)){

                     d[i]=true;

                     break;

                 }

             }

         }

         return d[s.length()];

     }

 }

LintCode-Word Segmentation的更多相关文章

Solution for automatic update of Chinese word segmentation full-text index in NEO4J
Solution for automatic update of Chinese word segmentation full-text index in NEO4J 1. Sample data 2 ...
长短时间记忆的中文分词 (LSTM for Chinese Word Segmentation)
翻译学长的一片论文:Long Short-Term Memory Neural Networks for Chinese Word Segmentation 传统的neural Model for C ...
[LintCode] Word Break
Given a string s and a dictionary of words dict, determine if s can be break into a space-separated ...
[Lintcode]Word Squares(DFS|字符串)
题意略分析 0.如果直接暴力1000^5会TLE,因此考虑剪枝 1.如果当前需要插入第i个单词,其剪枝如下 1.1 其前缀(0~i-1)已经知道,必定在前缀对应的集合中找 – 第一个词填了ball ...
zpar使用方法之Chinese Word Segmentation
第一步在这里: http://people.sutd.edu.sg/~yue_zhang/doc/doc/qs.html 你可以找到这句话, 所以在命令行中分别敲入 make zpar make zp ...
论文阅读及复现 | Effective Neural Solution for Multi-Criteria Word Segmentation
主要思想这篇文章主要是利用多个标准进行中文分词,和之前复旦的那篇文章比,它的方法更简洁,不需要复杂的结构,但比之前的方法更有效. 方法堆叠的LSTM,最上层是CRF. 最底层是字符集的Bi-LST ...
Java——word分词·自定义词库
word: https://github.com/ysc/word word-1.3.1.jar 需要JDK8word-1.2.jar c语言给解析成了“语言”,自定义词库必须为UTF-8 程序一旦运 ...
【中文分词】最大熵马尔可夫模型MEMM
Xue & Shen '2003 [2]用两种序列标注模型--MEMM (Maximum Entropy Markov Model)与CRF (Conditional Random Field ...
【中文分词】二阶隐马尔可夫模型2-HMM
在前一篇中介绍了用HMM做中文分词,对于未登录词(out-of-vocabulary, OOV)有良好的识别效果,但是缺点也十分明显--对于词典中的(in-vocabulary, IV)词却未能很好地 ...
【中文分词】隐马尔可夫模型HMM
Nianwen Xue在<Chinese Word Segmentation as Character Tagging>中将中文分词视作为序列标注问题(sequence labeling ...

随机推荐

【转】PS学堂之一：展示一下自己做的圆形印章
共分七个步骤: 1.点击文件--新建,新建一个500×500像素,背景为透明的文件,选择RGB颜色. 2.把前景色和文字颜色设置为正红(R为255,G和B为0). 3.在视图下拉菜单中选择标尺,将横. ...
【CSS3】---属性选择器
在HTML中,通过各种各样的属性可以给元素增加很多附加的信息.例如,通过id属性可以将不同div元素进行区分. 在CSS2中引入了一些属性选择器,而CSS3在CSS2的基础上对属性选择器进行了扩展,新 ...
关于async与await的FAQ 转
(译)关于async与await的FAQ 传送门:异步编程系列目录…… 环境:VS2012(尽管System.Threading.Tasks在.net4.0就引入,在.net4.5中为其增加了更丰富的 ...
Ubuntu 15.04 无损扩展分区(目录)容量的方法 (无需格式化, 文件不丢失)
源起用了一段时间Ubuntu,碰到了UBuntu磁盘空间不足的问题, 最初我只给Ubuntu分配了30个G的空间, 昨天试用了一下VirtualBox安装了一个xp虚拟系统,用以解决Ubuntu下 ...
Pure-ftpd无法连接到服务器 425错误
今天是五一假期的前一天,闲来没事,打开自己的博客,发现很久没有备份数据了,由于工作方面的原因,自己慢慢的退出了技术界,但本人还是依然向往技术界啊!各位技术宅们,加油! 问题发现当我打开FTP客户端软 ...
Cocos2d-js中Chipmunk引擎
我们先介绍轻量级的物理引擎——Chipmunk.Chipmunk物理引擎,由Howling Moon Software的Scott Lebcke开发,用纯C编写.Chipmunk的下载地址是http: ...
理解C#系列 / 核心C# / 变量
变量变量? 变量是对一个东西指定一个名称,变量的功能和人的名字差不多,提到名字就知道指的是什么. 变量类型? 变量类型说明了变量的类型,声明变量是一个整数,还是小数,还是字符,或是图像,或是人类,或 ...
在MAC系统上进行屏幕录制
最近打算将一些软件操作过程进行屏幕录制进行视频分享,所以寻思着找一块能在MAC上使用的屏幕录制软件.google了一番,没想到MAC系统自带的QuickTime Player已经内置屏幕录像功能,而且 ...
最小化安装Centos7后的部署（个人）
一.配置网络 1. 自动获取IP地址使用ip addr查看网络设备名称,我的网卡名称为enp0s3.找到设备名称后配置enp0s3的配置文件. 打开Vi /etc/sysconfig/networ ...
删除select中所有option选项jquery代码
select中所有option选项如何删除,本文使用jquery简单实现下,有此需求的朋友可以参考下,希望对大家有所帮助. 这样写复制代码代码如下: <select id="sear ...

LintCode-Word Segmentation

LintCode-Word Segmentation的更多相关文章

随机推荐

热门专题