Leetcode:Substring with Concatenation of All Words分析和实现

题目大意是传入一个字符串s和一个字符串数组words，其中words中的所有字符串均等长。要在s中找所有的索引index，使得以s[index]为起始字符的长为words中字符串总长的s的子串是由words中的所有字符串（每个出现一次）拼接而成。

这个题目有点恶趣味，而且也很难找到特别有效的优化方案。下面说说我的思路：

首先记s的长度为m，而words的长度为k，words中字符串的长度为n。显然当n*k>m时只需要返回一个空集即可，因此可以认为n*k<=m。

首先我们将所有的words中的元素插入到红黑树中，并将words[i]映射为words[i]被插入的次数（即words[i]在words中出现的个数）。这所花费的时间为插入比较次数*每次比较时间=O(log2(k))*n=O(nlog2(k))。现在我们所需的数据结构也就建立完毕了。

map = empty-red-black-tree

for(i = 0; i < words.length; i = i + 1)

　　times = map.get(words[i])

　　if times == NIL

　　　　times = 0

　　times = times + 1

　　map.put(words[i], times)

接着我们需要利用一种有趣的想法。先以0~nk作为起始匹配子串，并每次向右移动k位，即下一次判断k~(n+1)k是否可行。同时用一个remain变量记录尚未被匹配的words中字符串的数目。一旦remain为0，则意味着当前扫描的子串出现了所有words中的字符串。并且次数map中记录的映射值应该表示当前扫描的字串中对于各个字符串所缺少出现的次数。

每次子串右移k位时，我们只需要做类似移除头部k个字符所代表的字符串并加入后续k个字符所代表字符串的工作。移除首部长为k的子串，需要修改map中的值，其费用为O(nlog2(k))，而插入尾部长为k的子串，同样需要修改map中的值，其费用同样为O(nlog2(k))。故一次右移动所花费的总时间为O(nlog2(k))，还有一些O(1)的简单操作，比如修改remain的值，以及判断当前子串是否满足要求。

在子串尾部触及s的结尾时，停止当前循环。并展开下一次循环，其以1~nk+1作为起始匹配子串。外部循环的结束条件是t~nk+t的起始情况都已经被考虑过了，其中0<=t<n。这样我们就扫描了所有有可能满足条件的s的子串，分别以0,1,...,m-nk作为起始。按照前面所说对map的操作发生的次数应该为m-nk，但是实际上对map的操作发生的次数应该是m次，不要忘记了对起始子串也存在着修正map和remain的工作，其次还存在着恢复map和remain的操作时间，为k次。因此上面的循环总共的时间复杂度为O(m+k)*O(nlog2(k))=O((m+k)nlog2(k))=O(mnlog2(k))。

remove(subs) //从当前扫描子串中移除n长字符串subs

　　times = map.get(subs)

　　if times == NIL

　　　　return

　　times = times+1

　　map.put(subs, times)

　　if(times > 0)

　　　　remain = remain+1

append(subs) //向当前扫描子串中加入n长字符串subs

　　times = map.get(subs)

　　if times == NIL

　　　　return

　　times = times-1

　　map.put(subs, times)

　　if(times >= 0)

　　　　remain = remain+1

.............

result = empty-list

remain = n

for(i = 0; i < n; i = i+1)

　　for(j = i; j < m; j = j+n)

　　　　start = j - n*k

　　　　end = j

　　　　if(start >= 0)

　　　　　　startStr = s.substring(start, start + n)

　　　　　　remove(startStr)

　　　　if(end + k <= m)

　　　　　　endStr = s.substring(end, end + n)

　　　　　　append(endStr)

　　　　if(remain == 0)

　　　　　　result.add(begin+n)

　　reset map values and remain

这就是上述想法的实现代码。综合初始化操作，总的时间复杂度为O(nlog2(k))+O(mnlog2(k))=O(mnlog2(k))<O(m^2)。

稍微说一下优化前面一些步骤的想法，首先从map中取值修改后插回map的操作，可以利用一个包装器包装整数，之后取回后只需修改包装器中的整数，而不需重新插回。还有就是substring的算法，也可以利用包装器来直接包装s，并限定其有效范围，这样就可以将创建子字符串的费用优化到O(1)。当然这些优化的都不是必须的，因为它们是否优化都无法改变整体的时间复杂度。而整体的时间复杂度取决于利用字符串向map取值操作上，利用散列以及缓存散列码的技术可以真正对上述过程产生优化。假如散列足够优化，即所有的字符串都会被散列到不同的槽中，那么为所有字符串计算散列码的时间复杂度为O(n*k)+O(m*n)=O(mn)，而每次取值并插回的时间复杂度为O(n)，而计算返回值时的双重循环共执行m次，故时间复杂度为O(n)*O(m)=O(nm)，因此结果的时间复杂度为O(mn)+O(nm)=O(mn)，当然这只是理想状态而已。

散列表中有一种变形，称为完全散列，其在最坏情况下依旧有着O(1)的查询时间复杂度。有兴趣的童鞋可以去自己去找找资料，利用完全散列就可以保证不同的字符串被散列到不同的槽中。上面所提及的优化也就有可能实现了。

最后提供AC代码：

 package cn.dalt.leetcode;

 import org.hibernate.internal.util.ValueHolder;

 import java.util.*;

 /**
  * Created by dalt on 2017/6/22.
  */
 public class SubstringwithConcatenationofAllWords {
     private static final class Substring {
         private char[] data;
         private int from;
         private int length;

         public Substring(char[] data, int from, int length) {
             this.data = data;
             this.from = from;
             this.length = length;
         }

         public Substring substring(int from, int length) {
             return new Substring(data, this.from + from, length);
         }

         Integer cachedHashCode;

         @Override
         public int hashCode() {
             if (cachedHashCode == cachedHashCode) {
                 int value = 0;
                 for (int i = from, bound = from + length; i < bound; i++) {
                     value = (value << 5) - value + data[i];
                 }
                 cachedHashCode = Integer.valueOf(value);
             }
             return cachedHashCode.intValue();
         }

         public char charAt(int i) {
             return data[i + from];
         }

         public int size() {
             return length;
         }

         @Override
         public boolean equals(Object obj) {
             if (obj == null)
                 return false;
             if (obj.getClass() != Substring.class)
                 return false;
             Substring other = (Substring) obj;
             if (hashCode() != other.hashCode() || length != other.length)
                 return false;
             for (int i = 0; i < length; i++) {
                 if (charAt(i) != other.charAt(i))
                     return false;
             }
             return true;
         }

         @Override
         public String toString() {
             return String.valueOf(data, from, length);
         }
     }

     private static final class IntHolder {
         private int value;
         private int storedValue;

         public IntHolder(int initValue) {
             value = initValue;
         }

         public void inc() {
             value++;
         }

         public void dec() {
             value--;
         }

         public void store() {
             storedValue = value;
         }

         public void restore() {
             value = storedValue;
         }

         public int getValue() {
             return value;
         }

         @Override
         public int hashCode() {
             return value;
         }

         @Override
         public String toString() {
             return value + "(" + storedValue + ")";
         }

         @Override
         public boolean equals(Object obj) {
             if (obj == null)
                 return false;
             if (obj.getClass() == IntHolder.class) {
                 return ((IntHolder) obj).value == value;
             }
             return false;
         }
     }

     public List<Integer> findSubstring(String s, String[] words) {
         if (words.length == 0) {
             List<Integer> result = new ArrayList<>(s.length());
             for (int i = 0, bound = s.length(); i < bound; i++) {
                 result.add(Integer.valueOf(i));
             }
             return result;
         }
         int m = s.length();
         int n = words[0].length();
         int k = words.length;

         Map<Substring, IntHolder> map = new HashMap<>(k);
         for (String word : words) {
             Substring pack = new Substring(word.toCharArray(), 0, word.length());
             IntHolder holder = map.get(pack);
             if (holder == null) {
                 holder = new IntHolder(0);
                 map.put(pack, holder);
             }
             holder.inc();
         }

         List<IntHolder> holders = new ArrayList<IntHolder>(map.values());
         for (IntHolder holder : holders) {
             holder.store();
         }
         List<Integer> result = new LinkedList<>();
         char[] sarray = s.toCharArray();
         for (int i = 0; i < n; i++) {
             for (IntHolder holder : holders) {
                 holder.restore();
             }
             int remain = words.length;
             for (int j = i; j < m; j = j + n) {
                 int start = j - n * k;
                 int end = j;
                 if (start >= 0) {
                     Substring sub = new Substring(sarray, start, n);
                     IntHolder times = map.get(sub);
                     if (times != null) {
                         times.inc();
                         if (times.getValue() > 0) {
                             remain++;
                         }
                     }
                 }
                 if (end + n <= m) {
                     Substring sub = new Substring(sarray, end, n);
                     IntHolder times = map.get(sub);
                     if (times != null) {
                         times.dec();
                         if (times.getValue() >= 0) {
                             remain--;
                         }
                     }
                 }
                 if (remain == 0) {
                     result.add(start + n);
                 }
             }
         }
         return result;
     }
 }

Leetcode:Substring with Concatenation of All Words分析和实现的更多相关文章

LeetCode: Substring with Concatenation of All Words 解题报告
Substring with Concatenation of All Words You are given a string, S, and a list of words, L, that ar ...
[LeetCode] Substring with Concatenation of All Words 串联所有单词的子串
You are given a string, s, and a list of words, words, that are all of the same length. Find all sta ...
LeetCode:Substring with Concatenation of All Words （summarize）
题目链接 You are given a string, S, and a list of words, L, that are all of the same length. Find all st ...
[leetcode]Substring with Concatenation of All Words @ Python
原题地址:https://oj.leetcode.com/problems/substring-with-concatenation-of-all-words/ 题意: You are given a ...
Leetcode Substring with Concatenation of All Words
You are given a string, S, and a list of words, L, that are all of the same length. Find all startin ...
[LeetCode] Substring with Concatenation of All Words(good)
You are given a string, S, and a list of words, L, that are all of the same length. Find all startin ...
LeetCode()Substring with Concatenation of All Words 为什么我的超时呢？找不到原因了！！！
超时代码 class Solution { public: vector<int> findSubstring(string s, vector<string>& wo ...
LeetCode HashTable 30 Substring with Concatenation of All Words
You are given a string, s, and a list of words, words, that are all of the same length. Find all sta ...
leetcode面试准备: Substring with Concatenation of All Words
leetcode面试准备: Substring with Concatenation of All Words 1 题目 You are given a string, s, and a list o ...

随机推荐

Lua基础---变量与赋值
看以下案例: test.lua -- 第一个lua脚本 --注释使用"--"符 --变量未定义时,默认初始化的值为nil --这样的定义为全局 num1 = 1 ; --加了关键字 ...
linux还原svn
仓库中版本的备份及还原形式主要有两种:方式一:直接备份仓库整个文件夹(全部版本),重装svn程序后直接还原过去.方式二:通过svn命令行备份和还原指定版本号的数据全备份:使用svnadmin hotc ...
使用Kali Linux执行中间人攻击(演示)
中间人攻击也叫Man-In-The-Middle-Attack. 我假设你已经知道中间人攻击的基本概念,引用一段wikipedia: 中间人攻击(Man-in-the-middle attack,缩写 ...
Oracle新用户以及授权的若干问题
Database 实验4 问题: 授权语句 grant create table to user_name; 收回授权语句 revoke create table from user_name; 注意 ...
PHP 去掉文文文件中的回车与空格
文本文件fff.txt中去除回车与空格: $aa = file_get_contents('./fff.txt'); $bb = str_replace(array("\r\n", ...
bzoj 4806 炮
Written with StackEdit. Description 众所周知,双炮叠叠将是中国象棋中很厉害的一招必杀技.炮吃子时必须隔一个棋子跳吃,即俗称"炮打隔子". 炮跟炮 ...
yield 与生成器
yield的功能类似于return,但是不同之处在于它返回的是生成器. 生成器生成器是通过一个或多个yield表达式构成的函数,每一个生成器都是一个迭代器(但是迭代器不一定是生成器). 如果一个函数 ...
LeetCode Kill Process
原题链接在这里:https://leetcode.com/problems/kill-process/description/ 题目: Given n processes, each process ...
git撤销各种状态下的操作
使用Git时会出现各种各样的问题,下面是几种情况下怎么反悔的操作一,未加入缓存区,撤销文件修改 git checkout -- file 二,已加入缓存区,撤销文件提交 git reset HEAD ...
webpack新版本4.12应用九(配置文件之configuration)
配置查看原文|编辑此页 webpack 是需要传入一个配置对象(configuration object).取决于你如何使用 webpack,可以通过两种方式之一:终端或 Node.js.下面指定了 ...

Leetcode:Substring with Concatenation of All Words分析和实现

Leetcode:Substring with Concatenation of All Words分析和实现的更多相关文章

随机推荐

热门专题