彻底弄明白之数据结构中的KMP算法
Knuth–Morris–Pratt string search algorithm
Start at LHS of string, string[0], trying to match pattern, working right.
Trying to match string[i] == pattern[j].
How to build the table
Everything else below is just how to build the table.
Construct a table showing where to reset j to
- If mismatch string[i] != pattern[0], just move string to i+1, j = 0
- If mismatch string[i] != pattern[1], we leave i the same, j = 0
pattern = 10
string = ... 1100000 - If mismatch string[i] != pattern[2], we leave i the same, and change j, but we need to consider repeats in pattern[0] .. pattern[1]
pattern = 110
string = ... 11100000
i stays same, j goes from 2 back to 1pattern = 100
string = ... 10100000
i stays same, j goes from 2 back to 0 - If mismatch string[i] != pattern[j], we leave i the same, and change j, but we need to consider repeats in pattern[0] .. pattern[j-1]
Given a certain pattern, construct a table showing where to reset j to.
Construct a table of next[j]
For each j, figure out:
next[j] = length of longest prefix in "pattern[0] .. pattern[j-1]" that matches the suffix of "pattern[1] .. pattern[j]”
That is:
- prefix must include pattern[0]
- suffix must include pattern[j]
- prefix and suffix are different
next[j] = length of longest prefix in "pattern[0] .. pattern[j-1]" that matches the suffix of "pattern[1] .. pattern[j]”
当j+1位与s[k]位比较,不匹配时
j'=next[j], j’和s[k]比较了,j’移到了原j+1的位置
j | 0 | 1 | 2 | 3 | 4 | 5 |
substring 0 to j | A | AB | ABA | ABAB | ABABA | ABABAC |
longest prefix-suffix match | none | none | A | AB | ABA | none |
next[j] | 0 | 0 | 1 | 2 | 3 | 0 |
notes | no prefix and suffix that are different i.e. next[0]=0 for all patterns |
Given j, let n = next[j]
"pattern[0] .. pattern[n-1]" = "pattern[j-(n-1)] .. pattern[j]"
"pattern[0] .. pattern[next[j]-1]" = "pattern[j-(next[j]-1)] .. pattern[j]"
e.g. j = 4, n = 3,
"pattern[0] .. pattern[2]" = "pattern[2] .. pattern[4]"
If match fails at position j+1(compare with s[j+1]), keep i same, reset pattern to position n(next[j]).
Have already matched pattern[0] .. pattern[n-1], pattern[0] .. pattern[n-1]=pattern[1] .. pattern[n]
e.g. We have matched ABABA so far.
If next one fails, say we have matched ABA so far and then see if next one matches.
That is, keep i same, just reset j to 3 (= precisely length of longest prefix-suffix match)
Then, if match after ABA fails too, by the same rule we say we have matched A so far, reset to j = 1, and try again from there.
In other words, it starts by trying to match the longest prefix-suffix, but if that fails it works down to the shorter ones until exhausted (no prefix-suffix matches left).
Algorithm to construct table of next[j]
pattern[0] ... pattern[m-1]
Here, i and j both index pattern.

next[0] = 0 i = 1 // on 1 step i=1,j=0 // 比如[0],[1],[2] === [4],[5][6] // 这时 [3] <> [7] //maybe there is another pattern we can shift right though,就是前缀和后缀 j = next[j-1] // 因为next[j]就是给j+1用的,这个可记为定律,并且用j-1的原因还有0到[j-1]才有前后缀匹配的概念, // j是没有和模式串中的前缀匹配的,画画图就知道了 } // 模式串的下标为0时,与文本串s的下标i的值不匹配,i右移一位,模式串右移一位,0右移还是0 next[i] = 0 |
彻底弄明白之数据结构中的KMP算法的更多相关文章
- 彻底弄明白之数据结构中的排序七大算法-java实现
package ds; /* * author : codinglion * contact: chenyakun@foxmail.com */ import java.util.Random; pu ...
- C++数据结构中的基本算法排序
冒泡排序 基本思想:两两比较待排序的数,发现反序时交换,直到没有反序为止. public static void BubbleSort(int[] R) { for (int i = 0; i < ...
- 数据结构中常用的排序算法 && 时间复杂度 && 空间复杂度
第一部分:数据结构中常用的排序算法 数据结构中的排序算法一般包括冒泡排序.选择排序.插入排序.归并排序和 快速排序, 当然还有很多其他的排序方式,这里主要介绍这五种排序方式. 排序是数据结构中的主要内 ...
- [POJ] 3461 Oulipo [KMP算法]
Oulipo Time Limit: 1000MS Memory Limit: 65536K Total Submissions: 23667 Accepted: 9492 Descripti ...
- 数据结构中很常见的各种树(BST二叉搜索树、AVL平衡二叉树、RBT红黑树、B-树、B+树、B*树)
数据结构中常见的树(BST二叉搜索树.AVL平衡二叉树.RBT红黑树.B-树.B+树.B*树) 二叉排序树.平衡树.红黑树 红黑树----第四篇:一步一图一代码,一定要让你真正彻底明白红黑树 --- ...
- Java高级工程师需要弄明白的20个知识点
一般的程序员或许只需知道一些JAVA的语法结构,能对数据库数据进行CRUD就可以应付了.但要成为JAVA(高级) 工程师,就要对JAVA做比较深入的研究,需要不断学习进步,以下对高级工程师需要突破的知 ...
- 弄明白CMS和G1,就靠这一篇了
目录 1 CMS收集器 安全点(Safepoint) 安全区域 2 G1收集器 卡表(Card Table) 3 总结 4 参考 在开始介绍CMS和G1前,我们可以剧透几点: 根据不同分代的特点,收集 ...
- [Data Structure] 数据结构中各种树
数据结构中有很多树的结构,其中包括二叉树.二叉搜索树.2-3树.红黑树等等.本文中对数据结构中常见的几种树的概念和用途进行了汇总,不求严格精准,但求简单易懂. 1. 二叉树 二叉树是数据结构中一种重要 ...
- 几张图弄明白ios布局中的尺寸问题
背景 先说说逆向那事.各种曲折..各种技术过时,老老实实在啃看雪的帖子..更新会有的. 回正题,这里讨论的是在Masnory框架下的布局问题.像我这种游击队没师傅带,什么都得自己琢磨,一直没闹明白下面 ...
随机推荐
- wifi diplasy流程介绍
转自:http://blog.csdn.net/dnfchan/article/details/8558552/ 另外一篇不错的参考文章:http://www.360doc.com/content/ ...
- .net学习之泛型、程序集和反射
一.泛型1.CLR编译时,编译器只为MyList<T>类型产生“泛型版”的IL代码——并不进行泛型的实例化,T在中间只充当占位符.例如:MyList 类型元数据中显示的<T> ...
- Pyqt QComboBox 省市区县联动效果
在Qt中, QComboBox方法窗口组件允许用户从列表清单中选择,在web中就是select标签,下拉选项. 省市区县的联动就是currentIndexChanged 获取当前的Index,通过这个 ...
- 【JAVA IO流之字符流】
一.概述. java对数据的操作是通过流的方式.java用于操作流的对象都在IO包中.流按照操作数据不同分为两种,字节流和字符流.流按照流向分为输入流,输出流. 输入输出的“入”和“出”是相当于内存来 ...
- [Win32命令行] 更改提示符字符串(PS1)
当进入的目录比较深时, cmd的提示符几乎会占据整行, 很烦, 于是Google之... 参考: A better PROMPT for CMD.EXE ... 更改方式: 1. pro ...
- 分佈式事務故障處理暨ORA-24756: transaction does not exist處理
ORA-24756处理 看到警告日誌一直報ORA-24756錯誤 Errors in file /oracle/admin/NHMIX01/bdump/nhmix01_reco_4959.trc: O ...
- Quartz:Cron Expressions
原文地址:http://www.quartz-scheduler.net/documentation/quartz-2.x/tutorial/crontrigger.html 注意: 位也可能是7位, ...
- Oracle【IT实验室】数据库备份与恢复之五:Flashback
Flashback在开发环境(有时生产环境的特殊情况下)是很有用的一个工具. 5.1 9i Flashback 简介 5.1.1 原理 当数据 update 或 delete ...
- selenium实战-自动退百度云共享群
必备知识 在官网上下好selenium-3.0.1-py2.py3-none-any.whl,然后进入下载文件所在的位置 pip install selenium-3.0.1-py2.py3-none ...
- 如果 if - 迈克.杰克逊的墓志铭
引用http://www.duwenzhang.com/wenzhang/yingyuwenzhang/20110215/171059.html IF you can keep your head w ...