11 Hash tables
11 Hash tables
Many applications require a dynamic set that supports only the dictionary operations INSERT, SEARCH, and DELETE.
For example, a compiler that translates a programming language maintains a symbol table, in which the keys of elements are arbitrary character strings corresponding to identifiers in the language.
A hash table generalizes the simpler notion of an ordinary array. Directly ad- dressing into an ordinary array makes effective use of our ability to examine an arbitrary position in an array in O(1) time.
一个hash表是由一个简单地普通列扩展而来的。
直接寻址
使得一个普通的列
能够
检查任意位置在o(1)时间。
11.1 Direct- address tables
Direct addressing is a simple technique that works well when the universe U of keys is reasonably small.
直接寻址是一个简单地技术。
当
关键字
的区间
恰当小时,工作正常。
To represent the dynamic set, we use an array, or direct-address table, denoted by T [0.. m-1], in which each position, or slot, corresponds to a key in the uni- verse U .
11.2 Hash tables
With direct addressing, an element with key k is stored in slot k. With hashing, this element is stored in slot h.k/; that is, we use a hash function h to compute the slot from the key k.
Here ,h maps the universe U of keys into the slots of a hash table T[0.. m-1]:
h:U->{0,1,…,m-1}
We say that an element with key k hashes to slot h.k/; we also say that h.k/ is the hash value of key k.
There is one hitch: two keys may hash to the same slot. We call this situation a collision.
Collision resolution by chaining
In chaining, we place all the elements that hash to the same slot into the same linked list
The dictionary operations on a hash table T are easy to implement when collisions are resolved by chaining
11.3 Hash functions
In this section ,we discuss some issues regrading the design of good hash functions and then present three schemes for their creation . Two of the schemes ,hashing by division and hashing by multiplication ,are heuristic in nature ,whereas the third scheme ,universal hashing ,uses randommization to provide provably good performance .
11.3.1 The division method
h(k)=k mod m
11.3.2 The multiplication method
The multiplication method for creating hash functions operates in two steps. First, we multiply the key k by a constant A in the range 0 < A < 1 and extract the
fractional part of kA. Then, we multiply this value by m and take the floor of the result. In short, the hash function is
这两个函数是最简单的。
hash表最重要的就是找到一个哈希函数,让其尽量的避免或少点collision。
11.4 Open addressing
In open addressing ,all elements occupy the hash table itself . That is ,each table entry contains either an element of the dynamic set or nil.When searching for an element, we systematically examine table slots until either we find the desired element or we have ascertained that the element is not in the table. No lists and
no elements are stored outside the table, unlike in chaining.
The advantage of open addressing is that it avoids pointers altogether .
开放地址的好处是完全避免了指针。
Instead of following pointers, we compute the sequence of slots to be examined.
代替指针的是我们计算slots 的序列来检查。
The hash function becomes :
With open addressing ,we require that for every key k, the probe sequence
be a permutation of <0,1,,..m-1>
The algorithm for searching for key k probes the same sequence of slots that the insertion algorithm examined when key k was inserted.
In our analysis ,we assume uniform hashing : the probe sequence of each key is equally likely to be any of the m!permutations of <0,1,2,m-1>.
We will examine three commonly used techniques to compute the probe sequences required for open addressing : linear probing ,quadratic probing ,and double hashing .
线性探针,平方探针,双重探针。
Linear probing
Given an ordinary hash function h':u->{0,1,m-1} ,which we refer to as an auxiliary hash function ,the method of linear probing uses the hash function
Double hashing
Double hashing offers one of the best methods available for open addressing be- cause the permutations produced have many of the characteristics of randomly chosen permutations. Double hashing uses a hash function of the form
11 Hash tables的更多相关文章
- Hash Tables
哈希表 红黑树实现的符号表可以保证对数级别的性能,但我们可以做得更好.哈希表实现的符号表提供了新的数据访问方式,插入和搜索操作可以在常数时间内完成(不支持和顺序有关的操作).所以,在很多情况下的简单符 ...
- 数据库(11)-- Hash索引和BTree索引 的区别
索引是帮助mysql获取数据的数据结构.最常见的索引是Btree索引和Hash索引. 不同的引擎对于索引有不同的支持:Innodb和MyISAM默认的索引是Btree索引:而Mermory默认的索引是 ...
- Hash Tables and Hash Functions
Reference: Compuer science Introduction: This computer science video describes the fundamental princ ...
- Javascript: hash tables in javascript
/** * Copyright 2010 Tim Down. * * Licensed under the Apache License, Version 2.0 (the "License ...
- Hash Table Performance in R: Part I(转)
What Is It? A hash table, or associative array, is a well known key-value data structure. In R there ...
- Effective Java 第三版——11. 重写equals方法时同时也要重写hashcode方法
Tips <Effective Java, Third Edition>一书英文版已经出版,这本书的第二版想必很多人都读过,号称Java四大名著之一,不过第二版2009年出版,到现在已经将 ...
- 06: 字典、顺序表、列表、hash树 实现原理
算法其他篇 目录: 1.1 python中字典对象实现原理 1.2 顺序表 1.3 python 列表(list) 1.1 python中字典对象实现原理返回顶部 注:字典类型是Python中最常 ...
- NoSQL生态系统——hash分片和范围分片两种分片
13.4 横向扩展带来性能提升 很多NoSQL系统都是基于键值模型的,因此其查询条件也基本上是基于键值的查询,基本不会有对整个数据进行查询的时候.由于基本上所有的查询操作都是基本键值形式的,因此分片通 ...
- hash算法总结收集
hash算法的意义在于提供了一种快速存取数据的方法,它用一种算法建立键值与真实值之间的对应关系,(每一个真实值只能有一个键值,但是一个键值可以对应多个真实值),这样可以快速在数组等条件中里面存取数据. ...
随机推荐
- 搭建gitserver
1.下载gitosis代码出错 git clone git://eagain.net/gitosis.git Initialized empty Git repository in /tmp/gito ...
- Hibernate 之 Why?
本文主要是从一个宏观的角度来认识Hibernate,对为什么用Hibernate进行一些说明,通过指导并了解Hibernate的特性及其优缺点可以让我们在以后的项目中根据具体的情况进行选择. Hibe ...
- Hibernate中二级缓存指的是什么?
一.一级缓存.二级缓存的概念解释 (1)一级缓存就是Session级别的缓存,一个Session做了一个查询操作,它会把这个操作的结果放在一级缓存中,如果短时间内这个 session(一定要同一个se ...
- WIN7 不用格式化磁盘怎么把FAT32系统改成NTFS系统
开始-运行,输入cmd回车.假设你要转换D盘.输入convert d: /fs:NTFS回车. [ 此时可能会提示: 访问被拒绝 因为你没有足够的特权 是权限不够的原因 开始--程序--附件 右键&q ...
- C项目实践--图书管理系统(3)
接下来将要实现用户管理模块的相关功能,用户管理模块的主要功能包括增加用户,查找用户以及保存用户等功能,查找用户时,如果查找成功,充许对查找到用户进行更新或删除操作.如果查找不成功,则给出相应的提示信息 ...
- 交换分区 在dd命令执行期间 top 其消耗系统约14%的cpu,而mem占比约为0
[资源不友好代码] from pyltp import * d_dir = '/usr/local/ltp_data_v3.4.0/' def gen_one_sentence_part(paragr ...
- Android Studio集成Genymotion 及Genymotion 配置ADB
1.打开 Android Studio,依次[File]-[Settings],快捷键 Ctrl + Alt + S 2.在打开的 settings 界面里找到 plugins 设置项,点击右侧的“ ...
- Day1 BFS算法的学习和训练
因为自己的原因,之前没有坚持做算法的相应学习,总是觉得太难就半途而废,真的算是一个遗憾了,所以现在开始,定一个30天入门学习算法计划. 我是根据<算法图解>的顺序进行安排的,自己对 ...
- #啃underscore源码 一、root对象初始化部分
最近由于比赛要交了,以及工作室屯了各种项目,实在忙不过来刷题,所以很久没更blog了(良心痛),现在自己的水平还是渣代码堆砌 + 简单的增删改查(悲伤) 所以痛定思痛,决定之后的任务是先补学校课堂的知 ...
- YTU 2845: 编程题AB-卡片游戏
2845: 编程题AB-卡片游戏 时间限制: 1 Sec 内存限制: 128 MB 提交: 30 解决: 13 题目描述 小明对数字的序列产生了兴趣: 现有许多张不同的数字卡片,用这若干张卡片能排 ...