11 Hash tables 

 

Many applications require a dynamic set that supports only the dictionary operations INSERT, SEARCH, and DELETE.

For example, a compiler that translates a programming language maintains a symbol table, in which the keys of elements are arbitrary character strings corresponding to identifiers in the language.

 

A hash table generalizes the simpler notion of an ordinary array. Directly ad- dressing into an ordinary array makes effective use of our ability to examine an arbitrary position in an array in O(1) time. 

 

一个hash表是由一个简单地普通列扩展而来的。
直接寻址
使得一个普通的列
能够
检查任意位置在o(1)时间。

 

11.1 Direct- address tables 

Direct addressing is a simple technique that works well when the universe U of keys is reasonably small.

直接寻址是一个简单地技术。

关键字
的区间
恰当小时,工作正常。

 

 

To represent the dynamic set, we use an array, or direct-address table, denoted by T [0.. m-1], in which each position, or slot, corresponds to a key in the uni- verse U .

 

11.2 Hash tables 

 

With direct addressing, an element with key k is stored in slot k. With hashing, this element is stored in slot h.k/; that is, we use a hash function h to compute the slot from the key k.

Here ,h maps the universe U of keys into the slots of a hash table T[0.. m-1]:

h:U->{0,1,…,m-1}

 

We say that an element with key k hashes to slot h.k/; we also say that h.k/ is the hash value of key k.

 

 

 

There is one hitch: two keys may hash to the same slot. We call this situation a collision.

 

Collision resolution by chaining

In chaining, we place all the elements that hash to the same slot into the same linked list

 

The dictionary operations on a hash table T are easy to implement when collisions are resolved by chaining 

 

11.3 Hash functions 

In this section ,we discuss some issues regrading the design of good hash functions and then  present three schemes for their creation . Two of the schemes ,hashing by division and hashing by multiplication ,are heuristic in nature ,whereas  the third scheme ,universal hashing ,uses randommization to provide provably good performance .

11.3.1 The division method

h(k)=k mod m 

11.3.2 The multiplication method 

The multiplication method for creating hash functions operates in two steps. First, we multiply the key k by a constant A in the range 0 < A < 1 and extract the

fractional part of kA. Then, we multiply this value by m and take the floor of the result. In short, the hash function is

这两个函数是最简单的。

hash表最重要的就是找到一个哈希函数,让其尽量的避免或少点collision。

11.4 Open addressing 

 

In open addressing ,all elements occupy the hash table itself . That is ,each table entry contains either an element of the dynamic set or nil.When searching for an element, we systematically examine table slots until either we find the desired element or we have ascertained that the element is not in the table. No lists and

no elements are stored outside the table, unlike in chaining.

 

The advantage of open addressing  is that  it avoids pointers altogether . 

开放地址的好处是完全避免了指针。

 

Instead of following pointers, we compute the sequence of slots to be examined.

 

代替指针的是我们计算slots 的序列来检查。

 

The hash function becomes :

With open addressing ,we require that for every key k, the probe sequence 

be a permutation of <0,1,,..m-1>

 

The algorithm for searching for key k probes the same sequence of slots that the insertion algorithm examined when key k was inserted.

 

 

 

In our analysis ,we assume uniform hashing : the probe sequence of each key is equally likely to be any of the m!permutations of <0,1,2,m-1>.

We will examine three commonly used techniques to compute the probe sequences required for open addressing : linear probing ,quadratic probing ,and double hashing .

线性探针,平方探针,双重探针。

 

Linear probing 

Given an ordinary hash function h':u->{0,1,m-1} ,which we refer to as an auxiliary hash function ,the method of linear probing uses the hash function 

 

 

Double hashing 

Double hashing offers one of the best methods available for open addressing be- cause the permutations produced have many of the characteristics of randomly chosen permutations. Double hashing uses a hash function of the form

 

 

 

 

 

 

 

11 Hash tables的更多相关文章

  1. Hash Tables

    哈希表 红黑树实现的符号表可以保证对数级别的性能,但我们可以做得更好.哈希表实现的符号表提供了新的数据访问方式,插入和搜索操作可以在常数时间内完成(不支持和顺序有关的操作).所以,在很多情况下的简单符 ...

  2. 数据库(11)-- Hash索引和BTree索引 的区别

    索引是帮助mysql获取数据的数据结构.最常见的索引是Btree索引和Hash索引. 不同的引擎对于索引有不同的支持:Innodb和MyISAM默认的索引是Btree索引:而Mermory默认的索引是 ...

  3. Hash Tables and Hash Functions

    Reference: Compuer science Introduction: This computer science video describes the fundamental princ ...

  4. Javascript: hash tables in javascript

    /** * Copyright 2010 Tim Down. * * Licensed under the Apache License, Version 2.0 (the "License ...

  5. Hash Table Performance in R: Part I(转)

    What Is It? A hash table, or associative array, is a well known key-value data structure. In R there ...

  6. Effective Java 第三版——11. 重写equals方法时同时也要重写hashcode方法

    Tips <Effective Java, Third Edition>一书英文版已经出版,这本书的第二版想必很多人都读过,号称Java四大名著之一,不过第二版2009年出版,到现在已经将 ...

  7. 06: 字典、顺序表、列表、hash树 实现原理

    算法其他篇 目录: 1.1 python中字典对象实现原理 1.2 顺序表 1.3 python 列表(list) 1.1 python中字典对象实现原理返回顶部   注:字典类型是Python中最常 ...

  8. NoSQL生态系统——hash分片和范围分片两种分片

    13.4 横向扩展带来性能提升 很多NoSQL系统都是基于键值模型的,因此其查询条件也基本上是基于键值的查询,基本不会有对整个数据进行查询的时候.由于基本上所有的查询操作都是基本键值形式的,因此分片通 ...

  9. hash算法总结收集

    hash算法的意义在于提供了一种快速存取数据的方法,它用一种算法建立键值与真实值之间的对应关系,(每一个真实值只能有一个键值,但是一个键值可以对应多个真实值),这样可以快速在数组等条件中里面存取数据. ...

随机推荐

  1. 局域网内PC通过笔记本共享上网

    现实:PC.笔记本都通过网线接在局域网内,局域网无法上网:笔记本有无线网卡,可连WIFI上网. 现在想让PC通过笔记本来共享上网. 步骤: 1.笔记本开启DHCP.方法是开启"服务" ...

  2. Cg入门23: Fragment shader – UV动画(序列帧)

    让动画从1-9循环播放此纹理 watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkF ...

  3. Module in powershell

    https://docs.microsoft.com/en-us/powershell/module/powershellget/install-module?view=powershell-6 ht ...

  4. git不同分支局部代码合并 git cherry-pick

    cherry-pick 可以局部代码合并. cherry-pick不仅可以用在不同分支之间,还可以用在同一个分支上. 比如说你在某一个向某个分支中添加了一个功能,后来处于某种原因把它给删除了,然而后来 ...

  5. 路由器的LAN、WAN、WLAN的区别

    路由器的LAN.WAN.WLAN的区别 好多朋友在群内问我路由器如何配置,本来还耐心解答,但是他竟然连LAN.WAN都分不清,瞬间就没了帮助的热情.借这篇经验,小编简单讲一下路由常见的LAN.WAN. ...

  6. 小程序-demo:小程序示例-page/common

    ylbtech-小程序-demo:小程序示例-page/common 1.返回顶部 0.     1. 2. pages/common返回顶部 1. -lib --weui.wxss /*! * we ...

  7. 洛谷P3295 [SCOI2016]萌萌哒(倍增+并查集)

    传送门 思路太妙了啊…… 容易才怪想到暴力,把区间内的每一个数字用并查集维护相等,然后设最后总共有$k$个并查集,那么答案就是$9*10^{k-1}$(因为第一位不能为0) 考虑倍增.我们设$f[i] ...

  8. jquery的validate的用法

    //引入js文件 //jquery 文件 <script src="__PUBLIC__/static/wap/js/jquery.min.js?v=2.1.4">&l ...

  9. Rabbitmq笔记二

    消息何去何从 mandatory 和 immediate 是 channel . basicPublish 方法中的两个参数,它们都有 当消息传递过程中不可达目的地时将消息返回给生产者的功能. 当 m ...

  10. HTML中a标签自动识别电话、邮箱

    HTML中a标签自动识别电话.邮箱 联系电话:<a href="tel:010-88888888">010-88888888</a><br> 联 ...