Erlang--etc结构解析

Erlang中可以用List表达集合数据,但是如果数据量特别大的话在List中访问元素就会变慢了;这种主要是由于List的绝大部分操作都是基于遍历完成的.

Erlang的设计目标是软实时(参考:http://en.wikipedia.org/wiki/Real-time_computing),在大量数据中检索的时间不仅要快而且要求是常量.为了解决快速查

询的问题,Erlang提供的机制就是ETS(Erlang Term Storage)和DETS(Disk Erlang Term Storage).本文只关注ETS.

ETS基础

ETS查询时间是常量,例外是如果使用ordered_set查询时间与logN成正比(N为存储的数据量)

ETS 存储数据的格式是Tuple,下面的测试代码中我们可以看到细节

ETS Table由进程创建,进程销毁ETS Table也随着销毁,在使用Shell做ETS实验的时候要注意一下,Table的拥有关系可以give_away 转交给其它进程

一个Erlang节点的ETS表的数量是有限制的,默认是1400个表,在启动erlang节点之前修改 ERL_MAX_ETS_TABLES参数可以修改这个限制ejabberd社区站点上总结的性能调优中提到了这一点,点击这里查看:

http://www.ejabberd.im/tuning

ETS表不在GC的管理范围内，除非拥有它的进程死掉它才会终止；可以通过delete删除数据

目前版本,insert和lookup操作都会导致对象副本的创建,insert和lookup时间对于set bag duplicate_bag都是常量值与表大小无关.

并发控制：所有针对一个对象的更新都被保证是原子的、隔离的：修改要么全部成功要么失败。也没有其它的中间结果被其它的进程使用。有些方法可以在处理多个对象的时候保证这种原子性和隔离性。

在数据库术语中隔离级别被称作序列化，就好像所有隔离的操作一个接一个严格按照顺序执行。

在遍历过程中,可以使用safe_fixtable来保证遍历过程中不出现错误,所有数据项只被访问一遍.用到逐一遍历的场景就很少，使用safe_fixtable的情景就更少。不过这个机制是非常有用的，

还记得在.net中版本中很麻烦的一件事情就是遍历在线玩家用户列表.由于玩家登录退出的变化,这里的异常几乎是不可避免的.select match内部实现的时候都会使用safe_fixtable

查看ETS Table

Erlang提供了一个可视化的ETS查看工具 The Table Visualizer,启动tv:start(),界面比较简单.值得一提的是,这个工具可以跨节点查看ETS信息,在File菜单里面有一个nodes选项,

打开会给出和当前节点互相连通的节点列表,点击节点会显示这个节点上的ETS Table信息.

在没有可视化工具的时候我们如何查看ETS的信息?而且这还是比较常见的情况,在文本模式操作服务器的情况下,Table Visualizer根本没法使用.下面的命令可以达到同样的效果:

ets:all() %列出所有的ETS Table

ets:i() %给出一个ETS Table的清单包含表的类型,数据量,使用内存,所有者信息

ets:i(zen_ets) % 输出zen_ets表的数据,个人感觉这个非常方便比tv还要简单快捷,如果表数据量很大,它还提供了一个分页显示的功能

ets:info(zen_ets) %单独查看一个ETS Table的详细信息也可以使用这个方法,如果怀疑这个表被锁了可以使用ets:info(zen_ets,fixed)查看,ets:info(zen_ets,safe_fixed) 可以

获得更多的信息,这样比较容易定位是哪个模块出了问题.

ets:member(Tab, Key) -> true | false %看表里面是否存在键值为Key的数据项.

创建删除ETS Table插入数据

上面已经提到了ETS存储数据的格式是Tuples,我们动手写一些测试代码看一下ETS的常规操作:

%快速创建一个ETS Table 并填充数据

T = ets:new(x,[ordered_set]).

[ ets:insert(T,{N}) || N <- lists:seq(1,10) ].

TableID = ets:new(temp_table , []), %Create New ETS Table

ets:insert(TableID,{1,2} ), % insert one Item to Table

Result= ets:lookup(TableID ,1),

io:format("ets:lookup(TableID ,1) Result: ~p ~n " ,[ Result ]),

ets:insert(TableID,{1,3} ),

Result2 = ets:lookup(TableID, 1 ),

io:format("ets:lookup(TableID ,1) Result2: ~p ~n ", [ Result2 ]),

ets:delete(TableID),

BagTableID = ets:new(temp_table, [bag]),

ets:insert(BagTableID,{1,2} ),

ets:insert(BagTableID,{1,3} ),

ets:insert(BagTableID,{1,4} ),

%Note that the time order of object insertions is preserved;

%The first object inserted with the given key will be first in the resulting list, and so on.

Result3 = ets:lookup(BagTableID, 1 ),

io:format("ets:lookup(BagTableID ,1) Result3: ~p ~n ", [ Result3 ])

%创建ETS表注意参数named_table,我们可以通过countries原子来标识这个ETS Table

ets:new(countries, [bag,named_table]),

%插入几条数据

ets:insert(countries,{yves,france,cook}),

ets:insert(countries,{sean,ireland,bartender}),

ets:insert(countries,{marco,italy,cook}),

ets:insert(countries,{chris,ireland,tester}).

我不明白为什么宁愿相信别人的话,也不愿意自己动手写一段测试代码看看;别人说对了还好,如果说错了呢?快速的构建一个测试模型,去验证自己的想法,这种能力会在一次次实验中不断强化;比如有人问到"ETS INSERT是每次都新加一个还是会更新一个",在Erlang Shell中就可以验证它(下面默认创建的是Set类型):

Eshell V5.9  (abort with ^G)
1> ets:new(test,[named_table]).
test
2> [ets:insert(test,{Item}) || Item <-[1,2,3,4,5,6]].
[true,true,true,true,true,true]
3> [ets:insert(test,{Item}) || Item <-[1,2,3,4,5,6]].
[true,true,true,true,true,true]
4> ets:i(test).
<1   > {5}
<2   > {3}
<3   > {2}
<4   > {1}
<5   > {4}
<6   > {6}
EOT  (q)uit (p)Digits (k)ill /Regexp -->q

ok

分页从ETS中提取数据

有时候匹配的数据量很大,如果一次性把所有的数据都取出来,处理会非常慢;一个处理方法就是分批次处理,这也就要求我们能够分多次

从ETS Table中取数据.这和做网页分页很像.ets类库中提供了一系列方法来实现这个功能这里我们以match为例:

match(Tab, Pattern, Limit) -> {[Match],Continuation} | '$end_of_table'

参数Limit就是每一次查询的数量限制,如果实际匹配的数据量超过了Limit就会返回{[Match],Continuation}的结果,Match代表查询的结果集,可以推测

Continuation包含分页的信息,如果继续取下一页的结果集使用下面的方法:

match(Continuation) -> {[Match],Continuation} | '$end_of_table'

我们通过demo看一下分页查询的结果,特别是Continuation的数据结构,首先我们先填充一些测试数据:

ets:new(zen_ets, [{keypos, #t.id}, named_table, public, set]),

     ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

    ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

    ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

    ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

    ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

    ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

      ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

    ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

    ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

      ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

    ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

    ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

      ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

    ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

    ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

      ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

    ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

    ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

      ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

    ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

    ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

      ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

    ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

    ets:insert(zen_ets,#t{id= ,item=,name="hello",iabn=,age=}),

      ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

    ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

    ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

      ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

    ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

    ets:insert(zen_ets,#t{id=,item=,name="hello",iabn=,age=}),

    ets:foldl(fun(A,AC)-> io:format("Data:~p~n",[A]) end ,,zen_ets).

我们每页10条数据,执行4次,代码如下:

{M,C}=ets:match(zen_ets,'$1',10). %第一页

{M2,C2} = ets:match(C). %第二页

{M3,C3} = ets:match(C2). %第三页

{M4,C4} = ets:match(C3). %没有数据了看异常是什么?

展开下面的代码查看调用结果:

(zen_latest@192.168.1.188)> {M,C}=ets:match(zen_ets,'$1',).

{[[{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}]],

 {zen_ets,,,<<>>,[],}}

(zen_latest@192.168.1.188)> {M2,C2} = ets:match(C).

{[[{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}]],

 {zen_ets,,,<<>>,[],}}

(zen_latest@192.168.1.188)> {M3,C3} = ets:match(C2).

{[[{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}],

  [{t,,,"hello",,}]],

 '$end_of_table'}

(zen_latest@192.168.1.188)> {M4,C4} = ets:match(C3).

** exception error: no match of right hand side value '$end_of_table'

(zen_latest@192.168.1.188)>

类似的还有:

match_object(Tab, Pattern, Limit) -> {[Match],Continuation} | '$end_of_table'

match_object(Continuation) -> {[Match],Continuation} | '$end_of_table'

select(Tab, MatchSpec, Limit) -> {[Match],Continuation} | '$end_of_table'

select(Continuation) -> {[Match],Continuation} | '$end_of_table'

只获取匹配数据的数量: select_count(Tab, MatchSpec) -> NumMatched

ETS 使用Match specifications 查询

match方法进行匹配最简单, '$数字'代表占位符,'_'代表通配符;'$数字'这种表示方式,数字的大小代表什么?

从下面的代码示例中可以看出数字控制的是输出结果顺序,数字相对大小代表相对位置顺序;

%'_' 通配符
A= ets:match(countries, {'$1','_','_' } ) ,
io:format(" ets:match(countries, {'$1','_','_' } ) Result : ~p ~n " ,[ A ]),
B= ets:match(countries , {'$1', '$0' ,'_' } ),
io:format(" ets:match(countries , {'$1', '$0' ,'_' } ), Result : ~p ~n " ,[ B ]),
C= ets:match(countries , {'$11', '$9' ,'_' } ),
io:format(" C= ets:match(countries , {'$11', '$9' ,'_' } ), Result : ~p ~n " ,[ C ]),
D= ets:match(countries , {'$11', '$99' ,'_' } ),
io:format(" ets:match(countries , {'$11', '$99' ,'_' } ), Result : ~p ~n " ,[ D ]),
E= ets:match(countries , {'$101', '$9' ,'_' } ),
io:format("ets:match(countries , {'$101', '$9' ,'_' } ), Result : ~p ~n " ,[ E ]),
F= ets:match(countries,{'$2',ireland,'_'}),
G= ets:match(countries,{'_',ireland,'_'}), % [[],[]] 如果没有数字占位符是没有结果输出的只是空列表
H= ets:match(countries,{'$2',cook,'_'}),
I= ets:match(countries,{'$0','$1',cook}),
J= ets:match(countries,{'$0','$0',cook}),

如果是需要所有字段,提取整个数据项,那就直接使用match_object,

K= ets:match_object(countries,{'_',ireland,'_'}),

io:format(" ets:match_object(countries,{'_',ireland,'_'}), Result : ~p ~n " ,[ K ]),

L= ets:match(countries ,'$1' ),

io:format(" ets:match(countries ,'$1' ), Result: ~p ~n " ,[ L ]),

Result=ets:match_delete(countries,{'_','_',cook}),

io:format("ets:match_delete(countries,{'_','_',cook}), Result : ~p ~n " ,[ Result ]),

上面的例子countries这个结构很简单,但是如果是一个字段稍多写的结构呢?很容易出现类似ets:match(zen_ets, {'$1','_','_','_','_','_' } ) .这样的代码,不仅可读性差,而且一旦字段顺序发生

变化,这里就容易出错.解决方法在[Erlang 0006] Erlang中的record与宏一文中已经提到过,使用record可以规避掉tuple字段增减,顺序的问题.

例如: ets:match_delete(zen_ets, #t{age=24,iabn=1,_='_'}),

有时候我们需要表达更为复杂的匹配条件,这就需要使用Match specifications了,ms的解析依赖ms_transform模块,所以首先我们在模块头添加

include_lib("stdlib/include/ms_transform.hrl").增加对ms_transform.hrl头文件的引用.Match specifications的详细说明参见这里: http://www.erlang.org/doc/apps/erts/match_spec.html

MS = ets:fun2ms(fun({ Name,Country , Position } ) when Position /=cook -> [Country,Name ] end ),

MSResult = ets:select(countries, MS ),

io:format("ets:fun2ms(fun({ Name,Country , Position } ) when Position /=cook -> [Country,Name ] end ), MSResult:~p~n " , [MSResult ]),

MS2 =ets:fun2ms(fun(Data ={Name, Country ,Position } ) when Position /=cook -> Data end ),

MSResult2 = ets:select(countries , MS2),

io:format("ets:fun2ms(fun(Data ={Name, Country ,Position } ) when Position /=cook -> Data end ), Result : ~p ~n " ,[ MSResult2 ]),

%当我们使用的是Tuple的时候这里必须使用完全匹配

MS3 = ets:fun2ms(fun(Data ={Name, Country ,Position } ) when Position /=cook -> Data end ),

MSResult2 = ets:select(countries , MS3),

在实战操作中,我们遇到这样一个问题,下面的MS MS2是等效的么? ets:fun2ms(fun(#t{id =ID , name =Name, _='_' } ) when ID >30 -> Name end ),亮点是红色标记的部分.可以运行一下下面的

代码看,两者是生成的ms是一样的.

MS = ets:fun2ms(fun(#t{id =ID , name =Name } ) when ID >30 -> Name end ),

io:format(" ets:fun2ms(fun(#t{id =ID , name =Name } ) when ID >30 -> Name end ), MS: ~p ~n " , [ MS ]),

MS2 = ets:fun2ms(fun(#t{id =ID , name =Name, _='_' } ) when ID >30 -> Name end ),

io:format(" ets:fun2ms(fun(#t{id =ID , name =Name, _='_' } ) when ID >30 -> Name end ), MS2: ~p ~n " ,[ MS2 ]),

io:format("MS==MS2 ? Result : ~p ~n " , [ MS==MS2 ]),

MSResult = ets:select(zen_ets , MS ),

在使用MS的过程中,还有一个特殊的情况,如果要返回完整的record应该怎么写呢?仔细阅读ETS文档,可以看到这么一句:The return value is constructed using the "match variables" bound in

the MatchHead or using the special match variables '$_' (the whole matching object) and '$$' (all match variables in a list), so that the following ets:match/2 expression:

再翻看http://www.erlang.org/doc/apps/erts/match_spec.html,可以看到下面的说明:

ExprMatchVariable ::= MatchVariable (bound in the MatchHead) | '$_' | '$$'

也就是说只要这样'$_'就可以了,试验了一下MS3 = ets:fun2ms(fun(T=#t{id =ID , name =Name, _='_' } ) when ID >30 -> T end )生成的ms是:

,MS3: [{{t, '$1', '_','$2', '_', '_'}, [{'>', '$1', 30}],['$_']}]

拓展阅读:

2003年的论文 <<Erlang ETS Table的实现与性能研究>>

A Study of Erlang ETS Table Implementation and Performance. [点此下载]

Scott Lystig Fritchie.
Second ACM SIGPLAN Erlang Workshop.
Uppsala, Sweden, August 29, 2003.

Erlang--etc结构解析的更多相关文章

iOS沙盒目录结构解析
iOS沙盒目录结构解析原文地址:http://blog.csdn.net/wzzvictory/article/details/18269713 出于安全考虑,iOS系统的沙盒机制规定每个应 ...
H.264码流结构解析
from:http://wenku.baidu.com/link?url=hYQHJcAWUIS-8C7nSBbf-8lGagYGXKb5msVwQKWyXFAcPLU5gR4BKOVLrFOw4bX ...
Oracle的rowid结构解析
SQL> select rowid,deptno from dept; ROWID DEPTNO ------------------ ---------- A ...
EXT 结构解析
EXT Demo 结构解析创建项目 sencha -sdk F:\lib\ext-6.0.0 generate app demo F:\demo 预览项目执行命令 sencha app build ...
ionic项目结构解析
ionic项目结构解析原始结构创建一个IonicDemo项目 'ionic start IonicDemo sidemenu' 这种结构多模块开发比较麻烦,因为view跟controller分开路 ...
Redis源码剖析--源码结构解析
请持续关注我的个人博客:https://zcheng.ren 找工作那会儿,看了黄建宏老师的<Redis设计与实现>,对redis的部分实现有了一个简明的认识.在面试过程中,redis确实 ...
InfluxDB源码目录结构解析
操作系统 : CentOS7.3.1611_x64 go语言版本:1.8.3 linux/amd64 InfluxDB版本:1.1.0 influxdata主目录结构 [root@localhost ...
[转帖]认识固态：SSD硬盘内外结构解析
认识固态:SSD硬盘内外结构解析来自: 中关村在线收藏分享邀请固态硬盘(Solid State Drive),简称固态盘(SSD),是用固态电子存储芯片阵列而制成的硬盘,由控制单元和存储单元 ...
redis源代码结构解析
看了黄建宏老师的<Redis设计与实现>,对redis的部分实现有了一个简明的认识: 之前面试的时候被问到了这部分的内容,没有关注,好在还有时间,就把Redis的源码看了一遍. Redis ...
MBR结构解析与fdisk的bash实现
一.MBR结构解析首先我们先介绍一些MBR的基本知识基础,再晾图片分析. MBR主要分为三大块各自是: 1.载入引导程序(446K) 2.分区表(64k) 3.标志结束位(2k) 载入引导程序:内容 ...

随机推荐

ajex请求的数据什么时候需用Json.parse()
ajex请求的数据什么时候需用 Json.parse()
基于Vue2.0+Vue-router构建一个简单的单页应用
爱编程爱分享,原创文章,转载请注明出处,谢谢!http://www.cnblogs.com/fozero/p/6185492.html 一.介绍 vue.js 是目前最火的前端框架,vue.js ...
2016福州大学软件工程第五、六次团队作业-Alpha阶段成绩汇总
1.本次作业成绩统计结果: 本次Alpha阶段团队作业公布如下: 表格说明: PE:贡献百分比 YS:演示评分(满分15分) BK:博客评分(满分15分) SH:事后诸葛亮环节(满分5分) P:个人分 ...
Dubbo项目demo搭建
项目参考: http://dubbo.io/User+Guide-zh.htm https://my.oschina.net/superman158/blog/466637 项目使用 maven+id ...
linux 正则表达式使用
1.正则表达式概念正则表达式使用单个字符串来描述.匹配一系列符合某个句法规则的字符串.在很多文本编辑里,正则表达式通常被用来检索.替换那些符合某个模式的文本. 正则表达式的基本元素包括普通字符和元字 ...
【转】使用SQL Tuning Advisor STA优化SQL
SQL优化器(SQL Tuning Advisor STA)是Oracle10g中推出的帮助DBA优化工具,它的特点是简单.智能,DBA值需要调用函数就可以给出一个性能很差的语句的优化结果.下面介绍一 ...
EBS 中HOST主机并发请求模板
#!/bin/sh########################################################################################### ...
使用node.js生成excel报表下载(excel-export express篇)
引言:日常工作中已经有许多应用功能块使用了nodejs作为web服务器,而生成报表下载也是我们在传统应用. java中提供了2套类库实现(jxl 和POI),.NET 作为微软的亲儿子更加不用说,各种 ...
Maven的Missing artifact问题解决
Maven的Missing artifact问题解决今天在创建一个新的Maven项目时,在其中添加了很多依赖.刚开始为了避免错误就每添加一次,保存一下,Eclipse就会下载相应的包.最后为了 ...
Rails Array method second/third/second_to_last
http://api.rubyonrails.org/classes/Array.html#method-i-second [27] pry(main)> list = ["a&quo ...

Erlang--etc结构解析

Erlang--etc结构解析的更多相关文章

随机推荐

热门专题