之前项目为了自动化,所以写一个protobuf的解释器,用来生成项目所需的格式。

当然现在通过以下链接的指导,跳过手工分析,直接生成代码了。

https://developers.google.com/protocol-buffers/docs/reference/cpp-generated

这次文档主要是描述如何分析protobuf格式,以及如何收集需要的符号。

使用python 2.7脚本进行文本的处理。

程序分成4个模块:

expression: 格式的解析

symbol:在protobuf中定义的message等对象以及它们的层次结构,在这里已经看不见protobuf的样子了。

typecollection:基础类型定义和收集message等对象。

builder:遍历symbol,根据需要创建适合的输出文件。typecollection起到索引的作用。这次就不演示了。

1 测试用protobuf文件。(来源于google示例)

package tutorial;

message Person {
required string name = 1;
required int32 id = 2 ;
optional string email = 3; enum PhoneType {
MOBILE = 0;
HOME = 1;
WORK = 2;
}
message PhoneNumber {
required string number = 1;
optional PhoneType type = 2 [default = HOME];
} repeated PhoneNumber phone = 4;
} message AddressBook {
repeated Person person = 1;
}

2 expression实现---最简单的扫描方法,分析每一个word。

# -*- coding: UTF-8 -*-
# pb_expression.py
import sys
import os
import string
import shutil
import io
import pb_symbol class StringBuffer(object):
def __init__(self,src):
self.src = src;
pass;
def __del__(self):
self.buf = None;
pass; def OpenFile(self):
self.Data = open(self.src).read()
pass; class Expression(object): desc_set = set(['required','optional','repeated']) b_char_set = set(['A','B','C','D','E'
,'F','G','H','I','J'
,'K','L','M','N','O'
,'P','Q','R','S','T'
,'U','V','W','X','Y','Z']) l_char_set = set (['a','b','c','d','e'
,'f','g','h','i','j'
,'k','l','m','n','o'
,'p','q','r','s','t'
,'u','v','w','x','y','z']) digit_set = set([0,1,2,3,4,5,6,7,8,9]) equals_char = '='
space_char = ' '
openbrace_char = '{'
closebrace_char = '}'
semicolon_char = ';'
tab_char = chr(9)
newline_char = chr(10)
return_char = chr(13)
slash_char = chr(47)
ctl_char_set = set([openbrace_char,closebrace_char,semicolon_char,equals_char,'\n','\r','\t','=',';',space_char]) empty_char_set = set ([space_char,tab_char,newline_char,return_char]) symbol_char_set = b_char_set | l_char_set | digit_set
all_char_set = symbol_char_set | ctl_char_set def backup(self):
return self.index; def restore(self,prevIndex):
self.index = prevIndex;
def forwardChar(self):
if(self.index < self.count):
self.index = self.index +1 def backChar(self):
if(self.index > 0):
self.index = self.index -1 def getchar(self):
if( self.index < self.count):
char = self.Buf.Data[self.index]
self.forwardChar()
return char
return None; def skipComment(self):
bkIndex = self.backup();
while 1:
char = self.getchar()
next_char = self.getchar()
if(char != self.slash_char or next_char != self.slash_char):
self.restore(bkIndex)
return;
while 1:
char = self.getchar()
if(char == None):
self.restore(bkIndex)
return;
if(char == self.newline_char):
return; def getSpecialChar(self,currentchar):
while 1:
self.skipComment()
char = self.getchar();
if(char == None):
break;
else:
if(char == currentchar):
break;
return char; def getVisibleChar(self):
while 1:
self.skipComment()
char = self.getchar();
if(char is None):
break;
else:
if(char not in self.empty_char_set):
break;
return char; def getNextword(self):
word = None
got1st = 0
while 1:
self.skipComment()
char = self.getchar()
if(char == None):
break;
if(got1st == 0):
if(char not in self.ctl_char_set):
word = char
got1st = 1
else:
if(char in self.ctl_char_set):
self.backChar()
break;
else:
word = word + char
return word; def do_enum_item(self,pbEnum):
memText = self.getNextword();
self.getSpecialChar(self.equals_char);
memValue = self.getNextword();
self.getSpecialChar(self.semicolon_char);
pbEnum.append_Member(memText,memValue) def do_enum_proc(self):
symbol = self.getNextword();
pbEnum = pb_symbol.PBEnum(symbol)
while 1:
currentIndex = self.backup()
word = self.getNextword();
if(word == None):
break;
self.restore(currentIndex)
self.do_enum_item(pbEnum)
end_char_Index = self.backup();
char = self.getVisibleChar();
if(char == self.closebrace_char):
break;
else:
self.restore(end_char_Index);
self.symbol.append_enum(pbEnum) def do_message_proc(self):
symbol = self.getNextword();
pbMsg = pb_symbol.PBMessage(symbol)
while 1:
currentIndex = self.backup()
word = self.getNextword();
if(word == None):
break;
if(word in self.token_set):
subSymbol = pb_symbol.Symbol(self.symbol.tpDict,self.symbol.entity_full_path,False);
subSymbol.update_namespace(symbol);
self.restore(currentIndex);
subExp = Expression(self.Buf,subSymbol);
subExp.index = self.index;
subExp.do_expression();
self.index = subExp.index
self.symbol.append_symbol(subSymbol)
pbMsg.enableSymbol = 1
else:
if(word in self.desc_set):
memType = self.getNextword();
memText = self.getNextword();
pbMsg.append_Member(word,memType,memText)
self.getSpecialChar(self.semicolon_char); end_char_Index = self.backup();
char = self.getVisibleChar();
if(char == self.closebrace_char):
break;
else:
self.restore(end_char_Index);
self.symbol.append_message(pbMsg) def do_import_proc(self):
self.getSpecialChar(self.semicolon_char); def do_package_proc(self):
word = self.getNextword();
self.symbol.update_namespace(word)
self.getSpecialChar(self.semicolon_char); token_set = { 'message':do_message_proc
,'enum':do_enum_proc
,'import':do_import_proc
,'package':do_package_proc
} def do_expression(self):
while 1:
current_index = self.backup();
token = self.getNextword();
if(token == None):
break;
if(token in self.token_set):
proc = self.token_set[token];
proc(self);
else:
self.restore(current_index)
break;

def __init__(self,sBuf,symbol):
self.Buf = sBuf;
self.index = 0;
self.count = len(self.Buf.Data)
self.symbol = symbol;

3 symbol--定义对象类型以及层次

# -*- coding: UTF-8 -*-
# pb_symbol.py
import os
import string
import pb_typecollection class PBEntity(object):
def __init__(self,entName,rtname):
self.entName = entName;
self.orgName = entName
self.rtname = rtname def outputDebug(self):
pass; def create_impl(self,entity_indent,top_ns):
batch_list = list();
return batch_list; def mem_include(self,entName):
return False; class PBMessageMember(object):
def __init__(self,option,memType,memText):
self.option = option;
self.memType = memType;
self.memText = memText; def outputDebug(self):
print(self.option,self.memType,self.memText) @property
def mem_option(self):
return self.option @property
def mem_type(self):
return self.memType; @property
def mem_text(self):
return self.memText class PBMessage(PBEntity): def __init__(self,entName):
PBEntity.__init__(self,entName, entName );
self.members = []
self.enableSymbol = 0;
self.rt_ns = '';
self.tpDict = None @property
def Members(self):
return self.members def attach_tp_dict(self,tpDict):
self.tpDict = tpDict; def append_Member(self,option,memType,memText):
msgMem = PBMessageMember(option,memType,memText)
self.members.append(msgMem) def enable_Symbol(self,enable):
self.enableSymbol = enable; def outputDebug(self,ns):
print(ns,'message',self.entName);
for entMsg in self.members:
entMsg.outputDebug();
print(''); def attach_tp_dict(self,tpDict):
self.tpDict = tpDict; def set_rt_ns(self,rt_entity_full_path):
self.rt_ns = rt_entity_full_path def mem_include(self,entName):
for entMsg in self.members:
if(entName == entMsg.memType):
return True;
return False; def detect_request(self):
if(self.members.count > 0 ):
return True;
return False; class PBEnumMember(object):
def __init__(self,memText,memValue):
self.memText = memText;
self.memValue = memValue; def outputDebug(self):
print(self.memText,self.memValue) class PBEnum( PBEntity):
def __init__(self,entName):
PBEntity.__init__(self,entName,entName);
self.members = [] def append_Member(self,memText,memValue):
msgMem = PBEnumMember(memText,memValue)
self.members.append(msgMem) def outputDebug(self,ns):
print(ns,'enum',self.entName);
for entEnum in self.members:
entEnum.outputDebug();
print(''); class Symbol(object):
def __init__(self,tpDict,fullpath,rooted):
self.namespace = ''
self.tpDict = tpDict
self.rooted = rooted
self.entity_full_path = fullpath
self.rt_entity_full_path = fullpath
self.entitylist = []
self.containerlist = [] def __del__(self):
pass; def update_namespace(self,namespace):
self.namespace = namespace;
if(self.rooted == False):
if(self.entity_full_path == ''):
self.entity_full_path = namespace
self.rt_entity_full_path = namespace
else:
self.entity_full_path = '%s_%s' %(self.entity_full_path,namespace)
self.rt_entity_full_path = '%s_%s' %(self.entity_full_path,namespace) def append_type_dict(self,entity,isMsg):
if(isMsg == True):
if(self.entity_full_path == ''):
self.tpDict.insert_type(entity.entName
,entity.rtname
,entity
,'')
else:
self.tpDict.insert_type(entity.entName
,'%s::%s' % (self.rt_entity_full_path, entity.rtname)
,entity
,'')
else:
if(self.entity_full_path == ''):
self.tpDict.insert_type(entity.entName
,entity.rtname
,entity
,entity.rtname)
else:
self.tpDict.insert_type(entity.entName
,'%s::%s' % (self.rt_entity_full_path, entity.rtname)
,entity
,'%s::%s' % (self.entity_full_path, entity.rtname)) def append_message(self,msg):
self.entitylist.append(msg)
self.containerlist.append(msg)
msg.attach_tp_dict(self.tpDict);
if(self.rt_entity_full_path == ''):
msg.set_rt_ns(self.rt_entity_full_path)
else:
msg.set_rt_ns(self.rt_entity_full_path + '_')
self.append_type_dict(msg,True) def append_enum(self,enum):
self.entitylist.append(enum)
self.append_type_dict(enum,False) def append_symbol(self,symbol):
self.entitylist.append(symbol)
self.containerlist.append(symbol) def outputDebug(self,ns):
for entity in self.entitylist:
entity.outputDebug(ns +'::'+self.namespace); def query_entitylist(self):
return self.entitylist; def query_containerlist(self):
return self.containerlist; def query_pb_ns(self):
return self.namespace; def mem_include(self,entName):
for entity in self.entitylist:
if(entity.mem_include(entName) == True):
return True;
return False; class PBProxy(object):
def __init__(self,entity):
self.entity = entity @property
def enableSymbol(self):
return self.entity.enableSymbol def mem_include(self,entName):
return self.entity.mem_include(entName) def create_impl(self,entity_indent,top_ns):
return self.entity.create_impl(entity_indent,top_ns) @property
def entName(self):
return self.entity.entName; @property
def rtname(self):
return self.entity.rtname; @property
def orgName(self):
return self.entity.orgName; @property
def members(self):
return self.entity.members; @property
def rt_ns(self):
return self.entity.rt_ns; @property
def namespace(self):
return self.entity.namespace; @property
def rooted(self):
return self.entity.rooted; @property
def entity_full_path(self):
return self.entity.entity_full_path; @property
def rt_entity_full_path(self):
return self.entity.rt_entity_full_path; @property
def entitylist(self):
return self.entity.entitylist @property
def containerlist(self):
return self.entity.containerlist @property
def tpDict(self):
return self.entity.tpDict; def detect_request(self):
return self.entity.detect_request() @property
def Members(self):
return self.entity.members @property
def mem_option(self):
return self.entity.mem_option @property
def mem_type(self):
return self.entity.mem_type; @property
def mem_text(self):
return self.entity.mem_text

4 typecollection

# -*- coding: UTF-8 -*-
# pb_typecollection.py import os
import pb_symbol class typeDict(object):
op_req_desc = 'required'
op_opt_desc = 'optional'
op_rep_desc = 'repeated'
def __init__(self):
self.collection = dict()
self.insert_type('int32','__int32',pb_symbol.PBEntity('int32','int32'),'')
self.insert_type('int64','__int64',pb_symbol.PBEntity('int64','int64'),'')
self.insert_type('uint32','unsigned int',pb_symbol.PBEntity('uint32','uint32'),'')
self.insert_type('bool','bool',pb_symbol.PBEntity('bool','bool'),'')
self.insert_type('float','float',pb_symbol.PBEntity('float','float'),'')
self.insert_type('double','double',pb_symbol.PBEntity('double','double'),'')
self.insert_type('string','const char*',pb_symbol.PBEntity('string','string'),'')
self.insert_type('bytes','const char*',pb_symbol.PBEntity('bytes','bytes'),'') def insert_type(self, entName, rtType,entity,orgType):
self.collection[entName] = (rtType,entity,orgType); def output_debug(self):
print('type collection')
for item in self.collection.items():
print(item);

5 测试脚本

# -*- coding: UTF-8 -*-

import pb_symbol
import pb_expression
import pb_typecollection if __name__ == '__main__': pb_file = 'google_tutorial.proto'
sBuf = pb_expression.StringBuffer(pb_file);
tpDict = pb_typecollection.typeDict()
symbol = pb_symbol.Symbol(tpDict,'',True);
try:
sBuf.OpenFile();
exp = pb_expression.Expression(sBuf,symbol);
exp.do_expression();
symbol.outputDebug('');
tpDict.output_debug();
except Exception as exc:
print("%s",exc);
print("done");

6 输出

命名空间:::tutorial::Person

类型名称:PhoneType

('::tutorial::Person', 'enum', 'PhoneType')   
('MOBILE', '0')

('HOME', '1')

('WORK', '2')

('::tutorial::Person', 'message', 'PhoneNumber')

('required', 'string', 'number')

('optional', 'PhoneType', 'type')

('::tutorial', 'message', 'Person')

('required', 'string', 'name')

('required', 'int32', 'id')

('optional', 'string', 'email')

('repeated', 'PhoneNumber', 'phone')

('::tutorial', 'message', 'AddressBook')

('repeated', 'Person', 'person')

type collection

('PhoneNumber', ('Person::PhoneNumber', <pb_symbol.PBMessage object at 0x02B9DED0>, ''))

('int32', ('__int32', <pb_symbol.PBEntity object at 0x02BE3F70>, ''))

('string', ('const char*', <pb_symbol.PBEntity object at 0x02BEE0F0>, ''))

('double', ('double', <pb_symbol.PBEntity object at 0x02BEE0B0>, ''))

('float', ('float', <pb_symbol.PBEntity object at 0x02BEE070>, ''))

('bytes', ('const char*', <pb_symbol.PBEntity object at 0x02BEE130>, ''))

('Person', ('Person', <pb_symbol.PBMessage object at 0x02BEE210>, ''))

('bool', ('bool', <pb_symbol.PBEntity object at 0x02BEE050>, ''))

('PhoneType', ('Person::PhoneType', <pb_symbol.PBEnum object at 0x02BEE450>, 'Person::PhoneType'))

('int64', ('__int64', <pb_symbol.PBEntity object at 0x02BE3FB0>, ''))

('uint32', ('unsigned int', <pb_symbol.PBEntity object at 0x02BE3FF0>, ''))

('AddressBook', ('AddressBook', <pb_symbol.PBMessage object at 0x02BEE7B0>, ''))

参考

protobuf的git地址:https://github.com/google/protobuf

python实现: protobuf解释器的更多相关文章

  1. python是一个解释器

    python是一个解释器 利用pip安装python插件的时候,观察到python的运作方式是逐步解释执行的 适合作为高级调度语言: 异常的处理以及效率应该是主要的问题

  2. Python自动化 【第九篇】:Python基础-线程、进程及python GIL全局解释器锁

    本节内容: 进程与线程区别 线程 a)  语法 b)  join c)  线程锁之Lock\Rlock\信号量 d)  将线程变为守护进程 e)  Event事件 f)   queue队列 g)  生 ...

  3. [译]Python编写虚拟解释器

    使用Python编写虚拟机解释器 一.实验说明 1. 环境登录 无需密码自动登录,系统用户名shiyanlou,密码shiyanlou 2. 环境介绍 本实验环境采用带桌面的Ubuntu Linux环 ...

  4. python读写protobuf

    0.     前期准备 官方protobuf定义 https://code.google.com/p/protobuf/   python使用指南 https://developers.google. ...

  5. 【Python】-NO.98.Note.3.Python -【Python3 解释器、运算符】

    1.0.0 Summary Tittle:[Python]-NO.98.Note.3.Python -[Python3 解释器] Style:Python Series:Python Since:20 ...

  6. Python 编译器与解释器

    Python 编译器与解释器 Python的环境我们已经搭建好了,可以开始学习基础知识了.但是,在此之前,还要先说说编译器与解释器相关的内容. 如果这部分内容,让你觉得难以理解或不能完全明白,可以暂时 ...

  7. 11 个最佳的 Python 编译器和解释器

    原作:Archie Mistry 翻译:豌豆花下猫@Python猫 原文:https://morioh.com/p/765b19f066a4 Python 是一门对初学者友好的编程语言,是一种多用途的 ...

  8. python 处理protobuf协议

    背景:需要用django基于python3模拟一个http接口,请求是post方式,body是protobuf string,返回也是protobuf string 设计:django获取pb str ...

  9. python post protobuf

    本文主要讲述如何使用Python发送protobuf数据. 安装protobuf .tar.gz cd protobuf- ./configure make make install 安装成功. // ...

  10. python设计模式之解释器模式

    python设计模式之解释器模式 对每个应用来说,至少有以下两种不同的用户分类. [ ] 基本用户:这类用户只希望能够凭直觉使用应用.他们不喜欢花太多时间配置或学习应用的内部.对他们来说,基本的用法就 ...

随机推荐

  1. object_funs.py

    #__init__ 构造方法,双下划线 #__del__ 析构方法,在对象就要被垃圾回收前调用.但发生调用 #的具体时间是不可知的.所以建议尽量避免使用__del__ print('-------ex ...

  2. BZOJ_2196_[Usaco2011 Mar]Brownie Slicing_二分答案+贪心

    BZOJ_2196_[Usaco2011 Mar]Brownie Slicing_二分答案+贪心 Description Bessie烘焙了一块巧克力蛋糕.这块蛋糕是由R*C(1 <= R,C ...

  3. 【NOIP2016】 组合数问题

    [题目链接] 点击打开链接 [算法] 杨辉三角 + 二维前缀和 O(1)计算答案 [代码] #include<bits/stdc++.h> using namespace std; #de ...

  4. Moctf--没时间解释了

    记录一道简单的题目. 打开后就张这个样子,,然后看到url为index2.php---->所以我们把它改为index.php(用burp抓包才行,这是一个302跳转). 看到它提示我们要uplo ...

  5. POJ2367【拓扑排序】

    很裸的拓扑排序~ //#include <bits/stdc++.h> #include<iostream> #include<string.h> #include ...

  6. python __builtins__ enumerate类 (21)

    21.'enumerate', 用于将一个可遍历的数据对象(如列表.元组或字符串)组合为一个索引序列,同时列出数据和数据下标,一般用在 for 循环当中. class enumerate(object ...

  7. hdu1875 畅通工程再续 暴力+基础最小生成树

    #include<cstdio> #include<cmath> #include<algorithm> using namespace std; ; ; ; in ...

  8. decltype使用

    #include<thread> #include<array> #include<iostream> #include<windows.h> #inc ...

  9. 地址重用REUSEADDR

    一个socket连接断开后会进入TIME_WAIT,大概有几十秒,这个时候端口是无法使用的,如果不设定地址重用,就会报错,说端口占用. 创建一个socket实例后,在对这个实例进行地址绑定前,要设定地 ...

  10. 洛谷p2234/BZOJ1588 [HNOI2002]营业额统计

    题目链接: 洛谷 BZOJ 分析: 好像没有什么好说的就是一个平衡树的板子--唯一要注意的就是这里要找的并不是严格的前驱和后继,因为如果找到之前某一天的营业额和它相等那么差就是0,所以我们仍然在结构体 ...