之前项目为了自动化,所以写一个protobuf的解释器,用来生成项目所需的格式。

当然现在通过以下链接的指导,跳过手工分析,直接生成代码了。

https://developers.google.com/protocol-buffers/docs/reference/cpp-generated

这次文档主要是描述如何分析protobuf格式,以及如何收集需要的符号。

使用python 2.7脚本进行文本的处理。

程序分成4个模块:

expression: 格式的解析

symbol:在protobuf中定义的message等对象以及它们的层次结构,在这里已经看不见protobuf的样子了。

typecollection:基础类型定义和收集message等对象。

builder:遍历symbol,根据需要创建适合的输出文件。typecollection起到索引的作用。这次就不演示了。

1 测试用protobuf文件。(来源于google示例)

package tutorial;

message Person {
required string name = 1;
required int32 id = 2 ;
optional string email = 3; enum PhoneType {
MOBILE = 0;
HOME = 1;
WORK = 2;
}
message PhoneNumber {
required string number = 1;
optional PhoneType type = 2 [default = HOME];
} repeated PhoneNumber phone = 4;
} message AddressBook {
repeated Person person = 1;
}

2 expression实现---最简单的扫描方法,分析每一个word。

# -*- coding: UTF-8 -*-
# pb_expression.py
import sys
import os
import string
import shutil
import io
import pb_symbol class StringBuffer(object):
def __init__(self,src):
self.src = src;
pass;
def __del__(self):
self.buf = None;
pass; def OpenFile(self):
self.Data = open(self.src).read()
pass; class Expression(object): desc_set = set(['required','optional','repeated']) b_char_set = set(['A','B','C','D','E'
,'F','G','H','I','J'
,'K','L','M','N','O'
,'P','Q','R','S','T'
,'U','V','W','X','Y','Z']) l_char_set = set (['a','b','c','d','e'
,'f','g','h','i','j'
,'k','l','m','n','o'
,'p','q','r','s','t'
,'u','v','w','x','y','z']) digit_set = set([0,1,2,3,4,5,6,7,8,9]) equals_char = '='
space_char = ' '
openbrace_char = '{'
closebrace_char = '}'
semicolon_char = ';'
tab_char = chr(9)
newline_char = chr(10)
return_char = chr(13)
slash_char = chr(47)
ctl_char_set = set([openbrace_char,closebrace_char,semicolon_char,equals_char,'\n','\r','\t','=',';',space_char]) empty_char_set = set ([space_char,tab_char,newline_char,return_char]) symbol_char_set = b_char_set | l_char_set | digit_set
all_char_set = symbol_char_set | ctl_char_set def backup(self):
return self.index; def restore(self,prevIndex):
self.index = prevIndex;
def forwardChar(self):
if(self.index < self.count):
self.index = self.index +1 def backChar(self):
if(self.index > 0):
self.index = self.index -1 def getchar(self):
if( self.index < self.count):
char = self.Buf.Data[self.index]
self.forwardChar()
return char
return None; def skipComment(self):
bkIndex = self.backup();
while 1:
char = self.getchar()
next_char = self.getchar()
if(char != self.slash_char or next_char != self.slash_char):
self.restore(bkIndex)
return;
while 1:
char = self.getchar()
if(char == None):
self.restore(bkIndex)
return;
if(char == self.newline_char):
return; def getSpecialChar(self,currentchar):
while 1:
self.skipComment()
char = self.getchar();
if(char == None):
break;
else:
if(char == currentchar):
break;
return char; def getVisibleChar(self):
while 1:
self.skipComment()
char = self.getchar();
if(char is None):
break;
else:
if(char not in self.empty_char_set):
break;
return char; def getNextword(self):
word = None
got1st = 0
while 1:
self.skipComment()
char = self.getchar()
if(char == None):
break;
if(got1st == 0):
if(char not in self.ctl_char_set):
word = char
got1st = 1
else:
if(char in self.ctl_char_set):
self.backChar()
break;
else:
word = word + char
return word; def do_enum_item(self,pbEnum):
memText = self.getNextword();
self.getSpecialChar(self.equals_char);
memValue = self.getNextword();
self.getSpecialChar(self.semicolon_char);
pbEnum.append_Member(memText,memValue) def do_enum_proc(self):
symbol = self.getNextword();
pbEnum = pb_symbol.PBEnum(symbol)
while 1:
currentIndex = self.backup()
word = self.getNextword();
if(word == None):
break;
self.restore(currentIndex)
self.do_enum_item(pbEnum)
end_char_Index = self.backup();
char = self.getVisibleChar();
if(char == self.closebrace_char):
break;
else:
self.restore(end_char_Index);
self.symbol.append_enum(pbEnum) def do_message_proc(self):
symbol = self.getNextword();
pbMsg = pb_symbol.PBMessage(symbol)
while 1:
currentIndex = self.backup()
word = self.getNextword();
if(word == None):
break;
if(word in self.token_set):
subSymbol = pb_symbol.Symbol(self.symbol.tpDict,self.symbol.entity_full_path,False);
subSymbol.update_namespace(symbol);
self.restore(currentIndex);
subExp = Expression(self.Buf,subSymbol);
subExp.index = self.index;
subExp.do_expression();
self.index = subExp.index
self.symbol.append_symbol(subSymbol)
pbMsg.enableSymbol = 1
else:
if(word in self.desc_set):
memType = self.getNextword();
memText = self.getNextword();
pbMsg.append_Member(word,memType,memText)
self.getSpecialChar(self.semicolon_char); end_char_Index = self.backup();
char = self.getVisibleChar();
if(char == self.closebrace_char):
break;
else:
self.restore(end_char_Index);
self.symbol.append_message(pbMsg) def do_import_proc(self):
self.getSpecialChar(self.semicolon_char); def do_package_proc(self):
word = self.getNextword();
self.symbol.update_namespace(word)
self.getSpecialChar(self.semicolon_char); token_set = { 'message':do_message_proc
,'enum':do_enum_proc
,'import':do_import_proc
,'package':do_package_proc
} def do_expression(self):
while 1:
current_index = self.backup();
token = self.getNextword();
if(token == None):
break;
if(token in self.token_set):
proc = self.token_set[token];
proc(self);
else:
self.restore(current_index)
break;

def __init__(self,sBuf,symbol):
self.Buf = sBuf;
self.index = 0;
self.count = len(self.Buf.Data)
self.symbol = symbol;

3 symbol--定义对象类型以及层次

# -*- coding: UTF-8 -*-
# pb_symbol.py
import os
import string
import pb_typecollection class PBEntity(object):
def __init__(self,entName,rtname):
self.entName = entName;
self.orgName = entName
self.rtname = rtname def outputDebug(self):
pass; def create_impl(self,entity_indent,top_ns):
batch_list = list();
return batch_list; def mem_include(self,entName):
return False; class PBMessageMember(object):
def __init__(self,option,memType,memText):
self.option = option;
self.memType = memType;
self.memText = memText; def outputDebug(self):
print(self.option,self.memType,self.memText) @property
def mem_option(self):
return self.option @property
def mem_type(self):
return self.memType; @property
def mem_text(self):
return self.memText class PBMessage(PBEntity): def __init__(self,entName):
PBEntity.__init__(self,entName, entName );
self.members = []
self.enableSymbol = 0;
self.rt_ns = '';
self.tpDict = None @property
def Members(self):
return self.members def attach_tp_dict(self,tpDict):
self.tpDict = tpDict; def append_Member(self,option,memType,memText):
msgMem = PBMessageMember(option,memType,memText)
self.members.append(msgMem) def enable_Symbol(self,enable):
self.enableSymbol = enable; def outputDebug(self,ns):
print(ns,'message',self.entName);
for entMsg in self.members:
entMsg.outputDebug();
print(''); def attach_tp_dict(self,tpDict):
self.tpDict = tpDict; def set_rt_ns(self,rt_entity_full_path):
self.rt_ns = rt_entity_full_path def mem_include(self,entName):
for entMsg in self.members:
if(entName == entMsg.memType):
return True;
return False; def detect_request(self):
if(self.members.count > 0 ):
return True;
return False; class PBEnumMember(object):
def __init__(self,memText,memValue):
self.memText = memText;
self.memValue = memValue; def outputDebug(self):
print(self.memText,self.memValue) class PBEnum( PBEntity):
def __init__(self,entName):
PBEntity.__init__(self,entName,entName);
self.members = [] def append_Member(self,memText,memValue):
msgMem = PBEnumMember(memText,memValue)
self.members.append(msgMem) def outputDebug(self,ns):
print(ns,'enum',self.entName);
for entEnum in self.members:
entEnum.outputDebug();
print(''); class Symbol(object):
def __init__(self,tpDict,fullpath,rooted):
self.namespace = ''
self.tpDict = tpDict
self.rooted = rooted
self.entity_full_path = fullpath
self.rt_entity_full_path = fullpath
self.entitylist = []
self.containerlist = [] def __del__(self):
pass; def update_namespace(self,namespace):
self.namespace = namespace;
if(self.rooted == False):
if(self.entity_full_path == ''):
self.entity_full_path = namespace
self.rt_entity_full_path = namespace
else:
self.entity_full_path = '%s_%s' %(self.entity_full_path,namespace)
self.rt_entity_full_path = '%s_%s' %(self.entity_full_path,namespace) def append_type_dict(self,entity,isMsg):
if(isMsg == True):
if(self.entity_full_path == ''):
self.tpDict.insert_type(entity.entName
,entity.rtname
,entity
,'')
else:
self.tpDict.insert_type(entity.entName
,'%s::%s' % (self.rt_entity_full_path, entity.rtname)
,entity
,'')
else:
if(self.entity_full_path == ''):
self.tpDict.insert_type(entity.entName
,entity.rtname
,entity
,entity.rtname)
else:
self.tpDict.insert_type(entity.entName
,'%s::%s' % (self.rt_entity_full_path, entity.rtname)
,entity
,'%s::%s' % (self.entity_full_path, entity.rtname)) def append_message(self,msg):
self.entitylist.append(msg)
self.containerlist.append(msg)
msg.attach_tp_dict(self.tpDict);
if(self.rt_entity_full_path == ''):
msg.set_rt_ns(self.rt_entity_full_path)
else:
msg.set_rt_ns(self.rt_entity_full_path + '_')
self.append_type_dict(msg,True) def append_enum(self,enum):
self.entitylist.append(enum)
self.append_type_dict(enum,False) def append_symbol(self,symbol):
self.entitylist.append(symbol)
self.containerlist.append(symbol) def outputDebug(self,ns):
for entity in self.entitylist:
entity.outputDebug(ns +'::'+self.namespace); def query_entitylist(self):
return self.entitylist; def query_containerlist(self):
return self.containerlist; def query_pb_ns(self):
return self.namespace; def mem_include(self,entName):
for entity in self.entitylist:
if(entity.mem_include(entName) == True):
return True;
return False; class PBProxy(object):
def __init__(self,entity):
self.entity = entity @property
def enableSymbol(self):
return self.entity.enableSymbol def mem_include(self,entName):
return self.entity.mem_include(entName) def create_impl(self,entity_indent,top_ns):
return self.entity.create_impl(entity_indent,top_ns) @property
def entName(self):
return self.entity.entName; @property
def rtname(self):
return self.entity.rtname; @property
def orgName(self):
return self.entity.orgName; @property
def members(self):
return self.entity.members; @property
def rt_ns(self):
return self.entity.rt_ns; @property
def namespace(self):
return self.entity.namespace; @property
def rooted(self):
return self.entity.rooted; @property
def entity_full_path(self):
return self.entity.entity_full_path; @property
def rt_entity_full_path(self):
return self.entity.rt_entity_full_path; @property
def entitylist(self):
return self.entity.entitylist @property
def containerlist(self):
return self.entity.containerlist @property
def tpDict(self):
return self.entity.tpDict; def detect_request(self):
return self.entity.detect_request() @property
def Members(self):
return self.entity.members @property
def mem_option(self):
return self.entity.mem_option @property
def mem_type(self):
return self.entity.mem_type; @property
def mem_text(self):
return self.entity.mem_text

4 typecollection

# -*- coding: UTF-8 -*-
# pb_typecollection.py import os
import pb_symbol class typeDict(object):
op_req_desc = 'required'
op_opt_desc = 'optional'
op_rep_desc = 'repeated'
def __init__(self):
self.collection = dict()
self.insert_type('int32','__int32',pb_symbol.PBEntity('int32','int32'),'')
self.insert_type('int64','__int64',pb_symbol.PBEntity('int64','int64'),'')
self.insert_type('uint32','unsigned int',pb_symbol.PBEntity('uint32','uint32'),'')
self.insert_type('bool','bool',pb_symbol.PBEntity('bool','bool'),'')
self.insert_type('float','float',pb_symbol.PBEntity('float','float'),'')
self.insert_type('double','double',pb_symbol.PBEntity('double','double'),'')
self.insert_type('string','const char*',pb_symbol.PBEntity('string','string'),'')
self.insert_type('bytes','const char*',pb_symbol.PBEntity('bytes','bytes'),'') def insert_type(self, entName, rtType,entity,orgType):
self.collection[entName] = (rtType,entity,orgType); def output_debug(self):
print('type collection')
for item in self.collection.items():
print(item);

5 测试脚本

# -*- coding: UTF-8 -*-

import pb_symbol
import pb_expression
import pb_typecollection if __name__ == '__main__': pb_file = 'google_tutorial.proto'
sBuf = pb_expression.StringBuffer(pb_file);
tpDict = pb_typecollection.typeDict()
symbol = pb_symbol.Symbol(tpDict,'',True);
try:
sBuf.OpenFile();
exp = pb_expression.Expression(sBuf,symbol);
exp.do_expression();
symbol.outputDebug('');
tpDict.output_debug();
except Exception as exc:
print("%s",exc);
print("done");

6 输出

命名空间:::tutorial::Person

类型名称:PhoneType

('::tutorial::Person', 'enum', 'PhoneType')   
('MOBILE', '0')

('HOME', '1')

('WORK', '2')

('::tutorial::Person', 'message', 'PhoneNumber')

('required', 'string', 'number')

('optional', 'PhoneType', 'type')

('::tutorial', 'message', 'Person')

('required', 'string', 'name')

('required', 'int32', 'id')

('optional', 'string', 'email')

('repeated', 'PhoneNumber', 'phone')

('::tutorial', 'message', 'AddressBook')

('repeated', 'Person', 'person')

type collection

('PhoneNumber', ('Person::PhoneNumber', <pb_symbol.PBMessage object at 0x02B9DED0>, ''))

('int32', ('__int32', <pb_symbol.PBEntity object at 0x02BE3F70>, ''))

('string', ('const char*', <pb_symbol.PBEntity object at 0x02BEE0F0>, ''))

('double', ('double', <pb_symbol.PBEntity object at 0x02BEE0B0>, ''))

('float', ('float', <pb_symbol.PBEntity object at 0x02BEE070>, ''))

('bytes', ('const char*', <pb_symbol.PBEntity object at 0x02BEE130>, ''))

('Person', ('Person', <pb_symbol.PBMessage object at 0x02BEE210>, ''))

('bool', ('bool', <pb_symbol.PBEntity object at 0x02BEE050>, ''))

('PhoneType', ('Person::PhoneType', <pb_symbol.PBEnum object at 0x02BEE450>, 'Person::PhoneType'))

('int64', ('__int64', <pb_symbol.PBEntity object at 0x02BE3FB0>, ''))

('uint32', ('unsigned int', <pb_symbol.PBEntity object at 0x02BE3FF0>, ''))

('AddressBook', ('AddressBook', <pb_symbol.PBMessage object at 0x02BEE7B0>, ''))

参考

protobuf的git地址:https://github.com/google/protobuf

python实现: protobuf解释器的更多相关文章

  1. python是一个解释器

    python是一个解释器 利用pip安装python插件的时候,观察到python的运作方式是逐步解释执行的 适合作为高级调度语言: 异常的处理以及效率应该是主要的问题

  2. Python自动化 【第九篇】:Python基础-线程、进程及python GIL全局解释器锁

    本节内容: 进程与线程区别 线程 a)  语法 b)  join c)  线程锁之Lock\Rlock\信号量 d)  将线程变为守护进程 e)  Event事件 f)   queue队列 g)  生 ...

  3. [译]Python编写虚拟解释器

    使用Python编写虚拟机解释器 一.实验说明 1. 环境登录 无需密码自动登录,系统用户名shiyanlou,密码shiyanlou 2. 环境介绍 本实验环境采用带桌面的Ubuntu Linux环 ...

  4. python读写protobuf

    0.     前期准备 官方protobuf定义 https://code.google.com/p/protobuf/   python使用指南 https://developers.google. ...

  5. 【Python】-NO.98.Note.3.Python -【Python3 解释器、运算符】

    1.0.0 Summary Tittle:[Python]-NO.98.Note.3.Python -[Python3 解释器] Style:Python Series:Python Since:20 ...

  6. Python 编译器与解释器

    Python 编译器与解释器 Python的环境我们已经搭建好了,可以开始学习基础知识了.但是,在此之前,还要先说说编译器与解释器相关的内容. 如果这部分内容,让你觉得难以理解或不能完全明白,可以暂时 ...

  7. 11 个最佳的 Python 编译器和解释器

    原作:Archie Mistry 翻译:豌豆花下猫@Python猫 原文:https://morioh.com/p/765b19f066a4 Python 是一门对初学者友好的编程语言,是一种多用途的 ...

  8. python 处理protobuf协议

    背景:需要用django基于python3模拟一个http接口,请求是post方式,body是protobuf string,返回也是protobuf string 设计:django获取pb str ...

  9. python post protobuf

    本文主要讲述如何使用Python发送protobuf数据. 安装protobuf .tar.gz cd protobuf- ./configure make make install 安装成功. // ...

  10. python设计模式之解释器模式

    python设计模式之解释器模式 对每个应用来说,至少有以下两种不同的用户分类. [ ] 基本用户:这类用户只希望能够凭直觉使用应用.他们不喜欢花太多时间配置或学习应用的内部.对他们来说,基本的用法就 ...

随机推荐

  1. iOS成员变量、实例变量、属性变量三者的联系与区别

    一.类Class中的属性property 在ios第一版中: 我们为输出口同时声明了属性和底层实例变量,那时,属性是oc语言的一个新的机制,并且要求你必须声明与之对应的实例变量,例如: 注意:(这个是 ...

  2. 国产免费的visio替代品edraw mind map,用来话流程图够用了

    最新版Edraw Mind Map可以创建基本的思维导图.气泡图和基本流程图,提供了强大的设计功能,包括丰富设计素材.全面的页面布局定义.预置的符号库与绘图工具等.创建的图形,可以导出为常用图像格式. ...

  3. cocos2d-x 坐标系解惑

    1.CCTouch* touch->getLocation() ---- 返回当前触摸点在openGL坐标系中的位置 openGL坐标系,原点在左下角,x向右为正,y向上为正. 2.CCTouc ...

  4. Start Developing Mac Apps -- Mac App Store Mac 应用商店

      Mac App Store The information you’ve read so far focused on how to create an app in Xcode. However ...

  5. ASP.NET Core MVC 2.x 全面教程_ASP.NET Core MVC 01. 创建项目 +项目结构和配置简介

    新建项目:Tutotial.Web 解决方案名称可以把web去掉 视频里面把git这个选项勾选了.我就不勾选了 dotnet CLI创建项目 首先必须安装好了.net Core的SDK dotnet ...

  6. JAVA基础-面向对象07

    一.代码块 1. 含义: 就是使用大括号括起来的一段代码 格式 { 代码: } 2.静态代码块 格式 static{ 代码: } 书写位置: 直接书写在类中成员位置: 怎么执行呢? 在类加载的最后一步 ...

  7. Codeforces - 559B - Equivalent Strings - 分治

    http://codeforces.com/problemset/problem/559/B 这个题目,分治就好了,每次偶数层可以多一种判断方式,判断它的时间就是logn的(吧),注意奇数层并不是直接 ...

  8. 洛谷 P1589 泥泞路

    题目描述 暴雨过后,FJ的农场到镇上的公路上有一些泥泞路,他有若干块长度为L的木板可以铺在这些泥泞路上,问他至少需要多少块木板,才能把所有的泥泞路覆盖住. 输入输出格式 输入格式: 第一行为正整数n( ...

  9. OGG How to Resync Tables / Schemas on Different SCN s in a Single Replicat

    To resync one or more tables/schemas on different SCN's using a single or minimum number of replicat ...

  10. 在 CentOS 环境下安装 .NET Core

    安装步骤: 参见官网 CentOS 会报以下错误: Error downloading packages: dotnet-runtime-2.2-2.2.4-1.x86_64: [Errno 256] ...