语义分析:C语言表达式的语法树生成——Python实现
令狐冲慢慢走近,那汉子全身发抖,双膝一屈,跪倒在雪地之中。令狐冲怒道:“你辱我师妹,须饶你不得。”长剑指在他咽喉之上,心念一动,走近一步,低声问道:“写在雪人上的,是些什么字?”
那汉子颤声道:“是……是……‘海枯……海枯……石烂,两……情……情不……不渝’。”自从世上有了“海枯石烂,两情不渝”这八个字以来,说得如此胆战心惊、丧魂落魄的,只怕这是破题儿第一遭了。
令狐冲一呆,道:“嗯,是海枯石烂,两情不渝。”心头酸楚,长剑送出,刺入他咽喉。
——《笑傲江湖》
语义分析较困难的根本原因在于语法的可递归性,深层次的递归使得问题的分解看起来变得相当地复杂。但是如果能将递归问题转化为迭代问题,便能很大程度地简化此问题模型。递归转化为迭代的关键在于——找到最深层递归结构的全部特征,迭代化之,问题便迎刃而解。
一般情况下,人们在面对复杂的递归问题时时,亦是依据其语法规则,找到其递归深层的结构,化解之,步步迭代,如此,问题便得到解决。人类的思维很是擅长将递归问题转化为迭代问题,而学习知识的过程,则可以看成是对各种各样语法规则的理解与掌握。
一元操作符、二元操作符的递归问题,可以很简单的转化为迭代,多元操作符的情况稍复杂些。
所有的操作符及其优先级如下图:
如typeof、取地址、指针指向等,在这里并未实现。实现的包括有算数运算式、逻辑运算式、函数调用与括号。对于理解语义分析的过程,已足够。
对于不包含括号与函数的简单表达式,我们语义分析演算过程如下:
我们的数据结构:
'''
____________________________ Syntax Tree
Parenthesis:
["(",None]
[")",None]
Operators(grouped by precedence):
Unary :
1 + - ! ~ ["+",None] ["-",None] ["!",None] ["~",None]
Binary :
2 * / % ["*",None] ["/",None] ["%",None]
3 + - ["+",None] ["-",None]
4 << >> ["<<",None] [">>",None]
5 > >= < <= [">",None] [">=",None] ["<",None] ["<=",None]
6 == != ["==",None] ["!=",None]
7 & ["&",None]
8 ^ ["^",None]
9 | ["|",None]
10 && ["&&",None]
11 || ["||",None]
Ternary :
12 expr ? expr : expr ["?",None] [":",None] ["@expr","?:",listPtr0,listPtr1,listPtr2]
13 expr , expr , expr...
Var,Num,Expr,Function:
["@var","varName"]
["@num","num_string"]
["@expr","Operator",listPtr,...]
["@func","funcName",listPtr1,...]
["@expr_list",["@var"|"@num"|"@expr"|"@func",...],...]
'''
这是我们最终的代码模块图:
其中形如 module_x_y 的函数,x表示此运算符的优先级,y表示横向序号,从零开始。代码注释已经写得很详细了,请看源代码:
######################################## global list
OperatorList=['+','-','!','~',\
'*','/','%',\
'+','-',\
'<<','>>',\
'>','>=','<','<=',\
'==','!=',\
'&',\
'^',\
'|',\
'&&',\
'||',\
'?',':'\
',']
''' 31 + 8 * 9 '''
listToParse=[ ['@num',''] , ['+',None] , ['@num',''] , ['*',None] , ['@num',''] ] ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# + =: ^+A... | ...Op+A...
def module_1_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ^+A...
if i==0 and len(lis)>=2:
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[0:2]
lis.insert(0,["@expr","+",rightPtr])
return 0
# process: ...Op+A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[left][0] in OperatorList:
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[i:i+2]
lis.insert(i,["@expr","+",rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# - =: ^-A... | ...Op-A...
def module_1_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ^-A...
if i==0 and len(lis)>=2:
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[0:2]
lis.insert(0,["@expr","-",rightPtr])
return 0
# process: ...Op-A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[left][0] in OperatorList:
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[i:i+2]
lis.insert(i,["@expr","-",rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# ! =: ...!A...
def module_1_2(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...!A...
if len(lis)>=2 and right<len(lis):
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[i:i+2]
lis.insert(i,["@expr","!",rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# ~ =: ...~A...
def module_1_3(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...~A...
if len(lis)>=2 and right<len(lis):
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[i:i+2]
lis.insert(i,["@expr","~",rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# * =: ...A*A...
def module_2_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A*A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","*",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# / =: ...A/A...
def module_2_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A/A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","/",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# % =: ...A%A...
def module_2_2(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A%A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","%",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# + =: ...A+A...
def module_3_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A+A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","+",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# - =: ...A-A...
def module_3_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A-A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","-",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# << =: ...A<<A...
def module_4_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A<<A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","<<",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# >> =: ...A>>A...
def module_4_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A>>A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr",">>",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# > =: ...A>A...
def module_5_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A>A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr",">",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# >= =: ...A>=A...
def module_5_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A>=A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr",">=",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# < =: ...A<A...
def module_5_2(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A<A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","<",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# <= =: ...A<=A...
def module_5_3(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A<=A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","<=",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# == =: ...A==A...
def module_6_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A==A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","==",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# != =: ...A!=A...
def module_6_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A!=A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","!=",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# & =: ...A&A...
def module_7_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A&A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","&",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# ^ =: ...A^A...
def module_8_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A^A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","^",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# | =: ...A|A...
def module_9_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A|A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","|",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# && =: ...A&&A...
def module_10_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A&&A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","&&",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# || =: ...A||A...
def module_11_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A||A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","||",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# ?: =: ...A?A:A...
################# ^
def module_12_0(lis,i): # left i right are both indexes :)
first=i-3
leftOp=i-2
left=i-1
right=i+1 # process: ...A?A:A...
# ^
if i>=3 and len(lis)>=5 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' and\
lis[leftOp][0]=='?' and lis[first][0][0]=='@':
firstPtr=lis[first]
leftPtr=lis[left]
rightPtr=lis[right]
del lis[first:first+5]
lis.insert(first,["@expr","?:",firstPtr,leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# , =: A,A,...A,A
def module_13_0(lis,i): # process: A,A,...A,A
if len(lis)==1 and lis[0][0][0]!='@':
return 1
if len(lis)==1 and lis[0][0][0]=='@':
return 0
if (len(lis)%2)==1 :
i=1
if lis[0][0][0]!='@':
return 1
while i<len(lis):
if lis[i+1][0][0]=='@' and lis[i][0]==',':
i=i+2
else:
return 1
ls=[['@expr_list']]
i=0
while i<len(lis):
ls[0].append(lis[i])
i=i+2
del lis[:]
lis[:]=ls[:]
return 0
return 1
上面的代码虽然很大,却是最简单的一部分了,其实可以采取一些方法显著地压缩代码量,但是时间有限。
下面给出一元运算符、二元运算符、三元运算符及逗号分隔符的语义分析过程,这是本文的核心代码之一:
######################################## global list
# construct a module dictionary
# module_dic_tuple[priority]['Operator'](lis,i)
module_dic_tuple=({}, { '+':module_1_0,'-':module_1_1,'!':module_1_2,'~':module_1_3 },\
{ '*':module_2_0,'/':module_2_1,'%':module_2_2 }, \
{ '+':module_3_0,'-':module_3_1 },\
{ '<<':module_4_0,'>>':module_4_1 },\
{ '>':module_5_0,'>=':module_5_1,'<':module_5_2,'<=':module_5_3 },\
{ '==':module_6_0,'!=':module_6_1 },\
{ '&':module_7_0 },\
{ '^':module_8_0 },\
{ '|':module_9_0 },\
{ '&&':module_10_0 },\
{ '||':module_11_0 },\
{ '?:':module_12_0 },\
{ ',':module_13_0 } ) operator_priority_tuple=( () , ('+', '-', '!', '~') , ('*','/','%'),\
('+','-'),('<<','>>'),\
('>','>=','<','<='),('==','!='),\
('&'),('^'),('|'),('&&'),('||'),('?',':'),(',') ) ############################# parse:unary,binary,ternary,comma expr
########### return value :
############# 0 parsed sucessfully
############# 1 syntax error
def parse_simple_expr(lis):
if len(lis)==0:
return 1
#if lis[len(lis)-1][0][0]!='@':
# return 1
#if lis[0][0][0]!='@' and lis[0][0] not in ('+','-','!','~'):
# return 1
for pri in range(1,12): # pri 1,2,3,4,5,6,7,8,9,10,11
i=0
while 1:
if len(lis)==1 and lis[0][0][0]=='@':
return 0
if i>=len(lis):
break
if lis[i][0] in operator_priority_tuple[pri]:
if module_dic_tuple[pri][lis[i][0]](lis,i)==0:
i=0
continue
else:
i=i+1
continue
else:
i=i+1
for pri in range(12,13): # pri 12 # parse ...A?A:A...
i=0
while 1:
if len(lis)==1 and lis[0][0][0]=='@':
return 0
if i>=len(lis):
break
if lis[i][0]==':':
if module_dic_tuple[pri]['?:'](lis,i)==0:
i=0
continue
else:
i=i+1
continue
else:
i=i+1
return module_dic_tuple[13][','](lis,0)
return 1
上面代码中,使用了函数引用的词典链表来简化此部分的代码数量。
这一部分就不进行验证展示了,具体过程与前面的文章《一个简单的语义分析算法:单步算法——Python实现》中的描述类似。
实现了 parse_simple_expr 功能之后,剩下的函数与括号的语义分析变得简单些,演算过程如下:
代码实现:
########### return value :[intStatusCode,indexOf'(',indexOf')']
############# intStatusCode
############# 0 sucessfully
############# 1 no parenthesis matched
############# 2 list is null :(
def module_parenthesis_place(lis):
length=len(lis)
err=0
x=0
y=0
if length==0:
return [2,None,None]
try:
x=lis.index([")",None])
except:
err=1
lis.reverse()
try:
y=lis.index(["(",None],length-x-1)
except:
err=1
lis.reverse()
y=length-y-1
if err==1:
return [1,None,None]
else:
return [0,y,x] ############################# parse:unary binary ternary prenthesis function expr
########### return value :
############# 0 parsed sucessfully
############# 1 syntax error
############################# find first ')'
def parse_comp_expr(lis):
while 1:
if len(lis)==0:
return 1
if len(lis)==1:
if lis[0][0][0]=='@':
return 0
else:
return 1
place=module_parenthesis_place(lis)
if place[0]==0:
mirror=lis[(place[1]+1):place[2]]
if parse_simple_expr(mirror)==0:
if place[1]>=1 and lis[place[1]-1][0]=='@var':
'''func'''
funcName=lis[place[1]-1][1]
del lis[place[1]-1:(place[2]+1)]
lis.insert(place[1]-1,["@func",funcName,mirror[0]])
else:
del lis[place[1]:(place[2]+1)]
lis.insert(place[1],mirror[0])
else:
return 1
else:
return parse_simple_expr(lis)
return 1
如此,代码到此结束。
下面给出实验结果:
>>> ls=[['(',None],['@var','f'],['(',None],['@num',''],[',',None],['@num',''],[',',None],['@num',''],[',',None],['!',None],['-',None],['@var','x'],['?',None],['@var','y'],[':',None],['~',None],['@var','z'],[')',None],['-',None],['@num',''],[')',None],['/',None],['@num','']]
>>> ls
[['(', None], ['@var', 'f'], ['(', None], ['@num', ''], [',', None], ['@num', ''], [',', None], ['@num', ''], [',', None], ['!', None], ['-', None], ['@var', 'x'], ['?', None], ['@var', 'y'], [':', None], ['~', None], ['@var', 'z'], [')', None], ['-', None], ['@num', ''], [')', None], ['/', None], ['@num', '']]
>>> len(ls)
23
>>> parse_comp_expr(ls);ls
0
[['@expr', '/', ['@expr', '-', ['@func', 'f', ['@expr_list', ['@num', ''], ['@num', ''], ['@num', ''], ['@expr', '?:', ['@expr', '!', ['@expr', '-', ['@var', 'x']]], ['@var', 'y'], ['@expr', '~', ['@var', 'z']]]]], ['@num', '']], ['@num', '']]]
>>> len(ls)
1
>>>
附录:
本文的全部源代码如下:
'''
____________________________Syntax & Syntax Tree
Parenthesis:
["(",None]
[")",None]
Operators(grouped by precedence):
Unary :
1 + - ! ~ ["+",None] ["-",None] ["!",None] ["~",None]
Binary :
2 * / % ["*",None] ["/",None] ["%",None]
3 + - ["+",None] ["-",None]
4 << >> ["<<",None] [">>",None]
5 > >= < <= [">",None] [">=",None] ["<",None] ["<=",None]
6 == != ["==",None] ["!=",None]
7 & ["&",None]
8 ^ ["^",None]
9 | ["|",None]
10 && ["&&",None]
11 || ["||",None]
Ternary :
12 expr ? expr : expr ["?",None] [":",None] ["@expr","?:",listPtr0,listPtr1,listPtr2]
13 expr , expr , expr...
Var,Num,Expr,Function:
["@var","varName"]
["@num","num_string"]
["@expr","Operator",listPtr,...]
["@func","funcName",listPtr1,...]
["@expr_list",["@var"|"@num"|"@expr"|"@func",...],...]
''' ######################################## global list
OperatorList=['+','-','!','~',\
'*','/','%',\
'+','-',\
'<<','>>',\
'>','>=','<','<=',\
'==','!=',\
'&',\
'^',\
'|',\
'&&',\
'||',\
'?',':'\
',']
''' 31 + 8 * 9 '''
listToParse=[ ['@num',''] , ['+',None] , ['@num',''] , ['*',None] , ['@num',''] ] ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# + =: ^+A... | ...Op+A...
def module_1_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ^+A...
if i==0 and len(lis)>=2:
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[0:2]
lis.insert(0,["@expr","+",rightPtr])
return 0
# process: ...Op+A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[left][0] in OperatorList:
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[i:i+2]
lis.insert(i,["@expr","+",rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# - =: ^-A... | ...Op-A...
def module_1_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ^-A...
if i==0 and len(lis)>=2:
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[0:2]
lis.insert(0,["@expr","-",rightPtr])
return 0
# process: ...Op-A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[left][0] in OperatorList:
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[i:i+2]
lis.insert(i,["@expr","-",rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# ! =: ...!A...
def module_1_2(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...!A...
if len(lis)>=2 and right<len(lis):
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[i:i+2]
lis.insert(i,["@expr","!",rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# ~ =: ...~A...
def module_1_3(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...~A...
if len(lis)>=2 and right<len(lis):
if lis[right][0][0]=='@':
rightPtr=lis[right]
del lis[i:i+2]
lis.insert(i,["@expr","~",rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# * =: ...A*A...
def module_2_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A*A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","*",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# / =: ...A/A...
def module_2_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A/A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","/",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# % =: ...A%A...
def module_2_2(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A%A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","%",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# + =: ...A+A...
def module_3_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A+A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","+",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# - =: ...A-A...
def module_3_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A-A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","-",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# << =: ...A<<A...
def module_4_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A<<A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","<<",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# >> =: ...A>>A...
def module_4_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A>>A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr",">>",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# > =: ...A>A...
def module_5_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A>A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr",">",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# >= =: ...A>=A...
def module_5_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A>=A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr",">=",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# < =: ...A<A...
def module_5_2(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A<A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","<",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# <= =: ...A<=A...
def module_5_3(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A<=A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","<=",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# == =: ...A==A...
def module_6_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A==A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","==",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# != =: ...A!=A...
def module_6_1(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A!=A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","!=",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# & =: ...A&A...
def module_7_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A&A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","&",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# ^ =: ...A^A...
def module_8_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A^A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","^",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# | =: ...A|A...
def module_9_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A|A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","|",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# && =: ...A&&A...
def module_10_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A&&A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","&&",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# || =: ...A||A...
def module_11_0(lis,i): # left i right are both indexes :)
left=i-1
right=i+1 # process: ...A||A...
if i>=1 and len(lis)>=3 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' :
leftPtr=lis[left]
rightPtr=lis[right]
del lis[left:left+3]
lis.insert(left,["@expr","||",leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# ?: =: ...A?A:A...
################# ^
def module_12_0(lis,i): # left i right are both indexes :)
first=i-3
leftOp=i-2
left=i-1
right=i+1 # process: ...A?A:A...
# ^
if i>=3 and len(lis)>=5 and right<len(lis):
if lis[right][0][0]=='@' and lis[left][0][0]=='@' and\
lis[leftOp][0]=='?' and lis[first][0][0]=='@':
firstPtr=lis[first]
leftPtr=lis[left]
rightPtr=lis[right]
del lis[first:first+5]
lis.insert(first,["@expr","?:",firstPtr,leftPtr,rightPtr])
return 0 return 1 ########### return value :
############# 0 parsed some expresions
############# 1 done nothing but no errors happened
################# , =: A,A,...A,A
def module_13_0(lis,i): # process: A,A,...A,A
if len(lis)==1 and lis[0][0][0]!='@':
return 1
if len(lis)==1 and lis[0][0][0]=='@':
return 0
if (len(lis)%2)==1 :
i=1
if lis[0][0][0]!='@':
return 1
while i<len(lis):
if lis[i+1][0][0]=='@' and lis[i][0]==',':
i=i+2
else:
return 1
ls=[['@expr_list']]
i=0
while i<len(lis):
ls[0].append(lis[i])
i=i+2
del lis[:]
lis[:]=ls[:]
return 0
return 1 ######################################## global list
# construct a module dictionary
# module_dic_tuple[priority]['Operator'](lis,i)
module_dic_tuple=({}, { '+':module_1_0,'-':module_1_1,'!':module_1_2,'~':module_1_3 },\
{ '*':module_2_0,'/':module_2_1,'%':module_2_2 }, \
{ '+':module_3_0,'-':module_3_1 },\
{ '<<':module_4_0,'>>':module_4_1 },\
{ '>':module_5_0,'>=':module_5_1,'<':module_5_2,'<=':module_5_3 },\
{ '==':module_6_0,'!=':module_6_1 },\
{ '&':module_7_0 },\
{ '^':module_8_0 },\
{ '|':module_9_0 },\
{ '&&':module_10_0 },\
{ '||':module_11_0 },\
{ '?:':module_12_0 },\
{ ',':module_13_0 } ) operator_priority_tuple=( () , ('+', '-', '!', '~') , ('*','/','%'),\
('+','-'),('<<','>>'),\
('>','>=','<','<='),('==','!='),\
('&'),('^'),('|'),('&&'),('||'),('?',':'),(',') ) ############################# parse:unary,binary,ternary,comma expr
########### return value :
############# 0 parsed sucessfully
############# 1 syntax error
def parse_simple_expr(lis):
if len(lis)==0:
return 1
#if lis[len(lis)-1][0][0]!='@':
# return 1
#if lis[0][0][0]!='@' and lis[0][0] not in ('+','-','!','~'):
# return 1
for pri in range(1,12): # pri 1,2,3,4,5,6,7,8,9,10,11
i=0
while 1:
if len(lis)==1 and lis[0][0][0]=='@':
return 0
if i>=len(lis):
break
if lis[i][0] in operator_priority_tuple[pri]:
if module_dic_tuple[pri][lis[i][0]](lis,i)==0:
i=0
continue
else:
i=i+1
continue
else:
i=i+1
for pri in range(12,13): # pri 12 # parse ...A?A:A...
i=0
while 1:
if len(lis)==1 and lis[0][0][0]=='@':
return 0
if i>=len(lis):
break
if lis[i][0]==':':
if module_dic_tuple[pri]['?:'](lis,i)==0:
i=0
continue
else:
i=i+1
continue
else:
i=i+1
return module_dic_tuple[13][','](lis,0)
return 1 ########### return value :[intStatusCode,indexOf'(',indexOf')']
############# intStatusCode
############# 0 sucessfully
############# 1 no parenthesis matched
############# 2 list is null :(
def module_parenthesis_place(lis):
length=len(lis)
err=0
x=0
y=0
if length==0:
return [2,None,None]
try:
x=lis.index([")",None])
except:
err=1
lis.reverse()
try:
y=lis.index(["(",None],length-x-1)
except:
err=1
lis.reverse()
y=length-y-1
if err==1:
return [1,None,None]
else:
return [0,y,x] ############################# parse:unary binary ternary prenthesis function expr
########### return value :
############# 0 parsed sucessfully
############# 1 syntax error
############################# find first ')'
def parse_comp_expr(lis):
while 1:
if len(lis)==0:
return 1
if len(lis)==1:
if lis[0][0][0]=='@':
return 0
else:
return 1
place=module_parenthesis_place(lis)
if place[0]==0:
mirror=lis[(place[1]+1):place[2]]
if parse_simple_expr(mirror)==0:
if place[1]>=1 and lis[place[1]-1][0]=='@var':
'''func'''
funcName=lis[place[1]-1][1]
del lis[place[1]-1:(place[2]+1)]
lis.insert(place[1]-1,["@func",funcName,mirror[0]])
else:
del lis[place[1]:(place[2]+1)]
lis.insert(place[1],mirror[0])
else:
return 1
else:
return parse_simple_expr(lis)
return 1
由于当树结构稍复杂时,分析其结构很是耗费时间,接下来,我们将开发一个将代码中的树结构图形化显示的简陋工具。
如有问题或者建议,欢迎留言讨论 :)
语义分析:C语言表达式的语法树生成——Python实现的更多相关文章
- [WebKit内核] JavaScript引擎深度解析--基础篇(一)字节码生成及语法树的构建详情分析
[WebKit内核] JavaScript引擎深度解析--基础篇(一)字节码生成及语法树的构建详情分析 标签: webkit内核JavaScriptCore 2015-03-26 23:26 2285 ...
- [WebKit内核] JavaScriptCore深度解析--基础篇(一)字节码生成及语法树的构建
看到HorkeyChen写的文章<[WebKit] JavaScriptCore解析--基础篇(三)从脚本代码到JIT编译的代码实现>,写的很好,深受启发.想补充一些Horkey没有写到的 ...
- JSP编译成Servlet(一)语法树的生成——语法解析
一般来说,语句按一定规则进行推导后会形成一个语法树,这种树状结构有利于对语句结构层次的描述.同样Jasper对JSP语法解析后也会生成一棵树,这棵树各个节点包含了不同的信息,但对于JSP来说解析后的语 ...
- Atitit.sql ast 表达式 语法树 语法 解析原理与实现 java php c#.net js python
Atitit.sql ast 表达式 语法树 语法 解析原理与实现 java php c#.net js python 1.1. Sql语法树 ast 如下图锁死1 2. SQL语句解析的思路和过程3 ...
- EL语言表达式 (一)【语法和特点】
一.基本语法规则: EL表达式语言以“${”开头,以"}"结尾的程序段,具体格式如下: ${expression} 其中expression:表示要指定输出的内容和字符串以及EL运 ...
- .NET技术-6.0. Expression 表达式树 生成 Lambda
.NET技术-6.0. Expression 表达式树 生成 Lambda public static event Func<Student, bool> myevent; public ...
- 《深入理解Android虚拟机内存管理》示例程序编译阶段生成的各种语法树完整版
1.tokens "int" "int" <SPACES> " &quo ...
- 抽象语法树简介(ZZ)
转载自: http://www.cnblogs.com/cxihu/p/5836744.html (一)简介 抽象语法树(abstract syntax code,AST)是源代码的抽象语法结构的树状 ...
- 03.从0实现一个JVM语言系列之语法分析器-Parser-03月01日更新
从0实现JVM语言之语法分析器-Parser 相较于之前有较大更新, 老朋友们可以复盘或者针对bug留言, 我会看到之后答复您! 源码github仓库, 如果这个系列文章对你有帮助, 希望获得你的一个 ...
随机推荐
- Java项目中使用Log4J
Log4J下载 官网:http://logging.apache.org/log4j/ Log4J 1.2下载地址:http://logging.apache.org/log4j/1.2/downlo ...
- 转载maven安装,配置,入门
转载:http://www.cnblogs.com/dcba1112/archive/2011/05/01/2033805.html 本书代码下载 大家可以从我的网站下载本书的代码:http://ww ...
- BZOJ3529 [Sdoi2014]数表【莫比乌斯反演】
Description 有一张 n×m 的数表,其第 i 行第 j 列(1 <= i <= n, 1 <= j <= m)的数值为 能同时整除 i 和 j 的所有自然数之和.给 ...
- LOJ2362. 「NOIP2016」蚯蚓【单调队列】
LINK 思路 良心来说这题还挺思维的 我没看题解也不知道要这样维护 把每次斩断的点分别放进两个队列里面 因为要维护增长,所以可以让新进队的节点来一个负增长? 是不是就好了? 然后很容易发现因为在原始 ...
- CALayer1-简介
一.什么是CALayer * 在iOS系统中,你能看得见摸得着的东西基本上都是UIView,比如一个按钮.一个文本标签.一个文本输入框.一个图标等等,这些都是UIView. * 其实UIView之所以 ...
- 如何点焊过的镍片再次焊接到 PCBA 上?
如何将点焊过的镍片再次焊接到 PCBA 上? 在 PCBA 上贴了镍片再点焊,这样的制造工艺可以大大减少人工处理,提高生产通过率. 由于种种原因,有些机器可能有故障需要维修,而且电池又需要拆下来,才能 ...
- Zend Studio 下载
http://www.52pojie.cn/thread-507229-1-1.html THINKPHP : http://www.cnblogs.com/TigerYangWTH/p/57250 ...
- golang的指针到string,string到指针的转换
由于某个需求,需要如题的转换,废话不多说,直接贴代码了,其实挺丑了,备用了 func (this *Server) socketParserHandler(client *genTcpServer.C ...
- 杂项: Redis
ylbtech-杂项: Redis Redis是一个开源的使用ANSI C语言编写.支持网络.可基于内存亦可持久化的日志型.Key-Value数据库,并提供多种语言的API. 1. 定义返回顶部 re ...
- 安装nagios检测hadoop
Nagios是常用的系统监控工具,提供了很多基本服务的监控脚本,如HTTP,MYSQL等,同时具有不错的可扩展性,自己可定制针对特定参数的监控脚本以及报警的方式. 我现在有三台机器:192.168.0 ...