HiBench成长笔记——(10) 分析源码execute_with_log.py
#!/usr/bin/env python2 # Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. import sys, os, subprocess from terminalsize import get_terminal_size from time import time, sleep import re import fnmatch def load_colors(): color_script_fn = os.path.join(os.path.dirname(__file__), "color.enabled.sh") with open(color_script_fn) as f: return dict([(k,v.split("'")[1].replace('\e[', "\033[")) for k,v in [x.strip().split('=') for x in f.readlines() if x.strip() and not x.strip().startswith('#')]]) Color=load_colors() if int(os.environ.get("HIBENCH_PRINTFULLLOG", 0)): Color['ret'] = os.linesep else: Color['ret']='\r' tab_matcher = re.compile("\t") tabstop = 8 def replace_tab_to_space(s): def tab_replacer(match): pos = match.start() length = pos % tabstop if not length: length += tabstop return " " * length return tab_matcher.sub(tab_replacer, s) class _Matcher: hadoop = re.compile(r"^.*map\s*=\s*(\d+)%,\s*reduce\s*=\s*(\d+)%.*$") hadoop2 = re.compile(r"^.*map\s+\s*(\d+)%\s+reduce\s+\s*(\d+)%.*$") spark = re.compile(r"^.*finished task \S+ in stage \S+ \(tid \S+\) in.*on.*\((\d+)/(\d+)\)\s*$") def match(self, line): for p in [self.hadoop, self.hadoop2]: m = p.match(line) if m: return (float(m.groups()[0]) + float(m.groups()[1]))/2 for p in [self.spark]: m = p.match(line) if m: return float(m.groups()[0]) / float(m.groups()[1]) * 100 matcher = _Matcher() def show_with_progress_bar(line, progress, line_width): """ Show text with progress bar. @progress:0-100 @line: text to show @line_width: width of screen """ pos = int(line_width * progress / 100) if len(line) < line_width: line = line + " " * (line_width - len(line)) line = "{On_Yellow}{line_seg1}{On_Blue}{line_seg2}{Color_Off}{ret}".format( line_seg1 = line[:pos], line_seg2 = line[pos:], **Color) sys.stdout.write(line) def execute(workload_result_file, command_lines): proc = subprocess.Popen(" ".join(command_lines), shell=True, bufsize=1, stdout=subprocess.PIPE, stderr=subprocess.STDOUT) count = 100 last_time=0 log_file = open(workload_result_file, 'w') # see http://stackoverflow.com/a/4417735/1442961 lines_iterator = iter(proc.stdout.readline, b"") for line in lines_iterator: count += 1 if count > 100 or time()-last_time>1: # refresh terminal size for 100 lines or each seconds count, last_time = 0, time() width, height = get_terminal_size() width -= 1 try: line = line.rstrip() log_file.write(line+"\n") log_file.flush() except KeyboardInterrupt: proc.terminate() break line = line.decode('utf-8') line = replace_tab_to_space(line) #print "{Red}log=>{Color_Off}".format(**Color), line lline = line.lower() def table_not_found_in_log(line): table_not_found_pattern = "*Table * not found*" regex = fnmatch.translate(table_not_found_pattern) reobj = re.compile(regex) if reobj.match(line): return True else: return False def database_default_exist_in_log(line): database_default_already_exist = "Database default already exists" if database_default_already_exist in line: return True else: return False def uri_with_key_not_found_in_log(line): uri_with_key_not_found = "Could not find uri with key [dfs.encryption.key.provider.uri]" if uri_with_key_not_found in line: return True else: return False if ('error' in lline) and lline.lstrip() == lline: #Bypass hive 'error's and KeyProviderCache error bypass_error_condition = table_not_found_in_log or database_default_exist_in_log(lline) or uri_with_key_not_found_in_log(lline) if not bypass_error_condition: COLOR = "Red" sys.stdout.write((u"{%s}{line}{Color_Off}{ClearEnd}\n" % COLOR).format(line=line,**Color).encode('utf-8')) else: if len(line) >= width: line = line[:width-4]+'...' progress = matcher.match(lline) if progress is not None: show_with_progress_bar(line, progress, width) else: sys.stdout.write(u"{line}{ClearEnd}{ret}".format(line=line, **Color).encode('utf-8')) sys.stdout.flush() print log_file.close() try: proc.wait() except KeyboardInterrupt: proc.kill() return 1 return proc.returncode def test_progress_bar(): for i in range(101): show_with_progress_bar("test progress : %d" % i, i, 80) sys.stdout.flush() sleep(0.05) if __name__=="__main__": sys.exit(execute(workload_result_file=sys.argv[1], command_lines=sys.argv[2:])) # test_progress_bar()
HiBench成长笔记——(10) 分析源码execute_with_log.py的更多相关文章
- HiBench成长笔记——(9) 分析源码monitor.py
monitor.py 是主监控程序,将监控数据写入日志,并统计监控数据生成HTML统计展示页面: #!/usr/bin/env python2 # Licensed to the Apache Sof ...
- HiBench成长笔记——(8) 分析源码workload_functions.sh
workload_functions.sh 是测试程序的入口,粘连了监控程序 monitor.py 和 主运行程序: #!/bin/bash # Licensed to the Apache Soft ...
- HiBench成长笔记——(11) 分析源码run.sh
#!/bin/bash # Licensed to the Apache Software Foundation (ASF) under one or more # contributor licen ...
- HiBench成长笔记——(5) HiBench-Spark-SQL-Scan源码分析
run.sh #!/bin/bash # Licensed to the Apache Software Foundation (ASF) under one or more # contributo ...
- Hadoop学习笔记(10) ——搭建源码学习环境
Hadoop学习笔记(10) ——搭建源码学习环境 上一章中,我们对整个hadoop的目录及源码目录有了一个初步的了解,接下来计划深入学习一下这头神象作品了.但是看代码用什么,难不成gedit?,单步 ...
- CentOS 7运维管理笔记(10)----MySQL源码安装
MySQL可以支持多种平台,如Windows,UNIX,FreeBSD或其他Linux系统.本篇随笔记录在CentOS 7 上使用源码安装MySQL的过程. 1.下载源码 选择使用北理工的镜像文件: ...
- memcached学习笔记——存储命令源码分析下篇
上一篇回顾:<memcached学习笔记——存储命令源码分析上篇>通过分析memcached的存储命令源码的过程,了解了memcached如何解析文本命令和mencached的内存管理机制 ...
- memcached学习笔记——存储命令源码分析上篇
原创文章,转载请标明,谢谢. 上一篇分析过memcached的连接模型,了解memcached是如何高效处理客户端连接,这一篇分析memcached源码中的process_update_command ...
- kernel 3.10内核源码分析--hung task机制
kernel 3.10内核源码分析--hung task机制 一.相关知识: 长期以来,处于D状态(TASK_UNINTERRUPTIBLE状态)的进程 都是让人比较烦恼的问题,处于D状态的进程不能接 ...
随机推荐
- Spring中@MapperScan注解
之前是,直接在Mapper类上面添加注解@Mapper,这种方式要求每一个mapper类都需要添加此注解,麻烦. 通过使用@MapperScan可以指定要扫描的Mapper类的包的路径,比如: @Sp ...
- Linux搭建maven私服
1.把压缩包上传到服务器/usr/local/tmp 2.在/usr/local下创建nexus文件夹(mkdir nexus) 3.解压压缩包nexus-3.13.0-01-unix.tar.gz到 ...
- jsp分割字符串并遍历
1.先引入JSTL库 <%@ taglib uri="http://java.sun.com/jsp/jstl/core" prefix="c"%> ...
- IntelliJ IDEA常用快捷键大全
如果想要非常高效的使用IDEA这款工具,应该掌握图中已被标记的快捷键. 另: 代码实时模板生成:psvm/sout/ifn等 按Tab键快速生成模板. 转载请保留或注明出处:http://www.cn ...
- 本机配置集群主机名访问(Windows配置hosts)
Windows配置hosts C:\Windows\System32\drivers\etc\hosts 主机IP 主机名 示例: 192.168.1.1 master 192.168.1.2 sla ...
- c++拷贝构造函数(翁恺c++公开课[26-27]学习笔记)
这节课在p26.拷贝构造中讲的很清楚,建议大家耐心的去看下. 什么时候会发生拷贝构造: 对象之间的初始化赋值 使用对象作为变量进行函数传参(通常使用引用来传参从而减去不必要的拷贝构造,提高效率和代码健 ...
- Codeforces 1304D. Shortest and Longest LIS
根据题目,我们可以找最短的LIS和最长的LIS,找最短LIS时,可以将每一个increase序列分成一组,从左到右将最大的还未选择的数字填写进去,不同组之间一定不会存在s[i]<s[j]的情况, ...
- 122、Java面向对象之直接输出对象本身
01.代码如下: package TIANPAN; class Book { public void print() { // 调用print()方法的对象就是当前对象,this就自动与此对象指向同一 ...
- 中山普及Day13——普及
又是迷之自信的说...估的230,考的50整,我欲上天呐!!! T1:深渊(怕不是黑暗种族聚集地???) 思路:动归.而且是简单动归.转移方程:Fi,j=max(Fi-1,j,Fi,j,Fi-1,(j ...
- Wepy框架和mpVue框架的比较及使用mpVue框架需要注意的点
Wepy框架 它是一款类Vue框架,在代码风格上借鉴了Vue,本身和Vue没有任何关系. mpVue框架 它是从整个Vue的核心代码上经过二次开发而形成的一个框架,相当于是给Vue本身赋能,增加了开发 ...