Basic Information

  • Publication: ICSE'17
  • Authors: Shin Hwei Tan, Jooyong Yi, Yulis, Sergey Mechtaev, Abhik Roychoudhury
  • Language: C Program
  • Source: Codeforces Programming Contest (Reject/Accept)
  • Description: a set of 3902 defects from 7436 programs automatically classified across 39 defect classes
  • Dataset Homepage

Summary

Existing benchmarks (like ManyBugs and IntroClass) on automated program repairs do not allow thorough investigation of the relationship between fault types and the effectiveness of repair tools.
Four criterias for a benchmark that allows extensive evaluation of repair tools:

  • C1: Diverse types of real defects.
  • C2: Large number of defects.
  • C3: Large number of programs.
  • C4: Programs that are algorithmically complex
  • C5: Large held-out test suite for patch correctness verification

Overall, author crawled over 10000 webpages from Codeforces programming contest. For each rejected submission r, they find another accepted submission a by the same user for the same programming problem in the crawled data. Each fault is represented by the submission pair (r, a). In total, they obtain 5544 defects. Then they further exclude 924 defects due to inadequate held-out tests, 677 defects due to non-reproducible bugs, and 41 defects due to a known CIL bugs2 in handling variable sized multidimensional array.

All defects are divided into 39 classes by using Gumtree on AST-level syntactic differences between buggy program and patched program.

Structure

codeflaws
|> 1-A-bug-18353198-18353306 (<contestid>-<problem>-bug-<buggy-submisionid>-<accepted-submissionid>)
|===> 1-A-18353198.c (<contestid>-<problem>-<buggy-submisionid>.c)
|===> 1-A-18353306.c (<contestid>-<problem>-<accepted-submissionid>.c)
|===> input-neg1 (Test input files: input[0-9]+ file used by Test suite (i))
|===> output-neg1 (Test output files: output[0-9]+ file used by Test suite (i))
|===> heldout-input-pos1 (heldout-input[0-9]+ file used by Test suite (ii))
|===> heldout-output-pos1 (heldout-output[0-9]+ file used by Test suite (ii))
|===> 1-A-18353198.c.revlog(Test configuration for SPR that specify the name for pass/fail test: --.c.revlog)
|===> test-genprog.sh (Repair Test script (test suite given to repair tools for generating repair), test-genprog.sh is for search-based repair tools (GenProg, SPR, Prophet))
|===> test-angelix.sh (Repair Test script (test suite given to repair tools for generating repair), test-angelix.sh is for Angelix as it requires inserting special instrumentation)
|===> test-valid.sh(Test script for patch validation (held-out test suite): test-valid.sh is for validating the correctness of patches)
|===> Makefile (Makefile for compiling the buggy submission. This contains the CFLAGS options recommended by Codeforces. To compile the accepted submission, use the command make FILENAME=10-A-13543524)
|===> Makefile.genprog (Makefile.genprog for compiling the buggy submission using cilly. This is for GenProg experiments as GenProg works on CIL representation.)

[Benchmark] Codeflaws: A Programming Competition Benchmark for Evaluating Automated Program Repair Tools的更多相关文章

  1. Benchmark result without MONITOR running: Benchmark result with MONITOR running (redis-cli monitor > /dev/null): 吞吐量 下降约1半 Redis监控工具,命令和调优

    https://redis.io/commands/monitor In this particular case, running a single MONITOR client can reduc ...

  2. 2050 Programming Competition

    http://2050.acmclub.cn/contests/contest_show.php?cid=3 开场白 Time Limit: 2000/1000 MS (Java/Others)    ...

  3. 2050 Programming Competition (CCPC)

    Pro&Sol 链接: https://pan.baidu.com/s/17Tt3EPKEQivP2-3OHkYD2A 提取码: wbnu 复制这段内容后打开百度网盘手机App,操作更方便哦 ...

  4. 2019 China Collegiate Programming Contest Qinhuangdao Onsite F. Forest Program(DFS计算图中所有环的长度)

    题目链接:https://codeforces.com/gym/102361/problem/F 题意 有 \(n\) 个点和 \(m\) 条边,每条边属于 \(0\) 或 \(1\) 个环,问去掉一 ...

  5. Reading List on Automated Program Repair

    Some resources: https://www.monperrus.net/martin/automatic-software-repair 2017 [ ] DeepFix: Fixing ...

  6. Azure Redis Cache (3) 在Windows 环境下使用Redis Benchmark

    <Windows Azure Platform 系列文章目录> 熟悉Redis环境的读者都知道,我们可以在Linux环境里,使用Redis Benchmark,测试Redis的性能. ht ...

  7. MYSQL BENCHMARK函数的使用

    MYSQL BENCHMARK函数是最重要的函数之一,下文对该函数的使用进行了详尽的分析,如果您对此感兴趣的话,不妨一看. 下文为您介绍的是MYSQL BENCHMARK函数的语法,及一些MYSQL  ...

  8. Benchmark与Profiler---性能调优得力助手

    转载请注明出处:http://blog.csdn.net/gaoyanjie55/article/details/34981077 性能优化.它是一种诊断性能瓶颈,能问题点进行优化的过程.前两天听完s ...

  9. c++性能测试工具:google benchmark入门(一)

    如果你正在寻找一款c++性能测试工具,那么这篇文章是不容错过的. 市面上的benchmark工具或多或少存在一些使用上的不便,那么是否存在一个使用简便又功能强大的性能测试工具呢?答案是google/b ...

随机推荐

  1. angular学习笔记(3)- MVC

    angular1学习笔记(3)- MVC --- MVC终极目标 - 模块化和复用 AngularJs的MVC是借助于$scope实现的!!! 神奇的$scope: 1.$scope是一个POJO(P ...

  2. openstack 之~keystone基础

    第一:keystone是什么? keystone是 OpenStack Identity Service 的项目名称,是一个负责身份管理验证.服务规则管理和服务令牌功能.它实现了openstack的i ...

  3. CSS魔法堂:稍稍深入伪类选择器

    前言  过去零零星星地了解和使用:link.::after和content等伪类.伪元素选择器,最近看书时发现这方面有所欠缺,于是决定稍微深入学习一下,以下为伪类部分的整理. 伪类  伪类选择器实质上 ...

  4. 使用log4net生成日志文件

    (一)使用log4net生成日志文件   1.引入log4net.dll 1.1 Nuget安装 或 http://logging.apache.org/log4net/下载log4net的源代码,编 ...

  5. solr集群构建的基本流程介绍

    先从第一台solr服务器说起:1. 它首先启动一个嵌入式的Zookeeper服务器,作为集群状态信息的管理者,2. 将自己这个节点注册到/node_states/目录下3. 同时将自己注册到/live ...

  6. 众里寻他千百度?No!这项技术只需走两步就能“看穿”你!

    电影<碟中谍5>中阿汤哥带上了面具,顺利通过指纹锁,三重物理等重重关卡,却最终仍旧功亏一篑,正是由于“ 火眼金睛 ”——步态识别 .   (图片来源:碟中谍) 中国科学院自动化所的专家日前 ...

  7. VirtualBox虚拟机磁盘瘦身

    操作系统 : windows7_x64 VirtualBox 版本 : 4.3.28 原理: 使用0填充虚拟系统磁盘,然后删除填充文件,再使用VBoxManage进行压缩. Linux系统磁盘瘦身 一 ...

  8. libreoffice python 操作word及excel文档

    1.开始.关闭libreoffice服务: 开始之前同步字体文件时间,是因为创建soffice服务时,服务会检查所需加载的文件的时间,如果其认为时间不符,则其可能会重新加载,耗时较长,因此需事先统一时 ...

  9. 【C++】解析C++运行环境的搭建

    在本篇文章中,笔者会谈谈如何搭建C++的运行环境.在不同操作系统中,运行C++编译器的命令也各不相同,最常用的编译器是GNU编译器(Linux系统)和微软Visual Studio编译器(Window ...

  10. 分析轮子(六)- LinkedList.java

    注:玩的是JDK1.7版本 一:先上类的继承结构图 二:再看一下他的底层实现数据结构 三:然后从源码中找点好玩的东西 1)双向链表的结构构成元素,头指针.尾指针.节点信息(前向指针.后向指针.节点信息 ...