The Great DOM Fuzz-off of 2017

Posted by Ivan Fratric, Project Zero
https://googleprojectzero.blogspot.nl/2017/09/the-great-dom-fuzz-off-of-2017.html

Introduction

Historically, DOM engines have been one of the largest sources of web browser bugs(以前DOM是浏览器最大漏洞来源). And while in the recent years the popularity of those kinds of bugs in targeted attacks has somewhat fallen in favor of Flash (which allows for cross-browser exploits) and JavaScript engine bugs (which often result in very powerful exploitation primitives,利用更强), they are far from gone(远非如此). For example, CVE-2016-9079 (a bug that was used in November 2016 against Tor Browser users) was a bug in Firefox’s DOM implementation, specifically the part that handles SVG elements in a web page. It is also a rare case that a vendor will publish a security update that doesn’t contain fixes for at least several DOM engine bugs.//DOM漏洞还有空间?
An interesting property of many of those bugs is that they are more or less easy to find by fuzzing. This is why a lot of security researchers as well as browser vendors who care about security invest into building DOM fuzzers and associated infrastructure.//通过fuzz可以较容易找到。
As a result, after joining Project Zero, one of my first projects was to test the current state of resilience of major web browsers against DOM fuzzing.

The fuzzer

For this project I wanted to write a new fuzzer which takes some of the ideas from my previous DOM fuzzing projects, but also improves on them and implements new features. Starting from scratch also allowed me to end up with cleaner code that I’m open-sourcing(开源!) together with this blog post. The goal was not to create anything groundbreaking - as already noted by security researchers, many DOM fuzzers have begun to look like each other over time. Instead the goal was to create a fuzzer that has decent initial coverage, is easily understandable and extendible and can be reused by myself as well as other researchers for fuzzing other targets besides just DOM fuzzing.//更像一个框架
We named this new fuzzer Domato (credits to Tavis for suggesting the name). Like most DOM fuzzers, Domato is generative, meaning that the fuzzer generates a sample from scratch given a set of grammars that describes HTML/CSS structure as well as various JavaScript objects, properties and functions.
The fuzzer consists of several parts:

  • The base engine that can generate a sample given an input grammar. This part is intentionally fairly generic and can be applied to other problems besides just DOM fuzzing.//通用并且适合DOMfuzz之外的
  • The main script that parses the arguments and uses the base engine to create samples. Most logic that is DOM specific is captured in this part.//解析参数并创建例子
  • A set of grammars for generating HTML, CSS and JavaScript code.//语法
One of the most difficult aspects in the generation-based fuzzing is creating a grammar or another structure that describes the samples that are going to be created. In the past I experimented with manually created grammars as well as grammars extracted automatically from web browser code. Each of these approaches has advantages and drawbacks, so for this fuzzer I decided to use a hybrid approach:

  1. I initially extracted DOM API declarations from .idl files in Google Chrome Source. Similarly, I parsed Chrome’s layout tests to extract common (and not so common) names and values of various HTML and CSS properties.
  2. Afterwards, this automatically extracted data was heavily manually edited to make the generated samples more likely to trigger interesting behavior. One example of this are functions and properties that take strings as input: Just because a DOM property takes a string as an input does not mean that any string would have a meaning in the context of that property.
Otherwise, Domato supports features that you’d expect from a DOM fuzzer such as:

  • Generating multiple JavaScript functions that can be used as targets for various DOM callbacks and event handlers
  • Implicit (through grammar definitions) support for “interesting” APIs (e.g. the Range API) that have historically been prone to bugs.
Instead of going into much technical details here, the reader is referred to the fuzzer code and documentation at https://github.com/google/domato. It is my hope that by open-sourcing the fuzzer I would invite community contributions that would cover the areas I might have missed in the fuzzer or grammar creation.//语法部分

Setup

We tested 5 browsers with the highest market share: Google Chrome, Mozilla Firefox, Internet Explorer, Microsoft Edge and Apple Safari. We gave each browser approximately 100.000.000 iterations with the fuzzer and recorded the crashes. (If we fuzzed some browsers for longer than 100.000.000 iterations, only the bugs found within this number of iterations were counted in the results.) Running this number of iterations would take too long on a single machine and thus requires fuzzing at scale, but it is still well within the pay range of a determined attacker. For reference, it can be done for about $1k on Google Compute Engine given the smallest possible VM size, preemptable VMs (which I think work well for fuzzing jobs as they don’t need to be up all the time) and 10 seconds per run.
Here are additional details of the fuzzing setup for each browser:
  • Google Chrome was fuzzed on an internal Chrome Security fuzzing cluster called ClusterFuzz. To fuzz Google Chrome on ClusterFuzz we simply needed to upload the fuzzer and it was run automatically against various Chrome builds.
  • Mozilla Firefox was fuzzed on internal Google infrastructure (linux based). Since Mozilla already offers Firefox ASAN builds for download, we used that as a fuzzing target. Each crash was additionally verified against a release build.
  • Internet Explorer 11 was fuzzed on Google Compute Engine running Windows Server 2012 R2 64-bit. Given the lack of ASAN build, page heap was applied to iexplore.exe process to make it easier to catch some types of issues.
  • Microsoft Edge was the only browser we couldn’t easily fuzz on Google infrastructure since Google Compute Engine doesn’t support Windows 10 at this time and Windows Server 2016 does not include Microsoft Edge. That’s why for fuzzing it we created a virtual cluster of Windows 10 VMs on Microsoft Azure. Same as with Internet Explorer, page heap was applied to MicrosoftEdgeCP.exe process before fuzzing.
  • Instead of fuzzing Safari directly, which would require Apple hardware, we instead used WebKitGTK+ which we could run on internal (Linux-based) infrastructure. We created an ASAN build of the release version of WebKitGTK+. Additionally, each crash was verified against a nightly ASAN WebKit build running on a Mac.//利用WebKitGTK+而不需Apple硬件

Results

Without further ado, the number of security bugs found in each browsers are captured in the table below.
Only security bugs were counted in the results (doing anything else is tricky as some browser vendors fix non-security crashes while some don’t) and only bugs affecting the currently released version of the browser at the time of fuzzing were counted (as we don’t know if bugs in development version would be caught by internal review and fuzzing process before release).
Vendor
Browser
Engine
Number of Bugs
Project Zero Bug IDs
Google
Chrome
Blink
2
994, 1024
Mozilla
Firefox
Gecko
4**
1130, 1155, 1160, 1185
Microsoft
Internet Explorer
Trident
4
1011, 1076, 1118, 1233
Microsoft
Edge
EdgeHtml
6
1011, 1254, 1255, 1264, 1301, 1309
Apple
Safari
WebKit
17
999, 1038, 1044, 1080, 1082, 1087, 1090, 1097, 1105, 1114, 1241, 1242, 1243, 1244, 1246, 1249, 1250
Total
31*
*While adding the number of bugs results in 33, 2 of the bugs affected multiple browsers
**The root cause of one of the bugs found in Mozilla Firefox was in the Skia graphics library and not in Mozilla source. However, since the relevant code was contributed by Mozilla engineers, I consider it fair to count here.
All of the bugs listed here have been fixed in the current shipping versions of the browsers. As can be seen in the table most browsers did relatively well in the experiment with only a couple of security relevant crashes found. Since using the same methodology used to result in significantly higher number of issues just several years ago, this shows clear progress for most of the web browsers. For most of the browsers the differences are not sufficiently statistically significant to justify saying that one browser’s DOM engine is better or worse than another.//在实验阶段已经足够好。
However, Apple Safari is a clear outlier in the experiment with significantly higher number of bugs found. This is especially worrying given attackers’ interest in the platform as evidenced by the exploit prices and recent targeted attacks. It is also interesting to compare Safari’s results to Chrome’s, as until a couple of years ago, they were using the same DOM engine (WebKit). It appears that after the Blink/Webkit split either the number of bugs in Blink got significantly reduced or a significant number of bugs got introduced in the new WebKit code (or both). To attempt to address this discrepancy, I reached out to Apple Security proposing to share the tools and methodology. When one of the Project Zero members decided to transfer to Apple, he contacted me and asked if the offer was still valid. So Apple received a copy of the fuzzer and will hopefully use it to improve WebKit.
It is also interesting to observe the effect of MemGC, a use-after-free mitigation in Internet Explorer and Microsoft Edge. When this mitigation is disabled using the registry flag OverrideMemoryProtectionSetting, a lot more bugs appear. However, Microsoft considers these bugs strongly mitigated by MemGC and I agree with that assessment. Given that IE used to be plagued with use-after-free issues, MemGC is an example of a useful mitigation that results in a clear positive real-world impact. Kudos to Microsoft’s team behind it!//MemGC基本是的IE/Edge上的DOM UAF失效,但并不代表其它浏览器也是这样。
When interpreting the results, it is very important to note that they don’t necessarily reflect the security of the whole browser and instead focus on just a single component (DOM engine), but one that has historically been a source of many security issues. This experiment does not take into account other aspects such as presence and security of a sandbox, bugs in other components such as scripting engines etc. I can also not disregard忽视 the possibility that, within DOM, my fuzzer is more capable at finding certain types of issues than other, which might have an effect on the overall stats.

Experimenting with coverage-guided DOM fuzzing

Since coverage-guided fuzzing seems to produce very good results in other areas we wanted to combine it with the DOM fuzzing. We built an experimental coverage-guided DOM fuzzer and ran it against Internet Explorer. IE was selected as a target both because of the author's familiarity with it and because it is very easy to limit coverage collection to just the DOM component (mshtml.dll,仅计算DOM组件覆盖率). The experimental fuzzer used a modified Domato engine to generate mutations and used a modified WinAFL's DynamoRIO client to measure coverage. The fuzzing flow worked roughly as follows:
  1. The fuzzer generates a new set of samples by mutating existing samples in the corpus.//变异
  2. The fuzzer spawns IE process which opens a harness HTML page.
  3. The harness HTML page instructs the fuzzer to start measuring coverage and loads one of the samples in an iframe//执行
  4. After the sample executes, it notifies the harness which notifies the fuzzer to stop collecting coverage.//计算代码覆盖率
  5. Coverage map is examined and if it contains unseen coverage, the corresponding sample is added to the corpus.
  6. Go to step 3 until all samples are executed or the IE process crashes
  7. Periodically minimize the corpus using the AFL’s cmin algorithm.//自动精简
  8. Go to step 1.
The following set of mutations was used to produce new samples from the existing ones://与上面的相比
  • Adding new CSS rules
  • Adding new properties to the existing CSS rules
  • Adding new HTML elements
  • Adding new properties to the existing HTML elements
  • Adding new JavaScript lines. The new lines would be aware of the existing JavaScript variables and could thus reuse them.
Unfortunately, while we did see a steady increase in the collected coverage over time while running the fuzzer, it did not result in any new crashes (i.e. crashes that would not be discovered using dumb fuzzing.盲目fuzzing). It would appear more investigation is required in order to combine coverage information with DOM fuzzing in a meaningful way.//并没有看到代码覆盖率的稳步增长,也没有新增crash(这些crash不能通过盲目fuzzing发现)。需要进一步思考,如何将DOMfuzzing与代码覆盖率结合!

Conclusion

As stated before, DOM engines have been one of the largest sources of web browser bugs. While this type of bug are far from gone, most browsers show clear progress in this area. The results also highlight the importance of doing continuous security testing as bugs get introduced with new code and a relatively short period of development can significantly deteriorate a product’s security posture.
The big question at the end is: Are we now at a stage where it is more worthwhile to look for security bugs manually than via fuzzing? Or do more targeted fuzzers need to be created instead of using generic DOM fuzzers to achieve better results? And if we are not there yet - will we be there soon (hopefully)? The answer certainly depends on the browser and the person in question. Instead of attempting to answer these questions myself, I would like to invite the security community to let us know their thoughts.

转:The Great DOM Fuzz-off of 2017的更多相关文章

  1. JQuery基本知识、选择器、事件、DOM操作、动画--2017年2月10日

    $(对象)可以将JS对象转换为JQuery对象  .get(0)可以将JQuery对象转换为JS对象 并无太大区别,灵活点出即可

  2. 像VUE一样写微信小程序-深入研究wepy框架

    像VUE一样写微信小程序-深入研究wepy框架 微信小程序自发布到如今已经有半年多的时间了,凭借微信平台的强大影响力,越来越多企业加入小程序开发. 小程序于M页比相比,有以下优势: 1.小程序拥有更多 ...

  3. JS DOM(2017.12.28)

    一.获得元素节点的方法 document.getElementById()    根据Id获取元素节点 document.getElementsByName()    根据name获取元素节点   遍 ...

  4. 2017 年值得一瞥的 JavaScript 相关技术趋势

    跨年前两天,Dan Abramov在Twitter上提了一个问题: JS社区毫不犹豫的抛出了它们对于新技术的预期与期待,本文内容也是总结自Twitter的回复,按照流行度降序排列.有一个尚未确定的小点 ...

  5. DOM的相关优化

    为什么要进行DOM优化? DOM对象本身也是一个js对象,所以严格来说,并不是操作这个对象慢,而是说操作了这个对象后,会触发一些浏览器行为,比如布局(layout)和绘制(paint). 首先先说一些 ...

  6. python运维开发(十六)----Dom&&jQuery

    内容目录: Dom 查找 操作 事件 jQuery 查找 筛选 操作 事件 扩展 Dom 文档对象模型(Document Object Model,DOM)是一种用于HTML和XML文档的编程接口.它 ...

  7. 【2017年新篇章】 .NET 面试题汇总(二)

    本次给大家介绍的是我收集以及自己个人保存一些.NET面试题第二篇 第一篇文章请到这里:[2017年新篇章] .NET 面试题汇总(一) 简介 此次包含的不止是.NET知识,也包含少许前端知识以及.ne ...

  8. X-NUCA 2017 web专题赛训练题 阳光总在风雨后和default wp

     0X0.前言 X-NUCA 2017来了,想起2016 web专题赛,题目都打不开,希望这次主办方能够搞好点吧!还没开赛,依照惯例会有赛前指导,放一些训练题让CTFer们好感受一下题目. 题目有一大 ...

  9. HTML DOM (文档对象模型)

    当网页被加载时,浏览器会创建页面的文档对象模型(Document Object Model). HTML DOM 模型被构造为对象的树. HTML DOM 树 通过可编程的对象模型,JavaScrip ...

随机推荐

  1. 设置texture

    //获取内部资源贴图 public void setInsideTexture() { Texture2D texture = Resources.Load(texture_url) as Textu ...

  2. 谈一谈我所了解的https

    一. http协议 首先我并不会很深入的去探讨这个东西,即使我曾经花了很长的时间去研究这个东西.主要是我考虑到1. 自己没有系统的去学习这一块的知识,讲解的会比较的肤浅.2. 就算是懂这个东西也不一定 ...

  3. Kali设置代理

    原文:Kali-linux设置ProxyChains ProxyChains是Linux和其他Unices下的代理工具.它可以使任何程序通过代理上网,允许TCP和DNS通过代理隧道,支持HTTP.SO ...

  4. stanfordCorenlp在python3中的安装使用+词性学习

    1 安装 前言 Stanford CoreNLP的源代码是使用Java写的,提供了Server方式进行交互.stanfordcorenlp是一个对Stanford CoreNLP进行了封装的Pytho ...

  5. Android中的通信Volley

    1. Volley简介 我们平时在开发Android应用的时候不可避免地都需要用到网络技术,而多数情况下应用程序都会使用HTTP协议来发送和接收网络数据.Android系统中主要提供了两种方式来进行H ...

  6. Django 自定义分页类

    分页类代码: class Page(object): ''' 自定义分页类 可以实现Django ORM数据的的分页展示 输出HTML代码: 使用说明: from utils import mypag ...

  7. 15个你不得不知道的Chrome dev tools的小技巧

    转载自:https://www.imooc.com/article/2559 谷歌浏览器如今是Web开发者们所使用的最流行的网页浏览器.伴随每六个星期一次的发布周期和不断扩大的强大的开发功能,Chro ...

  8. Loadrunner脚本学习总结

    1.1      web脚本录制选择Web(HTTP/HTML)协议: 注意录制脚本前选择如下协议: 1.2      脚本如果需要使用如下函数: web_reg_save_param.web_fin ...

  9. C#子线程中更新主线程UI-----注意项

    using System;using System.Collections.Generic;using System.ComponentModel;using System.Data;using Sy ...

  10. Cookie/Session的认识

    Cookie 1.cookie是什么? cookie是一小段文本,伴随着用户请求和页面在web服务器和浏览器之间传递,cookie包含每次用户访问站点时,web应用程序都可以读取的信息 2.为什么需要 ...