硕士论文的研究方向为Android恶意应用分类,因此花了一点时间去搜集Android恶意样本。其中一部分来自过去论文的公开数据集,一部分来自社区或平台的样本。现做一个汇总,标明了样本或数据集的采集时间、样本数量、对于论文以及获取方式。

List some android malware datasets in academic research.Some of them are still up to date.

  1. 我这里有Drebin的数据集,以及VirusTotal(2018.3)的android恶意样本,约15GB。VirusTotal的数据集在Google云盘上,Drebin数据集我上传了 2560/5560 个到OneDrive(由于空间受限)。需要的可联系我本人,并告知机构和身份(分享Google云盘需要提供你的gmail)。
  2. 历史的数据集例如Drebin、Genome 等可以联系导师,然后发邮件联系他们获取,一些不再共享的也可以联系一些已经拥有数据集的大学和机构,基本上国内知名的大学都会有这些数据集。
  3. VirusTotal的样本可以自己去申请。分为API和恶意文件夹。前者可以等到详细的样本检测报告,后者的话主要是大量的恶意样本。但是VirusTotal样本申请需要填写大量的信息,例如身份、研究的内容、学校和导师的资料等。
  4. Contagio样本的密码,直接联系博主本人即可。
  5. 所有样本仅可用于学术研究,并且请指出样本来源。

VirusTotal Mobile Apps Samples

VirusTotal: Analyze suspicious files and URLs to detect types of malware including viruses, worms, and trojans.

Description: VirusTotal can also be used through a smartphone app. VirusTotal is about empowering the Community in order to build tools that will make the Internet a safer place, as such, we like to credit and feature Community-developed goodies that help the antivirus industry in receiving more files in order to have more visibility into threats. Below you can find links to apps that will allow you to interact with VirusTotal making use of your smartphone, note that these are not developed by VirusTotal itself and so we are not responsible for them.

Sample Volume: N/A

Collected Time: up to date

HomePage: https://www.virustotal.com

Way to get:

  1. If you need a small volume of sample, login to VirusTotal and download manually.
  2. If you need a large volume of sample, email to virusTotal for academic requests. You can choose "access to the Academic API" or "access to a folder of malware"

Contagio Mobile Malware Mini Dump

Description: aka "take a sample, leave a sample"Contagio mobile mini-dump is a part of contagiodump.blogspot.com. Contagio mobile mini-dump offers an upload dropbox for you to share your mobile malware samples. You can also download any samples individually or in one zip.

Sample Volume: N/A

Collected Time: up to date

HomePage: http://contagiominidump.blogspot.hk/

Way to get: free for download in Contagio blogs.And you can also download the sample from this link: http://contagiomobile.deependresearch.org/index.html However, the package need password to decompress, you need to email bloger to get password.

Koodous

Description: Koodous is a collaborative platform that combines the power of online analysis tools with social interactions between the analysts over a vast APKs repository.

Sample Volume: N/A

Collected Time: up to date

HomePage: https://koodous.com/

Way to get: register and download manually or use the api.

The Drebin Dataset

Description: The dataset contains 5,560 applications from 179 different malware families. The samples have been collected in the period of August 2010 to October 2012 and were made available to us by the MobileSandbox project.

Sample Volume: 5,560 applications from 179 different malware families

Collected Time: 2010.8 - 2012.10

Papers:

  1. Daniel Arp, Michael Spreitzenbarth, Malte Huebner, Hugo Gascon, and Konrad Rieck "Drebin: Efficient and Explainable Detection of Android Malware in Your Pocket", 21th Annual Network and Distributed System Security Symposium (NDSS), February 2014
  2. Michael Spreitzenbarth, Florian Echtler, Thomas Schreck, Felix C. Freling, Johannes Hoffmann, "MobileSandbox: Looking Deeper into Android Applications", 28th International ACM Symposium on Applied Computing (SAC), March 2013

HomePage: https://www.sec.cs.tu-bs.de/~danarp/drebin/index.html

Way to get: send email

Android Malware Genome Project

(2015/12/21) Due to limited resources and the situation that students involving in this project have graduated, we decide to stop the efforts of malware dataset sharing.

Description: In this project, we focus on the Android platform and aim to systematize or characterize existing Android malware. Particularly, with more than one year effort, we have managed to collect more than 1,200 malware samples that cover the majority of existing Android malware families, ranging from their debut in August 2010 to recent ones in October 2011.

Sample Volume: more than 1,200

Collected Time: 2010.8 - 2011.10

Papers:

Yajin Zhou, Xuxian Jiang, Dissecting Android Malware: Characterization and Evolution. Proceedings of the 33rd IEEE Symposium on Security and Privacy (Oakland 2012). San Francisco, CA, May 2012

HomePage: http://www.malgenomeproject.org/

Way to get: ask someone who had already get this dataset. following universities, research labs and companies

Kharon Malware Dataset

Description: The Kharon dataset is a collection of malware totally reversed and documented. This dataset has been constructed to help us to evaluate our research experiments. Its construction has required a huge amount of work to understand the malicous code, trigger it and then construct the documentation. This dataset is now available for research purpose, we hope it will help you to lead your own experiments.

Papers: CIDRE, EPI. Kharon dataset: Android malware under a microscope. Learning from Authoritative Security Experiment Results (2016): 1.

Homepage: http://kharon.gforge.inria.fr/dataset/

AMD Project

Description: AMD contains 24,553 samples, categorized in 135 varieties among 71 malware families ranging from 2010 to 2016. The dataset provides an up-to-date picture of the current landscape of Android malware, and is publicly shared with the community.

Sample Volume: 24,553 samples

Collected Time: 2010 to 2016

Papers
Li Y, Jang J, Hu X, et al. Android malware clustering through malicious payload mining[C]//International Symposium on Research in Attacks, Intrusions, and Defenses. Springer, Cham, 2017: 192-214.

Wei F, Li Y, Roy S, et al. Deep Ground Truth Analysis of Current Android Malware[C]//International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, Cham, 2017: 252-276.

Homepage: http://amd.arguslab.org

更多有关于Android恶意分类的资料,可访问我的github。项目地址为:DroidCC,里面包含了Android恶意检测的工具、最近的参考文献、第三方应用市场等资料。

如果仅仅是希望得到恶意样本的,请尽可能通过邮箱联系,并且告知相应的机构和个人身份。未告知身份信息的一律不回复。

Android恶意样本数据集汇总的更多相关文章

  1. GitHub上史上最全的Android开源项目分类汇总 (转)

    GitHub上史上最全的Android开源项目分类汇总 标签: github android 开源 | 发表时间:2014-11-23 23:00 | 作者:u013149325 分享到: 出处:ht ...

  2. ANDROID内存优化——大汇总(转)

    原文作者博客:转载请注明本文出自大苞米的博客(http://blog.csdn.net/a396901990),谢谢支持! ANDROID内存优化(大汇总——上) 写在最前: 本文的思路主要借鉴了20 ...

  3. ANDROID内存优化(大汇总——中)

    转载请注明本文出自大苞米的博客(http://blog.csdn.net/a396901990),谢谢支持! 写在最前: 本文的思路主要借鉴了2014年AnDevCon开发者大会的一个演讲PPT,加上 ...

  4. 准确率99%!基于深度学习的二进制恶意样本检测——瀚思APT 沙箱恶意文件检测使用的是CNN,LSTM TODO

    所以我们的流程如图所示.将正负样本按 1:1 的比例转换为图像.将 ImageNet 中训练好的图像分类模型作为迁移学习的输入.在 GPU 集群中进行训练.我们同时训练了标准模型和压缩模型,对应不同的 ...

  5. CVPR2021提出的一些新数据集汇总

    ​  前言  在<论文创新的常见思路总结>(点击标题阅读)一文中,提到过一些新的数据集或者新方向比较容易出论文.因此纠结于选择课题方向的读者可以考虑以下几个新方向.文末附相关论文获取方式. ...

  6. GitHub上史上最全的Android开源项目分类汇总

    今天在看博客的时候,无意中发现了 @Trinea 在GitHub上的一个项目 Android开源项目分类汇总 ,由于类容太多了,我没有一个个完整地看完,但是里面介绍的开源项目都非常有参考价值,包括很炫 ...

  7. Android 开源项目分类汇总(转)

    Android 开源项目分类汇总(转) ## 第一部分 个性化控件(View)主要介绍那些不错个性化的 View,包括 ListView.ActionBar.Menu.ViewPager.Galler ...

  8. 大礼包!ANDROID内存优化(大汇总)

    写在最前: 本文的思路主要借鉴了2014年AnDevCon开发者大会的一个演讲PPT,加上把网上搜集的各种内存零散知识点进行汇总.挑选.简化后整理而成. 所以我将本文定义为一个工具类的文章,如果你在A ...

  9. Android 开源项目分类汇总

    Android 开源项目分类汇总 Android 开源项目第一篇——个性化控件(View)篇  包括ListView.ActionBar.Menu.ViewPager.Gallery.GridView ...

随机推荐

  1. 不使用JS实现表单验证

    我们可以给表单元素添加required,pattern属性,还有根据具体元素类型决定的Measureable属性,如:min,max等. required:表示必填. pattern:一般用于type ...

  2. 消除TortoiseSVN 检出到(checkout)桌面上显示一堆问号

    之前不小心直接将版本库的内容检出到桌面,后才发现桌面上的文件图标都变成了问号,新建文件夹也同样如此. 为了解决这个问题,采用如下方法(任何一个检出文件夹均可这样操作): 1.删除桌面隐藏的.SVN文件 ...

  3. Selenium 、WebDriver :Capability

    Selenium | WebDriver Capability 内容摘要: 1.WebDriver 通用配置 2.RemoteWebDriver特有配置 3.Grid特有配置 4.在使用特定浏览器时的 ...

  4. NSObject

    一.前言 该博客里面的方法均是看着苹果官方的API来解释的,一般都是常用的方法如有问题,请指出. 二.简介: 该类集成的是其本身,大家可以从任何一个类去向上追溯,都会发现最终的父类都是NSObject ...

  5. Django的settings配置

    静态文件 STATIC_URL = '/static/' # 别名 STATICFILES_DIRS = [ os.path.join(BASE_DIR,'static'), os.path.join ...

  6. win10系统安装两个版本的python,该怎么安装Django

    最近遇到一个问题,系统上安装了python2,7 和python3.5两个版本,然后使用命令:pip install Django 安装Django后却发现以下情况: Traceback (most ...

  7. myeclipse中配置spring xml自己主动提示

    版权声明: https://blog.csdn.net/zdp072/article/details/24582173 这是一篇分享技巧的文章:myeclipse中配置spring xml自己主动提示 ...

  8. 安装eclipse、maven等JAVA开发环境

    一 下载JAVA,这是官方JAVA8的下载地址,包含了JDK和JRE: http://www.oracle.com/technetwork/java/javase/downloads/jdk8-dow ...

  9. Semaphore实现的生产者消费者程序

    Semaphore:Semaphores are often used to restrict the number of threads than can access some (physical ...

  10. esp8266(3) Arduino通过ESP8266连接和获取网站源代码

    http://www.plclive.com/a/tongxinjiekou/2016/0422/374.html 在上一篇8266的基础上,这一篇做个具体的连接网站的例子,供大家参考.上一篇基础篇请 ...