Cybersecurity Datasets:
9. Investigation of the Android Malware (CIC-InvesAndMal2019)

We provide the second part of the CICAndMal2017 dataset publicly available namely CICInvesAndMal2019 which includes permissions and intents as static features and API calls and all generated log files as dynamic features in three steps (During installation, before restarting and after restarting the phone). In this part, we improve our malware category and family classification performance around 30% by combining the previous dynamic features (80 network-flows by using CICFlowMeter-V3) with 2-gram sequential relations of API calls. In addition, we examine these features in the presented two-layer malware analysis framework. Besides these, we provide other captured features such as battery states, log states, packages, process logs, etc.

In this second part of the dataset, we followed the same installation and capturing process as previous:

We installed 5,000 of the collected samples (426 malware and 5,065 benign) on real devices. Our malware samples in this dataset are classified into four categories:

Adware
Ransomware
Scareware
SMS Malware

Our samples come from 42 unique malware families. The family kinds of each category and the numbers of the captured samples are as follows:

Adware
Dowgin family, 10 captured samples
Ewind family, 10 captured samples
Feiwo family, 15 captured samples
Gooligan family, 14 captured samples
Kemoge family, 11 captured samples
koodous family, 10 captured samples
Mobidash family, 10 captured samples
Selfmite family, 4 captured samples
Shuanet family, 10 captured samples
Youmi family, 10 captured samples

Ransomware
Charger family, 10 captured samples
Jisut family, 10 captured samples
Koler family, 10 captured samples
LockerPin family, 10 captured samples
Simplocker family, 10 captured samples
Pletor family, 10 captured samples
PornDroid family, 10 captured samples
RansomBO family, 10 captured samples
Svpeng family, 11 captured samples
WannaLocker family, 10 captured samples

Scareware
AndroidDefender, 17 captured samples
AndroidSpy.277 family, 6 captured samples
AV for Android family, 10 captured samples
AVpass family, 10 captured samples
FakeApp family, 10 captured samples
FakeApp.AL family, 11 captured samples
FakeAV family, 10 captured samples
FakeJobOffer family, 9 captured samples
FakeTaoBao family, 9 captured samples
Penetho family, 10 captured samples
VirusShield family, 10 captured samples

SMS Malware
BeanBot family, 9 captured samples
Biige family, 11 captured samples
FakeInst family, 10 captured samples
FakeMart family, 10 captured samples
FakeNotify family, 10 captured samples
Jifake family, 10 captured samples
Mazarbot family, 9 captured samples
Nandrobox family, 11 captured samples
Plankton family, 10 captured samples
SMSsniffer family, 9 captured samples
Zsone family, 10 captured samples

In order to acquire a comprehensive view of our malware samples, we created a specific scenario for each malware category. We also defined three states of data capturing in order to overcome the stealthiness of advanced malware:

Installation: The first state of data capturing which occurs immediately after installing malware (1-3 min). (In the dataset the folder name is "AfterInstall")
Before restart: The second state of data capturing which occurs 15 min before rebooting phones. (In the dataset the folder name is "Before")
After restart: The last state of data capturing which occurs 15 min after rebooting phones. (In the dataset the folder name is "After")

License
The CICInvesAndMal2019 dataset is publicly available for researchers. If you are using our dataset, you should cite our related research paper which outlines the details of the dataset and its underlying principles:

- Laya Taheri, Andi Fitriah Abdulkadir, Arash Habibi Lashkari; Extensible Android Malware Detection and Family Classification Using Network-Flows and API-Calls, The IEEE (53rd) International Carnahan Conference on Security Technology, India, 2019

You can download this dataset from here.
Researchers named among top researchers for Canada 150
The cybersecurity Research and Academic Leadership award, Canada 2019
The cybersecurity academic award, Canada 2017