Obfuscated malware is malware that hides to avoid detection and extermination. The obfuscated malware dataset is designed to test obfuscated malware detection methods through memory. The dataset was created to represent as close to a real-world situation as possible using malware that is prevalent in the real world. Made up of Spyware, Ransomware and Trojan Horse malware, it provides a balanced dataset that can be used to test obfuscated malware detection systems.
This dataset uses debug mode for the memory dump process to avoid the dumping process to show up in the memory dumps. This works to represent a more accurate example of what an average user would have running at the time of a malware attack.
The obfuscated malware dataset focuses on simulation of real-world scenarios. Figure 1 shows the breakdown of benign and malicious memory dumps. Figure 2 shows the breakdown of what malware families are used in each malware category for Spyware (a), Ransomware (b), and Trojan Horse (c) malware. Figure 3 shows the overall malware families used in the whole dataset.
Figure 1: Memory Dump Categories
Figure 2A: Spyware Families
Figure 2B: Ransomware Families
Figure 2C: Trojan Horse Families
Figure 3: Complete Dataset Breakdown
The dataset is balanced with it being made up by 50% malicious memory dumps and 50% benign memory dumps. The break down for malware families is shown in the table below. The dataset contains a total of 58,596 records with 29,298 benign and 29,298 malicious. Figure 4, shown below, is a table showing the total count of each malware family from each malware category.
Figure 4: Malware Table Breakdown
You may redistribute, republish, and mirror the CIC-Darknet2020 dataset in any form. However, any use or redistribution of data must include a citation to the CICDarknet2020 dataset and the following paper:
- Tristan Carrier, Princy Victor, Ali Tekeoglu, Arash Habibi Lashkari,” Detecting Obfuscated Malware using Memory Feature Engineering”, The 8th International Conference on Information Systems Security and Privacy (ICISSP), 2022
You can download this dataset from here.