Arash Habibi Lashkari
Associate Professor

CYBERSECURITY OPEN-SOURCE ANALYZERS (INFORMATION EXTRACTOR (IE))

12. Smart Contracts Vulnerability Analyzer (SCsVulLyzer V2.0)

2024

As part of the Understanding Cybersecurity Series (UCS), SCsVolLyzer is a Python open-source project to extract more than 240 features to profile Smart Contracts (SCs) for vulnerability detection in the Ethereum Blockchain Platform. It is an advanced Python open-source project designed to enhance the profiling of Smart Contracts for improved vulnerabilities detection. This version stands out by classifying features into compiler-based and non-compiler-based categories, enabling a broader scope of feature extraction compared to SCsVulLyzer V1.0. Compiler-based features, such as the Abstract Syntax Tree (AST) and the Application Binary Interface (ABI), are derived post-compilation, while non-compiler-based features leverage natural language processing techniques tailored to identify critical keywords in the source code. Additionally, the tool introduces three new feature categories—Contract Information, Source Code Information, and Solidity Information—that quantify aspects like function counts, statements, loops, and lines of code. These advancements allow for more granular and in-depth analysis of smart contracts, enhancing the overall utility of SCsVulLyzer V2.0. Another notable enhancement in this version is the introduction of 'bytecode entropy'—a measure of the randomness within the bytecode, which serves as an indicator of unpredictability and complexity. This metric is particularly valuable in fields like cryptography and anomaly detection. Related published papers:
- Sepideh HajiHosseinKhani, Arash Habibi Lashkari, Ali Mizani Oskui,"Unveiling Smart Contracts Vulnerabilities: Toward Profiling Smart Contracts Vulnerabilities using Enhanced Genetic Algorithm and Generating Benchmark Dataset", Blockchain: Research and Applications, December 2024, 100253

For more information and download the source code, visit this page.

11. Application Layer Flow Analyzer (ALFlowLyzer)

2024

ALFlowLyzer generates bidirectional flows from the Application Layer of network traffic, where the first packet determines the forward (source to destination) and backward (destination to source) direction and extract 120 features. Hence, the statistical time-related features can be calculated separately in the forward and backward directions. Additional functionalities include selecting features from the list of existing features, adding new features, and controlling the duration of flow timeout.

Related published papers:
- MohammadMoein Shafi, Arash Habibi Lashkari, Hardhik Mohanty, "Unveiling Malicious DNS Behavior Profiling and Generating Benchmark Dataset through Application Layer Traffic Analysis", Computers and Electrical Engineering, Volume 118, Part B, September 2024, 109436

For more information and download the source code, visit this page.

10. Network and Transportation Layers Flow Analyzer (NTLFlowLyzer)

2024

The NTLFlowLyzer generates bidirectional flows from the Network and Transportation Layers of network traffic, where the first packet determines the forward (source to destination) and backward (destination to source) directions. Hence, the statistical time-related features can be calculated separately in the forward and backward directions. Additional functionalities include selecting features from the list of existing features, adding new features, and controlling the duration of flow timeout.

Related published papers:
- MohammadMoein Shafi, Arash Habibi Lashkari, Arousha Haghighian Roudsari, ”NTLFlowLyzer: Toward Generating an Intrusion Detection Dataset and Intruders Behavior Profiling through Network Layer Traffic Analysis and Pattern Extraction, Computers & Security, 104160, ISSN 0167-4048 (2024)

For more information and download the source code, visit this page.

9. Benign User Profiler (BUP)

2024

The BUP is responsible for profiling the abstract behavior of human interactions and generating naturalistic, benign background traffic. Profiles can be applied to a diverse range of network protocols with different topologies because they represent the abstract properties of human and attack behavior. Once a benign profile is derived from users, an agent or human operator can generate realistic benign events on the network. Organizations and researchers can use this approach to generate realistic benign data easily; therefore, there is no need to anonymize data sets.

Related published papers:
- MohammadMoein Shafi, Arash Habibi Lashkari, Vicente Rodriguez, and Ron Nevo, ”Toward Generating a New Realistic Cloud_based Distributed Denial of Service (DDoS) Dataset and Intrusion Traffic Characterization”, Information, Vol. 15, 2024

For more information and download the source code, visit this page.

8. Smart Contracts Vulnerability Analyzer (SCsVolLyzer)

2023

The SCsVolLyzer is a Python-based tool designed to analyze and extract key metrics from Ethereum smart contracts written in Solidity. It employs a suite of functions to dissect the contract's source code, compiling it to obtain its abstract syntax tree (AST), bytecode, and opcodes. The analyzer calculates entropy of the bytecode to assess its randomness and security, determines the frequency of certain opcodes to understand the contract's complexity, and evaluates the usage of key Solidity keywords to gauge coding patterns. This modular and extensible tool provides a comprehensive snapshot of a smart contract's structure and behavior, facilitating developers and auditors in optimizing and securing Ethereum blockchain applications.

Related published papers:
- Sepideh Hajihosseinkhani, Arash Habibi Lashkari, Ali Mizani Oskui, “Unveiling Vulnerable Smart Contracts: Toward Profiling Vulnerable Smart Contracts using Genetic Algorithm and Generating Benchmark Dataset”, Blockchain: Research and Applications, Vol. 4, December 2023

For more information and download the source code, visit this page.

7. Authorship Attribution Analyzer (AuthAttLyzer)

2022

The source code of a program often contains some attributes and peculiarities that might can be used to identify the program as they reflect individual coding styles, similar to writer having specific identifiable hand-writings. These stylistic or peculiarities patterns vary from very basic artifacts in the code layout and comments to very fine or subtle habits in control flow of the program or the syntax used. The challenging task of identification of the author of the source code based on these attributes is called as Source Code Authorship Attribution (SCAA). AuthAttLyzer is a source code analyzer that can extract several features including N-rgrams, Word-based embeddings, and Abstract Syntax Tree (AST) features.

Related published papers:
- Abhishek Chopra , Nikhill Vombatkere , Arash Habibi Lashkari,”AuthAttLyzer: A Robust defensive distillation-based Authorship Attribution framework”, The 12th International Conference on Communication and Network Security (ICCNS), China, 2022

For more information and download the source code, visit this page.

6. PDF Malware Analyzer (PDFMalLyzer)

2022

Over the years, PDF has been the most widely used document format due to its portability and reliability. Unfortunately, PDF popularity and its advanced features have allowed attackers to exploit them in numerous ways. There are various critical PDF features that an attacker can misuse to deliver a malicious payload. This program extracts 31 different features from a set of pdf files specified by the user and writes them on a csv file. The resulting csv file can be further studied for variety of purposes, most importantly for detecting malicious pdf files.

Related published papers:
- Maryam Issakhani, Princy Victor, Ali Tekeoglu, and Arash Habibi Lashkari1, “PDF Malware Detection Based on Stacking Learning”, The International Conference on Information Systems Security and Privacy, February 2022

For more information and download the source code, visit this page.

5. IMAP Bot AnaLyzer (IMAPBotLyzer)

2022

Credential stuffing is an attack that obtains stolen account credentials, usually sourced from data breaches. It is a technique used to exploit the fact that many people use the same username and password for multiple accounts. Credential stuffing has become a great matter of concern for the Internet Mail Access Protocol (IMAP), a popular method for accessing electronic mail and news messages maintained on a remote server. A significant vulnerability in IMAP and other legacy email protocols is that it cannot support MFA and depends on only a username and password for authentication, leaving it susceptible to credential stuffing. As bots generally carry out credential stuffing attacks, a promising countermeasure is to identify and block them before they can login. Our objective is to use two types of behavioral biometrics - mouse dynamics and keystroke dynamics - for profiling human and bot to distinguish between them. In this project, we introduced a supervised learning bot detection system using mouse and keystroke dynamics and compared the classification of the Random Forest(RF), Decision Tree(DT), Support Vector Machine(SVM), and K-Nearest Neighbors(KNN) machine learning algorithms to identify which model achieves the best overall result.

Related published papers:
- "Detecting IMAP Credential Stuﬃng Bots Using Behavioural Biometrics", Ashley Barkworth, Rehnuma Tabassum and Arash Habibi Lashkari, Tthe 12th International Conference on Communication and Network Security (ICCNS2022), Beijing, China.

For more information and download the source code, visit this page.

4. Volatility Memory Analyzer (VolMemLyzer)

2021

Memory forensics is a fundamental step that inspects malicious activities during live malware infection. Memory analysis not only captures malware footprints but also collects several essential features that may be used to extract hidden original code from obfuscated malware. There are significant efforts in analyzing volatile memory using several tools and approaches. These approaches fetch relevant information from the kernel and user space of the operating system to investigate running malware. However, the fetching process will accelerate if the most dominating features required for malware classification are readily available. Volatility Memory Analyzer (VolMemLyzer) is a python code to extract more than 36 features to analyze the malicious activities in a memory snapshot using Volatility tool.

Related published papers:
- Arash Habibi Lashkari, Beiqi Li, Tristan Lucas Carrier, Gurdip Kaur, "VolMemLyzer: Volatile Memory Analyzer for Malware Classification using Feature Engineering", Reconciling Data Analytics, Automation, Privacy, and Security: A Big Data Challenge (RDAAPS), IEEE 978-1-7281-6937-8/20, Canada, ON, McMaster University, 2021

For more information and download the source code, visit this page.

3. DNS over HTTPS (DoH) Analyzer (DoHLyzer)

2020

Set of tools to capture HTTPS traffic, extract statistical and time-series features from it, and analyze them with a focus on detecting and characterizing DoH (DNS-over-HTTPS) traffic.

Related published papers:
- Mohammadreza MontazeriShatoori, Logan Davidson, Gurdip Kaur and Arash Habibi Lashkari, "Detection of DoH Tunnels using Time-series Classification of Encrypted Traffic", The 5th Cyber Science and Technology Congress (2020) (CyberSciTech 2020), Vancouver, Canada, August 2020

For more information and download the source code, visit this page.

2. Static and Dynamic Android App Analyzer (AndroidApplyzer)

2019

This research focuses on classifying android samples using static and dynamic analysis. The first version of this package covers the data collection and static feature extraction. The second version focuses on developing a classification model using AI for static features. The third version has the dynamic analysis module and related features to improve the classifier.

Related published papers:
- Abir Rahali, Arash Habibi Lashkari, Gurdip Kaur, Laya Taheri, Francois Gagnon, and Frédéric Massicotte, "DIDroid: Android Malware Classification and Characterization Using Deep Image Learning", 10th International Conference on Communication and Network Security, Tokyo, Japan, November 2020, https://doi.org/10.1145/3442520.3442521

- David Sean Keyes, Beiqi Li, Gurdip Kaur, Arash Habibi Lashkari, Francois Gagnon, Fr´ed´eric Massicotte, "EntropLyzer: Android Malware Classification and Characterization Using Entropy Analysis of Dynamic Characteristics", Reconciling Data Analytics, Automation, Privacy, and Security: A Big Data Challenge (RDAAPS), IEEE 978-1-7281-6937-8/20, Canada, ON, McMaster University, 2021

For more information and download the source code, visit this page.

1. Network Traffic Analyzer (CICFlowMeter formerly known as ISCXFlowMeter)

2015

The CICFlowMeter is an open source tool that generates Biflows from pcap files, and extracts features from these flows.
CICFlowMeter is a network traffic flow generator available from here . It can be used to generate bidirectional flows, where the first packet determines the forward (source to destination) and backward (destination to source) directions, hence the statistical time-related features can be calculated separately in the forward and backward directions. Additional functionalities include, selecting features from the list of existing features, adding new features, and controlling the duration of flow timeout.
NOTE: TCP flows are usually terminated upon connection teardown (by FIN packet) while UDP flows are terminated by a flow timeout. The flow timeout value can be assigned arbitrarily by the individual scheme e.g., 600 seconds for both TCP and UDP.

Related published papers:
- Arash Habibi Lashkari, Gerard Draper-Gil, Mohammad Saiful Islam Mamun and Ali A. Ghorbani, "Characterization of Tor Traffic Using Time Based Features", In the proceeding of the 3rd International Conference on Information System Security and Privacy, SCITEPRESS, Porto, Portugal, 2017

- Gerard Drapper Gil, Arash Habibi Lashkari, Mohammad Mamun, Ali A. Ghorbani, "Characterization of Encrypted and VPN Traffic Using Time-Related Features", In Proceedings of the 2nd International Conference on Information Systems Security and Privacy(ICISSP 2016) , pages 407-414, Rome , Italy

For more information and download the source code, visit this page.

Researchers named among top researchers for Canada 150

The cybersecurity Research and Academic Leadership award, Canada 2019

The cybersecurity academic award, Canada 2017

Arash Habibi Lashkari Associate Professor

CYBERSECURITY OPEN-SOURCE ANALYZERS (INFORMATION EXTRACTOR (IE))

Arash Habibi Lashkari
Associate Professor