Computational criminology: at-scale quantitative analysis of the evolution of cybercrime forums

Hughes, Jack

Computational criminology: at-scale quantitative analysis of the evolution of cybercrime forums

Repository URI

https://www.repository.cam.ac.uk/handle/1810/366133

Repository DOI

https://doi.org/10.17863/CAM.107180

Files

Thesis (1.15 MB)

Type

Thesis

Authors

Hughes, Jack

https://orcid.org/0000-0002-0730-1055

Abstract

Cybercrime forums and marketplaces are used by members to share hacking techniques, general community-building discussions, and trade hacking tools. While there is a large corpus of literature studying these platforms, from a cross-forum ecosystem comparison to smaller qualitative analyses of specific crime types within a single forum, there has been little research into studying these over time. Using the CrimeBB dataset from the Cambridge Cybercrime Centre, this first contribution of the thesis explores the evolution of a large cybercrime forum, from growth to gradual decline from peak activity, with research questions grounded in the digital drift framework from criminological theory. This finds a trend towards financially-driven cybercrime over time, by users and the forum as a whole. The second contribution of the thesis presents a method for detecting trending terms, using a lightweight natural language processing method to handle queries, given the size of the dataset. Evaluation using manual annotations showed more relevant salient terms were detected over TF-IDF. Finally, the third contribution of the thesis applies signalling theory to analyse the usage of argot (jargon and slang) on the forum, finding a negative correlation with reputation usage, and clustering to find a decreasing use of argot over time. Part of this contribution includes a lightweight argot detection pipeline with word embeddings aligned with manual annotations. Overall, the combination of different approaches, including criminological theory driving research directions, natural language processing to analyse forum text data, machine learning for classifications, and data science techniques, all contribute to provide a unique interdisciplinary perspective within the field of cybercrime community research, both drawing insights into these communities and contributing novel tools for measurements of large, noisy text data.

Date

2023-07-24

Advisors

Hutchings, Alice

Keywords

cybercrime, evolution, forums, measurement

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights

Sponsorship

Engineering and Physical Sciences Research Council (2276284)

EPSRC DTA PhD Studentship

Collections

Theses - Computer Science and Technology