MODELLING CRYPTOCURRENCY ACTIVITIES’ REPORTAGE USING CORRELATED TOPIC MODEL
Abstract
This study aimed at unveiling the connections amidst cryptocurrency activities by users and how traders read these activities online to aid their trading capabilities. Therefore, it examined the daily actions of Bitcoin traders who may have learned about the market from news items. The study employed the latent Dirichlet Allocation (LDA) model and its version, and the Correlated Topic Model (CTM) to examine news stories that were scraped from the market section website of CNBC. Using this machine learning process, topics with their terminology and words that were hidden in the documents and articles were found. The Document-Term-matrix was used to create coherence and perplexity graphs, which were then used to determine the number of Topics (K) to prevent the CTM from being overfitted or underfitted. The topics were determined by estimating the proportions for each document, and the CTM was used to perform correlation tests between the themes that were found. Similarly, document-word proportions and topic-word proportions were also measured. The posterior covariance matrix produced by the CTM was used to create a dense topic graph, which was then fed straight into a network analysis model to show positive, negative, and no relationships between the topics. Additionally, papers were searched using the "Hellinger distance" approach to determine the relationship between two or more documents/articles. The outcome demonstrates that there were more positive correlations between the topics that were found and between the different documents and topics that might have included the collective knowledge underlying the cryptocurrency traders' actions.