Abstract
Purpose – Intrigued by the methodological challenges emerging from text complexity, the purpose of this study is to evaluate the effectiveness of different topic modelling algorithms based on Instagram textual data.
Design/methodology/approach – By taking Instagram posts captioned with #darktourism as the study context, this research applies latent Dirichlet allocation (LDA), correlation explanation (CorEx), and nonnegative matrix factorisation (NMF) to uncover tourist experiences.
Findings – CorEx outperforms LDA and NMF by classifying emerging dark sites and activities into 17 distinct topics. The results of LDA appear homogeneous and overlapping, whereas the extracted topics of NMF are not specific enough to gain deep insights.
Originality/value – This study assesses different topic modelling algorithms for knowledge extraction in the highly heterogeneous tourism industry. The findings unfold the complexity of analysing short-text social media data and strengthen the use of CorEx in analysing Instagram content.
Design/methodology/approach – By taking Instagram posts captioned with #darktourism as the study context, this research applies latent Dirichlet allocation (LDA), correlation explanation (CorEx), and nonnegative matrix factorisation (NMF) to uncover tourist experiences.
Findings – CorEx outperforms LDA and NMF by classifying emerging dark sites and activities into 17 distinct topics. The results of LDA appear homogeneous and overlapping, whereas the extracted topics of NMF are not specific enough to gain deep insights.
Originality/value – This study assesses different topic modelling algorithms for knowledge extraction in the highly heterogeneous tourism industry. The findings unfold the complexity of analysing short-text social media data and strengthen the use of CorEx in analysing Instagram content.
Original language | English |
---|---|
Pages (from-to) | 1-13 |
Number of pages | 13 |
Journal | Tourism Review |
DOIs | |
Publication status | Published - Oct 2021 |
Keywords
- Machine learning
- LDA
- Topic modelling
- CorEx
- NMF