Publications du laboratoire
|Concept-based Topic Model Improvement |
Auteur(s): Musat Claudiu, VELCIN J., RIZOIU M.-A., Trausan-matu Stefan
Actes de conférence: Conference: International Symposium on Methodologies for Intelligent Systems (ISMIS) (Warsaw, PL, 2011-06-20) Publié: International Symposium on Methodologies for Intelligent Systems (ISMIS), vol. (2011) p.100
Ref HAL: hal-00616247_v2
Résumé: We propose a system which employs conceptual knowledge to improve topic models by removing unrelated words from the simplified topic description. We use WordNet to detect which topical words are not conceptually similar to the others and then test our assumptions against human judgment. Results obtained on two different corpora in different test conditions show that the words detected as unrelated had a much greater probability than the others to be chosen by human evaluators as not being part of the topic at all. We prove that there is a strong correlation between the said probability and an automatically calculated topical fitness and we discuss the variation of the correlation depending on the method and data used.