Astronomical Data Features Extraction and Citation Prediction
Date
2023
Authors
Kutsuruk, Vladyslav
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Natural Language Processing methods present promising opportunities for analyzing astronomical data, enabling the extraction of essential information from vast
amounts of observations. Yet, applying these techniques to astronomical data presents
notable challenges, including the difficulty of astronomical terminology and the diverse range of data sources. In this research, we leverage multiple Natural Language
Processing techniques to extract information from astronomical observations with a
specific focus on predicting the future citation rate of astronomical telegrams. To
achieve this, we create a comprehensive dataset gathering astronomical messages
from various sources and utilize techniques such as Named Entity Recognition,
doc2vec, word2vec, and topic extraction. Along with this, we enhance the extracted
information by incorporating manually created features that capture the characteristics of astronomical telegrams beyond their direct context. These features aim to
provide a comprehensive representation of the messages. We then use all the extracted information to predict the future impact of the telegrams, as indicated by
their citation counts, using multiple Machine Learning techniques.
Description
Keywords
Citation
Kutsuruk Vladyslav. Astronomical Data Features Extraction and Citation Prediction, Faculty of Applied Sciences, Department of Computer Sciences. Lviv 2023, 51 p.