Astronomical Data Features Extraction and Citation Prediction

Date

2023

Authors

Kutsuruk, Vladyslav

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Natural Language Processing methods present promising opportunities for analyzing astronomical data, enabling the extraction of essential information from vast amounts of observations. Yet, applying these techniques to astronomical data presents notable challenges, including the difficulty of astronomical terminology and the diverse range of data sources. In this research, we leverage multiple Natural Language Processing techniques to extract information from astronomical observations with a specific focus on predicting the future citation rate of astronomical telegrams. To achieve this, we create a comprehensive dataset gathering astronomical messages from various sources and utilize techniques such as Named Entity Recognition, doc2vec, word2vec, and topic extraction. Along with this, we enhance the extracted information by incorporating manually created features that capture the characteristics of astronomical telegrams beyond their direct context. These features aim to provide a comprehensive representation of the messages. We then use all the extracted information to predict the future impact of the telegrams, as indicated by their citation counts, using multiple Machine Learning techniques.

Description

Keywords

Citation

Kutsuruk Vladyslav. Astronomical Data Features Extraction and Citation Prediction, Faculty of Applied Sciences, Department of Computer Sciences. Lviv 2023, 51 p.

Collections

Endorsement

Review

Supplemented By

Referenced By