Title

Predicting the Academic Influence and Trending Research Topics

Abstract

Abstract

Predictions on academic research are thoroughly studied in the literature. In this thesis, we focus on two prediction problems in this domain. First we study the problem of topic adoption prediction for an author within a social academic network. We model the problem with an influence detection point of view, and propose that the influence on the author is an important factor. Hence, we define a novel influencee prediction based feature and developed an algorithm to calculate the influence propagated towards the author. The effect of this feature is explored together with and in comparison to other features used in the literature for the problem. The experiments conducted on Arnet Miner data set show that accumulated influence on author is effective for predicting topic adoption.

As a second problem, we try to enlarge our scope and generalize by focusing on predicting the trending research topics from a collection of academic papers. Previous efforts model the problem in different ways and mostly apply classical approaches such as correlation analysis and clustering. There are also several recent neural model based solutions, however they rely on feature vectors and additional information for the trend prediction. In this work, given a collection of publications within the observation time window, we predict whether the use of a keyword will increase, decrease or be steady for the future time window (prediction window). As the solution, we propose a family of deep neural architectures that focus on generating summary representations for paper collections under the query keyword. Due to the sequence based nature of the data, Long Short-Term Memory (LSTM) module plays a core role, but it is combined with different layers in a novel way. The first group of proposed neural architectures consider each paper as a sequence of keywords and use word embeddings to construct paper collection representations. In this group, the proposed architectures differ from each other in the way year based and overall summary representations are constructed. In the second group, each paper is directly represented as a vector and the use of different paper embedding techniques are explored. The analyses of the models are performed on a variety of paper collections belonging to different academic venues, obtained from Microsoft Academic Graph data set. The experiments conducted against baseline methods show that proposed deep neural based models achieve higher trend prediction performance than the baseline models on the overall. Among the proposed models, paper embedding based models provide better results for most of the cases.

Supervisor(s)

Supervisor(s)

MURAT YUKSELEN

Date and Location

Date and Location

2022-09-05 14:00:00

Category

Category

PhD_Thesis