Predicting Institution Hierarchies with Set-based Models
Derek Tam, Nicholas Monath, Ari Kobren, Andrew McCallum
Keywords: Hierarchies, Sets, Transformers, Institutions
TLDR: Predicting hierarchies of institutions by modeling set operations over tokens
Abstract:
The hierarchical structure of research organizations plays a pivotal role in science of science research as well as in tools that track the research achievements and output. However, this structure is not consistently documented for all institutions in the world, motivating the need for automated construction methods. In this paper, we present a new task and model for predicting sub-institution/super-institution relationships based on their string names. The crux of our model is that it leverages learned, permutation invariant representations of various token subsets of institution name strings. Our model outperforms or matches non-set-based models and baselines. We also create a dataset for training and evaluating models for this task based on the publicly available relationships in the Global Research Identifier Database.