PLoS By Category | Recent PLoS Articles

Computer Science - Mathematics - Science Policy

Identifying Overlapping and Hierarchical Thematic Structures in Networks of Scholarly Papers: A Comparison of Three Approaches
Published: Tuesday, March 27, 2012
Author: Frank Havemann et al.

by Frank Havemann, Jochen Gläser, Michael Heinz, Alexander Struck

The aim of this paper is to introduce and assess three algorithms for the identification of overlapping thematic structures in networks of papers. We implemented three recently proposed approaches to the identification of overlapping and hierarchical substructures in graphs and applied the corresponding algorithms to a network of 492 information-science papers coupled via their cited sources. The thematic substructures obtained and overlaps produced by the three hierarchical cluster algorithms were compared to a content-based categorisation, which we based on the interpretation of titles, abstracts, and keywords. We defined sets of papers dealing with three topics located on different levels of aggregation: h-index, webometrics, and bibliometrics. We identified these topics with branches in the dendrograms produced by the three cluster algorithms and compared the overlapping topics they detected with one another and with the three predefined paper sets. We discuss the advantages and drawbacks of applying the three approaches to paper networks in research fields.