Who Re-Uses Data? A Bibliometric Analysis of Dataset Citations

Abstract

Open data is receiving increased attention and support in academic environments, with one justification being that shared data may be re-used in further research. But what evidence exists for such re-use, and what is the relationship between the producers of shared datasets and researchers who use them? Using a sample of data citations from OpenAlex, this study investigates the relationship between creators and citers of datasets at the individual, institutional, and national levels. We find that the vast majority of datasets have no recorded citations, and that most cited datasets only have a single citation. Rates of self-citation by individuals and institutions tend towards the low end of previous findings and vary widely across disciplines. At the country level, the United States is by far the most prominent exporter of re-used datasets, while importation is more evenly distributed. Understanding where and how the sharing of data between researchers, institutions, and countries takes place is essential to developing open research practices.

Publication
Quantitative Science Studies
Geoff Krause
Geoff Krause
ID PhD Program, Dalhousie University
Madelaine Hare
Madelaine Hare
Digital Transformation & Innovation PhD student, University of Ottawa
Philippe Mongeon
Philippe Mongeon
Associate Professor, Department of Information Science, Dalhousie University