Stop misusing data when hiring academics

Search me in InCites – a tool created by US analytical research company Clarivate – and you’ll find a biochemist who hasn’t published much recently. Google Scholar, Google’s tool for searching academic publications, shows more of my work from the last few years on research assessment and open research. Like everyone else, I prefer larger numbers – and Google Scholar provides them – but they have more errors, due to the way Google’s algorithms crawl and index author lists. My problems are relatively minor; a colleague has dozens of citations according to InCites, and tens of thousands according to scrapers such as Google Scholar.

Researchers, especially those at the start of their careers, need to know how to gather sources of evidence, such as these publication counts, to make their case to hiring and promotion boards. These tools are treated as reliable sources, despite the fact that they can give very different values. Other data, such as retweets or “likes” on online videos, are sometimes used as an indicator of societal impact, but the relevance of these is even more questionable when used inappropriate way.

This is a serious and deeply ironic problem in the scientific enterprise. As researchers, we are used to using partial, imperfect, and incomplete data to make decisions and draw conclusions. These imperfections are mitigated by statistical processes, error calculations and good research practices. But best practices are often not applied in the evaluation of researchers’ publication files: where are the error bars on a tenure decision, a university’s ranking or a grant application?

Policy, hiring, funding, and promotion decisions are built on this fragile evidence. If this evidence were research data, its collection, description, analysis and interpretation would never pass peer review.

This problem concerns institutions as well as individuals and disciplines. My colleague Karl Huang, an open knowledge researcher at Curtin University in Perth, Australia, and I investigated the data behind university rankings (C.‑KK Huang et al. As to. Science. Standard. 1, 445–478; 2020). We created a simple ranking based on the citations of 155 universities and fed it data from each of three sources: Web of Science, Scopus, and Microsoft Academic, all of which are publication record search tools. Three universities moved more than 110 places, and 45 moved more than 20, when the data source changed.

It should come as no surprise that different ranking sources and approaches yield different results. But we continue to ignore the differences – and make political, funding and career decisions as if any individual measure could provide an answer. We therefore make many crucial decisions in error, at the individual and institutional levels.

What must change? The political landscape has changed over the past decade. The Agreement on Research Assessment Reform, published last week (see go.nature.com/3pmwd), responds to the Paris Call on Research Assessment to assess research on “the intrinsic merits and impact, rather than the number of publications and where they are published, by promoting qualitative judgment provided by peers, supported by a responsible use of quantitative indicators”. In other words, its editors are also fed up with figures taken out of context. It follows the 2013 San Francisco Declaration on Research Assessment and the 2015 Leiden Manifesto, both of which called for similar policy changes.

To make these calls effective, academia needs a cultural shift in terms of the evidence used to assess research results. This will only happen when the whole company demands higher standards. We should tell the stories behind our work and our success in a more qualitative way – with more meaningful words and fewer meaningless numbers. This would better respect the variety of disciplines and the many ways in which researchers have an impact.

Principal Investigators should critically assess the quality of evidence presented when judging job and grant applicants or conducting departmental reviews. And we should support early career researchers by creating guidelines and training to help them prepare the best possible cases for advancement.

It is unfair but unavoidable that much of the work falls to early and mid-career researchers, for whom evaluations are most crucial. They have the choice of providing more rigorous and comprehensive evidence in their claims or just the usual numbers. But this is an opportunity to reshape the narratives of their research, while making the evaluation of their work fairer.

These changes are already underway. The policy landscape is changing as a result of the Agreement on Research Assessment Reform and similar initiatives in many countries. Increasingly, scholarship, promotion and job evaluations require qualitative cases supported by quantitative evidence. More and more senior academics are demanding higher standards of evidence for research assessment. And I see more and more young researchers presenting sophisticated, rigorous and diverse arguments for the value of their research to promotion committees or grant evaluations.

Real change will only happen when assessees are willing to show the real value and impact of their research, beyond the number of citations, retweets or h-hints. I might like bigger numbers, but I’d rather work in a world powered by informative numbers.

Competing interests

CN is a co-lead of the Curtin Open Knowledge Initiative (COKI), which develops and advocates for the provision of community-controlled data and analysis to support research evaluation, policy development and strategy. COKI has received funding from Curtin University, the Mellon Foundation in New York and Arcadia, a charity fund of Lisbet Rausing and Peter Baldwin, in London. No funder played a role in the preparation of this manuscript.

Comments are closed.