I’m not a Research Parasite. You’re a Data Vulture
There’s a minor philosophical conflict in the academic community today, and it has to do with data sharing. You know, the thing that the internet was literally built to do? A key component of science is the ability to have others verify and built upon your data. Most of the time (in my experience) data is freely shared with people who ask for it, and usually a small consideration is given in the resulting article in the form of “Data for this analysis was provided by X.” A citation is given and everyone is happy.
Except some people. Some people won’t share their data unless you play ball. Some researchers are Data Vultures.
“But”, you say, “we’re citing your original research in our paper”.
“No”, they reply, “I want to be an author.”
“But”, you retort, “you’re not doing any work on this study, we’re doing the analysis and you are just providing the data.”
“Yes”, they respond, “and without my data you have no study.”
Let me step a bit back and explain why I’m writing about this:
Yesterday an editorial in the New England Journal of Medicine was published. In it the authors explain that they are all for data sharing, if done properly. Why the concern? Because sometimes, according to the authors, the data shared isn’t used properly, and proper credit isn’t given. If that sounds logical you’re right, but the authors go a bit further.
From the editorial (emphasis added):
A second concern held by some is that a new class of research person will emerge — people who had nothing to do with the design and execution of the study but use another group’s data for their own ends, possibly stealing from the research productivity planned by the data gatherers, or even use the data to try to disprove what the original investigators had posited. There is concern among some front-line researchers that the system will be taken over by what some researchers have characterized as “research parasites.”
Research parasites? What exactly do you mean by the phrase?
How would data sharing work best? We think it should happen symbiotically, not parasitically. Start with a novel idea, one that is not an obvious extension of the reported work. Second, identify potential collaborators whose collected data may be useful in assessing the hypothesis and propose a collaboration. Third, work together to test the new hypothesis. Fourth, report the new findings with relevant coauthorship to acknowledge both the group that proposed the new idea and the investigative group that accrued the data that allowed it to be tested. What is learned may be beautiful even when seen from close up.
And the appropriate response to this suggestion is this:
I believe the word you're looking for here is "citation", @NEJM. #IAmAResearchParasite https://t.co/fhLCNlVsfr pic.twitter.com/kzrYESj7bQ
— Paul Macklin (@MathCancer) January 22, 2016
By the way you should definitely follow the hashtag #IAmAResearchParasite which gives plenty of examples of what I like to call Data Vultures: living off of the remains of their original research.
I should note that the authors have some valid concerns. They are concerned with individuals who ask for a data set and yet may not be equipped to analyse the data properly. Or they may be unaware of the decisions made when the original data was collected which might affect the appropriate analysis. Lastly, if your data is to be combined with a novel data set, it may be that the novel data set is not fully compatible with your original data.
All of these are valid criticisms and should be addressed by both research teams. But unless you are actively doing work on the project (and not just sharing an SPSS file) you have no right to demand to be made an author just because you’re providing the data. That’s called data sharing and it is your ethical responsibility as a scientist.
But what the NEJM is concerned about, apparently, is Fluffing scientists CVs rather than promoting research. In fact, one of the criticisms of “Research Parasites” is that they might “even use the data to try to disprove what the original investigators had posited.”
Excuse me for getting a little bit emotional but that’s exactly what scientists are supposed to do! The NEJM apparently is afraid of Type II errors (saying that something isn’t there when it actually is) as opposed to Type I errors (saying that something is there when it actually isn’t). These are basically false negatives and false positives. Some people argue over which is worse but they’re both bad. And they only way we can determine if either error is happening is if we perform multiple studies. Reproducibility, that’s the cornerstone of science. And it can’t happen if Data Vultures prevent other scientists from expanding upon their work.
I had my own experience with a Data Vulture when I was an undergrad. I was doing a small meta analysis (combining the results of multiples studies to look at the overall effect) on the effect that training programs had on the use of physical restraint on patients in the prehospital setting. It’s a rather esoteric line of study and there were only a handful of papers that met my criteria. One such paper didn’t report the effect size I needed. So I contacted the author and asked for the effect size, or for the data set so that I could calculate it myself. It was the kind of thing that would have taken about 20 minutes worth of effort. They responded, said they would be happy to help, and were very interested in my study, but they wanted to be made a secondary author on the paper. Well, this was a problem because approximately 0.00000001% of Undergraduate Honors Theses get published (outside of the University’s internal publications), and the meta analysis I was doing was only a very small portion of my Honors project. I said no.
Luckily my advisor and I were able to work around this and using some fun statistical tricks find the study’s effect size. Which was small, I might add (probably the reason why it wasn’t included in the original journal article). As an aside Chi Square is your friend.
This happens all the time. It’s even seen as common practice in academia. The study I worked on when I was a Lab Manager suffered a similar experience. We wanted to piggyback our survey onto a larger survey another researcher was conducting. We would do all of the analysis of our survey and weren’t going to be touching the other researcher’s survey. We were actually running the survey at a different time: we just needed access to the sample. Of course the person running the larger survey became the second author. Let me be clear: he did no analysis, he didn’t write the survey, he didn’t perform any analysis on the data, and we didn’t use any of his data. We just used his study participants. Of course it could be said that he had done a fair bit of work in gaining access to the participants, who were in a hard to get population. My advisor didn’t see anything untoward by his demanding to be an author, I did, but I didn’t push the issue1)Maybe I’m just being sensitive.
This is a problem in academia that has no easy solution. Recent PhDs need to get jobs. Once they get a job they need to publish in order to get tenure. After tenure they still need to publish and have others cite their work in order to get consideration for grants and fellowships. Now the obvious answer is if you need to be published do more research. But when someone asks to use your dataset, some researchers give in to temptation and take the easy way out. They become Data Vultures, and live off of other researchers hard work. They get another publication on their CV.
Edit 1-2-16 @1021: A more apt term might be Data Vampire:
We prefer the term "data Vampire," thanks. #researchparasites
— Christie Bahlai (@cbahlai) January 22, 2016
I’m not saying the NEJM is promoting Data Vulturism (okay I am a little). They have valid concerns, but they also have some ridiculous ones. Sometimes people will use your data to try to disprove your study. And that’s fine, because either your study is strong enough to stand up on its own, or you are wrong. And if you’re wrong, wouldn’t you rather know it?
I know this is a naive view. Not all researchers are as intellectually honest as they should be. But to call someone who wants access to your dataset a Research Parasite is simply dishonest. You don’t deserve to be made an author unless you did some actual work. Wanting a small consideration in the article is one thing, and it is almost always given when the source of the data is explained. Citation is your reward, not authorship.
Media Credit
(See what I’m doing here. I’m giving someone credit. not making them an author)
The featured image of this article is Vulture Food by Valerie Everett via Flickr. It is available under a Creative Commons Attribution-ShareAlike 2.0 Generic license.
References
1. | ↑ | Maybe I’m just being sensitive |
It’s spooky how clever some ppl are. Thkans!