Today Academia.edu is announcing that users can embed data-sets and code onto their Academia.edu profile pages. Data-sets and code can be attached to papers, or can be uploaded in a stand-alone way.
Historically researchers have only shared their ideas in the form of academic papers. The DNA of academic journals came from the era of print, and it never made sense to share data and code in print form.
Currently 75% of the world’s scientific data is not shared. It hasn’t been there because the distribution platforms haven’t been there, and there haven’t been the right reputation metrics to incentivize researchers to share their data.
Academia.edu’s announcement today is providing an outlet for researchers to share their data and code in a way that enhances their reputations. Data-sets and code are attached to Academia.edu’s analytics engine. You can see how many views you get for your data-sets and code, and share these analytics with your tenure and grant committees.
Below is a screenshot of an embedded data-set:
Below is a screenshot of an embedded Github repo:
The importance of the sharing of data was highlighted in the media a couple of weeks ago. A couple of Harvard professors wrote an influential economics paper on national debt and growth ratios. The paper was circulated in 2009 and it had a significant impact on the policy decisions of governments around the world.
Earlier this year a graduate student asked the authors of the paper for the data-set that backed up the paper. After looking at the data-set he found an error that undermined the conclusions of the paper. Had the data been shared with this paper’s publication, the error would have been caught immediately, before it had a chance to impact the various countries’ economic policies.
History of the Science Ecosystem
400 years ago, journals had not been invented yet, and research was largely a private pursuit. Wealthy people would have private labs in their country houses, and they would keep the results of their experiments private. There was not a strong cultural norm around sharing your scientific ideas or results.
Journals were invented for the sharing of ideas towards the end of the 1600s. This sharing infrastructure helped spur the Scientific Revolution, a rapid acceleration of scientific progress.
As much as 50% of the world’s research output may not be being shared right now, because the incentives haven’t encouraged the form that the output comes in. These forms can include data, code, comments on papers, images and videos.
Part of Academia.edu’s mission is to build the incentive engine for researchers to get credit for sharing the full range of their research output: closing the feedback loop, so if a contribution they make to research has an impact, there are metrics that reflect that impact. The researcher can take those metrics and use them to improve their chances with grant and tenure committees.
This announcement today is part of building the new infrastructure in research, where researchers can collect credit for sharing more and more of their output.
Some users were in the beta for this feature on Academia.edu, and they added their thoughts.
Shivendra Tewari, a Biology post-doc at the Medical College of Wisconsin, writes “I see sharing data as an advancement of science. If I’ve already done something, why should someone re-do all of the work again. They should just use whatever I’ve done and then move forward from that point.
I think it’s really good that you can provide things like code and datasets on Academia.edu because sometimes publishers don’t even ask for code. So if there is one single place where you can put papers and code, then people can get a lot of information from a single site.”
Murray Rudd, a Lecturer in the Environment department at the University of York, writes “Looking at it from the environment and economics realm, there’s generally not enough sharing of datasets. I have datasets that go back 10 or 12 years, and I always have these good intentions to get students working on them at some point. But unfortunately due to time constraints, a lot of these datasets just die out— they are never really plumbed to the extent that they could be.
Most of the people that I work with are in the same situation. So, in the case of one of my datasets, I thought why not just put it up online and if someone can use it sometime then that’s great. Also, even if I know that it’s going to take me a while to analyze the data, it still doesn’t hurt to post it online. If someone else picks up my data and publishes a paper, I’ll still get cited.”
Daniel Curtis, a History post-doc at Utrecht University, writes “I think the more material you have on Academia.edu, the more ‘visible’ you are. You are more likely to be found through search engines that way. Perhaps by arriving at my page by accident through a bibliographic reference, someone might see one of my papers and become interested.
“I do also believe in the sharing of data though. I think the future of the historical discipline is not through individual research but research in teams with international collaboration. This is just a small way of contributing to that. Plus it doesn’t really take much effort to just put a file on academia.edu, so there’s no reason not too!”