Tag Archive | OA

Open Science: not only a matter of outcomes, but also of processes

Post written by Toni Hermoso, bioinformatician at the CRG.

 

Image by Mari Carmen Cebrián

It’s been almost a decade since the term “Open Science” first appeared in Wikipedia. The page was created by Aaron Swartz and initially redirected to the “Open Access” entry. Some years later this young activist committed suicide as a result of the pressure from the judicial charges against him after having uploaded many privative licensed articles to the Internet.

Parallel to these events, Creative Commons licenses, a set of recommendations intended to foster sharing in the digital world, became increasingly popular, and many novel publishing initiatives took advantage of them for promoting open access to scientific literature.

At the same time, more and more government agencies started to demand that the benefactors of their funding should provide their publication results openly within a certain period of time. So, if research was not published originally in an open-access journal (golden road) it should be eventually uploaded in an institutional repository (green road). Furthermore, preprints, an already common practice in Physical Sciences, started to become widespread in Biosciences after the creation of portals such as BioRxiv.

However, despite the bloom of Open-Access (OA) journals and the introduction of a more favouring legislation, there are still strong concerns regarding the future of open access in science. This is mostly due to the fact that the publishing sector is effectively controlled by very few parties, which often provide pay walled content. A reaction to this situation is evidenced by initiatives such as Sci-Hub, which is defiantly providing free-access to those restricted articles.

In any case, there is more to Open Science than Open Access. We could highlight at least two other major facets: Open Data and Open Methodology. These are the indispensable two pilars for making reproducibility in modern science actually possible. In general terms, they may be the initial and raw data (straight from machines or sensors) or the final outcomes such as chart images or spreadsheets. The recent data flood has made necessary the birth of established public open repositories (e.g. Sequence Read Archive or the European Variant Archive) so researchers could freely reuse and review existing material.

It is also a common requirement from these repositories that data must be available in an open format, so other researchers may process them with different tools or versions than the ones originally used. This latter aspect is intimately associated to Open Source, which is also essential for ensuring a reproducible methodology. As a consequence, an increasing number of journals are requiring submitters to provide both data and program code so reviewers may assess by themselves that results are those that are claimed to be.

The present challenge is how to transfer those good practices -which originated in the software engineering world and later permeated into computational sciences- to the wide scientific community, where subject systems may be far less controllable (e.g., organisms or population samples). In order to help on this, there is an increasing effort in training scientists on technologies such as control version systems (e.g. GitHub), wikis or digital lab notebooks. All these kind of systems can enable collaboration of several different parties in an open and traceable way.

Even though there are some practices in everyday scientific activity, such as peer review, that are still under experimentation within the open umbrella, hopefully we may expect that in the future more and more of the key points we commented above will be just taken for granted. At that stage we might not even need to distinguish Open Science from simply SCIENCE anymore.

On data sharing and open science – interviewing Rebecca Lawrence (F1000Research)

6, rebeccaRebecca Lawrence has worked in scientific publishing for over 15 years and is currently involved in several international associations and working groups on data publishing and peer review. Rebecca was responsible for the launch of F1000Research in January 2013, a novel open science publishing platform that “uses immediate publication, transparent peer review, and publishes all source data”. She came to the PRBB to talk about the future of scientific publishing.

What are the current challenges in scientific publishing?

One is the delay between the moment you’re ready to share the science and when it actually gets out there and others benefit from it – it can take from 6 months to a year, or even as long as five! In the digital era this makes no sense. Another is the bias in the peer review, which is inherent to the process because those who review your work must be experts in your field and are therefore likely to be competitors.

Lack of data sharing is a huge issue. In a paper we usually don’t see the raw data that backs up the conclusions, we just take it on trust that the analysis has been done in the best way. Really, the core of the paper should be the data. And finally, a vast amount of results, particularly negative ones, are not being published – we are building the next generation of science on facts that are wrong or at least incomplete!

What can be done to resolve these issues?

We need transparency, and that’s what Open Science advocates, trying to make everything – the article, the data, the software and the review process – as transparent as possible. We now have the tools and it is cheap enough to share all the findings and data, although obviously we need to ensure that this is done in a useful way!

What are Open Science’s potential challenges?

Many researchers are nervous about sharing their data, because they don’t want to give a potential advantage to competitors, but actually you can get priority on the data if you openly share it. And it can be time consuming to sort your data out in a way that is understandable and usable by others. But if you don’t sort it out properly, in five years’ time even you won’t be able to do anything with it!

What can be done to entice authors to share data?

We need to give credit for the data. When people are better recognised for creating the data they will be happier to share it. And this is happening: many journals have started citing datasets properly in the references and we have launched a project with several international standards organisations to help develop dataset-level metrics.

How does F1000Research fit in with all this?

We are using a completely new publishing process that is fully transparent. We offer immediate publication, following a set of basic checks, and then it goes out to invited expert referees. The names and reports of the reviewers are published alongside the article and we also make our reviewers’ reports citable, to provide our referees with additional credit for their work. Finally, we strongly encourage publication of negative and null results, replication studies, all kinds of studies.

 This interview was published by Maruxa Martínez-Campos in the July-August 2015 edition of El·lipse, the monthly newspaper of the PRBB

%d bloggers like this: