1000-AID: If data could speak …

Posted by me on behalf of co-authors

This project would be the first attempt of crowd-sourced paper writing from Indonesian authors. We began working on this paper in the open via Google Docs and Google Sheet editable files. The publishing efforts were made using the power social media (Facebook, Twitter, Whatsapp Group, and Telegram Group).

The minimum participation to be an author was to enter their GS and Scopus data profile on the designated Google Sheet file.

Here’s what we’ve come up so far. Interested?

How to cite our dataset:

Team 1000 Authors. 2017. “Crowdsourced Dataset: GS and Scopus Profile”. INA-Rxiv. October 29. osf.io/preprints/inarxiv/m6zd7. DOI 10.17605/OSF.IO/M6ZD7

Join us by contributing your dataset in the following editable Google Sheet file.


H-index: GS (blue) vs Scopus (red)

Scopus values are significantly lower than GS values. Do they reflect the real productivity of Indonesian authors? We don’t think so.

One hypothesis in our mind was the language barrier. Although Indonesians take English lesson since early years of education, but they do not use it as first or second language. Therefore their English skill is not highly developed, especially in scientific writing. By looking at the following plot, you may see that the less developed English skill, which is then lead to the low number of papers in English, and hence lead to the low number of citation, is reflected in the GS vs Scopus H-indeks values.

H-index vs working duration

GS data

Here’s another hint. H index over working period (durasi waktu berkarya) according to GS data. As we understand, H-index grows overtime, given the constant number of paper per year written by an author. The majority of the data contributors, who are also the authors of this 1000-AID paper, are in 0 to 15 years of career (max), And you may see a weak correlation between H-index and time that bringing a weak correlation.

Scopus data

And here’s the same authors based on Scopus data. The points  go down as Scopus only sees English-language papers and citations, which are published in a Scopus indexed journals/conferences.

Observing the chart, do you think it’s fair to use it to measure our performance and the quality of our works? It’s your call. 🙂

The less powered Journal Impact Factor (JIF) to citation

People have been wondering if the JIF relates to high citation, as everyone would have expected. So here’s what we find. The two have no or weak correlation, based on GS data.

We collected the sum of JIF from each GS profile and plot it with the sum of citations. You may see it in the plot below that they’re unlikely to be correlated. This brings us to the next questions:

  1. is it because less paper in English than Indonesian language. JIF would likely to count English-language journals, while more Indonesians are writing to Indonesian journal in Indonesian language, or
  2. is it because the paper written by Indonesians, published in journals with IF, are likely to be cited due to various reasons: it contains less original subject, it contains very local content which is not observed by international readers, or
  3. JIF simply has not been used as the main consideration in paper citation.

Points to convey

However there’s a lot pros and cons in this project. Here we introduce a new way (from Indonesia context) with a bit of sarcasm about that ScopusID everybody has been looking for. The new way conveys the following ideas:

  • (massive) collaboration
  • (unique) creativity
  • (reproducible) research

While we amazed by this movement, there were some instances of crowd-based authoring and how we acknowledge the contributions at the smallest bits.

Nature News — LHC paper

Science — LHC paper

Independent — LHC paper

Still not convinced that crowdsourcing science is possible?

Watch this video.


Yes we run INAR-xiv — Feel the freedom