The Modern Scientific Paper

Resolution of the grand challenge in reproducibility

Johan Jansson (jjan@kth.se), KTH

Ridgway Scott, University of Chicago

Rebecca Durst, University of Pittsburgh

Problem: Publications are not reproducible

KI President Ottersen describes the problem concisely;

We are in the midst of what some have called a research reproducibility crisis. While scientific discovery and complexity are developing at an unprecedented speed, less than 50% of scientific research studies can be reliably replicated. Left unchecked, this troubling fact may threaten our ability to generate sound, evidence-based knowledge that meets society’s needs. It is time to look beyond the traditional measures of quality and re-examine the very concept of quality itself.

NASA Transform to Open Science (TOPS):
https://science.nasa.gov/open-science/transform-to-open-science
"Lowering barriers to entry for historically excluded communities"
"Increasing opportunities for collaboration while promoting scientific innovation, transparency, and reproducibility.

Illustration: In school one may not only give the answer to a problem - "show your work!".

In research it is often not possible to see or reproduce how the answer was derived or constructed. Why is this so?

Report from European Commission

"Reproducibility of scientific results in the EU", Directorate-General for Research and Innovation (European Commission), 2020

Excerpts from the EU report:

Second, there is a perceived deliberateness, or at least carelessness, in scientific production due to competitive pressures. A growing proportion of scientists are perceived as – willingly or unwittingly – bending some of the basic premises of the scientific method to produce ‘fast science’ or even ‘make believe science’ – facts and theories that are declared true but are dubious or even false. This rests more on the structure of incentives of science-making, embedded in culture and practice, than on deliberate attempts to ‘cheat’. The need for results to be reproducible, and the tangible steps needed to make them so, may help results be trustworthy and keep scientists honest.
Possible remedies:
[...]
Sharing of data, protocols, materials, software, codes, and other tools underlying publications; Transparency of analysis and modelling;
Possible actions:
[...]
17. Fund the testing and R&I development of automatic systems of compliance for reproducibility before publication;
[...]
24. Ensure that Horizon Europe provisions encourage and support
reproducibility (see list of possible actions, above);
25. Employ and police guidelines early in the grant application phase to anchor journal practices;

etc.

Reproducibility in the digital age

Lorena Barba, a professor at George Washington University in Washington, D. C., says in Physics World:

What we are calling for is changing those norms to give importance to the full set of digital objects that are part of a scientific study and acknowledging that the scientific paper is insufficient today in its methods section to include all of the information needed for another researcher to confirm the results or build from those results.

The technology exists to achieve this, there have been technical solutions since the 80s and 90s.

In the US there are now guidelines for requiring the publication of the "digital objects" (Open Source), in the US National Academies of Sciences, Engineering and Medicine. Professor Barba has been a leader in these developments.

Zenodo (https://en.wikipedia.org/wiki/Zenodo) has become a standard resource in science for publishing “data sets”. For each submission, a persistent digital object identifier (DOI) is minted, which makes the stored items easily citeable. Zenodo is based on the Open Source project Invenio.

KTH Library is active in developing an Invenio/Zenodo-framework for supporting reproducibility.

Reproducibility in scientific modeling

With Invenio/Zenodo, DOIs can be acquired for both the source code and generated data for a scientific model, allowing this material to be easily cited. The material may then be shared while avoiding questions about ownership of the intellectual property.

However, just publishing a “data set” or even an archive of the source code, does not guarantee or make scientific results reproducible. It may still take an enormous effort to actually re-run the computations (e.g. lacking familiarity with required software, access to computing resources, etc.), and you do not know before you invest that effort how reproducible the results are (e.g. limited or missing methodology sections).

Reproducibility requires transparency. A lack of transparency in experiments creates a barrier to inclusivity and accessibility in science.

Solution

We present the Digital Math framework as the foundation for modern science based on constructive digital mathematical computation.

Ubiquitous Computing: "Copy a lab"

  • "One click" - Jupyter - Google Colab - MyBinder
  • Virtual Machines - Universal computing environment - HPC with 100s-1000s cores.

Easily accessible - “copy our lab at zero cost - rerun experiment in seconds 1-click in web browser”
Advantage of digital and simulation over experiments

Modern Scientific Paper

Zenodo Digital Object Identifier (DOI)
KTH-Zenodo/Invenio object
Direct link to Colab notebook

with interactive editable computation with Jupyter/FEniCS:

Modern Scientific Paper - paradigm-shift in aerodynamics

Our paper at KTH-Zenodo/Invenio with interactive editable computation with Jupyter/FEniCS:

Theory in Practice: Automated Digital Math

FEniCS open source FEM framework for automated solution of general mathematical models (PDE). We started FEniCS 2003, today de-facto world-standard for mathematical FEM with 100s co-authors at highest level in academia.

Impact

  • Largest online course in Sweden with 30000+ participants - DigiMat Pro (MOOC-HPFEM)
  • My team has been elected to the IVA Royal Swedish Academy of Engineering Sciences 100-list.
  • Pilot project with one of the top Formula 1 teams.
  • Invited seminar with top engineers at Boeing.
  • Predictive aerodynamics in ELISE project for electric aircraft - highlighted by Swedish Prime Minister.
  • NASA and a Fields Medalist highlighted our work.
  • We have been selected to exhibit at World Expo together with Vinnova.
  • Attracted high-impact projects: H2020 MSO4SC, Center of Excellence, etc.
  • Great feedback from course participants: "Cool!", "Very exciting!", "It is good that the Digital Math computational environment and the theory are combined together.", “Hands-on experience of manipulating codes that are applied in the real world engineering problem.”, 90% recommend courses to others.
  • Large DigiMat initiative with leading municipality in Sweden with great feedback.

Success Stories

  1. Elected to IVA Royal Swedish Academy of Engineering Sciences 100-list
  2. Highlighted by NASA and Fields Medalist
  3. Pilot project with leading Formula 1 team, leading aircraft companies
  4. Invited to World Expo with Vinnova and RISE - highlighted by Ambassador
  5. Highlighted in Teacher Magazine reaching all teachers in Sweden - close collaboration with Lidingö Municipality
  6. Panel Debate on reproducibility with EU-Commission, Swedish Parliament, Young Academy of Sweden
  7. DigiMat Pro Sweden's largest online course 30000+ participants
  8. ELISE project for electric aviation - predictive design with Heart+Katla leading electric regional aircraft and drone startups in Sweden
  9. Excellent feedback from course participants:
    Basic: "Super-fun way to learn math!", "I did not know so much about algoriths before, but now I've started to understand".
    Pro: "Cool!", "Very exciting!", "It is good that the Digital Math computational environment and the theory are combined together.", “Hands-on experience of manipulating codes that are applied in the real world engineering problem.”, 90% rekommenderar kursen till andra.
  10. Support from Swedish School authority, KTH Vice President, US, India, etc.
  11. Lidingö: DigiMat the only solution which delivers in-depth integrated math+programming, which is missing today, and is required by the teaching plan.