Programme détaillé > Mercredi 8 Novembre

Software reproducibility

--------------------------------------------------------------------------------------------------------------

13h30 - 14h15 Konrad Hinsen (Centre de Biophysique Moléculaire, CNRS; Synchrotron SOLEIL; France)

Caring for your environment(s)

  Programmers mentally divide code into three layers: their own code, the
  libraries and tools they interact with, and the
  environment. Unfortunately, the environment is the layer they care about
  least. My mission is to convince you that the environment is interesting
  and worth caring about. Did you know that the environment metaphor is
  very inaccurate? It's really the foundation supporting your code. And
  then, there are additional environments you should be aware of: the
  social environments of developers and users, society at large, the
  physical environment of our computing systems. Are you ready to become
  an environmentalist?

--------------------------------------------------------------------------------------------------------------

14h15 - 15h00 Miguel Colom-Barco (Centre Borelli, ENS Paris Saclay, France):

Good practices for reproducibility

The lack of reproducibility of much scientific research pointed out in 2009 by D. Donoho and many other researchers has consequences not only on the credibility of the results, but also it has a very direct and negative impact in aspects such as public health, safety, or security. There's a consensus on the benefits of reproducibility, but only a minority of researchers are fully committed. We'll review some of the reasons why, and think about how reproducible research could be encouraged and rewarded.

From a practical point of view, we'll discuss good practices both to write and review reproducible scientific articles. To this purpose we'll review good practices related to the execution environments of the software, the versioning of the code, the FAIR principles for data, formats, standards, and the quality of the source code itself.

--------------------------------------------------------------------------------------------------------------

15h00 - 15h45  Sarah Gibson (2i2c, England).

Reproducible Computational Environments with Binder

Reproducible research is necessary to ensure that scientific work can be trusted. Funders and publishers are beginning to require that publications include access to the underlying data and the analysis code. The goal is to ensure that all results can be independently verified and built upon in future work. This is sometimes easier said than done! Sharing these research outputs means understanding data management, library sciences, software development, and continuous integration techniques: skills that are not widely taught or expected of academic researchers. A particularly steep barrier to working with codebases is setting up computational environments, and getting the combination of package versions just right can influence the reproducibility of code: from outright failures, to subtle changes in generated outputs. There are many tools available to manage your computational environment; but in this talk, we’ll explore Project Binder and its subproject repo2docker, which aims to automate reproducibility best practices across a number of ecosystems. Binder can build portable computational environments, when requested, with all the information encoded in a single, clickable URL, which greases the wheels of collaborative research while reducing the toil involved. We will discuss how these concepts can apply to the HPC community.

--------------------------------------------------------------------------------------------------------------

16h15 - 17h00 Josselin Poiret (Équipe Gallinette ; Nantes Université, Inria, France):

What is Guix?

Navigating the jungle of reproducible environments can be pretty tough, as there is a myriad of problems to consider: does this language-specific package manager build packages reproducibly? How does it integrate its external dependencies? Is the compiler bootstrapped? Will all the metadata it's using disappear in X years? Will my distributed artifacts work without external assumptions? Do I have tools and guarantees to examine all of this?

With these interrogations in mind, I will give an overview of Guix, the swiss army knife of environments, and describe how it can help you achieve reproducibility. I will also be comparing it with some other practices I've seen and try to dismiss some misconceptions about them, highlighting Guix's strengths.

--------------------------------------------------------------------------------------------------------------

17h00 - 17h45 Nicolas Vallet (University Hospital of Tours, Hematology and Cell Therapy department, Inserm U1069, LNOx group, France):

Everyone can learn how to Guix

In the biomedical environment, reproducibility is mainly
taught in the setting of bench experiments. Data analyses description
usually focus on basic measures such as mentioning the software
version used, sharing the raw data, and occasionally providing a
partial script. This presentation aims to showcase how we went from
version labels to Guix and how it shaped our workflows to analyze
various types of data, encompassing omics and targeted measurements.
We will provide insights into how we effectively reported its
utilization in our published manuscripts.

--------------------------------------------------------------------------------------------------------------

Personnes connectées : 2 Vie privée
Chargement...