Reproducibility Tools

The Moore-Sloan Data Science Environment Open Science & Reproducibility Working Group at NYU has been in development of a suite of tools to simplify the process of making reproducible experiments, including:

ReproZip logo

ReproZip is a tool aimed at simplifying the process of creating reproducible experiments from command-line executions. It tracks operating system calls and creates a package that contains all the binaries, files, and dependencies required to run a given command on the author’s computational environment. A reviewer can then extract the experiment in their own environment to reproduce the results, even if the environment has a different operating system from the original one.

See how researchers use ReproZip
REANA logo

Reana is an free and open source reproducible research data analysis platform built at CERN. It lets users structure analysis inputs, code, environments, workflows and run analysis on remote containerised compute clouds.

See some REANA examples
ReproMatch screenshot

ReproMatch stands for Reproducibility Match and it was designed to help you find the tool (or tools) that best matches your reproduciblity needs. The tools in the ReproMatch catalog are classified according to different reproducibility tasks, which we organized in a taxonomy. Please see Reproducibility Tasks for a detailed description of this taxonomy.

Browse all the tools

VisTrails logo

VisTrails is an open-source scientific workflow and provenance management system that provides support for simulations, data exploration and visualization. Whereas workflows have been traditionally used to automate repetitive tasks, for applications that are exploratory in nature, such as simulations, data analysis and visualization, very little is repeated---change is the norm.

Read more about VisTrails

A framework for extending the impact of existing analyses performed by high-energy physics experiments. Anyone can add analyses to the Analysis Catalog, upload alternative signals in the LHE format and request that any given analysis is "recast" for their alternative model, and subscribe to an analysis to be informed of activity associated with the analysis. The impact of RECAST depends entirely on the incorporation and integration of existing analyses into the framework.

See the source of RECAST
noWorkflow logo

noWorkflow is a non-intrusive tool that doesn’t require researchers to change the way in which they work, but instead allows them to capture a variety of provenance information and utilize the analyses it supports, including graph-based visualization, differencing over provenance trails, and inference queries. noWorkflow was developed in Python and it currently is able to capture provenance of Python scripts using Software Engineering techniques such as abstract syntax tree (AST) analysis, reflection, and profiling, to collect provenance without the need of a version control system or any other environment.

Watch a tutorial video on the wiki