Reproducible Publications with Julia and Quarto

07/29/2022, 12:30 PM — 1:00 PM UTC
Purple

Abstract:

Quarto is an open-source scientific and technical publishing system that builds on standard markdown with features essential for scientific communication. The system has support for reproducible embedded computations, equations, citations, crossrefs, figure panels, callouts, advanced layout, and more. In this talk we'll explore the use of Quarto with Julia, describing both integration with IJulia and the Julia VS Code extension, as well as areas for future improvement and exploration.

Description:

Quarto is an open-source scientific and technical publishing system that builds on standard markdown with features essential for scientific communication. One of the most important enhancements is embedded computations, which enable documents to be fully reproducible. There are also a wide variety of technical authoring features including equations, citations, crossrefs, figure panels, callouts, advanced layout, and more. In this talk we'll explore the use of Quarto with Julia, describing both integration with IJulia and the Julia VS Code extension, as well as areas for future improvement and exploration.

Quarto is built on Pandoc and as a result can target dozens of output formats including HTML, PDF, MS Word, OpenOffice, and ePub. Quarto also includes a project system that enables publishing collections of documents as a blog, full website, or book. Output formats are extensible, making it possible to create Journal ready LaTeX and HTML output from the same source code. Several examples of creating these output types with Julia will be presented, and we will take advantage of integration between the Quarto and Jupyter VS Code extensions to demonstrate productive workflows.

After reviewing the basics of the system and presenting examples, we'll dive more into the technical details of how Quarto works. One of the things that makes Pandoc so capable is that it is not merely a markdown system but rather a generalized system for computing on documents. We'll describe the Pandoc AST for documents and how users of Quarto can write filters to transform the AST during rendering. Examples of filters authored with both Lua (the Pandoc embedded language for filters) and Julia (via the PandocFilters.jl package) will be presented.

Embedded computations present the opportunity for fully reproducible workflows, but also create new performance challenges. The system needs to support expensive, long-running computations but at the same time interactive and iterative use (especially for content authoring). Quarto includes a variety of facilities for managing these tradeoffs, including daemonized Jupyter kernels for interactive use, caching computations, and the ability to freeze computational documents. We'll demonstrate using all of these techniques with Julia, and discuss their benefits, drawbacks, and potential for future improvement.

Quarto interfaces with embedded Julia code using its Juptyer computational engine and the IJulia kernel. Documents can be authored in either a plain text markdown format or as Jupyter notebooks. There are several other literate programming systems available in the Julia ecosystem (Pluto, Neptune, Weave.jl, etc.) which have their own benefits and tradeoffs. We'll discuss why we chose IJulia along with an exploration of how we could integrate with other systems.

Platinum sponsors

Julia ComputingRelational AIJulius Technology

Gold sponsors

IntelAWS

Silver sponsors

Invenia LabsBeacon BiosignalsMetalenzASMLG-ResearchConningPumas AIQuEra Computing Inc.Jeffrey Sarnoff

Media partners

Packt PublicationGather TownVercel

Community partners

Data UmbrellaWiMLDS

Fiscal Sponsor

NumFOCUS