Effortless Bayesian Deep Learning through Laplace Redux

07/28/2022, 4:50 PM — 5:00 PM UTC
Blue

Abstract:

Treating deep neural networks probabilistically comes with numerous advantages including improved robustness and greater interpretability. These factors are key to building artificial intelligence (AI) that is trustworthy. A drawback commonly associated with existing Bayesian methods is that they increase computational costs. Recent work has shown that Bayesian deep learning can be effortless through Laplace approximation. This talk presents an implementation in Julia: BayesLaplace.jl.

Description:

Problem: Bayes can be costly 😥

Deep learning models are typically heavily under-specified in the data, which makes them vulnerable to adversarial attacks and impedes interpretability. Bayesian deep learning promises an intuitive remedy: instead of relying on a single explanation for the data, we are interested in computing averages over many compelling explanations. Multiple approaches to Bayesian deep learning have been put forward in recent years including variational inference, deep ensembles and Monte Carlo dropout. Despite their usefulness these approaches involve additional computational costs compared to training just a single network. Recently, another promising approach has entered the limelight: Laplace approximation (LA).

Solution: Laplace Redux 🤩

While LA was first proposed in the 18th century, it has so far not attracted serious attention from the deep learning community largely because it involves a possibly large Hessian computation. The authors of this recent NeurIPS paper are on a mission to change the perception that LA has no use in DL: they demonstrate empirically that LA can be used to produce Bayesian model averages that are at least at par with existing approaches in terms of uncertainty quantification and out-of-distribution detection, while being significantly cheaper to compute. Our package BayesLaplace.jl provides a light-weight implementation of this approach in Julia that allows users to recover Bayesian representations of deep neural networks in an efficient post-hoc manner.

Limitations and Goals 🚩

The package functionality is still limited to binary classification models trained in Flux. It also lacks any framework for optimizing with respect to the Bayesian prior. In future work we aim to extend the functionality. We would like to develop a library that is at least at par with an existing Python library: Laplace. Contrary to the existing Python library, we would like to leverage Julia's support for language interoperability to also facilitate applications to deep neural networks trained in other programming languages like Python an R.

Further reading 📚

For more information on this topic please feel free to check out my introductory blog post: [TDS], [blog]. Presentation slides can be found here.

Platinum sponsors

Julia ComputingRelational AIJulius Technology

Gold sponsors

IntelAWS

Silver sponsors

Invenia LabsBeacon BiosignalsMetalenzASMLG-ResearchConningPumas AIQuEra Computing Inc.Jeffrey Sarnoff

Media partners

Packt PublicationGather TownVercel

Community partners

Data UmbrellaWiMLDS

Fiscal Sponsor

NumFOCUS