I present a new package which aims to automate the process of using reinforcement learning to solve discrete-time heterogeneous-agent macroeconomic models. Models with discrete choice, matching, aggregate uncertainly, and multiple locations are supported. The pure-Julia package, tentatively named Bucephalus.jl, also defines a data structure for describing this class of models, allowing new solvers to be easily implemented and models to be defined once and solved many ways.
Heterogeneous-agent macroeconomic models, though relatively recent in their development, have been applied across macroeconomics, and have contributed to our understanding of inequality, trade, business cycles, migration, epidemics, and the transmission of monetary policy.
Conventional methods of solving these models, which generally require computing policy or value functions on a grid which covers the model's entire state space, are subject to a curse of dimensionality. High-dimensional state spaces make a model unfeasible to solve. Using neural networks instead of grids to approximate policy and value functions solves this problem, and has become an important and active area of research. Because these models must be trained by simulating agents and updating based on simulated outcomes, these solution methods are a form of reinforcement learning.
At present, the ability to use reinforcement learning to solve economic models is limited to economists who are also trained in these techniques. Bucephalus.jl aims to make these techniques accessible by automating the process while remaining applicable to a broad class of models. The user describes a model using a simple model description syntax built on Julia macros. The models are then automatically compiled to a standard data structure, to which, in principle, many solvers could then be applied. I present a solver that uses deep reinforcement learning to solve for steady state, impulse responses, and transition paths.
The package furthermore implements reinforcement learning techniques never before applied to this domain, including discrete-choice policy networks and nested generalized moments.