Julia's compiler spends almost all of its time generating, optimizing, and compiling LLVM IR. Currently, much of this work is done under one giant lock, which is also held during type inference, reducing compiler throughput in a multithreaded environment. By using finer-grained locking and handling LLVM IR in a threadsafe manner, we can reduce contention of compilation resources. This work also leads into future JIT optimizations such as lazy, parallel, and speculative compilation of Julia code.
Julia's JIT compiler converts Julia IR to LLVM IR, optimizes it, and converts it to machine code for efficient subsequent execution. However, much of this process relies on shared global resources, such as a global LLVM context, the pass manager that runs the optimization, and various data caches. This has necessitated the presence of a global lock to prevent multiple threads from simultaneously modifying this data. Furthermore, as generation of LLVM IR and type inference may co-recurse indefinitely, type inference also acquires and holds the same lock during its execution. This serialized compilation process increases the startup time of multithreaded environments (often referred to as time-to-first-plot, TTFP) and prevents our execution environment from performing more complex transformations, such as speculative and parallel compilation.
Thus far, refactorings of our IR generation pipeline have reduced the number of global variables used in the compiler and added finer grained locks to our JIT stack in preparation for removing the global locks. At this stage, much of the remaining challenge in removing the global lock is in proving thread safety and progressively reducing the scope of the lock until the minimum amount of critical code is protected. Once that work has completed, work on speculative optimization and IR generation can begin, which should bring additional improvements to TTFP for situations without multiple contending compilation threads.