BanyanDataFrames.jl is an open-source library for processing massive Parquet/CSV/Arrow datasets in your Virtual Private Cloud. One of the key goals of the project is to match the API of DataFrames.jl as much as possible. In this talk, we will provide an overview of BanyanDataFrames.jl and discuss challenges and success so far in achieving massively scalable data analytics with the Julia language.
More information about BanyanDataFrames.jl can be found on GitHub: https://github.com/banyan-team/banyan-julia https://github.com/banyan-team/banyan-julia-examples