Production Data Engineering in Julia

07/29/2022, 4:30 PM — 6:00 PM UTC
BoF

Abstract:

We believe that Julia is uniquely well-positioned to pioneer new approaches to dataflow orchestration that are currently dominated by monolithic frameworks. In this BoF, Julia's nascent Data Engineering community will swap experiences and identify opportunities to collaborate on open-source next-generation data engineering tools.

Join the discussion on the bof-voice channel in discord.

Description:

Julia has already succeeded by empowering many scientists and engineers to author their own high-performance compute kernels without the usual ergonomics/composability sacrifices that high-performance code often entails. However, actually leveraging these kernels within production contexts often requires packaging them into an automated service, usually within the context of wider automated pipelines. It is not surprising that in the past few years, many new capabilities and packages have emerged that facilitate this by enabling Julia to be executed atop Kubernetes, interop with tabular data sources/sinks via Apache Arrow, and integrate with other popular cloud-native technologies. This blossoming ecosystem within the wider Julia community demonstrates both the desire and opportunity for Julia's usage in production data engineering contexts.

Topics of discussion for this BoF include:

  • current data engineering efforts/challenges faced by industry Julia users maintaining production systems
  • containerization of Julia processes and Julia-functions-as-a-service
  • executing Julia-based jobs/services via Kubernetes
  • Julia-centric workflow/dataflow orchestration
  • the intersection of Julia's tabular data ecosystem and enterprise data architectures

Our goal is two-fold:

  • uncover the shared data engineering problems, tools, and opportunities that characterize Julia's nascent Data Engineering community
  • identify concrete opportunities for open-source and cross-organization collaboration (hackathons, blogs, package development, etc.)

Platinum sponsors

Julia ComputingRelational AIJulius Technology

Gold sponsors

IntelAWS

Silver sponsors

Invenia LabsBeacon BiosignalsMetalenzASMLG-ResearchConningPumas AIQuEra Computing Inc.Jeffrey Sarnoff

Media partners

Packt PublicationGather TownVercel

Community partners

Data UmbrellaWiMLDS

Fiscal Sponsor

NumFOCUS