Statistics symposium

07/22/2022, 8:00 AM — 11:00 AM UTC
Green

Abstract:

Statistics is a domain where some early stage development of packages, and some early applications, have come about in Julia. We think of this mini-symposium as a combination of (a) Report on many interesting recent developments in this field and (b) Offer a birds eye view to the people interested in this field, and help them assess the state of maturity so as to make decisions about whether Julia is appropriate for their statistics work.

Description:

  1. "Doing applied statistics research in Julia", Ajay Shah (20 minutes) We show the journey of two applied statistics research papers, done fully in Julia, by researchers who were previously working in R. What was convenient, what were the chokepoints, what were the gains in expressivity and in performance. Based on this, we evaluate the state of maturity of Julia for doing applied statistics. We propose practical pathways for statisticians, and speak to the Julia community about what is required next. We report on recent developments in the field of Julia and statistics.

  2. "CRRao: A unified framework for statistical models", Sourish Das (20 minutes) Many statistical models are available in Julia, and many more will come. CRRao is a consistent framework through which callers interact with a large suite of models. For the end-user, it reduces the cost and complexity of estimating statistical models. It offers convenient guidelines through which development of additional statistical models can take place in the future.

  3. "TSx: A time series class for Julia", Chirag Anand (20 minutes) DataFrames.jl is a powerful system, but expressing the standard tasks of manipulating time series -- e.g. as seen in finance or macroeconomics -- is often cumbersome. We draw on the work of the R community, which has built zoo and xts, to build a time series class, TSx, which delivers a simple set of operators and functions for the people working with time series. It constitutes syntactic sugar on top of the capabilities of DataFrame.jl and thus harnesses the capabilities and efficiency of that package. We conduct comparisons of capabilities and performance against zoo and xts in R.

  4. "Comparing glm in Julia, R and SAS", Mousum Datta (10 minutes) glm is an unusually important class of statistical models. We compare the capabilities, correctness and performance of the present glm systems in Julia, R and SAS. We report on recent improvements that have been injected into GLM.jl.

  5. "Working with survey data", Ayush Patnaik (10 minutes) The Julia package survey.jl builds some of the functionality required for statistical estimators with stratified random sampling. For a limited subset of the capabilities of Thomas Lumley's R package `survey', we show the correctness and the performance gains of the Julia package.

Platinum sponsors

Julia ComputingRelational AIJulius Technology

Gold sponsors

IntelAWS

Silver sponsors

Invenia LabsBeacon BiosignalsMetalenzASMLG-ResearchConningPumas AIQuEra Computing Inc.Jeffrey Sarnoff

Media partners

Packt PublicationGather TownVercel

Community partners

Data UmbrellaWiMLDS

Fiscal Sponsor

NumFOCUS