James is an engineer at Saturn Cloud, where he works on a managed Dask + Kubernetes offering and contributes to open source projects in the Dask ecosystem, such as Prefect. Before that, he spent 4 years as a data scientist working on industrial internet of things (IIoT) problems at AWS and Chicago-based Uptake. He is a maintainer on LightGBM, a popular machine learning project from Microsoft Research, and on 2 R packages on CRAN and 1 Python package on PyPi. James holds masters degrees in Applied Economics and in Data Science.

"Scaling Machine Learning with Python and Dask"
Tuesday, 9/22 @ 10:30am

In this talk, attendees will get an introduction to Dask, a distributed computing framework in the PyData ecosystem. The first half of the talk will describe the current state of the project and its ecosystem including distributed data collections, cloud deployment options, distributed machine learning projects, and workflow orchestration. The second half of the talk will be a live demo showing the programming model for machine learning on Dask. Benchmarks with comparisons to single-node training (scikit-learn) and Spark (Spark ML) for tree-based supervised learning models  will be presented.