Mariano Gonzalez

Mariano is a Chicago-based computer geek originally from Mexico. Over the last years, he has been working with big data technologies such as Hadoop, Cassandra and Spark just to mention a few. He has been using the JVM for more than 12 years implementing multiple kinds of applications for different business fields (insurance, banking, trade shows, and recently IoT). He enjoys sharing his knowledge about JVM and its huge ecosystem with the Chicago tech community.

Native Spark Executors on K8S: Diving Into a Data Lake

By Grace Chang and Mariano Gonzalez

 

Everybody wants to do big data on a data lake! However, implementing it and maintaining the infrastructure necessary to explore it, such as Spark, has been a historically challenging endeavor. Kubernetes is the tool of choice for cloud orchestration, and Spark continues to be the de facto framework for most data wrangling tasks. We’ve previously tried different data lake architectures, and suffered from the pain that Hadoop carries with it. Finally, we decided to bring the best from the cloud and big data worlds together, and walk you through a session on how to set an endless data lake powered with native Spark executors on Kubernetes. Come to this session to get a grasp on how data visualization can be achieved with ease!