Publications

Talks

Thesis

Supervision

Awards

Service

Bogdan Ghit

I'm a computer scientist and tech lead at Databricks broadly working on big data, distributed systems, and cloud computing. In the role of tech lead at Databricks, I seek to build cloud computing infrastructure to speed-up SQL workloads in the Databricks SQL warehouse.

Before joining Databricks, I was awarded a PhD in Computer Science from Delft University of Technology. My research was focused on scheduling and resource management for big data processing frameworks.

Impact
At Databricks I designed Cloud Fetch, a new architecture for high-throughput connectivity with Business Intelligence tools that enables parallel extracts of query results from data warehouses. I also incorporated the Dynamic Partition Pruning optimization in Apache Spark which detects and avoids scanning data that is irrelevant to the query.

One of the highlights of my research was Fawkes, a resource manager for allocating and balancing resources for big data frameworks. My work also advocates for using multi-queuing scheduling with bias for short jobs to optimize the job slowdown in data centers.

At IBM Research I co-invented Capri a new cloud spot market abstraction based on bribe scheduling that provides good fairness guarantees based on bids.

Publications


2022



2021



2020



2019



2018



2017


2016


2015


2014


2013


2012


Talks


Research Supervision


Awards


Service