Very excited to share that our work on SwitchML is now available. SwitchML shows that distributed ML training workloads are often network-bound, and in-network aggregation using a programmable switch can resolve this bottleneck.
I recently gave a Distinguished Lecture at the National University of Singapore about Accelerating Distributed Systems with In-Network Computation.
The recording is now available to watch online.
On 11/18, Papers We Love is holding a mini-conference to help raise money for USENIX and support open access – a great cause!
I’ll be speaking on a panel with a number of very distinguished guests!
I recently gave a keynote about in-network computation at the SPMA workshop at Eurosys.
Having finally remembered how to update my website, here are three cool new systems:
Harmonia (VLDB ‘20) lets replicated storage systems scale their performance linearly with the number of replicas, without sacrificing linearizability, by using a programmable switch as a contention detector.
LeapIO (ASPLOS '20) is an architecture for efficiently offloading complex cloud storage stacks to ARM-based coprocessors, avoiding the 10-20% “storage tax” CPU overhead that cloud providers pay today.
Meerkat is a new multicore-scalable replicated transaction protocol that avoids both cross-core and cross-replica coordination for non-conflicting transactions.
I’m happy to share this position paper with our take on in-network computing – not just what we can do, but what we should do; what applications it’s good for; and what we need to solve before we can deploy it.
This will appear in HotOS ‘19.
I’m currently chairing USENIX HotCloud 2019. Looking forward to exciting submissions on early-stage work and new research directions!