Jonathan Perry

Founder and CEO of Flowmill:
software-based (eBPF) network monitoring
now part of Splunk.

PhD Computer science, MIT.

Flowmill helps SREs accelerate​​ ​production incident resolution​.

By monitoring every service dependency pair, Flowmill answers questions such as “which of these 30 services is likely the cause for this incident?”​ in seconds, making it possible to direct escalations to fewer, more relevant engineers. This expedites triage, focuses mitigation efforts, and dramatically shrinks war room staffing and engineer burnout. Flowmill monitoring has negligible overhead, no sampling, no per-service configuration or code changes, and can be deployed in less than 20 minutes of configuration management.

We are hiring! See our careers page.

I received my Ph.D at MIT CSAIL‘s Networks and Mobile Systems group, with advisors Hari Balakrishnan and Devavrat Shah, with the thesis “Centralized performance control for datacenter networks“, during which we had intensive collaboration with Microsoft Research (2011) and Facebook (2013-2017). I had previously spent 7 years in communication systems R&D and HPC algorithm development as an officer in an army technological unit.

The PhD research revolved around enabling fast detection of and reaction to undesirable incidents in datacenter and cloud networks, by designing extremely fine granulrity, low overhead, low latency monitoring, processing, and control of service interactions. The systems produced mostly controlled network transfers, as this use-case provides extreme challenges for the technology. Fastpass aims for high utilization with zero queueing: a logically centralized arbiter controls and orchestrates all network transfers. Flowtune assigns shares of network throughput to pairs of applications according to organizational policy, maximizing the organization’s utility.

Other research deals with rateless error correcting codes for wireless networks: Spinal Codes (w/source code) are efficient, high-performance error correction codes, especially suited for analog channels.

Selected Publications

More publications

Teaching

Spring 2014: 6.824 Distributed Systems
Spring 2013: 6.829 Computer Networks