ACM SIGMOD City, Country, Year
sigmod pods logo

SIGMOD Keynote Talk 2

Symbiosis in Scale Out Networking and Data Management

Amin Vahdat (University of California San Diego and Google)

Abstract

This talk highlights the symbiotic relationship between data management and networking through a study of two seemingly independent trends in the traditionally separate communities: large-scale data processing and software defined networking. First, data processing at scale increasingly runs across hundreds or thousands of servers. We show that balancing network performance with computation and storage is a prerequisite to both efficient and scalable data processing. We illustrate the need for scale out networking in support of data management through a case study of TritonSort, currently the record holder for several sorting benchmarks, including GraySort and JouleSort. Our TritonSort experience shows that disk-bound workloads require 10 Gb/s provisioned bandwidth to keep up with modern processors while emerging flash workloads require 40 Gb/s fabrics at scale.

We next argue for the need to apply data management techniques to enable Software Defined Networking (SDN) and Scale Out Networking. SDN promises the abstraction of a single logical network fabric rather than a collection of thousands of individual boxes. In turn, scale out networking allows network capacity (ports, bandwidth) to be expanded incrementally, rather than by wholesale fabric replacement. However, SDN requires an extensible model of both static and dynamic network properties and the ability to deliver dynamic updates to a range of network applications in a fault tolerant and low latency manner. Doing so in networking environments where updates are typically performed by timer-based broadcasts and models are specified as comma-separated text files processed by one-off scripts presents interesting challenges. For example, consider an environment where applications from routing to traffic engineering to monitoring to intrusion/anomaly detection all essentially boil down to inserting, triggering and retrieving updates to/from a shared, extensible data store.

Bio

Amin Vahdat is a Principal Engineer at Google working on data center and wide-area network architecture. He is also a Professor and holds the Science Applications International Corporation Chair in the Department of Computer Science and Engineering at the University of California San Diego. Vahdat's research focuses broadly on computer systems, including distributed systems, networks, and operating systems. He received a PhD in Computer Science from UC Berkeley under the supervision of Thomas Anderson after spending the last year and a half as a Research Associate at the University of Washington. Vahdat is an ACM Fellow and a past recipient of the the NSF CAREER award, the Alfred P. Sloan Fellowship, and the Duke University David and Janet Vaughn Teaching Award.

Credits
Follow our progress: Facebook