Optimization of Client-Site User-Defined Functions

Tobias Mayr       Praveen Seshadri*
Cornell University       Cornell University
mayr@cs.cornell.edu       praveen@cs.cornell.edu

Abstract

We explore the optimization of queries with client side UDFs. Many UDFs can only be executed at the client site, for reasons of security, confidentiality, or availability of resources. How should a query be optimized to take client-site UDFs into account? We demonstrate that in this context the known execution techniques for expensive server site UDFs perform badly. The involved network latencies cannot be ignored. We blend well-known distributed database algorithms with established techniques to handle expensive server-site UDFs. The resulting query execution techniques are implemented in the Cornell Predator database system, and we present performance results to demonstrate their effectiveness. We also reconsider the question of expensive UDF placement in the context of client-site UDFs. The known techniques, namely rank ordering, turn out to be inadequate. We demonstrate query plan optimizations for client-site UDFs and show their effectiveness in performance tests. Finally we propose a System-R style optimizer for query plans involving client-site operations.