|
|
When:
Monday, May 05, 3:00 p.m.
Where: 4615AWean Hall
Debabrata Dash
Speaking Skills Talk
Abstract: Automated database design is a major challenge in building self-tuning database management systems. A major difficulty in the development of practical physical design algorithms is dealing with the huge number of design features, such as indexes that must be considered. A second difficulty is determining, given a (pruned) search space of candidate features, a combination that provides optimal performance for the workload that also satisfies resource constraints such as available storage. Existing index selection tools rely on heuristics to efficiently search within the large space of alternative solutions and to minimize the overhead of using the query optimizer for cost estimation. Index selection heuristics, despite being practical, are hard to analyze and formally compute how close they get to the optimal solution. In this talk we propose a tool for index selection based on Integer Linear Programming, in the context of commercial database systems. Our tool offers higher solution quality, efficiency and scalability without sacrificing any of the precision offered by existing index selection tools. In order to make the tool practical, we developed a new query cost model (INUM) to estimate query costs 4 orders of magnitude faster than existing models. Using our selection algorithm with INUM on a large TPC-H like workload, provides 55% better solution and 40 times speedup, when compared with commercial solutions. More importantly, our model opens the way for the application of a huge body of work in combinatorial optimization and operations research, that is successfully deployed for real-world, large-scale optimization problems to be applied to databases as well. This is a joint work with Stratos Papadomanolakis, and Anastasia Ailamaki.
Presented in Partial Fulfillment of the CSD Speaking Skills Requirement.
<< Back
|