Open Access System for Information Sharing

Login Library

 

Article
Cited 7 time in webofscience Cited 11 time in scopus
Metadata Downloads

Prediction and predictability for search query acceleration SCIE SCOPUS

Title
Prediction and predictability for search query acceleration
Authors
Hwang, SWKim, SHe, YXElnikety, SChoi, S
Date Issued
2016-08
Publisher
ACM Transactions on the Web
Abstract
A commercial web search engine shards its index among many servers, and therefore the response time of a search query is dominated by the slowest server that processes the query. Prior approaches target improving responsiveness by reducing the tail latency, or high-percentile response time, of an individual search server. They predict query execution time, and if a query is predicted to be long-running, it runs in parallel; otherwise, it runs sequentially. These approaches are, however, not accurate enough for reducing a high tail latency when responses are aggregated from many servers because this requires each server to reduce a substantially higher tail latency (e.g., the 99.99th percentile), which we call extreme tail latency. To address tighter requirements of extreme tail latency, we propose a new design space for the problem, subsuming existing work and also proposing a new solution space. Existing work makes a prediction using features available at indexing time and focuses on optimizing prediction features for accelerating tail queries. In contrast, we identify "when to predict?" as another key optimization question. This opens up a new solution of delaying a prediction by a short duration to allow many short-running queries to complete without parallelization and, at the same time, to allow the predictor to collect a set of dynamic features using runtime information. This new question expands a solution space in two meaningful ways. First, we see a significant reduction of tail latency by leveraging "dynamic" features collected at runtime that estimate query execution time with higher accuracy. Second, we can ask whether to override prediction when the "predictability" is low. We show that considering predictability accelerates the query by achieving a higher recall. With this prediction, we propose to accelerate the queries that are predicted to be long-running. In our preliminary work, we focused on parallelization as an acceleration scenario. We extend to consider heterogeneous multicore hardware for acceleration. This hardware combines processor cores with different microarchitectures such as energy-efficient little cores and high-performance big cores, and accelerating web search using this hardware has remained an open problem. We evaluate the proposed prediction framework in two scenarios: (1) query parallelization on a multicore processor and (2) query scheduling on a heterogeneous processor. Our extensive evaluation results show that, for both scenarios of query acceleration using parallelization and heterogeneous cores, the proposed framework is effective in reducing the extreme tail latency compared to a start-of-the-art predictor because of its higher recall, and it improves server throughput by more than 70% because of its improved precision.
URI
https://oasis.postech.ac.kr/handle/2014.oak/37454
DOI
10.1145/2943784
ISSN
1559-1131
Article Type
Article
Citation
ACM Transactions on the Web, vol. 10, no. 3, 2016-08
Files in This Item:
There are no files associated with this item.

qr_code

  • mendeley

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher

최승진CHOI, SEUNGJIN
Dept of Computer Science & Enginrg
Read more

Views & Downloads

Browse