Provably Good Randomized Strategies for Data Placement in Distributed Key-Value Stores
Distributed storage systems are used widely in clouds, data-bases, and file systems. These systems store a large amount of data across multiple servers. When a request to access data comes in, it is routed to the appropriate server, queued, and eventually processed. If the server’s queue is full, then requests may be rejected. Thus, one important challenge when designing the algorithm for allocating data to servers is the fact that the request pattern may be unbalanced, unpredictable, and may change over time. If some servers get a large fraction of the requests, they are overloaded, leading to many rejects. In this paper, we analyze this problem theoretically under adversarial assumptions. In particular, we assume that the request sequence is generated by an adversarial process to maximize the number of rejects and analyze the performance of various algorithmic strategies in terms of the fraction of the requests rejected. We show that no deterministic strategy can perform well. On the other hand, a simple randomized strategy guarantees that at most a constant fraction of requests are rejected in expectation. We also show that moving data to load balance is essential if we want to reject a very small fraction ($1/m$ where $m$ is the number of servers) of requests. We design a strategy with randomization and data transfer to achieve this performance with speed augmentation. Finally, we conduct experiments and show that our algorithms perform well in practice.
Mon 27 FebDisplayed time zone: Eastern Time (US & Canada) change
10:00 - 12:00 | |||
10:00 20mTalk | Boosting Performance and QoS for Concurrent GPU B+trees by Combining-based Synchronization Main Conference Weihua Zhang Fudan University, Chuanlei Zhao Fudan University, Lu Peng Tulane University, Yuzhe Lin Fudan University, Fengzhe Zhang Fudan University, Yunping Lu Fudan University | ||
10:20 20mTalk | The State-of-the-Art LCRQ Concurrent Queue Algorithm Does NOT Require CAS2 Main Conference | ||
10:40 20mTalk | Provably Good Randomized Strategies for Data Placement in Distributed Key-Value Stores Main Conference Zhe Wang Washington University in St. Louis, Jinhao Zhao Washington University in St. Louis, Kunal Agrawal Washington University in St. Louis, USA, Jing Li New Jersey Institute of Technology, He Liu Apple Inc., Meng Xu Apple Inc. | ||
11:00 20mTalk | 2PLSF: Two-Phase Locking with Starvation-Freedom Main Conference Pedro Ramalhete Cisco Systems, Andreia Correia University of Neuchâtel, Pascal Felber University of Neuchâtel | ||
11:20 20mTalk | Provably Fast and Space-Efficient Parallel Biconnectivity Main Conference Xiaojun Dong University of California, Riverside, Letong Wang University of California, Riverside, Yan Gu UC Riverside, Yihan Sun University of California, Riverside | ||
11:40 20mTalk | Practically and Theoretically Efficient Garbage Collection for Multiversioning Main Conference Yuanhao Wei Carnegie Mellon University, USA, Guy E. Blelloch Carnegie Mellon University, USA, Panagiota Fatourou FORTH ICS and University of Crete, Greece, Eric Ruppert York University |