Cuttle: Enabling cross-column compression in distributed column stores
Liu, Hao1; Xiao, Jiang2; Guo, Xianjun3; Tan, Haoyu1; Luo, Qiong1; Ni, Lionel M.4
Conference Name1st Asia-Pacific Web and Web-Age Information Management Joint Conference on Web and Big Data, APWeb-WAIM 2017
Source PublicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10367 LNCS
Conference Date7 7, 2017 - 7 9, 2017
Conference PlaceBeijing, China
Author of SourceSpringer Verlag
AbstractWe observe that, in real-world distributed data warehouse systems, data columns from different sources often exhibit redundancy. Even though these systems can employ both general and column-oriented compression schemes to reduce the data storage pressure, such cross-column redundancy (CCR) is not recognized or exploited effectively. Therefore, we propose Cuttle, a column storage system that enables cross-column compression to reduce CCR. Specifically, we identify three kinds of CCR and develop a referential transformation encoding (RTE) scheme to compress multiple columns of data with CCR. Furthermore, we address the CCR selection problem and propose a greedy algorithm to generate cross-column compression schemes. Our experiments on real-world datasets show that Cuttle can further reduce data size by half after applying both the column-oriented and general compression schemes, and that the query processing performance with Cuttle is improved by $$20\%$$ without any change to the application programs. © Springer International Publishing AG 2017.
Fulltext Access
Citation statistics
Cited Times [WOS]:2   [WOS Record]     [Related Records in WOS]
Document TypeConference paper
CollectionUniversity of Macau
Affiliation1.Department of Computer Science and Engineering, HKUST, Kowloon, Hong Kong;
2.Huazhong University of Science and Technology, Wuhan, China;
3.Deepera Inc., Ocean Coast City, Shenzhen, China;
4.University of Macau, Zhuhai, China
Recommended Citation
GB/T 7714
Liu, Hao,Xiao, Jiang,Guo, Xianjun,et al. Cuttle: Enabling cross-column compression in distributed column stores[C]//Springer Verlag,2017:219-226.
Related Services
Recommend this item
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Liu, Hao]'s Articles
[Xiao, Jiang]'s Articles
[Guo, Xianjun]'s Articles
Baidu academic
Similar articles in Baidu academic
[Liu, Hao]'s Articles
[Xiao, Jiang]'s Articles
[Guo, Xianjun]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Liu, Hao]'s Articles
[Xiao, Jiang]'s Articles
[Guo, Xianjun]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.