Title
BLOCKSET (Block-Aligned Serialized Trees): Reducing Inference Latency for Tree ensemble Deployment
Abstract
ABSTRACTWe present methods to serialize and deserialize gradient-boosted trees and random forests that optimize inference latency when models are not loaded into memory. This arises when models are larger than memory, but also systematically when models are deployed on low-resource devices in the Internet of Things or run as cloud microservices where resources are allocated on demand. Block-Aligned Serialized Trees (BLOCKSET) introduce the concept of selective access for random forests and gradient boosted trees in which only the parts of the model needed for inference are deserialized and loaded into memory. %BLOCKSET combines concepts from external memory algorithms and data-parallel %layouts of random forests that maximize I/O-density for in-memory models. Using principles from external memory algorithms, we block-align the serialization format in order to minimize the number of I/Os. For gradient boosted trees, this results in a more than five time reduction in inference latency over layouts that do not perform selective access and a 2 times latency reduction over techniques that are selective, but do not encode I/O block boundaries in the layout.
Year
DOI
Venue
2021
10.1145/3447548.3467368
Knowledge Discovery and Data Mining
Keywords
DocType
Citations 
random forest, gradient boosted tree, tree ensemble, block alignment, serialization, efficient inference, IoT, Microservices, locality
Conference
0
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Meghana Madhyastha100.68
kunal lillaney292.18
James Browne301.01
Joshua T. Vogelstein427331.99
Randal Burns51955115.15