Title
Azure Data Lake Store: A Hyperscale Distributed File Service for Big Data Analytics.
Abstract
Azure Data Lake Store (ADLS) is a fully-managed, elastic, scalable, and secure file system that supports Hadoop distributed file system (HDFS) and Cosmos semantics. It is specifically designed and optimized for a broad spectrum of Big Data analytics that depend on a very high degree of parallel reads and writes, as well as collocation of compute and data for high bandwidth and low-latency access. It brings together key components and features of Microsoft?s Cosmos file system-long used by internal customers at Microsoft and HDFS, and is a unified file storage solution for analytics on Azure. Internal and external workloads run on this unified platform. Distinguishing aspects of ADLS include its design for handling multiple storage tiers, exabyte scale, and comprehensive security and data sharing features. We present an overview of ADLS architecture, design points, and performance.
Year
DOI
Venue
2017
10.1145/3035918.3056100
SIGMOD Conference
Keywords
Field
DocType
Storage,HDFS,Hadoop,map-reduce,distributed file system,tiered storage,cloud service,Azure,AWS,GCE,Big Data
Distributed File System,Data mining,File system,Computer science,Data sharing,Exabyte,Hyperscale,Analytics,Big data,Database,Operating system,Cloud computing
Conference
Citations 
PageRank 
References 
11
0.70
13
Authors
20