Title
Parallel programming framework for large batch transaction processing on scale-out systems
Abstract
A scale-out system is a cluster of commodity machines, and offers a good platform to support steadily increasing workloads that process growing data sets. Sharding [4] is a method of partitioning data and processing a computation on a scale-out system. In a database system, a large table can be partitioned into small tables so each node can process its part of the computation. The sharding approach in a large batch transaction processing, which is important in financial area, presents two hard problems to programmers. Programmers have to write complex code (1) to transfer the input data so as to align the computations with the data partitions, and (2) to manage the distributed transactions. This paper presents a new parallel programming framework that makes parallel transactional programming easier by specifying transaction scopes and partitioners to simplify the code. Transaction scopes include series of subtransactions, each of which performs local operations. The system manages the distributed transactions automatically. A partitioner represents how the computation should be decomposed and aligned with the data partitions to avoid remote database accesses. Between paired of subtransactions, the system handles the data shuffling across the network. We implemented our parallel programming framework as a new Java class library. We hide all of the complex details of data transfer and distributed transaction management in the library. Our programming framework can eliminate almost 66% of the lines of code compared to a current programming approach without programming framework support. We also confirmed good scalability, with a scaling factor of 20.6 on 24 nodes using our modified batch program for the TPC-C benchmark.
Year
DOI
Venue
2010
10.1145/1815695.1815714
SYSTOR
Keywords
Field
DocType
data shuffling,input data,partitioning data,large batch transaction processing,current programming approach,data partition,data transfer,transaction scope,new parallel programming framework,scale-out system,system management,distributed transactions,transaction processing,scale out,programming framework,database system,lines of code
Transaction processing,Computer science,Parallel computing,Online transaction processing,Database transaction,Distributed transaction,Transaction processing system,Software framework,Source lines of code,Scalability
Conference
Citations 
PageRank 
References 
1
0.35
17
Authors
8
Name
Order
Citations
PageRank
Kazuaki Ishizaki119117.66
Ken Mizuno271.93
Toshio Suganuma340427.10
Daniel Silva460.88
Akira Koseki5464.73
Hideaki Komatsu641034.00
Yohei Ueda7293.73
Toshio Nakatani874156.80