Title
Summarizing developer work history using time series segmentation: challenge report
Abstract
Temporal segmentation partitions time series data with the intent of producing more homogeneous segments. It is a technique used to preprocess data so that subsequent time series analysis on individual segments can detect trends that may not be evident when performing time series analysis on the entire dataset. This technique allows data miners to partition a large dataset without making any assumption of periodicity or any other a priori knowledge of the dataset's features. We investigate the insights that can be gained from the application of time series segmentation to software version repositories. Software version repositories from large projects contain on the order of hundreds of thousands of timestamped entries or more. It is a continuing challenge to aggregate such data so that noise is reduced and important characteristics are brought out. In this paper, we present a way to summarize developer work history in terms of the files they have modified over time by segmenting the CVS change data of individual Eclipse developers. We show that the files they modify tends to change significantly over time though most of them tend to work within the same directories.
Year
DOI
Venue
2008
10.1145/1370750.1370784
MSR
Keywords
DocType
Citations 
time series data,a priori knowledge,time series,time series analysis
Conference
4
PageRank 
References 
Authors
0.57
3
3
Name
Order
Citations
PageRank
Harvey Siy158144.51
Parvathi Chundi214248.59
Mahadevan Subramaniam320425.78