Title
Genomic distance entrained clustering and regression modelling highlights interacting genomic regions contributing to proliferation in breast cancer
Abstract
BACKGROUND: Genomic copy number changes and regional alterations in epigenetic states have been linked to grade in breast cancer. However, the relative contribution of specific alterations to the pathology of different breast cancer subtypes remains unclear. The heterogeneity and interplay of genomic and epigenetic variations means that large datasets and statistical data mining methods are required to uncover recurrent patterns that are likely to be important in cancer progression. RESULTS: We employed ridge regression to model the relationship between regional changes in gene expression and proliferation. Regional features were extracted from tumour gene expression data using a novel clustering method, called genomic distance entrained agglomerative (GDEC) clustering. Using gene expression data in this way provides a simple means of integrating the phenotypic effects of both copy number aberrations and alterations in chromatin state. We show that regional metagenes derived from GDEC clustering are representative of recurrent regions of epigenetic regulation or copy number aberrations in breast cancer. Furthermore, detected patterns of genomic alterations are conserved across independent oestrogen receptor positive breast cancer datasets. Sequential competitive metagene selection was used to reveal the relative importance of genomic regions in predicting proliferation rate. The predictive model suggested additive interactions between the most informative regions such as 8p22-12 and 8q13-22. CONCLUSIONS: Data-mining of large-scale microarray gene expression datasets can reveal regional clusters of co-ordinate gene expression, independent of cause. By correlating these clusters with tumour proliferation we have identified a number of genomic regions that act together to promote proliferation in ER+ breast cancer. Identification of such regions should enable prioritisation of genomic regions for combinatorial functional studies to pinpoint the key genes and interactions contributing to tumourigenicity.
Year
DOI
Venue
2010
10.1186/1752-0509-4-127
BMC systems biology
Keywords
Field
DocType
Breast Cancer, Ridge Regression, Epigenetic Silence, Proliferation Signature, Copy Number Aberration
Regression,Breast cancer,Biology,Systems biology,Genomics,Bioinformatics,Gene regulatory network,Gene expression profiling,Cancer,Epigenetics
Journal
Volume
Issue
ISSN
4
1
1752-0509
Citations 
PageRank 
References 
12
0.54
3
Authors
7
Name
Order
Citations
PageRank
Tim J Dexter1120.54
David Sims2253.07
Costas Mitsopoulos3252.69
Alan Mackay4120.54
Anita Grigoriadis5160.93
Amar S Ahmad6120.54
Marketa Zvelebil7454.26