Computational pan-genomics: status, promises and challenges.

Paper Info

Title
Computational pan-genomics: status, promises and challenges.

Abstract
Many disciplines, from human genetics and oncology to plant breeding, microbiology and virology, commonly face the challenge of analyzing rapidly increasing numbers of genomes. In case of Homo sapiens, the number of sequenced genomes will approach hundreds of thousands in the next few years. Simply scaling up established bioinformatics pipelines will not be sufficient for leveraging the full potential of such rich genomic data sets. Instead, novel, qualitatively different computational methods and paradigms are needed. We will witness the rapid extension of computational pan-genomics, a new sub-area of research in computational biology. In this article, we generalize existing definitions and understand a pan-genome as any collection of genomic sequences to be analyzed jointly or to be used as a reference. We examine already available approaches to construct and use pan-genomes, discuss the potential benefits of future technologies and methodologies and review open challenges from the vantage point of the above-mentioned biological disciplines. As a prominent example for a computational paradigm shift, we particularly highlight the transition from the representation of reference genomes as strings to representations as graphs. We outline how this and other challenges from different application domains translate into common computational problems, point out relevant bioinformatics techniques and identify open problems in computer science. With this review, we aim to increase awareness that a joint approach to computational pan-genomics can help address many of the problems currently faced in various domains.

Year	DOI	Venue
2018	10.1093/bib/bbw089	Briefings in Bioinformatics
Field	DocType	Volume
Graph,Computational problem,Computational and Statistical Genetics,Paradigm shift,Computer science,Genomics,Bioinformatics,Computational genomics,Genetics	Journal	19
Issue	Citations	PageRank
1	8	0.58
References	Authors
41	60

Authors (60 rows)

Cited by (8 rows)

References (41 rows)

Name	Order	Citations	PageRank
The Computational Pan-Genomics Consortium	1	8	0.58
Tobias Marschall	2	8	0.58
Manja Marz	3	55	8.41
Thomas Abeel	4	8	0.58
Louis Dijkstra	5	8	0.58
Bas E. Dutilh	6	40	6.17
Ali Ghaffaari	7	8	0.92
Paul Kersey	8	130	8.82
Wigard P. Kloosterman	9	8	0.58
Veli Mäkinen	10	1583	85.29
Adam M. Novak	11	8	0.58
Benedict Paten	12	17	1.19
David Porubsky	13	8	0.58
Eric Rivals	14	8	0.58
Can Alkan	15	312	26.92
Jasmijn Baaijens	16	8	0.58
Paul I. W. De Bakker	17	8	0.58
Valentina Boeva	18	8	0.58
Raoul J. P. Bonnal	19	8	0.58
Francesca Chiaromonte	20	8	0.58
Rayan Chikhi	21	8	0.58
Francesca D. Ciccarelli	22	17	1.07
Robin Cijvat	23	8	1.26
Erwin Datema	24	8	0.58
Cornelia M. Van Duijn	25	8	0.58
Evan E. Eichler	26	8	0.58
Corinna Ernst	27	8	0.58
Eleazar Eskin	28	8	0.92
Erik Garrison	29	8	0.58
Mohammed El-Kebir	30	8	0.58
Gunnar W. Klau	31	8	0.58
Jan O. Korbel	32	8	0.58
Eric-Wubbo Lameijer	33	8	0.58
Benjamin Langmead	34	8	0.58
Marcel Martin	35	8	0.58
Paul Medvedev	36	8	0.58
John C. Mu	37	8	0.58
Pieter Neerincx	38	8	0.58
Klaasjan Ouwens	39	8	0.58
Pierre Peterlongo	40	19	1.32
Nadia Pisanti	41	8	0.58
Sven Rahmann	42	11	1.81
Ben Raphael	43	8	0.58
Knut Reinert	44	8	0.58
Dick de Ridder	45	13	2.02
Jeroen de Ridder	46	102	10.47
Matthias Schlesner	47	49	5.10
Ole Schulz-Trieglaff	48	13	1.45
Ashley D. Sanders	49	8	0.58
Siavash Sheikhizadeh	50	14	1.38

1
2
50 / page