Big data challenges in bone research: genome-wide association studies and next-generation sequencing

Nerea Alonso, Gavin Lucas and Pirro Hysi

Genome-wide association studies (GWAS) have been developed as a practical method to identify genetic loci associated with disease by scanning multiple markers across the genome. Significant advances in the genetics of complex diseases have been made owing to advances in genotyping technologies, the progress of projects such as HapMapand 1000Gand the emergence of genetics as a collaborative discipline. Because of its great potential to be used in parallel by multiple collaborators, it is important to adhere to strict protocols assuring data quality and analyses. Quality control analyses must be applied to each sample and each single-nucleotide polymorphism (SNP). The software package PLINK is capable of performing the whole range of necessary quality control tests. Genotype imputation has also been developed to substantially increase the power ofGWAS methodology. Imputation permits the investigation of associations at genetic markers that are not directly genotyped. Results of individual GWAS reports can be combined through meta-analysis. Finally, next-generation sequencing (NGS) has gained popularity in recent years through its capacity to analyse a much greater number of markers across the genome. Although NGS platforms are capable of examining a higher number of SNPs compared with GWA studies, the results obtained by NGS require careful interpretation, as their biological correlation is incompletely understood. In this article, we will discuss the basic features of such protocols.

Nerea Alonso
Gavin Lucas