Human Disease


Human Disease


Paper Link:



Various genetic variants have been associated with human diseases, of which copy number variations (CNVs) are of great importance (Sharp, Locke et al. 2005) as genome-wide CNVs are reported to be involved in various human diseases including cancer (Yu, Liu et al. 2014), autism (Sebat, Lakshmi et al. 2007, Glessner, Wang et al. 2009), schizophrenia (St Clair 2009) and intellectual disability (Madrigal, Rodriguez-Revenga et al. 2007). Cancer studies show that segmental deletions or duplications of chromosomes frequently occur throughout the process of tumorigenesis and progression (Albertson, Collins et al. 2003, Stratton, Campbell et al. 2009). These aberrations are often associated with abnormal expression of tumor suppressors and oncogenes (Bentires-Alj, Gil et al. 2006). Therefore, accurate detection of CNVs is an important step to identify disease-causing genes and functionally disrupted pathways.

Advances in experimental technologies from array-based technologies including array comparative genomic hybridization (array CGH) (Park 2008) and single nucleotide polymorphism (SNP) genotyping (Li, Liu et al. 2014), to recent high-throughput DNA sequencing (Schuster 2008) have greatly promoted studies on human genomes. As whole-exome sequencing (WES) continues to be cheaper and more reliable, it has been demonstrated as an effective alternative to whole-genome sequencing (WGS) for the identification of genetic variants underlying human diseases.

Several state-of-the-art tools (Sathirapongsasuti, Lee et al. 2011, Fromer, Moran et al. 2012, Krumm, Sudmant et al. 2012, Magi, Tattini et al. 2013) have been developed to discover CNVs from WES data. These methods can be classified into two categories on the basis of approaches used: 1) to detect deviations in read counts among a pool of examined samples without the need of control samples, such as CoNIFER (Krumm, Sudmant et al. 2012) and XHMM (Fromer, Moran et al. 2012); 2) to find deviations in read counts ratio by comparing the examined samples with the controls, such as ExomeCNV (Sathirapongsasuti, Lee et al. 2011) and EXCAVATOR (Magi, Tattini et al. 2013). Most of these tools are stand-alone programs that require users to locally set up computational environments with necessary hardware and software, which is sometimes difficult for users or even impossible if the technical requirements cannot be met. On the other hand, few tools are available for systematically functional annotation of CNVs by integrating currently available resources (Wang, Li et al. 2007, Chang and Wang 2012, Zhao and Zhao 2013, Erikson, Deshpande et al. 2014). These tools need a file containing the information of genome coordinates of CNVs as input, and annotation process is performed by finding genomic overlaps between input and annotation features. However, sample information is not provided in the annotation results from these tools, which makes it inconvenient for users to assign the annotation information to a specific sample carrying these CNVs, especially when applying these tools to annotate CNVs found in cohort studies. To our knowledge, integrated pipelines for detection and annotation of CNVs from WES data have not been reported yet. Therefore, online bioinformatics tools that can precisely detect and systematically annotate CNVs are highly needed for WES data.

Here we introduce DeAnnCNV, an efficient web server designed for integrating Detection and Annotation of Copy Number Variations from WES data. DeAnnCNV is capable of identifying CNVs from each sample accurately based on our previously published algorithm GPHMM (Li, Liu et al. 2011) and providing detailed visualization of the detected CNVs. It can also extract CNVs shared by multiple samples and further copiously annotate them based on several supporting features including: 1) whether a CNV has been reported or not (documented in dbVar (Lappalainen, Lopez et al. 2013)); 2) detailed information on genes associated with CNVs; 3) whether genetic variants of these genes have been reported in human diseases (collected from ClinVar (Landrum, Lee et al. 2014)); 4) phenotypes of mice deficient for these genes (collected from MGI (Blake, Bult et al. 2014)); 5) mRNA expression of these genes in human tissues and cell lines; 6) functional enrichment analysis for these genes (including enriched GO, pathway and protein domains) and 7) constructing the protein-protein interaction network for the genes involved in CNVs, in which whether a gene is associated with a human disorder is indicated.

In order to verify the practicability of our tool, we applied DeAnnCNV to a study of infertile men, and found that two patients have a CNV (each patient has only one of the two copies), which shares a gene PABPN1L, hemizygous deletion of which causes male infertility in mice. This result indicates that DeAnnCNV is a powerful and reliable tool for the detection and annotation of CNVs from WES data.

Cover of the Issue