Note: VarCards and its resources are freely available for academic usage. For some of the resources, licenses are required for non-academic usage, please contact the original content provider for that purpose.View Disclaimer

A general workflow of VarCards


pipeline


Motivation of VarCards


Whole-exome sequencing provides unprecedented opportunities to identify causative variations underlying human genetic diseases. A number of genomic tools and databases were developed to facilitate the interpretation of genetic variants, especially in the coding regions. However, they were separately presented in various online websites or databases, which are inconvenient and challenging for general clinicians, geneticists, and biologists to obtain the first-hand information regarding to some of interested variants and genes. Therefore, there is a strong general demanding to develop a convenient database through which users can retrieve the general genetic and clinical knowledge for giving variants in one integrated online database.



Genetic and clinical information in VarCards


To fill these unmet needs, starting with coding regions and splice sties, we artificially generated all possible single nucleotide variants (n = 110,154,363) and cataloged all reported small insertion and deletions (n = 1,223,370) in coding regions or splicing sites. Then we annotated them with respect to functional consequences from more than 60 genomic data sources to develop a database named VarCards. By employing this database, users can conveniently search, browse and annotate the variant- and gene-level implications of any giving coding variants, including following information.

  1. Functional effects of variants, such as stop-gain, splicing, nonsynonymous, and frameshift etc.
  2. Functional consequences (deleterious or tolerant) of variants through 23 in silico predictive algorithms, such as SIFT, Polyphen2, and MutationTaster etc.
  3. Allele frequency in different populations of public databases, such as dbSNP, ExAC, and gnomAD etc.
  4. Disease- and phenotype-related databases for variant- and gene-level implications, such as OMIM, MGI, COSMIC, and ClinVar etc.
  5. General gene-level information, such as protein sequences and interactions, gene functions, pathways, domains, and expression levels, gene-based mutation rates and genic intolerance etc.
  6. Drug鈥揼ene interactions and gene druggability for precision medicine.

Publication detail


VarCards: an integrated genetic and clinical database for coding variants in the human genome. 2018. Nucleic Acids Research.



Recent Updates

[27/9/2018] The upload file maximum allowed size expanded to 300M and gz compressed format is supported.

[16/12/2017] The number of SNVs and indels increased to 112,365,765 and 1,268,233 respectively following the new filter criteria.

[1/10/2017] The variants are shown in HGVS format and can be searched with this format.

[23/9/2017] The number of SNVs and indels increased to 110,154,363 and 1,223,370 respectively. Coding regions and splicing sites are based on annotations of RefSeq Gene, CCDS gene, UCSC known gene and Ensembl gene.

[3/9/2017] Fix the bug of getting php errors when use sample queries in 'Search' page.

[25/6/2017] The frontend and backend of VarCards were finished.

[5/6/2017] Annotation of all SNVs and InDels were finished.

[5/11/2016] Allele frequency in population were downloaded from ExAC browser.

[1/10/2016] In silico missense prediction by 23 algorithms were downloaded from dbNSFP.

[5/8/2016] SNVs (n = 104,331,216) were artificially generated based on reference sequence and annotation of hg19_refGene.txt.

[8/7/2016] Retrieved gene-based annotation datasets (hg19_refGene.txt, hg19_refGeneMrna.fa) through ANNOVAR and downloaded human reference sequence from UCSC Genome Browser ftp site.