- 1000 GENOMES: it provides population-level information for variants found in 1000 Genomes, including FST index, allele frequency spectrum and associated genomic elements;
- ANNOTATION: it annotates variants with known risk variants, genomic annotations, diseases-associated genes and pathways recoded in curated databases;
- VSEA: it provides enrichment analysis for a set of variants;
- PAFA: users can search and download PAFA scores;
- SEARCH: it provides an integrated web browser for genes and their annotations.
Type or paste variants or upload a file containing variants by dragging, users can annotate variants with known risk variants, genomic annotations, disease-associated genes and pathways recorded in curated databases (e.g. ClinVar, COSMIC and ENCODE) and evaluate variants based on their associated genes’ occurrence frequency in disease-related databases. PAFA will accept input format like “TYPE CHROM START END CHROM_BP BREAKPOINT” (e.g.Translocation chr5 43320488 43320498 chr11 108153462 or “TYPE CHROM START END” (e.g. SNP chr10 51602168 51602168 ) without header. PAFA provides downloadable Excel file containing all analysis results and present them in a very visible and interactive way.
- PAFA arranges and colors variants according to their TYPE, SNP, Insertion or Deletion. It will show the length of a variant if users set its TYPE as Deletion.
- PAFA labels variants and marks them in red dots if they exist in curated database, such as 1000 Genomes, ESP and dbSNP.
- If a variant is overlapped with disease-associated variants in curated databases, like COSMIC and ClinVar, PAFA will show the number of risk variants.
- PAFA lists all involved genes of these variants. If a variant overlaps a gene’s protein coding region, it will be shown in blue; if a variant overlaps a gene’s noncoding region (e.g. TSS, UTR and regulator), it will be shown in pink.
- PAFA shows the number of times that a gene occurs in gene-disease databases, such as OMIM and GAD. Deeper colored grip means a gene occurs more often. It also provides a score for the gene by accumulating the occurrence time of the gene in all gene-disease databases.
- PAFA provides a quantitative value for a variant according to the occurrence frequency of its associated gene in current gene-disease databases.
By typing or pasting variants or uploading a file containing variants by dragging, users can carry out enrichment analysis on a set of variants. PAFA will accept input format like “TYPE CHROM START END CHROM_BP BREAKPOINT” (e.g.Translocation chr5 43320488 43320498 chr11 108153462 or “TYPE CHROM START END” (e.g. SNP chr10 51602168 51602168 ) without header.
To provide enrichment analysis for the target variant set, PAFA included background variant sets, such as variants from 1000 Genomes, genomic annotations from GENCODE and ENCODE and canonical pathways in the Molecular Signatures Database (MSigDB). First, it maps test variants and background variants (user uploaded or selected) to a range of annotated elements. Then, it will obtain genes related to the test and background variants. Next, PAFA extracts the related pathways of these genes in MSigDB. Finally, according to the relationships among variants, genes and pathways, PAFA calculates the p value to estimate the enrichment degree in relevant pathways using Fisher’s exact test.
PAFA provides a downloadable Excel file containing all analysis results and presents them in a visible way with five sections, including 1) variant and overlapped genes listed in a tabular format; 2) gene and corresponding pathways listed in a tabular format; 3) relationship among variants, genes and pathways presented in network graph; 4) enrichment pathways listed in table format; 5) relationship among pathways and genes presented in a network graph.
By typing gene name, users can obtain annotations from curated databases, including:
- information of the gene from GENCODE v19, including chr, start, end, strand, gene id, gene type, gene name and position;
- neutral variants of ClinVar and 1000 Genomes and risk variants of ClinVar and COSMIC in the current genomic location;
- genomic elements in the current genomic location, including TSSs, enhancers, genes, exons, DNaseseq Peaks, FAIRE Peaks, Histone Peaks, TFBS (PeakSeq) and TFBS (SPP);
- information of the gene in DEG Database, including related functions and organism;
- information of the gene in OMIM, including related disorders;
- information of the gene in GAD, including related disease, disease classification, chr.band, reference, PubMed ID, pathway search, PubMed, locus link, gene card, gene&disease(PubMed), etc.;
- information of the gene in COSMIC, including related disease, sample, primary site, etc..
If you have technical problems using PAFA, please check the information provided here. If it does not resolve your issues, please contact us at email@example.com.
PAFA is currently developed by Fangqing Zhao and Zhou Lin.
If you are planning on using PAFA in a commercial application, please contact Fangqing Zhao.