![]() |
Table of Contents
: Data Sources
: Data source list | Genomic Variants | ClinVar | COSMIC | Orthologue alignments | Mastermind |
![]() |
By default, orthologues aligned and displayed in Alamut® Visual are taken from the Ensembl Compara database. So far (March 2010), the only non-Ensembl-based alignments available were the manually-curated alignments of ATM NM_000051 and U82828, provided by Tavtigian et al. (2009) and BRCA1 NM_007294, provided by IARC with Align GVGD.
Although Ensembl Compara is a very valuable information source, manual selection of orthologues and alignment curation are necessary for optimal missense interpretation and for scoring systems like SIFT, Polyphen, and Align GVGD.
This is why we have designed a semi-automatic procedure for orthologue alignment construction, briefly described here (Deforche and Blavier 2010).
Orthologous sequences are searched with the BlastP program, first against the Uniprot/Swissprot database. If sequences of distant species are not found, BlastP is then run against the Refseq database, and finally against the NCBI non-redundant protein sequence database, if needed.
The set of orthologues is then filtered manually, based on sequence length, identity with the human sequence, and available annotations.
Selected sequences are then aligned with M-Coffee, a meta-multiple sequence alignment program.
In order to adjust the alignment depth, two quality criteria are calculated. These criteria are based on work published by Tavtigian et al. (2008, 2009) and on recommendations published on the SIFT web site (Sorting Intolerant From Tolerant, a variant classification system). Correct alignments should contain on average three substitutions per position, and the median information content should be less than or equal to 3.25. If the alignment does not satisfy the quality criteria, sequences creating large gaps are removed, and new sequences are added if needed to raise information content.
As a final step, alignments are optimized manually.
Gene | Transcript | Origin |
---|---|---|
ABCA4 | NM_000350.2 | Interactive Biosoftware |
ACVRL1 | NM_000020.2 | Interactive Biosoftware |
ADCK3 | NM_020247.4 | Interactive Biosoftware |
APC | NM_000038.4 | Interactive Biosoftware |
APC | NM_000038.5 | Interactive Biosoftware |
APC | NM_001127510.2 | Interactive Biosoftware |
ATM | NM_000051.3 | Interactive Biosoftware |
ATM | U82828.1 | Interactive Biosoftware |
BBS1 | NM_024649.4 | Interactive Biosoftware |
BRCA1 | NM_007294.2 | IARC |
BRCA1 | NM_007294.3 | IARC |
BRCA2 | NM_000059.3 | IARC |
BSCL2 | NM_001122955.2 | Interactive Biosoftware |
BSCL2 | NM_001122955.3 | Interactive Biosoftware |
C3 | NM_000064.2 | Interactive Biosoftware |
C3 | NM_000064.3 | Interactive Biosoftware |
CDKN2A | NM_058195.2 | Interactive Biosoftware |
CDKN2A | NM_058195.3 | Interactive Biosoftware |
CHD7 | NM_017780.2 | Interactive Biosoftware |
CHD7 | NM_017780.3 | Interactive Biosoftware |
COL3A1 | NM_000090.3 | Interactive Biosoftware |
CYP21A2 | NM_000500.5 | Interactive Biosoftware |
CYP21A2 | NM_000500.7 | Interactive Biosoftware |
DEPDC5 | NM_001242896.1 | Interactive Biosoftware |
DMD | NM_004006.2 | Interactive Biosoftware |
DYNC2H1 | NM_001080463.1 | Interactive Biosoftware |
ENG | NM_001114753.1 | Interactive Biosoftware |
EYS | NM_001142800.1 | Interactive Biosoftware |
F8 | NM_000132.3 | Interactive Biosoftware |
FAT4 | NM_024582.4 | Interactive Biosoftware |
GATA2 | NM_001145661.1 | Interactive Biosoftware (GRCh38 LRG_295) |
GATA2 | NM_032638.4 | Interactive Biosoftware (GRCh38 LRG_295) |
GATA2 | NM_001145662.1 | Interactive Biosoftware (GRCh38) |
GCK | NM_000162.3 | Interactive Biosoftware |
GCK | NM_033507.1 | Interactive Biosoftware |
GJB2 | NM_004004.5 | Interactive Biosoftware |
GLA | NM_000169.2 | Interactive Biosoftware |
HBB | NM_000518.4 | Interactive Biosoftware |
KCNQ1 | NM_000218.2 | Interactive Biosoftware |
KCTD7 | NM_153033.1 | Interactive Biosoftware |
KCTD7 | NM_153033.4 | Interactive Biosoftware |
KIT | NM_000222.2 | Interactive Biosoftware |
KRAS | NM_033360.2 | Interactive Biosoftware |
L1CAM | NM_000425.3 | Interactive Biosoftware |
L1CAM | NM_000425.4 | Interactive Biosoftware |
LDLR | NM_000527.3 | Interactive Biosoftware |
LDLR | NM_000527.4 | Interactive Biosoftware |
LMNA | NM_170707.2 | Interactive Biosoftware |
LMNA | NM_170707.3 | Interactive Biosoftware |
MAP3K14 | NM_003954.3 | Interactive Biosoftware |
MAP3K14 | NM_003954.4 | Interactive Biosoftware |
MECP2 | NM_001110792.1 | Interactive Biosoftware |
MEN1 | NM_000244.3 | Interactive Biosoftware |
MLH1 | NM_000249.2 | IARC |
MLH1 | NM_000249.3 | IARC |
MSH2 | NM_000251.1 | IARC |
MSH2 | NM_000251.2 | IARC |
MSH6 | NM_000179.1 | IARC |
MSH6 | NM_000179.2 | IARC |
MUTYH | NM_001128425.1 | Interactive Biosoftware |
MYBPC3 | NM_000256.3 | Interactive Biosoftware |
MYH7 | NM_000257.2 | Interactive Biosoftware |
MYL2 | NM_000432.3 | Interactive Biosoftware |
NEFL | NM_006158.2 | Interactive Biosoftware |
NEFL | NM_006158.3 | Interactive Biosoftware |
NEFL | NM_006158.4 | Interactive Biosoftware |
NF1 | NM_001042492.2 | Interactive Biosoftware |
NOTCH3 | NM_000435.2 | Interactive Biosoftware |
NRXN1 | NM_001135659.1 | Interactive Biosoftware |
ORC1 | NM_004153.3 | Interactive Biosoftware |
PKD1 | L33243.1 | Interactive Biosoftware |
PKD1 | NM_001009944.2 | Interactive Biosoftware |
PKP2 | NM_004572.3 | Interactive Biosoftware |
PMS2 | NM_000535.5 | IARC |
RB1 | NM_000321.2 | Interactive Biosoftware |
RECQL4 | NM_004260.3 | Interactive Biosoftware |
SCN1A | AB093548.1 | Interactive Biosoftware |
SCN1A | NM_001165963.1 | Interactive Biosoftware |
SCN5A | NM_198056.2 | Interactive Biosoftware |
SDHB | NM_003000.2 | Interactive Biosoftware |
SH3TC2 | NM_024577.3 | Interactive Biosoftware |
SMCHD1 | NM_015295.2 | Interactive Biosoftware |
SPRED1 | NM_152594.2 | Interactive Biosoftware |
SRD5A2 | NM_000348.3 | Interactive Biosoftware |
TARDBP | NM_007375.3 | Interactive Biosoftware |
TFR2 | NM_003227.3 | Interactive Biosoftware |
TP53 | NM_000546.4 | Interactive Biosoftware |
TP53 | NM_000546.5 | Interactive Biosoftware |
TTN | NM_133378.4 | Interactive Biosoftware |
TTN | NM_001256850.1 | Interactive Biosoftware |
VHL | NM_000551.3 | Interactive Biosoftware |
WNK1 | NM_018979.2 | Interactive Biosoftware |
We intend to add new alignments for the most frequently studied genes regularly. Should you wish a new alignment for a specific gene not in the above list, please send us a request at .
We would like to express our thanks to the Genetic Cancer Susceptibility Group at IARC for their kind help in defining our alignment protocol.
Tavtigian, SV., Greenblatt, MS., Lesueur, F., Byrnes, GB. (2008). In silico analysis of missense substitutions using sequence-alignment based methods. Hum Mutat.11 : 1327-36
Tavtigian, SV., Oefner, PJ., Babikyan, D. et al (2009). Rare, evolutionarily unlikely missense substitutions in ATM confer increased risk of breast cancer. Am J Hum Genet. 85 : 427-46.
Deforche A., Blavier A. (2010). Systematic Building of Multiple Protein Alignments for Variant Interpretation Human Genome Meeting poster.
© 2020 Interactive Biosoftware - Last modified: 30 December 2017