Sains Malaysiana 48(10)(2019): 2151–2159
http://dx.doi.org/10.17576/jsm-2019-4810-10
Benchmarking in silico Tools
for the Functional Assessment of DNA Variants using a Set of Strictly
Pharmacogenetic Variants
(Ujian Tanda Aras Alat in silico untuk
Penilaian Kesan Fungsian Varian DNA menggunakan
Varian Farmakogenetik Terpilih)
ENG WEE CHUA*
& CHIAN SIANG GOH
Faculty of Pharmacy, Universiti
Kebangsaan Malaysia, Jalan Raja Muda Abdul Aziz, 50300 Kuala Lumpur, Federal
Territory, Malaysia
Received: 19 August 2018/Accepted:
9 September 2019
ABSTRACT
Predictive
algorithms are important tools for translating genomic data into meaningful
functional annotations. In this work, we benchmarked the performance of eight
prediction methods using a set of strictly pharmacogenetic variants.
We first compiled a set of damaging or neutral variants that affected
pharmacogenes from two online databases. We then cross-checked their functional
impacts against the predictions given by the chosen tools. Of the eight methods, SIFT (Sorting Intolerant From Tolerant),
Mutation Assessor, and CADD (Combined Annotation Dependent Depletion)
were the top performers in predicting the functional relevance of a variant.
The performance of SIFT surpassed that of CADD despite
its much simpler algorithm, correctly identifying 66.91% of the damaging
variants and 84.38% of the neutral variants. SIFT assumes
that important DNA bases within a gene are conserved
and not amenable to substitution. Overall, none of the prediction methods
struck a balance between sensitivity and specificity. For instance, we noted
that CADD was very sensitive in detecting the damaging
variants (89.21%); however, it also mispredicted a large fraction of the
neutral variants (43.75%). We then trialled a consensus approach whereby the
functional significance of a variant is defined by agreement between at least
three prediction methods. The approach performed better than all the tools
deployed alone, detecting 84.17% of the deleterious variants and 70.97% of the
neutral variants. A prediction method that integrates an assortment of
algorithms, each assigned an empirically optimised weighting, may be
established in the future for the functional assessment of pharmacogenetic
variants.
Keywords:
Deleteriousness; functional impact; pharmacogenetic variant; predictive
algorithm; receiver operating characteristic analysis
ABSTRAK
Algoritma
ramalan merupakan alat yang penting dalam menterjemahkan data genom kepada
anotasi fungsian yang lebih bermakna. Dalam kajian ini, kami menilai prestasi
lapan kaedah ramalan dalam mengenal pasti kesan fungsian varian farmakogenetik.
Kami mengumpulkan varian farmakogenetik yang neutral atau merosakkan fungsi
protein daripada dua pangkalan data atas talian. Kemudian, kami membandingkan
kesan fungsian setiap varian tersebut dengan ramalan yang dijanakan oleh kaedah
yang terpilih. Daripada lapan kaedah tersebut, SIFT (Sorting Intolerant From Tolerant,
atau Mengasingkan Varian Mudarat Daripada Varian Neutral), Mutation Assessor dan CADD (Combined Annotation Dependent Depletion,
atau Kesusutan Bersandarkan Anotasi Gabungan) adalah terbaik dalam menentukan
kesan fungsian sesuatu varian farmakogenetik. SIFT mencapai
prestasi yang lebih baik daripada CADD walaupun algoritmanya
lebih ringkas. Kaedah tersebut dapat mengenal pasti 66.91% varian yang
mencacatkan fungsi protein dan 84.38% varian neutral. Ramalan SIFT adalah
berasaskan anggapan bahawa bes DNA yang penting dalam sesuatu
gen adalah terabadi dan tidak boleh ditukar ganti. Secara keseluruhan, tiada
kaedah ramalan mencapai keseimbangan antara kepekaan dan kekhususan diagnostik.
Contohnya, kami mendapati bahawa CADD sangat sensitif dalam
mengesankan varian mudarat (89.21%); tetapi, CADD juga
membuat ramalan yang salah terhadap kesan fungsian sekelompok besar varian
neutral (43.75%). Seterusnya, kami menaksirkan kesan fungsian varian dengan
berdasarkan persetujuan antara sekurang-kurangnya tiga kaedah ramalan.
Pendekatan konsensus ini didapati lebih baik daripada menggunakan mana-mana
kaedah secara berasingan. Pendekatan tersebut mampu mengesankan 84.17% varian
mudarat dan 70.97% varian neutral. Kaedah ramalan yang menggabungkan pelbagai
algoritma, dengan setiap satunya diberi pemberatan yang ditentukan secara
empirik, mungkin dibangunkan pada masa hadapan bagi menilai impak fungsian
varian farmakogenetik dengan lebih berkesan.
Kata kunci: Algoritma
ramalan; analisis penerimaan pengoperasian lengkung; impak fungsian; mudarat;
varian farmakogenetik
REFERENCES
Adzhubei, I.A., Schmidt, S., Peshkin, L., Ramensky, V.E.,
Gerasimova, A., Bork, P., Kondrashov, A.S. & Sunyaev, S.R. 2010. A method
and server for predicting damaging missense mutations. Nature Methods 7(4):
248-249.
Apweiler, R., Bairoch, A., Wu, C.H., Barker, W.C., Boeckmann, B.,
Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, M.J.,
Natale, D.A., O’Donovan, C., Redaschi, N. & Yeh, L.S. 2004. UniProt: The
universal protein knowledgebase. Nucleic Acids Research 32(Database
issue): D115-D119.
Bush, W.S., Crosslin, D.R., Owusu-Obeng, A., Wallace, J.,
Almoguera, B., Basford, M.A., Bielinski, S.J., Carrell, D.S., Connolly, J.J.,
Crawford, D., Doheny, K.F., Gallego, C.J., Gordon, A.S., Keating, B., Kirby,
J., Kitchner, T., Manzi, S., Mejia, A.R., Pan, V., Perry, C.L., Peterson, J.F.,
Prows, C.A., Ralston, J., Scott, S.A., Scrol, A., Smith, M., Stallings, S.C.,
Veldhuizen, T., Wolf, W., Volpi, S., Wiley, K., Li, R., Manolio, T., Bottinger,
E., Brilliant, M.H., Carey, D., Chisholm, R.L., Chute, C.G., Haines, J.L.,
Hakonarson, H., Harley, J.B., Holm, I.A., Kullo, I.J., Jarvik, G.P., Larson,
E.B., McCarty, C.A., Williams, M.S., Denny, J.C., Rasmussen-Torvik, L.J.,
Roden, D.M. & Ritchie, M.D. 2016. Genetic variation among 82 pharmacogenes:
The PGRNseq data from the eMERGE network. Clinical Pharmacology &
Therapeutics 100(2): 160-169.
Chan, A.O. 2013. Performance of in silico analysis in
predicting the effect of non-synonymous variants in inherited steroid metabolic
diseases. Steroids 78(7): 726-730.
Flanagan, S.E., Patch, A.M. & Ellard, S. 2010. Using SIFT and
PolyPhen to predict loss-of-function and gain-of-function mutations. Genetic
Testing & Molecular Biomarkers 14(4): 533-537.
Fowler, D.M. & Fields, S. 2014. Deep mutational scanning: a
new style of protein science. Nature Methods 11(8): 801-807.
Frousios, K., Iliopoulos, C.S., Schlitt, T. & Simpson, M.A.
2013. Predicting the functional consequences of non-synonymous DNA sequence
variants-evaluation of bioinformatics tools and development of a consensus
strategy. Genomics 102(4): 223-228.
Gardiner, S.J. & Begg, E.J. 2006. Pharmacogenetics,
drug-metabolizing enzymes, and clinical practice. Pharmacological Reviews 58(3):
521-590.
Goksuluk, D., Korkmaz, S., Zararsiz, G. &
Karaağaoğlu, A.E. 2016. easyROC: An interactive web-tool for ROC
curve analysis using R language environment. The R Journal 8(2):
213-230.
Gottesman, M.M. & Ambudkar, S.V. 2011. Overview: ABC
transporters and human disease. Journal of Bioenergetics and Biomembranes 33(6):
453-458.
Gray, V.E., Hause, R.J., Luebeck, J., Shendure, J. & Fowler,
D.M. 2018. Quantitative missense variant effect prediction using large-scale
mutagenesis data. Cell Systems 6(1): 116-124.
Gray, V.E., Kukurba, K.R. & Kumar, S. 2012. Performance of
computational tools in evaluating the functional impact of laboratory-induced
amino acid mutations. Bioinformatics 28(16): 2093-2096.
Hao, D., Feng, Y., Xiao, R. & Xiao, P.G. 2011. Non-neutral
nonsynonymous single nucleotide polymorphisms in human ABC transporters: The
first comparison of six prediction methods. Pharmacological Reports 63(4):
924-934.
Hao, D., Xiao, P. & Chen, S. 2010. Phenotype prediction of
nonsynonymous single nucleotide polymorphisms in human phase II drug/xenobiotic
metabolizing enzymes: Perspectives on molecular evolution. Science China
Life Sciences 53(10): 1252-1262.
Ingelman-Sundberg, M., Mkrtchian, S., Zhou, Y. & Lauschke,
V.M. 2018. Integrating rare genetic variants into pharmacogenetic drug response
predictions. Human Genomics 12(1): 26.
Kircher, M., Witten, D.M., Jain, P., O’roak, B.J., Cooper, G.M.
& Shendure, J. 2014. A general framework for estimating the relative
pathogenicity of human genetic variants. Nature Genetics 46(3): 310-315.
Li, B., Seligman, C., Thusberg, J., Miller, J.L., Auer, J., Whirl-
Carrillo, M., Capriotti, E., Klein, T.E. & Mooney, S.D. 2014. In silico comparative
characterization of pharmacogenomic missense variants. BMC Genomics 15(Suppl
4): S4.
Mather, C.A., Mooney, S.D., Salipante, S.J., Scroggins, S., Wu,
D., Pritchard, C.C. & Shirts, B.H. 2016. CADD score has limited clinical
validity for the identification of pathogenic variants in noncoding regions in
a hereditary cancer panel. Genetic Medicine 18(12): 1269-1275.
Ng, P.C. & Henikoff, S. 2003. SIFT: predicting amino acid
changes that affect protein function. Nucleic Acids Research 31(13):
3812-3814.
Pandurangan, A.P., Ochoa-Montaño, B., Ascher, D.B. & Blundell,
T.L. 2017. SDM: A server for predicting effects of mutations on protein
stability. Nucleic Acids Research 45(W1): W229-W235.
Pejaver, V., Urresti, J., Lugo-Martinez, J., Pagel, K.A., Lin,
G.N., Nam, H., Mort, M., Cooper, D.N., Sebat, J., Iakoucheva, L.M., Mooney,
S.D. & Radivojac, P. 2017. MutPred2: Inferring the molecular and phenotypic
impact of amino acid variants. bioRxiv doi:
https://doi.org/10.1101/134981.
Reva, B., Antipin, Y. & Sander, C. 2011. Predicting the
functional impact of protein mutations: Application to cancer genomics. Nucleic
Acids Research 39(17): e118.
Sakuyama, K., Sasaki, T., Ujiie, S., Obata, K., Mizugaki, M.,
Ishikawa, M. & Hiratsuka, M. 2008. Functional characterization of 17 CYP2D6
allelic variants (CYP2D6.2, 10, 14A-B, 18, 27, 36, 39, 47-51, 53-55, and 57). Drug
Metabolism and Disposition 36(12): 2460-2467.
Tang, H. & Thomas, P.D. 2016. PANTHER-PSEP: Predicting
disease-causing genetic variants using position-specific evolutionary
preservation. Bioinformatics 32(14): 2230-2232.
Valdmanis, P.N., Verlaan, D.J. & Rouleau, G.A. 2009. The
proportion of mutations predicted to have a deleterious effect differs between
gain and loss of function genes in neurodegenerative disease. Human Mutation 30(3): E481-E489.
Worth, C.L., Preissner, R. & Blundell, T.L. 2011. SDM-A server
for predicting effects of mutations on protein stability and malfunction. Nucleic
Acids Research 39(Web Server issue): W215-W222.
Yu, A., Kneller, B.M., Rettie, A.E. & Haining, R.L. 2002.
Expression, purification, biochemical characterization, and comparative
function of human cytochrome P450 2D6.1, 2D6.2, 2D6.10, and 2D6.17 allelic
isoforms. Journal of Pharmacology and Experimental Therapeutics 303(3):
1291-1300.
Yu, Y., Wang, J., Huang, X., Wang, Y., Yang, P., Li, J., Tsuei,
S.H., Shen, Y. & Fu, Q. 2011. Molecular characterization of 25 Chinese
pedigrees with 21-hydroxylase deficiency. Genetic Testing and Molecular
Biomarkers 15(3): 137-142.
Zou, M., Baitei, E.Y., Alzahrani, A.S., Parhar, R.S., Al-Mohanna,
F.A., Meyer, B.F. & Shi, Y. 2011. Mutation prediction by PolyPhen or
functional assay, a detailed comparison of CYP27B1 missense mutations. Endocrine 40(1): 14-20.
*Corresponding author; email:
cew85911@ukm.edu.my
|