Sains Malaysiana 48(10)(2019): 2151–2159

http://dx.doi.org/10.17576/jsm-2019-4810-10

 

Benchmarking in silico Tools for the Functional Assessment of DNA Variants using a Set of Strictly Pharmacogenetic Variants

(Ujian Tanda Aras Alat in silico untuk Penilaian Kesan Fungsian Varian DNA menggunakan Varian Farmakogenetik Terpilih)

 

ENG WEE CHUA* & CHIAN SIANG GOH

 

Faculty of Pharmacy, Universiti Kebangsaan Malaysia, Jalan Raja Muda Abdul Aziz, 50300 Kuala Lumpur, Federal Territory, Malaysia

 

Diserahkan: 19 Ogos 2018/Diterima: 9 September 2019

 

ABSTRACT

Predictive algorithms are important tools for translating genomic data into meaningful functional annotations. In this work, we benchmarked the performance of eight prediction methods using a set of strictly pharmacogenetic variants. We first compiled a set of damaging or neutral variants that affected pharmacogenes from two online databases. We then cross-checked their functional impacts against the predictions given by the chosen tools. Of the eight methods, SIFT (Sorting Intolerant From Tolerant), Mutation Assessor, and CADD (Combined Annotation Dependent Depletion) were the top performers in predicting the functional relevance of a variant. The performance of SIFT surpassed that of CADD despite its much simpler algorithm, correctly identifying 66.91% of the damaging variants and 84.38% of the neutral variants. SIFT assumes that important DNA bases within a gene are conserved and not amenable to substitution. Overall, none of the prediction methods struck a balance between sensitivity and specificity. For instance, we noted that CADD was very sensitive in detecting the damaging variants (89.21%); however, it also mispredicted a large fraction of the neutral variants (43.75%). We then trialled a consensus approach whereby the functional significance of a variant is defined by agreement between at least three prediction methods. The approach performed better than all the tools deployed alone, detecting 84.17% of the deleterious variants and 70.97% of the neutral variants. A prediction method that integrates an assortment of algorithms, each assigned an empirically optimised weighting, may be established in the future for the functional assessment of pharmacogenetic variants.

Keywords: Deleteriousness; functional impact; pharmacogenetic variant; predictive algorithm; receiver operating characteristic analysis

 

ABSTRAK

Algoritma ramalan merupakan alat yang penting dalam menterjemahkan data genom kepada anotasi fungsian yang lebih bermakna. Dalam kajian ini, kami menilai prestasi lapan kaedah ramalan dalam mengenal pasti kesan fungsian varian farmakogenetik. Kami mengumpulkan varian farmakogenetik yang neutral atau merosakkan fungsi protein daripada dua pangkalan data atas talian. Kemudian, kami membandingkan kesan fungsian setiap varian tersebut dengan ramalan yang dijanakan oleh kaedah yang terpilih. Daripada lapan kaedah tersebut, SIFT (Sorting Intolerant From Tolerant, atau Mengasingkan Varian Mudarat Daripada Varian Neutral), Mutation Assessor dan CADD (Combined Annotation Dependent Depletion, atau Kesusutan Bersandarkan Anotasi Gabungan) adalah terbaik dalam menentukan kesan fungsian sesuatu varian farmakogenetik. SIFT mencapai prestasi yang lebih baik daripada CADD walaupun algoritmanya lebih ringkas. Kaedah tersebut dapat mengenal pasti 66.91% varian yang mencacatkan fungsi protein dan 84.38% varian neutral. Ramalan SIFT adalah berasaskan anggapan bahawa bes DNA yang penting dalam sesuatu gen adalah terabadi dan tidak boleh ditukar ganti. Secara keseluruhan, tiada kaedah ramalan mencapai keseimbangan antara kepekaan dan kekhususan diagnostik. Contohnya, kami mendapati bahawa CADD sangat sensitif dalam mengesankan varian mudarat (89.21%); tetapi, CADD juga membuat ramalan yang salah terhadap kesan fungsian sekelompok besar varian neutral (43.75%). Seterusnya, kami menaksirkan kesan fungsian varian dengan berdasarkan persetujuan antara sekurang-kurangnya tiga kaedah ramalan. Pendekatan konsensus ini didapati lebih baik daripada menggunakan mana-mana kaedah secara berasingan. Pendekatan tersebut mampu mengesankan 84.17% varian mudarat dan 70.97% varian neutral. Kaedah ramalan yang menggabungkan pelbagai algoritma, dengan setiap satunya diberi pemberatan yang ditentukan secara empirik, mungkin dibangunkan pada masa hadapan bagi menilai impak fungsian varian farmakogenetik dengan lebih berkesan.

Kata kunci: Algoritma ramalan; analisis penerimaan pengoperasian lengkung; impak fungsian; mudarat; varian farmakogenetik

RUJUKAN

Adzhubei, I.A., Schmidt, S., Peshkin, L., Ramensky, V.E., Gerasimova, A., Bork, P., Kondrashov, A.S. & Sunyaev, S.R. 2010. A method and server for predicting damaging missense mutations. Nature Methods 7(4): 248-249.

Apweiler, R., Bairoch, A., Wu, C.H., Barker, W.C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, M.J., Natale, D.A., O’Donovan, C., Redaschi, N. & Yeh, L.S. 2004. UniProt: The universal protein knowledgebase. Nucleic Acids Research 32(Database issue): D115-D119.

Bush, W.S., Crosslin, D.R., Owusu-Obeng, A., Wallace, J., Almoguera, B., Basford, M.A., Bielinski, S.J., Carrell, D.S., Connolly, J.J., Crawford, D., Doheny, K.F., Gallego, C.J., Gordon, A.S., Keating, B., Kirby, J., Kitchner, T., Manzi, S., Mejia, A.R., Pan, V., Perry, C.L., Peterson, J.F., Prows, C.A., Ralston, J., Scott, S.A., Scrol, A., Smith, M., Stallings, S.C., Veldhuizen, T., Wolf, W., Volpi, S., Wiley, K., Li, R., Manolio, T., Bottinger, E., Brilliant, M.H., Carey, D., Chisholm, R.L., Chute, C.G., Haines, J.L., Hakonarson, H., Harley, J.B., Holm, I.A., Kullo, I.J., Jarvik, G.P., Larson, E.B., McCarty, C.A., Williams, M.S., Denny, J.C., Rasmussen-Torvik, L.J., Roden, D.M. & Ritchie, M.D. 2016. Genetic variation among 82 pharmacogenes: The PGRNseq data from the eMERGE network. Clinical Pharmacology & Therapeutics 100(2): 160-169.

Chan, A.O. 2013. Performance of in silico analysis in predicting the effect of non-synonymous variants in inherited steroid metabolic diseases. Steroids 78(7): 726-730.

Flanagan, S.E., Patch, A.M. & Ellard, S. 2010. Using SIFT and PolyPhen to predict loss-of-function and gain-of-function mutations. Genetic Testing & Molecular Biomarkers 14(4): 533-537.

Fowler, D.M. & Fields, S. 2014. Deep mutational scanning: a new style of protein science. Nature Methods 11(8): 801-807.

Frousios, K., Iliopoulos, C.S., Schlitt, T. & Simpson, M.A. 2013. Predicting the functional consequences of non-synonymous DNA sequence variants-evaluation of bioinformatics tools and development of a consensus strategy. Genomics 102(4): 223-228.

Gardiner, S.J. & Begg, E.J. 2006. Pharmacogenetics, drug-metabolizing enzymes, and clinical practice. Pharmacological Reviews 58(3): 521-590.

Goksuluk, D., Korkmaz, S., Zararsiz, G. & Karaağaoğlu, A.E. 2016. easyROC: An interactive web-tool for ROC curve analysis using R language environment. The R Journal 8(2): 213-230.

Gottesman, M.M. & Ambudkar, S.V. 2011. Overview: ABC transporters and human disease. Journal of Bioenergetics and Biomembranes 33(6): 453-458.

Gray, V.E., Hause, R.J., Luebeck, J., Shendure, J. & Fowler, D.M. 2018. Quantitative missense variant effect prediction using large-scale mutagenesis data. Cell Systems 6(1): 116-124.

Gray, V.E., Kukurba, K.R. & Kumar, S. 2012. Performance of computational tools in evaluating the functional impact of laboratory-induced amino acid mutations. Bioinformatics 28(16): 2093-2096.

Hao, D., Feng, Y., Xiao, R. & Xiao, P.G. 2011. Non-neutral nonsynonymous single nucleotide polymorphisms in human ABC transporters: The first comparison of six prediction methods. Pharmacological Reports 63(4): 924-934.

Hao, D., Xiao, P. & Chen, S. 2010. Phenotype prediction of nonsynonymous single nucleotide polymorphisms in human phase II drug/xenobiotic metabolizing enzymes: Perspectives on molecular evolution. Science China Life Sciences 53(10): 1252-1262.

Ingelman-Sundberg, M., Mkrtchian, S., Zhou, Y. & Lauschke, V.M. 2018. Integrating rare genetic variants into pharmacogenetic drug response predictions. Human Genomics 12(1): 26.

Kircher, M., Witten, D.M., Jain, P., O’roak, B.J., Cooper, G.M. & Shendure, J. 2014. A general framework for estimating the relative pathogenicity of human genetic variants. Nature Genetics 46(3): 310-315.

Li, B., Seligman, C., Thusberg, J., Miller, J.L., Auer, J., Whirl- Carrillo, M., Capriotti, E., Klein, T.E. & Mooney, S.D. 2014. In silico comparative characterization of pharmacogenomic missense variants. BMC Genomics 15(Suppl 4): S4.

Mather, C.A., Mooney, S.D., Salipante, S.J., Scroggins, S., Wu, D., Pritchard, C.C. & Shirts, B.H. 2016. CADD score has limited clinical validity for the identification of pathogenic variants in noncoding regions in a hereditary cancer panel. Genetic Medicine 18(12): 1269-1275.

Ng, P.C. & Henikoff, S. 2003. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Research 31(13): 3812-3814.

Pandurangan, A.P., Ochoa-Montaño, B., Ascher, D.B. & Blundell, T.L. 2017. SDM: A server for predicting effects of mutations on protein stability. Nucleic Acids Research 45(W1): W229-W235.

Pejaver, V., Urresti, J., Lugo-Martinez, J., Pagel, K.A., Lin, G.N., Nam, H., Mort, M., Cooper, D.N., Sebat, J., Iakoucheva, L.M., Mooney, S.D. & Radivojac, P. 2017. MutPred2: Inferring the molecular and phenotypic impact of amino acid variants. bioRxiv doi: https://doi.org/10.1101/134981.

Reva, B., Antipin, Y. & Sander, C. 2011. Predicting the functional impact of protein mutations: Application to cancer genomics. Nucleic Acids Research 39(17): e118.

Sakuyama, K., Sasaki, T., Ujiie, S., Obata, K., Mizugaki, M., Ishikawa, M. & Hiratsuka, M. 2008. Functional characterization of 17 CYP2D6 allelic variants (CYP2D6.2, 10, 14A-B, 18, 27, 36, 39, 47-51, 53-55, and 57). Drug Metabolism and Disposition 36(12): 2460-2467.

Tang, H. & Thomas, P.D. 2016. PANTHER-PSEP: Predicting disease-causing genetic variants using position-specific evolutionary preservation. Bioinformatics 32(14): 2230-2232.

Valdmanis, P.N., Verlaan, D.J. & Rouleau, G.A. 2009. The proportion of mutations predicted to have a deleterious effect differs between gain and loss of function genes in neurodegenerative disease. Human Mutation 30(3): E481-E489.

Worth, C.L., Preissner, R. & Blundell, T.L. 2011. SDM-A server for predicting effects of mutations on protein stability and malfunction. Nucleic Acids Research 39(Web Server issue): W215-W222.

Yu, A., Kneller, B.M., Rettie, A.E. & Haining, R.L. 2002. Expression, purification, biochemical characterization, and comparative function of human cytochrome P450 2D6.1, 2D6.2, 2D6.10, and 2D6.17 allelic isoforms. Journal of Pharmacology and Experimental Therapeutics 303(3): 1291-1300.

Yu, Y., Wang, J., Huang, X., Wang, Y., Yang, P., Li, J., Tsuei, S.H., Shen, Y. & Fu, Q. 2011. Molecular characterization of 25 Chinese pedigrees with 21-hydroxylase deficiency. Genetic Testing and Molecular Biomarkers 15(3): 137-142.

Zou, M., Baitei, E.Y., Alzahrani, A.S., Parhar, R.S., Al-Mohanna, F.A., Meyer, B.F. & Shi, Y. 2011. Mutation prediction by PolyPhen or functional assay, a detailed comparison of CYP27B1 missense mutations. Endocrine 40(1): 14-20.

 

*Pengarang untuk surat-menyurat; email: cew85911@ukm.edu.my

 

 

 

 

sebelumnya