Sains Malaysiana 48(10)(2019): 2151–2159
http://dx.doi.org/10.17576/jsm-2019-4810-10
Benchmarking in silico Tools
for the Functional Assessment of DNA Variants using a Set of Strictly
Pharmacogenetic Variants
(Ujian Tanda Aras Alat in silico untuk
Penilaian Kesan Fungsian Varian DNA menggunakan
Varian Farmakogenetik Terpilih)
ENG WEE
CHUA*
& CHIAN SIANG GOH
Faculty of Pharmacy, Universiti Kebangsaan
Malaysia, Jalan Raja Muda Abdul Aziz, 50300 Kuala Lumpur, Federal
Territory, Malaysia
Diserahkan: 19 Ogos 2018/Diterima:
9 September 2019
ABSTRACT
Predictive
algorithms are important tools for translating genomic data into
meaningful functional annotations. In this work, we benchmarked
the performance of eight prediction methods using a set of strictly
pharmacogenetic variants.
We first compiled a set of damaging or neutral variants that affected
pharmacogenes from two online databases. We then cross-checked their
functional impacts against the predictions given by the chosen tools.
Of the eight methods, SIFT (Sorting Intolerant
From Tolerant),
Mutation Assessor, and CADD (Combined
Annotation
Dependent
Depletion)
were the top performers in predicting the functional relevance of
a variant. The performance of SIFT surpassed that of CADD despite
its much simpler algorithm, correctly identifying 66.91% of the
damaging variants and 84.38% of the neutral variants. SIFT assumes
that important DNA bases within a gene are conserved
and not amenable to substitution. Overall, none of the prediction
methods struck a balance between sensitivity and specificity. For
instance, we noted that CADD was very sensitive in detecting the damaging variants
(89.21%); however, it also mispredicted a large fraction of the
neutral variants (43.75%). We then trialled a consensus approach
whereby the functional significance of a variant is defined by agreement
between at least three prediction methods. The approach performed
better than all the tools deployed alone, detecting 84.17% of the
deleterious variants and 70.97% of the neutral variants. A prediction
method that integrates an assortment of algorithms, each assigned
an empirically optimised weighting, may be established in the future
for the functional assessment of pharmacogenetic variants.
Keywords:
Deleteriousness; functional impact; pharmacogenetic variant; predictive
algorithm; receiver operating characteristic analysis
ABSTRAK
Algoritma
ramalan merupakan alat yang penting dalam menterjemahkan data genom
kepada anotasi fungsian yang lebih bermakna. Dalam kajian ini, kami
menilai prestasi lapan kaedah ramalan dalam mengenal pasti kesan
fungsian varian farmakogenetik. Kami mengumpulkan varian farmakogenetik
yang neutral atau merosakkan fungsi protein daripada dua pangkalan
data atas talian. Kemudian, kami membandingkan kesan fungsian setiap
varian tersebut dengan ramalan yang dijanakan oleh kaedah yang terpilih.
Daripada lapan kaedah tersebut, SIFT (Sorting
Intolerant
From Tolerant,
atau Mengasingkan Varian Mudarat Daripada Varian Neutral), Mutation
Assessor dan CADD (Combined Annotation
Dependent
Depletion,
atau Kesusutan Bersandarkan Anotasi Gabungan) adalah terbaik dalam
menentukan kesan fungsian sesuatu varian farmakogenetik. SIFT mencapai
prestasi yang lebih baik daripada CADD walaupun algoritmanya lebih
ringkas. Kaedah tersebut dapat mengenal pasti 66.91% varian yang
mencacatkan fungsi protein dan 84.38% varian neutral. Ramalan SIFT adalah
berasaskan anggapan bahawa bes DNA yang penting dalam sesuatu
gen adalah terabadi dan tidak boleh ditukar ganti. Secara keseluruhan,
tiada kaedah ramalan mencapai keseimbangan antara kepekaan dan kekhususan
diagnostik. Contohnya, kami mendapati bahawa CADD sangat sensitif dalam mengesankan
varian mudarat (89.21%); tetapi, CADD juga
membuat ramalan yang salah terhadap kesan fungsian sekelompok besar
varian neutral (43.75%). Seterusnya, kami menaksirkan kesan fungsian
varian dengan berdasarkan persetujuan antara sekurang-kurangnya
tiga kaedah ramalan. Pendekatan konsensus ini didapati lebih baik
daripada menggunakan mana-mana kaedah secara berasingan. Pendekatan
tersebut mampu mengesankan 84.17% varian mudarat dan 70.97% varian
neutral. Kaedah ramalan yang menggabungkan pelbagai algoritma, dengan
setiap satunya diberi pemberatan yang ditentukan secara empirik,
mungkin dibangunkan pada masa hadapan bagi menilai impak fungsian
varian farmakogenetik dengan lebih berkesan.
Kata kunci: Algoritma
ramalan; analisis penerimaan pengoperasian lengkung; impak fungsian;
mudarat; varian farmakogenetik
RUJUKAN
Adzhubei, I.A., Schmidt, S., Peshkin, L., Ramensky, V.E., Gerasimova,
A., Bork, P., Kondrashov, A.S. & Sunyaev, S.R. 2010. A method
and server for predicting damaging missense mutations. Nature
Methods 7(4): 248-249.
Apweiler, R., Bairoch, A., Wu, C.H., Barker, W.C., Boeckmann, B.,
Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin,
M.J., Natale, D.A., O’Donovan, C., Redaschi, N. & Yeh, L.S.
2004. UniProt: The universal protein knowledgebase. Nucleic Acids
Research 32(Database issue): D115-D119.
Bush, W.S., Crosslin, D.R., Owusu-Obeng, A., Wallace, J., Almoguera,
B., Basford, M.A., Bielinski, S.J., Carrell, D.S., Connolly, J.J.,
Crawford, D., Doheny, K.F., Gallego, C.J., Gordon, A.S., Keating,
B., Kirby, J., Kitchner, T., Manzi, S., Mejia, A.R., Pan, V., Perry,
C.L., Peterson, J.F., Prows, C.A., Ralston, J., Scott, S.A., Scrol,
A., Smith, M., Stallings, S.C., Veldhuizen, T., Wolf, W., Volpi,
S., Wiley, K., Li, R., Manolio, T., Bottinger, E., Brilliant, M.H.,
Carey, D., Chisholm, R.L., Chute, C.G., Haines, J.L., Hakonarson,
H., Harley, J.B., Holm, I.A., Kullo, I.J., Jarvik, G.P., Larson,
E.B., McCarty, C.A., Williams, M.S., Denny, J.C., Rasmussen-Torvik,
L.J., Roden, D.M. & Ritchie, M.D. 2016. Genetic variation among
82 pharmacogenes: The PGRNseq data from the eMERGE network. Clinical
Pharmacology & Therapeutics 100(2): 160-169.
Chan, A.O. 2013. Performance of in silico analysis in predicting
the effect of non-synonymous variants in inherited steroid metabolic
diseases. Steroids 78(7): 726-730.
Flanagan, S.E., Patch, A.M. & Ellard, S. 2010. Using SIFT and
PolyPhen to predict loss-of-function and gain-of-function mutations.
Genetic Testing & Molecular Biomarkers 14(4): 533-537.
Fowler, D.M. & Fields, S. 2014. Deep mutational scanning: a new
style of protein science. Nature Methods 11(8): 801-807.
Frousios, K., Iliopoulos, C.S., Schlitt, T. & Simpson, M.A. 2013.
Predicting the functional consequences of non-synonymous DNA sequence
variants-evaluation of bioinformatics tools and development of a
consensus strategy. Genomics 102(4): 223-228.
Gardiner, S.J. & Begg, E.J. 2006. Pharmacogenetics, drug-metabolizing
enzymes, and clinical practice. Pharmacological Reviews 58(3):
521-590.
Goksuluk, D., Korkmaz, S., Zararsiz, G. & Karaağaoğlu,
A.E. 2016. easyROC: An interactive web-tool for ROC curve analysis
using R language environment. The R Journal 8(2): 213-230.
Gottesman, M.M. & Ambudkar, S.V. 2011. Overview: ABC transporters
and human disease. Journal of Bioenergetics and Biomembranes
33(6): 453-458.
Gray, V.E., Hause, R.J., Luebeck, J., Shendure, J. & Fowler,
D.M. 2018. Quantitative missense variant effect prediction using
large-scale mutagenesis data. Cell Systems 6(1): 116-124.
Gray, V.E., Kukurba, K.R. & Kumar, S. 2012. Performance of computational
tools in evaluating the functional impact of laboratory-induced
amino acid mutations. Bioinformatics 28(16): 2093-2096.
Hao, D., Feng, Y., Xiao, R. & Xiao, P.G. 2011. Non-neutral nonsynonymous
single nucleotide polymorphisms in human ABC transporters: The first
comparison of six prediction methods. Pharmacological Reports
63(4): 924-934.
Hao, D., Xiao, P. & Chen, S. 2010. Phenotype prediction of nonsynonymous
single nucleotide polymorphisms in human phase II drug/xenobiotic
metabolizing enzymes: Perspectives on molecular evolution. Science
China Life Sciences 53(10): 1252-1262.
Ingelman-Sundberg, M., Mkrtchian, S., Zhou, Y. & Lauschke, V.M.
2018. Integrating rare genetic variants into pharmacogenetic drug
response predictions. Human Genomics 12(1): 26.
Kircher, M., Witten, D.M., Jain, P., O’roak, B.J., Cooper, G.M. &
Shendure, J. 2014. A general framework for estimating the relative
pathogenicity of human genetic variants. Nature Genetics 46(3):
310-315.
Li, B., Seligman, C., Thusberg, J., Miller, J.L., Auer, J., Whirl-
Carrillo, M., Capriotti, E., Klein, T.E. & Mooney, S.D. 2014.
In silico comparative characterization of pharmacogenomic
missense variants. BMC Genomics 15(Suppl 4): S4.
Mather, C.A., Mooney, S.D., Salipante, S.J., Scroggins, S., Wu, D.,
Pritchard, C.C. & Shirts, B.H. 2016. CADD score has limited
clinical validity for the identification of pathogenic variants
in noncoding regions in a hereditary cancer panel. Genetic Medicine
18(12): 1269-1275.
Ng, P.C. & Henikoff, S. 2003. SIFT: predicting amino acid changes
that affect protein function. Nucleic Acids Research 31(13):
3812-3814.
Pandurangan, A.P., Ochoa-Montaño, B., Ascher, D.B. & Blundell,
T.L. 2017. SDM: A server for predicting effects of mutations on
protein stability. Nucleic Acids Research 45(W1): W229-W235.
Pejaver, V., Urresti, J., Lugo-Martinez, J., Pagel, K.A., Lin, G.N.,
Nam, H., Mort, M., Cooper, D.N., Sebat, J., Iakoucheva, L.M., Mooney,
S.D. & Radivojac, P. 2017. MutPred2: Inferring the molecular
and phenotypic impact of amino acid variants. bioRxiv doi:
https://doi.org/10.1101/134981.
Reva, B., Antipin, Y. & Sander, C. 2011. Predicting the functional
impact of protein mutations: Application to cancer genomics. Nucleic
Acids Research 39(17): e118.
Sakuyama, K., Sasaki, T., Ujiie, S., Obata, K., Mizugaki, M., Ishikawa,
M. & Hiratsuka, M. 2008. Functional characterization of 17 CYP2D6
allelic variants (CYP2D6.2, 10, 14A-B, 18, 27, 36, 39, 47-51, 53-55,
and 57). Drug Metabolism and Disposition 36(12): 2460-2467.
Tang, H. & Thomas, P.D. 2016. PANTHER-PSEP: Predicting disease-causing
genetic variants using position-specific evolutionary preservation.
Bioinformatics 32(14): 2230-2232.
Valdmanis, P.N., Verlaan, D.J. & Rouleau, G.A. 2009. The proportion
of mutations predicted to have a deleterious effect differs between
gain and loss of function genes in neurodegenerative disease. Human
Mutation 30(3): E481-E489.
Worth, C.L., Preissner, R. & Blundell, T.L. 2011. SDM-A server
for predicting effects of mutations on protein stability and malfunction.
Nucleic Acids Research 39(Web Server issue): W215-W222.
Yu, A., Kneller, B.M., Rettie, A.E. & Haining, R.L. 2002. Expression,
purification, biochemical characterization, and comparative function
of human cytochrome P450 2D6.1, 2D6.2, 2D6.10, and 2D6.17 allelic
isoforms. Journal of Pharmacology and Experimental Therapeutics
303(3): 1291-1300.
Yu, Y., Wang, J., Huang, X., Wang, Y., Yang, P., Li, J., Tsuei, S.H.,
Shen, Y. & Fu, Q. 2011. Molecular characterization of 25 Chinese
pedigrees with 21-hydroxylase deficiency. Genetic Testing and
Molecular Biomarkers 15(3): 137-142.
Zou, M., Baitei, E.Y., Alzahrani, A.S., Parhar, R.S., Al-Mohanna,
F.A., Meyer, B.F. & Shi, Y. 2011. Mutation prediction by PolyPhen
or functional assay, a detailed comparison of CYP27B1 missense
mutations. Endocrine 40(1): 14-20.
*Pengarang untuk surat-menyurat; email: cew85911@ukm.edu.my
|