NBioinfo: Establishing a Bioinformatics Core in a University-based General Hospital in South Brazil
DOI:
https://doi.org/10.5753/jidm.2024.2684Keywords:
Bioinformatics, Computational Biology, Genomics, Machine Learning, Research Group, Systems BiologyAbstract
Bioinformatics is an indispensable discipline for current research in life and medical sciences. The increasing volume and complexity of biological data and the growing tendency for open data and data reuse projects have made computer-based analytical tools central to these research fields. However, it is an intrinsic interdisciplinary field with a multitude of skill sets required for using bioinformatics tools or undertaking research toward developing new methods. There is still a lack of skilled human resources to meet the numerous and growing application possibilities, which represents a bottleneck in many research projects. This paper reports our efforts to create the Núcleo de Bioinformática (NBioinfo, or Bioinformatics Core) at the Hospital de Clínicas de Porto Alegre (HCPA), a major public university hospital in Brazil. NBioinfo aims to serve as a hub for research and interaction in Bioinformatics and Computational Biology at HCPA, institutionally developing these areas of knowledge and promoting scientific advances triggered by bioinformatics. We briefly present our research group's history and goals, and describe our activities toward providing HCPA with competencies in these fields. We also describe the scientific and methodological challenges recently faced by our group and the advances promoted by scientific collaborations and research projects developed at NBioinfo.
Downloads
References
Andrades, R. and Recamonde-Mendoza, M. (2022). Machine learning methods for prediction of cancer driver genes: a survey paper. Briefings in Bioinformatics. bbac062. DOI: 10.1093/bib/bbac062.
Aron, S., Jongeneel, C. V., Chauke, P. A., Chaouch, M., Kumuthini, J., Zass, L., Radouani, F., Kassim, S. K., Fadlelmola, F. M., and Mulder, N. (2021). Ten simple rules for developing bioinformatics capacity at an academic institution. PLOS Computational Biology, 17(12):e1009592.
Assmann, T. S., Recamonde-Mendoza, M., de Souza, B. M., Bauer, A. C., and Crispim, D. (2018a). MicroRNAs and diabetic kidney disease: Systematic review and bioinformatic analysis. Molecular and Cellular Endocrinology, 477:90–102.
Assmann, T. S., Recamonde-Mendoza, M., Punales, M., Tschiedel, B., Canani, L. H., and Crispim, D. (2018b). MicroRNAs expression profile in plasma from type 1 diabetic patients: Case-control study and bioinformatic analysis. Diabetes Research and Clinical Practice, 141:35–46.
Attwood, T. K., Blackford, S., Brazas, M. D., Davies, A., and Schneider, M. V. (2019). A global perspective on evolving bioinformatics and data science training needs. Briefings in Bioinformatics, 20(2):398–404.
Barone, L., Williams, J., and Micklos, D. (2017). Unmet needs for analyzing biological big data: A survey of 704 NSF principal investigators. PLoS Computational Biology, 13(10):e1005755.
Borges, P., Pasqualim, G., Giugliani, R., Vairo, F., and Matte, U. (2020). Estimated prevalence of mucopolysaccharidoses from population-based exomes and genomes. Orphanet Journal of Rare Diseases, 15(1):1–9.
Brondani, L. d. A., Soares, A. A., Recamonde-Mendoza, M., Dall’Agnol, A., Camargo, J. L., Monteiro, K. M., and Silveiro, S. P. (2020). Urinary peptidomics and bioinformatics for the detection of diabetic kidney disease. Scientific Reports, 10(1):1–11.
Callahan, B. J., Sankaran, K., Fukuyama, J. A., McMurdie, P. J., and Holmes, S. P. (2016). Bioconductor workflow for microbiome data analysis: from raw reads to community analyses. F1000Research, 5.
Cameron, A., Bohrhunter, J. L., Taffner, S., Malek, A., and Pecora, N. D. (2020). Clinical pathogen genomics. Clinics in Laboratory Medicine, 40(4):447–458.
Ching, T., Himmelstein, D. S., Beaulieu-Jones, B. K., Kalinin, A. A., Do, B. T., Way, G. P., Ferrero, E., Agapow, P.-M., Zietz, M., Hoffman, M. M., et al. (2018). Opportunities and obstacles for deep learning in biology and medicine. Journal of The Royal Society Interface, 15(141):20170387.
Collins, F. S., Morgan, M., and Patrinos, A. (2003). The human genome project: lessons from large-scale biology. Science, 300(5617):286–290.
Colombelli, F., Kowalski, T. W., and Recamonde-Mendoza, M. (2021). A hybrid ensemble feature selection design for candidate biomarkers discovery from transcriptome profiles. arXiv preprint arXiv:2108.00290.
de Vos, W. M., Tilg, H., Van Hul, M., and Cani, P. D. (2022). Gut microbiome and health: Mechanistic insights. Gut, 71(5):1020–1032.
do Amaral Gomes, J., Olstad, E. W., Kowalski, T. W., Gervin, K., Vianna, F. S. L., Schüler-Faccini, L., and Nordeng, H. M. E. (2021). Genetic susceptibility to drug teratogenicity: A systematic literature review. Frontiers in Genetics, 12:645555. DOI: 10.3389/fgene.2021.645555.
Dragon, J. A., Gates, C., Sui, S. H., Hutchinson, J. N., Karuturi, R. K. M., Kucukural, A., Polson, S., Riva, A., Settles, M. L., Thimmapuram, J., et al. (2020). Bioinformatics core survey highlights the challenges facing data analysis facilities. Journal of Biomolecular Techniques: JBT, 31(2):66.
Eisele, B., Silva, G., Bessow, C., Donato, R., Genro, V., and Cunha-Filho, J. (2021). An in silico model using prognostic genetic factors for ovarian response in controlled ovarian stimulation: A systematic review. Journal of Assisted Reproduction and Genetics, 38(8):2007–2020.
Findlay, G. M. (2021). Linking genome variants to disease: scalable approaches to test the functional impact of human mutations. Human molecular genetics, 30:R187–R197. DOI: 10.1093/hmg/ddab219.
Gauthier, J., Vincent, A. T., Charette, S. J., and Derome, N. (2019). A brief history of bioinformatics. Briefings in Bioinformatics, 20(6):1981–1996.
Goemann, I. M., Marczyk, V. R., Recamonde-Mendoza, M., Wajner, S. M., Graudenz, M. S., and Maia, A. L. (2020). Decreased expression of the thyroid hormone-inactivating enzyme type 3 deiodinase is associated with lower survival rates in breast cancer. Scientific Reports, 10(1):1–12.
Greener, J. G., Kandathil, S. M., Moffat, L., and Jones, D. T. (2022). A guide to machine learning for biologists. Nature Reviews Molecular Cell Biology, 23(1):40–55.
Gregório, C., Soares-Lima, S. C., Alemar, B., Recamonde-Mendoza, M., Camuzi, D., de Souza-Santos, P. T., Rivero, R., Machado, S., Osvaldt, A., Ashton-Prolla, P., et al. (2020). Calcium signaling alterations caused by epigenetic mechanisms in pancreatic cancer: from early markers to prognostic impact. Cancers, 12(7):1735.
Haas Bueno, R. and Recamonde-Mendoza, M. (2020). Meta-analysis of transcriptomic data reveals pathophysiological modules involved with atrial fibrillation. Molecular Diagnosis & Therapy, 24(6):737–751.
Hasin, Y., Seldin, M., and Lusis, A. (2017). Multi-omics approaches to disease. Genome Biology, 18(1):1–15.
Hood, L. and Rowen, L. (2013). The human genome project: big science transforms biology and medicine. Genome Medicine, 5(9):1–8.
Jünemann, S., Kleinbölting, N., Jaenicke, S., Henke, C., Hassa, J., Nelkner, J., Stolze, Y., Albaum, S. P., Schlüter, A., Goesmann, A., et al. (2017). Bioinformatics for NGS-based metagenomics and the application to biogas research. Journal of biotechnology, 261:10–23.
Karczewski, K. J., Francioli, L. C., Tiao, G., Cummings, B. B., Alföldi, J., Wang, Q., Collins, R. L., et al. (2020). The mutational constraint spectrum quantified from variation in 141,456 humans. Nature, 581:434–443. DOI: 10.1038/s41586-020-2308-7.
Karczewski, K. J., Weisburd, B., Thomas, B., Solomonson, M., Ruderfer, D. M., Kavanagh, D., Hamamsy, T., Lek, M., Samocha, K. E., Cummings, B. B., Birnbaum, D., Consortium, T. E. A., Daly, M. J., and MacArthur, D. G. (2017). The ExAC browser: displaying reference data information from over 60 000 exomes. Nucleic Acids Research, 45:D840–D845. DOI: 10.1093/nar/gkw971.
Kerr, K., McAneney, H., Smyth, L. J., Bailie, C., McKee, S., and McKnight, A. J. (2020). A scoping review and proposed workflow for multi-omic rare disease research. Orphanet Journal of Rare Diseases, 15(1):1–18.
Knight, R., Vrbanac, A., Taylor, B. C., Aksenov, A., Callewaert, C., Debelius, J., Gonzalez, A., Kosciolek, T., McCall, L.-I., McDonald, D., et al. (2018). Best practices for analysing microbiomes. Nature Reviews Microbiology, 16(7):410–422.
Kopanos, C., Tsiolkas, V., Kouris, A., Chapple, C. E., Aguilera, M. A., Meyer, R., and Massouras, A. (2019). Varsome: the human genomic variant search engine. Bioinformatics (Oxford, England), 35:1978–1980. DOI: 10.1093/bioinformatics/bty897.
Kowalski, T. W., Caldas-Garcia, G. B., do Amaral Gomes, J., Fraga, L. R., Schuler-Faccini, L., Recamonde-Mendoza, M., Paixão-Côrtes, V. R., and Vianna, F. S. L. (2021). Comparative genomics identifies putative interspecies mechanisms underlying Crbn-Sall4-linked thalidomide embryopathy. Frontiers in genetics, 12:680217. DOI: 10.3389/fgene.2021.680217.
Kowalski, T. W., do Amaral Gomes, J., Feira, M. F., Ágata de Vargas Dupont, Recamonde-Mendoza, M., and Vianna, F. S. L. (2020a). Anticonvulsants and chromatin-genes expression: A systems biology investigation. Frontiers in Neuroscience, 14:591196. DOI: 10.3389/fnins.2020.591196.
Kowalski, T. W., Gomes, J. d. A., Garcia, G. B. C., Fraga, L. R., Paixao-Cortes, V. R., Recamonde-Mendoza, M., Sanseverino, M. T. V., Schuler-Faccini, L., and Vianna, F. S. L. (2020b). CRL4-cereblon complex in thalidomide embryopathy: a translational investigation. Scientific Re- ports, 10(1):1–13.
Kristem, L., Recamonde-Mendoza, M., Cigerza, G. C., Khoraki, J., Campos, G. M., and Mazzini, G. S. (2021). Roux-en-y gastric bypass downregulates angiotensin-converting enzyme 2 (ACE2) gene expression in subcutaneous white adipose tissue: a putative protective mechanism against severe covid-19. Obesity Surgery, 31(6):2831–2834.
Lappalainen, T., Scott, A. J., Brandt, M., and Hall, I. M. (2019). Genomic analysis in the age of human genome sequencing. Cell, 177:70–84. DOI: 10.1016/j.cell.2019.02.032.
LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. Nature, 521(7553):436–444.
Ma, Y., Zhang, P., Wang, F., Yang, J., Yang, Z., and Qin, H. (2010). The relationship between early embryo development and tumourigenesis. Journal of cellular and molecular medicine, 14:2697–701. DOI: 10.1111/j.1582-4934.2010.01191.x.
Marcon, G., de Ávila Pereira, F., Zimerman, A., da Silva, B. C., von Diemen, L., Passos, I. C., and Recamonde-Mendoza, M. (2021). Patterns of high-risk drinking among medical students: A web-based survey with machine learning. Computers in Biology and Medicine, 136:104747.
Mariano, D., Ferreira, M., Sousa, B. L., Santos, L. H., and de Melo-Minardi, R. C. (2020). A brief history of bioinformatics told by data visualization. In Brazilian Symposium on Bioinformatics, pages 235–246. Springer.
Matschinske, J., Alcaraz, N., Benis, A., Golebiewski, M., Grimm, D. G., Heumos, L., Kacprowski, T., Lazareva, O., List, M., Louadi, Z., et al. (2021). The AIMe registry for artificial intelligence in biomedical research. Nature Methods, 18(10):1128–1131.
Mello, A. C., Freitas, M., Coutinho, L., Falcon, T., and Matte, U. (2020). Machine learning supports long noncoding rnas as expression markers for endometrial carcinoma. BioMed Research International, 2020.
Misra, B. B., Langefeld, C., Olivier, M., and Cox, L. A. (2019). Integrated omics: tools, advances and future approaches. Journal of Molecular Endocrinology, 62(1):R21–R45.
Naslavsky, M. S., Yamamoto, G. L., de Almeida, T. F., Ezquina, S. A. M., Sunaga, D. Y., Pho, N., Bozoklian, D., Sandberg, T. O. M., Brito, L. A., Lazar, M., Bernardo, D. V., Amaro, E., Duarte, Y. A. O., Lebrão, M. L., Passos-Bueno, M. R., and Zatz, M. (2017). Exomic variants of an elderly cohort of brazilians in the abraom database. Human Mutation, 38:751–763. DOI: 10.1002/humu.23220.
Pavlopoulos, G. A., Secrier, M., Moschopoulos, C. N., Soldatos, T. G., Kossida, S., Aerts, J., Schneider, R., and Bagos, P. G. (2011). Using graph theory to analyze biological networks. BioData Mining, 4(1):1–27.
Petersen, B.-S., Fredrich, B., Hoeppner, M. P., Ellinghaus, D., and Franke, A. (2017). Opportunities and challenges of whole-genome and -exome sequencing. BMC genetics, 18:14. DOI: 10.1186/s12863-017-0479-5.
Ramos-Lima, L. F., Waikamp, V., Oliveira-Watanabe, T., Recamonde-Mendoza, M., Teche, S. P., Mello, M. F., Mello, A. F., and Freitas, L. H. M. (2022). Identifying posttraumatic stress disorder staging from clinical and sociodemographic features: a proof-of-concept study using a machine learning approach. Psychiatry Research, page 114489.
Recamonde-Mendoza, M., Werhli, A. V., and Biolo, A.(2019). Systems biology approach identifies key regulators and the interplay between mirnas and transcription factors for pathological cardiac hypertrophy. Gene, 698:157–169.
Richards, S., Aziz, N., Bale, S., Bick, D., Das, S., Gastier-Foster, J., Grody, W. W., Hegde, M., Lyon, E., Spector, E., Voelkerding, K., Rehm, H. L., and Committee, A. L. Q. A. (2015). Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the american college of medical genetics and genomics and the association for molecular pathology. Genetics in medicine : official journal of the American College of Medical Genetics, 17:405–24. DOI: 10.1038/gim.2015.30.
Sartor, I. T. S., Recamonde-Mendoza, M., and Ashton-Prolla, P. (2019). TULP3: A potential biomarker in colorectal cancer? PLOS ONE, 14(1):e0210762.
Sayres, M. A. W., Hauser, C., Sierk, M., Robic, S., Rosenwald, A. G., Smith, T. M., Triplett, E. W., Williams, J. J., Dinsdale, E., Morgan, W. R., et al. (2018). Bioinformatics core competencies for undergraduate life sciences education. PloS One, 13(6):e0196878.
Schapke, J., Tavares, A., and Recamonde-Mendoza, M. (2021). Epgat: Gene essentiality prediction with graph attention networks. IEEE/ACM Transactions on Computational Biology and Bioinformatics.
Shahbazi, M. N. and Zernicka-Goetz, M. (2018). Deconstructing and reconstructing the mouse and human early embryo. Nature cell biology, 20:878–887. DOI: 10.1038/s41556-018-0144-x.
Silva, G. C. V. and Matte, U. (2022). Neuronetworks: Analysis of brain pathology in mucopolysaccharidoses–a systems biology approach. Neuroscience Informatics, 2(1):100036.
Smithells, R. W. and Newman, C. G. (1992). Recognition of thalidomide defects. Journal of medical genetics, 29:716–23. DOI: 10.1136/jmg.29.10.716.
Trevizan, B. and Recamonde-Mendoza, M. (2021). Ensemble feature selection compares to meta-analysis for breast cancer biomarker identification from microarray data. In International Conference on Computational Science and Its Applications, pages 162–178. Springer.
Villalba, G. C. and Matte, U. (2021). Fantastic databases and where to find them: Web applications for researchers in a rush. Genetics and Molecular Biology, 44.
Yamada, R., Okada, D., Wang, J., Basak, T., and Koyama, S. (2021). Interpretation of omics data analyses. Journal of Human Genetics, 66(1):93–102.