PATopics: A framework to automate the extraction of information in pharmaceutical patent documents
DOI:
https://doi.org/10.5753/reic.2023.3417Keywords:
Patentes farmacêuticas, Modelagem de tópicos, Processamento de linguagem naturalAbstract
Pharmaceutical patents are composed of documents with many details regarding the invention’s claims and methodology/results explanation. Management them refers to an exhaustive manual search. To mitigate this problem, we proposed PATopics, a framework able to extract relevant information from patents’ textual information, build relevant topics, correlate them with useful patent characteristics and present the information in a friendly web interface. We evaluated the framework using 4,832 pharmaceutical patents concerning 809 molecules patented by 478 companies. We analyze considering the demands of three user profiles – researchers, chemists, and companies – showing how practical and helpful PATopics is in the pharmaceutical scenario.
Downloads
References
Garattini, L., Badinella Martini, M., and Mannucci, P. M. (2022). Pharmaceutical patenting in the European Union: reform or riddance. Internal and Emergency Medicine, 17(3):937–939.
Genin, B. L. and Zolkin, D. S. (2021). Similarity search in patents databases. The evaluations of the search quality. World Patent Information, 64(February):102022.
Khachigian, L. M. (2020). Pharmaceutical patents: reconciling the human right to health with the incentive to invent. Drug Discovery Today, 25(7):1135–1141.
Meng, Z., Shen, H., Huang, H., Liu, W., Wang, J., and Sangaiah, A. K. (2018). Search result diversification on attributed networks via nonnegative matrix factorization. Information Processing & Management, 54(6):1277–1291.
Reinhardt, U. E. (2001). Perspectives on the pharmaceutical industry. Health Affairs, 20(5):136–149.
Sammut, C. and Webb, G. I., editors (2010). TF–IDF, pages 986–987. Springer US, Boston, MA.
Viegas, F., Canuto, S., Gomes, C., Luiz, W., Rosa, T., Ribas, S., Rocha, L., and Gonçalves, M. A. (2019). Cluwords: Exploiting semantic word clustering representation for enhanced topic modeling. pages 753–761.
Waters, H. and Graf, M. (2018). The Costs of Chronic Disease in the U.S. Milken Institute, (August):24.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Os autores
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.