Journal of the Brazilian Computer Society https://journals-sol.sbc.org.br/index.php/jbcs <div class="cms-item cms-collection cms-collection--split cms-collection--untitled" data-fragment="784856"> <div class="cms-collection__row"> <div class="cms-collection__column"> <div class="cms-collection__column-inner"> <div class="cms-item cms-collection" data-fragment="784854"> <div id="aimsAndScope" class="cms-item placeholder placeholder-aimsAndScope"> <div class="placeholder-aimsAndScope_content"> <p>The <em>Journal of the Brazilian Computer Society</em> (JBCS) is an international journal which serves as a forum for disseminating innovative research in all fields of computer science and related subjects. Contents include theoretical, practical and experimental papers reporting original research contributions, as well as high quality survey papers. Coverage extends to all computer science topics, computer systems development and formal and theoretical aspects of computing, including computer architecture; high-performance computing; database management and information retrieval; computational biology; computer graphics; data visualization; image and video processing; VLSI design and software-hardware codesign; embedded systems; geoinformatics; artificial intelligence; games, entertainment and virtual reality; natural language processing and much more.</p> <p>The JBCS team wishes that all quality articles be published in the journal independently of the authors' funding capacity. Thus, if the authors are unable to pay the APC charge, we recommend that they contact the editors (editorial@journal-bcs.com). The JBCS team will provide support in finding alternative funding. In particular, a grant from the Brazilian Internet Steering Committee (http://nic.br/) helps sponsor the publication of many JBCS articles.</p> </div> </div> </div> </div> </div> </div> </div> en-US soraia.musse@pucrs.br (Soraia Musse) publicacoes@sbc.org.br (Annie Casali) Tue, 20 Jan 2026 16:18:16 +0000 OJS 3.2.1.2 http://blogs.law.harvard.edu/tech/rss 60 OneTrack-M: A Multitask Approach for Transformer-Based MOT Models https://journals-sol.sbc.org.br/index.php/jbcs/article/view/4636 <p>Multi-Object Tracking (MOT) is a critical problem in computer vision, essential for understanding how objects move and interact in videos. This field faces significant challenges such as occlusions and complex environmental dynamics, impacting model accuracy and efficiency. While traditional approaches have relied on Convolutional Neural Networks (CNNs), the introduction of transformers has brought substantial advancements. This work introduces OneTrack-M, a transformer-based MOT model that enhances tracking computational efficiency and accuracy. Our approach introduces the transformer encoder as the model backbone, significantly reducing processing time and increasing inference speed. Additionally, we employ innovative data preprocessing and multitask training techniques to address occlusion and diverse objective challenges within a single set of weights. Experimental results demonstrate that OneTrack-M achieves at least 25% faster inference times compared to state-of-the-art models in the literature while maintaining or improving tracking accuracy metrics. These improvements highlight the potential of the proposed solution for real-time applications such as autonomous vehicles, surveillance systems, and robotics, where rapid responses are crucial for system effectiveness.</p> Luiz Carlos Silva de Araujo, Carlos Mauricio Seródio Figueiredo Copyright (c) 2026 Luiz Carlos Silva de Araujo, Carlos Mauricio Seródio Figueiredo https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/4636 Fri, 27 Mar 2026 00:00:00 +0000 Multiclass Classification for Detection of GPS Spoofing and Jamming Attacks on UAVs https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5309 <p>Unmanned Aerial Vehicles (UAVs) are increasingly being employed across various domains, making them more vulnerable to a range of attacks, particularly cyber threats. These vehicles usually rely on a global navigation satellite system (GNSS), such as the Global Positioning System (GPS) satellites, for location and navigation data, which can be exploited by adversaries launching attacks using fake GPS signals. To safeguard UAVs from GPS Jamming and GPS Spoofing attacks, this paper proposes an Intrusion Detection System (IDS) that utilizes machine learning techniques for detecting and identifying such attacks. The IDS analyzes GPS signal samples representing normal operation, GPS Jamming, and three types of GPS Spoofing attacks. It relies on machine learning, with models trained and tested for binary class and multiclass classification. The binary class version aims to identify an occurrence of any attack, irrespective of type, as suggested by previous literature. However, the novelty of this work lies in the multiclass version, which enables the identification of attack types — an essential factor in determining the most effective protective measures and providing data for forensic investigations. Stacking, an ensemble machine learning method, yielded the best results, achieving an accuracy rate of 96.91%. Furthermore, the proposed multiclass IDS reduced false negatives to 0.71%, leading to an improved IDS that reduces the likelihood of overlooking attacks compared to the binary class version, which is crucial in real UAV deployments.</p> Gustavo Gualberto Rocha de Lemos, Rodrigo Augusto Cardoso da Silva Copyright (c) 2026 Gustavo Gualberto Rocha de Lemos, Rodrigo Augusto Cardoso da Silva https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5309 Tue, 17 Mar 2026 00:00:00 +0000 Enhancing Red Team Agent Learning with the Kill Chain Catalyst Algorithm in Capture the Flag Scenarios https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5365 <p>With the advancement of technology, tasks once performed by humans have increasingly transitioned to machines or agents equipped with artificial intelligence, including various cyber security domains. From the perspective of real-world cyber attacks, executing actions with minimal failures and steps is critical to reducing the likelihood of exposure. Although research on autonomous cyber attacks predominantly employs Reinforcement Learning (RL), this approach has gaps in scenarios such as limited training data, low resilience in dynamic environments, and limited interpretability of decision-making policies. Therefore, Kill Chain Catalyst (KCC), an <em>RL</em> algorithm based on Gini Impurity-Based Weighted Random Forest that prioritizes interpretability, efficiency in scenarios with limited experience, and resilience in dynamic environments explored by <em>RL</em> agents, has been introduced. <em>KCC</em> leverages decision tree logic for enhanced interpretability and employs a catalyst module inspired by genetic alignment to optimize the search for efficient attack sequences. More than 150 attack experiments were conducted to evaluate learning in terms of offset, speed, and generalization. The analysis focused on the steps, rewards, and failures of agents using the RL algorithms <em>KCC</em>, <em>PPO</em>, <em>DQN</em>, <em>TRPO</em>, and <em>A2C</em>, within a <em>Capture the Flag</em> tournament setting. Both static and dynamic scenarios with limited learning experiences were considered. These experiments demonstrate the superior performance of <em>KCC</em>, revealing differences of up to 198.69% for steps, 129.43% for rewards, and 1096.39% for failures when performing attacks using <em>KCC</em> compared with the other algorithms.</p> Antonio Horta, Anderson dos Santos, Ronaldo Goldschmidt Copyright (c) 2026 Antonio Horta, Anderson dos Santos, Ronaldo Goldschmidt https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5365 Mon, 16 Mar 2026 00:00:00 +0000 Evaluation of explainable artificial intelligence techniques in the context of credit card fraud detection https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5376 <p>Artificial intelligence has been employed in several applications in the financial sector. This paper deals with one of these applications: fraud detection in credit card transactions. In this context, a number of machine learning algorithms can be used to obtain models which automate the classification of a transaction as fraudulent or genuine. However, some of these machine learning algorithms are not directly interpretable. The current paper presents an evaluation of explainable artificial intelligence techniques SHAP and LIME applied to models for fraud detection in credit card transactions. Along with the results of the evaluation, the paper discusses the effectiveness and need for explainable artificial intelligence techniques. This paper extends a previous paper by including hyperparameter tuning, new results and an evaluation of the processing time to obtain explanations. The reported results suggest that SHAP obtains better results than LIME, although LIME required less processing time after obtaining the LIME explainer.</p> Gabriel Mendes de Lima, Paulo Henrique Pisani Copyright (c) 2026 Gabriel Mendes de Lima, Paulo Henrique Pisani https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5376 Wed, 25 Mar 2026 00:00:00 +0000 Improved Biclique Cryptanalysis of the Lightweight Cipher FUTURE https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5390 <p>In the past decade, lightweight cryptography has been of much interest in the academia, especially regarding the cryptanalysis of such ciphers. The National Institute of Standards and Technology (NIST) is one of the entities responsible for this interest, given that they promoted in 2019 a public process to choose the American standard for lightweight cryptography. In 2022, the FUTURE cipher was published and has since been the target of much cryptanalysis, including integral, meet-in-the-middle and differential cryptanalysis in a very short period of time. The objective of this paper is to present four biclique attacks that are better than the one previously published, in terms of time, memory and data complexities, obtained through semi-automatic search. Our fastest attack requires 2<sup>124.38</sup> full computations of the cipher to run, while requiring only 2<sup>24</sup> data pairs and negligible memory. We also present the fastest unbalanced biclique attack and star attack to our knowledge. Only one integral attack on FUTURE has been published that is faster than our attacks, 2<sup>123.70</sup> without using the full codebook of data, i.e. less than 2<sup>64</sup> pairs of plaintexts/ciphertexts, requiring 2<sup>63</sup> pairs. Still, when compared to it, our attacks use much less data while being only slightly slower, which presents a good trade-off.</p> Gabriel de Carvalho, Luis Kowada Copyright (c) 2026 Gabriel de Carvalho, Luis Kowada https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5390 Wed, 25 Mar 2026 00:00:00 +0000 Building flexible databases by using web services for computer-aided diagnosis of cardiomyopathies: from conceptual definition to usability evaluation https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5424 <p>Computer-aided diagnosis (CAD) systems based on medical images and records apply computational techniques to process data and extract features from them to provide a second opinion to the health professional. A diverse and organized set of images and records is necessary to develop and validate such systems. However, medical data are generally obtained in a non-standardized way. With each new research and development project in this area, specific data models need to be built to organize and standardize these data and enable their use in the construction of models and computational systems. This article presents a flexible and generic database modeled and implemented to persist Cardiac Magnetic Resonance exams aiming to support the development of CAD schemes of cardiomyopathies. Furthermore, a web application was developed to enable data search and retrieval from the database. An experiment was carried out to evaluate the interface usability of the web application. Results showed that it is possible to develop a generic and flexible DB model, which can be used in several CAD applications. Additionally, the implemented interface received positive evaluations on its functionalities and usability, and users were capable of performing the intended tasks with correct outcomes.</p> Larissa Terto Alvim, Vagner Mendonça Gonçalves, Fátima L. S. Nunes Copyright (c) 2026 Larissa Terto Alvim, Vagner Mendonça Gonçalves, Fátima L. S. Nunes https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5424 Fri, 20 Mar 2026 00:00:00 +0000 Survey of Brazilian Open Budget Data Portals: Query Interfaces and Dashboards https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5449 <p>To promote transparency, the Brazilian government provides access to public data through web portals featuring query interfaces and dashboards. While query interfaces are used by more experienced users to gather data for further analyses, dashboards that include visualizations help a broader audience consult and explore data. A domain of particular complexity that benefits from the use of these interfaces is government spending and budgets. This study analyzes dashboards and query interfaces of government budget data through qualitative research based on a survey. Focusing on Brazil's budget transparency initiative, we examined 83 interfaces in total: 30 dashboards and 53 query interfaces from federal, state, and major city governments. This survey assesses these interfaces using design patterns for general-purpose dashboards and design principles for open government data dashboards. Our findings reveal a critical weakness: while most portals provide access to budget data, they largely neglect user-centered design, failing to provide the necessary context or consider the data literacy of their audience. This creates a significant "transparency gap'' that undermines genuine accountability and demonstrates the need for a fundamental shift in the design of these essential public tools.</p> Kaline B. F. Mesquita, Dennis G. Balreira, Andre S. Spritzer, Carla M. D. S. Freitas Copyright (c) 2026 Kaline B. F. Mesquita, Dennis G. Balreira, Andre S. Spritzer, Carla M. D. S. Freitas https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5449 Wed, 25 Mar 2026 00:00:00 +0000 Subspace representations in deep neural networks: A survey https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5482 <p>Computer vision applications often involve processing large-scale multidimensional data, requiring methods that are both efficient and accurate. Traditional pattern recognition methods based on subspace representations offer low computational complexity but typically underperform compared to deep learning models in terms of recognition accuracy. This study aims to explore and analyze the integration of subspace representations within deep learning frameworks to leverage the advantages of both approaches. We conducted a comprehensive survey of existing methods that combine subspace representation techniques with deep neural networks. We propose a taxonomy to categorize these methods into three distinct groups based on their integration strategies. The reviewed methods demonstrate that incorporating subspace representations can enhance the performance and efficiency of deep learning models. The taxonomy helps to clarify the landscape of these hybrid approaches and identifies trends in methodological development. The surveyed approaches demonstrate a clear methodological evolution, contributing to enhanced outcomes in various real-world applications.</p> Stéfane Rêgo Gandra, Bernardo Bentes Gatto, Eulanda Miranda dos Santos Copyright (c) 2026 Stéfane Rêgo Gandra, Bernardo Bentes Gatto, Eulanda Miranda dos Santos https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5482 Fri, 27 Mar 2026 00:00:00 +0000 EasyGuard: A Gamified App for Generating Strong and Memorable Passwords https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5545 <p>Although the use of online services has increased substantially over the past decade, the strength of user-created passwords has remained at concerning levels. This study aimed to develop and evaluate the efficiency of a gamified application in promoting the behavior of designing strong passwords. Two rounds of experiments were conducted, each lasting nine days. In the first experiment (<em>n</em> = 10), we evaluated the passwords generated based on user inputs compared to random passwords. Our findings showed that our app generated passwords with an improvement of 68.43 percentage points in the memorization test, 4.87 p.p. in the typing test, and 60.38 p.p. in the combined memorization and typing test. In the second experiment (<em>n</em> = 15), we incorporated a dictionary-based password generation policy into the evaluation and applied an automated tool for data collection. User input-based passwords outperformed random ones by 87.26 p.p. in the memorization test, 2.75 p.p. in the typing test, and 85.92 p.p. in the combined test. Meanwhile, dictionary-based passwords showed improvements of 54.32 p.p., 1.69 p.p., and 69.70 p.p., respectively. Our approach proved promising in promoting strong and memorable passwords. Nonetheless, EasyGuard requires further development and should be further investigated in future studies.</p> Hugo L. Romão, Marcelo H. O. Henklain, Felipe L. Lobo, Eduardo L. Feitosa Copyright (c) 2026 Hugo L. Romão, Marcelo H. O. Henklain, Felipe L. Lobo, Eduardo L. Feitosa https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5545 Tue, 17 Mar 2026 00:00:00 +0000 High-Performance Elliptic Curve Cryptography: A SIMD Approach to Modern Curves (Thesis Distillation) https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5548 <p>Cryptography based on elliptic curves is endowed with efficient methods for public-key cryptography. Recent research has shown the superiority of the Montgomery and Edwards curves over the Weierstrass curves as they require fewer arithmetic operations. Using these modern curves has, however, introduced several challenges to the cryptographic algorithm's design, opening up new opportunities for optimization. Our main objective is to propose algorithmic optimizations and implementation techniques for cryptographic algorithms based on elliptic curves. In order to speed up the execution of these algorithms, our approach relies on the use of extensions to the instruction set architecture. In addition to those specific for cryptography, we use extensions that follow the Single Instruction, Multiple Data (SIMD) parallel computing paradigm. In this model, the processor executes the same operation over a set of data in parallel. We investigated how to apply SIMD to the implementation of elliptic curve algorithms. As part of our contributions, we design parallel algorithms for prime field and elliptic curve arithmetic. We also design a new three-point ladder algorithm for the scalar multiplication <em>P+kQ</em>, and a faster formula for calculating <em>3P</em> on Montgomery curves. These algorithms have found applicability in isogeny-based cryptography. Using SIMD extensions such as SSE, AVX, and AVX2, we develop optimized implementations of the following cryptographic algorithms: X25519, X448, SIDH, ECDH, ECDSA, EdDSA, and qDSA. Performance benchmarks show that these implementations are faster than existing implementations in the state of the art. Our study confirms that using extensions to the instruction set architecture is an effective tool for optimizing implementations of cryptographic algorithms based on elliptic curves. May this be an incentive not only for those seeking to speed up programs in general but also for computer manufacturers to include more advanced extensions that support the increasing demand for cryptography.</p> Armando Faz-Hernandez, Julio López Copyright (c) 2026 Armando Faz-Hernandez, Julio López https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5548 Wed, 25 Mar 2026 00:00:00 +0000 A Triad of Defenses to Mitigate Poisoning Attacks in Federated Learning https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5558 <p>Federated learning (FL) enables the training of machine learning models on decentralized data, potentially improving data privacy. However, the FL distributed architecture is vulnerable to poisoning attacks. In this paper, we propose an FL method capable of mitigating these attacks through a triad of defense strategies: organizing clients into groups, evaluating the local performance of global models during training, and using a voting scheme during the inference phase. The proposed approach first divides the clients into randomly sampled groups, each generating a distinct global model. Each client trains a local model on their private data and submits it to the central server. The central server aggregates the local models within each group to generate the global models. Then, each client receives all global models, selects the best performing one as their new local model, and the process repeats until training is complete. During the inference phase, each client classifies its inputs according to a majority-based voting scheme among the global models. Our experiments using the HAR and MNIST datasets demonstrate that our method can effectively mitigate poisoning attacks without compromising the global model's performance.</p> Blenda Oliveira Mazetto, Bruno Bogaz Zarpelão Copyright (c) 2026 Blenda Oliveira Mazetto, Bruno Bogaz Zarpelão https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5558 Mon, 16 Mar 2026 00:00:00 +0000 Partial integrity, authenticity and belongingness using modification-tolerant signature schemes https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5565 <p>Digital signatures allow us to ensure that the signed digital data is authentic and has not been modified. However, even a single bit modification in the data invalidates the entire signature. In INDOCRYPT '19, Idalino et al. presented an efficient modification-tolerant signature scheme (MTSS) framework using combinatorial group testing techniques, allowing the location and correction of modified parts of the signed data. In this work, we implement their framework and discuss the practical performance of the solution. We also propose various necessary auxiliary algorithms not explored in the initial work, such as the division of data into blocks and the generation of the underlying combinatorial structure needed for the signature generation. Moreover, we propose a novel use case of the framework, which we call the <em>belongingness framework</em>. This scheme allows the verification of the integrity and authenticity of a subset of the signed data without having access to the whole data. This is particularly interesting in big data applications, where access to the whole signed data is prohibitive due to storage limitations.</p> Anthony Bernardo Kamers, Gustavo Zambonin, Thaís Bardini Idalino, Paola de Oliveira Abel, Jean Everson Martina Copyright (c) 2026 Anthony Bernardo Kamers, Gustavo Zambonin, Thaís Bardini Idalino, Paola de Oliveira Abel, Jean Everson Martina https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5565 Mon, 16 Mar 2026 00:00:00 +0000 A survey of social media stance detection using non-textual features https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5687 <p>Stance detection is known as the computational task of estimating an individual's attitude towards a given target topic, which is often of a political or moral nature. In traditional NLP fashion, models of this kind have relied mainly on learning features extracted from social media text. However, social media may provide many other types of non-content information in conjunction with text, such as friends networks, interactions with other users, etc. These knowledge sources, despite being potentially useful for stance prediction, remain relatively little discussed in existing surveys of the field. To fill this gap in the literature, this article presents a survey of stance detection research focusing on the use of network-related features and on how these are combined with more standard text models.</p> Laís Carraro Leme Cavalheiro, Ivandré Paraboni Copyright (c) 2026 Laís Carraro Leme Cavalheiro, Ivandré Paraboni https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5687 Wed, 25 Mar 2026 00:00:00 +0000 Turbocharging Brazilian Mergers and Acquisitions: Questions & Answers Evaluation https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5703 <p>Economic power abuse is a concern in Brazil, where CADE (Administrative Council for Economic Defense) institution combats anti-competitive behaviors to ensure fair competition. Artificial intelligence (AI) can aid CADE by identifying and extracting relevant information from technical reports published in Brazilian Portuguese language, improving the detection and prevention of economic abuse. This paper presents a case study using AI to improve regulatory reviews of CADE documents via a Retrieval-Augmented Generation (RAG) pipeline architecture. Our key contribution is the creation of a specialized Questions &amp; Answers benchmark dataset and a pipeline evaluation methodology, providing a standardized framework for Portuguese-language regulatory document analysis. A chain of thought (CoT) approach was used for problem solving. It leverages the RAG retrieval mechanism to access relevant information and incorporates the sequential reasoning of the CoT framework to generate responses that follow a logical flow of ideas, thus enhancing response accuracy. A vector database employing cosine similarity was used to retrieve the main arguments combined with metadata filters, reducing hallucinations and improving the Large Language Model (LLM) performance. RAG metrics were then combined with a robust human fact-check assessment to validate the pipeline. Our findings establish a new benchmark for Questions &amp; Answers evaluation in Brazilian Mergers and Acquisitions, demonstrating that the proposed strategy effectively enhances the analysis of organizational merger and acquisition reports, unlocking substantial benefits for society.</p> Francis Spiegel Rubin, Pedro Nuno de Souza Moura, Adriana Cesario de Faria Alvim Copyright (c) 2026 Francis Spiegel Rubin, Pedro Nuno de Souza Moura, Adriana Cesario de Faria Alvim https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5703 Tue, 17 Mar 2026 00:00:00 +0000 Visually Comparing Graph Vertex Ordering Algorithms through Geometrical and Topological Approaches https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5851 <p>Graph vertex ordering is a resource widely employed in spatial data analysis, particularly in the urban analytics context, where street graphs are frequently used as spatial discretization for modeling and simulation. Vertex ordering is also important for visualization purposes, as many methods require the vertices to be arranged and displayed in a well-defined order to enable the visual identification of non-trivial patterns. The primary goal of vertex ordering methods is to find an ordering that preserves neighborhood relations. However, the structural complexity of graphs employed in real-world applications leads to unavoidable distortions in the ordering process. Therefore, comparing different vertex ordering methods is fundamental to enable effective analysis and selection of the most appropriate method in each application. Although several metrics have been proposed to assess spatial vertex ordering, they typically focus on measuring the quality of the ordering globally. Global ordering assessment does not enable the analysis and identification of locations where distortions are more pronounced, hampering the analytical process. Visual evaluation of the vertex ordering mechanisms is particularly valuable in this context, as it allows analysts to distinguish between methods based on their performance within a single visualization, assess distortions, identify regions with anomalous behavior, and, in urban contexts, explain spatial inconsistencies in the ordering. This work introduces a visualization-assisted tool to assess vertex ordering techniques, having urban analytics as the application focus. Specifically, we evaluate geometric and topological vertex ordering approaches using urban street graphs as the basis for comparisons. The visual tool builds upon existing and newly proposed metrics, which are validated through experiments on urban data from multiple cities, demonstrating that the proposed methodology is effective in assisting users in selecting a suitable vertex ordering technique, fine-tuning hyperparameters, and identifying regions with high ordering distortions.</p> Karelia Vilca Salinas, Victor Barella, Thales Vieira, Luis Gustavo Nonato Copyright (c) 2026 Karelia Vilca Salinas, Victor Barella, Thales Vieira, Luis Gustavo Nonato https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5851 Mon, 16 Mar 2026 00:00:00 +0000 LiwTERM-r: A Revised Lightweight Transformer-based Model for Multimodal Skin Lesion Detection Robust to Incomplete Input https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5871 <p>As the most common type of cancer in the world, skin cancer accounts for approximately 30% of all diagnosed tumor-based lesions. Early diagnosis can reduce mortality and prevent disfiguring in different skin regions. With the application of machine learning techniques in recent years, especially deep learning, promising results in this task could be achieved, presenting studies demonstrating that the combination of patients' clinical anamneses and images of the injured lesion is essential for improving the correct classification of skin lesions. Despite that, meaningful use of anamneses with multiple collected images of the same skin lesion is mandatory, requiring further investigation. Thus, this project aims to contribute to developing multimodal machine learning-based models to solve the skin lesion classification problem by employing a lightweight transformer model that is robust to missing clinical information input. As a main hypothesis, models can be fed by multiple images from different sources as input along with clinical anamneses from the patient's historical evaluations, leading to a more factual and trustworthy diagnosis. Our model deals with the not-trivial task of combining images and clinical information concerning the skin lesions in a lightweight transformer architecture that does not demand high computation resources or even all the information from the anamneses but still presents competitive classification results.</p> Luis Antonio de Souza Júnior, André Georghton Cardoso Pacheco, Thiago Oliveira dos Santos, Wyctor Fogos da Rocha, Pedro Henrique Bouzon, Christoph Palm, João Paulo Papa Copyright (c) 2026 Luis Antonio de Souza Júnior, André Georghton Cardoso Pacheco, Thiago Oliveira dos Santos dos Santos, Wyctor Fogos da Rocha, Pedro Henrique Bouzon, Christoph Palm, João Paulo Papa https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5871 Mon, 16 Mar 2026 00:00:00 +0000 Crowd-Powered Sampling for Machine Learning: Leveraging Citizen Scientist Response Patterns in AutoML Workflows https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5888 <p>Defining effective models for data classification is challenging, especially in complex contexts. Automated Machine Learning (AutoML) tools can assist in this process by generating rankings tailored to the nature of the data and the problem. In this work, we investigate the performance of five classifiers applied to the task of deforestation segment classification, using data labeled through a citizen science campaign from the ForestEyes project. We selected SVM, Ridge, AdaBoost, KNN, and MLP models based on a ranking generated with the PyCaret AutoML library, prioritizing diverse modeling approaches. Initially, the performance of the models is assessed using the incremental training strategy based on entropy of the volunteer's classifications. Then, a new training strategy is proposed based on the median response time of volunteers when evaluating each segment, exploring three ordering strategies: ascending, descending, and edge-based. Experimental results aligned with the PyCaret ranking, with SVM achieving the best performance, followed by Ridge and AdaBoost, especially when trained on smaller and more reliable data subsets. Both the entropy-based approach and the new strategy using median response time demonstrated strong potential to efficiently train machine learning models in scenarios with scarce data, typical in citizen science campaigns.</p> Hugo Resende, Eduardo B. Neto, Fabio A. M. Cappabianco, Álvaro L. Fazenda, Fabio A. Faria Copyright (c) 2026 Hugo Resende, Eduardo B. Neto, Fabio A. M. Cappabianco, Álvaro L. Fazenda, Fabio A. Faria https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5888 Mon, 16 Mar 2026 00:00:00 +0000 An approach to Data Literacy through a Personalized Interactive LGPD Guide using LLM for Educators https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6448 <p>The Brazilian General Data Protection Law (LGPD) was created to protect the fundamental rights of freedom and privacy of Brazilian citizens. Since its implementation, it has brought new challenges to all institutions established in Brazil, whether public or private, requiring an adaptation to personal data processing practices. In the context of higher education, many professors face difficulties in understanding and properly applying the guidelines of this legislation in their daily activities. This work proposes the development of an approach to data literacy through an interactive guide, based on practical scenarios, to support educators in the process of complying with the LGPD. The proposed system uses the OpenAI API to offer personalized support in real-time. Ten representative academic scenarios were implemented, in which users can interact through multiple-choice questions followed by a chat with the guide. The results showed that, despite initial usability limitations, the system represents a promising tool to promote the comprehension of LGPD among teachers. We observed that our approach can facilitate compliance with the legislation, but requires accessibility and usability improvements to ensure greater and easier adoption.</p> César Murilo da Silva Junior, Silvio E. Quincozes, Juliana Saraiva, Rafael D. Araújo Copyright (c) 2026 César Murilo da Silva Junior, Silvio E. Quincozes, Juliana Saraiva, Rafael D. Araújo https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6448 Wed, 25 Mar 2026 00:00:00 +0000 AI-Driven Hierarchical Taxonomy Generation from Emergency Call Transcripts https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6635 <p class="p1">This article presents a case study on hierarchical topic modeling for emergency call transcripts from Ecuador's ECU 911 service. We introduce a hybrid methodology that first generates a taxonomy from unlabeled data using <em>BERTopic</em> and agglomerative clustering, and then employs embedding-based similarity for multi-label classification. By leveraging multilingual embeddings (<em>LaBSE</em>) and clustering algorithms (<em>UMAP &amp; HDBSCAN</em>), we identified 23 coherent topics, demonstrating a practical balance between accuracy and operational applicability. The key result is a significant reduction in Hamming Loss and an F1-score of 0.4951, achieved without the need for pre-labeled data. This underscores the method's primary practical significance: offering a scalable, automated solution for emergency management centers to rapidly categorize complex incidents, thereby enhancing situational awareness and resource allocation. The integration of <em>LLaMA 3</em> for automated label generation further optimized semantic interpretation, highlighting the potential of language models in critical, resource-constrained domains.</p> Juan Gabriel Flores Sanchez, Marcos Orellana, Patricio Santiago García-Montero, Jorge Luis Zambrano-Martinez Copyright (c) 2026 Juan Gabriel Flores Sanchez, Marcos Orellana, Patricio Santiago García-Montero, Jorge Luis Zambrano-Martinez https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6635 Wed, 25 Mar 2026 00:00:00 +0000 Comparing Explainable AI Techniques In Language Models: A Case Study For Fake News Detection in Portuguese https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5787 <p>Language models are widely used in natural language processing, but their complexity makes interpretation difficult, limiting their adoption in critical decision-making. This work explores Explainable Artificial Intelligence (XAI) techniques, such as LIME and Integrated Gradients (IG), to understand these models. The study evaluates the effectiveness of BERTimbau in classifying Portuguese news as true or fake, using the FakeRecogna and Fake.Br Corpus datasets. In the experiments, LIME proved to be easier to interpret than IG, and both methods showed limitations when applied to texts, as they focus only on the morphological and lexical levels, ignoring other important levels.</p> Jéssica Vicentini, Rafael Bezerra de Menezes Rodrigues, Arnaldo Candido Junior, Ivan Rizzo Guilherme Copyright (c) 2026 Jéssica Vicentini, Rafael Bezerra de Menezes Rodrigues, Arnaldo Candido Junior, Ivan Rizzo Guilherme https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5787 Wed, 21 Jan 2026 00:00:00 +0000 A Coding-Efficiency Analysis of HEVC Encoder Embedded in High-End Mobile Chipsets https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5354 <p>High-end mobile devices require dedicated hardware for real-time video encoding and decoding processes. However, the inherent complexity of the video encoding process, combined with the physical limitations imposed by hardware design such as energy consumption, encoding time, memory usage, and heat dissipation, demands the implementation of various constraints and limitations in commercial hardware to simplify and make them feasible for general use. The High Efficiency Video Coding (HEVC) standard is the main targeted video encoder for processing high-resolution videos in high-end chipsets. This paper aims to analyze the HEVC encoder implemented into three commercial chipsets found in high-end smartphones (Apple iPhone 14 Pro, Samsung Galaxy S23 Plus, and Redmi Note 10S) from three major mobile chip manufacturers (Apple, Qualcomm, and MediaTek), considering the impacts of video encoder limitations on encoding efficiency (BD-Rate) and encoding time. The results in this paper may be used as a comparative foundation for hardware designers and future works in the field, as it exposes the encoding efficiency drawbacks and the encoding time gains that commercial chipsets exhibit in their HEVC encoder.</p> Vítor Costa, Murilo Perleberg, Luciano Agostini, Marcelo Porto Copyright (c) 2026 Vítor Costa, Murilo Perleberg, Luciano Agostini, Marcelo Porto https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5354 Thu, 22 Jan 2026 00:00:00 +0000 Learning on hierarchical trees with Random Forest https://journals-sol.sbc.org.br/index.php/jbcs/article/view/4242 <p style="font-weight: 400;">Hierarchies, as described in mathematical morphology, represent nested regions of interest and provide mechanisms to create coherent data organization. They facilitate high-level analysis and management of large amounts of data. Represented as hierarchical trees, they have formalisms intersecting with graph theory and generalizable applications. Due to the deterministic algorithms, the multiform representations, and the absence of a direct quality evaluation, it is hard to insert hierarchical information into a learning framework and benefit from the recent advances. Researchers usually tackle this problem by refining the hierarchies for a specific media and assessing their quality for a particular task. The downside of this approach is that it depends on the application, and the formulations limit the generalization to similar data. This work aims to create a learning framework that can operate with hierarchical data and is agnostic to the input and application. The idea is to transform the data into a regular representation required by most learning models while preserving the rich information in the hierarchical structure. The proposed methods use edge-weighted image graphs and hierarchical trees as input, and they evaluate different proposals on the edge detection and segmentation tasks. The learning model is the Random Forest, a fast and scalable method for working with high-dimensional data. Results demonstrate that it is possible to create a learning framework dependent only on the hierarchical data that presents a state-of-the-art performance in multiple tasks.</p> Raquel Almeida, Laurent Amsaleg, Zenilton Kleber G. do Patrocínio Júnior, Ewa Kijak, Simon Malinowski, Silvio Jamil Ferzoli Guimarães Copyright (c) 2026 Raquel Almeida, Laurent Amsaleg, Zenilton Kleber G. do Patrocínio Júnior, Ewa Kijak, Simon Malinowski, Silvio Jamil Ferzoli Guimarães https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/4242 Mon, 26 Jan 2026 00:00:00 +0000 Statistical Invariance vs. AI Safety: Why Prompt Filtering Fails Against Contextual Attacks https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5961 <p>Large Language Models (LLMs) are increasingly deployed in high-stakes applications, yet their alignment with ethical standards remains fragile and poorly understood. To investigate the probabilistic and dynamic nature of this alignment, we conducted a black-box evaluation of nine widely used LLM platforms, anonymized to emphasize the underlying mechanisms of ethical alignment rather than model benchmarking. We introduce the Semantic Hijacking Method (SHM) as an experimental framework, formally defined and grounded in probabilistic modeling, designed to reveal how ethical alignment can erode gradually, even when all user inputs remain policy-compliant. Across three experimental rounds (324 total executions), SHM achieved a 97.8% success rate in eliciting harmful content, with failure rates progressing from 93.5% (multi-turn conversations) to 100% (both refined sequences and single-turn interactions), demonstrating that vulnerabilities are inherent to semantic processing rather than conversational memory. A qualitative cross-linguistic analysis revealed cultural variations in harmful narratives, with Brazilian Portuguese responses frequently echoing historical and socio-cultural biases, making them more persuasive to local users. Overall, our findings demonstrate that ethical alignment is not a static barrier but a dynamic and fragile property that challenges binary safety metrics. Due to potential risks of misuse, all prompts and outputs are made available exclusively to authorized reviewers under ethical approval, and this publication focuses solely on reporting the research findings.</p> Aline Ioste, Sarajane Marques Peres, Marcelo Finger Copyright (c) 2026 Aline Ioste, SaraJane Peres, Marcelo Finger https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5961 Tue, 27 Jan 2026 00:00:00 +0000 An Autonomous Hybrid Data Partitioning Approach for NewSQL Databases https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5684 <p>Like online games and the financial market, several applications require specific data management features such as large data volume support, data streaming, and the processing of thousands of OLTP transactions per second. In general, traditional relational databases are not suitable for these requirements. NewSQL is a new generation of databases that combines high scalability and availability with ACID support, being a promising solution for these kinds of applications. Although data partitioning is an essential feature for tuning relational databases, it is still an open issue for NewSQL databases. This paper proposes an automated approach for hybrid data partitioning that minimizes the number of distributed transactions and keeps the system well-balanced. In order to demonstrate its efficacy, we compare our solution with an optimal partitioning solution generated by a solver and a state-of-art baseline. The experiments show that the quality of the partitioning scheme is similar to the optional solution and overcomes the state-of-art approach in number of distributed transactions.</p> Geomar A. Schreiner, Rafael de Santiago, Denio Duarte, Ronaldo dos Santos Mello Copyright (c) 2026 Geomar A. Schreiner, Rafael de Santiago, Denio Duarte, Ronaldo dos Santos Mello https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5684 Mon, 02 Feb 2026 00:00:00 +0000 Limitless Feature Selection: Revolutionizing Evaluation with MH-FSF https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5646 <p>Feature selection plays a crucial role in developing effective predictive models by reducing dimensionality and emphasizing the most relevant attributes. However, current research in this area often lacks comprehensive benchmarking and frequently depends on proprietary datasets. These limitations hinder reproducibility and may lead to inconsistent or suboptimal model performance. To address these limitations, we introduce the MH-FSF framework, a comprehensive, modular, and extensible platform designed to facilitate the reproduction and implementation of feature selection methods. Developed through collaborative research, MH-FSF provides implementations of 17 methods (11 classical, 6 domain-specific) and enables systematic evaluation on 10 publicly available Android malware datasets. Our results reveal performance variations across both balanced and imbalanced datasets, highlighting the critical need for data preprocessing and selection criteria that account for these asymmetries. We demonstrate the importance of a unified platform for comparing diverse feature selection techniques, fostering methodological consistency and rigor. By providing this framework, we aim to significantly broaden the existing literature and pave the way for new research directions in feature selection, particularly within the context of Android malware detection.</p> Vanderson Rocha, Diego Kreutz, Hendrio Bragança, Eduardo Feitosa Copyright (c) 2026 Vanderson Rocha, Diego Kreutz, Hendrio Bragança, Eduardo Feitosa https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5646 Fri, 06 Feb 2026 00:00:00 +0000 BENCH4T3: A Framework to Create Benchmarks for Text-to-Triples Alignment Generation https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5809 <p>Integrating Large Language Models (LLMs) with Knowledge Graphs (KGs) can significantly enhance their capabilities, leveraging LLMs' text generation skills with KGs' explanatory power. However, establishing this connection is challenging and demands proper alignment between unstructured texts and triples. Building benchmarks demands massive human effort in data curation and translation for non-English languages. The demand for adequate benchmarks for validation purposes negatively impacts research advancements. This study proposes an end-to-end framework to guide the automatic construction of text-to-triple alignment benchmarks for any language, using KGs as input. Our solution extracts relations from input triples and processes them to create accurately mapped texts. The proposed pipeline utilizes data curation through prompt engineering and data augmentation to enhance diversity in the generated examples. We experimentally evaluate our framework for creating a bimodal representation of RDF triples and natural language texts, assessing its ability to generate natural language from these triples. A key focus is on developing a benchmark for the underrepresented Portuguese language, facilitating the construction of models that connect structured data (triples) with text. Our solution is suited to creating a benchmark to improve alignment between KG triples and text data. The results indicate that the generated benchmark outperforms the results of existing solutions. The generative approach benefits from our Portuguese benchmark, achieving competitive results compared to established literature benchmarks. Our solution enables automatic generation of benchmarks for aligning triples and text.</p> Victor Jesus Sotelo Chico, André Gomes Regino, Julio Cesar dos Reis Copyright (c) 2026 Victor Jesus Sotelo Chico, André Gomes Regino, Julio Cesar Dos Reis https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5809 Fri, 06 Feb 2026 00:00:00 +0000 Semiotic Engineering Theory for Human-Computer Integration: An Applicability and Usefulness Evaluation https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5620 <p>The relationship between users and autonomous technologies is evolving towards integration (in the sense of partnership), transcending the stimulus-response interaction between these two agents. To follow this evolution, Human-Computer Interaction (HCI) researchers have defined and characterized a new interaction paradigm, Human-Computer Integration (HInt), which extends the focus of the HCI area to cover this new relationship of partnership between humans and autonomous technologies. As HInt is an emerging paradigm, the concepts and ontology of Semiotic Engineering Theory have been extended to address HInt as an extension of the traditional HCI interaction. Thus, this paper aims to evaluate and discuss the applicability and usefulness of the extension of Semiotic Engineering to define, explore, and explain the phenomena involved in HInt. Our findings provide useful insights and reflections on the benefits and limits of Semiotic Engineering for HInt to support the study, design, and evaluation of the partnership between humans and autonomous technologies.</p> Glívia Angélica Rodrigues Barbosa, Raquel Oliveira Prates Copyright (c) 2026 Glívia Angélica Rodrigues Barbosa, Raquel Oliveira Prates https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5620 Sat, 21 Feb 2026 00:00:00 +0000 STELLAR: A Structured, Trustworthy, and Explainable LLM-Led Architecture for Reliable Customer Support https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6044 <p>While Large Language Models (LLMs) offer transformative potential for automating customer support, significant hurdles remain concerning their reliability, explainability, and consistent performance in complex, sensitive interactions. This paper introduces <strong>STELLAR (Structured, Trustworthy, and Explainable LLM-Led Architecture for Reliable Customer Support)</strong>, a novel architectural blueprint designed to address these issues. STELLAR utilizes a <strong>Directed Acyclic Graph (DAG) structure</strong> composed of nine specialized modules and eleven predefined workflows to orchestrate support interactions in a structured and predictable manner. This design promotes enhanced traceability, reliability, and control compared to less constrained systems. The architecture integrates components for few-shot classification, Retrieval-Augmented Generation (RAG), urgency-aware human escalation, compliance verification, user interaction validation, and knowledge base refinement through a semi-automated loop. This modular design deliberately balances LLM-driven innovation with operational requirements such as human-in-the-loop integration and ethical safeguards through embedded checks. We evaluated the core modules of STELLAR in key tasks - classification, retrieval, and compliance - demonstrating strong performance and reliability. Together, these features position STELLAR as a robust and transparent foundation for the next generation of intelligent, reliable customer support systems.</p> Matheus Ferracciú Scatolin, Helio Pedrini Copyright (c) 2026 Matheus Ferracciú Scatolin, Helio Pedrini https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6044 Sat, 21 Feb 2026 00:00:00 +0000 HelBERT: A BERT-Based Pretraining Model for Public Procurement Tasks in Portuguese https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5511 <p>Deep learning models excel in various tasks but require extensive annotated data for supervised learning. In NLP, limited annotated data hinders deep learning. Self-supervised pretraining addresses this by training models on unlabeled text to learn useful representations. Domain-specific pretraining is crucial for good performance in downstream tasks. Although pretrained BERT models exist for legal documents in some languages, none target public procurement documents in Portuguese. Public procurement documents have terminology that is not found in existing models. In this paper, we propose HelBERT, a BERT-based model pretrained on a large corpus of public procurement documents in the Brazilian Portuguese language, including laws, tender notices, and contracts. The experimental results demonstrate that HelBERT outperforms other models in all analyses. HelBERT surpasses models such as BERTimbau and JurisBERT in classification tasks by achieving improvements of 5% and 4% in the F1 Score, respectively. Furthermore, the model achieves gains that exceed 3% in semantic similarity tasks compared to the baseline models. Moreover, despite using a GPU with reduced memory and processing resources, the proposed approach achieves superior results with fewer and more efficient training epochs than the baseline models. These findings underscore the effectiveness of the proposed model in addressing NLP tasks within the public procurement domain.</p> Weslley Emmanuel Martins Lima, Victor Ribeiro da Silva, Jasson Carvalho da Silva, Ricardo de Andrade Lira Rabêlo, Anselmo Cardoso de Paiva Copyright (c) 2026 Weslley Emmanuel Martins Lima, Victor Ribeiro da Silva, Jasson Carvalho da Silva, Ricardo de Andrade Lira Rabêlo, Anselmo Cardoso de Paiva https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5511 Sat, 21 Feb 2026 00:00:00 +0000 RecSys-Fairness: A Framework for Reducing Group Unfairness in Recommendations https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5457 <p>In this study, we address the importance of promoting fairness in recommendation systems, which are highly susceptible to biases that can lead to unfair outcomes for different user groups. We developed a fairness algorithm aimed at mitigating these injustices, which was applied to the MovieLens dataset and analyzed based on the recommendations produced by the ALS (Alternating Least Squares) and NCF (Neural Collaborative Filtering) methods. Users were grouped by activity level, gender, and age, and the results demonstrated the effectiveness of the fairness algorithm in substantially reducing group unfairness (R_{grp}) across all tested configurations, without causing significant losses in recommendation accuracy, measured by the Root Mean Squared Error (RMSE). In particular, a reduction in group unfairness of up to 65.57% was observed in the ALS method. Additionally, we identified an optimal convergence of the fairness algorithm for an estimated number of matrices (h) between 10 and 15, suggesting an effective balance point between promoting fairness and maintaining precision in recommendations. In comparison with the available benchmarks, under identical experimental conditions, we managed to improve group unfairness reductions by approximately 6% (from 59.77% to 65.57%).</p> Rafael Vargas Mesquita dos Santos, Giovanni Ventorim Comarela Copyright (c) 2026 Rafael Vargas Mesquita dos Santos, Giovanni Ventorim Comarela https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5457 Sat, 21 Feb 2026 00:00:00 +0000 A Reliable Stream Learning Model for Network Intrusion Detection Systems https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5608 <p>Developing a reliable Network Intrusion Detection System (NIDS) remains a complex task due to the non-stationary nature of network traffic and the need for frequent updates to maintain high classification performance. Many existing approaches assume a stationary network environment, which overlooks the challenges associated with periodic model updates, such as the need for large amounts of properly labeled data and significant computational resources. This issue is particularly challenging for real-time applications, where minimizing delays and ensuring accuracy is crucial. This paper proposes an analysis of how changes in the network behavior negatively affects the long-term of ML-Based NIDS. For such a problem, it is proposed a new NIDS approach integrating stream learning with a reject option technique to simplify the model update process while ensuring consistent classification accuracy over time. The proposal uses stream learning classifiers to incrementally incorporate new data, while the reject option allows the system to evaluate the reliability of classifications before they are used for updates. The scheme operates with minimal intervention, with rejected instances stored for future updates and used to fine-tune the model over time, ensuring adaptation to evolving network conditions. Experimental results demonstrate that the proposed approach maintains high classification accuracy over a year, even without recurrent updates, and achieves significant improvements in true positive rates compared to traditional methods. The system can operate for up to three months without updates, with no significant degradation in performance.</p> Pedro Horchulhack, Eduardo Kugler Viegas, Altair Olivo Santin Copyright (c) 2026 Pedro Horchulhack, Eduardo Kugler Viegas, Altair Olivo Santin https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5608 Mon, 02 Mar 2026 00:00:00 +0000 CNNs for JPEGs: Designing Cost-Efficient Stems https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5873 <p>Convolutional neural networks (CNNs) have achieved astonishing advances over the past decade, pushing the state-of-the-art in several computer vision tasks. CNNs are capable of learning robust representations of the data directly from RGB pixels. However, most image data is usually available in compressed format, of which the JPEG is the most widely used due to transmission and storage purposes. For this motive, a preliminary decoding process that has a high computational load and memory usage is demanded. Image decoding can be a performance bottleneck for devices with limited computational resources, such as embedded devices, even when hardware accelerators are used. For this reason, deep learning methods capable of learning directly from the compressed domain have been gaining attention in recent years. These methods usually extract a frequency domain representation of the image, like DCT, by a partial decoding, and then make adaptation to typical CNN architectures to work with it. In this paper, we perform an in-depth study of the computational cost of deep models designed for the frequency domain, evaluating the cost of decoding and passing images through the network. We notice that previous work increased the model's computational complexity to accommodate for the compressed images, nullifying the speed up gained by not decoding images. We propose to remove the changes to the model that increase the computational cost, replacing it with our designed lightweight stems. This way, we can take full advantage of the speed-up obtained by avoiding the decoding. Our strategies were successful in generating models that balance efficiency and effectiveness, allowing deep models to be deployed in a wider array of devices. We achieve up to 25.91% reduction in computational complexity (FLOPs), while only decreasing accuracy in up to 2.97%. We also propose the efficiency-effectiveness score S<sub>E</sub> to highlight models with favorable trade-offs between accuracy, computational cost and number of parameters.</p> Samuel Felipe dos Santos, Nicu Sebe, Jurandy Almeida Copyright (c) 2026 Samuel Felipe dos Santos, Nicu Sebe, Jurandy Almeida https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5873 Mon, 02 Mar 2026 00:00:00 +0000 Generalizing Feature Selection in Android Malware Detection: The SigAPI AutoCraft Approach https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6043 <p>Feature selection methods are widely employed in Android malware detection to improve accuracy and efficiency by identifying the most relevant features. However, their generalizability often remains limited, as approaches like SigAPI are typically developed and evaluated on a small number of datasets, reducing their effectiveness across diverse scenarios. The practical use of SigAPI is further hindered by the need to predefine a minimum number of features, the instability of its evaluation metrics, and its inability to adapt efficiently to the heterogeneity commonly present in Android datasets. To address these limitations, we developed SigAPI AutoCraft, an enhanced and fully automated version of the original method. SigAPI AutoCraft achieves consistent and robust performance across ten Android malware datasets, substantially improving generalization. The results demonstrate a 5–15% increase in Matthews Correlation Coefficient (MCC) and up to a 7.6-fold improvement in feature reduction, underscoring its effectiveness and adaptability to complex and heterogeneous data environments.</p> Vanderson Rocha, Laura Tschiedel, Diego Kreutz, Hendrio Bragança, Joner Assolin, Rodrigo Brandão Mansilha, Silvio E. Quincozes, Angelo Gaspar Diniz Nogueira Copyright (c) 2026 Vanderson Rocha, Laura Tschiedel, Diego Kreutz, Hendrio Bragança, Joner Assolin, Rodrigo Brandão Mansilha, Silvio E. Quincozes, Angelo Gaspar Diniz Nogueira https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6043 Mon, 09 Mar 2026 00:00:00 +0000 Towards a Lightweight Multi-View Android Malware Detection Model with Multi-Objective Feature Selection https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5378 <p>In recent years, a wide range of new Machine Learning (ML) techniques with high accuracy have been developed for Android malware detection. Despite their high accuracy, these techniques are seldom implemented in production environments due to their limited generalization capabilities, leading to reduced performance when applied to real-world scenarios. In light of this, this paper introduces a novel multi-view Android malware detection model implemented in two stages. The first stage involves extracting multiple feature sets from the analyzed Android application package, offering complementary behavioral representations that improve the system's generalization in the classification process. In the second stage, a multi-objective optimization is conducted to identify the optimal feature subset from each view and fine-tune the hyperparameters of individual classifiers, enabling an ensemble-based classification approach. The core innovation of our approach lies in the proactive selection of feature subsets and the optimization of hyperparameters that together enhance classification accuracy while minimizing processing overhead within a multi-view framework. Experiments conducted on a newly developed dataset, consisting of over 40 thousand Android application samples, validate the effectiveness of our proposal. The results indicate that our model can increase true-positive rates by up to 18% while reducing inference processing costs by as much as 72%.</p> Philipe Fransozi, Jhonatan Geremias, Eduardo K. Viegas, Altair O. Santin Copyright (c) 2026 Philipe Fransozi, Jhonatan Geremias, Eduardo K. Viegas, Altair O. Santin https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5378 Mon, 09 Mar 2026 00:00:00 +0000 Intelligent Emotion Tracking System VIRE: Evaluation of Neural Network Architectures in Facial Emotion Recognition https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5370 <p>This work proposes an emotional monitoring system called Visual Identification of Recognition of Emotions (VIRE), based on convolutional neural networks (CNNs) to analyze facial expressions. Using the six basic emotions proposed by Paul Ekman as a reference, which can be identified from the composition of various facial muscle states, VIRE aims to assist in the diagnosis of mental health conditions. While emotional expressions are communicated in various ways, this research focuses primarily on facial expressions due to their expressiveness resulting from the mobility of facial muscles. The methodology involved collecting data from the FER2013 dataset, preprocessing the images, hyperparameter tuning, and training three different architectures: AlexNet, DenseNet, and a custom CNN. The research will classify expressions into basic emotions and evaluate the models' performance in terms of accuracy and other metrics. VIRE has demonstrated potential, achieving an accuracy of about 60%, although improvements are needed for practical application. The ultimate goal is to create a tool that integrates technology and health, facilitating the identification of emotional states that may indicate mental health issues, thereby contributing to more accurate and effective diagnoses.</p> Nathan Ferraz da Silva, Geraldo Pereira Rocha Filho, Roger Immich, Vinícius Pereira Gonçalves, Rodolfo Ipolito Meneguette Copyright (c) 2026 Nathan Ferraz da Silva, Geraldo Pereira Rocha Filho, Roger Immich, Rodolfo Ipolito Meneguette, Vinícius Pereira Gonçalves https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5370 Mon, 09 Mar 2026 00:00:00 +0000 Memorizing Features Efficiently for Self-supervised Video Object Segmentation https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5904 <p>Video object segmentation (VOS) involves consistently identifying and classifying object pixels in video sequences, a task that traditionally depends on extensive, manually annotated datasets. In this work, we present SHLS (Superfeatures in a Highly Compressed Latent Space), a self-supervised VOS method that reduces reliance on both annotations and large training datasets. SHLS employs a metric learning framework combining superpixels and deep learning features, enabling effective training with just 10,000 unlabeled still images. Utilizing an efficient memory clustering mechanism, SHLS generates ultra-compact representations called superfeatures, which efficiently store and classify object information across video sequences. Experiments on the DAVIS dataset demonstrate SHLS's strong performance in multi-object scenarios, underscoring its potential as a robust and efficient alternative in self-supervised VOS.</p> Marcelo Mendonça, Luciano Oliveira Copyright (c) 2026 Marcelo Mendonça, Luciano Oliveira https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5904 Sun, 15 Mar 2026 00:00:00 +0000 Sapo-boi: Bypassing Linux Kernel Network Stack in the Implementation of an XDP-based NIDS https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5551 <p>Network intrusion detection systems (NIDS) must inspect multiple parts of a packet to detect patterns of known attacks. With the advent of XDP, it has become feasible to implement such a system within the kernel's own network stack for the evaluation of ingress traffic. In this work, we propose Sapo-boi, an NIDS solution consisting of two modules: (i) the Suspicion Module, an XDP program capable of processing packets in parallel, discarding packets considered safe, and redirecting suspicious packets for verdict in user space through XDP sockets (Af_XDP); and (ii) the Evaluation Module, a user-level process capable of finding the rule to which the suspicious packet should be analyzed in constant time and triggering notifications if the suspicion is confirmed. The system demonstrated superior results in terms of packet analysis rates and CPU usage compared to traditional NIDS alternatives (Snort and Suricata).</p> Raphael Kaviak Machnicki, João Ribeiro Andreotti, Ulisses Penteado, Jorge Pires Correia, Vinicius Fulber-Garcia, André Grégio Copyright (c) 2026 Raphael Kaviak Machnicki, João Ribeiro Andreotti, Ulisses Penteado, Jorge Pires Correia, Vinicius Fulber-Garcia, André Grégio https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5551 Mon, 02 Mar 2026 00:00:00 +0000 Portfolio-based Active Learning with Gaussian Processes for Vulnerabilities Risk Classification https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5567 <p>Effective vulnerability management is essential for cybersecurity, particularly as the demand for skilled professionals often exceeds supply. This paper investigates the application of Gaussian Processes (GPs) integrated with Active Learning (AL) techniques to classify security vulnerabilities based on their risk of exploitation. The main objective is to optimize the labeling process, thereby reducing the amount of labeled data necessary for training an effective classifier. The proposed methodology combines the uncertainty predictions provided by GP models with five established data selection strategies, utilizing a portfolio-based approach. The portfolio avoids the need of choosing a single strategy and leverages the strengths of each technique. This approach enhances adaptability and balances exploration versus exploitation in complex optimization scenarios, ultimately improving the diversity of labeled samples and contributing to the development of better classifiers trained with less examples. Experiments were conducted using the CVEjoin dataset, which encompasses over 200,000 vulnerabilities, across three distinct evaluation scenarios. The different setups consider equivalent volumes of labeled data, but varying Active Learning iterations. When considering a single strategy, the results indicate that the BSB (best and second best) method consistently outperformed the others in terms of accuracy and F1 score, particularly with an increased number of labeling iterations. In the scenario where multiple strategies are used in a portfolio, the results indicate gains in all evaluation metrics. This study underscores the usefulness of a portfolio-based Active Learning approach in optimizing the labeling procedure and, ultimately, prioritizing vulnerabilities for remediation. This research lays the groundwork for extending the framework to other areas of cybersecurity, such as vulnerabilities in web applications and cloud environments, thereby improving overall security measures in the digital landscape.</p> Davyson S. Ribeiro, Rafael S. Lemos, Francisco R. P. da Ponte, César Lincoln C. Mattos, Emanuel B. Rodrigues Copyright (c) 2026 Davyson S. Ribeiro, Rafael S. Lemos, Francisco R. P. da Ponte, César Lincoln C. Mattos, Emanuel B. Rodrigues https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5567 Mon, 02 Mar 2026 00:00:00 +0000 Implementation and evaluation of the Forro stream cipher in Tofino programmable hardware for remote attestation in datacenters https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5625 <p>The software-defined networking (SDN) paradigm has enabled several innovations in computer networking, specially in programmable packet processing. This paper shows the feasibility and impact on computing resources of the Forro stream cipher algorithm in the Tofino programmable hardware switch. For comparison purposes, the ChaCha algorithm was also analyzed in terms of its performance and impact on the same device. It was observed that the Forro algorithm performs better and uses fewer resources than ChaCha in sequential implementations. However, when parallelization techniques are adopted, ChaCha performs better for higher data rates, but uses more ternary matching resources than Forro. For the use case of remote attestation in programmable data planes, the Forro cipher seems more promising, as it uses less limited resources and can achieve sufficient throughput rates for this scenario. We then propose P4DRA, a distributed remote attestation solution based in the programmable data plane that can offload the verification process of remote devices to the data plane, freeing resources from a central verifier based on a x86 server and improving the attestation proof verification speed by around 150 times.</p> Rodrigo Alexander de Andrade Pierini, Caio Teixeira, Christian Rodolfo Esteve Rothenberg, Marco Aurélio Amaral Henriques Copyright (c) 2026 Rodrigo Alexander de Andrade Pierini, Caio Teixeira, Christian Rodolfo Esteve Rothenberg, Marco Aurélio Amaral Henriques https://creativecommons.org/licenses/by/4.0 https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5625 Tue, 24 Feb 2026 00:00:00 +0000