Journal of the Brazilian Computer Society https://journals-sol.sbc.org.br/index.php/jbcs <div class="cms-item cms-collection cms-collection--split cms-collection--untitled" data-fragment="784856"> <div class="cms-collection__row"> <div class="cms-collection__column"> <div class="cms-collection__column-inner"> <div class="cms-item cms-collection" data-fragment="784854"> <div id="aimsAndScope" class="cms-item placeholder placeholder-aimsAndScope"> <div class="placeholder-aimsAndScope_content"> <p>The <em>Journal of the Brazilian Computer Society</em> (JBCS) is an international journal which serves as a forum for disseminating innovative research in all fields of computer science and related subjects. Contents include theoretical, practical and experimental papers reporting original research contributions, as well as high quality survey papers. Coverage extends to all computer science topics, computer systems development and formal and theoretical aspects of computing, including computer architecture; high-performance computing; database management and information retrieval; computational biology; computer graphics; data visualization; image and video processing; VLSI design and software-hardware codesign; embedded systems; geoinformatics; artificial intelligence; games, entertainment and virtual reality; natural language processing and much more.</p> <p>The JBCS team wishes that all quality articles be published in the journal independently of the authors' funding capacity. Thus, if the authors are unable to pay the APC charge, we recommend that they contact the editors (editorial@journal-bcs.com). The JBCS team will provide support in finding alternative funding. In particular, a grant from the Brazilian Internet Steering Committee (http://nic.br/) helps sponsor the publication of many JBCS articles.</p> </div> </div> </div> </div> </div> </div> </div> Brazilian Computer Society en-US Journal of the Brazilian Computer Society 1678-4804 OneTrack-M: A Multitask Approach for Transformer-Based MOT Models https://journals-sol.sbc.org.br/index.php/jbcs/article/view/4636 <p>Multi-Object Tracking (MOT) is a critical problem in computer vision, essential for understanding how objects move and interact in videos. This field faces significant challenges such as occlusions and complex environmental dynamics, impacting model accuracy and efficiency. While traditional approaches have relied on Convolutional Neural Networks (CNNs), the introduction of transformers has brought substantial advancements. This work introduces OneTrack-M, a transformer-based MOT model that enhances tracking computational efficiency and accuracy. Our approach introduces the transformer encoder as the model backbone, significantly reducing processing time and increasing inference speed. Additionally, we employ innovative data preprocessing and multitask training techniques to address occlusion and diverse objective challenges within a single set of weights. Experimental results demonstrate that OneTrack-M achieves at least 25% faster inference times compared to state-of-the-art models in the literature while maintaining or improving tracking accuracy metrics. These improvements highlight the potential of the proposed solution for real-time applications such as autonomous vehicles, surveillance systems, and robotics, where rapid responses are crucial for system effectiveness.</p> Luiz Carlos Silva de Araujo Carlos Mauricio Seródio Figueiredo Copyright (c) 2026 Luiz Carlos Silva de Araujo, Carlos Mauricio Seródio Figueiredo https://creativecommons.org/licenses/by/4.0 2026-03-27 2026-03-27 32 1 555 567 10.5753/jbcs.2026.4636 A Simple U-Diffusion Inpainting Structure https://journals-sol.sbc.org.br/index.php/jbcs/article/view/4907 <p>The inpainting problem is addressed in this work through a very simplified version of the approach based on low-dimensional manifold model (LDMM), in which the actual working principle of the LDMM is put into evidence, namely, the simulated diffusion of image pixels that takes place in a manifold from which patches are drawn to form a given image. The simplicity of this principle is translated into a straightforward algorithm that borrows ideas from the Locally Linear Embedding (LLE) method, commonly used for dimensionality reduction and data visualization. The equivalence between this much simpler algorithm and the original LDMM is illustrated through visual inspection and experimental measurements of peak signal-to-noise ratios. By maintaining the key components of LDMM while reducing conceptual and computational complexity, the proposed method offers a streamlined solution for image inpainting tasks. Additionally, a (U-shaped) multi-scale use of the proposed algorithm is presented as a significantly better initializer for missing pixels, thus reducing the number of algorithmic iterations for convergence.</p> Jugurta Montalvão Gabriel F. A. Bastos Israel J. Santos Filho Copyright (c) 2026 Jugurta Montalvão, Gabriel F.A. Bastos, Israel J. Santos Filho https://creativecommons.org/licenses/by/4.0 2026-05-07 2026-05-07 32 1 1331 1342 10.5753/jbcs.2026.4907 An embedded vision-based system for cyclist detection and counting https://journals-sol.sbc.org.br/index.php/jbcs/article/view/4937 <p>Automatically detecting and counting cyclists in urban scenarios is a task in intelligent transportation systems and smart cities that enables the generation of important structured data. This data contributes to understanding the dynamics of cyclists' use of the urban space and guides the development of public policies for cycling mobility and traffic safety. In this study, we propose an embedded system for cyclist detection and counting, aiming to be a lightweight solution using computer vision and deep learning methods. It is characterized by low energy consumption and easy handling, based on the Raspberry Pi 4 platform and the Edge TPU Coral accelerator. The developed system achieved an F1-score of 0.9137 for processing prerecorded video.In experiments conducted in a real urban setting, we achieved counting accuracy between 78,3% and 82,2%, a performance comparable to solutions with higher computational requirements and/or costs. Code is available at https://github.com/leandroAS86/det-cicle</p> Leandro Alves dos Santos Roberto Cesar Betini Bogdan Tomoyuki Nassu Copyright (c) 2026 Leandro Alves dos Santos, Roberto Cesar Betini, Bogdan Tomoyuki Nassu https://creativecommons.org/licenses/by/4.0 2026-04-15 2026-04-15 32 1 690 699 10.5753/jbcs.2026.4937 Adaptive Systems for Well-being Promotion: A Systematic Mapping https://journals-sol.sbc.org.br/index.php/jbcs/article/view/4994 <p><span style="font-weight: 400;">Research on Human-Computer Interaction (HCI) interfaces has gained increasing relevance in both corporate and academic environments, particularly in adaptive systems that offer personalized interventions. Adaptive systems are crucial for enhancing user experience and promoting well-being by dynamically adjusting to individual needs and contexts. Well-being, which encompasses physical, mental, and social dimensions, can significantly influence user behavior and task performance. However, measuring well-being remains a complex challenge due to its subjective and multidimensional nature. This study aims to map and analyze the state of the art in computational interfaces that adapt to the user’s context to promote well-being. Specifically, the study addresses the gap in adaptive systems, which are still underdeveloped in the field. Despite significant progress in measuring well-being, most systems focus on monitoring well-being states or training predictive models, rather than offering fully adaptive interventions. To explore this, a systematic mapping study was conducted, investigating three key questions: What is the purpose of the study regarding the well-being dimension explored, as well as the approaches and techniques used to promote it? What methods were employed to measure users’ well-being? What interventions were implemented to promote well-being? The analysis of 36 selected studies reveals that research primarily concentrates on the mental and physical dimensions of well-being, with artificial intelligence techniques and physiological sensors, particularly electrocardiograms (ECG), being the most frequently used. However, there is a notable lack of adaptive systems in the literature. These findings underscore the need for further development of adaptive interventions that actively improve well-being, providing valuable insights to guide the design of adaptive interfaces. By leveraging these insights, future systems can be developed to enhance user experience and promote well-being across diverse domains.</span></p> Alex Sandro Rodrigues Ancioto Fabiana Luci de Oliveira Vânia Paula de Almeida Neris Copyright (c) 2026 Alex Sandro Rodrigues Ancioto, Fabiana Luci de Oliveira, Vânia Paula de Almeida Neris https://creativecommons.org/licenses/by/4.0 2026-05-04 2026-05-04 32 1 1108 1127 10.5753/jbcs.2026.4994 Understanding the Influence of Collaborative Tasks on User Experience in Moodle https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5101 <p>Learning Management Systems (LMSs) are widely adopted to support teaching and learning across diverse contexts, including higher education, corporate training, professional certification, and hybrid or fully online courses. These platforms enable flexible access to resources, asynchronous interactions, and integration of individual and collaborative learning activities. While numerous studies have assessed LMS usability and features, the specific influence of collaborative activities on user experience (UX) remains underexplored. This study investigates how collaboration affects UX in Moodle by comparing individual and peer review–based collaborative activities. We conducted a case study with 35 undergraduate Software Engineering students, following Wohlin’s five-step methodology: scope definition, planning, operation, analysis, and presentation. Participants completed an individual task and a collaborative Peer Review activity, after which UX was evaluated through the Attrakdiff questionnaire and Focus Group discussions. Quantitative results indicated notable differences between individual and collaborative contexts, with collaborative activities perceived as more inventive, stimulating, and pleasant, though issues such as lack of interaction, unclear learnability, and system constraints impacted satisfaction. Qualitative analysis revealed that course structure, the configuration of collaborative activities, and specific Moodle features—such as notifications, feedback mechanisms, and progress tracking—strongly shape UX. Contributions include: (i) empirical evidence of how collaboration-related aspects influence LMS UX; (ii) a replicable methodology combining quantitative and qualitative UX evaluations; (iii) actionable recommendations for improving Moodle’s support of collaborative learning, enhancing engagement and learning outcomes; and (iv) we list a set of recommendations for professors to configure Peer Review in Moodle.</p> Romualdo Azevedo Ketlen Lucena Alberto Castro Bruno Gadelha Copyright (c) 2026 Romualdo Azevedo, Ketlen Lucena, Alberto Castro, Bruno Gadelha https://creativecommons.org/licenses/by/4.0 2026-04-22 2026-04-22 32 1 817 838 10.5753/jbcs.2026.5101 Optimal resource allocation in networks of general single-server finite queues https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5178 <p>This study simultaneously considers minimizing the total buffer allocation and the overall service rates in the network while maximizing its throughput. Some algorithms have already been proposed in the literature, but the discussion of efficient alternatives is relevant. We develop a novel approach to multi-objective particle swarm optimization (MO-PSO). We apply this approach to an acyclic, general single-server finite queueing network to optimize throughput. This algorithm was specifically tailored to address the problem, which involves mixed-integer variables and constraints that depend on the current solution, since service rates cannot fall below arrival rates. The proposed approach simultaneously decreases the total buffer allocation and the overall service rate. Consequently, our method yields a suboptimal Pareto set for these conflicting objectives. We conducted a computational and experimental study to verify the effectiveness of the proposed approach and to compare it with previously proposed solutions. The insights gained can enhance the design of queue networks.</p> Gabriel L. Souza Anderson R. Duarte Frederico R. B. Cruz Gladston J. P. Moreira Copyright (c) 2026 Gabriel L. Souza, Anderson R. Duarte, Frederico R. B. Cruz, Gladston J. P. Moreira https://creativecommons.org/licenses/by/4.0 2026-04-24 2026-04-24 32 1 919 930 10.5753/jbcs.2026.5178 Multiclass Classification for Detection of GPS Spoofing and Jamming Attacks on UAVs https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5309 <p>Unmanned Aerial Vehicles (UAVs) are increasingly being employed across various domains, making them more vulnerable to a range of attacks, particularly cyber threats. These vehicles usually rely on a global navigation satellite system (GNSS), such as the Global Positioning System (GPS) satellites, for location and navigation data, which can be exploited by adversaries launching attacks using fake GPS signals. To safeguard UAVs from GPS Jamming and GPS Spoofing attacks, this paper proposes an Intrusion Detection System (IDS) that utilizes machine learning techniques for detecting and identifying such attacks. The IDS analyzes GPS signal samples representing normal operation, GPS Jamming, and three types of GPS Spoofing attacks. It relies on machine learning, with models trained and tested for binary class and multiclass classification. The binary class version aims to identify an occurrence of any attack, irrespective of type, as suggested by previous literature. However, the novelty of this work lies in the multiclass version, which enables the identification of attack types — an essential factor in determining the most effective protective measures and providing data for forensic investigations. Stacking, an ensemble machine learning method, yielded the best results, achieving an accuracy rate of 96.91%. Furthermore, the proposed multiclass IDS reduced false negatives to 0.71%, leading to an improved IDS that reduces the likelihood of overlooking attacks compared to the binary class version, which is crucial in real UAV deployments.</p> Gustavo Gualberto Rocha de Lemos Rodrigo Augusto Cardoso da Silva Copyright (c) 2026 Gustavo Gualberto Rocha de Lemos, Rodrigo Augusto Cardoso da Silva https://creativecommons.org/licenses/by/4.0 2026-03-17 2026-03-17 32 1 374 386 10.5753/jbcs.2026.5309 Enhancing Red Team Agent Learning with the Kill Chain Catalyst Algorithm in Capture the Flag Scenarios https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5365 <p>With the advancement of technology, tasks once performed by humans have increasingly transitioned to machines or agents equipped with artificial intelligence, including various cyber security domains. From the perspective of real-world cyber attacks, executing actions with minimal failures and steps is critical to reducing the likelihood of exposure. Although research on autonomous cyber attacks predominantly employs Reinforcement Learning (RL), this approach has gaps in scenarios such as limited training data, low resilience in dynamic environments, and limited interpretability of decision-making policies. Therefore, Kill Chain Catalyst (KCC), an <em>RL</em> algorithm based on Gini Impurity-Based Weighted Random Forest that prioritizes interpretability, efficiency in scenarios with limited experience, and resilience in dynamic environments explored by <em>RL</em> agents, has been introduced. <em>KCC</em> leverages decision tree logic for enhanced interpretability and employs a catalyst module inspired by genetic alignment to optimize the search for efficient attack sequences. More than 150 attack experiments were conducted to evaluate learning in terms of offset, speed, and generalization. The analysis focused on the steps, rewards, and failures of agents using the RL algorithms <em>KCC</em>, <em>PPO</em>, <em>DQN</em>, <em>TRPO</em>, and <em>A2C</em>, within a <em>Capture the Flag</em> tournament setting. Both static and dynamic scenarios with limited learning experiences were considered. These experiments demonstrate the superior performance of <em>KCC</em>, revealing differences of up to 198.69% for steps, 129.43% for rewards, and 1096.39% for failures when performing attacks using <em>KCC</em> compared with the other algorithms.</p> Antonio Horta Anderson dos Santos Ronaldo Goldschmidt Copyright (c) 2026 Antonio Horta, Anderson dos Santos, Ronaldo Goldschmidt https://creativecommons.org/licenses/by/4.0 2026-03-16 2026-03-16 32 1 289 304 10.5753/jbcs.2026.5365 Identification of Services and Devices for Enhancing Vulnerability Analysis https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5372 <p>The identification of services and devices is an important step in vulnerability analysis, responsible for listing all assets that should be scanned for vulnerabilities. In recent years, this task has become increasingly complex due to the proliferation of new types of services and Internet-connected devices, such as IoT devices. In this context, search engines like Censys and Shodan have become popular tools, often used in one or more stages of the process for scanning network-accessible vulnerabilities. However, while these tools are capable of probing numerous devices worldwide, the information they process is often incomplete, primarily due to the challenge of keeping pace with the rapid creation of new applications. This paper introduces our solution for efficient service enumeration based on fingerprint matching, which can complement existing information about scanned devices. Our solution is highly efficient as it leverages the responses from connections established by search engines during their probing as input data, eliminating the need for additional scans. Furthermore, our processing engine is optimized and capable of processing data in parallel. To validate our solution, we compared the information obtained by our framework with that provided by Shodan. Overall, we were able to identify a larger number of services and devices. For instance, our approach increased the identification of services such as operating systems by 1.6 times and hardware information by up to 14 times. Additionally, we present two use cases demonstrating how our framework can assist in vulnerability analysis by providing more accurate and detailed information.</p> Lucas M. Ponce Indra Ribeiro Etelvina Oliveira Ítalo Cunha Cristine Hoepers Klaus Steding-Jessen Marcelo H. P. C. Chaves Dorgival Guedes Wagner Meira Jr. Copyright (c) 2026 Lucas M. Ponce, Indra Ribeiro, Etelvina Oliveira, Ítalo Cunha, Cristine Hoepers, Klaus Steding-Jessen, Marcelo H. P. C. Chaves, Dorgival Guedes, Wagner Meira Jr. https://creativecommons.org/licenses/by/4.0 2026-04-02 2026-04-02 32 1 586 599 10.5753/jbcs.2026.5372 Evaluation of explainable artificial intelligence techniques in the context of credit card fraud detection https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5376 <p>Artificial intelligence has been employed in several applications in the financial sector. This paper deals with one of these applications: fraud detection in credit card transactions. In this context, a number of machine learning algorithms can be used to obtain models which automate the classification of a transaction as fraudulent or genuine. However, some of these machine learning algorithms are not directly interpretable. The current paper presents an evaluation of explainable artificial intelligence techniques SHAP and LIME applied to models for fraud detection in credit card transactions. Along with the results of the evaluation, the paper discusses the effectiveness and need for explainable artificial intelligence techniques. This paper extends a previous paper by including hyperparameter tuning, new results and an evaluation of the processing time to obtain explanations. The reported results suggest that SHAP obtains better results than LIME, although LIME required less processing time after obtaining the LIME explainer.</p> Gabriel Mendes de Lima Paulo Henrique Pisani Copyright (c) 2026 Gabriel Mendes de Lima, Paulo Henrique Pisani https://creativecommons.org/licenses/by/4.0 2026-03-25 2026-03-25 32 1 484 497 10.5753/jbcs.2026.5376 Cybersecurity knowledge and behaviors: An exploratory study in Brazil with data mainly from northeast and southeast https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5384 <p>Research about the importance of the human factor in cybersecurity is scarce. Aiming to contribute, we characterized cybersecurity knowledge and behaviors of internet users, assessing their relationship with the Big Five personality traits (openness, conscientiousness, extraversion, agreeableness, and neuroticism); 329 Brazilians, mostly from the North and the Southeast, participated. We observed higher scores in agreeableness and openness, and lower neuroticism. The knowledge level ranged from "moderate to good" and the frequency of cybersecurity behaviors was moderate. We found some weak evidence of an association between personality traits and cybersecurity knowledge and behavior. Future studies are needed to include a more diverse sample and improved instruments.</p> Marcelo Henrique Oliveira Henklain Felipe Leite Lobo Eduardo Luzeiro Feitosa Copyright (c) 2026 Marcelo Henrique Oliveira Henklain, Felipe Leite Lobo, Eduardo Luzeiro Feitosa https://creativecommons.org/licenses/by/4.0 2026-04-28 2026-04-28 32 1 1061 1073 10.5753/jbcs.2026.5384 Improved Biclique Cryptanalysis of the Lightweight Cipher FUTURE https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5390 <p>In the past decade, lightweight cryptography has been of much interest in the academia, especially regarding the cryptanalysis of such ciphers. The National Institute of Standards and Technology (NIST) is one of the entities responsible for this interest, given that they promoted in 2019 a public process to choose the American standard for lightweight cryptography. In 2022, the FUTURE cipher was published and has since been the target of much cryptanalysis, including integral, meet-in-the-middle and differential cryptanalysis in a very short period of time. The objective of this paper is to present four biclique attacks that are better than the one previously published, in terms of time, memory and data complexities, obtained through semi-automatic search. Our fastest attack requires 2<sup>124.38</sup> full computations of the cipher to run, while requiring only 2<sup>24</sup> data pairs and negligible memory. We also present the fastest unbalanced biclique attack and star attack to our knowledge. Only one integral attack on FUTURE has been published that is faster than our attacks, 2<sup>123.70</sup> without using the full codebook of data, i.e. less than 2<sup>64</sup> pairs of plaintexts/ciphertexts, requiring 2<sup>63</sup> pairs. Still, when compared to it, our attacks use much less data while being only slightly slower, which presents a good trade-off.</p> Gabriel de Carvalho Luis Kowada Copyright (c) 2026 Gabriel de Carvalho, Luis Kowada https://creativecommons.org/licenses/by/4.0 2026-03-25 2026-03-25 32 1 539 554 10.5753/jbcs.2026.5390 Building flexible databases by using web services for computer-aided diagnosis of cardiomyopathies: from conceptual definition to usability evaluation https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5424 <p>Computer-aided diagnosis (CAD) systems based on medical images and records apply computational techniques to process data and extract features from them to provide a second opinion to the health professional. A diverse and organized set of images and records is necessary to develop and validate such systems. However, medical data are generally obtained in a non-standardized way. With each new research and development project in this area, specific data models need to be built to organize and standardize these data and enable their use in the construction of models and computational systems. This article presents a flexible and generic database modeled and implemented to persist Cardiac Magnetic Resonance exams aiming to support the development of CAD schemes of cardiomyopathies. Furthermore, a web application was developed to enable data search and retrieval from the database. An experiment was carried out to evaluate the interface usability of the web application. Results showed that it is possible to develop a generic and flexible DB model, which can be used in several CAD applications. Additionally, the implemented interface received positive evaluations on its functionalities and usability, and users were capable of performing the intended tasks with correct outcomes.</p> Larissa Terto Alvim Vagner Mendonça Gonçalves Fátima L. S. Nunes Copyright (c) 2026 Larissa Terto Alvim, Vagner Mendonça Gonçalves, Fátima L. S. Nunes https://creativecommons.org/licenses/by/4.0 2026-03-20 2026-03-20 32 1 424 442 10.5753/jbcs.2026.5424 Survey of Brazilian Open Budget Data Portals: Query Interfaces and Dashboards https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5449 <p>To promote transparency, the Brazilian government provides access to public data through web portals featuring query interfaces and dashboards. While query interfaces are used by more experienced users to gather data for further analyses, dashboards that include visualizations help a broader audience consult and explore data. A domain of particular complexity that benefits from the use of these interfaces is government spending and budgets. This study analyzes dashboards and query interfaces of government budget data through qualitative research based on a survey. Focusing on Brazil's budget transparency initiative, we examined 83 interfaces in total: 30 dashboards and 53 query interfaces from federal, state, and major city governments. This survey assesses these interfaces using design patterns for general-purpose dashboards and design principles for open government data dashboards. Our findings reveal a critical weakness: while most portals provide access to budget data, they largely neglect user-centered design, failing to provide the necessary context or consider the data literacy of their audience. This creates a significant "transparency gap'' that undermines genuine accountability and demonstrates the need for a fundamental shift in the design of these essential public tools.</p> Kaline B. F. Mesquita Dennis G. Balreira Andre S. Spritzer Carla M. D. S. Freitas Copyright (c) 2026 Kaline B. F. Mesquita, Dennis G. Balreira, Andre S. Spritzer, Carla M. D. S. Freitas https://creativecommons.org/licenses/by/4.0 2026-03-25 2026-03-25 32 1 498 515 10.5753/jbcs.2026.5449 Leveraging Fog and Cloud Computing for Continuous Health Monitoring and Data Processing: An Architecture for Outdoor Environments and Variable Connectivity https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5471 <p>A multilayer architecture was developed for real-time health data collection and processing, designed for outdoor environments with high population density and significant network interference. By integrating fog and cloud computing, the system addresses the growing demand for continuous health monitoring driven by the proliferation of Internet of Things (IoT) devices and Wireless Body Area Networks (WBANs) using smartbands. Traditional cloud-centric solutions often face challenges such as high latency and data integrity issues in unstable network conditions. The proposed architecture overcomes these limitations by employing fog computing for edge data preprocessing, reducing reliance on cloud connectivity and enhancing system responsiveness. The architecture was originally evaluated under diverse network conditions (3G, 4G, 5G) and in real-world scenarios such as football stadiums, metro systems, and urban beaches, demonstrating over 96% packet delivery success and significant latency reductions compared to cloud-only approaches. In this extended version, additional real-world scenarios are analyzed, including domestic flights, large-scale events in stadiums with over 60,000 attendees, and new evaluations along urban beachfronts. Furthermore, this version provides a more detailed explanation of key mechanisms, such as the use of the Transactional Outbox pattern to ensure data consistency in unstable networks and the integration of distributed processing techniques for real-time alert generation. These contributions offer deeper insights into the architecture’s scalability and reliability, confirming its effectiveness in maintaining data integrity and achieving low latency in connectivity-challenged environments, providing a solution for health monitoring.</p> Juan Felipe Souza Oliveira Paulo Cesar Salgado Vidal Ronaldo Moreira Salles Marcelo Quesado Filgueiras Copyright (c) 2026 Juan Felipe Souza Oliveira, Paulo Cesar Salgado Vidal, Ronaldo Moreira Salles, Marcelo Quesado Filgueiras https://creativecommons.org/licenses/by/4.0 2026-04-28 2026-04-28 32 1 800 816 10.5753/jbcs.2026.5471 Subspace representations in deep neural networks: A survey https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5482 <p>Computer vision applications often involve processing large-scale multidimensional data, requiring methods that are both efficient and accurate. Traditional pattern recognition methods based on subspace representations offer low computational complexity but typically underperform compared to deep learning models in terms of recognition accuracy. This study aims to explore and analyze the integration of subspace representations within deep learning frameworks to leverage the advantages of both approaches. We conducted a comprehensive survey of existing methods that combine subspace representation techniques with deep neural networks. We propose a taxonomy to categorize these methods into three distinct groups based on their integration strategies. The reviewed methods demonstrate that incorporating subspace representations can enhance the performance and efficiency of deep learning models. The taxonomy helps to clarify the landscape of these hybrid approaches and identifies trends in methodological development. The surveyed approaches demonstrate a clear methodological evolution, contributing to enhanced outcomes in various real-world applications.</p> Stéfane Rêgo Gandra Bernardo Bentes Gatto Eulanda Miranda dos Santos Copyright (c) 2026 Stéfane Rêgo Gandra, Bernardo Bentes Gatto, Eulanda Miranda dos Santos https://creativecommons.org/licenses/by/4.0 2026-03-27 2026-03-27 32 1 568 585 10.5753/jbcs.2026.5482 Lattice Basis Reduction Attack on Matrix NTRU https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5486 <p>NTRU is one of the most important post-quantum cryptosystems nowadays and since its introduction several variants have been proposed in the literature. In particular, the Matrix NTRU is a variant which replaces the NTRU polynomials by integer matrices. In this work, we develop a lattice-based reduction attack on the Matrix NTRU cryptosystem that allows us to recover the plaintext. We also show that this system is completely vulnerable to the proposed attack for parameters that could be used in practice. We show that this practical attack can also be extended by reducing the lattice dimension. In addition, we give sufficient conditions to avoid decryption failure for the Matrix NTRU.</p> Thiago do Rego Sousa Tertuliano Carneiro de Souza Neto Copyright (c) 2026 Thiago do Rego Sousa, Tertuliano Carneiro de Souza Neto https://creativecommons.org/licenses/by/4.0 2026-04-24 2026-04-24 32 1 949 951 10.5753/jbcs.2026.5486 A Novel Forgetting Technique with Random Walk Sampling for Scalable and Adaptive Stream-Based Recommender Systems https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5496 <p>The explosion of user-generated data at fast rates in online services leads to the need for designing scalable recommender systems that are able to learn from data streams. Stream-based recommender systems are specifically devised for these scenarios, and have seen a recent increase in interest. These systems rely on incremental approaches that incorporate newly generated data on a single pass, resulting in a model that is always up-to-date. A known limitation of only incorporating data into a model is the presence and effect of old data, which negatively affects predictive performance and eventually raises scalability issues. Therefore, an explicit mechanism to forget such data and remove it from the model is required. In this work, we present a graph-based recommender system that recommends items based on random walk sampling, and simultaneously includes new information while also forgetting obsolete ones. Information obtained from random walk sampling is not only used to recommend relevant items, but also to capture structural information from the graph. We devise a forgetting function that prunes obsolete edges based on this information, and also on the recency, popularity and acceptance ratio of items. Our experiments highlight the importance of forgetting obsolete information and suggest the effectiveness of our method, which leads to scalability, accuracy and diversity improvements.</p> Murilo F. L. Schmitt Eduardo J. Spinosa Copyright (c) 2026 Murilo F. L. Schmitt, Eduardo J. Spinosa https://creativecommons.org/licenses/by/4.0 2026-05-11 2026-05-11 32 1 1343 1365 10.5753/jbcs.2026.5496 EXSS: an Educational Emulator for Cross-Site Scripting Attacks https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5521 <p>This article proposes a Cross-Site Scripting (XSS) attack emulator for learning in cybersecurity. The emulator allows users to identify a website vulnerable to XSS attacks in a controlled environment. The identification of vulnerabilities is achieved through activities that consist of a theoretical introduction to the topic, followed by practical procedures for conducting XSS vulnerability tests on a Web server running on a virtual machine. Activities are developed for different levels of knowledge. The particularity of the proposed emulator is its educational approach and its goal is to raise awareness among undergraduate students and professionals to develop less vulnerable websites.</p> Bianca Guarizi Isabela Alves Júlia Fernandez e Souza Guilherme Pimentel João André Watanabe Dalbert Mascarenhas Ian Bastos Marcelo Rubinstein Igor Moraes Copyright (c) 2026 Bianca Guarizi, Isabela Alves, Júlia Fernandez e Souza, Guilherme Pimentel, João André Watanabe, Dalbert Mascarenhas, Ian Bastos, Marcelo Rubinstein, Igor Moraes https://creativecommons.org/licenses/by/4.0 2026-05-04 2026-05-04 32 1 1144 1154 10.5753/jbcs.2026.5521 Empathic Games and Older Adults: A Systematic Literature Review on Empathic Gaming and Aging Populations https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5543 <p>Empathic games simulate real-life challenges to foster players’ understanding of complex situations, such as daily medical practice, financial stress, or coping with grief. While these games have been used to support physical and cognitive health across age groups, little is known about older adults’ engagement with them. To understand the breath and lenght of this knowledge gap, a systematic literature review (SLR) following a structured PRISMA protocol was performed. The papers inside the Scopus, Web of Science, PubMed, IEEE, DBLP and ACM databases were examined, without consideration for a set time period, using keywords related to “empathic games” and “older adults.” Studies were included if they investigated digital games with empathetic elements targeting adults aged 60+. Two HCI specialists independently screened titles, abstracts, and full texts in a two-round process, applying predefined inclusion/exclusion criteria. Of 205 identified records, 15 met the final criteria. Findings suggest empathic games can positively influence issues such as loneliness, depression, and family dynamics among the senior citizens. However, significant gaps remain regarding usability needs, player preferences, and profiles. These results highlight the need for further research to guide the effective design of empathetic games for older adults and to explore sensitive emotional topics.</p> Vinicius Ferreira Galvão Aline Elias Cardoso Verhalen Kamila Rios da Hora Rodrigues Copyright (c) 2026 Vinicius Ferreira Galvão, Aline Elias Cardoso Verhalen, Kamila Rios da Hora Rodrigues https://creativecommons.org/licenses/by/4.0 2026-05-04 2026-05-04 32 1 1187 1206 10.5753/jbcs.2026.5543 EasyGuard: A Gamified App for Generating Strong and Memorable Passwords https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5545 <p>Although the use of online services has increased substantially over the past decade, the strength of user-created passwords has remained at concerning levels. This study aimed to develop and evaluate the efficiency of a gamified application in promoting the behavior of designing strong passwords. Two rounds of experiments were conducted, each lasting nine days. In the first experiment (<em>n</em> = 10), we evaluated the passwords generated based on user inputs compared to random passwords. Our findings showed that our app generated passwords with an improvement of 68.43 percentage points in the memorization test, 4.87 p.p. in the typing test, and 60.38 p.p. in the combined memorization and typing test. In the second experiment (<em>n</em> = 15), we incorporated a dictionary-based password generation policy into the evaluation and applied an automated tool for data collection. User input-based passwords outperformed random ones by 87.26 p.p. in the memorization test, 2.75 p.p. in the typing test, and 85.92 p.p. in the combined test. Meanwhile, dictionary-based passwords showed improvements of 54.32 p.p., 1.69 p.p., and 69.70 p.p., respectively. Our approach proved promising in promoting strong and memorable passwords. Nonetheless, EasyGuard requires further development and should be further investigated in future studies.</p> Hugo L. Romão Marcelo H. O. Henklain Felipe L. Lobo Eduardo L. Feitosa Copyright (c) 2026 Hugo L. Romão, Marcelo H. O. Henklain, Felipe L. Lobo, Eduardo L. Feitosa https://creativecommons.org/licenses/by/4.0 2026-03-17 2026-03-17 32 1 408 423 10.5753/jbcs.2026.5545 High-Performance Elliptic Curve Cryptography: A SIMD Approach to Modern Curves (Thesis Distillation) https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5548 <p>Cryptography based on elliptic curves is endowed with efficient methods for public-key cryptography. Recent research has shown the superiority of the Montgomery and Edwards curves over the Weierstrass curves as they require fewer arithmetic operations. Using these modern curves has, however, introduced several challenges to the cryptographic algorithm's design, opening up new opportunities for optimization. Our main objective is to propose algorithmic optimizations and implementation techniques for cryptographic algorithms based on elliptic curves. In order to speed up the execution of these algorithms, our approach relies on the use of extensions to the instruction set architecture. In addition to those specific for cryptography, we use extensions that follow the Single Instruction, Multiple Data (SIMD) parallel computing paradigm. In this model, the processor executes the same operation over a set of data in parallel. We investigated how to apply SIMD to the implementation of elliptic curve algorithms. As part of our contributions, we design parallel algorithms for prime field and elliptic curve arithmetic. We also design a new three-point ladder algorithm for the scalar multiplication <em>P+kQ</em>, and a faster formula for calculating <em>3P</em> on Montgomery curves. These algorithms have found applicability in isogeny-based cryptography. Using SIMD extensions such as SSE, AVX, and AVX2, we develop optimized implementations of the following cryptographic algorithms: X25519, X448, SIDH, ECDH, ECDSA, EdDSA, and qDSA. Performance benchmarks show that these implementations are faster than existing implementations in the state of the art. Our study confirms that using extensions to the instruction set architecture is an effective tool for optimizing implementations of cryptographic algorithms based on elliptic curves. May this be an incentive not only for those seeking to speed up programs in general but also for computer manufacturers to include more advanced extensions that support the increasing demand for cryptography.</p> Armando Faz-Hernandez Julio López Copyright (c) 2026 Armando Faz-Hernandez, Julio López https://creativecommons.org/licenses/by/4.0 2026-03-25 2026-03-25 32 1 516 526 10.5753/jbcs.2026.5548 Web xKaliBurr: An Online Platform for Information Gathering in Pentest for Internet Applications https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5550 <p>The Information Gathering stage in web Pentests is crucial as it lays the foundation for all subsequent activities. However, comprehensive information gathering requires the manual use of various tools that demand advanced technical knowledge. In this context, we propose Web xKaliBurr, an open-source web tool that automates the information gathering stage of web Pentest. With a simple and user-friendly interface, the proposed tool performs extensive scans from the site's URL, providing a wide range of information and recommendations, allowing users without advanced knowledge to assess their site's security and detect potential flaws or vulnerabilities. To evaluate Web xKaliBurr, we applied the System Usability Scale (SUS) questionnaire to measure aspects of usability in accordance with the user's subjective assessment and the Net Promoter Score (NPS) method to measure user satisfaction and willingness to recommend it to others. This study involved 10 respondents. The SUS method had a score of 80, which indicates a good to excellent product, and the results of using NPS reached a value of 70%, reflecting a very good level of user satisfaction. Besides, we performed an evaluation with 3 experts in web Pentests.</p> Daniel R. Barros Lucas Cabral João V. A. Oliveira Felipe M. Castro Lucas L. Soares José M. Monteiro Joaquim Bento Lincoln S. Rocha Copyright (c) 2026 Daniel R. Barros, Lucas Cabral Cabral, João V. A. Oliveira, Felipe M. Castro, Lucas L. Soares, José M. Monteiro, Joaquim Bento, Lincoln S. Rocha https://creativecommons.org/licenses/by/4.0 2026-04-15 2026-04-15 32 1 700 714 10.5753/jbcs.2026.5550 Redefining Digital Web Signature Secrecy: A Client-Side Model for Enhanced Security and Compliance https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5556 <p>Web-based digital signature platforms prioritize convenience but often compromise secrecy, exposing sensitive documents and private keys to third-party systems. This paper introduces a client-side cryptographic model that eliminates such vulnerabilities by performing all cryptographic operations within the user's browser. Leveraging One-Time Certificates and adhering to Claude Shannon's secrecy principles, the model ensures that documents and keys remain secure by never leaving the client environment. The proposed approach addresses critical risks, including document exposure, metadata leakage, and key compromise, while maintaining compatibility with public key infrastructure standards and legacy systems. Performance evaluations show efficient signing and verification processes, with documents up to 5 MB signed in approximately 1 second and verified in 0.15 seconds. By removing reliance on external servers for sensitive operations, the model mitigates platform vulnerabilities, reduces liability, and ensures compliance with regulations like GDPR and LGPD. Key contributions include enhanced secrecy, simplified key management, and scalable real-world use-case performance. This work redefines digital signature security, offering a robust, privacy-preserving alternative for secrecy document signature and verification workflows.</p> Wellington Fernandes Silvano Lucas Mayr Enzo Brum Gabriel Cabral Frederico Schardong Ricardo Custódio Copyright (c) 2026 Wellington Fernandes Silvano, Lucas Mayr, Enzo Brum, Gabriel Cabral, Frederico Schardong, Ricardo Custódio https://creativecommons.org/licenses/by/4.0 2026-04-22 2026-04-22 32 1 888 905 10.5753/jbcs.2026.5556 A Triad of Defenses to Mitigate Poisoning Attacks in Federated Learning https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5558 <p>Federated learning (FL) enables the training of machine learning models on decentralized data, potentially improving data privacy. However, the FL distributed architecture is vulnerable to poisoning attacks. In this paper, we propose an FL method capable of mitigating these attacks through a triad of defense strategies: organizing clients into groups, evaluating the local performance of global models during training, and using a voting scheme during the inference phase. The proposed approach first divides the clients into randomly sampled groups, each generating a distinct global model. Each client trains a local model on their private data and submits it to the central server. The central server aggregates the local models within each group to generate the global models. Then, each client receives all global models, selects the best performing one as their new local model, and the process repeats until training is complete. During the inference phase, each client classifies its inputs according to a majority-based voting scheme among the global models. Our experiments using the HAR and MNIST datasets demonstrate that our method can effectively mitigate poisoning attacks without compromising the global model's performance.</p> Blenda Oliveira Mazetto Bruno Bogaz Zarpelão Copyright (c) 2026 Blenda Oliveira Mazetto, Bruno Bogaz Zarpelão https://creativecommons.org/licenses/by/4.0 2026-03-16 2026-03-16 32 1 316 331 10.5753/jbcs.2026.5558 Partial integrity, authenticity and belongingness using modification-tolerant signature schemes https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5565 <p>Digital signatures allow us to ensure that the signed digital data is authentic and has not been modified. However, even a single bit modification in the data invalidates the entire signature. In INDOCRYPT '19, Idalino et al. presented an efficient modification-tolerant signature scheme (MTSS) framework using combinatorial group testing techniques, allowing the location and correction of modified parts of the signed data. In this work, we implement their framework and discuss the practical performance of the solution. We also propose various necessary auxiliary algorithms not explored in the initial work, such as the division of data into blocks and the generation of the underlying combinatorial structure needed for the signature generation. Moreover, we propose a novel use case of the framework, which we call the <em>belongingness framework</em>. This scheme allows the verification of the integrity and authenticity of a subset of the signed data without having access to the whole data. This is particularly interesting in big data applications, where access to the whole signed data is prohibitive due to storage limitations.</p> Anthony Bernardo Kamers Gustavo Zambonin Thaís Bardini Idalino Paola de Oliveira Abel Jean Everson Martina Copyright (c) 2026 Anthony Bernardo Kamers, Gustavo Zambonin, Thaís Bardini Idalino, Paola de Oliveira Abel, Jean Everson Martina https://creativecommons.org/licenses/by/4.0 2026-03-16 2026-03-16 32 1 343 362 10.5753/jbcs.2026.5565 A scheme based on Proof-of-Download blockchain for Digital Rights Management and Traitor Detection https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5568 <p>The licensing of digital assets is an important aspect of modern markets like music and movies. In this context, Digital Rights Management (DRM) and Traitor Detection (TD) comprise a set of tools, techniques, and processes that try to ensure control of the use of licensed media. This work contributes to this area by making it possible for DRM and TD to happen in a public and highly decentralized blockchain under a proposed threat model by developing new cryptographic techniques, which, as shown by the results obtained, are very efficient without requiring expensive hardware.</p> João Tito do Nascimento Silva Felipe Z. da N. Costa João Gondim Copyright (c) 2026 João Tito do Nascimento Silva, Felipe Z. da N. Costa, João Gondim https://creativecommons.org/licenses/by/4.0 2026-04-15 2026-04-15 32 1 715 732 10.5753/jbcs.2026.5568 LearnVis: Analyzing Higher Education Student Performance through Information Visualization Techniques https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5613 <p>Understanding educational challenges in higher education requires a detailed analysis of the variables related to academic performance, such as grades, attendance, and student engagement. This analysis is crucial for identifying critical factors that affect learning and student retention. In this context, this study introduces <em>LearnVis</em>, a visual analytics system developed to analyze student performance in higher education. The system was designed to utilize data from university academic records, including grades, attendance, and engagement with specific topics. It offers a set of coordinated layouts that enable the analysis of both student groups and individuals. These layouts allow users to explore the structure of course modules, considering the topics they comprise and their sequence throughout the course. Additionally, the system facilitates the analysis of student behavior in each module, including their attendance in the topics covered, their respective grades, and the tracking of multiple attempts in specific modules, as well as their completion sequences. To evaluate its effectiveness, the <em>LearnVis</em> system was applied to a dataset of 1,490 students from the Computer Information Systems program at the Federal University of Uberlândia, covering the period from 2009 to 2019. The results demonstrate that the system provides valuable insights that contributes to improve academic performance and decrease student retention.</p> Angélica Gomes Oliveira Paulo Henrique Ribeiro Gabriel José Gustavo de Souza Paiva Copyright (c) 2026 Angélica Gomes Oliveira, Paulo Henrique Ribeiro Gabriel Gabriel, José Gustavo de Souza Paiva https://creativecommons.org/licenses/by/4.0 2026-04-02 2026-04-02 32 1 600 616 10.5753/jbcs.2026.5613 Synthetic Data: AI's New Weapon Against Android Malware https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5655 <p>The ever-increasing number of Android devices and the accelerated evolution of malware, reaching over 35 million samples by 2024, highlight the critical importance of effective detection methods. Attackers are now using Artificial Intelligence to create sophisticated malware variations that can easily evade traditional detection techniques. Although machine learning has shown promise in malware classification, its success relies heavily on the availability of up-to-date, high-quality datasets. The scarcity and high cost of obtaining and labeling real malware samples presents significant challenges in developing robust detection models. In this paper, we propose MalSynGen, a Malware Synthetic Data Generation methodology that uses a conditional Generative Adversarial Network (cGAN) to generate synthetic tabular data. This data preserves the statistical properties of real-world data and improves the performance of Android malware classifiers. We evaluated the effectiveness of this approach using various datasets and metrics that assess the fidelity of the generated data, its utility in classification, and the computational efficiency of the process. Our experiments demonstrate that MalSynGen can generalize across different datasets, providing a viable solution to address the issues of obsolescence and low quality data in malware detection.</p> Angelo Gaspar Diniz Nogueira Kayua Oleques Paim Hendrio Bragança Rodrigo Brandão Mansilha Diego Kreutz Copyright (c) 2026 Angelo Gaspar Diniz Nogueira, Kayua Oleques Paim, Hendrio Bragança, Rodrigo Brandão Mansilha, Diego Kreutz https://creativecommons.org/licenses/by/4.0 2026-04-28 2026-04-28 32 1 1047 1060 10.5753/jbcs.2026.5655 Comprehensive Evaluation of Hybrid and XAI-Based Feature Selection for Intrusion Detection: A Smart City Perspective https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5664 <p>The expanding connectivity within smart cities has dramatically increased the attack surface, posing significant challenges for Intrusion Detection Systems (IDSs). A critical aspect of effective IDSs is the selection of relevant features to accurately identify potential attackers. Traditional feature selection methods, including filter-based (fast but potentially less accurate), wrapper-based (accurate but computationally intensive), and embedded (classifier-dependent) approaches, each present inherent limitations. Recent advancements propose alternative strategies, such as using Explainable Artificial Intelligence (XAI) algorithms to enhance filtering techniques, hybridizing filter and wrapper methods, and combining these strategies to optimize performance. However, a systematic evaluation of these novel feature selection methods within the context of diverse smart city environments remains largely unexplored. This work presents a comprehensive assessment of feature selection techniques for IDSs in multiple Smart City domains, including healthcare and transportation. Our analysis focuses on evaluating the trade-offs between classification performance and feature reduction achieved by hybrid (IWSHAP), metaheuristics (GRASPQ-FS) and XAI-based approaches (SHAP Ranking). The experimental results indicate that XAI-based methods achieve a favorable trade-off between dimensionality reduction and predictive performance, consistently preserving high F1-Scores (often exceeding 90%) while simultaneously reducing the feature set by substantial margins (e.g., over 90%). Although metaheuristic approaches can achieve superior feature reduction, they often require meticulous tuning to prevent performance degradation. This study underscores the potential of XAI-driven feature selection to enhance IDSs' effectiveness within complex Smart City ecosystems.</p> Felipe N. Dresch Felipe H. Scherer Matheus M. Ciocca Vagner E. Quincozes Silvio E. Quincozes Diego Kreutz Copyright (c) 2026 Felipe N. Dresch, Felipe H. Scherer, Matheus M. Ciocca, Vagner E. Quincozes, Silvio E. Quincozes, Diego Kreutz https://creativecommons.org/licenses/by/4.0 2026-04-16 2026-04-16 32 1 733 749 10.5753/jbcs.2026.5664 Support for Families of Children and Adolescents with Attention-Deficit/Hyperactivity Disorder: A Technological Solution Based on Persuasive Strategies https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5670 <p>Attention-Deficit/Hyperactivity Disorder (ADHD) is a neurodevelopmental disorder characterized by inattention, hyperactivity, and impulsivity. The disorder can directly impact the organization, focus, and self-regulation of children and adolescents, posing challenges for both diagnosed individuals and their families. Although several technological interventions exist for this population, many are abandoned before their benefits are realized. Given this scenario, this study investigates the difficulties families experience in organizing their routines in the context of ADHD and explores how Human-Computer Interaction (HCI) principles, combined with persuasive strategies, can make these solutions more effective. The research gathered data from the literature and from interactions with parents/caregivers and children/adolescents with ADHD, identifying daily challenges, reward strategies used, and desired features in a routine support app. The findings guided the design of a functional prototype with elements such as reminders and positive reinforcement, which was used to assess its usefulness and relevance from the perspective of the participating families. The contribution to the field of HCI lies in the explicit application of theoretical models of persuasive technologies in a context of neurodivergence, offering conceptual and practical guidelines for the design of digital solutions that are more engaging, ethical, and aligned with the family care ecosystem in ADHD.</p> Caroline R. S. Jandre Fernando C. S. Dal' Maria Débora M. de Miranda Cristiane N. Nobre Copyright (c) 2026 Caroline R. S. Jandre, Fernando C. S. Dal' Maria, Débora M. de Miranda, Cristiane N. Nobre https://creativecommons.org/licenses/by/4.0 2026-04-28 2026-04-28 32 1 1021 1046 10.5753/jbcs.2026.5670 A survey of social media stance detection using non-textual features https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5687 <p>Stance detection is known as the computational task of estimating an individual's attitude towards a given target topic, which is often of a political or moral nature. In traditional NLP fashion, models of this kind have relied mainly on learning features extracted from social media text. However, social media may provide many other types of non-content information in conjunction with text, such as friends networks, interactions with other users, etc. These knowledge sources, despite being potentially useful for stance prediction, remain relatively little discussed in existing surveys of the field. To fill this gap in the literature, this article presents a survey of stance detection research focusing on the use of network-related features and on how these are combined with more standard text models.</p> Laís Carraro Leme Cavalheiro Ivandré Paraboni Copyright (c) 2026 Laís Carraro Leme Cavalheiro, Ivandré Paraboni https://creativecommons.org/licenses/by/4.0 2026-03-25 2026-03-25 32 1 527 538 10.5753/jbcs.2026.5687 Turbocharging Brazilian Mergers and Acquisitions: Questions & Answers Evaluation https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5703 <p>Economic power abuse is a concern in Brazil, where CADE (Administrative Council for Economic Defense) institution combats anti-competitive behaviors to ensure fair competition. Artificial intelligence (AI) can aid CADE by identifying and extracting relevant information from technical reports published in Brazilian Portuguese language, improving the detection and prevention of economic abuse. This paper presents a case study using AI to improve regulatory reviews of CADE documents via a Retrieval-Augmented Generation (RAG) pipeline architecture. Our key contribution is the creation of a specialized Questions &amp; Answers benchmark dataset and a pipeline evaluation methodology, providing a standardized framework for Portuguese-language regulatory document analysis. A chain of thought (CoT) approach was used for problem solving. It leverages the RAG retrieval mechanism to access relevant information and incorporates the sequential reasoning of the CoT framework to generate responses that follow a logical flow of ideas, thus enhancing response accuracy. A vector database employing cosine similarity was used to retrieve the main arguments combined with metadata filters, reducing hallucinations and improving the Large Language Model (LLM) performance. RAG metrics were then combined with a robust human fact-check assessment to validate the pipeline. Our findings establish a new benchmark for Questions &amp; Answers evaluation in Brazilian Mergers and Acquisitions, demonstrating that the proposed strategy effectively enhances the analysis of organizational merger and acquisition reports, unlocking substantial benefits for society.</p> Francis Spiegel Rubin Pedro Nuno de Souza Moura Adriana Cesario de Faria Alvim Copyright (c) 2026 Francis Spiegel Rubin, Pedro Nuno de Souza Moura, Adriana Cesario de Faria Alvim https://creativecommons.org/licenses/by/4.0 2026-03-17 2026-03-17 32 1 387 407 10.5753/jbcs.2026.5703 Mind the Gap Between UX Data and Visualization Proposals: Analyzing User Dissatisfaction and Driving Interactive System Improvements https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5741 <p>Software professionals typically analyze user experience data (i.e., UX data) to identify positive and negative aspects of user interactions with software. While the scientific literature has proposed various UX data visualization approaches, these methods are rarely evaluated by software professionals (i.e., designers and developers) to determine their practical effectiveness in understanding user satisfaction and dissatisfaction. Moreover, professionals often adapt visualization techniques based on their own practical experiences, contributing to a wealth of informal knowledge published in UX blogs and websites (known as grey literature). To consolidate this dispersed knowledge, we conducted an analysis of 144 grey literature articles that discussed UX data definitions and visualization techniques. Our findings revealed three key components that support the investigation of user dissatisfaction: the visualization approach, the purpose of using the visualization, and UX data definitions. To validate the relevance of this three-leg approach, we conducted a study with 31 software professionals, using five UX data visualizations focused on a mobile airline ticketing application. We selected this domain to ensure familiarity and minimize the need for contextual interpretation. Through an online questionnaire, participants provided insights on how well the visualizations helped them identify aspects of the application contributing to user dissatisfaction. The results confirmed the practical value of the three-leg approach, with 78% of positive feedback regarding its effectiveness in UX data analysis. Additionally, 84% of participants acknowledged that the proposed visualizations could be integrated into their daily workflows. In this extended study, we introduced a new research question (i.e., RQ3) to investigate how UX data visualizations can support system improvements beyond identifying dissatisfaction. To address this, we conducted a new analysis of the grey literature, focusing on the potential benefits that UX data visualizations can bring to interactive systems. We also reanalyzed the collected study data, incorporating participant comments on how UX data visualizations could enhance software development processes. The comparison between the new findings and participant feedback highlighted four key themes: understanding users, improving system interaction, fostering self-knowledge about the system, and handling development resources efficiently. These insights reinforced the role of UX data visualizations in bridging the gap between theoretical models and practical applications for software professionals. Our findings update prior conclusions and discussions, demonstrating the broader impact of UX data visualizations beyond dissatisfaction analysis, extending to strategic decision-making, usability improvements, and software evolution.</p> Maylon Macedo Luciana Zaina Copyright (c) 2026 Maylon Macedo, Luciana Zaina https://creativecommons.org/licenses/by/4.0 2026-04-29 2026-04-29 32 1 1074 1089 10.5753/jbcs.2026.5741 Why Brazilian Organizations Invest in Accessibility? A Strategic Analysis of Motivations, Policies, Benefits and Barriers https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5744 <p>Accessibility is a key requirement in any software system that requires user interaction since it is concerned with the development of products and services that can be used by a wide range of users, regardless of their physical or cognitive abilities. Many studies were conducted to understand the lack of accessibility observed in many digital products and concluded that the lack of knowledge and awareness, and the lack of management commitment and executive support were the main factors that hinder accessibility adoption in the software development process, suggesting that many organizations do not recognize the business importance of accessibility in digital markets. On the other hand, some organizations consistently invest in accessibility. In this manuscript, we present a study conducted with professionals involved in the decision-making processes of some of those organizations to uncover their strategic views on the development of accessible products, mainly regarding their motivations, policies, barriers, and benefits obtained from investing in accessibility. To accomplish our goals, we adopted an exploratory sequential mixed methods, starting with a qualitative study to explore the views of participants, followed by a quantitative study designed based on factors identified in the first study. In total 15 professionals participated in our qualitative study and 31 professionals participated in our quantitative study. Our findings suggest that in these organizations: i) accessibility demands are mostly generated by internal policies and cultures in tactical and strategic layers; ii) organizations are mostly driven by a mix of ethical, business, and regulatory factors, such as to promote digital inclusion, increase brand reputation and yet comply with current legislation; iii) policies to ensure accessibility adoption mostly include hiring people with disabilities, making accessibility validation mandatory with well-defined processes, and including accessibility in the definition of done (DoD) for all features; iv) perceived benefits are primarily associated with stronger brand reputation, better software quality and increased digital inclusion; v) the more critical barriers to accessibility adoption are linked to the lack of awareness and knowledge among stakeholders and the technical team, in addition to the lack of commitment across management levels and struggles with legacy systems.</p> Marcelo Medeiros Eler Luis André da Silveira Isabel Francine Mendes Giovanne Bertotti Wajdi Aljedaani Copyright (c) 2026 Marcelo Medeiros Eler, Luis André da Silveira, Isabel Francine Mendes, Giovanne Bertotti, Wajdi Aljedaani https://creativecommons.org/licenses/by/4.0 2026-04-24 2026-04-24 32 1 839 858 10.5753/jbcs.2026.5744 Roll Like Thunder: Expert Validation and Refinement of a Design Process to Convey Emotions in Music Visualizations https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5747 <p>Music evokes complex emotions and enhances engagement across interactive media. Music visualizations — graphical representations of sound — have gained relevance in games, virtual environments, and digital art. Yet, many existing approaches fail to convey emotional nuance, focusing instead on structural or reactive aspects of sound. To support the creation of emotionally engaging visualizations, we previously introduced Thunder, a design process grounded in a Research Through Design (RtD) approach. While its initial application showed creative potential, it also highlighted opportunities to improve structural clarity, enhance guidance for designers, and increase adaptability across varied contexts. In this paper, we present a refined version of Thunder, developed through an expert review with specialists in Human-Computer Interaction, Music, and Computing. The updated process is organized into three core phases -- Conceptualization, Prototyping, and Evaluation -- each offering clearer support for emotional, aesthetic, and musical decision-making. Implementation is now framed as an external and adaptable step, allowing flexibility across technical scenarios. The new refined Thunder offers an improved foundation and design guidance for creating emotionally resonant music visualizations in diverse contexts.</p> Caio Nunes Ticianne Darin Copyright (c) 2026 Caio Nunes, Ticianne Darin https://creativecommons.org/licenses/by/4.0 2026-04-24 2026-04-24 32 1 931 948 10.5753/jbcs.2026.5747 BioNestedNER: A Hybrid Language Model Approach for Recognizing Nested, Discontinuous, and Multi-Type Named Entities https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5790 <p>Named Entity Recognition (NER) is essential in Natural Language Processing (NLP) for extracting pertinent information from unstructured data. Traditional NER approaches assume continuous and non-overlapping entities, which can be limiting in real-world scenarios. This research introduces <strong>BioNestedNER</strong>, a hybrid method for nested, discontinuous, and multi-type entity recognition, with a focus on clinical and biomedical domains. Our approach employs a language model (encoder-only Transformer-based model) using a machine reading comprehension strategy, treating NER as a question-answering-like task. A Conditional Random Field also addresses multi-label sequence labeling for handling nested entities as multi-type entities. Evaluation in Portuguese demonstrated state-of-the-art performance in micro F1-Scores across two clinical corpora. In <em>NestedClinBr</em>, featuring nested and discontinuous entities, our method achieved an F1-Score of 0.863, surpassing the second-place result by 2.1%. In<em> SemClinBr</em>, with multi-type entities, an F1-Score of 0.782 was achieved, surpassing the second-place result by 11.5%. This paper also presents a new clinical corpus in Brazilian Portuguese annotated with nested and discontinuous entities, offering a valuable resource for developing and evaluating models handling these complex entities. In conclusion, BioNestedNER presents an adaptable and effective NER solution for nested, discontinuous, and multi-type entities, with the potential to benefit various clinical applications.</p> Elisa Terumi Rubel Schneider Yohan Bonescki Gumiel Paloma Martínez Claudia Moro Emerson Cabrera Paraiso Copyright (c) 2026 Elisa Terumi Rubel Schneider, Yohan Bonescki Gumiel, Paloma Martínez, Claudia Moro, Emerson Cabrera Paraiso https://creativecommons.org/licenses/by/4.0 2026-04-05 2026-04-05 32 1 635 648 10.5753/jbcs.2026.5790 "Mamma mia! Autocorrection ruined my message": Interaction design challenges in a world of multilinguals https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5821 <div class="page" title="Page 1"> <div class="layoutArea"> <div class="column"> <p>The widespread adoption of digital technologies and internet connectivity, particularly in developing regions, has drawn attention to the growing diversity of software users, who come from different cultural and linguistic backgrounds. Traditionally, human-computer interaction (HCI) research has treated language communities as static and isolated units, often equating them with national boundaries. However, linguistic research demonstrates that cultures are dynamic and interconnected, languages transcend borders, and people frequently speak two or more languages. To optimize HCI design, developers must account for these complex sociolinguistic realities, incorporating both social and individual dimensions of multilingualism. In this work, our aim is to contrast the demands of multilingual users with current design solutions. First, the behavior and difficulties of technology users in multilingual contexts were investigated through an online survey conducted with the academic communities of two universities: the Università della Svizzera italiana (USI), in Switzerland, and the Pontifical Catholic University of Rio Grande do Sul (PUCRS), in Brazil. Next, we present a pattern language that describes solutions aimed at multilingualism issues found in current websites and applications. In a final discussion, we compare the findings of the survey with the patterns, summarizing the challenges and opportunities for future research that aim to propose HCI design approaches with a focus on multilingualism.</p> </div> </div> </div> Diego Moreira da Rosa Leandro Soares Guedes Monica Landoni Milene Silveira Copyright (c) 2026 Diego Moreira da Rosa, Leandro Soares Guedes, Monica Landoni, Milene Silveira https://creativecommons.org/licenses/by/4.0 2026-04-22 2026-04-22 32 1 859 887 10.5753/jbcs.2026.5821 Optimization of Formula 1 Racing Strategies: An Approach Based on Exploratory Analysis and Genetic Algorithms https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5833 <p>Formula 1 (F1) is a motorsport that demands high technical and strategic expertise, where tactical decisions can significantly influence driver's performance and race results. This work addresses the challenge of optimizing pit stop strategies and proposes solutions for strategic decisions aimed at minimizing total race time, such as tire compound selection and optimal stint planning, through Exploratory Data Analysis (EDA) and Genetic Algorithms (GAs). The study relies on historical F1 race data obtained through the FastF1 dataset to examine variables such as tire degradation, lap times, and the impact of pit stops on drivers' final positions. Based on the insights from EDA, a GA model was developed to simulate different race strategies and identify the most effective ones, serving as a complementary tool to enhance strategic decision-making. The model offers data-driven insights that can support race strategists in refining and adapting their strategies based on real-time race conditions and expert judgment. The results indicate that the proposed methodology can support teams in designing more efficient strategies, leading to better performance across various circuits.</p> Eduardo de Lira Brennand Eugênio Silva Gabriel Resende Machado Copyright (c) 2026 Eduardo de Lira Brennand, Eugênio Silva, Gabriel Resende Machado https://creativecommons.org/licenses/by/4.0 2026-05-07 2026-05-07 32 1 1270 1282 10.5753/jbcs.2026.5833 Visually Comparing Graph Vertex Ordering Algorithms through Geometrical and Topological Approaches https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5851 <p>Graph vertex ordering is a resource widely employed in spatial data analysis, particularly in the urban analytics context, where street graphs are frequently used as spatial discretization for modeling and simulation. Vertex ordering is also important for visualization purposes, as many methods require the vertices to be arranged and displayed in a well-defined order to enable the visual identification of non-trivial patterns. The primary goal of vertex ordering methods is to find an ordering that preserves neighborhood relations. However, the structural complexity of graphs employed in real-world applications leads to unavoidable distortions in the ordering process. Therefore, comparing different vertex ordering methods is fundamental to enable effective analysis and selection of the most appropriate method in each application. Although several metrics have been proposed to assess spatial vertex ordering, they typically focus on measuring the quality of the ordering globally. Global ordering assessment does not enable the analysis and identification of locations where distortions are more pronounced, hampering the analytical process. Visual evaluation of the vertex ordering mechanisms is particularly valuable in this context, as it allows analysts to distinguish between methods based on their performance within a single visualization, assess distortions, identify regions with anomalous behavior, and, in urban contexts, explain spatial inconsistencies in the ordering. This work introduces a visualization-assisted tool to assess vertex ordering techniques, having urban analytics as the application focus. Specifically, we evaluate geometric and topological vertex ordering approaches using urban street graphs as the basis for comparisons. The visual tool builds upon existing and newly proposed metrics, which are validated through experiments on urban data from multiple cities, demonstrating that the proposed methodology is effective in assisting users in selecting a suitable vertex ordering technique, fine-tuning hyperparameters, and identifying regions with high ordering distortions.</p> Karelia Vilca Salinas Victor Barella Thales Vieira Luis Gustavo Nonato Copyright (c) 2026 Karelia Vilca Salinas, Victor Barella, Thales Vieira, Luis Gustavo Nonato https://creativecommons.org/licenses/by/4.0 2026-03-16 2026-03-16 32 1 363 373 10.5753/jbcs.2026.5851 Brazilian Portuguese Image Captioning with Transformers: A Study on Cross-Native-Translated Dataset https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5857 <p>Image captioning (IC) refers to the automatic generation of natural language descriptions for images, with applications ranging from social media content generation to assisting individuals with visual impairments. While most research has been focused on English-based models, low-resource languages such as Brazilian Portuguese face significant challenges due to the lack of specialized datasets and models. Several studies create datasets by automatically translating existing ones to mitigate resource scarcity. This work addresses this gap by proposing a cross-native-translated evaluation of Transformer-based vision and language models for Brazilian Portuguese IC. We use a version of Flickr30K comprised of captions manually created by native Brazilian Portuguese speakers and compare it to a version with captions automatically translated from English to Portuguese. The experiments include a cross-context approach, where models trained on one dataset are tested on the other to assess the translation impact. Additionally, we incorporate attention maps for model inference interpretation and use the CLIP-Score metric to evaluate the image-description alignment. Our findings show that Swin-DistilBERTimbau consistently outperforms other models, demonstrating strong generalization across datasets. ViTucano, a Brazilian Portuguese pre-trained VLM, surpasses larger multilingual models (GPT-4o, LLaMa 3.2 Vision) in traditional text-based evaluation metrics, while GPT-4 models achieve the highest CLIP-Score, highlighting improved image-text alignment. Attention analysis reveals systematic biases, including gender misclassification, object enumeration errors, and spatial inconsistencies.</p> Gabriel Bromonschenkel Alessandro L. Koerich Thiago M. Paixão Hilário Tomaz Alves de Oliveira Copyright (c) 2026 Gabriel Bromonschenkel, Alessandro L. Koerich, Thiago M. Paixão, Hilário Tomaz Alves de Oliveira https://creativecommons.org/licenses/by/4.0 2026-04-15 2026-04-15 32 1 663 676 10.5753/jbcs.2026.5857 User Story Estimation using Natural Language Processing and Deep Learning: A Comparative Study https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5860 <p>Effort estimation is a fundamental activity in software development, guiding task prioritization, resource allocation, and cost planning. Traditional techniques, such as Planning Poker, rely heavily on team subjectivity and experience, which can compromise their effectiveness in certain contexts. This study explores how machine learning and natural language processing techniques can increase the accuracy of effort estimation for user stories. To understand the current state of research in this area, a systematic mapping was conducted to identify the main databases, techniques, and methods used to estimate effort based on textual requirements. Guided by the mapping, a comparative experimental study was performed using the FastText and XLNet language models, combined with a deep neural network, on a dataset of 23,313 user stories. The results indicate that XLNet outperformed FastText in most evaluation metrics, achieving a Mean Absolute Error (MAE) of 3.77, a Mean Squared Error (MSE) of 79.94, and a Median Absolute Error (MdAE) of 1.93. Furthermore, the proposed approach performed competitively compared to related works. These findings demonstrate the potential of deep learning models to assist developers by providing more accurate, consistent, and objective estimates for user story effort.</p> Daniel de Oliveira Silva Alinne Cristinne Corrêa Souza Francisco Carlos Monteiro Souza Silvio Ricardo Rodrigues Sanches Copyright (c) 2026 Daniel de Oliveira Silva, Alinne Cristinne Corrêa Souza, Francisco Carlos Monteiro Souza, Silvio Ricardo Rodrigues Sanches https://creativecommons.org/licenses/by/4.0 2026-04-24 2026-04-24 32 1 952 986 10.5753/jbcs.2026.5860 LiwTERM-r: A Revised Lightweight Transformer-based Model for Multimodal Skin Lesion Detection Robust to Incomplete Input https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5871 <p>As the most common type of cancer in the world, skin cancer accounts for approximately 30% of all diagnosed tumor-based lesions. Early diagnosis can reduce mortality and prevent disfiguring in different skin regions. With the application of machine learning techniques in recent years, especially deep learning, promising results in this task could be achieved, presenting studies demonstrating that the combination of patients' clinical anamneses and images of the injured lesion is essential for improving the correct classification of skin lesions. Despite that, meaningful use of anamneses with multiple collected images of the same skin lesion is mandatory, requiring further investigation. Thus, this project aims to contribute to developing multimodal machine learning-based models to solve the skin lesion classification problem by employing a lightweight transformer model that is robust to missing clinical information input. As a main hypothesis, models can be fed by multiple images from different sources as input along with clinical anamneses from the patient's historical evaluations, leading to a more factual and trustworthy diagnosis. Our model deals with the not-trivial task of combining images and clinical information concerning the skin lesions in a lightweight transformer architecture that does not demand high computation resources or even all the information from the anamneses but still presents competitive classification results.</p> Luis Antonio de Souza Júnior André Georghton Cardoso Pacheco Thiago Oliveira dos Santos Wyctor Fogos da Rocha Pedro Henrique Bouzon Christoph Palm João Paulo Papa Copyright (c) 2026 Luis Antonio de Souza Júnior, André Georghton Cardoso Pacheco, Thiago Oliveira dos Santos dos Santos, Wyctor Fogos da Rocha, Pedro Henrique Bouzon, Christoph Palm, João Paulo Papa https://creativecommons.org/licenses/by/4.0 2026-03-16 2026-03-16 32 1 305 315 10.5753/jbcs.2026.5871 Crowd-Powered Sampling for Machine Learning: Leveraging Citizen Scientist Response Patterns in AutoML Workflows https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5888 <p>Defining effective models for data classification is challenging, especially in complex contexts. Automated Machine Learning (AutoML) tools can assist in this process by generating rankings tailored to the nature of the data and the problem. In this work, we investigate the performance of five classifiers applied to the task of deforestation segment classification, using data labeled through a citizen science campaign from the ForestEyes project. We selected SVM, Ridge, AdaBoost, KNN, and MLP models based on a ranking generated with the PyCaret AutoML library, prioritizing diverse modeling approaches. Initially, the performance of the models is assessed using the incremental training strategy based on entropy of the volunteer's classifications. Then, a new training strategy is proposed based on the median response time of volunteers when evaluating each segment, exploring three ordering strategies: ascending, descending, and edge-based. Experimental results aligned with the PyCaret ranking, with SVM achieving the best performance, followed by Ridge and AdaBoost, especially when trained on smaller and more reliable data subsets. Both the entropy-based approach and the new strategy using median response time demonstrated strong potential to efficiently train machine learning models in scenarios with scarce data, typical in citizen science campaigns.</p> Hugo Resende Eduardo B. Neto Fabio A. M. Cappabianco Álvaro L. Fazenda Fabio A. Faria Copyright (c) 2026 Hugo Resende, Eduardo B. Neto, Fabio A. M. Cappabianco, Álvaro L. Fazenda, Fabio A. Faria https://creativecommons.org/licenses/by/4.0 2026-03-16 2026-03-16 32 1 332 342 10.5753/jbcs.2026.5888 FLIM-based Salient Object Detection Networks with Adaptive Decoders https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5893 <p>Salient Object Detection (SOD) methods can locate objects that stand out in an image, assign higher values to their pixels in a saliency map, and binarize the map outputting a predicted segmentation mask. A recent tendency is to investigate pre-trained lightweight models rather than deep neural networks in SOD tasks, coping with applications under limited computational resources. In this context, we have investigated lightweight networks using a methodology named <em>Feature Learning from Image Markers</em> (FLIM), which assumes that the encoder's kernels can be estimated from marker pixels on discriminative regions of a few representative images. This work proposes flyweight networks, hundreds of times lighter than lightweight models, for SOD by combining a FLIM encoder with an <em>adaptive decoder</em>, whose weights are estimated for each input image by a given heuristic function. Such FLIM networks are trained from three to four representative images only and without backpropagation, making the models suitable for applications under labeled data constraints as well. We study five adaptive decoders; two of them are introduced here. Differently from the previous ones that rely on one neuron per pixel with shared weights, the heuristic functions of the new adaptive decoders estimate the weights of each neuron per pixel. We compare FLIM models with adaptive decoders for two challenging SOD tasks with three lightweight networks from the state-of-the-art, two FLIM networks with decoders trained by backpropagation, and one FLIM network whose labeled markers define the decoder's weights. For one of the applications, we evaluate the generalization ability of the networks to six different datasets. The experiments demonstrate the advantages of the proposed networks over the baselines, revealing the importance of further investigating such methods in new applications.</p> Gilson Junior Soares Matheus Abrantes Cerqueira Jancarlo F. Gomes Laurent Najman Silvio Jamil F. Guimarães Alexandre X. Falcão Copyright (c) 2026 Gilson Junior Soares, Matheus Abrantes Cerqueira, Jancarlo F. Gomes, Laurent Najman, Silvio Jamil F. Guimarães, Alexandre X. Falcão https://creativecommons.org/licenses/by/4.0 2026-04-16 2026-04-16 32 1 750 765 10.5753/jbcs.2026.5893 Advancing Biodiversity Monitoring by Integrating Multimodal AI Models into Camera Trap Workflow https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5894 <p>Camera trap is an important non-invasive technique for wildlife monitoring. A typical camera-trap workflow involves various relevant tasks, such as filtering empty images, classifying animal species and identifying animal behavior. In this study, we explore the application of large-scale multimodal language models (MLLMs) for processing camera trap images to perform these three tasks. We evaluate the performance of four state-of-the-art models across these tasks, precisely BLIP, CLIP, Gemini, and GPT with zero-shot and few-shot learning methodologies. Our experiments showed several interesting results. First, few-shot learning significantly enhanced model performance in filtering empty images, with BLIP achieving a much higher accuracy (91.0%) compared to only 7.61% of its zero-shot counterpart. In the task of animal species classification, Gemini showed strong baseline performance, reaching 75.89 % of accuracy with zero-shot. In terms of identifying animal behavior, two scenarios were investigated: using single image or sequences of images. The results indicate that sequence-based processing improves behavioral analysis, with BLIP attaining the highest accuracy (75.57 %) in this task. In general, our study emphasizes the limitations of the zero-shot approach in complex tasks while highlights the effective potential of few-shot and sequence-based learning to address challenging problems such as empty images, and species misclassifications. These findings demonstrate the efficacy of advanced MLLMs in automating biodiversity monitoring, offering a scalable and accurate solution for processing large-scale datasets, and advancing conservation science.</p> Luiz Alencar Fagner Cunha Eulanda M. dos Santos Copyright (c) 2026 Luiz Alencar, Fagner Cunha, Eulanda M. dos Santos https://creativecommons.org/licenses/by/4.0 2026-04-15 2026-04-15 32 1 677 689 10.5753/jbcs.2026.5894 Salience prediction methods for video cropping in sidewalk footage https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5895 <p>The condition of urban infrastructure is an important aspect in ensuring the safety and well-being of pedestrians. This is especially important around public health facilities, such as sidewalks surrounding hospitals. Computational tools have already demonstrated their potential in this context, including surface material classification and obstacle detection; however, most solutions require labeled data, which is costly and time-consuming. To address this gap, we propose two strategies for salience prediction in videos that reduce the dependence of manual labeling. The first leverages human visual attention, converting user clicks into attention maps. The second employs the SAM2 model to generate labeled video data more efficiently. The outputs of this process are used to train specialized saliency detectors to identify general cracks, surface defects, and key sections of tactile paving, such as directional changes. Also, we apply these saliency models to video cropping in order to highlight the most relevant areas within each frame. This approach enables content-aware video retargeting, supports object-focused attention, and facilitates sidewalk condition analysis by emphasizing defects and potential hazards. This work presents the following contributions: (1) development of a click-based video annotation tool, (2) development of two saliency detection strategies for sidewalks video cropping, (3) training and evaluation of saliency models for sidewalk structure analysis, and (4) successful application of these introduced methods for video cropping. Our experimental results showed that saliency models were able to highlight relevant information in urban environments, achieving an AUC of 0.582 in the best case for human-based attention and 0.914 for tactile-based attention, thereby enhancing assistive technologies for visually impaired individuals.</p> Suayder M. Costa Rafael J. P. Damaceno Henrique Morimitsu Roberto M. Cesar-Jr Copyright (c) 2026 Suayder M. Costa, Rafael J. P. Damaceno, Henrique Morimitsu, Roberto M. Cesar-Jr https://creativecommons.org/licenses/by/4.0 2026-04-15 2026-04-15 32 1 649 662 10.5753/jbcs.2026.5895 Toward Unified Fine-Grained Vehicle Classification and Automatic License Plate Recognition https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5899 <p><span style="font-weight: 400;">Extracting vehicle information from surveillance images is essential for intelligent transportation systems, enabling applications such as traffic monitoring and criminal investigations. While Automatic License Plate Recognition (ALPR) is widely used, Fine-Grained Vehicle Classification (FGVC) offers a complementary approach by identifying vehicles based on attributes such as color, make, model, and type. Although there have been advances in this field, existing studies often assume well-controlled conditions, explore limited attributes, and overlook FGVC integration with ALPR. To address these gaps, we introduce UFPR-VeSV, a dataset comprising 24,945 images of 16,297 unique vehicles with annotations for 13 colors, 26 makes, 136 models, and 14 types. Collected from the Military Police of Paraná (Brazil) surveillance system, the dataset captures diverse real-world conditions, including partial occlusions, nighttime infrared imaging, and varying lighting. All FGVC annotations were validated using license plate information, with text and corner annotations also being provided. A qualitative and quantitative comparison with established datasets confirmed the challenging nature of our dataset. A benchmark using five deep learning models further validated this, revealing specific challenges such as handling multicolored vehicles, infrared images, and distinguishing between vehicle models that share a common platform. Additionally, we apply two optical character recognition models to license plate recognition and explore the joint use of FGVC and ALPR. The results highlight the potential of integrating these complementary tasks for real-world applications. </span></p> Gabriel Eduardo Lima Valfride Nascimento Eduardo Santos Eduil Nascimento Jr. Rayson Laroca David Menotti Copyright (c) 2026 Gabriel Eduardo Lima, Valfride Nascimento, Eduardo Santos, Eduil Nascimento Jr., Rayson Laroca, David Menotti https://creativecommons.org/licenses/by/4.0 2026-04-17 2026-04-17 32 1 783 799 10.5753/jbcs.2026.5899 A Machine Learning Classification Model for Identifying College Students with Depression Based on Digital Phenotyping https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5939 <p>Depression is a serious global mental health illness that causes significant suffering to the individual and social impairment in their lives. Compared to the general population, depression shows a higher prevalence among college students. With recent advancements in digital phenotyping data analysis to infer depressive symptoms, machine learning (ML) techniques have been increasingly employed to indicate behaviors related to potential depressive profiles (PDP). However, despite the growing body of work on ML usage to detect depression, few studies have focused on data preprocessing approaches to handle missing values in datasets that go beyond common data imputation. In this study, we conducted a series of experiments to evaluate the combination of data preprocessing methods and ML algorithms for effectively classifying PDP and non-PDP students using data from the Amive project. The primary challenges were implementing a data processing workflow to address missing values and class imbalance, common issues in digital phenotyping datasets, and selecting algorithms capable of handling such data. The experimental results showed promising outcomes, with individual classification models, including Random Forest, XGBoost, and SVM(rbf), achieving accuracies of 77%, 75%, and 76%, respectively. The best performance was obtained by training on datasets that went through outlier filtering, specifically removing rows with four or more missing values. This combination of data preprocessing approaches and ML algorithms resulted in a Random Forest classification model with the best performance ranging between 77% of accuracy and with mean errors metrics of AUC and MCC above 0.5.</p> Evandro Y. A. Ribeiro Franco E. Garcia Conrado dos S. Alves Saud Helena de M. Caseli Vivian G. Motti Taís Bleicher Jair B. Neto Heloisa C. Figueiredo Frizzo Larissa C. Martini Luciano de O. Neris Anderson Ara Alan D. Baria Valejo Vânia P. Almeida Copyright (c) 2026 Evandro Y. A. Ribeiro, Franco E. Garcia, Conrado dos S. Alves Saud, Helena de M. Caseli, Vivian G. Motti, Taís Bleicher, Jair B. Neto, Heloisa C. Figueiredo Frizzo, Larissa C. Martini, Luciano de O. Neris, Anderson Ara, Alan D. Baria Valejo, Vânia P. Almeida https://creativecommons.org/licenses/by/4.0 2026-04-16 2026-04-16 32 1 766 782 10.5753/jbcs.2026.5939 Teaching Leadership in Agile Software Engineering Education: A Case Study using Scrum and Challenge Based Learning https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5981 <p>The increasing adoption of agile methodologies in software engineering has emphasized the relevance of soft skills, particularly leadership. However, little is known about effective strategies to teach and develop these skills in educational settings. In this context, this paper presents an exploratory case study investigating the impact of assigning leadership roles — specifically, the Product Owner role — within agile student teams engaged in a Challenge Based Learning environment. Our main goal in the study was to improve the understanding of role assignments in software engineering education. To achieve this, we have conducted a case study over 10 weeks with 45 undergraduate students across 9 Scrum teams examining six soft skills: communication, coordination, cohesion, teamwork, leadership, and motivation. Data were collected through semi-structured interviews and analyzed qualitatively. Our results provide indications that assigning leadership roles has the potential to enhance students’ awareness and development of leadership competencies, promotes team engagement, and supports the alignment of individual and collective goals. The study offers evidence that integrating role-based leadership training in agile learning environments is a valuable pedagogical approach for software engineering education.</p> Nicolas Nascimento Afonso Sales Rafael Chanin Copyright (c) 2026 Nicolas Nascimento, Afonso Sales, Rafael Chanin https://creativecommons.org/licenses/by/4.0 2026-05-05 2026-05-05 32 1 1221 1232 10.5753/jbcs.2026.5981 IoT System for Residential Energy Monitoring and Management: Design and Implementation https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6008 <p>This study addresses the lack of integrated, secure, and scalable Home Energy Management Systems (HEMS) that combine appliance-level and circuit-level control while providing pilot-scale statistical analysis of energy savings under real-world conditions. We design and deploy a cloud-enabled IoT-based HEMS using ESP32 microcontrollers and PZEM-004T sensors, integrated through MQTT and Home Assistant on AWS. The system was evaluated in two residential households over several months. Compared with related work, the proposed solution integrates circuit-level actuation at the electrical distribution board, vendor-agnostic interoperability through Home Assistant, and a secure end-to-end data pipeline based on TLS encryption and broker-level authorization. In this pilot-scale evaluation (N=2), we observed a 9–10% reduction in monthly residential electricity consumption (p &lt; 0.05), with a 95% confidence level, within the monitored households. An illustrative difference-in-differences (DiD) estimator suggests an additional reduction of 3.17 kWh/month under a single treated–control pairing. Measurement accuracy remained below 3% relative error when compared with official utility bills. These findings should be interpreted as preliminary evidence and technical validation rather than generalizable proof of population-wide energy savings. Overall, the proposed HEMS constitutes a reproducible reference implementation for secure monitoring and control, demonstrating its feasibility and potential for residential energy optimization in pilot-scale deployments.</p> Juan Diego Martínez-Morocho Fabián Cuzme-Rodríguez Hernán Mauricio Domínguez-Limaico Copyright (c) 2026 Juan Diego Martínez-Morocho, Fabián Cuzme-Rodríguez, Hernán Mauricio Domínguez-Limaico https://creativecommons.org/licenses/by/4.0 2026-05-07 2026-05-07 32 1 1258 1269 10.5753/jbcs.2026.6008 Assessing the Psychological Impact of AI on Computer and Data Science Education: An Exploratory Study https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6085 <p><span style="font-weight: 400;">This study assesses the impact of Generative AI on the educational experiences of computer and data science students at the Center for Informatics, Federal University of Paraíba (CI/UFPB), Brazil. Through Exploratory Factor Analysis (EFA) of five psychometric scales, the research examines students’ acceptance of LLMs, their levels of academic burnout, technology-related anxiety, and the prevalence of both metacognitive and dysfunctional learning strategies associated with LLM use. Results revealed high adoption of LLMs, low levels of AI-related technology anxiety, and frequent use of metacognitive strategies. However, dysfunctional learning patterns were still present, particularly among students experiencing higher levels of academic burnout. This study contributes to the ongoing discourse on AI in education, emphasizing the need for pedagogical frameworks that support the effective and ethical adoption of AI while addressing the psychological demands placed on students. The validated instruments are made available for future research in educational and psychological contexts, along with their versions back-translated into English.</span></p> Pedro Henrique Ramos Pinto Cleydson de Souza Ferreira Junior Lutero Lima Goulart João Vitor Cardoso Beltrão Filipe de Lima Vaz Monteiro Paloma Duarte de Lira Samuel José Fernandes Mendes Vitor Meneghetti Ugulino de Araujo Copyright (c) 2026 Pedro Henrique Ramos Pinto, Vitor Meneghetti Ugulino de Araujo, Samuel José Fernandes Mendes, Paloma Duarte de Lira, Filipe de Lima Vaz Monteiro, João Vitor Cardoso Beltrão , Lutero Lima Goulart , Cleydson de Souza Ferreira Junior https://creativecommons.org/licenses/by/4.0 2026-04-24 2026-04-24 32 1 906 918 10.5753/jbcs.2026.6085 Multiphase Measurement, Soft Sensors, Digital Twins: A Systematic Literature Review https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6087 <p><span style="font-weight: 400;">Accurate multiphase flow measurement (MPFM) is essential in the oil and gas industry to optimize production, manage reservoirs, and ensure operational safety. Conventional MPFMs, such as Venturi, Coriolis, and positive displacement meters, remain costly and often unreliable under complex flow conditions, limiting their widespread application. In recent years, artificial intelligence (AI), soft sensors, and digital twins have emerged as promising alternatives to improve accuracy, reduce costs, and enable real-time monitoring. This paper presents a systematic review of multiphase measurement technologies, soft sensors, and digital twin applications in hydrocarbon production. Following a structured protocol, we analyze 150 publications from the past decade, addressing three research questions: (i) the current state, challenges, and limitations of MPFM technologies; (ii) the role of soft sensors and data-driven modeling, including statistical methods, machine learning algorithms, and hybrid physics-guided approaches; and (iii) methodological and industrial applications of digital twins in oil and gas operations. The review shows that while traditional MPFMs have reached technological maturity, their costs and operational constraints remain significant barriers. Soft sensors and AI-based methods offer high predictive capacity, although the challenges of interpretability and data quality persist. Digital twins demonstrate potential for integration of realtime monitoring and predictive analytics, but require clearer frameworks distinguishing theoretical models from industrial practice. In general, the findings highlight opportunities to advance multiphase measurement through the integration of AI, soft computing, and digital twin paradigms, and outline directions for future research in this field.</span></p> Luz Yamile Caicedo Chacón Sebastian Roa Prada Carlos Eduardo García Sánchez Copyright (c) 2026 Luz Yamile Caicedo Chacón, Sebastian Roa Prada, Carlos Eduardo García Sánchez https://creativecommons.org/licenses/by/4.0 2026-04-27 2026-04-27 32 1 987 1008 10.5753/jbcs.2026.6087 "Call My Big Sibling (CMBS)" – A Confidence-Based Strategy Leveraging Instance Selection to Combine Small and Large Language Models for Cost-Effective Text Classification https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6153 <p>Transformers have achieved state-of-the-art results, with Large Language Models (LLMs) leading many NLP tasks. However, it remains unclear whether LLMs always outperform first-generation Transformers (aka Small Language Models, SLMs) across different text classification tasks and scenarios (e.g., movie reviews, topic classification). This study compares four SLMs (BERT, RoBERTa, Qwen, BART) with four open LLMs (LLaMA 3.1, Mistral, Falcon, DeepSeek) across nine sentiment and four topic classification datasets, totaling over 1000 results. Results show that open LLMs only moderately outperform or tie with SLMs when fine-tuned, and at a very high computational cost. To address this trade-off, we propose “Call My Big Sibling” (CMBS), a novel confidence-based framework that integrates calibrated SLMs and open LLMs using advanced instance selection techniques. CMBS assigns high-confidence instances to the cheaper SLM, while low-confidence instances are routed to an LLM in zero-shot, in-context, or partially tuned modes, optimizing cost-effectiveness. Experiments show CMBS significantly outperforms SLMs and delivers LLM-level performance at a fraction of the cost, offering a cost-sensitive solution for NLP applications.</p> Claudio Moisés Valiense de Andrade Washington Cunha Davi Reis Celso França Wasterman Ávila Apolinário Luana de Castro Santos Adriana Silvina Pagano Leonardo Chaves Dutra da Rocha Marcos André Gonçalves Copyright (c) 2026 Claudio Moisés Valiense de Andrade, Washington Cunha, Davi Reis, Celso França, Wasterman Ávila Apolinário, Luana de Castro Santos, Adriana Silvina Pagano, Leonardo Chaves Dutra da Rocha, Marcos André Gonçalves https://creativecommons.org/licenses/by/4.0 2026-05-05 2026-05-05 32 1 1233 1249 10.5753/jbcs.2026.6153 IWSHAP-X: Enhancing Feature Selection for Intrusion Detection Systems via XAI-Guided Metaheuristics https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6186 <p>Feature selection plays a key role in developing effective machine learning-based Intrusion Detection Systems (IDS), as it influences model performance, computational efficiency, and explainability. While traditional methods like filter, wrapper, and embedding approaches have shown value, they frequently encounter challenges with premature convergence that can result in less optimal feature subsets. We present IWSHAP-X (IWSHAP with eXploration), an enhanced hybrid approach that combines SHapley Additive Explanations (SHAP) feature importance rankings with metaheuristic search strategies. This method extends the original IWSHAP process by introducing an additional exploration phase designed to reduce the likelihood of converging to local optima during feature selection. Our experiments with IWSHAP-X on the X-CANIDS dataset across multiple attack scenarios reveal several advantages over the original IWSHAP method. The approach demonstrates improved feature reduction capabilities while maintaining classification accuracy, along with better computational efficiency. Specifically, IWSHAP-X achieves up to 53.13% fewer selected features compared to IWSHAP, without compromising classification performance. These results suggest that IWSHAP-X offers a viable solution for IDS applications where both feature reduction and model effectiveness are important considerations.</p> Felipe H. Scherer Felipe N. Dresch Matheus M. Ciocca Silvio E. Quincozes Diego Kreutz Vagner E. Quincozes Copyright (c) 2026 Felipe Scherer, Felipe N. Dresch, Matheus M. Ciocca, Silvio E. Quincozes, Diego Kreutz, Vagner E. Quincozes https://creativecommons.org/licenses/by/4.0 2026-05-07 2026-05-07 32 1 1299 1316 10.5753/jbcs.2026.6186 Biotext with SWeePtex: Bioinformatics Tricks to Perform Fast, Accurate, and Content-specific String Embedding https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6198 <p>The escalating demand for adaptable Artificial Intelligence (AI) systems presents a critical hurdle: generating efficient text embeddings tailored to specific problems. While Large Language Models (LLMs) excel in general contexts, they struggle in specialized domains due to their massive data requirements, opaque embedding strategies, and high computational costs. We introduce Biotext, featuring SWeePtex, a novel framework that adapts successful Bioinformatics techniques for text embedding. By converting text to the Biological Sequence-Like (BSL) format, our Python package enables the application of SWeeP, a tool originally developed for biological sequences, to create content-addressable vectors in natural language, employing the random projection paradigm. Using unsupervised machine learning, we validated this finding by analyzing data from 14,984 MEDLINE abstracts on the thioredoxin theme. Biotext, through SWeePtex, constructs a unified vector space for words and documents from scratch, capturing rich contextual relationships and offering scalable processing. Our usage example demonstrates that this Bioinformatics-inspired method effectively addresses key challenges in Natural Language Processing (NLP), providing interpretable, computationally efficient, and content-addressable linguistic representations for document exploration. Ultimately, Biotext demonstrates that bridging Bioinformatics and NLP yields powerful, efficient, and accessible text analysis tools that balance analytical power with interpretability, particularly valuable in specialized domains and resource-constrained environments. Biotext Python package is freely available at the PyPI repository.</p> Diogo de J. S. Machado Camilla R. De Pierri Antonio C. da Silva Filho Flávia de F. Costa Nelson A. de M. Lemos Camila P. Perico Letícia G. C. Santos Maricel G. Kann Fábio de O. Pedrosa Roberto T. Raittz Copyright (c) 2026 Diogo de J. S. Machado, Camilla R. De Pierri, Antonio C. da Silva Filho, Flávia de F. Costa, Nelson A. de M. Lemos, Camila P. Perico, Letícia G. C. Santos, Maricel G. Kann, Fábio de O. Pedrosa, Roberto Tadeu Raittz https://creativecommons.org/licenses/by/4.0 2026-05-04 2026-05-04 32 1 1155 1164 10.5753/jbcs.2026.6198 Alzheimer’s Detection in 3D Magnetic Resonance Imaging Using Deep Convolutional Neural Networks https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6230 <p><span style="font-weight: 400;">Alzheimer’s disease (AD) is a progressive neurodegenerative disorder that affects millions of people worldwide and poses significant challenges for early and accurate diagnosis. In this paper, we present a comparative study between two-dimensional (2D) and three-dimensional (3D) convolutional neural networks (CNNs) for Alzheimer’s disease detection using T1-weighted magnetic resonance imaging (MRI). A robust preprocessing </span><span style="font-weight: 400;">pipeline based on the MNI-152 atlas is employed, including spatial normalization, skull stripping, and intensity normalization, ensuring anatomical consistency across subjects. A key contribution of this work is the adaptation of transfer learning from 2D to 3D CNNs, achieved by volumetrically extending pretrained 2D convolutional filters to initialize 3D kernels, improving convergence and feature representation in volumetric models. The proposed approach is evaluated on 2,603 MRI volumes from the ADNI dataset using a controlled experimental setup with identical preprocessing, training, and validation procedures for both models. Experimental results show that the 2D CNN achieved higher classification accuracy (90.17%) compared to the 3D CNN (87.88%), while both approaches </span><span style="font-weight: 400;">demonstrated strong potential for MRI-based AD detection. These findings provide practical insights into the trade-offs between slice-based and volumetric CNN architectures in neuroimaging applications.</span></p> Augusto Berwaldt de Oliveira Claudio Resin Geyer Copyright (c) 2026 Augusto Marlon Berwaldt de Oliveira, Claudio Resin Geyer https://creativecommons.org/licenses/by/4.0 2026-04-27 2026-04-27 32 1 1009 1020 10.5753/jbcs.2026.6230 Design, Implementation and Community Impact of QLattes: An Open Source Browser Extension for Qualis Annotation and Visualization in the Lattes Platform https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6243 <p><strong>QLattes</strong> is an open-source browser extension for Chrome and Firefox that assists researchers and administrative academic staff by automatically classifying and annotating journal publications listed on Brazil's Lattes CV platform, using CAPES' former Qualis ranking system. The tool also provides features for filtering, analyzing, and visualizing this publication data. This article presents<strong> QLattes</strong>'s architecture and implementation, alongside a large-scale analysis of usage data from Google Analytics and an online survey of 1,495 anonymous participants. Usage data show<strong> QLattes</strong> reached over 27,000 weekly active users as of October 2025 across all five regions of Brazil as well as many other countries. The survey analysis moves beyond aggregate metrics---which confirm high user satisfaction---to present a detailed cross-sectional analysis. We reveal several insights, including that (1) high-frequency "power users'' have a distinct behavioral profile focused on administrative and evaluative tasks, not just self-assessment, and (2) feature request priorities, such as support for conference papers, vary significantly by user occupation and geographic region. These numbers offer concrete evidence of the tool's extensive reach and impact within the Brazilian academic community during the Qualis era.<strong> QLattes</strong>'s modular architecture also allows easy extension to support additional publication types and metrics beyond Qualis, ensuring its continued relevance in Brazil's evolving research evaluation landscape. The tool's source code is freely available at <a href="https://github.com/nabormendonca/qlattes">https://github.com/nabormendonca/qlattes</a>.</p> Nabor C. Mendonça Maria Andréia Formico Rodrigues Lucas R. Mendonça Copyright (c) 2026 Nabor C. Mendonça, Maria Andréia Formico Rodrigues, Lucas R. Mendonça https://creativecommons.org/licenses/by/4.0 2026-04-05 2026-04-05 32 1 617 634 10.5753/jbcs.2026.6243 Establishing the Awareness Assessment Process https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6273 <p>Different strategies in the literature seek to help support awareness mechanisms in collaborative applications. However, the most common approaches are often applied in a specific context and do not focus on evaluating these mechanisms or the support provided from the user's perspective. Few studies present methods or processes that provide awareness aspects in collaborative systems; thus, finding a good starting point in the literature can be challenging for beginners in awareness design. We build upon the findings presented by Mantau and Benitti [2023a], providing contributions towards developing an awareness assessment process that enables accessing awareness and, consequently, collaboration support by measuring awareness mechanisms from the participant's viewpoint. Then, we expose the model’s artifacts to HCI and collaborative system examiners to gauge their appreciation and verify the suitability of the process based on reliability and usefulness criteria. The case study scenario demonstrated suitable indicators from the perspective of demographic data and IRT parameterization. As a result, the applicability of the awareness scale from the participant's perspective was achieved.</p> Márcio J. Mantau Fabiane B. V. Benitti Copyright (c) 2026 Márcio J. Mantau, Fabiane B.V. Benitti https://creativecommons.org/licenses/by/4.0 2026-05-07 2026-05-07 32 1 1283 1298 10.5753/jbcs.2026.6273 Trait and Consistency Evaluation: Measuring Behavioral Stability and the Adversarial Compensation Effect https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6343 <p>The stochastic nature of Large Language Models (LLMs) challenges traditional evaluation paradigms, which rely on single-response metrics and often mask complex behavioral patterns. This paper introduces Trait and Consistency Evaluation for LLMs (TraCE-LLM), an evaluation protocol that quantifies latent behavioral traits and model consistency within a black-box paradigm. Through a factorial design combining five LLMs, three benchmarks and a systematic stratification by prompt style (Naive, Chain-of-Thought and Adversarial), the framework employs a multidimensional rubric to measure Depth of Reasoning (DoR) and Originality (ORI) of model responses. The primary empirical contribution of this study is the identification and formalization of the Adversarial Compensation Effect (ACE), a phenomenon wherein smaller-capacity models under adversarial stress exhibit a paradoxical gain in accuracy metrics while suffering a severe degradation in behavioral stability. Our results also demonstrate an asymmetric stability with DoR being a significantly more stable trait than ORI and the prevalence of compressed reasoning, where 17.8% of correct answers lack adequate justification. By decoupling response correctness from process quality, TraCE-LLM provides a blueprint for more granular and reliable evaluation, arguing that LLM auditing must be multidimensional, context-sensitive and psychometrically informed to ensure the development of safer and more interpretable AI.</p> Pedro Carvalho Brom Vinícius Di Oliveira Li Weigang Copyright (c) 2026 Pedro Carvalho Brom, Vinícius Di Oliveira, Li Weigang https://creativecommons.org/licenses/by/4.0 2026-05-04 2026-05-04 32 1 1165 1186 10.5753/jbcs.2026.6343 An approach to Data Literacy through a Personalized Interactive LGPD Guide using LLM for Educators https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6448 <p>The Brazilian General Data Protection Law (LGPD) was created to protect the fundamental rights of freedom and privacy of Brazilian citizens. Since its implementation, it has brought new challenges to all institutions established in Brazil, whether public or private, requiring an adaptation to personal data processing practices. In the context of higher education, many professors face difficulties in understanding and properly applying the guidelines of this legislation in their daily activities. This work proposes the development of an approach to data literacy through an interactive guide, based on practical scenarios, to support educators in the process of complying with the LGPD. The proposed system uses the OpenAI API to offer personalized support in real-time. Ten representative academic scenarios were implemented, in which users can interact through multiple-choice questions followed by a chat with the guide. The results showed that, despite initial usability limitations, the system represents a promising tool to promote the comprehension of LGPD among teachers. We observed that our approach can facilitate compliance with the legislation, but requires accessibility and usability improvements to ensure greater and easier adoption.</p> César Murilo da Silva Junior Silvio E. Quincozes Juliana Saraiva Rafael D. Araújo Copyright (c) 2026 César Murilo da Silva Junior, Silvio E. Quincozes, Juliana Saraiva, Rafael D. Araújo https://creativecommons.org/licenses/by/4.0 2026-03-25 2026-03-25 32 1 453 471 10.5753/jbcs.2026.6448 A Driver Assistance System Based on YOLO Object Detection: Development and Experimental Validation in the CARLA Simulator https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6503 <p>This work presents the development and validation of an integrated Advanced Driver-Assistance System (ADAS) combining YOLOv8-based computer vision with vehicle control in the CARLA simulator. The primary objective was to implement and validate the system’s capability to detect stop signs and vehicles in real time, providing visual feedback to the driver and enabling the vehicle to respond to stop signs through appropriate deceleration and stopping behavior. The system employs a modular three-layer architecture: perception (YOLOv8), planning (finite state machine), and control (Proportional-Integral-Derivative (PID) longitudinal, Pure Pursuit lateral). Evaluation across six simulations under three weather conditions (clear noon, heavy rain sunset, heavy rain noon) demonstrated real-time processing at 17.13 FPS average, 59.4 ms detection time per frame, and 100% detection accuracy for stop signs and vehicles on the test route. Stop sign detection confidence remained above 0.73 across all conditions (coefficient of variation: 0.58%), and ANOVA revealed a significant effect of weather on detection time (p = 0.025) but no impact on detection confidence (p = 0.651), confirming perceptual reliability under adverse conditions. All routes were completed without collisions. Despite limitations inherent to simulated validation, the results empirically confirm that YOLO-based ADAS can provide reliable real-time visual feedback and proper behavioral responses for stop signs and vehicles under three distinct weather conditions, establishing methodological foundations for modular, distributed-processing architectures in autonomous vehicle applications.</p> Daniel Terra Gomes Annabell Del Real Tamariz Copyright (c) 2026 Daniel Terra Gomes, Annabell Del Real Tamariz https://creativecommons.org/licenses/by/4.0 2026-05-06 2026-05-06 32 1 1250 1257 10.5753/jbcs.2026.6503 An Extended Process Mining Framework for the Multi-factor Analysis of Student Trajectories in Higher Education: The Dropout Problem https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6559 <p>While higher education is the backbone for human capital development and economic growth, its high dropout rates remain a global concern that leads to wasted resources and unfulfilled student potential. Understanding dropout requires integrating social, economic, academic, and technical factors across students’ trajectories, often interrelated in intricate, non-obvious ways. In this context, Process Mining (PM) offers a promising approach by uncovering patterns in students’ interactions with academic programs and courses. However, traditional PM methods are typically established over mono perspectives of processes, which limits their ability to capture the multi-factor and correlated nature of educational trajectories. To address this gap, this paper proposes an extended PM-based approach that incorporates enriched labeling strategies that allow the simultaneous analysis of multiple dimensions of students' academic trajectories. Furthermore, the article presents a detailed application of the labeled method over real data of a Brazilian public university with 437,690 events from eight different programs, including students from the Unified Selection System (SISU). By comparing students' outcomes and paths, while considering their enrollment method, course option, and demographic information, we discovered that admission score, program, high school type, gender, and place of origin are the variables with a higher correlation to successful and less successful students. A deeper analysis of a specific program is also outlined to show how the approach can be customized for particular cases, under minor effort, while keeping standard input data.</p> Luiz F. P. Southier Marcelo Teixeira Lovania R. Teixeira Sheila C. Freitas Denise M. V. Sato Jair J. Ferronatto Edson E. Scalabrin Copyright (c) 2026 Luiz F. P. Southier, Marcelo Teixeira, Lovania R. Teixeira, Sheila C. Freitas, Denise M. V. Sato, Jair J. Ferronatto, Edson E. Scalabrin https://creativecommons.org/licenses/by/4.0 2026-04-29 2026-04-29 32 1 1090 1107 10.5753/jbcs.2026.6559 AI-Driven Hierarchical Taxonomy Generation from Emergency Call Transcripts https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6635 <p class="p1">This article presents a case study on hierarchical topic modeling for emergency call transcripts from Ecuador's ECU 911 service. We introduce a hybrid methodology that first generates a taxonomy from unlabeled data using <em>BERTopic</em> and agglomerative clustering, and then employs embedding-based similarity for multi-label classification. By leveraging multilingual embeddings (<em>LaBSE</em>) and clustering algorithms (<em>UMAP &amp; HDBSCAN</em>), we identified 23 coherent topics, demonstrating a practical balance between accuracy and operational applicability. The key result is a significant reduction in Hamming Loss and an F1-score of 0.4951, achieved without the need for pre-labeled data. This underscores the method's primary practical significance: offering a scalable, automated solution for emergency management centers to rapidly categorize complex incidents, thereby enhancing situational awareness and resource allocation. The integration of <em>LLaMA 3</em> for automated label generation further optimized semantic interpretation, highlighting the potential of language models in critical, resource-constrained domains.</p> Juan Gabriel Flores Sanchez Marcos Orellana Patricio Santiago García-Montero Jorge Luis Zambrano-Martinez Copyright (c) 2026 Juan Gabriel Flores Sanchez, Marcos Orellana, Patricio Santiago García-Montero, Jorge Luis Zambrano-Martinez https://creativecommons.org/licenses/by/4.0 2026-03-25 2026-03-25 32 1 472 483 10.5753/jbcs.2026.6635 Foundation Models for Time Series Forecasting: Evidence from the Fuel Sector https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6840 <p><span style="font-weight: 400;">Foundation Models (FMs), typically based on large pre-trained architectures such as Transformers, have significantly advanced the fields of Natural Language Processing and Computer Vision and are increasingly being adapted to time series analysis, particularly for forecasting. However, systematic empirical evidence on their performance compared to traditional statistical, machine learning, and deep learning models on truly unseen time </span><span style="font-weight: 400;">series data is limited, as many benchmark datasets may have been partially exposed during pre-training. This study provides empirical evidence from the fuel sector by benchmarking six state-of-the-art FMs against ten traditional forecasting methods, using 34 years of monthly fuel demand data with diverse and complex patterns. Accurate short-term forecasting of fuel demand is critical for decision-making across transportation, energy, and industry, making this domain particularly suitable for evaluating FMs’ capabilities. To this end, we assess both zero-shot inference and multiple fine-tuning strategies. Our results show that certain FMs, including Chronos and TimesFM, rank among the top-performing models in terms of RRMSE and POCID across zero-shot and fine-tuning settings, while classical statistical models such as ETS remain competitive. These findings have the potential to guide model selection in the fuel domain and similar real-world applications.</span></p> Jonas Krause Alex C. D. Lopes Lucas G. M. Castro André G. R. Ribeiro Marcos A. Mochinski Emerson Cabrera Paraiso Fabrício Enembreck Jean Paul Barddal Alceu de Souza Britto Jr Vinicius M. A. Souza Copyright (c) 2026 Jonas Krause, Alex C. D. Lopes, Lucas G. M. Castro, André G. R. Ribeiro, Marcos A. Mochinski, Emerson Cabrera Paraiso, Fabrício Enembreck, Jean Paul Barddal, Alceu de Souza Britto Jr, Vinicius M. A. Souza https://creativecommons.org/licenses/by/4.0 2026-05-04 2026-05-04 32 1 1128 1143 10.5753/jbcs.2026.6840 Quantifying Color and Distortion Biases in the NCT-CRC-HE-100K Histopathology Dataset https://journals-sol.sbc.org.br/index.php/jbcs/article/view/7045 <p>Colorectal cancer (CRC) represents a persistent challenge for healthcare systems, and the development of reliable deep learning systems for histopathology depends on unbiased datasets. The widely used NCT-CRC-HE-100K dataset has been shown to contain color inconsistencies, distortion artifacts, and corrupted patches, yet prior analyses offered only limited quantitative evidence. In this work, we extend these observations by evaluating color signatures, stain-normalization behavior, and class-dependent image quality variations. We compare classical and deep learning based stain normalization methods to identify their impact on image quality metrics and potential reduction of class-specific biases in computational pathology. Our results show that while normalization reduces color-based class distinguishability, none of the evaluated methods completely eliminate tissue-specific color signatures. Additionally, this work demonstrates that distortion artifacts disproportionately affect one class in the dataset, introducing technical biases unrelated to morphology. Also, a CNN classifier trained on each normalized dataset indicates that model performance is not significantly changed across the normalization methods, including the unnormalized dataset, despite reductions in color-based separability. Overall, our study provides quantitative evidence that color, saturation, and distortion persist across normalization techniques, emphasizing the need for caution when using NCT-CRC-HE-100K to assess histopathology models.</p> Gabriel Arquelau Pimenta Rodrigues André Luiz Marques Serrano Geraldo Pereira Rocha Filho Rodrigo Bonacin Vinícius Pereira Gonçalves Muttukrishnan Rajarajan Rodolfo Ipolito Meneguette Copyright (c) 2026 Gabriel Arquelau Pimenta Rodrigues, André Luiz Marques Serrano, Geraldo Pereira Rocha Filho, Rodrigo Bonacin, Vinícius Pereira Gonçalves, Muttukrishnan Rajarajan, Rodolfo Ipolito Meneguette https://creativecommons.org/licenses/by/4.0 2026-05-07 2026-05-07 32 1 1317 1330 10.5753/jbcs.2026.7045 Comparing Explainable AI Techniques In Language Models: A Case Study For Fake News Detection in Portuguese https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5787 <p>Language models are widely used in natural language processing, but their complexity makes interpretation difficult, limiting their adoption in critical decision-making. This work explores Explainable Artificial Intelligence (XAI) techniques, such as LIME and Integrated Gradients (IG), to understand these models. The study evaluates the effectiveness of BERTimbau in classifying Portuguese news as true or fake, using the FakeRecogna and Fake.Br Corpus datasets. In the experiments, LIME proved to be easier to interpret than IG, and both methods showed limitations when applied to texts, as they focus only on the morphological and lexical levels, ignoring other important levels.</p> Jéssica Vicentini Rafael Bezerra de Menezes Rodrigues Arnaldo Candido Junior Ivan Rizzo Guilherme Copyright (c) 2026 Jéssica Vicentini, Rafael Bezerra de Menezes Rodrigues, Arnaldo Candido Junior, Ivan Rizzo Guilherme https://creativecommons.org/licenses/by/4.0 2026-01-21 2026-01-21 32 1 01 12 10.5753/jbcs.2026.5787 A Coding-Efficiency Analysis of HEVC Encoder Embedded in High-End Mobile Chipsets https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5354 <p>High-end mobile devices require dedicated hardware for real-time video encoding and decoding processes. However, the inherent complexity of the video encoding process, combined with the physical limitations imposed by hardware design such as energy consumption, encoding time, memory usage, and heat dissipation, demands the implementation of various constraints and limitations in commercial hardware to simplify and make them feasible for general use. The High Efficiency Video Coding (HEVC) standard is the main targeted video encoder for processing high-resolution videos in high-end chipsets. This paper aims to analyze the HEVC encoder implemented into three commercial chipsets found in high-end smartphones (Apple iPhone 14 Pro, Samsung Galaxy S23 Plus, and Redmi Note 10S) from three major mobile chip manufacturers (Apple, Qualcomm, and MediaTek), considering the impacts of video encoder limitations on encoding efficiency (BD-Rate) and encoding time. The results in this paper may be used as a comparative foundation for hardware designers and future works in the field, as it exposes the encoding efficiency drawbacks and the encoding time gains that commercial chipsets exhibit in their HEVC encoder.</p> Vítor Costa Murilo Perleberg Luciano Agostini Marcelo Porto Copyright (c) 2026 Vítor Costa, Murilo Perleberg, Luciano Agostini, Marcelo Porto https://creativecommons.org/licenses/by/4.0 2026-01-22 2026-01-22 32 1 13 25 10.5753/jbcs.2026.5354 Learning on hierarchical trees with Random Forest https://journals-sol.sbc.org.br/index.php/jbcs/article/view/4242 <p style="font-weight: 400;">Hierarchies, as described in mathematical morphology, represent nested regions of interest and provide mechanisms to create coherent data organization. They facilitate high-level analysis and management of large amounts of data. Represented as hierarchical trees, they have formalisms intersecting with graph theory and generalizable applications. Due to the deterministic algorithms, the multiform representations, and the absence of a direct quality evaluation, it is hard to insert hierarchical information into a learning framework and benefit from the recent advances. Researchers usually tackle this problem by refining the hierarchies for a specific media and assessing their quality for a particular task. The downside of this approach is that it depends on the application, and the formulations limit the generalization to similar data. This work aims to create a learning framework that can operate with hierarchical data and is agnostic to the input and application. The idea is to transform the data into a regular representation required by most learning models while preserving the rich information in the hierarchical structure. The proposed methods use edge-weighted image graphs and hierarchical trees as input, and they evaluate different proposals on the edge detection and segmentation tasks. The learning model is the Random Forest, a fast and scalable method for working with high-dimensional data. Results demonstrate that it is possible to create a learning framework dependent only on the hierarchical data that presents a state-of-the-art performance in multiple tasks.</p> Raquel Almeida Laurent Amsaleg Zenilton Kleber G. do Patrocínio Júnior Ewa Kijak Simon Malinowski Silvio Jamil Ferzoli Guimarães Copyright (c) 2026 Raquel Almeida, Laurent Amsaleg, Zenilton Kleber G. do Patrocínio Júnior, Ewa Kijak, Simon Malinowski, Silvio Jamil Ferzoli Guimarães https://creativecommons.org/licenses/by/4.0 2026-01-26 2026-01-26 32 1 26 42 10.5753/jbcs.2026.4242 Statistical Invariance vs. AI Safety: Why Prompt Filtering Fails Against Contextual Attacks https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5961 <p>Large Language Models (LLMs) are increasingly deployed in high-stakes applications, yet their alignment with ethical standards remains fragile and poorly understood. To investigate the probabilistic and dynamic nature of this alignment, we conducted a black-box evaluation of nine widely used LLM platforms, anonymized to emphasize the underlying mechanisms of ethical alignment rather than model benchmarking. We introduce the Semantic Hijacking Method (SHM) as an experimental framework, formally defined and grounded in probabilistic modeling, designed to reveal how ethical alignment can erode gradually, even when all user inputs remain policy-compliant. Across three experimental rounds (324 total executions), SHM achieved a 97.8% success rate in eliciting harmful content, with failure rates progressing from 93.5% (multi-turn conversations) to 100% (both refined sequences and single-turn interactions), demonstrating that vulnerabilities are inherent to semantic processing rather than conversational memory. A qualitative cross-linguistic analysis revealed cultural variations in harmful narratives, with Brazilian Portuguese responses frequently echoing historical and socio-cultural biases, making them more persuasive to local users. Overall, our findings demonstrate that ethical alignment is not a static barrier but a dynamic and fragile property that challenges binary safety metrics. Due to potential risks of misuse, all prompts and outputs are made available exclusively to authorized reviewers under ethical approval, and this publication focuses solely on reporting the research findings.</p> Aline Ioste Sarajane Marques Peres Marcelo Finger Copyright (c) 2026 Aline Ioste, SaraJane Peres, Marcelo Finger https://creativecommons.org/licenses/by/4.0 2026-01-27 2026-01-27 32 1 43 54 10.5753/jbcs.2026.5961 An Autonomous Hybrid Data Partitioning Approach for NewSQL Databases https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5684 <p>Like online games and the financial market, several applications require specific data management features such as large data volume support, data streaming, and the processing of thousands of OLTP transactions per second. In general, traditional relational databases are not suitable for these requirements. NewSQL is a new generation of databases that combines high scalability and availability with ACID support, being a promising solution for these kinds of applications. Although data partitioning is an essential feature for tuning relational databases, it is still an open issue for NewSQL databases. This paper proposes an automated approach for hybrid data partitioning that minimizes the number of distributed transactions and keeps the system well-balanced. In order to demonstrate its efficacy, we compare our solution with an optimal partitioning solution generated by a solver and a state-of-art baseline. The experiments show that the quality of the partitioning scheme is similar to the optional solution and overcomes the state-of-art approach in number of distributed transactions.</p> Geomar A. Schreiner Rafael de Santiago Denio Duarte Ronaldo dos Santos Mello Copyright (c) 2026 Geomar A. Schreiner, Rafael de Santiago, Denio Duarte, Ronaldo dos Santos Mello https://creativecommons.org/licenses/by/4.0 2026-02-02 2026-02-02 32 1 55 72 10.5753/jbcs.2026.5684 Limitless Feature Selection: Revolutionizing Evaluation with MH-FSF https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5646 <p>Feature selection plays a crucial role in developing effective predictive models by reducing dimensionality and emphasizing the most relevant attributes. However, current research in this area often lacks comprehensive benchmarking and frequently depends on proprietary datasets. These limitations hinder reproducibility and may lead to inconsistent or suboptimal model performance. To address these limitations, we introduce the MH-FSF framework, a comprehensive, modular, and extensible platform designed to facilitate the reproduction and implementation of feature selection methods. Developed through collaborative research, MH-FSF provides implementations of 17 methods (11 classical, 6 domain-specific) and enables systematic evaluation on 10 publicly available Android malware datasets. Our results reveal performance variations across both balanced and imbalanced datasets, highlighting the critical need for data preprocessing and selection criteria that account for these asymmetries. We demonstrate the importance of a unified platform for comparing diverse feature selection techniques, fostering methodological consistency and rigor. By providing this framework, we aim to significantly broaden the existing literature and pave the way for new research directions in feature selection, particularly within the context of Android malware detection.</p> Vanderson Rocha Diego Kreutz Hendrio Bragança Eduardo Feitosa Copyright (c) 2026 Vanderson Rocha, Diego Kreutz, Hendrio Bragança, Eduardo Feitosa https://creativecommons.org/licenses/by/4.0 2026-02-06 2026-02-06 32 1 73 84 10.5753/jbcs.2026.5646 BENCH4T3: A Framework to Create Benchmarks for Text-to-Triples Alignment Generation https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5809 <p>Integrating Large Language Models (LLMs) with Knowledge Graphs (KGs) can significantly enhance their capabilities, leveraging LLMs' text generation skills with KGs' explanatory power. However, establishing this connection is challenging and demands proper alignment between unstructured texts and triples. Building benchmarks demands massive human effort in data curation and translation for non-English languages. The demand for adequate benchmarks for validation purposes negatively impacts research advancements. This study proposes an end-to-end framework to guide the automatic construction of text-to-triple alignment benchmarks for any language, using KGs as input. Our solution extracts relations from input triples and processes them to create accurately mapped texts. The proposed pipeline utilizes data curation through prompt engineering and data augmentation to enhance diversity in the generated examples. We experimentally evaluate our framework for creating a bimodal representation of RDF triples and natural language texts, assessing its ability to generate natural language from these triples. A key focus is on developing a benchmark for the underrepresented Portuguese language, facilitating the construction of models that connect structured data (triples) with text. Our solution is suited to creating a benchmark to improve alignment between KG triples and text data. The results indicate that the generated benchmark outperforms the results of existing solutions. The generative approach benefits from our Portuguese benchmark, achieving competitive results compared to established literature benchmarks. Our solution enables automatic generation of benchmarks for aligning triples and text.</p> Victor Jesus Sotelo Chico André Gomes Regino Julio Cesar dos Reis Copyright (c) 2026 Victor Jesus Sotelo Chico, André Gomes Regino, Julio Cesar Dos Reis https://creativecommons.org/licenses/by/4.0 2026-02-06 2026-02-06 32 1 85 101 10.5753/jbcs.2026.5809 Semiotic Engineering Theory for Human-Computer Integration: An Applicability and Usefulness Evaluation https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5620 <p>The relationship between users and autonomous technologies is evolving towards integration (in the sense of partnership), transcending the stimulus-response interaction between these two agents. To follow this evolution, Human-Computer Interaction (HCI) researchers have defined and characterized a new interaction paradigm, Human-Computer Integration (HInt), which extends the focus of the HCI area to cover this new relationship of partnership between humans and autonomous technologies. As HInt is an emerging paradigm, the concepts and ontology of Semiotic Engineering Theory have been extended to address HInt as an extension of the traditional HCI interaction. Thus, this paper aims to evaluate and discuss the applicability and usefulness of the extension of Semiotic Engineering to define, explore, and explain the phenomena involved in HInt. Our findings provide useful insights and reflections on the benefits and limits of Semiotic Engineering for HInt to support the study, design, and evaluation of the partnership between humans and autonomous technologies.</p> Glívia Angélica Rodrigues Barbosa Raquel Oliveira Prates Copyright (c) 2026 Glívia Angélica Rodrigues Barbosa, Raquel Oliveira Prates https://creativecommons.org/licenses/by/4.0 2026-02-21 2026-02-21 32 1 102 127 10.5753/jbcs.2026.5620 STELLAR: A Structured, Trustworthy, and Explainable LLM-Led Architecture for Reliable Customer Support https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6044 <p>While Large Language Models (LLMs) offer transformative potential for automating customer support, significant hurdles remain concerning their reliability, explainability, and consistent performance in complex, sensitive interactions. This paper introduces <strong>STELLAR (Structured, Trustworthy, and Explainable LLM-Led Architecture for Reliable Customer Support)</strong>, a novel architectural blueprint designed to address these issues. STELLAR utilizes a <strong>Directed Acyclic Graph (DAG) structure</strong> composed of nine specialized modules and eleven predefined workflows to orchestrate support interactions in a structured and predictable manner. This design promotes enhanced traceability, reliability, and control compared to less constrained systems. The architecture integrates components for few-shot classification, Retrieval-Augmented Generation (RAG), urgency-aware human escalation, compliance verification, user interaction validation, and knowledge base refinement through a semi-automated loop. This modular design deliberately balances LLM-driven innovation with operational requirements such as human-in-the-loop integration and ethical safeguards through embedded checks. We evaluated the core modules of STELLAR in key tasks - classification, retrieval, and compliance - demonstrating strong performance and reliability. Together, these features position STELLAR as a robust and transparent foundation for the next generation of intelligent, reliable customer support systems.</p> Matheus Ferracciú Scatolin Helio Pedrini Copyright (c) 2026 Matheus Ferracciú Scatolin, Helio Pedrini https://creativecommons.org/licenses/by/4.0 2026-02-21 2026-02-21 32 1 128 144 10.5753/jbcs.2026.6044 HelBERT: A BERT-Based Pretraining Model for Public Procurement Tasks in Portuguese https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5511 <p>Deep learning models excel in various tasks but require extensive annotated data for supervised learning. In NLP, limited annotated data hinders deep learning. Self-supervised pretraining addresses this by training models on unlabeled text to learn useful representations. Domain-specific pretraining is crucial for good performance in downstream tasks. Although pretrained BERT models exist for legal documents in some languages, none target public procurement documents in Portuguese. Public procurement documents have terminology that is not found in existing models. In this paper, we propose HelBERT, a BERT-based model pretrained on a large corpus of public procurement documents in the Brazilian Portuguese language, including laws, tender notices, and contracts. The experimental results demonstrate that HelBERT outperforms other models in all analyses. HelBERT surpasses models such as BERTimbau and JurisBERT in classification tasks by achieving improvements of 5% and 4% in the F1 Score, respectively. Furthermore, the model achieves gains that exceed 3% in semantic similarity tasks compared to the baseline models. Moreover, despite using a GPU with reduced memory and processing resources, the proposed approach achieves superior results with fewer and more efficient training epochs than the baseline models. These findings underscore the effectiveness of the proposed model in addressing NLP tasks within the public procurement domain.</p> Weslley Emmanuel Martins Lima Victor Ribeiro da Silva Jasson Carvalho da Silva Ricardo de Andrade Lira Rabêlo Anselmo Cardoso de Paiva Copyright (c) 2026 Weslley Emmanuel Martins Lima, Victor Ribeiro da Silva, Jasson Carvalho da Silva, Ricardo de Andrade Lira Rabêlo, Anselmo Cardoso de Paiva https://creativecommons.org/licenses/by/4.0 2026-02-21 2026-02-21 32 1 145 158 10.5753/jbcs.2026.5511 RecSys-Fairness: A Framework for Reducing Group Unfairness in Recommendations https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5457 <p>In this study, we address the importance of promoting fairness in recommendation systems, which are highly susceptible to biases that can lead to unfair outcomes for different user groups. We developed a fairness algorithm aimed at mitigating these injustices, which was applied to the MovieLens dataset and analyzed based on the recommendations produced by the ALS (Alternating Least Squares) and NCF (Neural Collaborative Filtering) methods. Users were grouped by activity level, gender, and age, and the results demonstrated the effectiveness of the fairness algorithm in substantially reducing group unfairness (R_{grp}) across all tested configurations, without causing significant losses in recommendation accuracy, measured by the Root Mean Squared Error (RMSE). In particular, a reduction in group unfairness of up to 65.57% was observed in the ALS method. Additionally, we identified an optimal convergence of the fairness algorithm for an estimated number of matrices (h) between 10 and 15, suggesting an effective balance point between promoting fairness and maintaining precision in recommendations. In comparison with the available benchmarks, under identical experimental conditions, we managed to improve group unfairness reductions by approximately 6% (from 59.77% to 65.57%).</p> Rafael Vargas Mesquita dos Santos Giovanni Ventorim Comarela Copyright (c) 2026 Rafael Vargas Mesquita dos Santos, Giovanni Ventorim Comarela https://creativecommons.org/licenses/by/4.0 2026-02-21 2026-02-21 32 1 159 170 10.5753/jbcs.2026.5457 A Reliable Stream Learning Model for Network Intrusion Detection Systems https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5608 <p>Developing a reliable Network Intrusion Detection System (NIDS) remains a complex task due to the non-stationary nature of network traffic and the need for frequent updates to maintain high classification performance. Many existing approaches assume a stationary network environment, which overlooks the challenges associated with periodic model updates, such as the need for large amounts of properly labeled data and significant computational resources. This issue is particularly challenging for real-time applications, where minimizing delays and ensuring accuracy is crucial. This paper proposes an analysis of how changes in the network behavior negatively affects the long-term of ML-Based NIDS. For such a problem, it is proposed a new NIDS approach integrating stream learning with a reject option technique to simplify the model update process while ensuring consistent classification accuracy over time. The proposal uses stream learning classifiers to incrementally incorporate new data, while the reject option allows the system to evaluate the reliability of classifications before they are used for updates. The scheme operates with minimal intervention, with rejected instances stored for future updates and used to fine-tune the model over time, ensuring adaptation to evolving network conditions. Experimental results demonstrate that the proposed approach maintains high classification accuracy over a year, even without recurrent updates, and achieves significant improvements in true positive rates compared to traditional methods. The system can operate for up to three months without updates, with no significant degradation in performance.</p> Pedro Horchulhack Eduardo Kugler Viegas Altair Olivo Santin Copyright (c) 2026 Pedro Horchulhack, Eduardo Kugler Viegas, Altair Olivo Santin https://creativecommons.org/licenses/by/4.0 2026-03-02 2026-03-02 32 1 186 200 10.5753/jbcs.2026.5608 CNNs for JPEGs: Designing Cost-Efficient Stems https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5873 <p>Convolutional neural networks (CNNs) have achieved astonishing advances over the past decade, pushing the state-of-the-art in several computer vision tasks. CNNs are capable of learning robust representations of the data directly from RGB pixels. However, most image data is usually available in compressed format, of which the JPEG is the most widely used due to transmission and storage purposes. For this motive, a preliminary decoding process that has a high computational load and memory usage is demanded. Image decoding can be a performance bottleneck for devices with limited computational resources, such as embedded devices, even when hardware accelerators are used. For this reason, deep learning methods capable of learning directly from the compressed domain have been gaining attention in recent years. These methods usually extract a frequency domain representation of the image, like DCT, by a partial decoding, and then make adaptation to typical CNN architectures to work with it. In this paper, we perform an in-depth study of the computational cost of deep models designed for the frequency domain, evaluating the cost of decoding and passing images through the network. We notice that previous work increased the model's computational complexity to accommodate for the compressed images, nullifying the speed up gained by not decoding images. We propose to remove the changes to the model that increase the computational cost, replacing it with our designed lightweight stems. This way, we can take full advantage of the speed-up obtained by avoiding the decoding. Our strategies were successful in generating models that balance efficiency and effectiveness, allowing deep models to be deployed in a wider array of devices. We achieve up to 25.91% reduction in computational complexity (FLOPs), while only decreasing accuracy in up to 2.97%. We also propose the efficiency-effectiveness score S<sub>E</sub> to highlight models with favorable trade-offs between accuracy, computational cost and number of parameters.</p> Samuel Felipe dos Santos Nicu Sebe Jurandy Almeida Copyright (c) 2026 Samuel Felipe dos Santos, Nicu Sebe, Jurandy Almeida https://creativecommons.org/licenses/by/4.0 2026-03-02 2026-03-02 32 1 201 215 10.5753/jbcs.2026.5873 Generalizing Feature Selection in Android Malware Detection: The SigAPI AutoCraft Approach https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6043 <p>Feature selection methods are widely employed in Android malware detection to improve accuracy and efficiency by identifying the most relevant features. However, their generalizability often remains limited, as approaches like SigAPI are typically developed and evaluated on a small number of datasets, reducing their effectiveness across diverse scenarios. The practical use of SigAPI is further hindered by the need to predefine a minimum number of features, the instability of its evaluation metrics, and its inability to adapt efficiently to the heterogeneity commonly present in Android datasets. To address these limitations, we developed SigAPI AutoCraft, an enhanced and fully automated version of the original method. SigAPI AutoCraft achieves consistent and robust performance across ten Android malware datasets, substantially improving generalization. The results demonstrate a 5–15% increase in Matthews Correlation Coefficient (MCC) and up to a 7.6-fold improvement in feature reduction, underscoring its effectiveness and adaptability to complex and heterogeneous data environments.</p> Vanderson Rocha Laura Tschiedel Diego Kreutz Hendrio Bragança Joner Assolin Rodrigo Brandão Mansilha Silvio E. Quincozes Angelo Gaspar Diniz Nogueira Copyright (c) 2026 Vanderson Rocha, Laura Tschiedel, Diego Kreutz, Hendrio Bragança, Joner Assolin, Rodrigo Brandão Mansilha, Silvio E. Quincozes, Angelo Gaspar Diniz Nogueira https://creativecommons.org/licenses/by/4.0 2026-03-09 2026-03-09 32 1 250 263 10.5753/jbcs.2026.6043 Towards a Lightweight Multi-View Android Malware Detection Model with Multi-Objective Feature Selection https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5378 <p>In recent years, a wide range of new Machine Learning (ML) techniques with high accuracy have been developed for Android malware detection. Despite their high accuracy, these techniques are seldom implemented in production environments due to their limited generalization capabilities, leading to reduced performance when applied to real-world scenarios. In light of this, this paper introduces a novel multi-view Android malware detection model implemented in two stages. The first stage involves extracting multiple feature sets from the analyzed Android application package, offering complementary behavioral representations that improve the system's generalization in the classification process. In the second stage, a multi-objective optimization is conducted to identify the optimal feature subset from each view and fine-tune the hyperparameters of individual classifiers, enabling an ensemble-based classification approach. The core innovation of our approach lies in the proactive selection of feature subsets and the optimization of hyperparameters that together enhance classification accuracy while minimizing processing overhead within a multi-view framework. Experiments conducted on a newly developed dataset, consisting of over 40 thousand Android application samples, validate the effectiveness of our proposal. The results indicate that our model can increase true-positive rates by up to 18% while reducing inference processing costs by as much as 72%.</p> Philipe Fransozi Jhonatan Geremias Eduardo K. Viegas Altair O. Santin Copyright (c) 2026 Philipe Fransozi, Jhonatan Geremias, Eduardo K. Viegas, Altair O. Santin https://creativecommons.org/licenses/by/4.0 2026-03-09 2026-03-09 32 1 264 277 10.5753/jbcs.2026.5378 Intelligent Emotion Tracking System VIRE: Evaluation of Neural Network Architectures in Facial Emotion Recognition https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5370 <p>This work proposes an emotional monitoring system called Visual Identification of Recognition of Emotions (VIRE), based on convolutional neural networks (CNNs) to analyze facial expressions. Using the six basic emotions proposed by Paul Ekman as a reference, which can be identified from the composition of various facial muscle states, VIRE aims to assist in the diagnosis of mental health conditions. While emotional expressions are communicated in various ways, this research focuses primarily on facial expressions due to their expressiveness resulting from the mobility of facial muscles. The methodology involved collecting data from the FER2013 dataset, preprocessing the images, hyperparameter tuning, and training three different architectures: AlexNet, DenseNet, and a custom CNN. The research will classify expressions into basic emotions and evaluate the models' performance in terms of accuracy and other metrics. VIRE has demonstrated potential, achieving an accuracy of about 60%, although improvements are needed for practical application. The ultimate goal is to create a tool that integrates technology and health, facilitating the identification of emotional states that may indicate mental health issues, thereby contributing to more accurate and effective diagnoses.</p> Nathan Ferraz da Silva Geraldo Pereira Rocha Filho Roger Immich Vinícius Pereira Gonçalves Rodolfo Ipolito Meneguette Copyright (c) 2026 Nathan Ferraz da Silva, Geraldo Pereira Rocha Filho, Roger Immich, Rodolfo Ipolito Meneguette, Vinícius Pereira Gonçalves https://creativecommons.org/licenses/by/4.0 2026-03-09 2026-03-09 32 1 278 288 10.5753/jbcs.2026.5370 Memorizing Features Efficiently for Self-supervised Video Object Segmentation https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5904 <p>Video object segmentation (VOS) involves consistently identifying and classifying object pixels in video sequences, a task that traditionally depends on extensive, manually annotated datasets. In this work, we present SHLS (Superfeatures in a Highly Compressed Latent Space), a self-supervised VOS method that reduces reliance on both annotations and large training datasets. SHLS employs a metric learning framework combining superpixels and deep learning features, enabling effective training with just 10,000 unlabeled still images. Utilizing an efficient memory clustering mechanism, SHLS generates ultra-compact representations called superfeatures, which efficiently store and classify object information across video sequences. Experiments on the DAVIS dataset demonstrate SHLS's strong performance in multi-object scenarios, underscoring its potential as a robust and efficient alternative in self-supervised VOS.</p> Marcelo Mendonça Luciano Oliveira Copyright (c) 2026 Marcelo Mendonça, Luciano Oliveira https://creativecommons.org/licenses/by/4.0 2026-03-15 2026-03-15 32 1 443 452 10.5753/jbcs.2026.5904 Evaluating Reranking Strategies for Portuguese Information Retrieval: Fine-Tuning, LLMs, and Sociocultural Aspects https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5659 <p>Reranking plays a crucial role in improving Information Retrieval (IR) performance, particularly in low-resource languages, such as Portuguese. In this study, we evaluate different reranking strategies for Portuguese IR, comparing multilingual and Portuguese-specific models, as well as not-so-large language models and large language models (LLMs). We assess the performance of BM25 combined with ptT5 fine-tuned on multilingual and Brazilian Portuguese datasets, alongside multilingual state-of-the-art rerankers (BGE m3) and LLM as rerankers RankGPT (GPT-4) and Sabiá 3, a Portuguese-specific LLM. Additionally, we introduce a novel dynamic In-Context Learning (DICL) prompting strategy to enhance LLM performance. Experiments conducted on the Quati and Pirá 2.0 datasets show that fine-tuning on native Brazilian Portuguese data significantly improves retrieval effectiveness by up to 5 p.p. in nDCG compared to using translated multilingual datasets. Two fine-tuning approaches were tested: a binary classification strategy with ‘true’ and ‘false’ tokens and a relevance score-based training, both outperforming models fine-tuned on translated multilingual data. RankGPT achieved the best overall results, yet Sabiá 3 demonstrated competitive performance, particularly on queries related to sociocultural aspects. The DICL strategy further improved the results of both LLMs, significantly boosting their MRR@10. These findings highlight the importance of language-specific training and suggest that not-so-large language models can be viable alternatives for reranking tasks in Portuguese IR.</p> Renato Okabayashi Miyaji Pedro Luiz Pizzigatti Corrêa Copyright (c) 2026 Renato Okabayashi Miyaji, Pedro Luiz Pizzigatti Corrêa https://creativecommons.org/licenses/by/4.0 2026-05-05 2026-05-05 32 1 1207 1220 10.5753/jbcs.2026.5659 Sapo-boi: Bypassing Linux Kernel Network Stack in the Implementation of an XDP-based NIDS https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5551 <p>Network intrusion detection systems (NIDS) must inspect multiple parts of a packet to detect patterns of known attacks. With the advent of XDP, it has become feasible to implement such a system within the kernel's own network stack for the evaluation of ingress traffic. In this work, we propose Sapo-boi, an NIDS solution consisting of two modules: (i) the Suspicion Module, an XDP program capable of processing packets in parallel, discarding packets considered safe, and redirecting suspicious packets for verdict in user space through XDP sockets (Af_XDP); and (ii) the Evaluation Module, a user-level process capable of finding the rule to which the suspicious packet should be analyzed in constant time and triggering notifications if the suspicion is confirmed. The system demonstrated superior results in terms of packet analysis rates and CPU usage compared to traditional NIDS alternatives (Snort and Suricata).</p> Raphael Kaviak Machnicki João Ribeiro Andreotti Ulisses Penteado Jorge Pires Correia Vinicius Fulber-Garcia André Grégio Copyright (c) 2026 Raphael Kaviak Machnicki, João Ribeiro Andreotti, Ulisses Penteado, Jorge Pires Correia, Vinicius Fulber-Garcia, André Grégio https://creativecommons.org/licenses/by/4.0 2026-03-02 2026-03-02 32 1 10.5753/jbcs.2026.5551 Portfolio-based Active Learning with Gaussian Processes for Vulnerabilities Risk Classification https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5567 <p>Effective vulnerability management is essential for cybersecurity, particularly as the demand for skilled professionals often exceeds supply. This paper investigates the application of Gaussian Processes (GPs) integrated with Active Learning (AL) techniques to classify security vulnerabilities based on their risk of exploitation. The main objective is to optimize the labeling process, thereby reducing the amount of labeled data necessary for training an effective classifier. The proposed methodology combines the uncertainty predictions provided by GP models with five established data selection strategies, utilizing a portfolio-based approach. The portfolio avoids the need of choosing a single strategy and leverages the strengths of each technique. This approach enhances adaptability and balances exploration versus exploitation in complex optimization scenarios, ultimately improving the diversity of labeled samples and contributing to the development of better classifiers trained with less examples. Experiments were conducted using the CVEjoin dataset, which encompasses over 200,000 vulnerabilities, across three distinct evaluation scenarios. The different setups consider equivalent volumes of labeled data, but varying Active Learning iterations. When considering a single strategy, the results indicate that the BSB (best and second best) method consistently outperformed the others in terms of accuracy and F1 score, particularly with an increased number of labeling iterations. In the scenario where multiple strategies are used in a portfolio, the results indicate gains in all evaluation metrics. This study underscores the usefulness of a portfolio-based Active Learning approach in optimizing the labeling procedure and, ultimately, prioritizing vulnerabilities for remediation. This research lays the groundwork for extending the framework to other areas of cybersecurity, such as vulnerabilities in web applications and cloud environments, thereby improving overall security measures in the digital landscape.</p> Davyson S. Ribeiro Rafael S. Lemos Francisco R. P. da Ponte César Lincoln C. Mattos Emanuel B. Rodrigues Copyright (c) 2026 Davyson S. Ribeiro, Rafael S. Lemos, Francisco R. P. da Ponte, César Lincoln C. Mattos, Emanuel B. Rodrigues https://creativecommons.org/licenses/by/4.0 2026-03-02 2026-03-02 32 1 10.5753/jbcs.2026.5567 Implementation and evaluation of the Forro stream cipher in Tofino programmable hardware for remote attestation in datacenters https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5625 <p>The software-defined networking (SDN) paradigm has enabled several innovations in computer networking, specially in programmable packet processing. This paper shows the feasibility and impact on computing resources of the Forro stream cipher algorithm in the Tofino programmable hardware switch. For comparison purposes, the ChaCha algorithm was also analyzed in terms of its performance and impact on the same device. It was observed that the Forro algorithm performs better and uses fewer resources than ChaCha in sequential implementations. However, when parallelization techniques are adopted, ChaCha performs better for higher data rates, but uses more ternary matching resources than Forro. For the use case of remote attestation in programmable data planes, the Forro cipher seems more promising, as it uses less limited resources and can achieve sufficient throughput rates for this scenario. We then propose P4DRA, a distributed remote attestation solution based in the programmable data plane that can offload the verification process of remote devices to the data plane, freeing resources from a central verifier based on a x86 server and improving the attestation proof verification speed by around 150 times.</p> Rodrigo Alexander de Andrade Pierini Caio Teixeira Christian Rodolfo Esteve Rothenberg Marco Aurélio Amaral Henriques Copyright (c) 2026 Rodrigo Alexander de Andrade Pierini, Caio Teixeira, Christian Rodolfo Esteve Rothenberg, Marco Aurélio Amaral Henriques https://creativecommons.org/licenses/by/4.0 2026-02-24 2026-02-24 32 1 171 185 10.5753/jbcs.2026.5625