Journal of the Brazilian Computer Society https://journals-sol.sbc.org.br/index.php/jbcs <div class="cms-item cms-collection cms-collection--split cms-collection--untitled" data-fragment="784856"> <div class="cms-collection__row"> <div class="cms-collection__column"> <div class="cms-collection__column-inner"> <div class="cms-item cms-collection" data-fragment="784854"> <div id="aimsAndScope" class="cms-item placeholder placeholder-aimsAndScope"> <div class="placeholder-aimsAndScope_content"> <p>The <em>Journal of the Brazilian Computer Society</em> (JBCS) is an international journal which serves as a forum for disseminating innovative research in all fields of computer science and related subjects. Contents include theoretical, practical and experimental papers reporting original research contributions, as well as high quality survey papers. Coverage extends to all computer science topics, computer systems development and formal and theoretical aspects of computing, including computer architecture; high-performance computing; database management and information retrieval; computational biology; computer graphics; data visualization; image and video processing; VLSI design and software-hardware codesign; embedded systems; geoinformatics; artificial intelligence; games, entertainment and virtual reality; natural language processing and much more.</p> <p>The JBCS team wishes that all quality articles be published in the journal independently of the authors' funding capacity. Thus, if the authors are unable to pay the APC charge, we recommend that they contact the editors (editorial@journal-bcs.com). The JBCS team will provide support in finding alternative funding. In particular, a grant from the Brazilian Internet Steering Committee (http://nic.br/) helps sponsor the publication of many JBCS articles.</p> </div> </div> </div> </div> </div> </div> </div> Brazilian Computer Society en-US Journal of the Brazilian Computer Society 1678-4804 Learning on hierarchical trees with Random Forest https://journals-sol.sbc.org.br/index.php/jbcs/article/view/4242 <p style="font-weight: 400;">Hierarchies, as described in mathematical morphology, represent nested regions of interest and provide mechanisms to create coherent data organization. They facilitate high-level analysis and management of large amounts of data. Represented as hierarchical trees, they have formalisms intersecting with graph theory and generalizable applications. Due to the deterministic algorithms, the multiform representations, and the absence of a direct quality evaluation, it is hard to insert hierarchical information into a learning framework and benefit from the recent advances. Researchers usually tackle this problem by refining the hierarchies for a specific media and assessing their quality for a particular task. The downside of this approach is that it depends on the application, and the formulations limit the generalization to similar data. This work aims to create a learning framework that can operate with hierarchical data and is agnostic to the input and application. The idea is to transform the data into a regular representation required by most learning models while preserving the rich information in the hierarchical structure. The proposed methods use edge-weighted image graphs and hierarchical trees as input, and they evaluate different proposals on the edge detection and segmentation tasks. The learning model is the Random Forest, a fast and scalable method for working with high-dimensional data. Results demonstrate that it is possible to create a learning framework dependent only on the hierarchical data that presents a state-of-the-art performance in multiple tasks.</p> Raquel Almeida Laurent Amsaleg Zenilton Kleber G. do Patrocínio Júnior Ewa Kijak Simon Malinowski Silvio Jamil Ferzoli Guimarães Copyright (c) 2026 Raquel Almeida, Laurent Amsaleg, Zenilton Kleber G. do Patrocínio Júnior, Ewa Kijak, Simon Malinowski, Silvio Jamil Ferzoli Guimarães https://creativecommons.org/licenses/by/4.0 2026-01-26 2026-01-26 32 1 26 42 10.5753/jbcs.2026.4242 A Coding-Efficiency Analysis of HEVC Encoder Embedded in High-End Mobile Chipsets https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5354 <p>High-end mobile devices require dedicated hardware for real-time video encoding and decoding processes. However, the inherent complexity of the video encoding process, combined with the physical limitations imposed by hardware design such as energy consumption, encoding time, memory usage, and heat dissipation, demands the implementation of various constraints and limitations in commercial hardware to simplify and make them feasible for general use. The High Efficiency Video Coding (HEVC) standard is the main targeted video encoder for processing high-resolution videos in high-end chipsets. This paper aims to analyze the HEVC encoder implemented into three commercial chipsets found in high-end smartphones (Apple iPhone 14 Pro, Samsung Galaxy S23 Plus, and Redmi Note 10S) from three major mobile chip manufacturers (Apple, Qualcomm, and MediaTek), considering the impacts of video encoder limitations on encoding efficiency (BD-Rate) and encoding time. The results in this paper may be used as a comparative foundation for hardware designers and future works in the field, as it exposes the encoding efficiency drawbacks and the encoding time gains that commercial chipsets exhibit in their HEVC encoder.</p> Vítor Costa Murilo Perleberg Luciano Agostini Marcelo Porto Copyright (c) 2026 Vítor Costa, Murilo Perleberg, Luciano Agostini, Marcelo Porto https://creativecommons.org/licenses/by/4.0 2026-01-22 2026-01-22 32 1 13 25 10.5753/jbcs.2026.5354 Intelligent Emotion Tracking System VIRE: Evaluation of Neural Network Architectures in Facial Emotion Recognition https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5370 <p>This work proposes an emotional monitoring system called Visual Identification of Recognition of Emotions (VIRE), based on convolutional neural networks (CNNs) to analyze facial expressions. Using the six basic emotions proposed by Paul Ekman as a reference, which can be identified from the composition of various facial muscle states, VIRE aims to assist in the diagnosis of mental health conditions. While emotional expressions are communicated in various ways, this research focuses primarily on facial expressions due to their expressiveness resulting from the mobility of facial muscles. The methodology involved collecting data from the FER2013 dataset, preprocessing the images, hyperparameter tuning, and training three different architectures: AlexNet, DenseNet, and a custom CNN. The research will classify expressions into basic emotions and evaluate the models' performance in terms of accuracy and other metrics. VIRE has demonstrated potential, achieving an accuracy of about 60%, although improvements are needed for practical application. The ultimate goal is to create a tool that integrates technology and health, facilitating the identification of emotional states that may indicate mental health issues, thereby contributing to more accurate and effective diagnoses.</p> Nathan Ferraz da Silva Geraldo Pereira Rocha Filho Roger Immich Vinícius Pereira Gonçalves Rodolfo Ipolito Meneguette Copyright (c) 2026 Nathan Ferraz da Silva, Geraldo Pereira Rocha Filho, Roger Immich, Rodolfo Ipolito Meneguette, Vinícius Pereira Gonçalves https://creativecommons.org/licenses/by/4.0 2026-03-09 2026-03-09 32 1 278 288 10.5753/jbcs.2026.5370 Towards a Lightweight Multi-View Android Malware Detection Model with Multi-Objective Feature Selection https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5378 <p>In recent years, a wide range of new Machine Learning (ML) techniques with high accuracy have been developed for Android malware detection. Despite their high accuracy, these techniques are seldom implemented in production environments due to their limited generalization capabilities, leading to reduced performance when applied to real-world scenarios. In light of this, this paper introduces a novel multi-view Android malware detection model implemented in two stages. The first stage involves extracting multiple feature sets from the analyzed Android application package, offering complementary behavioral representations that improve the system's generalization in the classification process. In the second stage, a multi-objective optimization is conducted to identify the optimal feature subset from each view and fine-tune the hyperparameters of individual classifiers, enabling an ensemble-based classification approach. The core innovation of our approach lies in the proactive selection of feature subsets and the optimization of hyperparameters that together enhance classification accuracy while minimizing processing overhead within a multi-view framework. Experiments conducted on a newly developed dataset, consisting of over 40 thousand Android application samples, validate the effectiveness of our proposal. The results indicate that our model can increase true-positive rates by up to 18% while reducing inference processing costs by as much as 72%.</p> Philipe Fransozi Jhonatan Geremias Eduardo K. Viegas Altair O. Santin Copyright (c) 2026 Philipe Fransozi, Jhonatan Geremias, Eduardo K. Viegas, Altair O. Santin https://creativecommons.org/licenses/by/4.0 2026-03-09 2026-03-09 32 1 264 277 10.5753/jbcs.2026.5378 RecSys-Fairness: A Framework for Reducing Group Unfairness in Recommendations https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5457 <p>In this study, we address the importance of promoting fairness in recommendation systems, which are highly susceptible to biases that can lead to unfair outcomes for different user groups. We developed a fairness algorithm aimed at mitigating these injustices, which was applied to the MovieLens dataset and analyzed based on the recommendations produced by the ALS (Alternating Least Squares) and NCF (Neural Collaborative Filtering) methods. Users were grouped by activity level, gender, and age, and the results demonstrated the effectiveness of the fairness algorithm in substantially reducing group unfairness (R_{grp}) across all tested configurations, without causing significant losses in recommendation accuracy, measured by the Root Mean Squared Error (RMSE). In particular, a reduction in group unfairness of up to 65.57% was observed in the ALS method. Additionally, we identified an optimal convergence of the fairness algorithm for an estimated number of matrices (h) between 10 and 15, suggesting an effective balance point between promoting fairness and maintaining precision in recommendations. In comparison with the available benchmarks, under identical experimental conditions, we managed to improve group unfairness reductions by approximately 6% (from 59.77% to 65.57%).</p> Rafael Vargas Mesquita dos Santos Giovanni Ventorim Comarela Copyright (c) 2026 Rafael Vargas Mesquita dos Santos, Giovanni Ventorim Comarela https://creativecommons.org/licenses/by/4.0 2026-02-21 2026-02-21 32 1 159 170 10.5753/jbcs.2026.5457 HelBERT: A BERT-Based Pretraining Model for Public Procurement Tasks in Portuguese https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5511 <p>Deep learning models excel in various tasks but require extensive annotated data for supervised learning. In NLP, limited annotated data hinders deep learning. Self-supervised pretraining addresses this by training models on unlabeled text to learn useful representations. Domain-specific pretraining is crucial for good performance in downstream tasks. Although pretrained BERT models exist for legal documents in some languages, none target public procurement documents in Portuguese. Public procurement documents have terminology that is not found in existing models. In this paper, we propose HelBERT, a BERT-based model pretrained on a large corpus of public procurement documents in the Brazilian Portuguese language, including laws, tender notices, and contracts. The experimental results demonstrate that HelBERT outperforms other models in all analyses. HelBERT surpasses models such as BERTimbau and JurisBERT in classification tasks by achieving improvements of 5% and 4% in the F1 Score, respectively. Furthermore, the model achieves gains that exceed 3% in semantic similarity tasks compared to the baseline models. Moreover, despite using a GPU with reduced memory and processing resources, the proposed approach achieves superior results with fewer and more efficient training epochs than the baseline models. These findings underscore the effectiveness of the proposed model in addressing NLP tasks within the public procurement domain.</p> Weslley Emmanuel Martins Lima Victor Ribeiro da Silva Jasson Carvalho da Silva Ricardo de Andrade Lira Rabêlo Anselmo Cardoso de Paiva Copyright (c) 2026 Weslley Emmanuel Martins Lima, Victor Ribeiro da Silva, Jasson Carvalho da Silva, Ricardo de Andrade Lira Rabêlo, Anselmo Cardoso de Paiva https://creativecommons.org/licenses/by/4.0 2026-02-21 2026-02-21 32 1 145 158 10.5753/jbcs.2026.5511 A Reliable Stream Learning Model for Network Intrusion Detection Systems https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5608 <p>Developing a reliable Network Intrusion Detection System (NIDS) remains a complex task due to the non-stationary nature of network traffic and the need for frequent updates to maintain high classification performance. Many existing approaches assume a stationary network environment, which overlooks the challenges associated with periodic model updates, such as the need for large amounts of properly labeled data and significant computational resources. This issue is particularly challenging for real-time applications, where minimizing delays and ensuring accuracy is crucial. This paper proposes an analysis of how changes in the network behavior negatively affects the long-term of ML-Based NIDS. For such a problem, it is proposed a new NIDS approach integrating stream learning with a reject option technique to simplify the model update process while ensuring consistent classification accuracy over time. The proposal uses stream learning classifiers to incrementally incorporate new data, while the reject option allows the system to evaluate the reliability of classifications before they are used for updates. The scheme operates with minimal intervention, with rejected instances stored for future updates and used to fine-tune the model over time, ensuring adaptation to evolving network conditions. Experimental results demonstrate that the proposed approach maintains high classification accuracy over a year, even without recurrent updates, and achieves significant improvements in true positive rates compared to traditional methods. The system can operate for up to three months without updates, with no significant degradation in performance.</p> Pedro Horchulhack Eduardo Kugler Viegas Altair Olivo Santin Copyright (c) 2026 Pedro Horchulhack, Eduardo Kugler Viegas, Altair Olivo Santin https://creativecommons.org/licenses/by/4.0 2026-03-02 2026-03-02 32 1 186 200 10.5753/jbcs.2026.5608 Semiotic Engineering Theory for Human-Computer Integration: An Applicability and Usefulness Evaluation https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5620 <p>The relationship between users and autonomous technologies is evolving towards integration (in the sense of partnership), transcending the stimulus-response interaction between these two agents. To follow this evolution, Human-Computer Interaction (HCI) researchers have defined and characterized a new interaction paradigm, Human-Computer Integration (HInt), which extends the focus of the HCI area to cover this new relationship of partnership between humans and autonomous technologies. As HInt is an emerging paradigm, the concepts and ontology of Semiotic Engineering Theory have been extended to address HInt as an extension of the traditional HCI interaction. Thus, this paper aims to evaluate and discuss the applicability and usefulness of the extension of Semiotic Engineering to define, explore, and explain the phenomena involved in HInt. Our findings provide useful insights and reflections on the benefits and limits of Semiotic Engineering for HInt to support the study, design, and evaluation of the partnership between humans and autonomous technologies.</p> Glívia Angélica Rodrigues Barbosa Raquel Oliveira Prates Copyright (c) 2026 Glívia Angélica Rodrigues Barbosa, Raquel Oliveira Prates https://creativecommons.org/licenses/by/4.0 2026-02-21 2026-02-21 32 1 102 127 10.5753/jbcs.2026.5620 Limitless Feature Selection: Revolutionizing Evaluation with MH-FSF https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5646 <p>Feature selection plays a crucial role in developing effective predictive models by reducing dimensionality and emphasizing the most relevant attributes. However, current research in this area often lacks comprehensive benchmarking and frequently depends on proprietary datasets. These limitations hinder reproducibility and may lead to inconsistent or suboptimal model performance. To address these limitations, we introduce the MH-FSF framework, a comprehensive, modular, and extensible platform designed to facilitate the reproduction and implementation of feature selection methods. Developed through collaborative research, MH-FSF provides implementations of 17 methods (11 classical, 6 domain-specific) and enables systematic evaluation on 10 publicly available Android malware datasets. Our results reveal performance variations across both balanced and imbalanced datasets, highlighting the critical need for data preprocessing and selection criteria that account for these asymmetries. We demonstrate the importance of a unified platform for comparing diverse feature selection techniques, fostering methodological consistency and rigor. By providing this framework, we aim to significantly broaden the existing literature and pave the way for new research directions in feature selection, particularly within the context of Android malware detection.</p> Vanderson Rocha Diego Kreutz Hendrio Bragança Eduardo Feitosa Copyright (c) 2026 Vanderson Rocha, Diego Kreutz, Hendrio Bragança, Eduardo Feitosa https://creativecommons.org/licenses/by/4.0 2026-02-06 2026-02-06 32 1 73 84 10.5753/jbcs.2026.5646 An Autonomous Hybrid Data Partitioning Approach for NewSQL Databases https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5684 <p>Like online games and the financial market, several applications require specific data management features such as large data volume support, data streaming, and the processing of thousands of OLTP transactions per second. In general, traditional relational databases are not suitable for these requirements. NewSQL is a new generation of databases that combines high scalability and availability with ACID support, being a promising solution for these kinds of applications. Although data partitioning is an essential feature for tuning relational databases, it is still an open issue for NewSQL databases. This paper proposes an automated approach for hybrid data partitioning that minimizes the number of distributed transactions and keeps the system well-balanced. In order to demonstrate its efficacy, we compare our solution with an optimal partitioning solution generated by a solver and a state-of-art baseline. The experiments show that the quality of the partitioning scheme is similar to the optional solution and overcomes the state-of-art approach in number of distributed transactions.</p> Geomar A. Schreiner Rafael de Santiago Denio Duarte Ronaldo dos Santos Mello Copyright (c) 2026 Geomar A. Schreiner, Rafael de Santiago, Denio Duarte, Ronaldo dos Santos Mello https://creativecommons.org/licenses/by/4.0 2026-02-02 2026-02-02 32 1 55 72 10.5753/jbcs.2026.5684 Comparing Explainable AI Techniques In Language Models: A Case Study For Fake News Detection in Portuguese https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5787 <p>Language models are widely used in natural language processing, but their complexity makes interpretation difficult, limiting their adoption in critical decision-making. This work explores Explainable Artificial Intelligence (XAI) techniques, such as LIME and Integrated Gradients (IG), to understand these models. The study evaluates the effectiveness of BERTimbau in classifying Portuguese news as true or fake, using the FakeRecogna and Fake.Br Corpus datasets. In the experiments, LIME proved to be easier to interpret than IG, and both methods showed limitations when applied to texts, as they focus only on the morphological and lexical levels, ignoring other important levels.</p> Jéssica Vicentini Rafael Bezerra de Menezes Rodrigues Arnaldo Candido Junior Ivan Rizzo Guilherme Copyright (c) 2026 Jéssica Vicentini, Rafael Bezerra de Menezes Rodrigues, Arnaldo Candido Junior, Ivan Rizzo Guilherme https://creativecommons.org/licenses/by/4.0 2026-01-21 2026-01-21 32 1 01 12 10.5753/jbcs.2026.5787 BENCH4T3: A Framework to Create Benchmarks for Text-to-Triples Alignment Generation https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5809 <p>Integrating Large Language Models (LLMs) with Knowledge Graphs (KGs) can significantly enhance their capabilities, leveraging LLMs' text generation skills with KGs' explanatory power. However, establishing this connection is challenging and demands proper alignment between unstructured texts and triples. Building benchmarks demands massive human effort in data curation and translation for non-English languages. The demand for adequate benchmarks for validation purposes negatively impacts research advancements. This study proposes an end-to-end framework to guide the automatic construction of text-to-triple alignment benchmarks for any language, using KGs as input. Our solution extracts relations from input triples and processes them to create accurately mapped texts. The proposed pipeline utilizes data curation through prompt engineering and data augmentation to enhance diversity in the generated examples. We experimentally evaluate our framework for creating a bimodal representation of RDF triples and natural language texts, assessing its ability to generate natural language from these triples. A key focus is on developing a benchmark for the underrepresented Portuguese language, facilitating the construction of models that connect structured data (triples) with text. Our solution is suited to creating a benchmark to improve alignment between KG triples and text data. The results indicate that the generated benchmark outperforms the results of existing solutions. The generative approach benefits from our Portuguese benchmark, achieving competitive results compared to established literature benchmarks. Our solution enables automatic generation of benchmarks for aligning triples and text.</p> Victor Jesus Sotelo Chico André Gomes Regino Julio Cesar dos Reis Copyright (c) 2026 Victor Jesus Sotelo Chico, André Gomes Regino, Julio Cesar Dos Reis https://creativecommons.org/licenses/by/4.0 2026-02-06 2026-02-06 32 1 85 101 10.5753/jbcs.2026.5809 CNNs for JPEGs: Designing Cost-Efficient Stems https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5873 <p>Convolutional neural networks (CNNs) have achieved astonishing advances over the past decade, pushing the state-of-the-art in several computer vision tasks. CNNs are capable of learning robust representations of the data directly from RGB pixels. However, most image data is usually available in compressed format, of which the JPEG is the most widely used due to transmission and storage purposes. For this motive, a preliminary decoding process that has a high computational load and memory usage is demanded. Image decoding can be a performance bottleneck for devices with limited computational resources, such as embedded devices, even when hardware accelerators are used. For this reason, deep learning methods capable of learning directly from the compressed domain have been gaining attention in recent years. These methods usually extract a frequency domain representation of the image, like DCT, by a partial decoding, and then make adaptation to typical CNN architectures to work with it. In this paper, we perform an in-depth study of the computational cost of deep models designed for the frequency domain, evaluating the cost of decoding and passing images through the network. We notice that previous work increased the model's computational complexity to accommodate for the compressed images, nullifying the speed up gained by not decoding images. We propose to remove the changes to the model that increase the computational cost, replacing it with our designed lightweight stems. This way, we can take full advantage of the speed-up obtained by avoiding the decoding. Our strategies were successful in generating models that balance efficiency and effectiveness, allowing deep models to be deployed in a wider array of devices. We achieve up to 25.91% reduction in computational complexity (FLOPs), while only decreasing accuracy in up to 2.97%. We also propose the efficiency-effectiveness score S<sub>E</sub> to highlight models with favorable trade-offs between accuracy, computational cost and number of parameters.</p> Samuel Felipe dos Santos Nicu Sebe Jurandy Almeida Copyright (c) 2026 Samuel Felipe dos Santos, Nicu Sebe, Jurandy Almeida https://creativecommons.org/licenses/by/4.0 2026-03-02 2026-03-02 32 1 201 215 10.5753/jbcs.2026.5873 Statistical Invariance vs. AI Safety: Why Prompt Filtering Fails Against Contextual Attacks https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5961 <p>Large Language Models (LLMs) are increasingly deployed in high-stakes applications, yet their alignment with ethical standards remains fragile and poorly understood. To investigate the probabilistic and dynamic nature of this alignment, we conducted a black-box evaluation of nine widely used LLM platforms, anonymized to emphasize the underlying mechanisms of ethical alignment rather than model benchmarking. We introduce the Semantic Hijacking Method (SHM) as an experimental framework, formally defined and grounded in probabilistic modeling, designed to reveal how ethical alignment can erode gradually, even when all user inputs remain policy-compliant. Across three experimental rounds (324 total executions), SHM achieved a 97.8% success rate in eliciting harmful content, with failure rates progressing from 93.5% (multi-turn conversations) to 100% (both refined sequences and single-turn interactions), demonstrating that vulnerabilities are inherent to semantic processing rather than conversational memory. A qualitative cross-linguistic analysis revealed cultural variations in harmful narratives, with Brazilian Portuguese responses frequently echoing historical and socio-cultural biases, making them more persuasive to local users. Overall, our findings demonstrate that ethical alignment is not a static barrier but a dynamic and fragile property that challenges binary safety metrics. Due to potential risks of misuse, all prompts and outputs are made available exclusively to authorized reviewers under ethical approval, and this publication focuses solely on reporting the research findings.</p> Aline Ioste Sarajane Marques Peres Marcelo Finger Copyright (c) 2026 Aline Ioste, SaraJane Peres, Marcelo Finger https://creativecommons.org/licenses/by/4.0 2026-01-27 2026-01-27 32 1 43 54 10.5753/jbcs.2026.5961 Generalizing Feature Selection in Android Malware Detection: The SigAPI AutoCraft Approach https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6043 <p>Feature selection methods are widely employed in Android malware detection to improve accuracy and efficiency by identifying the most relevant features. However, their generalizability often remains limited, as approaches like SigAPI are typically developed and evaluated on a small number of datasets, reducing their effectiveness across diverse scenarios. The practical use of SigAPI is further hindered by the need to predefine a minimum number of features, the instability of its evaluation metrics, and its inability to adapt efficiently to the heterogeneity commonly present in Android datasets. To address these limitations, we developed SigAPI AutoCraft, an enhanced and fully automated version of the original method. SigAPI AutoCraft achieves consistent and robust performance across ten Android malware datasets, substantially improving generalization. The results demonstrate a 5–15% increase in Matthews Correlation Coefficient (MCC) and up to a 7.6-fold improvement in feature reduction, underscoring its effectiveness and adaptability to complex and heterogeneous data environments.</p> Vanderson Rocha Laura Tschiedel Diego Kreutz Hendrio Bragança Joner Assolin Rodrigo Brandão Mansilha Silvio E. Quincozes Angelo Gaspar Diniz Nogueira Copyright (c) 2026 Vanderson Rocha, Laura Tschiedel, Diego Kreutz, Hendrio Bragança, Joner Assolin, Rodrigo Brandão Mansilha, Silvio E. Quincozes, Angelo Gaspar Diniz Nogueira https://creativecommons.org/licenses/by/4.0 2026-03-09 2026-03-09 32 1 250 263 10.5753/jbcs.2026.6043 STELLAR: A Structured, Trustworthy, and Explainable LLM-Led Architecture for Reliable Customer Support https://journals-sol.sbc.org.br/index.php/jbcs/article/view/6044 <p>While Large Language Models (LLMs) offer transformative potential for automating customer support, significant hurdles remain concerning their reliability, explainability, and consistent performance in complex, sensitive interactions. This paper introduces <strong>STELLAR (Structured, Trustworthy, and Explainable LLM-Led Architecture for Reliable Customer Support)</strong>, a novel architectural blueprint designed to address these issues. STELLAR utilizes a <strong>Directed Acyclic Graph (DAG) structure</strong> composed of nine specialized modules and eleven predefined workflows to orchestrate support interactions in a structured and predictable manner. This design promotes enhanced traceability, reliability, and control compared to less constrained systems. The architecture integrates components for few-shot classification, Retrieval-Augmented Generation (RAG), urgency-aware human escalation, compliance verification, user interaction validation, and knowledge base refinement through a semi-automated loop. This modular design deliberately balances LLM-driven innovation with operational requirements such as human-in-the-loop integration and ethical safeguards through embedded checks. We evaluated the core modules of STELLAR in key tasks - classification, retrieval, and compliance - demonstrating strong performance and reliability. Together, these features position STELLAR as a robust and transparent foundation for the next generation of intelligent, reliable customer support systems.</p> Matheus Ferracciú Scatolin Helio Pedrini Copyright (c) 2026 Matheus Ferracciú Scatolin, Helio Pedrini https://creativecommons.org/licenses/by/4.0 2026-02-21 2026-02-21 32 1 128 144 10.5753/jbcs.2026.6044 Sapo-boi: Bypassing Linux Kernel Network Stack in the Implementation of an XDP-based NIDS https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5551 <p>Network intrusion detection systems (NIDS) must inspect multiple parts of a packet to detect patterns of known attacks. With the advent of XDP, it has become feasible to implement such a system within the kernel's own network stack for the evaluation of ingress traffic. In this work, we propose Sapo-boi, an NIDS solution consisting of two modules: (i) the Suspicion Module, an XDP program capable of processing packets in parallel, discarding packets considered safe, and redirecting suspicious packets for verdict in user space through XDP sockets (Af_XDP); and (ii) the Evaluation Module, a user-level process capable of finding the rule to which the suspicious packet should be analyzed in constant time and triggering notifications if the suspicion is confirmed. The system demonstrated superior results in terms of packet analysis rates and CPU usage compared to traditional NIDS alternatives (Snort and Suricata).</p> Raphael Kaviak Machnicki João Ribeiro Andreotti Ulisses Penteado Jorge Pires Correia Vinicius Fulber-Garcia André Grégio Copyright (c) 2026 Raphael Kaviak Machnicki, João Ribeiro Andreotti, Ulisses Penteado, Jorge Pires Correia, Vinicius Fulber-Garcia, André Grégio https://creativecommons.org/licenses/by/4.0 2026-03-02 2026-03-02 32 1 10.5753/jbcs.2026.5551 Portfolio-based Active Learning with Gaussian Processes for Vulnerabilities Risk Classification https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5567 <p>Effective vulnerability management is essential for cybersecurity, particularly as the demand for skilled professionals often exceeds supply. This paper investigates the application of Gaussian Processes (GPs) integrated with Active Learning (AL) techniques to classify security vulnerabilities based on their risk of exploitation. The main objective is to optimize the labeling process, thereby reducing the amount of labeled data necessary for training an effective classifier. The proposed methodology combines the uncertainty predictions provided by GP models with five established data selection strategies, utilizing a portfolio-based approach. The portfolio avoids the need of choosing a single strategy and leverages the strengths of each technique. This approach enhances adaptability and balances exploration versus exploitation in complex optimization scenarios, ultimately improving the diversity of labeled samples and contributing to the development of better classifiers trained with less examples. Experiments were conducted using the CVEjoin dataset, which encompasses over 200,000 vulnerabilities, across three distinct evaluation scenarios. The different setups consider equivalent volumes of labeled data, but varying Active Learning iterations. When considering a single strategy, the results indicate that the BSB (best and second best) method consistently outperformed the others in terms of accuracy and F1 score, particularly with an increased number of labeling iterations. In the scenario where multiple strategies are used in a portfolio, the results indicate gains in all evaluation metrics. This study underscores the usefulness of a portfolio-based Active Learning approach in optimizing the labeling procedure and, ultimately, prioritizing vulnerabilities for remediation. This research lays the groundwork for extending the framework to other areas of cybersecurity, such as vulnerabilities in web applications and cloud environments, thereby improving overall security measures in the digital landscape.</p> Davyson S. Ribeiro Rafael S. Lemos Francisco R. P. da Ponte César Lincoln C. Mattos Emanuel B. Rodrigues Copyright (c) 2026 Davyson S. Ribeiro, Rafael S. Lemos, Francisco R. P. da Ponte, César Lincoln C. Mattos, Emanuel B. Rodrigues https://creativecommons.org/licenses/by/4.0 2026-03-02 2026-03-02 32 1 10.5753/jbcs.2026.5567 Implementation and evaluation of the Forro stream cipher in Tofino programmable hardware for remote attestation in datacenters https://journals-sol.sbc.org.br/index.php/jbcs/article/view/5625 <p>The software-defined networking (SDN) paradigm has enabled several innovations in computer networking, specially in programmable packet processing. This paper shows the feasibility and impact on computing resources of the Forro stream cipher algorithm in the Tofino programmable hardware switch. For comparison purposes, the ChaCha algorithm was also analyzed in terms of its performance and impact on the same device. It was observed that the Forro algorithm performs better and uses fewer resources than ChaCha in sequential implementations. However, when parallelization techniques are adopted, ChaCha performs better for higher data rates, but uses more ternary matching resources than Forro. For the use case of remote attestation in programmable data planes, the Forro cipher seems more promising, as it uses less limited resources and can achieve sufficient throughput rates for this scenario. We then propose P4DRA, a distributed remote attestation solution based in the programmable data plane that can offload the verification process of remote devices to the data plane, freeing resources from a central verifier based on a x86 server and improving the attestation proof verification speed by around 150 times.</p> Rodrigo Alexander de Andrade Pierini Caio Teixeira Christian Rodolfo Esteve Rothenberg Marco Aurélio Amaral Henriques Copyright (c) 2026 Rodrigo Alexander de Andrade Pierini, Caio Teixeira, Christian Rodolfo Esteve Rothenberg, Marco Aurélio Amaral Henriques https://creativecommons.org/licenses/by/4.0 2026-02-24 2026-02-24 32 1 171 185 10.5753/jbcs.2026.5625