Publications | AIRi @ UTCN

Prompting fairness: Learning prompts for debiasing large language models

ManualCenter for AI Measurement

Authors: Camelia Lemnaru, Cristian Andrei Rad

Large language models are prone to internalize social biases due to the characteristics of the data used for their self-supervised training scheme. Considering their recent emergence and wide availability to the general public, it is mandatory to identify and alleviate these biases to avoid perpetuating stereotypes towards underrepresented groups. We present a novel prompt-tuning method for reducing biases in encoder models such as BERT or RoBERTa. Unlike other methods, we only train a small set of additional reusable token embeddings that can be concatenated to any input sequence to reduce bias in the outputs. We particularize this method to gender bias by providing a set of templates used for training the prompts. Evaluations on two benchmarks show that our method is on par with the state of the art while having a limited impact on language modeling ability

View details Open PDF

A Hybrid Machine Learning–Genetic Algorithm for Optimizing Surface-Mount Technology Planning

2026conference paperManualTrusted AI

Authors: Adrian Petru Groza

We tackle the problem of improving the Surface- Mount Technology (SMT) process planning in an automotive manufacturing setting. Current simulations show low accu- racy across production lines as the existing approach relies on predefined setups rather than adapting to product-specific configurations. We propose a hybrid framework that couples machine learning with a genetic algorithm to generate product- specific plans. Our solution involves three tasks: (i) assigning boards to lines, (ii) allocating components to Pick-and-Place (PnP) machines, and (iii) balancing workloads across machines. Our hybrid pipeline embeds supervised learning in a genetic optimizer. A multi-class classifier selects feasible PnP head con- figurations per Bill of Materials (BOM) part number (precision = 0.73). A genetic algorithm assigns components to compatible feeder tables/machines, while a regression model estimates table cycle times (R² = 0.88). The fitness jointly optimizes Components Placed per Hour (CPH) and Line Balancing (LB) under process constraints. Different mutation methods are explored, revealing that mutation based on balancing the workload by leveling the number of placements on the tables with minimum and maximum cycle time results in an LB of 0.83, with a CPH of 0.37 and an average delta cycle time of -3.27% across 105-part numbers

View details Open PDF

ConvU-NExT: An Asymmetrical Encoder–Decoder for Denoising Low Dose CT

2026ArticleManualTrusted AI

Authors: Adrian Petru Groza

Low-dose computed tomography (LDCT) is a medical imaging modality designed to minimize ionizing radiation exposure while maintaining the ability to produce detailed cross-sectional images. It is particularly valuable in scenarios requiring repeated imaging, such as cancer screening, follow-up examinations or pediatric diagnostics, where reducing radiation dose is critical to patientsafety. For example, to reduce noise by half, fourtimesthe radiation dose isrequired in the slice. The goal isto achieve postprocessed LDCT images with comparable quality to those obtained from standard-dose CT imaging. We start with a brief overview of the CT procedures and their limitations. Then we introduce a novel denoising method based on an asymmetric integration of the ConvNeXt backbone with the U-Net architecture. This novel approach obtained 2–3 times less noise than the original LDCT, having a 10%–20% increase in performance compared to U-Net implementation, checked against three metrics MSE, SSIMLoss and combinations of both. The results suggest that: (i) augmenting the images with specific noise, obtained from water phantom CT scan test, while training yieldssuperiorresults compared to generic noise augmentations; (ii) a larger kernelsize better extracts features and (iii) a smaller kernel size was mandatory for feature reconstruction

View details Open PDF

AlloyGraph: Data and Evaluation Results for Multi-Agent AI Superalloy Property Prediction

2026OpenAlex automated

Authors: Alexandru Lecu, Adrian Petru Groza

Training data (77 alloys from the Nickel Institute handbook), evaluation data (88 alloys from manufacturer datasheets), prediction results for six model configurations, chatbot evaluation benchmarks (250 MCQ questions, 100 RAGAS questions, 12 expert-graded questions), inverse design results (20 target specifications), and OWL ontology for the AlloyGraph platform. Associated repository: https://github.com/AlexLecu/AlloyGraph

View details

OCTA-Based Biomarker Characterization in nAMD

2026OpenAlex automated

Authors: Adrian Petru Groza

We aim to enhance ophthalmologists' decision-making when diagnosing the Neovascular Age-Related Macular Degeneration (nAMD). We developed three tools to analyze Optical Coherence Tomography Angiography images: (1) extracting biomarkers such as mCNV area and vessel density using image processing; (2) generating a 3D visualization of the neovascularization for a better view of the affected regions; and (3) applying an ensemble of three white box machine learning algorithms (decision tree, support vector machines and DL-Learner) for nAMD diagnosis. The learned expressions reached 100% accuracy for the training data and 68% accuracy in testing. The main advantage is that all the learned models white-box, which ensures explainability and transparency, allowing clinicians to better understand the decision-making process.

View details Open PDF

Performance Evaluation of Large Language Models for Automated Knowledge Graph Generation

2026OpenAlex automated

Authors: Tudor Cioara, Anghel Ionuț

View details

Performance Evaluation of LLMs in Automated RDF Knowledge Graph Generation

2026OpenAlex automated

Authors: Tudor Cioara, Anghel Ionuț

Cloud systems generate large, heterogeneous log data containing critical infrastructure, application, and security information. Transforming these logs into RDF triples enables their integration into knowledge graphs, improving interpretability, root-cause analysis, and cross-service reasoning beyond what raw logs allow. Large Language Models (LLMs) offer a promising approach to automate RDF knowledge graph generation; however, their effectiveness on complex cloud logs remains largely unexplored. In this paper, we evaluate multiple LLM architectures and prompting strategies for automated RDF extraction using a controlled framework with two pipelines for systematically processing semi-structured log data. The extraction pipeline integrates multiple LLMs to identify relevant entities and relationships, automatically generating subject-predicate-object triples. These outputs are evaluated using a dedicated validation pipeline with both syntactic and semantic metrics to assess accuracy, completeness, and quality. Due to the lack of public ground-truth datasets, we created a reference Log-to-KG dataset from OpenStack logs using manual annotation and ontology-driven methods, enabling objective baseline. Our analysis shows that Few-Shot learning is the most effective strategy, with Llama achieving a 99.35% F1 score and 100% valid RDF output while Qwen, NuExtract, and Gemma also perform well under Few-Shot prompting, with Chain-of-Thought approaches maintaining similar accuracy. One-Shot prompting offers a lighter but effective alternative, while Zero-Shot and advanced strategies such as Tree-of-Thought, Self-Critique, and Generate-Multiple perform substantially worse. These results highlight the importance of contextual examples and prompt design for accurate RDF extraction and reveal model-specific limitations across LLM architectures.

View details Open PDF

Older people with mild cognitive impairment engaged by social robot-based intervention: Benefits shown in multicountry long term trials (Preprint)

2026OpenAlex automated

Authors: Tudor Cioara, Anghel Ionuț

BACKGROUND Loneliness and social isolation are some of the several risk factors that contribute to dementia. Technology‑based interventions using socially assistive robots (SARs) and mobile apps may help not only to maintain cognitive functioning but also to support social connectedness and psychosocial wellbeing in older adults with mild cognitive impairment (MCI); however, evidence from multicountry trials remains limited. OBJECTIVE The proof‑of‑concept (PoC) evaluated the benefits of the engAGE platform, a hybrid intervention combining a social robot, mobile app, and wearable activity tracker, designed to support both social connectedness and cognitive functioning in older people with MCI across three European countries. METHODS Older adults with MCI were recruited in Italy, Switzerland, and Norway. The 6‑month intervention combined weekly robot‑guided group sessions with daily tablet use and continuous activity tracker wear at home. Outcomes were assessed at baseline and post-test and included subjective memory complaints as primary outcome (MAC‑Q), and global cognition (MoCA), loneliness (UCLA), quality of life (QoL‑AD and EQ‑5D‑5L VAS), and mental wellbeing (WEMWBS) as secondary outcomes. Also, usability and acceptance were assessed through SUS and UTAUT after 3 and 6 months of intervention. Intra-group and inter‑group differences in change were explored for any dimension to determine the effects of the intervention. RESULTS Of 50 enrolled participants, 44 (36 assigned to the experimental group - EG; and 8 to the control group - CG) completed the final assessment and were included in the analyses. The subjective memory complaints reduced significantly from 26.41 (±2.23) to 25.22 (±3.19) in the EG, whereas remained unchanged in the CG (26.75 ±0.71). MoCA scores remained stable overall (EG: 23.51 ±2.16 to 23.44 ±3.21), with no significant differences between groups. Psychosocial outcomes showed a mixed pattern: in the Italian EG, loneliness decreased significantly (UCLA: 45.18 ± 9.61 to 37.94 ± 6.56), whereas in Switzerland significantly worsened (p=.021). Also, the self‑rated health (EQ‑5D‑5L-VAS) improved significantly in the EG (p=.013), but no significant differences were detected between groups. Overall QoL‑AD and WEMWBS scores remained broadly stable. SUS scores in the EG improved significantly (p=.020) from 58.81 (±18.17) to 68.54 (±18.54), reaching the commonly accepted usability threshold. CONCLUSIONS The engAGE hybrid intervention, combining robot‑guided group activities with home‑based tablet use and activity monitoring, was delivered across three different socio‑healthcare contexts and showed preliminary benefits in subjective memory, selected psychosocial measures, and usability. CLINICALTRIAL ClinicalTrials.gov NCT06302686; https://clinicaltrials.gov/study/NCT06302686 INTERNATIONAL REGISTERED REPORT RR2-10.2196/67601

View details

Asynchronous federated learning with partial weights aggregation for energy consumption forecasting

2026OpenAlex automated

Authors: Anghel Ionuț, Tudor Cioara

Accurate energy forecasting is essential for grid stability, demand-side management, and efficient renewable integration. However, energy consumption data collected from smart meters may expose sensitive user information, thus raising privacy concerns. Federated Learning (FL) offers a privacy-preserving mechanism for collaborative model training without sharing raw data. However, conventional synchronous FL suffers from training delays caused by heterogeneous client availability and computational capabilities, while frequent exchange of model parameters can lead to communication overheads. To address these challenges, this paper proposes an asynchronous federated learning framework for energy forecasting that enables continuous global model updating without waiting for all clients to complete local training. We introduce a federated asynchronous adaptive aggregation mechanism, where client-specific learning rates are dynamically adjusted based on both update staleness and model performance contribution. A partial aggregation strategy is defined for a Long Short-Term Memory (LSTM) forecasting model that splits the local models' layers, allowing clients to exchange only a subset of the weights with the server. The proposed solution is evaluated using real-world energy consumption data from multiple consumers. Experimental results demonstrate that the proposed asynchronous adaptive strategy outperforms the classic FedAvg approach and maintains prediction accuracy relative to personalised FedAvg, while reducing communication costs. Additionally, the proposed method outperforms the classic FedAsync algorithm across all client groups, with statistically significant improvements in most cases.

View details

Colonic Polyp Detection with Object Detection Models

2026OpenAlex automated

Authors: Eugen Richard Ardelean

In recent years, deep learning has been applied more and more to medical image analysis. One such application of deep learning is the automated polyp detection in colonoscopy with the target of reducing miss rates. This study presents a comprehensive evaluation of nine state-of-the-art object detection models for colonic polyp detection: YOLOv8, YOLOv9, YOLOv10, YOLO11, YOLO12, YOLO26, RT-DETR, YOLO-World, and YOLOE. The models were evaluated on three publicly available datasets: CVC-ClinicDB, CVC-ColonDB, and ETIS-LaribPolypDB. All models were trained under standardized conditions using identical hyperparameters and data augmentation strategies to guarantee fair comparison. Performance was evaluated using multiple metrics: mAP@50, mAP@50–95, F1 score, precision, recall, inference time, and computational cost. YOLO11 demonstrated the best overall performance, achieving mAP@50 scores of 0.995, 0.944, and 0.978 on the three datasets respectively, while maintaining the fastest inference time of approximately 150 ms per image and the third-lowest computational cost at 21.3 GFLOPs. Cross-dataset generalization experiments revealed a significant loss of performance, with mAP@50 dropping by 20–40% when models were tested on an unseen dataset, highlighting the challenge of true generalization with limited datasets. Statistical analysis by polyp size showed that while all models achieved F1 scores exceeding 0.95 for large polyps, performance decreased to 0.60–0.85 for small polyps, indicating a limitation in detecting small lesions. The analysis of failure modes showed that missed detections, false positives and boundary errors constitute 60–75% of all failures, suggesting that domain adaptation of object detection models may be required.

View details

Multi-Agent Coverage Using Multiplicatively Weighted Energy Voronoi Partitions

2026OpenAlex automated

Authors: Lucian Bușoniu

We propose a coverage control method for energy-constrained multi-agent systems with single-integrator agent motion, in which agent energy discharges and recharges at constant rates. Differently existing methods, a Multiplicatively Weighted Energy Voronoi (MWEV) partition ensures that each agent’s coverage region varies with energy in such a way that it vanishes when energy drops to a reserve level sufficient to reach a charging station. We show that the optima of an energy-aware coverage objective are the dynamically evolving MWEV centroids. A generalized Lloyd algorithm provably drives agents with remaining battery to these centroids, and reserve-level agents to their charging stations, under a switched two-timescale model with fast motion and slow energy dynamics. Agents repeatedly pause coverage as they get depleted, and then resume coverage upon recharging. This happens arbitrarily many times, leading to an infinite-horizon coverage setting. The method works well in simulations, where we also apply an alternative technique with a different problem formulation. We contrast coverage performance and average agent downtime between the two methods; e.g., downtime is 17.70% of the experiment duration for our MWEV technique, compared to 28.5% for the alternative. In addition, the robustness of the proposed approach is investigated under nonlinear battery discharge dynamics, position errors, and delayed energy information.

View details

Fast Neural-Network Approximation of Active Target Search Under Uncertainty

2026OpenAlex automated

Authors: Lucian Bușoniu

We address the problem of searching for an unknown number of stationary targets at unknown positions with a mobile agent. A probability hypothesis density filter is used to estimate the expected number of targets under measurement uncertainty. Existing planners, such as Active Search (AS) and its Intermittent variant (ASI), achieve accurate detection but require costly online optimization. To reduce online computation, we propose to use a convolutional neural network to approximate AS or ASI decisions through direct inference. The network is trained on AS/ASI data using a multi-channel grid that encodes target beliefs, the agent position, visitation history, and boundary information. Simulations with uniform and clustered target distributions show that the network achieves detection rates comparable to AS or ASI while reducing computation by orders of magnitude.

View details Open PDF

Value iteration with stopping criterion: finite iterations, stability, and near-optimality guarantees

2026OpenAlex automated

Authors: Lucian Bușoniu

Value iteration (VI) is a cornerstone of dynamic programming that allows computing near-optimal feedback laws for general plant dynamics and cost functions. In practice, however, it must be stopped after finitely many iterations. This raises the question of when to stop the algorithm so that the resulting policies and value functions achieve desirable properties, like given near-optimality bounds and stability. In this context, we study deterministic, discrete-time systems with infinite-horizon (possibly discounted) costs whose inputs are generated by VI. We equip VI with a generalized stopping criterion that encompasses existing choices while allowing new ones. Our aim is to analyze the properties of the policies and value functions at the final iteration. Under mild assumptions, we first show that VI indeed terminates in a finite number of iterations. We then establish that the final policies are stabilizing by properly designing the stopping criterion, and derive explicit near-optimality bounds characterized by this choice. These results offer a design framework for the stopping criteria that balances computational effort with stability and performance guarantees.

View details Open PDF

Segmentation of the Retinal Vascular Network and Biomarker Quantification in OCTA Imaging

2026OpenAlex automated

Authors: Mihnea Jurca

View details

Forest Inspection Dataset: A Synthetic UAV Dataset for Semantic Segmentation of Forest Environments

2026OpenAlex automated

Authors: Sergiu Nedevschi

This work describes the Forest Inspection dataset, a synthetic aerial image collection designed for semantic segmentation of forest environments with an emphasis on UAV-based forest inspection. The dataset consists of high-resolution RGB images paired with dense pixel-level semantic labels covering 11 classes, including deciduous trees, coniferous trees, fallen trees, ground vegetation, dirt ground, rocks, sky, buildings, fences, and vehicles. Images were generated in AirSim using a photorealistic virtual forest environment and captured with simulated UAV flights at three altitudes (30 m, 50 m, 80 m) and three camera pitch angles (0°, 60°, 90°) to reproduce diverse observation conditions, under two weather settings: sunny and overcast. Each data sample includes the corresponding UAV pose metadata for spatial context. The dataset is provided in standard image and annotation formats, accompanied by a description of the scene configuration and acquisition parameters. This resource is intended to support the development and evaluation of semantic segmentation models and other computer vision methods for UAV-based forest scene understanding and inspection.

View details Open PDF

Improving Counting Accuracy of Postdisaster Visual Question Answering for Remote Sensing

2026OpenAlex automated

Authors: Sergiu Nedevschi

In post-disaster damage assessment, visual question answering (VQA) systems are essential in identifying the severity and scope of damage. However, counting-related tasks, such as determining the number of vehicles and flooded buildings, remain a significant challenge for current deep learning models. To address this issue, we propose DeVANet (DeBERTa Vision Attention Network), a novel architecture aimed at enhancing counting accuracy in VQA for post-disaster scenarios. We leverage DeBERTa for language modeling and introduce an innovative image embedding module, where local-global attention guides Vision Mamba features to achieve precise extraction of both small and large objects. Our fusion mechanism employs self-attention for both text and image data, followed by bidirectional cross-attention and co-attention to enhance multimodal integration. We tackle VQA as both a classification and regression problem by employing separate MLPs for each task: one handling discrete class predictions and the other generating continuous values for counting tasks. A joint loss function, combining weighted cross-entropy and negative binomial loss, ensures optimized performance across both tasks. Extensive experiments on the FloodNet and RescueNet datasets demonstrate that DeVANet achieves significant improvements in counting accuracy and overall VQA performance compared to state-of-the-art works, supported by detailed ablation studies that validate the effectiveness of each component in the architecture.

View details

Memetic-based Coordination of Distributed Storage Units Flexibility for Congestion Management

2026OpenAlex automated

Authors: Tudor Cioara, Anghel Ionuț

View details

P2P Energy Trading Coordination in Interconnected Microgrid Systems

2026OpenAlex automated

Authors: Tudor Cioara

View details

Edge-Oriented Orchestration of Energy Services Using Graph-Driven Swarm Intelligence

2026OpenAlex automated

Authors: Tudor Cioara, Vasile Ofrim

As smart grids increasingly depend on IoT devices and distributed energy management, they require decentralized, low latency orchestration of energy services. We address this with a unified framework for edge fog cloud infrastructures tailored to smart energy systems. It features a graph based data model that captures infrastructure and workload, enabling efficient topology exploration and task placement. Leveraging this model, a swarm-based heuristic algorithm handles task offloading in a resource-aware, latency sensitive manner. Our framework ensures data interoperability via energy data space compliance and guarantees traceability using blockchain based workload notarization. We validate our approach with a real-world KubeEdge deployment, demonstrating zero downtime service migration under dynamic workloads while maintaining service continuity.

View details Open PDF

Energy forecasting under missing data: Comparative evaluation of augmented representations and decoder-only time-series imputation

2026OpenAlex automatedTrusted AI

Authors: Tudor Cioara, Mircea Gabriel Antonesi, Anghel Ionuț

Data-related issues, including missing values and irregular measurements, challenge the accuracy of short-term energy forecasting in smart grids. In data-scarce scenarios, two approaches are commonly considered, but their strengths and weaknesses are not fully mapped. Embedding-based models learn joint representations from heterogeneous data, compensating for the lack of time-series measurements via additional contextual or external sources, whereas imputation pipelines restore temporal continuity but may smooth variability or produce implausible values. To address these limitations, we propose a unified forecasting framework for energy systems that integrates a shared Temporal Fusion Transformer prediction with a controlled degradation protocol to simulate realistic missing-data patterns. This enables a fair and systematic comparison between two pipelines: a representation-augmented learning and decoder-only time series imputation. The former integrates TS2Vec temporal embeddings and BERT-based static contextual representations to provide a richer forecasting space without explicit reconstruction of missing values. The latter uses a Chronos-2 model to reconstruct missing time-series segments, followed by physics-based correction to enforce physically plausible outputs. We evaluate both pipelines under a controlled data degradation protocol to map the trade-offs between representation learning and data continuity restoration through imputation. We use real-world non-residential building electricity consumption and wind generation datasets. The imputation-based pipeline achieves a mean sMAPE of 10.14% and MAE of 8.43 kWh across 100 buildings, compared to 12.11% and 10.89 kWh for the representation-based approach (p<0.01, p<0.01, p<0.01). On the wind generation dataset, imputation also improves predictive accuracy (R²=0.870 vs. R²=0.794). However, representation-based models remain competitive in scenarios with irregular, spike-dominated, or event-driven consumption patterns where imputation provides limited additional benefits.

View details Open PDF

Generative AI for IT Project Management: A Systematic Review and Future Research Agenda

2026OpenAlex automated

Authors: Tudor Cioara

Nowadays, the literature on Generative AI (GenAI) in Information Technology (IT) project management is fragmented, focusing mainly on isolated tools, specific process groups, or practitioners’ perspectives, without offering a comprehensive synthesis. Therefore, there is a lack of systematic reviews to guide researchers in effectively and responsibly leveraging GenAI, including emerging innovations such as AI agents. This paper aims to synthesize current knowledge on GenAI in IT project management, combining a PRISMA-compliant systematic review of the peer-reviewed literature, a complementary analysis of commercial and open-source platforms, and a forward-looking research agenda featuring our vision on agentic AI architectures for IT project management. For the systematic review based on academic sources we have used the Web of Science (WoS) database in our study. Studies were eligible if published between 2021 and 2026 in English, as journal articles or conference proceedings, across major publishers (IEEE, Springer, Elsevier, MDPI, ACM, and others), and indexed under computer science, engineering, or AI categories in WoS. For industry-driven analysis, sources included vendor documentation, official product pages, and publicly accessible repository specifications, selected for relevance through manual search. The review reveals that while academic research remains largely focused on prompt-based applications of foundation models such as GPT, commercial and open-source platforms have progressed toward embedding GenAI as an operational capability within project workflows. Therefore, we consider that agentic architecture represents a promising future direction for enabling autonomous task execution, collaborative decision-making, and human–AI orchestration and integration across the project lifecycle.

View details Open PDF

Fine-Grained Complexity of Ontology Mediated Queries

2025ManualTrusted AI

Authors: Cristina Feier

View details

Prompts and Prayers: the Rise of GPTheology

2025ManualCenter for AI Measurement

Authors: Adrian Petru Groza

Increasingly artificial intelligence (AI) has been cast in “god-like” roles (to name a few: film industry – Matrix, The Creator, Mission Impossible, Foundation, Dune etc.; literature – Children of Time, Permutation City, Neuromancer, I Have no Mouth and I Must Scream, Alphaville etc.). This trend has accelerated with the advent of sophisticated Large Language Models such as ChatGPT. For this phenomenon, where AI is perceived as divine, we use the term GPTheology, where ChatGPT and other AI models are treated as potential oracles of a semi-divine nature. This paper explores the emergence of GPTheology as a form of techno-religion, examining how narratives around AI echo traditional religious constructs. We draw on community narratives from online forums – Reddit – and recent projects – AI-powered Mazu Statue in Malaysia (Lu, 2025); “ShamAIn” Project in Korea (He-rim, 2025); AI Jesus in a Swiss Church (Kennedy, 2024). These examples show striking similarities to technological notions of the Singularity and the development of Artificial General Intelligence (AGI). Additionally, we analyse how daily interactions with AI are acquiring ritualistic associations and how AI-centric ideologies clash with or are integrated into established religions. This study uses a dataset of Reddit posts discussing AI to identify recurring themes of salvation, prophecy, and demonization surrounding AI. Our findings suggest that new belief systems are developing around AI, and this carries both philosophical and sociotechnical implications. Our paper critically analyses the benefits and dangers, as well as the social, political and ethical challenges of this development. This transdisciplinary inquiry highlights how AI and religion are increasingly intertwined, prompting necessary questions about humanity’s relationship with its creations and the future of belief.

View details Open PDF

A Comparative Survey of Social Bias in Text and Image Generation: Gaps, Directions and Compliance with the EU AI Act

2025ManualCenter for AI Measurement

Authors: Cristian Andrei Rad, Camelia Lemnaru

Generative artificial intelligence models, including large language models and image generation models, are increasingly deployed in socially impactful domains. However, these models often exhibit social biases that can amplify stereotypes and produce harmful, discriminatory outputs. In this paper, we present a modality-comparative survey of social bias in text and image generation, structured around four components: benchmarks, bias identification, measurement, and mitigation. We systematically analyze methodological parallels and divergences across the two modalities, highlighting emerging research trends and identifying gaps. Finally, we map current image generation research efforts to the EU AI Act’s technical requirements, offering insights into how the community can advance towards more fair, safe, and trustworthy systems.

View details

MCP-Orchestrated Multi-Agent System for Automated Disinformation Detection

2025conference paperManualTrusted AI

Authors: Adrian Petru Groza, Alexandru Lecu

The large spread of disinformation across digital platforms creates significant challenges to information integrity. This paper presents a multi-agent system that uses relation extraction to detect disinformation in news articles, focusing on titles and short text snippets. The proposed Agentic AI system combines four agents: (i) a machine learning agent (logistic regression), (ii) a Wikipedia knowledge check agent (which relies on named entity recognition), (iii) a coherence detection agent (using LLM prompt engineering), and (iv) a web-scraped data analyzer that extracts relational triplets for fact checking. The system is orchestrated via the Model Context Protocol (MCP), offering shared context and live learning across components. Results demonstrate that the multi-agent ensemble achieves 95.3% accuracy with an F1 score of 0.964, significantly outperforming individual agents and traditional approaches. The weighted aggregation method, mathematically derived from individual agent misclassification rates, proves superior to algorithmic threshold optimization. The modular architecture makes the system easily scalable, while also maintaining details of the decision processes.

View details Open PDF