A Robot Image Anomaly Detection Model with Multimodal Cues under Partial Image Defects
Keywords:
Robot , GAN, FCN, Multimodal-Attention, image anomaly detection, GA-MMA-FCNAbstract
In robotic systems, the challenge of defective image anomaly detection in multimodal environments has been a critical issue. Solving this problem holds significant implications for the environmental perception of mobile robots and product quality inspection for industrial robots. This study addresses this challenge by proposing a multimodal robot image anomaly detection model for images with defects, integrating multimodal fusion attention networks, generative adversarial networks, and fully connected networks. By comprehensively considering various perceptual modalities such as images, texts, and sounds, the model efficiently captures crucial information, enhancing the precision and robustness of anomaly detection. Through detailed experimental validation, our model significantly improves accuracy, recall, precision, AUC, and F1-score metrics. The results demonstrate that the proposed GA-MMA-FCN model provides an efficient and reliable solution for robot image anomaly detection in multimodal environments, offering crucial support for practical applications in robotic systems.
Published
Issue
Section
License
Copyright (c) 2025 Journal of Information and Computing

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.