A Robot Image Anomaly Detection Model with Multimodal Cues under Partial Image Defects

Authors

  • Sangbing Tsai International Engineering and Technology Institute Author

Keywords:

Robot , GAN, FCN, Multimodal-Attention, image anomaly detection, GA-MMA-FCN

Abstract

In robotic systems, the challenge of defective image anomaly detection in multimodal environments has been a critical issue. Solving this problem holds significant implications for the environmental perception of mobile robots and product quality inspection for industrial robots. This study addresses this challenge by proposing a multimodal robot image anomaly detection model for images with defects, integrating multimodal fusion attention networks, generative adversarial networks, and fully connected networks. By comprehensively considering various perceptual modalities such as images, texts, and sounds, the model efficiently captures crucial information, enhancing the precision and robustness of anomaly detection. Through detailed experimental validation, our model significantly improves accuracy, recall, precision, AUC, and F1-score metrics. The results demonstrate that the proposed GA-MMA-FCN model provides an efficient and reliable solution for robot image anomaly detection in multimodal environments, offering crucial support for practical applications in robotic systems.

 

Published

2025-02-23

Issue

Section

Articles

How to Cite

A Robot Image Anomaly Detection Model with Multimodal Cues under Partial Image Defects. (2025). Journal of Information and Computing, 2(4), 69-85. https://itip-submit.com/index.php/JIC/article/view/100