Infant facial expression recognition is of great importance for early care, which can help parents and caregivers to identify the emotional state of infants.In this study, we propose an improved YOLOv8 model for infant facial expression recognition.Compared with traditional YOLOv8 model, following improvements have been proposed.First, the Swin Transformer V2 structure is introduced in the backbone network part to improve the model’s ability to model global features.
Second, the AKConv and Foods Dry RepViTBlock modules are added to the head structure to optimize the feature extraction and fusion process.Then, the BiFPN is treated as the feature pyramid network (FPN) module to enhance the multi-scale feature fusion capability.Based on the above improvements, the proposed YOLOv8 model can effectively handle the subtle differences and multi-scale changes in infant facial expression, resulting in high-Precision and high-Recall facial expression recognition, which is suitable Wrist Rests to recognize the complex and fine-grained facial expression.The applied dataset is composed of more than 9000 images, which are labeled as cry, happy, neutral and back of head.
The experimental results show that the improved YOLOv8 model outperforms the original YOLOv8 and other mainstream facial expression recognition models.