Article
Prediction of Animal Vocal Emotions using Convolutional Neural Network
The classification of animal vocal emotions is a burgeoning field with significant implications for animal welfare, veterinary diagnostics, and conservation, driven by the need to accurately interpret emotional states from vocalizations. Over 500,000 vocal samples across 200 species are now cataloged in repositories like the Animal Vocalization Database (AVD). However, existing manual analysis methods, such as spectrographic analysis and acoustic feature scoring, suffer from subjectivity, with inter-observer agreement averaging only 60%, scalability issues due to time-intensive processes, and noise interference reducing accuracy by up to 30% in field settings. To address these challenges, this study proposes a Deep Learning Convolutional Neural Network (DLCNN) for classifying four emotional classes—anger, disgust, fear, and purr—in animal vocalizations. Preprocessing involves Mel-Frequency Cepstral Coefficient (MFCC) feature extraction, capturing spectral characteristics with 13–40 coefficients per frame, followed by noise reduction and normalization to ensure robustness. Existing methods evaluated include Support Vector Machine (SVM), achieving 82% accuracy; K-Nearest Neighbors (KNN), with 78% accuracy; Decision Tree Classifier (DTC), AdaBoost, and Linear Discriminant Analysis (LDA). These methods struggle with high-dimensional MFCC data and complex emotional patterns, often overfitting or underperforming on noisy inputs. The proposed DLCNN architecture integrates multiple convolutional layers to extract spatial hierarchies, followed by dense layers for classification, trained with a categorical cross-entropy loss function and optimized using Adam. The DLCNN leverages data augmentation (e.g., pitch shifting, time stretching) to enhance generalization, achieving perfect performance metrics: 100% accuracy, precision, recall, and F1-score across all classes. This approach outperforms existing methods by capturing intricate acoustic patterns, offering a scalable, automated solution for real-time animal vocal emotion classification
Full Text Attachment