Pertussis is highly contagious; thus, prompt identification of cases is essential to control outbreaks. Clinicians experienced with the disease can easily identify classic cases, where patients have bursts of rapid coughing followed by gasps, and a characteristic whooping sound. However, many clinicians have never seen a case, and thus may miss initial cases during an outbreak. The purpose of this project was to use voice-recognition software to distinguish pertussis coughs from croup and other coughs.
We collected a series of recordings representing pertussis, croup and miscellaneous coughing by children. We manually categorized coughs as either pertussis or non-pertussis, and extracted features for each category. We used Mel-frequency cepstral coefficients (MFCC), a sampling rate of 16 KHz, a frame Duration of 25 msec, and a frame rate of 10 msec. The coughs were filtered. Each cough was divided into 3 sections of proportion 3-4-3. The average of the 13 MFCCs for each section was computed and made into a 39-element feature vector used for the classification. We used the following machine learning algorithms: Neural Networks, K-Nearest Neighbor (KNN), and a 200 tree Random Forest (RF). Data were reserved for cross-validation of the KNN and RF. The Neural Network was trained 100 times, and the averaged results are presented.
After categorization, we had 16 examples of non-pertussis coughs and 31 examples of pertussis coughs. Over 90% of all pertussis coughs were properly classified as pertussis. The error rates were: Type I errors of 7%, 12%, and 25% and Type II errors of 8%, 0%, and 0%, using the Neural Network, Random Forest, and KNN, respectively.
Our results suggest that we can build a robust classifier to assist clinicians and the public to help identify pertussis cases in children presenting with typical symptoms.
Journal:PLoS One. 2013