Diagnosis of COVID-19 by sound-based analysis of vocal recordings

Carreiro-Martins, P.; Paixão, P.; Caires, I.; Rodrigues, A.; Matias, P.; Gamboa, H.; Carreiro, A.; Soares, F.; Gomez, P.; Sousa, J.; Neuparth, N.

doi:10.1016/j.pulmoe.2023.03.003

Article information

Full Text

Bibliography

Download PDF

Statistics

Full Text

Commentary

The COVID-19 pandemic has had a significant impact on the world, with widespread illness, death, and economic disruption. The pandemic constantly evolves as new virus variants emerge, and governments respond with new measures. The development of a diverse range of diagnostic tests1 has been triggered to deal with the situation, which can be summarised in three major categories: molecular tests, rapid antigen tests and serology. Molecular tests have been the reference (real-time polymerase chain reaction - RT-PCR), but common to all, these are invasive, although minimally, and costly diagnostic techniques. Although at-home rapid antigen tests are currently available, they are invasive and uncomfortable, vary in accuracy depending on how they are done and must be quickly accessible.1 These characteristics may raise perceived barriers to COVID-19 testing and limit the detection of the disease. Vaccines have been unevenly distributed and accepted, and cases are still increasing in some areas. It is unclear what the future holds for the pandemic, but it will likely continue to require effort from all involved.

SARS-CoV-2 infection primarily affects the respiratory tract, including the upper and lower airways, and contributes to the disruption of normal vocalisations. Given this, some research groups have attempted to develop more convenient and accessible COVID-19 diagnostic methods by using machine learning to analyse voice recordings and other audio signals, such as coughing, breathing sounds, and breathing rate. Voice and cough analysis are an attractive approach to screening for respiratory disease symptoms2,3 as sound recordings are simple to acquire and non-invasive. However, due to subtle differences in voice and cough characteristics, artificial intelligence is required to detect specific disease patterns, discard confounding factors that induce similar manifestations, and reduce the effects of environmental noise. Geographic and idiolectal linguistic variations can also affect analyses of these samples.

Most existing algorithms rely on crowdsourced audio sample databases. However, such databases do not ensure the quality of the recordings and contain only a limited number of positive cases. Notable examples of respiratory audio–based sample collection include studies that used the Coswara dataset4 and the COVID-19 Sounds database.5 The accuracy reached by these studies’ algorithms is debatable as the results derive from unreliable data sources, either due to unconfirmed COVID-19 status or an uncontrolled recording environment.

For this reason, some authors have chosen to conduct their own recording protocols to ensure the reliability of their results. With a sample of 70 SARS-CoV-2-positive patients and 70 healthy individuals, Robotti et al.6 demonstrated that machine learning could accurately discriminate both groups (accuracy, 90.24%). Shimon et al.7 included 57 patients (25 SARS-CoV-2-positive), achieving an average accuracy of 80%. Pinkas et al.8 included recordings of 29 SARS-CoV-2-positive and 59 SARS-CoV-2-negative patients and achieved an accuracy of 79%.

Matias et al.9 used crowdsourced databases combined with quality assessment algorithms of voice recordings to overcome the above mentioned issues and detect SARS-CoV-2 infection with a more reliable and less noisy dataset. Such an approach reached accuracy values ranging from 75% to 84% on Coswara, and from 67% to 81% on a sub-set of COVID-19 sounds dataset.

The available literature suggests that machine learning tools using voice recordings and other audio signals could provide a non-invasive, low-cost way to screen and flag for SARS-CoV-2 infection, the results of which could later be confirmed with a clinically validated test. Such algorithms are powerful in their ability to automatically learn hidden patterns from data and decision rules. Combined with signal processing and expertise in feature engineering, they can form reliable tools to help track infection in a more decentralised way. This is particularly important in diseases where infection isolation is critical.

Nevertheless, future studies should validate the robustness of these algorithms in real-world settings and assess the feasibility of adapting this approach to other areas, such as other respiratory infections, such as seasonal influenza. Additionally, machine learning tools could also be used to monitor other chronic respiratory illnesses including asthma, COPD, obstructive sleep apnoea, and tuberculosis.8 In the future, this strategy might contribute to more decentralised screening for respiratory diseases, facilitating early diagnosis and monitoring disease progression, potentially improving outcomes for patients with these conditions, and reducing pressure on the healthcare system.

References

[1]

A Benda, L Zerajic, A Ankita, E Cleary, Y Park, S. Pandey.

COVID-19 testing and diagnostics: a review of commercialized technologies for cost, convenience and quality of tests.

Sensors, 21 (2021), pp. 6581

https://www.mdpi.com/1424-8220/21/19/6581

[2]

A Windmon, M Minakshi, P Bharti, S Chellappan, M Johansson, BA Jenkins, et al.

TussisWatch: a smart-phone system to identify cough episodes as early symptoms of chronic obstructive pulmonary disease and congestive heart failure.

IEEE J Biomed Health Inform, 23 (2019), pp. 1566-1573

http://dx.doi.org/10.1109/JBHI.2018.2872038 | Medline

[3]

GHR Botha, G Theron, RM Warren, M Klopper, K Dheda, PD van Helden, et al.

Detection of tuberculosis by automatic cough sound analysis.

Physiol Meas, 39 (2018), pp. 45005

https://doi.org/10.1088/1361-6579/aab6d0

[4]

N Sharma, P Krishnan, R Kumar, S Ramoji, SR Chetupalli, R Nirmala, et al.

Coswara - a database of breathing, cough, and voice sounds for COVID-19 diagnosis.

Proceedings of the annual conference of the International Speech Communication Association, INTERSPEECH. 2020, pp. 4811-4815

[5]

T Xia, D Spathis, C Brown, J Chauhan, A Grammenos, J Han, et al.

COVID-19 sounds: a large-scale audio dataset for digital respiratory screening.

Thirty-fifth conference on neural information processing systems datasets and benchmarks track (round 2), pp. 1-13

https://covid19.who.int/

[6]

C Robotti, G Costantini, G Saggio, V Cesarini, A Calastri, E Maiorano, et al.

Machine learning-based voice assessment for the detection of positive and recovered COVID-19 patients.

J Voice, (2021),

http://dx.doi.org/10.1016/j.jvoice.2021.11.004

[7]

C Shimon, G Shafat, I Dangoor, A. Ben-Shitrit.

Artificial intelligence enabled preliminary diagnosis for COVID-19 from voice cues and questionnaires.

J Acoust Soc Am, 149 (2021), pp. 1120-1124

http://dx.doi.org/10.1121/10.0003434 | Medline

https://asa.scitation.org/doi/10.1121/10.0003434

[8]

G Pinkas, Y Karny, A Malachi, G Barkai, G Bachar, V. Aharonson.

SARS-CoV-2 detection from voice.

IEEE Open J Eng Med Biol, 1 (2020), pp. 268-274

http://dx.doi.org/10.1109/OJEMB.2020.3026468 | Medline

https://ieeexplore.ieee.org/document/9205643/

[9]

P Matias, J Costa, AV. Carreiro, H Gamboa, I Sousa, P Gomez, et al.

Clinically relevant sound-based features in COVID-19 identification: robustness assessment with a data-centric machine learning pipeline.

IEEE Access, 10 (2022), pp. 105149-105168

Indexed in:

Follow us:

Indexed in:

Follow us:

Subscribe to our newsletter