Human versus AI in audiological practice: A comparative evaluation of ChatGPT and physician treatment decisions in idiopathic sudden sensorineural hearing loss

Caragli, Valeria; Pellegrino, Michele; Pingani, Luca; Galeazzi, Gian Maria; Soloperto, Davide; Genovese, Elisabetta; De Filippis, Cosimo; Marioni, Gino; Franz, Leonardo

doi:10.1371/journal.pone.0350549

Purpose Artificial Intelligence (AI) is increasingly being applied in the field of audiology, demonstrating potential to support screening, diagnosis, and rehabilitation in auditory and vestibular disorders. However, its effectiveness in guiding complex therapeutic decisions—such as those for idiopathic sudden sensorineural hearing loss (ISSNHL)–remains uncertain. This study aimed to compare treatment recommendations for ISSNHL made by medical doctors (MDs) with those proposed by an AI system (ChatGPT, GPT-4o model). The goal was to evaluate the concordance between human and AI decision-making, particularly regarding the administration of corticosteroids and adjunctive therapies, and to assess AI’s potential in clinical practice. Methods This study is a retrospective observational analysis. Data from 86 patients diagnosed with ISSNHL were retrospectively analysed. Treatment decisions by MDs were compared with those generated by ChatGPT using anonymized patient datasets and a series of sequential prompts simulating multidisciplinary discussion. Agreement between AI and MDs was assessed using Cohen’s Kappa coefficient. Results Overall, poor to fair agreement was observed between AI and clinician treatment decisions. ChatGPT did not recommend oral corticosteroids in 26 cases where MDs prescribed them (Kappa = 0.029) and recommended intravenous corticosteroids in 20 patients who were not treated with this approach by MDs (Kappa = 0.393). Discrepancies were also evident in recommendations for intratympanic steroids (Kappa = −0.044) and adjunctive therapies (Kappa = 0.035). These differences likely stem from the AI’s rigid adherence to generalized treatment protocols and limited contextual understanding. Conclusion While ChatGPT (GPT-4o) shows promise in generating structured, protocol-driven suggestions, its clinical decision-making capabilities for ISSNHL remain inferior to those of experienced physicians. The AI’s lack of individualized reasoning and context sensitivity resulted in frequent discordance with physician-led care. Thus, AI should currently be viewed as a supportive tool in audiological practice, with its integration requiring careful oversight and further validation in real-world clinical environments.

Human versus AI in audiological practice: A comparative evaluation of ChatGPT and physician treatment decisions in idiopathic sudden sensorineural hearing loss / Caragli, V., Pellegrino, M., Pingani, L., Galeazzi, G.M., Soloperto, D., Genovese, E., De Filippis, C., Marioni, G., Franz, L.. - In: PLOS ONE. - ISSN 1932-6203. - 21:6(2026). [10.1371/journal.pone.0350549]

Human versus AI in audiological practice: A comparative evaluation of ChatGPT and physician treatment decisions in idiopathic sudden sensorineural hearing loss

Caragli, Valeria;Pellegrino, Michele;Pingani, Luca;Galeazzi, Gian Maria;Soloperto, Davide;Genovese, Elisabetta;de Filippis, Cosimo;Marioni, Gino;Franz, Leonardo

2026

Abstract

Purpose Artificial Intelligence (AI) is increasingly being applied in the field of audiology, demonstrating potential to support screening, diagnosis, and rehabilitation in auditory and vestibular disorders. However, its effectiveness in guiding complex therapeutic decisions—such as those for idiopathic sudden sensorineural hearing loss (ISSNHL)–remains uncertain. This study aimed to compare treatment recommendations for ISSNHL made by medical doctors (MDs) with those proposed by an AI system (ChatGPT, GPT-4o model). The goal was to evaluate the concordance between human and AI decision-making, particularly regarding the administration of corticosteroids and adjunctive therapies, and to assess AI’s potential in clinical practice. Methods This study is a retrospective observational analysis. Data from 86 patients diagnosed with ISSNHL were retrospectively analysed. Treatment decisions by MDs were compared with those generated by ChatGPT using anonymized patient datasets and a series of sequential prompts simulating multidisciplinary discussion. Agreement between AI and MDs was assessed using Cohen’s Kappa coefficient. Results Overall, poor to fair agreement was observed between AI and clinician treatment decisions. ChatGPT did not recommend oral corticosteroids in 26 cases where MDs prescribed them (Kappa = 0.029) and recommended intravenous corticosteroids in 20 patients who were not treated with this approach by MDs (Kappa = 0.393). Discrepancies were also evident in recommendations for intratympanic steroids (Kappa = −0.044) and adjunctive therapies (Kappa = 0.035). These differences likely stem from the AI’s rigid adherence to generalized treatment protocols and limited contextual understanding. Conclusion While ChatGPT (GPT-4o) shows promise in generating structured, protocol-driven suggestions, its clinical decision-making capabilities for ISSNHL remain inferior to those of experienced physicians. The AI’s lack of individualized reasoning and context sensitivity resulted in frequent discordance with physician-led care. Thus, AI should currently be viewed as a supportive tool in audiological practice, with its integration requiring careful oversight and further validation in real-world clinical environments.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2026
			
	Codice DOI
	
				https://dx.doi.org/10.1371/journal.pone.0350549
			
	Codice WoS
	
				WOS:001782083200012
			
	Codice Scopus
	
				2-s2.0-105040548436
			
	Codice PubMed
	
				42224219
			
	Rivista
	
				PLOS ONE
			
	N° del Volume
	
				21
			
	Tutti gli autori
	
						Caragli, Valeria; Pellegrino, Michele; Pingani, Luca; Galeazzi, Gian Maria; Soloperto, Davide; Genovese, Elisabetta; De Filippis, Cosimo; Marioni, Gin...espandi
						
	Citazione
	
				Human versus AI in audiological practice: A comparative evaluation of ChatGPT and physician treatment decisions in idiopathic sudden sensorineural hearing loss / Caragli, V., Pellegrino, M., Pingani, L., Galeazzi, G.M., Soloperto, D., Genovese, E., De Filippis, C., Marioni, G., Franz, L.. - In: PLOS ONE. - ISSN 1932-6203. - 21:6(2026). [10.1371/journal.pone.0350549]
			
	Tipologia
	
				Abstract in Rivista

File in questo prodotto:

File	Dimensione	Formato
journal.pone.0350549 (1).pdf Open access Tipologia: VOR - Versione pubblicata dall'editore Licenza: [IR] creative-commons Dimensione 261.03 kB Formato Adobe PDF Visualizza/Apri	261.03 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris