Goals: To compare the interobserver variability of the American Foregut Society (AFS) endoscopic classification of esophago-gastric junction (EGJ) integrity with the Hill classification among endoscopists with varying expertise, assessing reproducibility in real-world clinical settings. Background: Gastroesophageal reflux disease (GERD) is a common indication for esophagogastroduodenoscopy (EGD), yet endoscopic EGJ assessment often lacks standardization. The Hill classification, traditionally used, has limitations, including vague grade distinctions and poor interobserver reliability. The AFS classification, incorporating axial hiatal hernia length (L), hiatal aperture diameter (D), and gastroesophageal flap valve presence (F), offers a standardized protocol to improve consistency and stratification of EGJ disruption. Study: This multicenter, prospective, blinded study involved 21 endoscopists (10 gastroenterologists, 11 surgeons) evaluating 48 deidentified EGJ video clips using both Hill and AFS classifications. Participants, with varied experience levels, graded EGJ integrity independently. Interobserver agreement was assessed using the Intraclass Correlation Coefficient (ICC) and Fleiss' Kappa (κ) for overall and individual components. Results: The AFS classification demonstrated superior interobserver agreement (ICC=0.749) compared with the Hill classification (ICC=0.651) (P=0.002). AFS grade 4 showed the highest concordance (κ=0.730), while Hill grade 2 had the lowest (κ=0.239). Agreement was excellent among experienced AFS users (ICC=0.813) and good for first-time users (ICC=0.709) (P=0.003). Interobserver variability of the AFS classification among first-time users was significantly lower than the Hill classification variability (P=0.025). L and D components showed good agreement (ICC=0.729, 0.686), while F had moderate agreement (ICC=0.441). No significant specialty-based differences were observed (P=0.516). Conclusions: The AFS classification offers lower interobserver variability and greater reproducibility than the Hill classification, enhancing diagnostic consistency in GERD evaluation across expertise levels and specialties.
The AFS Endoscopic Classification of Esophago-Gastric Junction Integrity is Superior to the Hill Classification in Terms of Interobserver Variability
Visaggi, PierfrancescoMembro del Collaboration Group
;De Bortoli, NicolaWriting – Review & Editing
;
2025-01-01
Abstract
Goals: To compare the interobserver variability of the American Foregut Society (AFS) endoscopic classification of esophago-gastric junction (EGJ) integrity with the Hill classification among endoscopists with varying expertise, assessing reproducibility in real-world clinical settings. Background: Gastroesophageal reflux disease (GERD) is a common indication for esophagogastroduodenoscopy (EGD), yet endoscopic EGJ assessment often lacks standardization. The Hill classification, traditionally used, has limitations, including vague grade distinctions and poor interobserver reliability. The AFS classification, incorporating axial hiatal hernia length (L), hiatal aperture diameter (D), and gastroesophageal flap valve presence (F), offers a standardized protocol to improve consistency and stratification of EGJ disruption. Study: This multicenter, prospective, blinded study involved 21 endoscopists (10 gastroenterologists, 11 surgeons) evaluating 48 deidentified EGJ video clips using both Hill and AFS classifications. Participants, with varied experience levels, graded EGJ integrity independently. Interobserver agreement was assessed using the Intraclass Correlation Coefficient (ICC) and Fleiss' Kappa (κ) for overall and individual components. Results: The AFS classification demonstrated superior interobserver agreement (ICC=0.749) compared with the Hill classification (ICC=0.651) (P=0.002). AFS grade 4 showed the highest concordance (κ=0.730), while Hill grade 2 had the lowest (κ=0.239). Agreement was excellent among experienced AFS users (ICC=0.813) and good for first-time users (ICC=0.709) (P=0.003). Interobserver variability of the AFS classification among first-time users was significantly lower than the Hill classification variability (P=0.025). L and D components showed good agreement (ICC=0.729, 0.686), while F had moderate agreement (ICC=0.441). No significant specialty-based differences were observed (P=0.516). Conclusions: The AFS classification offers lower interobserver variability and greater reproducibility than the Hill classification, enhancing diagnostic consistency in GERD evaluation across expertise levels and specialties.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


