Background: The population-based observational studies largely employ healthcare administrative databases (HAD) as data sources for selecting patients. Algorithms, usually used for patient identification, must be tested in validation studies to assess the reliability of the selection. Objectives: This section of the PATHFINDER study is aimed at testing the performance of different algorithms for extracting patients with rheumatoid arthritis (RA) from the HAD of Tuscany (Italy) and estimating the time elapsing from the date of the actual RA diagnosis (medical chart) and the supposed date of diagnosis (HAD). Methods: A sample of patients selected from Tuscan HAD (extracted sample) was compared with the corresponding data in the medical charts of the Rheumatology ward of Pisa University Hospital (reference). Patients were classified as RA patients on the basis of four algorithms: 1) at least one visit from 2013 to 2016 AND the first bDMARD from 2014 to 2016 AND the RA ICD-9 code; 2) at least one visit from 2013 to 2016 AND the first bDMARD from 2014 to 2016 AND the disease tax exemption code; 3) at least one visit from 2013 to 2016 AND the first bDMARD from 2014 to 2016 AND RA ICD-9 code AND the disease tax exemption code; 4) at least one visit from 2013 to 2016 AND the first bDMARD from 2014 to 2016 AND (RA ICD-9 code OR the disease tax exemption code), whichever occurred first. We estimated sensitivity, specificity, positive and negative predicted values (PPV and NPV) for each algorithm. The median time (interquartile range, IQR) of RA diagnosis recorded in the reference and that in the extracted sample was estimated. Results: Overall, 277 patients gave their consent and were included in the reference. Out of these, 103 had RA diagnosis. We found values over 0.70 for all the algorithms tested with the exception of sensitivity values of the first and third algorithms, 0.53 (95%CI 0.43–0.63) and 0.37 (95%CI 0.28–0.47), respectively. The fourth algorithm was able to select 96 true RA patients and displayed PPV 0.78 (95%CI 0.70–0.85), sensitivity 0.93 (95%CI 0.86–0.97), specificity 0.84 (95% CI 0.78–0.90), and NPV 0.95 (95%CI 0.91–0.98). The median time required by the HAD to detect RA diagnosis as compared with the chart medical records was of 2.2 years (IQR 0.5–8.4). Conclusions: In conclusion, all the algorithms tested for the identification of RA patients in the Tuscan HAD showed good estimations with the fourth being the best one. The median time needed to capture RA diagnosis by the Tuscan HAD has to be taken into account for the interpretation of results of future effectiveness and safety analyses in this population.
Validation test for algorithms to identify rheumatoid arthritis patients in the Tuscan healthcare administrative databases
Convertino, I;Cazzato, M;Lucenteforte, E;Valdiserra, G;Cappello, E;Ferraro, S;Tillati, S;Fornili, M;Lorenzoni, V;Trieste, L;Turchetti, G;Blandizzi, C;Mosca, M;Tuccori, M
2021-01-01
Abstract
Background: The population-based observational studies largely employ healthcare administrative databases (HAD) as data sources for selecting patients. Algorithms, usually used for patient identification, must be tested in validation studies to assess the reliability of the selection. Objectives: This section of the PATHFINDER study is aimed at testing the performance of different algorithms for extracting patients with rheumatoid arthritis (RA) from the HAD of Tuscany (Italy) and estimating the time elapsing from the date of the actual RA diagnosis (medical chart) and the supposed date of diagnosis (HAD). Methods: A sample of patients selected from Tuscan HAD (extracted sample) was compared with the corresponding data in the medical charts of the Rheumatology ward of Pisa University Hospital (reference). Patients were classified as RA patients on the basis of four algorithms: 1) at least one visit from 2013 to 2016 AND the first bDMARD from 2014 to 2016 AND the RA ICD-9 code; 2) at least one visit from 2013 to 2016 AND the first bDMARD from 2014 to 2016 AND the disease tax exemption code; 3) at least one visit from 2013 to 2016 AND the first bDMARD from 2014 to 2016 AND RA ICD-9 code AND the disease tax exemption code; 4) at least one visit from 2013 to 2016 AND the first bDMARD from 2014 to 2016 AND (RA ICD-9 code OR the disease tax exemption code), whichever occurred first. We estimated sensitivity, specificity, positive and negative predicted values (PPV and NPV) for each algorithm. The median time (interquartile range, IQR) of RA diagnosis recorded in the reference and that in the extracted sample was estimated. Results: Overall, 277 patients gave their consent and were included in the reference. Out of these, 103 had RA diagnosis. We found values over 0.70 for all the algorithms tested with the exception of sensitivity values of the first and third algorithms, 0.53 (95%CI 0.43–0.63) and 0.37 (95%CI 0.28–0.47), respectively. The fourth algorithm was able to select 96 true RA patients and displayed PPV 0.78 (95%CI 0.70–0.85), sensitivity 0.93 (95%CI 0.86–0.97), specificity 0.84 (95% CI 0.78–0.90), and NPV 0.95 (95%CI 0.91–0.98). The median time required by the HAD to detect RA diagnosis as compared with the chart medical records was of 2.2 years (IQR 0.5–8.4). Conclusions: In conclusion, all the algorithms tested for the identification of RA patients in the Tuscan HAD showed good estimations with the fourth being the best one. The median time needed to capture RA diagnosis by the Tuscan HAD has to be taken into account for the interpretation of results of future effectiveness and safety analyses in this population.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.