Home
Scholarly Works
Developing a Data and Analytics Platform to Enable...
Journal article

Developing a Data and Analytics Platform to Enable a Breast Cancer Learning Health System at a Regional Cancer Center

Abstract

PURPOSE: This study documents the creation of automated, longitudinal, and prospective data and analytics platform for breast cancer at a regional cancer center. This platform combines principles of data warehousing with natural language processing (NLP) to provide the integrated, timely, meaningful, high-quality, and actionable data required to establish a learning health system. METHODS: Data from six hospital information systems and one external data source were integrated on a nightly basis by automated extract/transform/load jobs. Free-text clinical documentation was processed using a commercial NLP engine. RESULTS: The platform contains 141 data elements of 7,019 patients with newly diagnosed breast cancer who received care at our regional cancer center from January 1, 2014, to June 3, 2022. Daily updating of the database takes an average of 56 minutes. Evaluation of the tuning of NLP jobs found overall high performance, with an F1 of 1.0 for 19 variables, with a further 16 variables with an F1 of > 0.95. CONCLUSION: This study describes how data warehousing combined with NLP can be used to create a prospective data and analytics platform to enable a learning health system. Although upfront time investment required to create the platform was considerable, now that it has been developed, daily data processing is completed automatically in less than an hour.

Authors

Petch J; Kempainnen J; Pettengell C; Aviv S; Butler B; Pond G; Saha A; Bogach J; Allard-Coutu A; Sztur P

Journal

JCO Clinical Cancer Informatics, Vol. 7, No. 7,

Publisher

American Society of Clinical Oncology (ASCO)

Publication Date

January 1, 2023

DOI

10.1200/cci.22.00182

ISSN

2473-4276

Contact the Experts team