Session Details: Colloquium 2023

Date & Time

Wednesday, September 6, 2023, 12:30 PM - 2:00 PM

Location Name

Pickwick

Session Type

Poster

Category

Evidence synthesis innovations and technology

Speakers

Xuxu Wei, Dongzhimen Hospital, Beijing University of Chinese Medicine
RUIJIN Qiu
Hong Cai Shang

Authors

Wei X¹, Qiu R¹, Zhao C², You L¹, Fan X¹, Gu C¹, Chen Z¹, Zheng R¹, Shang H¹

¹Dongzhimen Hospital, Beijing University of Chinese Medicine, China
²Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, China

Description

Background: Real-world data, including electronic medical records (EMRs), have the potential to provide evidence with high external validity. Clinical prediction models are one type of evidence that can take full advantage of such data. However, producing trustworthy evidence using real-world data can be challenging. To address this challenge, advanced data science methods are necessary, as they may offer opportunities to enhance the trustworthiness of evidence generated from real-world data.
Objectives: To propose a framework for constructing a transparent and traceable data science-based real-world evidence (DS-RWE) producing with an example in diagnostic model development for traditional Chinese medicine (TCM) syndrome pattern differentiation.
Methods: We propose a framework consisting of EMR repositories, AI-engaged data cleaner, pre-defined clinical problems, transparent method rules, and a regularly updated evidence interface with version control function (Figure 1). This framework allows for the efficient and accurate extraction of data from EMR repositories while ensuring data accuracy through AI-engaged data cleaning. Pre-defined clinical problems ensure the specificity of the evidence produced, whereas transparent method rules guarantee the trustworthiness of the results. The regularly updated evidence interface with version control function ensures the evidence produced is up to date and traceable.
Results: According to the framework, we have constructed a DS-RWE pipeline for TCM syndrome pattern diagnosis using 478,862 EMRs from 298,632 patients in our data repository. The AI-engaged data cleaner consists of Bidirectional Encoder Representation from Transformers (BERT), regular expressions, and a series of factor regulators. A pre-defined model development workflow for syndrome pattern prediction was established, and Autogluon, an auto-machine learning tool, was applied for model development. It took approximately one day to update all the diagnostic models.
Conclusions: The proposed framework provides a transparent and traceable pipeline for producing trustworthy evidence using real-world data through data science methods. It has potential contribution to the development of more robust evidence production methods in real-world research, providing patients with trustworthy, up-to-date, and traceable evidence for their care.

Session Details

Transparent and traceable data science-based real-world evidence (DS-RWE) producing: framework and practice in traditional Chinese medicine