Construction of minimum data set for core outcome set in real-world settings: A data-driven approach
Background: Hundreds of core outcome sets (COS) have been developed for standardized outcome reporting in clinical trials, whereas their usage in real-world settings is largely limited because of the data availability and difficulty in data cleansing. Although a few COS have matched core data sets (CDS), the CDS generally consist of description of data elements but lack of details on how to locate these elements in a given real-world databases and how the data elements are matched to each outcome in the COS. This study aimed to propose a data-driven approach to build minimum data set (MDS) for the COS. Study design: We used a set of healthcare quality measures and a set of real-world data elements, both released by the healthcare officials in China, as an illustrating example of building MDS for COS. We first compiled a data element set with the source and description of the data elements extracted. We also include a list of quality measures along with characteristics and calculation requirements. We then map the data elements to each of the quality measures according to their requirements and created a relational database that stores these mapping relationships. We then build a platform based on this relational database to allow an interface for the input of quality measures and the presentation of the outcome, which is the MDS that is calculated as the union set of the data elements mapped to the selected measures. Conclusion and implication: This is an ongoing project funded by the National Natural Science Foundation of China. The project innovatively applies the concept of MDS on COS by adopting a data-driven method to construct the MDS from existing real-world data instead of proposing an independent set of date elements based on experts’ consensus. This approach should ease the effort on data cleansing for COS, saving the cost of extra data collection and promoting the more efficient use of existing real-world data. Patient, public, and/or healthcare consumer involvement: With the methodology proposed by this study, more core outcome sets can be applicable in real-world clinical practices instead of being largely restricted to clinical trials.