Metadata-Driven Creation of Data Marts From an EAV-Modeled Clinical Research Database

Document Type


Publication Date



Generic clinical study data management systems can record data on an arbitrary number of parameters in an arbitrary number of clinical studies without requiring modification of the database schema. They achieve this by using an Entity-Attribute-Value (EAV) model for clinical data. While very flexible for creating transaction-oriented systems for data entry and browsing of individual forms, EAV-modeled data is unsuitable for direct analytical processing, which is the focus of data marts. For this purpose, such data must be extracted and restructured appropriately. This paper describes how such a process, which is non-trivial and highly error prone if performed using non-systematic approaches, can be automated by judicious use of the study metadata—the descriptions of measured parameters and their higher-level grouping. The metadata, in addition to driving the process, is exported along with the data, in order to facilitate its human interpretation.


A project of The Center for Medical Informatics, Yale University School of Medicine.