A common problem while integrating data sets from heterogenes sources in a Data Warehouse is to provide a mapping of data not only on a syntactical level, but also on a semantical level. In i2b2, every instance of an observation_fact has an association to an instance of a concept in concept_dimension. i2b2 concepts are rather straightforward. From a terminologcal perspective, they mainly consist of an identifier and a label. Furthermore, concepts are arranged hierarchically by a broader/narrower relation. In particular, there is no attribute defining the meaning of a concept. So, if there exists a concept gender in a clinical trial and a second concept gender in a registry, there is no way of recognizing whether they are equivalent. In fact, from i2b2's perspective, they are as much different as any other concept is.
Medical terminologies as human-curated ontologies play an important part in providing a common understanding but in reality, most data elements imported from sources like the clinical data management system, epidemiological registries, survey tools from longitudinal cohorts or the local EHR won't have an explicit annotation refencing terminologies.
A Metadata Repository is a central registry of