Control Matching
Space shortcuts
Space Tools
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Control Matching

The case-control matching algorithm matches a predefined set of case patients to patients in a control pool. A single case is matched to one or more controls based on data points in common (age, gender, race, number of healthcare encounters, etc...). Data points are binned and converted to their corresponding bin value prior to matching. The i2b2 user running the application indicates the fields to be binned in the PATIENT_DIMENSION table and the number of intervals (bins) for each field. Numeric fields are binned using NTILE. Character string fields are ranked by frequency; any value with a rank greater than the user supplied threshold is converted to 0 for "other".

The bin results are stored in a temporary table allocated for each match request. Numeric values are binned a total of three times. Once with the original number of requested bin intervals, then twice more with the number of intervals equal to approximately 10% and 20% less than the original number respectively. For example, if the user requests patient age binned into 8 intervals, three bin results will be generated; one each at 8, 7, and 6 intervals. The decreasing number of intervals corresponds to weaker match strengths if these bin values are used to match cases to controls when an exact match cannot be found. This is because decreasing the number of bins increases the size of each interval into which data are placed, thus increasing the allowable difference between controls that may be matched to cases. There are three match strengths hardcoded into the application: the original number of intervals requested (match strength 1), approximately 10% less the original number (match strength 2), and approximately 20% less the original number (match strength 3).

Cases are then randomly joined or matched to controls by comparing bin values in decreasing match strength order. Matching is accomplished in a bulk join step and the results are ordered by strength. The desired numbers of controls per case are selected into the final results. Matches with the greatest strength are selected first.

See documentation included with the release for application details.

Downloads

File

Size

Version

Description

Date

Case_Control_Matcher_1.0b1.zip

184 KB

1.0b1

Beta

2013-10-28

  • No labels