Message-ID: <2103180164.8440.1711696017970.JavaMail.confluence@ip-172-30-4-17.ec2.internal> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_8439_638024248.1711696017968" ------=_Part_8439_638024248.1711696017968 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html Case-Control Matching Algorithm

Case-Control Matching Algorithm

Case-Control Matching

The case-control matching algorithm matches a predefined set of case pat= ients to patients in a control pool. A single case is matched to one or mor= e controls based on data points in common (age, gender, race, number of hea= lthcare encounters, etc...). Data points are binned and converted to their = corresponding bin value prior to matching. The i2b2 user running the applic= ation indicates the fields to be binned in the PATIENT_DIMENSION table and = the number of intervals (bins) for each field. Numeric fields are binned by= quantiles specified by the investigator. Character string fields are ranke= d by frequency; any value with a rank greater than the user supplied thresh= old is converted to 0 for "other".

The bin results are stored in a temporary table allocated for each match= request. Numeric values are binned a total of three times. Once with the o= riginal number of requested bin intervals, then twice more with the number = of intervals equal to approximately 10% and 20% less than the original numb= er respectively. For example, if the user requests patient age binned into = 8 quantiles, three bin results will be generated; one each at 8, 7, and 6 q= uantiles. The decreasing number of quantiles corresponds to weaker match st= rengths if these bin values are used to match cases to controls when an exa= ct match cannot be found. This is because decreasing the number of bins inc= reases the size of each interval into which data are placed, thus increasin= g the allowable difference between controls that may be matched to cases.&n= bsp;

Cases are then randomly joined or matched to controls by comparing bin v= alues in decreasing match strength order. Matching is accomplished in a bul= k join step and the results are ordered by strength. The desired numbers of= controls per case are selected into the final results. Matches with the gr= eatest strength are selected first.

See documentation included with the release for application details.=

The first version of the Case-Control Matching algorithm are package= d as MS SQL Server stored procedure scripts that operate on an i2b2 1.6 ins= tance.

Downloads

File

Size

Version

Description

Date

Case_Control_Matcher_1.0b1a.zip

186 KB

1.0b1a

Beta

2013-10-28

------=_Part_8439_638024248.1711696017968--