Data

Source Data

WMLAD draws administrative records from seven state agencies:

State AgencyRecord typeKey informationTime periodNumber of individuals
Employment Security DepartmentEmployment records for Unemployment Insurance-eligible jobsEarnings, hours worked, job characteristics, Unemployment Insurance receipt2000-20177,699,646 workers
Dept. of Social and Health Services; Health Care AuthorityClient program participation and benefit receipt recordsProgram participation, benefits received, race/ethnicity, sex, age, residential address2010-20174,968,258 clients
Dept. of HealthRecords on parents listed on WA birth certificatesRace/ethnicity, sex, age, education, residential address2010-2016896,558 parents
Sec. of StateVoter recordsVoting history, sex, age, residential address2006-20166,084,439 voters
Dept. of LicensingDrivers license registration recordsAge, sex, residential address2005-20168,367,317 licensees
WA State PatrolArrest recordsArrest information2000-2018777,416 people arrested

While there is substantial overlap between the populations covered by each data source, the component datasets of WMLAD represent distinct populations based on the administrative processes that lead individuals to be represented in each set of records. The table below represents the overlap between key WMLAD populations for the time period 2010-2016. Each column represents a key data source, and each row indicates what share of individuals in that data source are also included in a source designated in the column header.

 Driver’s license holdersRegistered votersWorkersDSHS clientsParents
% holding driver’s license100%83%80%66%87%
% registered to vote64%100%57%44%58%
% voted50%68%44%28%41%
% worked60%55%100%50%72%
% DSHS clients43%37%43%100%58%
% new parents10%9%11%11%100%

Data Construction

The UW team contracted with the Research and Data Analysis (RDA) division of Washington State’s Department of Social and Health Services (DSHS) to perform the matching required to construct WMLAD. RDA has capacity to perform high level, complex matching due to its two decades of experience in building and maintaining DSHS’s integrated client database (ICDB) which combines records across DSHS administrations with outside records from several other agencies, including UI records from ESD and criminal records from the state highway patrol. The project used the Link King record linkage program, a public-domain application which runs on SAS. Kevin Campbell, the Link King developer, joined RDA for this project and carried out the matching personally. Link King uses a combination of deterministic and probabilistic matching algorithms. The linking process included data standardization, blocking into similar records, identity matching, and quality review. Quality control measures included careful implementation of linkage software with an established pedigree where decision rules were modified for each linkage task based on manual review of random samples of identity groupings. As confirmation, linkage results were generally consistent with expectations given known overlap in clients from linked data sources.

Data Resources

From the raw data files, the WMLAD team has created a series of processed datasets suitable for analysis. We designate analytic datasets that may be useful for multiple inquiries as “Data Resources” and post static versions of them along with code and documentation. Data resources to date include: