Research Highlights Power of Administrative Data

by Christina Pena
November 30, 2015

The National Bureau of Economic Research (NBER) has published two working papers that demonstrate the value of using administrative data for research, with significant relevance for policymaking.

David N. Figlio and Krzysztof Karbownik of Northwestern University in Illinois and Kjell G. Salvanes of the Norwegian School of Economics hail the amazing increases in data storage and computing speed in recent years as making government collection of data more prevalent. They review this goldmine for researchers in their working paper, “Education Research and Administrative Data.” The authors examine Scandinavia and Florida, citing relatively long histories of administrative data collection across their populations:

  • In Norway, researchers have taken advantage of extensive longitudinal data that creates “a detailed picture of the causal relationship between schooling and earnings.” In particular, they cite research showing that early childhood education has impacts on later life experience in the labor market. They go on to discuss how Norway has formalized a process for taking findings from this research and using it to inform policymaking, with politicians deciding to increase investment in preschool and other childhood-focused programs to improve higher education and labor market outcomes later in life.
  • In the United States, Florida has been on the cutting edge of using administrative data, linking K-12 with various postsecondary and workforce data. (See WDQC’s Blueprint survey for Florida’s progress on its longitudinal data systems.) The use of rich data sources in Florida has influenced policy, such as changes in compensation policies related to accumulation of teacher credentials and classroom performance. Studies in Florida and other states, however, have gaps in the data because people who leave for other parts of the United States can no longer be tracked by the same state data systems.

Figlio, Karbownik, and Salvanes go on to discuss the limitations of administrative data when compared to surveys, including the lack of flexibility, reliance on governments’ consistent use of identifiers and willingness to open data sets to researchers, changes in definitions over time, and political issues around security and confidentiality. They also discuss the utility of combining approaches. The paper concludes with an extensive and valuable list of studies that have used administrative databases from around the world.  

Bruce D. Meyer of the University of Chicago, and Nikolas Mittag of the Center for Economic Research and Graduate Education and Economics Institute (CERGE-EI) in Prague show how results from research using linked administrative data reveal weaknesses in relying on survey data alone in their working paper: “Using Linked Survey and Administrative Data to Better Measure Income: Implications for Poverty, Program Effectiveness and Holes in the Safety Net.”

The authors contend that the accuracy of the Current Population Survey (CPS) and other surveys has been declining, which in turn has adversely affected research that relies on those surveys for reporting income distribution, poverty, and unemployment. They note a sizeable decline in the percentage of households responding to interviewers, especially on income-related questions.

By comparing survey results against administrative records, the researchers discovered that program assistance receipt and amounts have been underreported in surveys, finding that with “the corrected data the poverty-reducing effect of all programs combined is nearly doubled while the effect of housing assistance is tripled.”

To compare survey to administrative data results, they looked at New York State records from the Supplemental Nutrition Assistance Program (SNAP), Temporary Assistance for Needy Families (TANF), as well as administrative record data on housing assistance from the U.S. Department of Housing and Urban Development (HUD). Then they matched the administrative data from those records to the CPS survey data, and aggregated data to the household level.

They found that CPS misses one-third of program receipt for housing assistance recipients, 40 percent of SNAP, and 60 percent of TANF recipients. Amounts received were also undercounted in surveys of program recipients. They conclude that this underreporting has fed into research results that suggest government programs are not helping households as much as they actually are, and “the safety net reaches far more people in need than the survey data suggest.”

Like the authors of the first paper, they acknowledge that administrative data are not perfect and often lack important information. They recommend linking survey data to administrative records to have a better understanding of where program support is not sufficiently reaching people in need. They add that the results from administrative data are instructional for understanding how to improve survey design and interview procedures to reduce misreporting.

WDQC advocates using administrative data for both performance management and research, such as our recent Perkins recommendations on using wage data to measure employment outcomes, and encourages a wide range of efforts to improve practices using administrative and survey data.