Eye Spy A PSU! Automating Sampling Frame Construction from Aerial Images Using Machine Learning

Main Article Content

Trent D. Buskirk
Han Shao
Peter Stefek

Abstract

The availability of sampling frames is critical for the use of probability-based sampling methods in social science research. Extensive literature addresses how sampling frames can be constructed when the target population consists of people. Less understood is how sampling frames should be constructed when the population being studied consists of places, objects, or locations (POLs). In this paper, we propose an approach that employs machine learning to automate the construction of sampling frames of POLs from aerial (or satellite) images when the POLs of interest have distinctive visual characteristics (e.g., windmills, playgrounds, religious centers). Automating this process with machine learning alleviates the time and monetary costs of researchers manually reviewing potentially several thousands of aerial images to identify sampling units. We evaluate our approach using a case study constructing sampling frames of windmills as POLs within the state of Iowa. We train convolutional neural networks to identify windmills within aerial images from the U.S. Department of Agriculture’s National Agriculture Imagery Program and find that our approach successfully predicted 80% of the windmills in the area of interest (1,521 out of 1,913 windmills across ten counties in Iowa) and ruled out 99% of locations lacking the POL of interest (out of over 300,000). Thus, we achieved good coverage in the resulting sampling frames and suggest that any over-coverage could be removed with manual review of only a small number of images — rather than all of them — representing an approximate 98% reduction in the manual effort required without machine learning.

Article Details

How to Cite
Eck, A., Buskirk, T. D., Shao, H., & Stefek, P. (2026). Eye Spy A PSU! Automating Sampling Frame Construction from Aerial Images Using Machine Learning. Methods, Data, Analyses, 1–39. https://doi.org/10.25521/mda.724
Section
Research Report
Author Biography

Adam Eck, Oberlin College

David H. and Margaret W. Barker Associate Professor of Computer ScienceAssociate Chair, Computer Science Department
Chair, Data Science Integrative Concentration 
Oberlin College