Importing Police-Public Contact Surveys Into R

12 October 2018

The Police-Public Contact Survey (PPCS), which “provides detailed information on the characteristics of persons who had some type of contact with police during the year, including those who contacted the police to report a crime or were pulled over in a traffic stop,” is a useful tool in examining trends like racial disparities surrounding police use of force or requests for police assistance. This survey is released about every three years, and this post is a walkthrough on how to import all years of this surveys into R. The output is a .Rda file for each year the survey was conducted.


Because the Institute for Social Research requires a login to download data from these surveys, files for each year will have to be downloaded manually.

To download these files, navigate pages for all of the survey’s years: 1999, 2002, 2005, 2008, 2011, and 2015. When more years of the survey are released, they will be listed in the National Crime Victimization Survey Series page. On the pages for the 1999 and 2002 iteratations of the survey, under the “Download” dropdown, chose the “SAS” option, which will download a zipped. For all other years, choose the “Delimited” option from the download dropdown. Note that while some years do have an “R” specific download option, not all of them do, and for consistency the “R” option is not used here. Do not unzip this file! Create a folder with the path “ppcs/raw” and drag all the downloaded zip folders into it. Then run the following code in R:


#unzip files
files <- list.files()
sapply(files, unzip)
sapply(files, file.remove)


#SAS imports take a long time.
#Use a text editor to open .sas files, find, and define "INPUT" lines
#define LRECL, which is given in each .sas file
ppcs1999 <- read.SAScii("ICPSR_03151/DS0001/03151-0001-Data.txt", "ICPSR_03151/DS0001/", 1205, lrecl = 1247)

# LRECL=953
# INPUT=544
ppcs2002 <- read.SAScii("ICPSR_04273/DS0001/04273-0001-Data.txt", "ICPSR_04273/DS0001/", 544, lrecl = 953)

ppcs2005 <- read.table('ICPSR_20020/DS0001/20020-0001-Data.tsv', sep = '\t', header = TRUE, stringsAsFactors = FALSE, fill = TRUE)
ppcs2008 <- read.table('ICPSR_32022/DS0001/32022-0001-Data.tsv', sep = '\t', header = TRUE, stringsAsFactors = FALSE, fill = TRUE)
ppcs2011 <- read.table('ICPSR_34276/DS0001/34276-0001-Data.tsv', sep = '\t', header = TRUE, stringsAsFactors = FALSE, fill = TRUE)
ppcs2015 <- read.table('ICPSR_36653/DS0001/36653-0001-Data.tsv', sep = '\t', header = TRUE, stringsAsFactors = FALSE, fill = TRUE)


Now you should have a ‘ppcsYYYY.Rda’ file for each year in a folder named rda.

Note: this code works for all current years of data available on these surveys, but will have to be manually adjusted as new years are added. Be careful to check variable names for each year of data you plan to use.

. filed under incarceration, statistics, and R.