Methodologies

Data Sources

Crime City of St. Louis Metropolitan Police Department
City Boundary City of St. Louis Open Data
Public Schools Missouri Spatial Data Information Service
Census Tract (CT) Boundaries United States Census Bureau
Population TIGER/Line with Selected Demographic and Economic Data, United States Census Bureau
Number of Persons Living Below the Poverty Line 2017 ACS 5-Year Estimates, United States Census Bureau
Median Household Income Data USA
Bars OpenStreetMap

Table 1: Data sources

Crime data used was from the summer months of June, July, and August in 2016. The purpose of this project was to understand the spatial distribution of violent crime throughout the city at its greatest, and studies have shown that the summer experiences higher aggravated assault rates when compared to fall, winter, and spring (Lauritsen & White, 2014). The crime data obtained was filtered in Microsoft Excel before being imported into ArcMap as a CSV using the XY coordinates.

To understand the spatial distribution of crime at a macro scale (census tracts), demographic data obtained from the United States Census Bureau was used alongside crime data. Crimes have been deeply linked to income inequality and poverty, so we expect that areas of higher poverty and lower median household income will experience a higher rate of violent crimes (Becker, 2007).

ArcMap and CrimeStat were primarily used for spatial analysis of crimes in St Louis. Five different methods were used for the analysis:

  • Kernel Density;
  • Hotspot Analysis;
  • Space-time Analysis for Hotspots;
  • Grouping Analysis;
  • and Geographically Weighted Regression;

The projection used for all analysis was NAD 1983 StatePlane Missouri_East FIPS 2401 Feet. The optimal cell size was generated from the optimized hot spot analysis for the crime data, which was 861 feet. The optimal fixed distance band, based on peak cluster, was 5166 feet.

Kernel Density Map

The kernel density map was created using Crimestat. Crimestat was preferred over ArcMap because it has more customization in its settings. The bandwidth interval was set to 5166 feet and the cell size was set to 250 feet to gain a higher resolution result. Geometric interval was used instead of natural breaks because kernel density tends to skew data. 10 classes were used to provide greater visual detail.

Hot Spot Overlay Analysis

Optimized hot spots were created for bars, violent crimes, and poverty (CT). Hot spots with a 99% confidence interval were selected for each variable using the ‘Select Layer by Attribute’ tool, using the expression “Gi_Bin=3”, and were intersected to create a map that shows areas that have a hot spots of bars, violent crimes, and poverty. Additionally, public schools within a quarter mile of the overlapping areas was created using the ‘Select Layer By Location’ tool.

Space-Time Analysis for Hotspots – Mapping Crime Trends

Crime trends were analyzed by using the ‘Space-Time Cube’ tool alongside the ‘Emerging Hot Spot Analysis’ tool. The time step for the space-time cube was 1 Week, and the time step alignment was ‘END_TIME’. The distance interval of 5166 feet was also used. The space-time cube created was then used with the ‘Emerging Hot Spot Analysis’ tool to create a map of crime trends.  

Grouping Analysis

Demographic trends within each census tract were analyzed to see if there were any variables that stood out. The variables included population, median household income, and poverty. 3 groups were created. Groups were then given arbitrary names based on the parallel box plot produced with the grouping analysis tool.