Introduction

Criminologists focus on explaining crime and criminal behavior. This necessarily requires an examination of individual decision-making within the context of social processes that occur over time. To complicate matters, social processes are multifaceted and include spatial, temporal, and cultural dimensions. Collecting data on and modeling these processes is difficult using traditional empirical approaches. To address these challenges, Epstein and various colleagues (Epstein 2006, 2008; Epstein and Axtell 1996) suggest a generative approach, one which examines how macroscopic regularities (e.g. crime patterns in space and time) develop from the actions and interactions of individuals. The primary instrument of such an approach is a computational laboratory in which the researcher creates an artificial society. The individuals, called agents, in the artificial society behave according to the assumptions of criminological (or other) theory. The agents in the model interact and the researcher observes whether the outcomes in the artificial society match what the theory would predict. Agent-based modeling (ABM) is one type of generative approach.Footnote 1

The criminologist’s interest in using ABM is relatively recent, with the first studies appearing in the early to mid-2000s. Over the last decade, a number of papers using ABM as the primary methodology have appeared in top criminological journals such as Criminology (Birks et al. 2012; Weisburd et al. 2017), the Journal of Quantitative Criminology (Groff 2007a), and the Journal of Research in Crime and Delinquency (Birks et al. 2014; Johnson and Groff 2014; Pitcher and Johnson 2011) as well as a myriad of other well-respected outlets. In each of these publications, the researchers turned to ABM when they were unable to examine the topic of their research using traditional empirical approaches. The appearance of studies using ABM in disciplinary journals rather than methodologically specialized outlets such as the Journal of Artificial Societies and Simulation represents an important milestone in the diffusion of ABM. Specifically, it indicates that both peer reviewers and editors recognize the value of implementing thought experiments in a virtual world, when the assumptions used are believable, and the outcome of the model can be shown to approximate theoretical or empirical patterns.

At this point in the development and application of ABM to criminological enquiry, it is reasonable to ask to what extent ABM is achieving its early promise. Balancing the tremendous potential of the methodology are legitimate questions such as what theories have been tested? How do those employing ABM calibrate models? To what extent and how rigorously do they address issues of validity? What contribution is ABM capable of making to knowledge building in criminology? What challenges remain for the methodology to reach its potential?

We examine these questions in the remainder of the paper. The paper begins with a brief introduction to ABM. We then discuss the challenges criminologists face when conducting empirical research with respect to the measurement of theoretical constructs, methodological approaches for studying dynamic processes, and the application of statistical techniques capable of accommodating complexity. Next, we explain how ABM can meet each of those challenges and in doing so provide an important testbed for both theory development and empirical experiments. The extent to which ABM successfully meets these challenges, especially the methodological adequacy with which models are implemented, directly affects the level of confidence researchers can have in the value of their findings and the use of the approach more generally. Accordingly, the primary aim of this paper is to take stock of the literature to date with the aim of informing and improving future research that will use ABM. To do this, we conduct a systematic review of the current literature that has employed ABM to discover: how models are calibrated, how model validity has been examined, what we have learned, and where current research is falling short.

A Brief Introduction to Agent-Based Modeling

ABM is a type of generative simulation modeling that allows for the creation of artificial worlds within a computational laboratory. Artificial worlds typically have agents and a landscape. The agents represent real world entities such as people, organizations, and groups. Relevant theoretical foundations and the specific research questions examined determine which entities are implemented in the model and the particular aspects of their behavior that are included. For example, models of urban crime usually include agents that represent people that interact in a representation of a city of some kind. In a given situation, they can take on roles to include an offender, victim, bystander, or police officer. Just as each individual in a real population has its own set of characteristics, so too does each agent in a simulation. As in real-life, these characteristics can also change over time. This ability to model heterogeneity in a dynamic way is a major strength of ABM as it can approximate the variation of real life. Agents are typically autonomous. That is, they have the ability to sense their environment, to make decisions based on their internal needs and the surrounding (artificial) environment, to adapt to changing circumstances (Bonabeau 2002), and there is no central controller. The decisions made by one agent, in turn, influence those of others. Agents’ actions may influence the decisions of others directly because of their presence or the actions they take, or indirectly by affecting the environment within which they act.

The simulated landscape can be simple or complex and reflects the research question the modeler seeks to address. For example, in a model of gang members’ social networks, the gang member agents can make decisions based on a set of social network connections and thus may never have to move around a landscape at all. In other models, such as one concerned with street robbery, the likelihood of crime occurrence will be a function of the convergence of (victim, offender and police) agents in space and time and consequently, it is important that agents move across a landscape. Variation in this landscape will affect the likelihood and frequency of such convergence by constraining the locations agents can access and the routes they take (Brantingham and Brantingham 1993). As such, changes to the landscape may affect both the likelihood of (simulated) criminal activity and its concentration in space and time.

The allure of ABM for social scientists lies in its potential for aiding in discovery, increasing understanding and facilitating the formalization of theories (Gilbert and Troitzsch 2005). Indeed, part of the process of formalization is the distillation of theory into the most parsimonious elements to describe the phenomena of interest. These elements inform the construction of a society in which the agents act according to the theory. Even agents with relatively simple behavior rules can produce very complex and surprising outcomes due to the dynamic interactions among them (Gilbert and Troitzsch 2005). Consequently, researchers usually first run a base model and collect data to document processes and patterns in outcome variables under simple conditions. The modeler adds more complex behavioral rules or sets of agent characteristics only after the dynamics of a simple model are understood (Macy and Willer 2002). Such a systematic approach to adding complexity makes understanding the changes in outcome patterns easier. The approach of systematically changing only one aspect of the model at a time means that each version of the model can represent a specific experimental condition. Statistical techniques can then be used to identify significant differences between experimental conditions.

Potential of Agent-Based Modeling for Understanding Urban Crime

In what follows, we examine the potential of agent-based modeling to increase understanding of crime and criminal behavior by describing three widely recognized shortcomings of criminological research. We then discuss how ABM can address each. Research validity is central to this since it is critical to the construction of scientific knowledge, regardless of whether studies are conducted in the real world or an artificial one.

Measurement of Theoretical Constructs

Challenges associated with empirical measurement can have a debilitating effect on the strength of a theory (Johnson and Groff 2014). Construct validity refers to the match between the theoretical construct of interest and its measurement. Simply put, poor measurement leads to the inability to substantiate or falsify theories. The case of urban crime is a good example since “it is well known that data on crime and justice are bad.” (Eck and Liu 2008, p. 416). Official crime data, victimization data, and self-report activity data all measure different aspects of crime and delinquency, and all have substantial shortcomings.

Unfortunately, crime is but one example. Many of our theories posit assumptions regarding “elusive concepts” that are critical to criminological theory (Farnworth et al. 1994, p. 32). Important concepts such as social control, learning processes, and opportunity are latent constructs measured indirectly through observation (Sullivan and McGloin 2014). The fact that these concepts represent processes presents additional measurement issues. Thus, the measurement of both the dependent variables and independent variables of interest to criminologists is difficult, expensive and often imperfect. It can sometimes also be unethical. To take an example, it would not be ethical to randomly assign people to travel through dangerous areas of a city in various states of inebriation to test the hypothesis that drunkenness increases the attractiveness of someone as a target for street robbery.

ABM can shed new light on the measurement of criminological constructs in several ways. First, the modeling process begins by carefully identifying theoretical perspectives that may be relevant to modeling the target behavior and identifying common elements across theories. With the key constructs identified, the modeler turns to the empirical evidence about each construct. To the extent that empirical evidence is strong and consistent, the behavior of agents in the agent-based model will reflect the state of the knowledge. However, where evidence is weak or non-existent ABM offers the potential to examine different operationalizations and to observe their effect on simulated outcomes. This activity can inform theory as well as the empirical research agenda by identifying what data are currently lacking and should be collected in future studies.

Second, the agent-based modeler can directly measure behaviors and perceptions of agents throughout model runs. In this way, what would be latent constructs in empirical research are explicit in ABM. This allows for a level of measurement precision not achievable in empirical research. Eck and Liu refer to this as the ability to investigate “hidden phenomena” (Eck and Liu 2008, p. 416). For example, in a model concerned with how guardianship affects the decision to commit a crime, a potential offender agent could perceive the likelihood that other agents present will intervene and those values can be collected. To collect such data in empirical research would be an enormous task, if it were possible at all. To the extent modelers carefully justify and document the details of how they represent a construct, subsequent modelers can build on that foundation and make systematic, incremental changes to it, as necessary. In this way, the explicit operationalization of constructs can help build cumulative scientific knowledge and advance theory.

The third potential contribution of ABM is that it allows modelers to develop different formal representations of conceptually identical (or similar) constructs to see if and how their definition influences model outcomes. Townsley and Johnson (2008) suggest that similar results from different formal representations of the same construct increase confidence in the inferences drawn. This is the case because outcomes would be shown to be implementation independent (i.e. not to depend on subtle variations in how they are implemented). When the formal representations of the same construct are different and produce different results, modelers can systematically investigate the differences in the formalization for clues as to how they relate to the results. Modelers can also examine how the formalization of a construct in one model translates when used in a different one (e.g. comparing a model that uses an abstract grid with one that uses a street network).

As modelers introduce additional formal representations of constructs, the interactions with existing model elements may change producing different results (Townsley and Johnson 2008). This potential for complexity in simulation models is why so many modelers begin with a simple model and systematically add more variables. Using this procedure, modelers can identify the important role of intervening variables and subject them to systematic testing. One method of testing a model involves manipulating parameter values so that they reflect empirically or theoretically unsupported levels to check that the model generates observably different outcomes to when realistic parameter values are used. For example, one might assign all agents a very high criminal propensity such a.90 (i.e. a 90 percent chance that they will offend in a given situation) to check that this affects model outcomes. In this way, ABM allows researchers to explore different aspects of theory and in doing so, to gain a more nuanced view of how those different aspects interact with one another to produce different outcomes. An interactive investigation has the potential to reveal more about the veracity of a theory than single tests.

Methodology to Investigate Process

Beyond measurement issues, empirical methodologies have important limitations for testing theory, explaining crime and evaluating programs.Footnote 2 In criminal justice research the manipulation of variables is sometimes not ethical or simply too expensive, or policy-makers want to know the effects of a policy change immediately, rather than waiting for a new policy to develop over time. Researchers often cannot conduct randomized controlled trials, and instead have to conduct empirical evaluations using (weaker) quasi-experimental designs. These often lack a counterfactual to estimate what the expected outcomes would be (for the treatment group) absent treatment. As such, they are unable to support strong causal statements. Even when quasi-experimental designs using a matched comparison group are used, the possibility that the comparison group might be different on some unmeasured characteristic remains. Moreover, where randomized controlled experiments are possible, they can be expensive, take time to implement, can be difficult to conduct and can be fraught with ethical and professional pitfalls. In addition, a randomized experiment can only describe the relationship between treatment and outcome. Additional studies are necessary to identify the causal mechanisms through which an intervention brings about its effects.

ABM supports causal analysis and allows the investigation of mechanisms at the same time. It does this by creating a counterfactual without random assignment. Each version of an artificial society represents one realization of history as it might have occurred under one set of conditions. The baseline model is the counterfactual because it represents society without some intervention. An example would be a model of random patrol in an agent-based model concerned with the effect of hot spots policing on crime (Johnson 2009; Weisburd et al. 2017). After running the baseline model that uses a random patrol strategy and collecting the baseline outcome data, the “clock is turned back”, the experimental factor changed (e.g. a hotspot deployment strategy is implemented) and the model run again. In this way, the artificial society serves as its own counterfactual. Modelers can use ABM to test different assumptions by designing a set of experiments (i.e., what-if scenarios) that systematically vary one aspect of the model while holding the others constant. Assuming a model is stochastic, multiple runs of each condition will produce variations in outcomes and allow the effects of chance, and uncertainty to be estimated.

The difference between an agent-based model and a field experiment is that the field experiment involves the manipulation of factors assumed to influence the phenomenon of interest. When using ABM, the scientist explicitly manipulates the model (Gilbert and Troitzsch 2005), and hence the theoretical mechanisms through which phenomena are thought to emerge. Moreover, unlike the former, simulated experiments can take place without concern for ethical violations and are inexpensive to implement compared to field experiments. Of course, “… computer simulations are actually philosophical thought experiments, intuition pumps, not empirical experiments. They systematically explore a set of assumptions.” (Dennett 2004, p. 218). The discussion here is not intended to argue that ABM is better than traditional empirical methods, rather that it can play a valuable role as part of a research program utilizing a variety of methods (Eck and Liu 2008; Grimm and Railsback 2005). ABM should inform empirical research and vice versa.

Statistical Techniques

In addition to construct validity, empirical research is assessed in terms of statistical conclusion validity—the extent to which confidence should be placed in tests of hypotheses (Shadish et al. 2002). This requires the use of appropriate statistical techniques and data. Traditional empirical methods rely on statistical models that have several shortcomings that do not reflect the complexity encountered in real life. First, they require simplifying assumptions about human behavior. This is in part because only limited data can be collected, and behavior observed, in empirical studies. Second, statistical models have difficulty accommodating the heterogeneity that characterizes populations and human decision-making. Third, they are not able to handle temporal dynamics (e.g. even longitudinal sample surveys can only provide limited data on the sequential actions of respondents) intrinsic to social interactions. These drawbacks can hinder both statistical conclusion and casual validity in empirical research.

In contrast, ABM does not require simplifying assumptions about human behavior (Eck and Liu 2008; Gilbert and Troitzsch 2005). Agents can sense and interpret their surroundings, and the modeler can collect data describing agent characteristics and thoughts throughout model runs. Each agent can have different characteristics and those characteristics can change in response to variation in the agent’s circumstances or decisions made. Finally, agent-based models can accommodate dynamic relationships and evolution. As a result, the outcome variables incorporate multi-layered decision-making of heterogeneous individuals. Agent-based models also allow the decisions of individuals at one time point to affect the decisions of others, and for attitudes and knowledge to change over time. The outcome of a model run emerges from the interactions of agents over time. This allows researchers to, for example, examine how informal social control develops, identify tipping points in group attitudes, and investigate the complex interaction between background factors and situational ones. Together, these characteristics provide greater confidence that the statistical relationships identified in the simulated outcome data are valid, given the model assumptions.

Unique Challenges to the Validity of Findings from ABM

This section builds upon an earlier publication by Townsley and Johnson (2008) in which they discuss (but do not examine) how the validity of simulation models might be demonstrated or assessed. In this publication, we rehearse the issues and then empirically assess the extent to which the published literature attends to them. As with empirical models, the validation of agent-based models is multi-layered and involves a variety of different dimensions of validity.

Software verification is the mechanism used by modelers to determine internal validity. As discussed, to avoid confounding effects, in an agent-based model the modeler should only change one aspect of the model at a time, holding all else constant. However, other threats arise in the form of coding errors or unexpected logic failures. Thus, increasing the internal validity of an agent-based model involves rigorous software verification (i.e., testing to ensure that the software operates as expected). As part of software verification modelers should consider the ways in which (1) a coding error, (2) the failure to identify software specific characteristics might change the nature or timing of interactions in the model, and (3) inadvertent characteristics of the environment might systematically influence outcomes. Of particular importance is the need to ensure that there are no unintended interactions between the formal representation and the modeling platform employed (e.g. some platforms may constrain how agent rules are executedFootnote 3) that influences outcomes generated (Edmonds and Hales 2003; Gilbert and Troitzsch 2005; Grimm and Railsback 2005; Townsley and Johnson 2008).

Empirical external validity concerns the extent to which the causal relationships persist across other people and places (Shadish et al. 2002). In the case of ABM, the question is slightly different and addresses whether relationships persist across different agents, behavior rules and landscapes. To examine this, modelers may use the same model but different environments or different agent behaviors.Footnote 4 However, Townsley and Johnson (2008, p. 8) remain “skeptical that true generalizability will be established without relying on a wider community of scholars to actively test for this.” They conclude that replication rather than incremental changes to the model by the original modeler is the best way to ensure external validity in models.

Transparency in describing models is essential to replication. Of course, achieving the appropriate level of transparency is important for all research. In her review of the crime reduction evaluation literature, Gill (2014) notes that many primary evaluations which use a randomized control design lack descriptive validity, often failing to report important details on issues such as how randomization was achieved, deviation from the evaluation plan, and the attrition of participants. The absence of such detail makes the quality of studies difficult to assess and replication difficult—factors which are likely to impede progress in the field. This is especially true for descriptions of agent-based models since each model involves a large number of decisions. Evaluation of the extent to which those decisions appropriately reflect a theoretical viewpoint cannot occur without comprehensive documentation of both the model assumptions and their implementation (Townsley and Johnson 2008, p. 15). This degree of detail may not be possible in the main text of journal publications but it is certainly possible in the supplemental on-line materials that journals now routinely offer (Grimm et al. 2006). Transparency of models is a key issue for ABM. Without it, modelers are unable to connect mechanisms and outcomes to demonstrate the validity of their models.

Statistical conclusion validity has an additional dimension unique to ABM. Agent-based models represent stochastic process and so outcomes for the same model will vary across runs, all else equal. For this reason, modelers typically run agent-based models many times to average out the effect of stochastic elements in the model. When analyzing outputs, they should use appropriate statistical tests to examine the distribution of outcomes and to examine correlations, or other expected patterns in the data. In what follows, we examine how many runs researchers employ, if and how they justify this, and the statistical tests they use to explore patterns in the simulated data.

Empirical validity is related to statistical conclusion validity but unique to simulation modeling (Townsley and Johnson 2008). Empirical validity refers to the appropriate and accurate use of empirical knowledge to build a model, calibrate parameters and evaluate outcomes. Thus, determining the empirical validity of models depends on how much we know about the phenomenon of interest. Empirical knowledge can potentially inform every aspect of an agent-based model from the propensity of people to commit a crime to their choices about travel behavior. When the literature is not clear, such as when multiple studies find different values, researchers should test a range of values (Gilbert and Troitzsch 2005; Grimm and Railsback 2005; Werker and Brenner 2004) and examine the sensitivity of their model to different values. This should subsequently inform interpretation of their findings and inform the parameter values used in future models. When the empirical literature is silent on a particular parameter, modelers must make assumptions and in these cases the articulation of a clear rationale for these is very important, as is sensitivity testing. Interrogation of model results is thus a critical component of the validation process in ABM. When systematic variations in outcomes are observed for particular parameter settings, the extent to which these threaten the validity of the agent-based model should be explicitly addressed. Where simulated outcomes are unexpected or inconsistent with existing knowledge, the researcher should be transparent about this and assess whether they have value in strengthening theory (Gilbert and Troitzsch 2005; Grimm and Railsback 2005; Werker and Brenner 2004).

Empirical knowledge also provides the basis for evaluating the outcomes of models. Current practice is to compare the outcome patterns of models to known characteristics of crime patterns that represent stylized facts (Townsley and Johnson 2008) or statistical signatures (Gilbert 2008). Eck and Liu (2004) offer the following empirical regularities as useful benchmarks: (1) the high degree of crime clustering in certain places, (2) that relatively few offenders are responsible for a large proportion of crime, and (3) that victimization is concentrated among a small number of victims. Another example benchmark pattern is the near repeat phenomenon (Johnson et al. 2007; Morgan 2001). Once a crime occurs, there is a temporary elevation in risk of another crime of that type occurring nearby—an effect that decays over space and time. These stylized facts represent expected distributions of, and thus provide, known patterns to use for validation rather than exact outcomes. However, matching distributions is insufficient by itself for validation since more than one specification could produce similar patterns (Troitzsch 2004). This is the problem of equifinality.

What is the State of the Evidence that ABM Can Achieve Its Potential?

In the previous sections, we discussed the potential for ABM to address many of the challenges faced by social science researchers. However, the degree to which studies using agent-based models have contributed to the body of knowledge, and the methodological adequacy of such studies, is at present an open question. Answering these questions requires a systematic review of the literature.

To our knowledge, no such review exists for ABM and hence the aim of the current publication is to document the details of criminological agent-based models. To maintain focus, and avoid comparing completely different types of models, we limited this review to agent-based models of everyday urban crime at the micro level. That is, those that model crime events that involve individual decision-making in particular (simulated) city situations. As such, studies were excluded if they focused on rural crime or crimes that occur in different environments such as maritime piracy, cyber crime, or poaching. The systematic review has three parts: (1) summary of the state of the art in the ABM of urban crime; (2) identification of gaps in our knowledge, and (3) suggestions for improving the documentation and communication of models in the literature. This approach provides a foundation for the development of more explicit methodological templates for designing, analyzing, and reporting research conducted with agent-based models that focuses on urban crime patterns.

Methodology

Systematic Review Process

To identify a set of studies for review, we first conducted a systematic search of the literature. This was completed in spring 2015 and focused on agent-based models that examined urban crime (i.e., crime in a city environment) at the micro-level. We focused on urban settings because they have a much higher density of human interaction than suburban or rural ones. The micro-level constraint reflected our focus on interactions between individuals in specific contexts. As part of our review protocol, we specified the electronic databases we would search (see below), a set of search terms to inform backwards searches of the literature, and identified a set of key publications to guide forward searches. Backward searches involve using keywords to identify publications via search engines. Forward searches start from seminal articles and examine articles that have cited them. We describe these elements of the review along with our inclusion criteria below.

For the backward searches, to identify the keywords used, each of the investigators independently suggested a set of potential keywords, and a final list was derived through discussion. The keywords used were as follows:

agent based OR cellular automata OR complex system OR complexity science OR computer simulation OR emergence OR individual based mod*OR simulation

AND

anti social behavior* OR assault OR auto theft OR burglary OR crime OR delinquency OR disorder OR homicide OR incivilities OR property OR rape OR robbery OR theft OR violen*

We elected not to use the term urban as one of our search terms so as not to constrain too narrowly the publications we identified in our initial searches. We used the above terms to search the MetaLib search engine, which encompasses FRANCIS, GEOBASE, JSTOR, PubMed, SCOPUS, Web of Science and Zetoc. Additionally, we searched Google Scholar. During our initial searches, the terms disorder, property, rape and violen* proved problematic, throwing up thousands of physical science hits, with the initial search identifying 285,119 publications as candidates for review. A review of a random sample of publications identified suggested that most would not meet our inclusion criteria (see below). It was therefore necessary to add the term crime after these in order to narrow down the search to relevant material. We narrowed the search by adding relevant filters to exclude subject areas that were irrelevant to the current paper. These filters included de-selecting physical science topics such as chemistry, biology and physics which were responsible for so many problematic hits, or actively selecting subject fields such as computer science, sociology, criminology, crime and law. The exact selection of each depended upon the search engine used, but in order to be as inclusive as possible a decision was made to de-select knowingly irrelevant fields rather than select potentially relevant ones whenever we could.

The forward search strategy involved the systematic examination of all works that cited a target work. For this search, we first identified a set of the five most influential publications concerned with the agent-based modeling of crime (Table 1). These are termed seed publications hereafter because they grow a tree of additional publications through citation. The seed publications were identified through Google searches to identify those with the highest citation counts, author knowledge and consultation with an acknowledged expert.Footnote 5 Using Google Scholar’s Cited By functionality, we conducted a forward search in 2015 to identify every study that had cited one or more of the seed publications. A total of 128 studies were uncovered through the forward search, and each cited an average of 1.62 of the five seed publications indicating there was a good deal of awareness of the seed publications across the citing studies.

Table 1 Publications used as seeds for forward searches (citation counts shown up to December 2014)

A preliminary filtering exercise involved reading the titles and abstracts of all studies, and the removal of duplicates. We established criteria to narrow the spectrum of models reviewed and in doing so, reduce the variance in the approaches to modeling crime. Only those studies that met the following inclusion criteria were selected for the secondary filtering exercise:

  1. 1.

    Study must be in English.

  2. 2.

    Study must have reported research by those who either created a simulation themselves or reviewed simulations that others had created.

  3. 3.

    Study must have focused on urban crime patterns (excluding cybercrime and financial crimes such as fraud, embezzlement and forgery).

  4. 4.

    Study must have focused on crime at the micro level.

  5. 5.

    Study must have included autonomous agents.

We included both published and unpublished studies that appeared in a scholarly journal or a proceeding, a thesis or dissertation, a working paper, a technical report, or as a book. As a result of our initial sift, the list of potential studies identified for review was 171 items (“Appendix A”). One of the authors reviewed the full text of each of these publications to determine if it did in fact meet the inclusion criteria specified above. During this second stage of the review, we also checked the reference lists of the studies to see if they identified any new sources that could be included in the review. However, we identified no additional items. For any papers (N = 15) that the original reviewer was uncertain about, a second reviewer also read the paper and the two reviewers decided whether to include or exclude them. After in-depth review, 45 papers met our inclusion criteria (“Appendix B”).Footnote 6

We developed a coding scheme to extract important information about each study concerning the theoretical basis and the operational characteristics of each model including the following:

  • General characteristics (title, author, year, publication type: journal publication, book chapter, proceedings, book, report, working paper, thesis or dissertation)

  • Purpose of the model (simulate theory, test policy or both) and any theories used

  • Crime type investigated

  • Model level characteristics (software used, type of landscape, time steps, duration, size of world, size of spatial units, number of agents in the model, informal guardianship included, number of parameters, and the number of dependent variables)

  • Empirical data used to calibrate the model (e.g. the fraction of motivated offenders, guardians, police, and agent movement rules) and its provenance.

  • Agent level characteristics (movement and offender decision-making)

  • Sensitivity testing (number of runs and the extent to which parameters were evaluated for their impact on model results)

  • Evaluation of model results (were results compared to an empirical distribution, a stylized distribution, or theoretical distribution, statistical tests used)

We also gathered information documenting any justification given for the selection of parameters or decision making rules. We discuss the rationale for including each of the coded items in the next section with the respective findings.

To test the coding scheme and our ability to apply it consistently, all three reviewers coded the same three publications. There were no disagreements among reviewers but a few minor changes were made to the coding scheme and our descriptions of the variables included. All of the 45 publications noted above then underwent in-depth coding by one of the three authors using the finalized coding scheme.

Summarizing the State of the Art

We set out to reveal the state of the art of agent-based modeling of urban crime. We discuss what we uncovered and what it reveals in this section.

Authors, publication outlets and timing

The identity of early adopters, publication outlets and the timing of publications using ABM offer some insight into its penetration into the discipline. Our examination of authorship revealed some interesting patterns, namely the importance of teams and thesis/dissertation work in developing the field and the dominance a relatively small number of scholars (Table 2). Publication authorship ranged from one to seven authors with a mean of 2.9 authors, confirming the tendency for the development of agent-based models to be collaborative exercises that involve teams of scholars, often from a variety of disciplines (Eck and Liu 2008). The dominance of teams reflects the need for programming skills, which are less common in the social sciences than the natural and computer sciences. Reinforcing the importance of mentoring in developing future scholars knowledgeable about ABM, only seven publications were single authored and five of those seven were dissertation related. Of the total of 75 different named authors across the 45 publications, 48 scholars appeared on only one paper. At the same time, this total represents a tiny proportion of all scholars in the field of criminology. Two teams, one led by Malleson and the other by Bosse, accounted for 33% of the publications.

Table 2 Authors with three or more publications

It is often challenging to get research using new methodologies published in peer-reviewed journals, especially top tier journals (Richiardi et al. 2006). Yet, journals were the most frequent publication outlet, with almost two-thirds (65%) of the publications appearing in peer-reviewed journals. Six of the 45 publications reviewed here appeared in top tier criminological journals such as Criminology (n = 2), Journal of Experimental Criminology (n = 1); Journal of Quantitative Criminology (n = 1), and Journal of Research in Crime and Delinquency (n = 2). Moreover, another six appeared in similarly highly-ranked journals in related disciplines such as Transactions in GIS (n = 1), Computers, Environment and Urban Systems (n = 3), and Environment and Planning, B: Planning and Design (n = 2). Of the rest, 14% appeared as book chapters in collected volumes of research. About 9% of publications were in proceedings. PhD dissertations and master’s theses accounted for about 7%. The other 5% were technical reports and working papers. Looking at the year of publication, there was a steady increase between 2002 and 2009 (Fig. 1). The peak years were 2008, 2009, 2010, and 2012 with 6 publications each year. The number of publications declined in 2013 and 2014.

Fig. 1
figure 1

Trend in publication date of urban crime ABMs

Motivation for Using ABM

One of the strengths of using ABM is the ability to undertake research into topics that would be difficult or impossible using empirical methods, so we examined whether the model focused on theory testing/exploration, policy simulation, or both. If the authors discussed the theoretical basis for the model, we noted the particular theories identified. If evaluation was the purpose, we noted the policy evaluated.

Approximately 40% of the simulations examined policy and 60% explored theory. Policy-focused publications tended to examine policing-related topics. Over half modeled different patrol strategies such as random, directed, hotspots and problem-oriented policing.

It is standard practice in ABM to use theory as a foundation for the selection of the agent sample and the behavioral rules used. In terms of the theories used as the basis for agent behavior in the models, studies drew from opportunity theories most frequently (Table 3). Routine activity theory (Cohen and Felson 1979) was mentioned in 58% of studies; crime pattern theory (Brantingham and Brantingham 1984) in 29%; and the rational choice perspective (Clarke and Cornish 1985) in 27% of the studies reviewed. Social disorganization/collective efficacy/social cohesion contributed to at least one aspect of the model in 20% of the publications. Near repeat or repeat victimization (e.g. Johnson et al. 1997) informed 9% of publications. The extent to which the authors explicitly made the connection between theory and model structure in the text of the publication varied tremendously. Opportunity theories are arguably more explicit about human activity and event level decision making than are traditional criminological theories, and consequently it is perhaps not surprising that they more naturally lend themselves to formalization within an agent-based model (Johnson and Groff 2014). However, to be clear, our results reveal more regarding the theories that form the basis for the assumptions codified in the model, rather than the frequency of explicit tests of those theories.

Table 3 Theories mentioned as the basis for some aspect of the model and crime types investigated

The majority of models focused on a particular type of crime (Table 3). This is likely due to differences in targets and offender decision-making by crime type. Burglary (36%) was the most frequently modeled crime type, followed by robbery (18%) and drug crime (11%). Almost a third of models used all crime or did not specify a crime type (31%).

Model Implementation Decisions and How They Affect Construct Measurement

This section addresses model implementation decisions (as opposed those pertaining to individual agents) and define the structure of the model and limit what it is possible for the model to test and to explore.

Software

The software used to create the model has important implications for the length of development time. For example, programming a model from the ground up using languages such as Java or C++ requires more knowledge and ability than using an existing modeling framework such as Repast Symphony, which comes with the foundation work done (e.g. the object classes are created). A programmer using the latter can take advantage of the existing foundation to speed development and focus their efforts on model-specific programming.

Scholars used a wide variety of software to create the published models (Table 4). Repast, which was developed by Argonne National Laboratory, was used in 29% of the publications. Groff (2007a, b, 2008a) used a related program, Agent Analyst, in her research (n = 3). Esri (ArcGIS) and Argonne National Laboratories partnered to develop Agent Analyst by building on a python version of Repast (RepastPy). The Python programming language is the foundation for RepastPy and features a well-developed graphical user interface. These characteristics make it easier to use than programming in a Java-based language (for example). Another 16% of publications used NetLogo, which is widely considered to be the easiest software to use for agent-based modeling (Shiflet and Shiflet 2014) and has a robust user community, as well as several good tutorials. Two relatively prolific groups used software they developed in-house. Spatial Adaptive Crime Event Simulation (SPACES) is the in-house software developed by Wang while at the University of Cincinnati. LEADSTO is the in-house software developed by Bosse and colleagues at VU University in Amsterdam.

Table 4 Software used to build the ABM

Researchers explored a variety of crime types using ABM. Residential burglary was the most frequently modeled and was examined in 35% of studies. Malleson and colleagues authored 60% of all burglary studies. Street robbery was addressed by 19%, drug crime by 12% and total crime by 7% of studies. Surprisingly, 23% of studies reviewed did not clearly state the type of urban crime they sought to simulate. This was unexpected since the mechanisms that generate different forms of crime are likely to differ. This is the first mention of what will be a recurring theme throughout the results. The studies reviewed did not consistently report the basic information necessary to understand, evaluate or replicate models implemented. Additionally, when authors wrote several publications examining the same crime type, they often used the same agent-based model and adjusted it to test a different scenario. In those cases, there was less variety in the modeling approach than the sheer number of the publications might suggest.

Number of Agents

One of the many decisions agent-based modelers must make as they move from a conceptual model to implementation concerns the number of agents to include in the model. Typically, the decision reflects a balance between the theoretical population needed and the computational costs of adding more agents. In policy simulations, the number of agents may have implications for the veracity of the findings, especially in realistic simulations that attempt to reflect a particular place. To ease interpretation, we classified the number of agents into five categories (Table 5). Almost half of all publications used between 1 and 1,000 agents (47%). This likely reflects the constraints on computational power during the 2000s. Disappointingly, the largest category of publications consisted of those that did not specify the number of agents in the model (31%). In two-thirds of the studies (67%) the author(s) offered no justification for the agent population chosen. Almost a quarter (24%) provided an empirical source for their agent population and 9% provided a logical argument.

Table 5 Units of analysis and simulation sample size

Spatial Aspects

Many components make up the spatial aspects of a model, but perhaps the most fundamental is the type of landscape. We classified the type of landscapes used along a continuum from those that were not spatially explicit to those that used real-world data and a Geographical Information System (GIS). Characteristics coded included the type and size of spatial units and the dimensions of the virtual world.

In terms of coding the type of spatial units, the simplest model landscape is completely abstract. Such models will (for example) ignore the role of urban form and instead focus on some attribute(s) of agent interaction, such as the role of social networks in offending or victim behavior. Models can be abstract, but representative, as in those that use lines to represent a typical street network. Alternatively, they can be facsimiles of aspects of real-world environments, using real street centerline networks, for example. Landscape type coupled with the type and size of units has implications for spatial movement, spatial activity patterns and spatial interaction by agents in the model.

As shown in Table 5, 60% of the studies reviewed used a landscape that was not spatially explicit. This finding reflects a traditional bias toward the use of abstract or grid space. For example, Robert Axelrod advocates the “keep it simple, stupid” (KISS) principle (Axelrod 1997). Simplicity, at least in the initial development of models, is critical to allow interpretation of the complex and dynamic outcomes that emerge from even simple models. Simple landscapes were also more compatible with the limited software capability that existed until the mid to late 2000s, which prevented modelers from using GIS layers in popular packages such as Netlogo and Repast. Thirty-one percent used GIS layers or street centerlines as the basis for their landscapes. This is encouraging given mounting evidence from empirical research that the configuration of the street network influences crime pattern formation for both volume (e.g. Davies and Johnson 2015) and more serious crimes (Summers and Johnson 2017).

There is a close relationship between spatial units and the landscape. Thus, it makes sense that 49% of models used grid cells as their spatial units (Table 5). Another 29% used street networks and 9% used (mathematical) graphs or diagrams. Whether agents travel along street networks or across grid cells has implications for the structure of agent movement, activity and interaction. It also has implications for the generalizability of the model’s findings and its theoretical foundations.

We observed substantial variation in the size of the spatial units used and in the size of the world reported. The size of spatial units ranged from nodes to grid cells (with no reference to corresponding real world size) to streets (again no reference to real world length) to areal units such as blocks and neighborhoods. Corresponding world sizes were often equally vague. A significant number of studies did not report the size of the units (38%) or the size of the world (25%). This absence of specificity makes it impossible to say much about general patterns save one, lack of reporting.

Temporal Aspects

Temporal aspects of an agent-based model include the time steps used to update agent activity or decision making, and the duration of a model run. For example, in a model with a time step of one simulated minute, agents will make decisions every minute of the simulated study period. Time increments represent the minimal temporal resolution of the model (i.e., the minimum time for which we can calculate measures of the phenomenon of interest). Ideally, the temporal units used would match the temporal resolution at which the investigator hypothesizes the phenomenon will unfold. However, historically modelers have sometimes chosen model increments that were less than ideal simply because of computing limitations. That is, if the agents made decisions as frequently as they do in real life, the model would run too slowly or not at all. Similarly, the duration of the model reflects a balance between important assumptions regarding how long it takes a pattern to emerge and computational limitations.

The publications we examined reported a wide variety of temporal units (Table 5). The extreme variation in temporal units required we collapse our coding into four categories. Twenty nine percent of models had time steps of one hour or less. However, and alarmingly, almost two-thirds of models (64%) did not report the size of the time step used at all.

There was also wide variation in the total duration of simulated time among models. In order to conserve computational time, a modeler typically chooses a length of time that is sufficient to produce stable patterns in the modeled phenomenon, but no longer. Patterns represent order in data (Grimm and Railsback 2005). As discussed above, one example of a pattern in crime data is the near repeat burglary phenomena. Thus, if the purpose of the model is to examine near repeat burglary, then it may be sufficient for the duration of the model to be fairly short since near repeat patterns tend to emerge within four weeks or less. In that case, a simulation of 12 months should provide sufficient data to test for empirical regularities. On the other hand, patterns of relatively rare events such as homicide may take much longer to form. Modelers may also choose durations with temporal variations such as seasonality, or some other temporal cycles that they have in mind (Grimm and Railsback 2005).

In the models reviewed here, those that simulated a period of less than one year (20%) were as frequent as those that ran for one year or longer (20%) (Table 6). The most frequent duration was 30 days/one month, which was used in 13% of the models. Another twenty percent did not explicitly make a connection between model time and real time and simply reported time steps. In some cases, the models did not run for a specified time but had some other limiting factor (e.g. a specified number of crimes). In forty percent of studies, the authors did not discuss model duration.

Table 6 Duration and number of simulation runs

Accounting For Statistical Conclusion and Empirical Validity

Models vary in terms of parameter calibration strategies and whether they use empirical data for this purpose. In this section, we consider the more general methodological strategies taken to increase the validity of model results. A key question concerns the extent to which models are able to reproduce those phenomena they seek to simulate, and how this is established. Like statistical models, agent-based models cannot include all of the factors that might influence activity in the real-world and, as discussed above, there is a preference for parsimony in model building. Additionally, models incorporate stochastic effects either to model random factors that might influence crime pattern formation, such as the presence of a bystander at a given time, or to model uncertainty in agent decision making. Since this stochasticity will mean that model results vary from one run to the next, an important question concerns if and how modelers assess the consistency of model outcomes across simulation runs, and how they determine the number of runs necessary to produce stable estimates and to draw conclusions. There is a parallel here with statistical power analysis in empirical research, where the researcher seeks to establish how large a sample size is necessary to detect an effect, should one exist. The following section examines these issues.

A basic convention for dealing with stochasticity involves running the model a number of times and taking a measure of central tendency of those runs to represent the typical outcome, and examining the variation across runs to estimate consistency. Determining the number of runs needed to capture the variation in model results is not trivial. It has significant resource implications, especially in terms of processing time. We examined how many publications detailed the number of runs and whether they articulated a justification for that number. In cases where the study authors reported multiple conditions or experiments, the coding reflects the lowest number reported. The largest proportion, 44% of publications, reported using between 11 and 100 runs (Table 6). Only 9% used more than 100 runs. Once again, the percentage of publications not reporting the information was considerable (27%). A small number of publications (almost 12%) noted that they chose a number of runs that allowed them to produce stable results (n = 5). Four explicitly mentioned the stochasticity inherent in agent-based models as a reason for the number of runs. Over three quarter (76%) of publications did not provide a reason for the number of runs employed (Table 7).

Table 7 Statistical analysis of model outcomes

Determining the empirical validity of model results is another critical area in ABM because it addresses how well the model represents the target phenomena (Gilbert 2008). We examined two components of model validation in our coding—sensitivity analysis and outcome analysis. Most agent-based models are not fully calibrated using empirical data. Consequently, sensitivity analysis examines the extent to which the values of the parameters used in a simulation affect the outcomes. This activity provides important information about: (1) the robustness of model results (Gilbert 2008; Grimm and Railsback 2005; Manson 2001) to variation in model parameters that are of less theoretical importance (these should have little effect on model outcomes), (2) how well the model simulates the mechanisms articulated in the theory (Gilbert 2008), and (3) how sensitive it is to changes in key parameters that would compromise this (the model should be sensitive to such changes). Evaluation of the latter is via an interrogation of the parameter values and their effect on the model dynamics (Gilbert 2008). In practice, sensitivity analysis usually involves systematically varying initial parameter values and the random number seed (which ensures variation in stochasticity across simulation runs) and then exploring how changes in parameter values affect model results (Grimm and Railsback 2005; Manson 2001).

Another way to look at sensitivity analysis is through the lens of theory. This more qualitative approach is particularly well suited to abstract and middle-range models that do not model a particular empirical example. In this case, systematic variation of important parameters in the model should produce “patterns at the macro level that are expected and interpretable” (Gilbert 2008, p. 41) based on what theory would predict. Put differently, the manipulation of parameter values should produce simulated outcomes that are consistent with hypothesized changes. For example, halving the number of agents with criminal propensity should produce a significant crime reduction. If there were no crime reduction, this would suggest that there is something else going on in the model. The more complex the model, the more parameters involved. Modelers must typically balance the number of parameters tested against the significant time and effort involved, which is why they often focus on the parameters most likely to have an impact on model results, or those that are of particular theoretical interest.

The number of runs chosen to account for stochasticity also influences the amount of work involved in sensitivity analysis since it requires tests of multiple values for each parameter investigated. For example, a simple approach might involve testing five different parameters using two additional values (typically at least one of the new values would be higher and one lower than the subject value). Practically, that translates into 10 more full runs of the model, each with a different parameter value set. To finish the example, if the number of runs used to account for stochasticity is 100, then the 10 additional versions of the model would each require 100 runs for a total of 1,000 simulation runs (Gilbert 2008). In the sample of studies examined, almost half of those examined conducted at least a partial sweep of parameters but 49% did not report any (Table 7). Only one publication examined the effects of varying all parameters used in the model.

Evaluation of model outcomes is another critical component of empirical validity. Model validation often involves comparing the model results for a number of variables against empirical values or stylistic patterns in empirical data. The particular strategy for validating model outcomes depends on the type of model. Gilbert (2008) provides a succinct summary of validation by model type. Abstract modelers often use ABM to formalize theory. Their aim is to describe how individual interactions can produce interpretable macro-level patterns, such as general trends in rates of crime over the last century. Middle-range agent-based models simulate particular social behavior such as committing a specific type of crime, or offender targeting strategies. Although they lack any connection to specific places, typical landscapes are used to implement the model, empirical data are employed to set agent characteristics (e.g. the number of police officers in the model, the age of offenders, and so on) and stylistic patterns in crime data—such as the fact that crime clusters at places, or the near repeat phenomenon discussed above—can be used to validate the outcomes of such models. Facsimile models exactly match a particular location in the real world and rely upon specific empirical data for both calibration and validation.

One challenge to validating agent-based models of crime is a function of the problems associated with official crime data (Maguire 2002), chief among which is the problem of under-reporting to the police. Because of these shortcomings many crimes occur but do not appear in official data. Consequently, we generally do not have an accurate baseline with which to compare the results of agent-based models of crime (Eck and Liu 2008). As a result, and as discussed above, most crime modelers validate their models by comparing patterns in model outcomes to stylistic patterns that are widely recognized in empirical data (Birks et al. 2012, 2014; Eck and Liu 2008; Groff 2007a; Wang et al. 2008). The use of such distributions is appealing, as opposed to precise empirical patterns, because it allows generalization and is less susceptible to particular local conditions.

When discussing empirical validity, it is important to remember that modelers often use ABM when: (1) no empirical data exists, (2) when we know little about patterns, or (3) when empirical data are unreliable. This is because ABM offers the researcher a way of systematically testing theories in silico for which (nonlinear) complex interactions and feedback loops would be difficult or impossible to anticipate in thought experiments, or where primary data collection is impractical. In those cases, validation is challenging and modelers often use a simple plausibility test. In other words, they examine to what extent the outcome of the model matches what the underlying theory would predict (Dowling 1999; Groff 2007a; Ostrom 1988; Wang et al. 2008). This strategy is similar to the one used to validate abstract models.

In practice, we found that even with the drawbacks of official crime data, modelers most frequently used empirical crime data (38%) to validate their models (Table 7). Stylized distributions were the next most frequently used (33%). About 24% of publications used theoretical distributions. Many used a combination of the three in their validation efforts.

We also identified the type of statistical tests used to validate results. Importantly, over half the publications (56%) did not mention using any specific statistical tests to validate results (Table 8). The remaining publications used a staggering variety of statistical tests to compare model results with a reference distribution or distributions. ANOVA was the most frequently used statistical technique but only 11% of publications employed it. Ripley’s K was mentioned in 7% of publications overall, but this test would only be appropriate for spatially explicit simulations. This variety likely reflects the different research questions examined and approaches taken to examine them. Surprisingly, no authors reported using the Kolmogorov–Smirnov test, a classic statistical test used for comparing distributions.

Table 8 Statistical tests used to compare model results with a reference distribution

Measurement of Agents and their Behavior

ABM enables investigators to assign roles and behaviors to agents. It also allows the quantification and tracking of agent perceptions as concrete values. The target phenomenon of the model and the theoretical foundation chosen by the modeler determine the types of agents, as well as their characteristics and behaviors. In this section, we examine what types of agents were used and the behaviors they were assigned.

Types of Agents

We examined the decision making rules used for each type of agent—offenders, police, and potential victims—and the number of agents included in the virtual society. Most models included a combination of the three agent types (victims, offenders and police). Offenders were included in all of the models but 27% did not specify the number of offenders (Table 9). Police were included in approximately 53% of the models, but the number used went unspecified in 14% of models (Table 9). Forty percent of models had less than 100 offenders and 67% had less than 100 (or no) police.

Table 9 Numbers of offender and police agents

Models varied in the type and number of victim agents included. This information is a little more challenging to summarize since models could have one or several types of victim. However, we noted three different types of victims - mobile, static dynamic and static stable. Mobile victims are people who move throughout the simulated landscape and are at risk of victimization. Static-dynamic victims are victims that do not move (e.g., buildings) but for which their attractiveness varies over time (e.g. if residents are home at some points in time but not others). Static stable victims are victims that do not move (e.g. buildings) and for which attractiveness is time-stable. Table 10 shows the number and type of each class of victim modelled. The majority (56%) had mobile victims. Non-mobile victims with dynamic attractiveness levels (47%) appeared in more publications than non-mobile victims with static levels of attractiveness (38%). Across victim types, the largest single category indicated that most often authors failed to specify explicitly how many victim agents of a particular type were included in the model.

Table 10 Number of victim agents by type of victim

We also looked at whether study authors provided any justification for the number of agents included in their models, especially related to the overall agent population, the number of offenders and the number of police. Disappointingly, the majority of studies failed to provide an empirical source or logical explanation for the number of agents used. Concerning the overall agent population, 9% provided a logical argument for the selection used, while 24% provided an empirical source (Table 11). For the number of offenders, 11% provided a logical argument, while 24% provide an empirical source. Police were not included in the models for 20 of the 45 publications. For the 25 publications that did include police, 24% (n = 6) offered a logical argument and only 8% used empirical sources to inform the police population selected. One way to improve the empirical validity of agent-based models is to have the implementation informed by empirical data. While empirical findings to inform many modeling decisions and parameters do not currently exist, many do and when they do, they should be used.

Table 11 Reasons for agent population, percent offenders and percent police officers

Representation of Offender Decision Making

Offender actions garnered the most attention from modelers. The modeling of offender decision-making followed two different approaches, rational or cognitive. The first, and most straightforward approach, draws on rational choice perspectives. Some models employed an economic benefit–cost calculus (Becker 1968) that assumes offenders have access to perfect information regarding potential risk, such as the proximity of guardians, and the potential reward. Other models drew on the rational choice perspective as described by Clarke and Cornish (1985) which incorporates “bounded rationality” and allows agents to make decisions based on the information they have. To do this, modelers used a stochastic term (or terms) to model the influence of uncertainty, imperfect information, or errors in decision making. This allowed agents to decide not to commit a crime even when mathematically the reward outweighed the risk. The largest proportion of models (67%) based offender decision-making on some flavor of rational choice (Table 12). Authors were more likely (42%) to build models that assumed perfect (information and) rationality than bounded rationality (24%).

Table 12 Basis for offender decision-making in the model

The second approach to offender decision-making is termed conceptual cognitive and is based on one of two frameworks. The original framework is the beliefs, desires and intentions (BDI) model (Rao and Georgeff 1995). Beliefs represent an individual’s perceptions about how the world works. Desires are the goals of an individual. Intentions are the individual’s deliberations. The BDI framework is mainly conceptual and lacks specific guidelines for implementation. It focuses on cognitive processes and assumes rational decision-making (Urban and Schmidt 2001). Some scholars question the necessity of having all three components while others feel strongly that three components are not enough to represent human decision-making (Rao and Georgeff 1995). In reaction to criticism of BDI, a broader framework that includes physical, emotional, cognitive, and social factors (PECS) was developed (Urban and Schmidt 2001). PECS divides agent behavior into two main categories, reactive and deliberative. Reactive behavior can be instinctive, learned, drive-controlled, or emotion-controlled and occurs without the need for conscious thought. Deliberative behavior, on the other hand, involves goal-directed behavior (see Malleson et al. 2010 for an example of PECS applied to burglary behavior). Advantages of the PECS framework include not requiring rational decision-making and not restricting the dimensions of decision-making as much as other models. Conceptual cognitive frameworks underpinned decision-making in 29% of models discussed. Of those models using a cognitive framework, just over 60% used PECS.

Representation of Agent Movement

For the models that specified agent movement and/or agent activity spaces, critical components of the model included the number of locations in an agent’s activity space and the type of movement algorithm used. Both decisions have clear implications for the representation of agent activity spaces and there is little empirical evidence upon which to draw.

The number of locations/nodes in agent activity spaces varied widely (Table 13). Almost half of studies reviewed did not specify the number of nodes used (47%). Of those that did, many did not use any (24%), which would mean that agent activity would not have been anchored in any way, thereby reducing the ecological validity of the model. Others used only one (11%) which typically represented a home location. Larger proportions of models used three or four (16%) activity nodes.

Table 13 Agent activity spaces and movement

Another aspect of routine activity spaces of agents concerns the type of movement they undertake. We classified agent movement using five different categories: completely random, biased random, purposive (the agent is moving toward a destination), Levy flight (observed in nature and human movement patterns) (Viswanathan et al. 1999), and exhaustive (agents consider the universe of potential destinations before deciding where they will move next).

Most modelers specified only one type of movement, but 10 of the publications specified two, with alternative algorithms used to test the effect of different types of movement on simulated outcomes. Half of the studies in our sample used purposive movement (Table 13). The majority of studies (56%) reported using some type of random movement (complete or biased). Models used exhaustive movement the least frequently. Sixty-nine percent of publications did not provide any references to support their operationalization of agent movement, which lowered the construct/empirical validity of that component of the model.

Discussion

This publication provides a systematic review of the state of the art in agent based modeling of urban crime at the micro level. We examined 45 publications produced between 2002 and 2014 to discover the extent that research using ABM is believable, contributing to the body of knowledge, and the challenges that remain for ABM to reach its potential. The findings are important to strengthening future studies that use ABM.

There were some trends in the descriptions of the model’s purpose and theoretical foundations. Related to the purpose of models, a slight majority of simulation modelers set out to explore theory, but the extent to which authors clearly described how model structure represented theoretical principles varied widely. The vast majority of modelers relied on one or more opportunity theories to model crime. This may stem from the relative ease of representing important elements of the crime event and criminal decision-making as compared to traditional criminological theory (Johnson and Groff 2014). Policy-focused models were overwhelmingly policing focused and examined patrol strategies or situational crime prevention techniques.

We identified some strong commonalities in the characteristics of the models. Modelers make many decisions about the structure of the model as they operationalize the conceptual model. These decisions can have significant implications for the believability of the model behavior. One of the foundational decisions in developing a model is whether to include space and if included, how to represent it. Most included space and the most frequently used landscape/spatial unit combination was an aspatial grid with grid cells. This is not unexpected given that the combination is straightforward to model and less computationally intense than, for example, modeling movement on a street network imported from a geographic information system. However, it does mean that the model may ignore important aspects of the urban environment that constrain human interaction.

Another foundational decision in operationalizing an agent-based model is how to represent time. This includes the amount of time represented by each moment in the model and the length of time the model runs. Ideally, the units of time used would reflect the temporal resolution under which the target phenomenon is hypothesized to unfold, and the model would be run until the phenomenon of interest stabilizes. Almost two-thirds of the studies reviewed did not report a time step but most reported the length of time the model ran. One year is commonly accepted as the minimum time for crime patterns to emerge, but a large number of models ran for less than one year (20%). Thus, if the outcomes of models are to be taken seriously, modelers need to provide more detail on the how and why of time.

The selection of the outcome variable of interest is another important decision. When thinking about offender behavior, it is logical that most models targeted specific types of crime because offender decisions incorporate different logic models depending on crime type (Clarke and Cornish 1985). About 64% of models investigated burglary, robbery or drug crime. To a large extent, the choice of crime types reflects the emphasis in the empirical literature. In other words, we simply know more about those types of crime. Worryingly, some studies did not specify a particular crime type, which questions the utility of their findings.

Most agent-based models were found to use a relatively small number of agents (between 101 and 1000). But more distressing is the fact that two-thirds did not offer a justification for the number chosen, which leaves the reader wondering why that number and not another, potentially undermining the value of the research.

The types of agents included in a model and their behaviors are a reflection of the crime type and the theoretical foundation used. The most important aspect is the justification of how the agent roles and behaviors reflect the theoretical foundation. This information was often partial or even omitted completely. We return to this topic later.

Sensitivity testing and validation are critical for evaluating the plausibility and accuracy of model results. Since agent decision-making is autonomous and there are stochastic components to many decisions, each time a model is run the results represent one potential realization of the interactions in the model. Thus, it is standard practice for modelers to use multiple runs and average the results to represent the most frequently observed outcome(s). Only 7% of models used only one run but most did not provide a justification as to why they chose the number of runs they did. Most authors recognized the need to conduct sensitivity tests and almost half conducted at least a partial sweep of the parameter space, but comprehensive testing was rare.

Model validation can take different forms. The two most popular approaches were to compare simulated outcomes to empirical (38%) or stylized distributions (33%). Another quarter of modelers reported using the theoretical plausibility of the outcome to judge the validity of the model. There was no consensus on the type of statistical tests to apply, which varied widely. Moreover, just under half of the studies reviewed did not use any statistical tests to compare the model results with a reference distribution.

Overall, the single most glaring fact to emerge from this review was that a significant number of publications did not include even basic details of the models. This is important for two reasons. First, it is very hard to understand how the model design connects to theory or to the model outcome(s) if one does not have a clear picture of what is happening in the model. “[C]lear articulation of constructs and formalism construction” are critical to maximizing construct validity in agent-based models (Townsley and Johnson 2008: 5). Model parameters, as well as the representations of processes in the model, should be clearly supported by theory or, at the very least, have their assumptions enumerated (Richiardi, et al. 2006; Townsley and Johnson 2008). Second, without such detail it is impossible to replicate a model or critically assess it. Sharing source code is the most transparent source for replication (Müller et al. 2014; Townsley and Johnson 2008). But source code alone would not provide the justification for modeling choices made. The absence of detail in a large proportion of publications makes it difficult to draw strong conclusions about model validity and findings. Consequently, to inform future work we make observations regarding changes to the reporting of models that would strengthen the believability of model results.

Improving the Impact of Agent-Based Modeling

Based on our review, we make four suggestions for strengthening studies using ABM methodology:

  1. 1.

    Enable replication by publishing complete descriptions of model implementation

  2. 2.

    Increase the transparency of model assumptions through more complete description

  3. 3.

    Broaden the array of theories investigated

  4. 4.

    Spotlight the need for additional empirical research to more accurately parameterize and test agent-based models

Our first suggestion is that researchers who use agent-based modeling to examine crime should publish a complete model description including the intuition behind model design decisions, sensitivity testing, and validation strategies. In the field of medicine, reporting standards for primary evaluations (and systematic reviews of them) are well-established (e.g. Schulz et al. 2010). The intent of such standards is to improve the quality and completeness of the detail provided in experimental studies to both improve the studies themselves—and the author’s attention to important methodological details—and to better enable their replication. A good starting point for achieving the comprehensive documentation of agent-based models is by adopting an existing protocol or combination of protocols. The original protocol, developed in ecology, had three main sections: overview, design concepts, and details (ODD). The ODD + D protocol extends the basic ODD protocol to address the lack of attention to human decision-making in the original (Müller et al. 2013). In addition, the ODD + D protocol requires explication of the theoretical foundations of the model, which is particularly important in theory-driven disciplines such as criminology (“Appendix C”). Modelers who follow these protocols have reported some difficulties but they end up with better-documented models than when not using a protocol. The studies reviewed here did not employ such protocols. They should.

In addition to using a protocol, we suggest that authors include a table in each published publication that contains the elements coded for in their research. Although not exhaustive, these elements represent information basic to understanding the particulars of an agent-based model, and replicating it. Historically, journal publication page limits restricted the amount of model description that authors were able to provide. The dilemma of presenting enough model detail to enable other researchers to understand and evaluate a model within the page limits of most journals is widely recognized (Castle and Crook 2006). Some have suggested that to support replication, the model specification be presented in a separate publication from the model results (Carley 1996). The relatively recent option of including supplementary online materials to accompany journal publications offers a clear avenue for including such additional documentation regarding model details (that are so necessary) while ensuring they do not dominate the main text. Doing so would address the need for more complete and consistent standards in the reporting of agent-based models.

Our second suggestion relates to the first. We encourage authors to increase the transparency with which they describe the rationale for the choices they make during model design and implementation. In the studies we reviewed, many times the authors presented an element of agent or model design without offering the reader any reason for that choice. Providing the underlying assumptions, theoretical grounding or empirical evidence would increase understanding of how those decisions contribute to the model outcomes. Explicit descriptions of the reasoning behind modeling decisions would also contribute to the body of knowledge regarding the formalization of theory. Science is incremental. Each new model draws from the logic underpinning existing models or alters those models to explore particular aspects of theory. By carefully documenting each modeling decision, agent-based modelers could stimulate discussion and identify areas where theory is fuzzy and empirical evidence weak. At the same time, a body of evidence would begin to develop and could provide progress toward formalization of theories in criminology. A critical step in this process is to use the formalized micro-level behaviors in a bottom-up model. If they produce the macro phenomena of interest, they have explained it. Epstein and Axtell (1996) suggest this constitutes a new form of explanation that they term generative social science. However, none of this can occur without greater transparency regarding modeling decisions.

Thirdly, we suggest modelers both broaden the array of theories they use and spend more time diving deeper into existing theories. In terms of theoretical grounding, opportunity theories have dominated the agent-based modeling landscape to this point. Routine activity theory formed the theoretical foundation of models more than twice as often as any other single theory. Rational choice and crime pattern theory rounded out the top three most frequently used theories. Given the focus of opportunity theories on crime events and the comparatively concrete nature of their theoretical principles, their popularity is understandable. Largely, this is a natural artifact of our criteria for inclusion. Opportunity theories are typically the theoretical framework for empirical micro level studies and thus it is perhaps not surprising that this is also true for agent-based investigations. At the same time, the publication of the Brantinghams’ seminal (2004) publication spurred the use of these theories by making the connections between opportunity theories and agent-based modeling explicit. However, there is still tremendous potential for agent-based models to examine other, dispositional and community theories of crime. Johnson and Groff (2014), for example, offer a clear description of how strain theory and collective efficacy theory could inform agent behavior in models.

Finally, there is a tremendous need for additional empirical research to better parameterize and design our models. For example, modelers who incorporated the routine activities of agents reported very little empirical basis for their choices regarding the number of nodes in an activity space, the type of movement undertaken among those nodes or how much time should be spent at each. These are crucial questions to answer and will improve the ecological validity of models. With respect to such factors, we draw the reader’s attention to a series of empirical studies concerned with human mobility. For example, using mobile phone data, Candia et al. (2008) found that people spend just over 70% of their time 4 routine activity nodes, and 80% at ten such locations (see also Song et al. 2010). Similarly, transport studies and surveys (e.g. DfT 2013) ask respondents about their daily activity patterns, how long they spend travelling from destination to another, and how long they spend at activity nodes. Surprisingly, none of the studies we reviewed stated that they drew on such data to calibrate their models. Consequently, we recommend that another endeavor would be to conduct systematic reviews that synthesize what is and what is not known about particular elements of human behavior that are necessary to program agent-based models. In this way, ABM will help guide the agenda for empirical research by explicitly identifying what is unknown about human mobility and decision making.

Related to offender decision making, we found little evidence cited in the current studies to illuminate how offenders evaluate the built and social environment. Only through basic research, or perhaps an in-depth and systematic literature review, can we address these issues.

In addition to challenges, our review uncovered evidence of the potential for ABM to have an impact on the strength of criminological theory. Approximately 18% of the publications reviewed appeared in top tier, peer-reviewed criminology journals. This indicates a high level of rigor in a significant portion of the scholarship. It also bodes well for exposing the methodology to a wide range of potential adopters. Second, as noted in earlier publications (Eck and Liu 2008), ABM tends to involve teams, often interdisciplinary ones, in which a criminologist participates with programmers/modelers from other disciplines to create models of social phenomena. These types of collaborations are important because multidisciplinary teams are necessary to further this particular enterprise. What began as a challenge (i.e., the lack of emphasis on computer programming skills in the coursework for social science majors) could become an advantage.

Conclusion

This initial systematic review of agent-based models of urban crime highlights some glaring issues and some cause for optimism. Many of the issues stem from the lack of transparency in communicating basic details of the models, which hampers clear understanding of what the findings of such research mean for theory and practice. It is critical that researchers publish complete model descriptions that include the basis for parameterization decisions, sensitivity testing and validation strategies. Achieving generative social science’s potential for strengthening theory and explanation requires transparency. Finally, we encourage modelers to incorporate empirical knowledge whenever possible when parameterizing models. Rigor, imagination and the skill of the researchers involved are the only limits on the future possibilities for ABM to inform both theory and practice.