Interactive Analysis and Decision Support with MATSim

Agent-Based Simulation Means Lots of Data Agent-based transport demand models require managing and integrating data sources several orders of magnitude larger than traditional aggregate models. In a truly disaggregate demand description, as seen in our MATSim implementation for Singapore, spatial data represents individual buildings and land parcels, not zones; travel demand takes the form of a full activity diary with connecting trips for every individual, based on their personal demographic attributes, instead of an aggregate number of trips from zone to zone for a speci c time period. For this reason, input data for an aggregate four-step (or related) demand model can generally be edited on a laptop, using standard spreadsheet so ware, whereas agentbased modeling requires the manipulation and synthesis of large stores of structured, hierarchical data, frequently exceeding most personal computer capacity.

How MATSim Stores Data MATSim stores and retrieves data from XML, because XML re ects objects' hierarchical structure in the simulation and is readable.However, performing general exploratory analysis of large XML data stores is usually poorly supported by most data analysis so ware packages, especially GIS-based systems.To perform analyses, expert knowledge of XML querying technologies like XPath and XQuery is required (or Java, if one performs more specialized analysis on the objects themselves).In our experience, this specialized knowledge is lacking in transport and urban spatial planning practice.Therefore, in most MATSim applications so far, authorities, and other interested parties, must formulate their desired analysis in advance and have expert consultants perform the analysis.Any queries resulting from the analysis require another consultation cycle and the client's perceived value declines, due to both lack of interactivity and model ownership feeling.We believe this lack of a broadly supported exploratory data analysis interface, and the customer experience the interface can create, presents a considerable barrier to entry for many authorities and operators interested in using MATSim.
How Customers Interact With Data: Relational Databases, GUI-Driven Interaction Most transport and urban spatial planning customers rely on mature, GUI-driven so ware, such as ArchGIS (ESRI, 2011), EMME/3 (INRO, 2015), the PTV (PTV, 2009) transport planning suite, or even Microso Excel; all of these connect to relational databases and perform queries on large data sets.Many analysts can explicitly query databases using the SQL (Structured Query Language); the ODBC (Open Database Connectivity) standard allows so ware to connect to any relational database regardless of the actual technology driving it.Importantly, many interactive exploratory data analysis so ware suites, like Tableau, Tibco Spot re, SAS and the open source R project, support relational databases and ODBC.

Requirements of a Decision Support Interface to MATSim
The event stream produced by the MATSim mobility simulation represents the transport simulation process at the atomic level.It could be fed into a relational database; an analyst uent in procedural languages could process it in arbitrary ways.But we expect more general use case scenarios, where most analysts will perform general tasks that can be standardized.To this end, we set about compiling requirements speci cations for potential audiences and their use case scenarios, to come up with a general interactive analysis framework and decision support to satisfy most requirements.We developed a set of Java classes to process MATSim input and output, producing tables in a relational database, and an entity relationship diagram that should be intuitive and useful to a large user audience.

Users
This chapter presents a decision support tool geared to decision makers and researchers in the elds of transport planning and operations, spatial planning and spatial economics and geography.Generally speaking, it should serve professionals interested in mobility and spatial analysis, who understand transport modeling principles, but do not have the expertise to operate an agentbased transport simulation directly.Currently, we envision the following stakeholders and some hypothetical questions for a decision-support system-a non-exhaustive list that, we expect, will grow with time: Transport planners: How many trips occur where, when and what is the activity purpose?What are the socio-demographic characteristics of people performing these trips and activities?
Urban Planners: What are the temporal usage patterns of buildings and the surrounding neighborhood?What is the ow from public transport stops to surrounding buildings?
Policy-Makers: What are the costs and bene ts of a new public transport service?Who are the winners and losers when constructing a new road?
Public Transport Operators: What is the breakdown of speci c bus lines' ridership?Service Industry: Which customers are in catchment areas, separated by mode?

Functional Requirements
The decision support framework should facilitate classic transport appraisal methods, such as cost/bene t analysis and evaluation of transport infrastructure spatial impact and policy measures.The framework should allow any sort of spatial analysis, on the nest granularity level provided by the transport model; usually, individual buildings or parcels, as well as public transport stops and selected links, like count stations or tolled road segments.However, these geographic features should be indexed against transport zones, or other geographic areas of interes,t to allow customized results aggregation.Furthermore, it should capture all temporal aspects of the simulation; full temporal dynamics are a crucial part of the agent-based approach.

General Framework for Decision Support
Figure 37.1 shows the general framework as we envision it: data from various sources feeds into a spatially-enabled database, with all geodata transformed to use the same spatial reference system (ideally, using the same projection used for MATSim coordinates, allowing for simple distance calculations).Simple Java programs using the MATSim API and JDBC (Java Database Connectivity) produce XML input data for MATSim scenarios; events from these scenarios are fed back into the database.Analysts query the database to produce "data cubes", which are aggregations and queries across many database tables.These are designed for speci c purposes, such as calibration and validation, location analysis, winner/loser analysis or other application-speci c purposes. in our proposed ERD could already provide valuable insight to MATSim simulations, much richer analysis is possible when tapping relationships between di erent tables in the database.With the help of graphical query building so ware, little or no knowledge is required to construct SQL scripts that create customized data cubes.These cubes are fed into the business analytics so ware, which is designed with a relatively programming-agnostic audience in mind.Relying on the familiar paradigm of drag-and-drop interaction in a simple, well-documented GUI, the user constructs "dashboards" summarizing information and allowing interactive aggregation, or drilling-down across multiple dimensions.Figure 37.3 shows a Tableau visualization comparing public transport ridership from a MATSim simulation to actual smart card data records (transformed into the travel diary format speci ed in the ERD). Figure 37.4 shows the SQL query used to produce the data frame driving the Tableau analysis.The query exploits the primary/foreign key relationships in the database to perform rapid joins between the di erent tables.

Diaries from Events
In the package contrib.analysis.travelsummary(Chapter 38), the reader can nd a set of classes that will transform their MATSim simulation results into a set of travel diary tables, like those discussed in the preceding section.The package contains a simple GUI class that can be run to specify input data XML les, the location to save output CSV (Comma-Separated Values) les and other information such as a subscript appended to the end of le names to identify di erent scenarios.These CSV les can be read into a relational database of choice, or directly queried in Tableau, or other interactive analysis so ware.

Figure 37 . 1 :
Figure 37.1: General framework of the decision support system.

Figure 37 . 3 :
Figure 37.3: Tableau visualization of public transport ridership from a MATSim simulation compared against actual smart card data records in Singapore.

Figure 37 . 4 :
Figure 37.4: A diagram showing how the tables from Figure 37.2 are joined together for visualization in business analytics so ware, e.g., Tableau, as shown in Figure 37.3.Source: (Erath et al., 2013)