Production of Topographic Maps with VGI: Quality Management and Automation

The most common way to use geographic information is to make maps. With the ever growing amount of Volunteered Geographic Information (VGI), we have the opportunity to make many maps, but only automatic cartography (generalisation, stylisation, text placement) can handle such an amount of data with very frequent updates. This chapter reviews the recent proposals to adapt the current techniques for automatic cartography to VGI as the source data, focusing on the production of topographic base maps. The review includes methods to assess quality and the level of detail, which is necessary to handle data heterogeneity. The paper also describes automatic techniques to generalise, harmonise and render VGI.


Introduction
Maps are now everywhere, from the Web to smartphones, and are no longer limited to paper maps for hiking or routing. But most of the maps provided to the general public are not good maps, so they are not as effective as they could be. Whether they are static or dynamic (i.e. pan and zoom allowed), on paper or on screens of variable sizes, good maps are maps where every feature is legible, and where the user can easily understand the geography behind the map and the message of the map. Making good maps manually requires cartographic skills. However, when the amount of data is huge, for instance with the world OpenStreetMap (OSM) dataset, mapmaking has to be automated. Automating mapmaking entails two steps to obtain a legible topographic map out of a geographic database: selecting the data and the styles to be used to portray them, and refining the content in order to reach a legible map, which is complex when scale decreases, as the space in which to put the map symbols and the text reduces. These steps require the automation of three main processes: map generalisation (the simplification and abstraction of map objects when scale decreases), text placement, and cartographic symbolisation or stylisation. How to optimally automate such processes is still a research question, but, in recent years, maps have been more and more often produced through complete or partial automation. The traditional actors of automated mapmaking are the national or regional mapping agencies, the private map editors and the GIS software vendors. These actors have been used to making their maps out of traditional geographic databases, but what happens if the source data are partly or totally derived from Volunteered Geographic Information (VGI)? VGI is geographic information, and past studies on its quality (Girres and Touya, 2010;Haklay, 2010) have shown that it was satisfactory for many uses, but quite heterogeneous. Thus, the methods used for automated mapmaking should not be disrupted by the use of VGI as an input, but these methods need some adjustment to adapt to this new source of data: this adjustment is the topic of this chapter. Most of the problems presented here have been applied to the automated cartography of OSM, but we believe these problems and the proposed solutions also apply to different VGI sources, and even to cases where several VGI sources are combined into a map.
The next section of this chapter discusses the reasons why traditional automated mapping processes are not fully adapted to VGI, and is followed by a section that describes attempts to solve these problems by inferring the level of detail of VGI features. The fourth section then focuses on map generalisation, which may be the most complex of the cartographic processes. In the fifth section, the level of detailed harmonisation needed for large scale maps is discussed, while generalisation is dedicated to medium or small scale maps. The sixth part of the chapter focuses on the assessment of the quality of map features prior to applying automatic processes. Finally, in the seventh part, the issues related to advanced map stylisation with VGI are discussed.

Why Are Traditional Automated Mapping Processes
Not Fully Adapted to VGI?
Traditional automated mapping processes have been developed to process authoritative datasets, or at least datasets with consistent and homogeneous specifications, which is clearly not the case when VGI is used as (one of) the map source(s). The first problem is that VGI datasets suffer from level of detail (LoD) heterogeneities. For instance, there is no LoD specification in OSM, which allows contributors a great deal of freedom in capturing either detailed features (e.g. the cadastral LoD buildings from Figure 1) or less detailed features (e.g. the rough built-up areas or lake outlines in Figure 1) depending partly on their skills but mostly on the data source, as precise GPS tracks allow more precision than low-resolution satellite imagery. This heterogeneity leads to LoD inconsistencies, i.e. some very detailed features and some less detailed features might coexist on a map and share spatial relations ( Figure 1). Maps produced by National Mapping Agencies (NMAs), on the other hand, are based on datasets with strict specifications, where all features share the same geometrical resolution or granularity, whether they belong to the same theme or not. Thus the processes used to automate the production of such high-quality maps are not capable of handling the inconsistencies shown in Figure 1. The main characteristic of VGI compared to traditional authoritative datasets is the heterogeneity of quality, with very-good-quality contributions and very-bad-quality ones. This is true for most types of VGI: for OSM first and foremost, as shown in seminal studies by Girres and Touya (2010) and Haklay (2010), but also for photo sharing platforms such as Flickr (Zielstra and Hochmair, 2013), or even for hiking route sharing platforms (Ivanovic et al., 2015). Data quality varies from theme to theme, but also from feature to feature in the same theme (Girres and Touya, 2010). This is really different from authoritative datasets, where data quality is homogeneous, and cartography processes are developed in adaptation to this known quality. Among the quality indicators that can be heterogeneous with VGI, the most significant components are positional accuracy, thematic accuracy, completeness and logical consistency: • Positional accuracy heterogeneity is, of course, a problem because it can increase the symbol overlap problems faced when creating a small scale map. Heterogeneity in positional accuracy might drive mapmakers to use incompatible features on the same scale, which can give a false picture of the reality and the relations among features. • Heterogeneity in thematic accuracy is a problem because automated cartography relies on thematic information to classify the map features. The consequence of such heterogeneity is that processes should rely more on geometry and only use semantics when available. • Completeness heterogeneity raises the problems of 'empty space' in the map.
Empty spaces are useful to identify in automated mapmaking because they are excellent candidates to solve space conflicts during map generalisation or text placement. But, with VGI, empty might either mean really empty or just incomplete. • Logical consistency heterogeneity is also a problem, because automated cartography uses, for instance, the topology of geographic networks to identify important features, and road symbolisation techniques require topologically correct networks.
Traditional NMA maps cover the classic themes of topographic maps, or road maps, and most automated mapmaking processes focus on roads, buildings, hydrography, relief or vegetation. VGI has a broader range of contributed geographic features; even OSM, which started as a free alternative to topographic maps, has been extended to cover amenities, shops or addresses. Thus an automated process to make maps with VGI needs to handle unusual themes as well as classic road and building datasets. Another particularity of VGI is the broader range of scales used to describe the world, from world views that range from very small scales (smaller than 1: 100 000 000 scale) to very large scales. For instance, OSM suggests the capture of zebra crossings or traffic signals that can only be displayed at very large scales. Some projects even extend the OSM framework to indoor mapping (Goetz and Zipf, 2011). In contrast, traditional automated mapmaking targets a small number of fixed scales , and, even when the maps are displayed in online tools, the number of scales available is often limited by the number of scales available for paper maps (Dumont et al., 2016). In addition to the issue of the large range of scales in VGI, it should be noted that most of the automated processes were never developed for large scales that large and for small scales that small (e.g. the smallest scale produced by the French NMA only covers the whole French territory, excluding overseas territories).
Regarding symbology and stylisation, the automated processes are strongly related to the data and semantics. For instance, the choice of road symbols depends on the semantics of the road, and there has to be some consistency all along the road. When manipulating VGI data, how do we acquire these semantics? How do we handle the heterogeneities inherent to VGI?

LoD or Scale?
In cartography, the scale of a map is the ratio of the length of an object on the map by the length of the same object on the ground. But scale is also somehow related to map usage, and is then a proxy for map content. Maps around the scale of 1:25k are mainly used for hiking and contain information readable at this scale and useful for this purpose (e.g. footpaths, contour lines etc.); maps with a scale smaller than 1:500k are mainly used for road trips, and highlight the map themes related to roads. In contrast, it is too complex to assign a scale to VGI features, but here we consider the scale of a feature as the scale of the map at which this feature would be legible and legitimate.
LoD is a vaguer notion, which can be considered as the translation of map scale to geographic databases for which the scale is not fixed. Several factors affect the level of detail of geographic features: • geometric resolution, i.e. the minimum distance between two vertices of the geometry, as an analogy with image resolution. • geometric precision, i.e. the difference between the position in the database and the position in reality. • granularity, i.e. the size of the smallest details in a geometry, such as the protrusions in the church in Figure 2 (left). • semantic resolution, i.e. the amount of details in the semantic information attached to the geometric feature. • conceptual schema, i.e. how much the ground truth information is abstracted; for instance, a wood abstracted by individual trees is more detailed than one abstracted by a polygon feature.
Thus it is difficult to infer LoD as a numerical value as one would for scale, so often categories are used, such as the LoD for 3D city models (Biljecki et al., 2014). Touya and Brando-Escobar (2013) proposed five categories for the LoD of OSM features, from Street level to Country level. Scales can then be assigned to features if a scale range is assigned to each LoD category, e.g. the city level is assigned a scale range going from 1:15k to 1:50k (Touya and Reimer, 2015). Reimer et al. (2014) inferred a scale equivalency for OSM features by studying the characteristics of features in existing maps at different scales: for a given map theme, the measure that best characterises the difference in features at different scales is determined. In the example of urban areas in Reimer et al. (2014), vertex frequency (number of vertices in the polygon ring divided by the polygon perimeter) was the determining characteristic ( Figure 3). Then, by inversing Töpfer's radical law (Töpfer and Pillewizer, 1966), which defines the optimal number of map features at a scale given their number at a bigger scale, and applying it to existing map features in the maps of NMAs, Reimer et al. (2014) were able to calculate the scale equivalency of any urban area in OSM.

Multiple Criteria Decision Method
We stated in Section 3.1 that LoD can be affected by a combination of five factors, all of which can be measured in a geographic dataset but are hardly comparable or can hardly be added. Multi-criteria decision methods are computational techniques that allow decision-making based on several criteria in those cases where a simple numerical value such as a mean is not a valid solution (Roy, 2005). Touya and Brando-Escobar (2013) propose a multi-criteria decision method to classify VGI features into LoD categories from street to country level. The method was improved by integrating elements from the scale equivalency in Touya and Reimer (2015). Some automatic results from the improved method are presented in Figure 4.

Current Generalisation in OpenStreetMap
Map generalisation is a complex process that simplifies and abstracts geographic information to produce a legible map at a given (smaller) scale. The problem of map generalisation automation has attracted research proposals for many years (see for instance Burghardt et al., 2014;Mackaness et al., 2007), and some mapping agencies are now able to use research results to produce maps with partial or total automation . One of  the remaining challenges of automated generalisation research is to extend the current processes to make maps with VGI or maps that combine authoritative and user generated information.
If we look at the default maps available from OSM, there is almost no generalisation operation carried out on them. This is partly due to the philosophy of the OSM portal, which aims to show the content of the dataset rather than to display the best map possible. But it is also due to the difficulty of the generalisation process, which involves complex mechanisms that are not available in most mapping tools. However, some minimal selection operations are carried out in the default OSM map, using the semantics available to choose the zoom levels (i.e. scales) where features should be displayed. The piece of code below is extracted from the CartoCss file used to render buildings in the default OSM map. It shows that standard buildings are displayed only for zoom levels greater than 13 (zoom levels are ordered from 0 for the whole world to 19 in OSM), and with a coloured outline at zoom levels greater than 15. Besides these minimal selection operations, there are very few proposals dedicated to the issues of generalising VGI at present (Sester et al., 2014). Klammer (2013) proposed some solutions for tile-based maps such as OSM, with each tile being generalised separately, but potential problems at tile junctions are not handled: generalisation often requires an analysis of the neighbouring objects, which is not possible at the edge of the tiles. Schmid and Janetzek (2013) proposed to generalise the OSM road network at small scales on-thefly using important placenames in the dataset. However, most of the issues remain unsolved: how can we deal with the broad range of scales in generalisation processes, with the diversity of themes or with the heterogeneities in quality and LoD?
The next two subsections address issues related to the range of scales and the diversity of themes with the generalisation of complex airports and railways from OSM. Section 4.4 addresses the generalisation of mashup maps with user generated content on top of reference datasets.

Generalisation of Complex Airports
Airports can be described in a great amount of detail in OSM, and contributors often use the OSM recommendations to capture airports as complex objects composed of runways, aprons where planes are parked, taxiways that connect aprons and runways, and terminal buildings. Figure 5 shows that such a complex structure is hard to represent legibly when the scale decreases, so generalisation algorithms dedicated to such structures must be used.
This subsection briefly describes a generalisation process presented in Touya and Girres (2014), where algorithms for the different types of features comprising airports are proposed, including, for instance, the decomposition of runways from polygons to lines. Here, we choose to focus on taxiway lines. Figure 5 shows that the junctions of taxiways are often complex, with shapes similar to slip roads. The first step in generalisation is to automatically characterise all of these complex junctions (see the coloured polygons on the right side of Figure 6) using the shapes of the lines, the angles of the connection and the number of connected taxiways. Then, each complex junction is simplified to a straight line crossing, removing all of the slip roads ( Figure 6). Finally strokes are computed within the remaining taxiways. Strokes are groups of lines that follow the perceptual grouping principle of good continuity (Thomson and Richardson, 1999), like a continuous pen stroke, and have been used to simplify roads or rivers in the generalisation literature. Here, the smallest strokes are eliminated with a length threshold depending on map scale.
When algorithms for taxiways, runways, aprons and terminals (see Touya and Girres, 2014) are chained, complete airports can be generalised; the results for OSM airports with different initial complexities are presented in Figure 7, showing that the flexibility of the algorithms allows for the management of LoD heterogeneity of OSM data.

Generalisation of Railway Networks
Airports are not the only geographic feature that is captured with a greater complexity in OSM. The OSM specifications advise capturing each railway, even in a train station or in triage areas where a great number of lanes may exist (Figure 8). The railway lines are often very close to each other and their symbols overlap very quickly when the scale decreases. In this case, a good generalisation process is able to handle different densities of parallel railways and simplify them while preserving the connections and the patterns of the railways.   Railway networks are composed of two very different types of patterns: the main railway lines with a small number of parallel tracks, and the train station with complex structures of tracks. The best strategy is to handle both parts of the network separately with different methods (Touya and Girres, 2014;Savino and Touya, 2015). The simplest railways to generalise are the main railway lines: the parts where several railway tracks are close and parallel have to be identified automatically and then replaced by a single track when the symbols overlap (Savino and Touya, 2015). The results of this method for railways extracted from OSM in France are presented in Figure 9.
Regarding train stations, a typification operation is required. Typification simplifies a pattern of geographic features while preserving the characteristics of the pattern more than the position of the features taken individually. Several complementary typification algorithms are proposed in Touya and Girres (2014) and Savino and Touya (2015), and Figure 10 shows a result for a 1:25k map of a small train station.

Generalisation of a Combination of Authoritative Data and VGI
When VGI is used as a thematic layer on top of a map, as in Figure 11, which is extracted from the IGN application called 'Leisure area' 1 , the issues related to generalisation are different from those related to generalisation of VGI only. The background map can be nearly generalised as a traditional topographic map, but the constraint is the preservation of the relations between the thematic layers and the background layers. If we use the example of Figure 11, the route should remain on top of the road, even if the road Fig. 9: Main railways with parallel lanes collapsed to single lanes (Savino and Touya, 2015). ©OpenStreetMap contributors.  is generalised, which is likely to happen given the sharp bends at the top of the figure. Another example in Figure 11 is the spot of interest marked as n°2 in the figure, which is located on the summit of a large bend: if the bend is displaced by generalisation, which is a common side effect, the symbol should be adjusted accordingly. When the scale decreases, Duchêne (2014) states that such spatial relations should either be preserved or sometimes be abstracted to make them legible and understandable at the generalised scale. To enable this preservation or abstraction, the relevant spatial relations must be discovered and properly characterised, which is not an easy task, although propositions exist to model these relations  with the introduction of implicit features such as bend summits, or to build an ontology of such spatial relations relevant for cartography .

How can the LoD increase?
At large scales, e.g. maps at a 1:10k scale, there is no visualisation limitation for the very detailed features existing in OSM, and, as a consequence, map generalisation is not necessary. For instance, the very detailed railway networks described in Section 4 can have all of their lanes displayed without symbol overlaps at large scales. But the LoD inconsistencies illustrated in Figure 1 raise the problem of the representation of roughly digitised features at large scales. Most of the geographic meaning of maps is conveyed by relations between map features (Mackaness et al., 2014), so the solving of the problem of LoD inconsistencies should be focused on those relations that convey a specific meaning.
Following the ideas of Monmonier (1996), the idea to increase the LoD of roughly digitised features is to caricature them in order to transform the improbable relations of features into probable relations. For the examples in Figure 12, a clearing would be introduced around the group of buildings, and the bus stop would be moved to the closest road. We call this operation to artificially increase the LoD through probable spatial relations LoD harmonisation (Touya and Baley, in press). However, there is no clue in the data as to the real shape of the clearing required in Figure 12: we only know that there must be one. This makes harmonisation tend more towards caricature and schematic mapping than towards realistic mapping. The map does not present real and precise shapes to the reader, but rather presents very probable spatial relations. The next section briefly describes some harmonisation operations and shows some results of their implementation on OSM data, while Section 5.3 discusses the problem of automatically chaining these harmonisation operations on a complete large scale map.

Harmonisation Operations
Different types of harmonisation operations are described by Touya and Baley (in press), and some of these are presented in this subsection. First, OSM contains some polygon features that represent functional sites such as schools, hospitals or commercial areas, which are themselves composed of other features also represented in OSM: buildings, roads, paths, parks, sports fields or helipads. For a clear understanding of what these zones mean in the map, the components should really be contained by the polygon, which is not always the case because the components are sometimes much more detailed than the zone itself. In this case, the harmonisation operation identifies the components that lie outside the zone and modifies the zone geometry so that it includes the missing components ( Figure 13).
A similar problem might occur with land use/cover parcels that are often roughly digitised and some geographic features that should be inside the parcels. The most current example in OSM is the case of urban areas with buildings intersecting their limits or lying just outside. In such cases, the land use parcel geometry is extended by uniting the protruding geometries of the building just outside the area limits with the urban area geometry. The method is iterative, because new buildings can be found just outside once the geometry has been extended (see automatic results in Figure 14).
Another type of necessary harmonisation operation is disambiguation, which aims to remove spatial relations that should not exist in reality without knowing what the reality looks like. For instance, it is extremely unlikely to find a group of close buildings inside a forest without a clearing. When the forest has been roughly digitised and the buildings have a high LoD, we can infer the   presence of a clearing and try to add it in the forest. The proposed operation determines where the overlaps exist between the buildings and the forest and then crops the newly created clearing with the edges of the network elements, which are often barriers for forests ( Figure 15).
More examples of useful harmonisation operations can be found in Touya and Baley (in press).

How to Chain Harmonisation Operations
Harmonisation operations are the building blocks for deriving LoD harmonised large scale maps, but they are not enough, because several problems can occur: • Harmonisation operations carried out on close parts of the map can affect each other and the last one can damage the previous harmonisations. • Harmonisation operations that displace or distort features can cause legibility problems with other features of the map (e.g. symbol overlap). • Harmonisation operations can be related to each other and the order of operations might have an impact; for instance, a displacement of a building that overlaps a riverbank ( Figure 16) might put the building just outside the urban area, so the extension of the urban area should be implemented afterwards.
Similar problems occurred with the automation of map generalisation that first developed individual algorithms and then tried to combine them into complex processes (Harrie and Weibel, 2007;Regnauld et al., 2014). To harmonise the area shown in Figure 16, where multiple buildings overlap a riverbank, we therefore used an optimisation process inspired by map generalisation (Harrie, 1999;Sester, 2005), which combines the harmonisation of buildings that are close to each other into a least squares adjustment. Figure 16 shows that for each group of close buildings identified, all buildings have been jointly displaced, avoiding symbol overlap with the river and with other buildings.

Quality Assessment Taking into Account Crowdsourced Ground Truth Data
As mentioned in Section 2, automatic mapmaking processes require some consistency in data quality, or some kind of assessment of this quality if consistency is not achievable, which is the case with VGI. This section describes a study to assess the quality of OSM features, using ground truth data. In many studies, OSM is usually used as a proxy for VGI data; this study is not an exception, as OSM is a prime source of vector-encoded GI that can be directly used in cartographic processes. However, any effort in mapmaking using VGI data should expand its horizons to include other sources as well. Today, VGI comes from different sources and in many flavours, such as toponyms, GPS tracks, geotagged photographs, synchronous micro-blogging, social networking content, blogs, gaming spaces, sensor measurements, etc. All of these sources can either possibly offer valuable geographic information complementary to OSM data (e.g. Geonames can provide a supplementary dataset to the OSM places) Fig. 16: 1) Detection of LoD inconsistencies (in this case a building intersecting the riverbank); 2) clusters of close buildings are created around the identified inconsistencies; 3) each cluster is harmonised as a whole to remove overlaps without creating new ones. ©OpenStreetMap contributors.
or be used as quality assessment tools (e.g. through the use of geotagged photographs from photo-sharing repositories). This latter case is the focus of this section. Geotagged photographs are, in a sense, in-situ observations of the ground reality and thus, if properly used, can assess various quality factors of OSM data and improve the decisions in some of the cartographic processes analysed above. As explained, semantic mismatches, topological and positional errors and vague and ambiguous cases of overlaps and intersections should be expected when handling VGI. All these cases pose a challenging task when it comes to disambiguating them and can negatively affect the outcome of the cartographic processes.
When relying solely on VGI data for mapmaking, the ambiguous cases first need to be recognised and located, and then corrected or verified by the contributors themselves. Indeed, it has been documented that the positional quality of features improves as more contributors add data or modify a feature . However, participation biases (Antoniou and Schlieder, 2014) and the digital divide (Graham et al., 2014) can negatively affect a widespread effort of quality improvement. Hence, we need to devise methods, by using diverse VGI data, that can more easily identify and correct such potential sources of error before they enter the cartographic chain of processes: in a sense, the mixture of diverse VGI sources might counter-balance biases and errors from individual VGI sources.
Although there is no direct link between geotagged photographs and map scales, it can be inferred that, as geotagged photographs usually capture a small ground area from a close distance in high detail, they can be of help in large scale maps. In general, cases where geotagged photographs can provide better ground truth include the efforts to: • verify if a feature exists (i.e. assess completeness) • verify the type of a feature (i.e. assess thematic accuracy) • verify the topology and the relationship between features (i.e. assess logical consistency) • verify the state of a feature for a particular time-stamp (i.e. assess temporal accuracy).
Here, as a case study, we focus on the use of other VGI sources (i.e. Flickr geotagged photographs) to evaluate the validity of OSM Points of Interest (POIs) in three different scenarios trying to i) verify the OSM points that could not have been created through image interpretation as there are objects that obscure the view (i.e. trees and wooded areas), and whose OSM updates consequently normally require the physical presence of contributors on the ground; ii) disambiguate areas of overlapping OSM land use/land cover types at a given point in time (for more, see Antoniou et al., 2016); and iii) correct problematic POIs in terms of topo-semantic consistency.

Verify OSM POIs
One of the comparative advantages of VGI is that it can provide timely data for areas and cases where other sources cannot be equally effective. One such case is that of the areas where satellite imagery (a prominent way of capturing authoritative data) cannot provide the needed information, e.g. under wooded areas ( Figure 17). Here, local knowledge by contributors is valuable, as insitu observations can be an important source of information. In this context, geotagged images are well placed to play a significant role. For the verification of the OSM POIs, an online application has been developed that displays a geotagged photograph, retrieved using the Flickr API, and asks the user whether a specific POI could be recognised within approximately X meters (as computed by the location of the POI and the geotagged photograph) in the photograph. Thus, for example, the question has the form 'Do you see a monument about 2m away, in the photo below?' (for more on this, see Antoniou et al., 2016). Figure 18 shows a number of illustrative examples generated by the application.
A systematic fusion of diverse VGI sources can improve the quality of the data used for mapmaking not only in the initial phases of data gathering but also in a step-by-step implementation of cartographic processes as shown above. For example, in the case shown in Figure 15, geotagged photographs could be used to examine and verify if such openings in the forest really exist or if the constructions portrayed are hidden under the woods.

Verify OSM Land Use / Land Cover
The second case study for using geotagged photographs to evaluate a VGI dataset comes from the Land Use/Land Cover (LU/LC) domain. Here the challenge is to disambiguate inconsistencies regarding the actual LU/LC that arise from contradictory feature types that occur between different OSM layers, e.g. in the Landuse and the Natural OSM layers (a more thorough study can be found in Fig. 17: A satellite image of a sample area in Paris (left) and the polygons of wooded areas (right) for the same area (©IGN, France). Fonte et al., 2016). The LU/LC at each given point should be unambiguously retrieved: this requirement not only contributes to the overall quality of OSM and to the correct cartographic output but also enables the use of OSM data for the creation of LU/LC products. Here again, overlaps between different and contradictory LU/LC feature types create inconsistencies that could possibly be disambiguated with the use of geotagged photographs. For example, Figure 19 (left) shows the overlap of a closed construction site (purple polygon) and a residential road (green line) in OSM (green dots represent the locations of Flickr photographs). Although the VGI elements co-exist in the same VGI source (i.e. in OSM), it is obvious that it is not possible for both layers to correctly denote the actual land use of the area. The use of geotagged images could provide the necessary information to clarify the mismatch. In Figure 19 (right), a Flickr photograph taken within the polygon clearly shows that the area has been turned into a construction site. Additionally, a valuable characteristic of the VGI datasets used is the time information they contain: using the individual timestamps of features, it is possible to analyse and understand the currency of each feature, which could be valuable in updating the overlapping features that have outdated information. With the two illustrations given in this and the previous section, it is shown that mixing independent VGI sources can prove a helpful way to spot possible errors, to evaluate the validity of features and to justify the implementation of various cartographic processes. In this context, the proactive disambiguation of vague cases in large scales can lead to correct decisions on the cartographic processes described above and avert the propagation of errors when moving to smaller scales.

Verifying and Correcting Topo-semantic (In)consistency
Topo-semantic consistency (Servigne et al., 2000) is a subset of logical consistency that concerns the correctness of the topological relationship between two objects according to their semantics. Topo-semantic consistency refers to the consistency of geographic objects with other geographic objects of the same theme (intra-theme consistency) or of other themes (inter-theme consistency). Inconsistency exists in VGI due to the absence of integrity constraints and, therefore, depends on the expertise of the data contributor. A map should not portray inconsistencies; thus, inconsistencies should be identified and resolved during the mapmaking process. Instead of correcting these errors in order to satisfy consistency blindly and without taking reality into account, correction can be based on ground truth provided by Flickr images, as explained earlier.
A number of tests can be applied in order to find inconsistencies in the OSM data between features from the same layer (e.g. two roads), or from different layers. Tests are based on consistency evaluation utilising topological relations that the data should satisfy, taking the data semantics captured by their attributes into account as well. In OSM, apart from the geometry capture, the existence of a plethora of tags provides a rich semantic dataset, and thus sophisticated topo-semantic relations can be explored. Here, we focus on POIs because they are more easily captured in photographs due to their dimensions. POIs that are problematic with regards to their position in comparison to other layers can be verified with Flickr images. If the Flickr images prove that the topo-semantic relation is correct, then no changes are made; otherwise the geometry (relative horizontal position) and/or the semantic information (Type tag) is updated according to the photograph. Finally, the topo-semantic relations are re-evaluated. A case study was performed with OSM data that cover the broader Paris area . According to this study, in the area of interest there are 22,527 OSM POIs with two main attribute tags related to their identity: Name and Type. Topological relations of POIs against other thematic layers are examined based on a number of checks, and errors will be examined utilising Flickr photographs. For example, it is important to investigate the topological relationship between POIs and buildings, examining whether POIs should be situated inside or outside building polygons. Initially POIs are clipped with the convex hull of the area covered by buildings, resulting in 60136 points. A number of points (21872) are situated inside the building polygons, 2338 (4%) are situated on the building boundaries and 35926 (60%) are situated outside. It is examined whether the position of the POIs outside of the buildings is valid based on their semantics captured with the Type attribute. Based on this test, 30497 (85% of the initial estimate) can indeed be situated outside but 5429 (15% of the original estimate) should be situated inside the building polygons and need further investigation. Similarly, a number of points (24210) are situated inside the building polygons. Based on a similar test, 22047 (91%) can indeed be situated inside but 2163 (9%) should be situated outside the building polygons and need further investigation. In this study, the correct position of the points in relation to the buildings was decided according to common sense.
In another test, POIs that are semantically related to roads and railways are examined against the network geometry. Regarding POIs that are tagged as crossings (12612), 99.5% (12552) are situated on road intersections and only 60 of them (0.5%) have a different position and need checking. Regarding POIs that are tagged as traffic lights (12612), 99.2% (2292) are situated on the road intersections and only 18 of them (0.8%) have a different position that will be further checked. POIs that are tagged as 'level crossings' (209) and 'railway_ crossing' (1) are situated on the rail network intersections. Points semantically related to the intersections of the rail and road network, such as level crossings, are checked in relation to the actual intersections of the road and rail network. Of the 1101 points, 949 (86%) are situated on the intersections while 152 (14%) have a different position and need further investigation. Of course map scale is also an important factor when judging distance. For example, the distance between network junctions and POIs tagged as crossings might be negligible in relation to scale.
The inspection of topo-semantic relations highlights areas where consistency is not fulfilled and should be corrected during the mapmaking process. Pre-processing based on topo-semantic relations limits the intervention of cartographers to only those cases that are problematic. Whereas an in situ visit costs time and money, the provision of ground truth through geotagged Flickr images is a welcome alternative solution emerging from the VGI universe.

VGI and Symbol Specification
This section discusses issues related to VGI symbolisation, and is more forward looking than the previous ones. As with the other previously described cartographic processes, the main issues regarding symbol specification with VGI are: what could be impacted by this new source of data, and what should be adapted and how? A reminder of the symbol specification process is given first. Then, we highlight aspects to be discussed and controlled to adapt this process to VGI.

The Symbol Specification Process
The symbol specification process occurs at the end of the global cartographic design process. At this stage, the input objects should be generalised for the expected map scale in order to be able to properly specify styles that are suitable at this scale. Traditional cartographic symbolisation, for instance in map series production, is based on historical knowledge of symbol specifications and cartographic practices and processes, related to a particular topographic style (Ory et al., 2015). Symbol specifications have also been considered as a user controllable problem in order to make personalised maps (Christophe, 2011). Research on style and symbol specification now focuses on processes inspired by computer graphics to mimic traditional cartographic symbolisation, or to apply artistic styles to maps (Christophe et al., 2016). The three main steps of the symbol specification process are: • Legend specification: themes and semantic relations between map themes. It first requires that the legend be structured by semantic themes with semantic relationships (e.g. rivers and lakes are in the same legend theme and their symbols should be related). • Style specification: signs for themes. This requires choosing and combining relevant graphic signs to enhance semantic relations on the map. • Map rendering. The rendering step effectively applies the style specification to the cartographic objects on the map. It may involve complex rendering techniques, such as textures to render forest areas.
Tools such as Mapnik 2 , which are used to make maps with OSM, do provide some basic rendering methods, including polygon texture fills or advanced text rendering, that could be extended to help users complete the three steps of symbol specification.

Discussion and Guidelines for Using VGI in Symbol Specification Processes
As for the other mapmaking processes, the first issue to address when using VGI in symbol specification processes is the adaptation of processes developed for consistent databases to the heterogeneity of VGI. This adaptation can be achieved by a characterisation of VGI features, i.e. its quality, semantics and LoD. But such characteristics of quality or LoD are no longer consistent on a given map theme, as each VGI feature might have its own quality or LoD. Thus a symbol specification for each map theme might not be possible with VGI. For the same map theme, for instance rivers, the symbol might be adapted to the quality, semantics and LoD of the features (e.g. darker shades of blue and wider symbols for rivers with more details/better quality). A typical use case of maps made with VGI is the mashup map with crowdsourced thematic data on top of existing reference data. In this case, the symbol specification for the reference background might have been designed independently from the thematic data; thus the addition of thematic VGI involves three problems: • Management of contrasts: the thematic data should be more legible than the background and the contrast in the background should be altered to optimise the contrast with the thematic data. • Preserving a topographic style: adding a crowdsourced thematic layer should not prevent the map reader from understanding the topographic style of the background. • Visualising imprecision aspects: the thematic layer is both heterogeneous in terms of quality and different from the background. Thus the symbol specification should convey these differences as much as possible (see Chapter 9 by Skopeliti et al. (2017) regarding quality visualisation).

Crowdsourcing the Symbol Specification Process?
The symbol or style specification process is user-driven, as the map purpose and the map user needs are translated into a legend and rendered on the map. Additionally to the use of crowdsourced data in the map, a crowdsourced map could also include a more important interaction with the user during the mapmaking process: for example, a consensus decision among OSM contributors could be reached regarding the colour to use to render the forest areas in the standard display. Research on automated on-demand mapping tries to capture the needs of users through techniques such as ontologies and interactions (Balley et al., 2014), but allowing the users to choose the way crowdsourced data can be rendered in the legend and the map requires a step further in this direction.

Conclusions and Further Work
This chapter addressed the challenges of automated mapmaking using VGI as input data. VGI differs from traditional geographic databases because of heterogeneities in quality and LoD, and because of thematic diversity, so existing methods for automated mapmaking have to adapt to this situation. This chapter described a proposition to infer the LoD of VGI features to overcome heterogeneity, and then presented methods that use this inference to make maps at different scales using map generalisation or LoD harmonisation. The paper also proposed techniques to overcome the quality heterogeneity, which can alter the map legibility. Finally, the paper discussed how advanced stylisation techniques could be applied to VGI. There is much more work to be done, as automated mapmaking itself is a large research topic. The long-term goal is to design adaptive and completely automated cartographic processes, because the amount of data is too large for manual cartography, and the content has to be adapted to different needs and display devices. Beyond continuing to improve the methods presented here, it must be noted that generalisation and harmonisation operations do not handle quality heterogeneities yet, and we should investigate how such processes can adapt to quality information that can be inferred from VGI features similarly to the handling of LoD information discussed above. For instance, a forest imported from Corine Land Cover and one captured precisely with satellite imagery do not require the same simplification algorithms. The future diffusion of web maps will be based on vector maps using vector tiling, such as the OpenScienceMap project that provides a vector mapping of OSM. Such web maps will raise several research questions, such as that of the online triggering of generalisation and harmonisation processes, when such processes are mostly designed for offline processing. The question of tiled processing is also an issue, as mapmaking processes make considerable use of the geographic neighbourhood of features to choose the best process. The development of vector web maps will also enable user customisation of stylisation, which will require addressing the research issues discussed in the last section of this chapter.