Modelling the world in 3D from VGI/ Crowdsourced data

Within the last ten years, Volunteered Geographic Information (VGI) has developed rapidly and signi fi cantly in fl uenced the world of GIScience. Most prominently, the OpenStreetMap (OSM) project maps our world in a detail never seen before in user-generated maps on the one hand. On the other hand, most of the urban area on our planet has been covered by hundreds of millions of photos uploaded on social media platforms, such as Flickr, Instagram, Mapillary, Weibo and others. Th ese data are directly or indirectly geo-referenced and can be used to extract 3D information and model our world in the 3D environment. At the current stage, several approaches have been made available to visualise and generate the 3D world mainly using OpenStreetMap data. Th e 3D buildings are unfortunately restricted to a coarse level of detail, since no further information about facade structure is available on OSM. In this paper, the current work of reconstruction of 3D buildings by using the data both on OSM and Flickr is presented, whereby facade structure could be extracted from Flickr images and OSM footprints can be used to accelerate the process of dense image matching and to improve the accuracy of geo-referencing.


Introduction
In recent years, the term Volunteered Geographic Information (VGI) became popular, whereat VGI describes how an ever-expanding range of users collaboratively collects geographic data (Goodchild 2007a).That is, hobbyists create geographic data based on personal measurements (via GPS etc.) and share those in a Web 2.0 community, resulting in a comprehensive data source with humans acting as remote sensors (Goodchild 2007b).With a global cast of volunteers, OpenStreetMap (OSM) is considered as one of the most successful and popular Volunteered Geographic Information (VGI) projects.In its current state, there are more than two million registered members (OSM 2015) who contribute to the rapid growth of OSM.Recent investigations on its completeness and quality have shown that urban areas in Central Europe in particular have already been mapped with an impressive level of detail (Neis et al. 2012).In those areas, OSM is well ahead of only mapping the street network.For the continuous improvement of OSM it is crucial to enable the mapping of even more detailed, three-dimensional spatial information.
Several years ago, Over et al. (2010) investigated the possibility of creating a 3D virtual world by using OSM data for different applications, and drew the conclusion that OSM has huge potential for fulfilling the requirements of Cit-yGML LOD1 (Gröger et al. 2008), which is modelled as block models regionally.With the rapid development of OSM in recent years, especially, sparked by the availability of high-resolution imagery from Bing since 2010, there has been an increase in building information in OSM, proving that volunteers do not only contribute roads or points of interest (POIs) to the database.According to the latest statistics (the values are derived from our internal OSM database, which is updated daily), the number of buildings in OSM is above 200 million, thereof 18.4 million building footprints in Germany.The research of Fan et al. (2014) demonstrated that the building footprints data on OSM has a high degree completeness and semantic accuracy.There is an offset of about four meters on average in terms of position accuracy.With respect to shape, OSM building footprints have high similarity to those in authority data.Moreover, there is more and more information about building height and roof structures, which is required for the 3D reconstruction.From this point of view, one can say that it is possible to model the virtual world in 3D from OSM data.The 3D data could be further enriched when introducing related information from other VGI projects, such as Flickr, WikiMapia, Panoramio, Instagram and Dronestagram.
This paper provides a detailed perspective on the generation of 3D city models by using VGI data that is mainly based on OSM data.First, it gives an overview of the data sources that could be used for the 3D model generation, then the mechanism to generate 3D city models will be described.Many of the proposed algorithms have been implemented within the OSM-3D and Open-BuildingModels projects, which will also be introduced.Finally, this paper will give conclusions, a discussion and suggestions for future work.

VGI as a data source for 3D reconstruction
The earliest approach for sharing 3D models using the principle of 'everyone for everyone' is Google 3D Warehouse launched on April 24, 2006.This shared repository contains user-generated 3D models of both geo-referenced real-world objects, such as churches or stadiums, and non-geo-referenced prototypical objects, such as trees, light posts or interior objects like furniture.The former also appear in Google Earth.In order to voluntarily contribute, users have to have a certain level of 3D modelling skill.The main focus of this repository does not lie in assembling 3D city models as the non-geo-referenced objects seem to be more important in related work.They are, for example, used to improve methods of automatic object recognition in the field of laser scan classification or robotic vision.In addition, the 3D warehouse models are being integrated in several commercial systems, such as design tools or simulation software.However, the number of 3D buildings and many 3D city facilities, such as bridges, bus stations and fuel stations, has increased in recent years, thanks to the development of 3D modelling computer programs such as SketchUp and ESRI CityEngine, which make 3D editing more easy and effective.
In 2007, VGI was introduced by Goodchild (2007a,b) to describe the recent revolution of collaboratively created spatial information on Web 2.0.Almost at the same time, Microsoft Virtual Earth and Google Earth launched their pioneer projects in the way of VGI or crowdsourcing.The projects are called 3DVIA (Virtual Earth) and Building Maker (Google Earth).Both of them provide a model kit to create buildings, deriving the 3D geometry from a set of oblique (and proprietary) birds-eye images of the same object from different perspectives.In contrast to the 3D Warehouse, this tool specifically aims at georeferenced 3D building models only.It is intended for people who do not have knowledge in 3D modelling, but still want to contribute.
In addition to 3D Warehouse, there are several free-to-use 3D object repositories on the internet, for example OpenSceneryX6, Archive3D7 or Shape-ways8.These projects emerged from entirely different communities with interest in, for example, flight simulators or 3D printing.The contents usually lack connection to the real world but can nonetheless be useful to enrich real 3D city model visualisations.
The above mentioned projects serve to directly share and collect 3D models by means of crowdsourcing.The data used for 3D reconstruction might be commercial or authority data.In fact, 3D city models can also be reconstructed using 2D vector or image data contributed by crowdsourcing.The OSM community has not only captured roads and paths, but also more and more POIs, land use areas and even buildings.The latter can be extracted and extruded into 3D.At present, there are several projects that generate and visualise 3D buildings from OSM: OSM-3D, OSM Buildings, Glosm, OSM2World, etc.The major limitation of these projects is that the majority of buildings are only modelled at coarse level of detail.When applying the concept of levels level of details (LoDs) introduced in CityGML (Gröger et al. 2008), these buildings are actually LoD1, i.e. they are reconstructed by extruding footprints with flat roofs.In OSM-3D, a number of buildings are modelled in LOD2 in cases where there are indications for their roof types.Further, it is possible to integrate more detail, however, usually manually generated buildings (LOD3 or LOD4) from other sources via OpenBuildingModels (Uden & Zipf 2012).
Flickr is another VGI project that is often used for the reconstruction of 3D buildings.Preliminary experiments on reconstructing 3D scenes from Flickr imagery have been made available by Snavely et al. (2006;2008) and Agarwal et al. (2011).However, Flickr imagery is almost untapped and unexploited by computer vision researchers, in particular when it comes to deriving representations suitable for GIS.A major reason is that the imagery is not in a form that is amenable to processing (Snavely et al. 2008).The photos are unstructured -they are taken in no particular order, and have uncontrolled distribution of the camera viewpoints und unclear positional accuracy.In addition, they are uncalibrated (Argarwal et al. 2011), and with widely variable illumination, resolution, and image quality.All this increases the difficulty in image registration and sparse 3D reconstruction.Furthermore, the existing approaches are developed based on dense image matching which leads to high computation costs.

Generating 3D city models from OSM data
A city normally consists of street network, land uses, buildings, point features and others.These would be handled separately for 3D modelling and visualisation.It should be pointed out that a digital terrain model (DTM) is required from other sources (i.e.open data) for the 3D visualisation, because OSM does not contain any information about terrain.

Integrating OSM land uses within 3D Terrain Surface
In fact, it is hard to integrate OSM data within 3D terrain surface because OSM data is recorded in 2D.The problem can be solved by overlapping OSM data as a liquid net over the terrain surface as a solid object and preserving the characteristics of OSM features (i.e. a football ground should be flat) during the process.In principle, there are three alternative ways to display OSM land-use data in 3D.This data can be displayed by mapping raster images onto a digital elevation model (DEM), by overlaying vector data on the DEM or by combining the vector data and the DEM in an integrated triangulated irregular network (TIN).Schilling et al. (2007) proposed an approach to integrate the road surface into the triangulation of the DTM, which is represented by a set of Triangulated Irregular Networks (TINs).This means that the road surface becomes a part of the TIN.The street network is treated as a layer consisting of a collection of polygons representing all the individual network segments.The borders of the polygons are integrated into the TIN as fully topological edges so that we can distinguish between triangles that are part of the street surface and the remaining triangles.
The resulting triangles within the polygon receive the attributes of the source features and can be coloured for visualisation.Another advantage of this approach is that all layers can be styled by the user on demand via a 3D styled layer descriptor (3D-SLD), which is an enhancement of the OGC SLD standard (Neubauer & Zipf 2007).
After integrating the street network with the surface layer, the street surfaces within the DTM have to be smoothed and corrected, because linear features like ditches, smaller dikes, walls, the rims of terraces and especially the hard border edges of roads can be only represented insufficiently due to the low resolution of the DTM.An example can be seen in Figure 1.Sometimes the road sidelines seem to be frayed.At steep hillsides, the road surface is inclined sideways.The situation is of course even worse with lower-resolution DTM data sources.
Another common way to support linear features is to include break lines during the terrain triangulation, e.g. using the Constrained Delaunay Triangulation (CDT).However, break lines are seldom available.However, one can correct the parts of surfaces representing areas that should be actually more or less flat.
A comparison between the situation before the correction and afterwards is shown in Figure 1.It is much more likely that the middle line takes a smooth course between the river and the hillside with approximately the same height, and that the profile is nearly horizontal, as can be seen on the right side, instead of being very bumpy and uneven.

3D building objects
In OSM, building footprints are modelled as closed polygons.For creating 3D models, the height must be derived from other OSM attributes (called tags).
The key height, as well as the key building:height, ought to contain information about the height of a building.If such information is not available, as an alternative, the keys levels, building:levels and building:levels:aboveground can be utilised for an approximation of the building height (by multiplying the number of levels with an average level height of 3.5 meters).The key building:min_levels also needs to be considered because it describes the individual elevation of a building, thus the space between the ground and the building (part).When computing building geometries, it is also interesting to generate proper roof geometries.The keys building:roof:shape, building:roof:style and building:roof:type contain a semantic description of the roof shape, such as gabled roof or hipped roof.In contrast, the key building:roof is supposed to contain information about the material of the roof, although it often also contains roof shape information.Similar to this key, building:roof:material can contain information about the roof material.Besides those keys, there are also some other relevant keys for roof generation.Building:roof:extent describes the extent of the roof, thus the actual distance between the roof edge and the building facade.For describing the orientation of the roof, the key building:roof:orientation is applied: if the roof ridge is parallel to the longer roof side, the value is along; otherwise the value is across.The generation of roof geometries for simple building footprints, that is footprints with rectangular shape or those that only consist of four points, is straightforward and can be applied with adequate performance to the OSM dataset.For more complex roofs, such as those with holes or arbitrary shapes, the generation of roof geometries is quite challenging.Some early results have already been gained by using procedural extrusion with Skeleton computations, but until now a broad application of those algorithms for the whole OSM on the one hand is very time consuming (about factor 100) and on the other hand does not, due to special cases and exceptions, lead to satisfying results.A detailed description of the building generation process can be found in Götz and Zipf (2011).

3D indoor modelling
The 3D indoor environment of buildings could be generated by using the indoor information mapped on OSM using IndoorOSM.It is an OSM-based indoor extension proposed by Götz and Zipf (2011).The schema follows existing OSM methodologies; thus, it only uses nodes, ways, relations and key-value pairs.That is, existing OSM editors, such as JOSM7 or Potlatch, 8, are suitable for mapping IndoorOSM data.The schema is defined as follows: a whole building is represented as one OSM relation, whereas the different relation members (the children of the relation) are the different building levels (floors).A level itself consists of one or several closed way(s) for representing the shell of the level, that is the outer boundary, and several other closed ways representing the inner building parts (e.g.rooms, corridors, etc.).
3D information such as the height of a level or the height of a building part is attached as a key-value pair to the corresponding OSM feature with the key height and corresponding values (e.g. 3, 6 ft, etc., the default unit is meter).That is, for each level and its inner parts, a two-dimensional (2D) footprint geometry plus additional 3D information is available.Further semantic information, such as room names, level names, level numbers and so on are attached as keyvalue pairs to the corresponding OSM feature.
In IndoorOSM, information about windows is provided by adding nodes to the OSM features that represent the level shells.Thereby, the location of the node represents the 2D centre of the window (from a bird's perspective).Information about the breadth, width and height is attached via corresponding keys.

Point features
In OSM, point features have been captured for an abundance of different locations, shops, restaurants, facilities, technical installations and so on.They provide in part very deep information, which enables applications that go far beyond the static display of map content.For some categories, a tagging schema has been established for storing typically useful information about a specific type of facility.The schema for restaurants, for instance, includes name, address, opening hours, cuisine, telephone number and URL of the homepage.
The primary OSM key for this kind of node is 'amenity' .The value describes the type, which can be used to assign an icon or symbol.The generic 'name' key may be used for an additional label.The amenity types have been divided into the categories: accommodation, eating, education, enjoyment, health, money, post, public facilities, public transport, shop and traffic.Each category is provided as an individual layer through the Web3D Service.
For the 3D environment, these point features should be classified into two classes: points as additional attributes of buildings, and points as location indication of city facilities.The first class of points can be integrated to their corresponding buildings by using text-matching algorithms.The second class (Figure 2a.) stands for the objects of city facilities, such as bus station, traffic signal, post box, tree and streetlights.These objects of city facilities can be modelled in 3D by using generic 3D objects (Figure 2b.), because they have unified shapes and sizes in the city (Gröger et al. 2008).

Generic objects 3D public phone booth
The OSM-in-3D projects The most advanced work in the context of creating 3D city models from VGI data is the OSM-3D project developed at Heidelberg University.It combines the extrusion of building footprints into the third dimension with a detailed integrated terrain model derived from SRTM height data.It provides the 3D data in a standardised manner through a Web 3D Service (W3DS).The OSM-3D W3DS supports different terrain generalisation levels and provides tiled 3D scenes, based on the requested point of view, in VRML, X3D, COLLADA or KML format.Tailored client software called XNavigator has also been developed, which automatically requests the data from the W3DS server and assembles complex 3D landscapes worldwide.This client also allows the integration of other OGC Web Services, such as a Web Feature Service (WFS), the OpenGIS Location Services or the Sensor Observation Service.Thus, for example, POIs or 3D routes can be included.The interoperability with different data sources (e.g. also Cit-yGML), web services and targets has recently been examined within the OGC 3D Portrayal Interoperability Experiment. Figure 3 shows the user interface of XNavigator with 3D city models in Heidelberg, Germany.The wide applicability of the W3DS and XNavigator with heterogeneous data could be demonstrated.
3D buildings in OSM-3D are generated using the automated process described in Section 3.2.The drawback of this kind of automated approach is that the buildings can only be generated with coarse geometries.In other words, buildings can only be modelled as block models (LoD1) or models with roof structures (LoD2).Architectural details on facades unfortunately cannot be modelled.Aiming to acquire 3D buildings with detailed geometries (LoD3 and LoD4), the OpenBuildingModel project was launched in 2012.It is a webbased platform for uploading and sharing entire 3D building models.In line with this project, a user-friendly web interface (see Figure 4.) has been developed, which allows: (i) uploading 3D building models (modelled by internet users) associated with a footprint in OSM (Figure 4a), and (ii) browsing, viewing and downloading existing models in the repository (Figure 4b).The processing of the OSM data and setting up of a model repository in the first prototype comprises several steps.First, building footprints have to be derived from the OSM data separately and overlaid as a vector layer by importing the OSM data into a PostgreSQL/PostGIS database with the Osmosis tool.Then building footprints can be operated interactively and individually on the web-client, whereby GeoServer is deployed for the data provision.By selecting a building footprint, one can upload his/her own 3D building (with textures) created offline.At the same time, it is also possible to add attributive information.

Future work
Collaborative mapping in 3D is more difficult than in 2D, because basic knowledge and skills about 3D modelling are essential when creating 3D buildings manually or in a semi-automated way.From this point of view, one cannot expect much contribution of 3D building models through data-sharing platforms such as 3D Warehouse and OpenBuildingModel.In order to have 3D buildings with detailed façade structures at regional, country and even global scales, an alternative solution has to be provided.
One possible solution might be the combination of the two VGI projects: OSM and Flickr.Building footprints, height information and further semantic tags given in OSM will be used as known information for extracting facade geometries from dense unorganised Flickr photos.In addition, attributive information both in OSM and in Flickr are to be integrated into the 3D building structures.Thus, the resulting city models will not only be appropriate for visualisation tasks but also usable for further analysis, e.g.urban planning, emergency management and simulations for energy consumption.To achieve this, novel intelligent modelling concepts for 3D city models that can cope with the growing needs and requirements arising in the area of geo-information science have to be developed.
However, there are still many challenges that make the 3D reconstruction from VGI data somehow difficult.First of all, the data is heterogeneous in quality, completeness and accuracy.An automated approach may achieve good results in some regions but fails in other regions.Secondly, although there are a large number of images on Flickr for landmark buildings, there are still not enough to obtain robust results for dense image matching, in order to acquire detailed geometries of 3D buildings.Images on other crowdsourcing platforms (e.g.Wikipedia, Weibo and Tweeter) may also be used for 3D reconstruction purposes.The third issue is the quality of the 3D buildings.This can be evaluated by using authority data in some regions where authority data can be made available.But the quality of 3D buildings created in this way is difficult to be controlled from the sources due to the diversity of the personal capabilities of internet contributors in terms of operating with geo-data.

Figure 1 :
Figure 1: Comparison between the original terrain surface (left) and with the flattened road segment (right).

Figure 2 :
Figure 2: Generic objects in a city and an example of 3D representation of a public phone cell.