Considerations of Privacy, Ethics and Legal Issues in Volunteered Geographic Information

Today almost any kind of User Generated Content (UGC) can be situated within a geographic context. Volunteered Geographic Information (VGI) can include many types of UGC, such as georeferenced photographs, social media and text, geographic data themselves, etc. There are legal, privacy and ethical issues raised by VGI, and at present these are not very well studied or understood despite the rise in popularity of VGI. This chapter will discuss, investigate and define some of the most prominent issues related to the legal, privacy and ethics topic within VGI. The chapter argues that these issues are not well understood by all of the actors in VGI, and in particular by the producers of this information as well as the users or consumers of this new data source. Creating a better understanding of these issues will be very important in the future development and evolution of VGI in society.


Introduction
The public collection and exchange of geospatial data and information as Volunteered Geographic Information (VGI) involve many privacy, legal and ethical issues (Blatt, 2015). These issues are exacerbated with the further distribution and dissemination of these data by third parties such as libraries, online data services, etc. In many examples of VGI, the collection of geographic data involves the use of location-based devices that record the identities, positions and movements of the contributors of the information. Other examples of VGI, such as social media, can embed geographic position into imagery, video, sound, text, message data, etc. These data and information objects can then be accessed by other citizens, systems and services. As crowdsourced geographic information becomes more prevalent in society today, more detailed spatial data are constantly being collected from citizens, particularly through the proliferation of spatially aware devices such as smartphones, smart devices and sensors. The major issue developing here is that these sources of spatial data can be combined or linked to other databases and data sources and can potentially expose sensitive private information, such as the personal data, living habits and health conditions of the citizen contributor themselves (Shen et al., 2016). The further usage, storage and integration of these data are often the subject of complex legal and ethical considerations.

The role of the citizen within privacy, legal and ethical issues in VGI
In this chapter we consider the position of the citizen and the VGI that they can generate, and we discuss the privacy, legal and ethical issues relating to the production of this VGI and its further usage. In VGI projects and activities the citizen is at the very core of almost all aspects of VGI data production, management, dissemination and usage. Yet we argue in this paper that there is still a large gap in our understanding of the privacy, legal and ethical issues connected to these activities. VGI is still a relatively new field of research; subsequently there is not a great deal of published knowledge or guidelines available on these issues in VGI.
Although VGI tends to be associated with the collection and supply of explicitly geographic material, such as OSM (see Chapters 3 and 4 - Mooney and Minghini, 2017;Touya et al., 2017) or citizen science projects (see Chapters 1 and 2 - Foody et al., 2017;See et al., 2017), it is certainly not limited to this type of materials. As means of a short motivating example, we consider geotagged photographs. Geotagged photographs are not associated explicitly with VGI, in the sense that geotagging has become so implicit with the use of smartphones that most citizens may not be aware of this feature, i.e. that our holiday photographs, for example, are being geotagged when we take them and upload them to various social media sites. In this case, this information is volunteered passively (Fast and Rinner, 2014), without realizing that it is actually geographic information nor that it can be reused and integrated with other geographic information. Indeed many citizens are not aware that when, for example, we contribute geotagged photographs to a citizen science project, one cannot always predict what the downstream future usages of those photographs will be given the myriad of mashup tools and technologies available. Overall this means that although crowdsourced geographic information can be both volunteered, as in VGI, or harvested in a passive or ambient way (Stefanidis et al., 2013), for the most part citizens are not fully aware of the additional intelligence that can be elicited by the powerful combinations of software, cloud computing and data processing technologies available today. Dienlin and Trepte (2015) emphasise that even though citizens today have substantial concerns with regard to their online privacy, they are often engaged in self-disclosing behaviours that do not adequately reflect their concerns. It is therefore necessary to attempt to highlight the types of privacy, ethical and legal issues that can be faced knowingly or unknowingly by citizens involved in VGI today.
The remainder of this chapter is organised as follows. In Section 2 we provide a brief discussion of the current understanding of the issues of privacy, ethical and legal frameworks in VGI today by considering simple actor/use case scenarios. In the three sections that follow it, we discuss privacy (Section 3), ethics (Section 4) and legal issues (Section 5). In Section 6 we summarise the paper with some concluding remarks while highlighting future directions for this work.

Positioning the Issues of Privacy, Ethics and Legality in VGI
At the time of writing, the issues of privacy, ethics and legality in VGI have not received widespread or in-depth treatment by the research community. The exact nature of the VGI or data used and which use case it is applied to may help to determine which legal, ethical and privacy issues are most prominent. When information about individual citizens is transferred and presented within a geographic context, the resulting profile information could be both 'highly revelatory and involuntary' (Scassa, 2013:5), and this can raise important privacy and ethical issues. The ability for VGI data and information to be mashed up or integrated with other VGI datasets, proprietary datasets or other information sources means that new sources of data are created. The privacy, ethics and legal issues that existed for the original VGI dataset may not have completely changed due to this transformative change. In this section, we provide a simple table ( Table 1) that situates privacy, ethics and legal issues for the principal actors involved in the collection, production and dissemination of VGI, namely citizens, national mapping agencies (NMAs), commercial companies, researchers and other entities such as small and medium-sized enterprises (SMEs). While this table is not a fully comprehensive overview of all of the possible actor interactions with privacy, ethics and legal issues, it will allow us to situate our discussions in the subsequent sections of this chapter. Each cell in the table provides a simple example of considerations that are made by the corresponding actor when producing, collecting, managing, using or disseminating VGI.
As we can see, there is some overlap in the table. All of the actors will confront and deal with many of the same privacy, ethics and legal issues but they will respond to these issues differently. For example, how an NMA deals with the liability and legal aspects of VGI will be different to how an academic researcher deals with the same problem. With these examples in mind we will now look at privacy (Section 3), ethics (Section 4) and legal issues (Section 5) in the next three sections.

Privacy Issues
Privacy is probably the most well known aspect of the three issues considered in this chapter; protecting it is very important, and this is no different when considering VGI. Privacy of user data and information should be considered in the initial design of VGI systems, as adding privacy protection to existing systems can be very cumbersome, and this is no different for VGI systems and projects.

Understanding Privacy within the VGI context
Private data in the VGI context are any geographic data or information that can be linked to an individual contributor who created, collected or edited those data. Thus, to prevent VGI data being used to violate the privacy of individuals, we need to look at the character of the data and investigate the entire process from the collection of data to the submission of the VGI to data repositories, and then onwards to the usage of the data. The most efficient measure is not to A contributor will report observations honestly and truthfully, and, to the best of their ability, they will contribute accurate information.
The citizen will obtain consent or permission to survey, record or measure a specific area or geographic feature.

A National Mapping
Agency, Environmental Ministry, Geological Survey, etc. Ensuring that the organisation gives careful consideration to the scale at which it provides geographic data and to the contents of the metadata attached to those data.
The organisation will not knowingly provide false or inaccurate data or information nor report with bias on specific geographic themes. The organisation is bound by many legal requirements to produce mapping products. The organisation is legally bound to the quality of data produced and could be liable to consequences of the use of these data.
A commercial mapping company Privacy can mean keeping the information about the sources of the VGI hidden from public view.
Acting responsibly, ensuring the privacy of citizens is maintained and that the VGI is distributed using an appropriate licence. The focus here is on the terms and conditions of the type of licence applied to the VGI, whether it be for the commercial usage of the data or the integration of the data with other data products. Continued.

Who am I? I am … An example of a privacy issue
An example of an ethical consideration An example of the legal issues involved An academic or researcher looking to use VGI data During the dissemination of research outputs, care must be taken not to expose the identities of or other private information related to the citizens who contributed to the VGI project.
Patterns and inferences made about the contributors of the data must be carefully considered so as not to breach the privacy of those citizens.
If carrying out an analysis/ survey of citizens involved in VGI, the consent process ensures that individuals are voluntarily participating in the research with full knowledge of relevant risks and benefits.
Many geography-based research projects can mix and integrate multiple datasets for investigation.
It is important that the licences are compatible to allow this, so that future research results can be disseminated legally.
An SME wishing to use VGI for an application or service that it is looking to develop and bring to market VGI as a product must be used in such a way that the producer's privacy is protected from any potential future commercial exploitation.
Using the data 'as is' without any embellishment or corrections to the VGI data that are untrue or incorrect.
Ensuring that the licences and terms of use of the VGI data are compatible with commercial development. In some open licences it might be necessary that changes made to the VGI dataset/ database by the SME also be made available under the same licence. collect private data at all or at least not to collect data that are linkable to individuals. If linkable private data are collected, it then becomes necessary to set up protection mechanisms to ensure that the data are only used according to the original purpose defined before the collection of the VGI started. As VGI data collections are considered a resource for new and maybe unforeseen usages and research, it becomes all the more important that these data do not provide linkable private data about individuals. The question that must be asked is whether location information in itself is private data or can be linked to individuals: the answer depends on the location accuracy. Many location data are accurate enough to be bound to one individual or to a small group of individuals, e.g. an office or home, and are sometimes even combined with precise time and date. There is no one-size-fits-all solution here; the collection of point-based geographic data for a specific purpose may need to have high geographic accuracy. With this requirement for accuracy comes a possibility that the geographic features close to the collected points could be used to infer other information.

Approaches to Privacy Preservation in VGI
The guiding principle of privacy protection is to collect as little private data as possible. Cho (2014) argues that there must be privacy and legal protection for volunteers in VGI data collection and projects, otherwise 'the ensuing litigation may destroy the VGI model before it reaches its full potential' . Calderoni et al. (2015) remark that we, as citizens, are only starting to grasp the privacy risks associated with the constant tracking of our whereabouts by the very devices that we carry around with us. In order to continue using location-based services in the future without compromising personal privacy and security, there is an urgent need for privacy-friendly applications and protocols. There exists some literature related to privacy concerns and possible solutions related to VGI. There are a number of prevalent technological approaches, including perhaps the popular approach of blurring or fuzzing information from its original data (Luther et al., 2009). Anonymising data and selectively revealing information according to volunteer preference is another approach (Kim et al., 2013). In the Geographic Privacy-Aware Knowledge Discovery and Delivery (GeoPKDD) project, Giannotti and Pedreschi (2008) investigated various scientific and technological issues of mobility data, open problems and roadmaps. They found that privacy issues related to Information and Communications Technology (ICT) can only be addressed through an alliance of technology, legal regulations and social norms. In the meanwhile, increasingly sophisticated privacy-preserving data mining techniques are being studied and need to be further developed. These approaches aim to achieve appropriate levels of anonymity by means of controlled transformation of data and/or patterns with limited distortion, to avoid the undesired side effects on privacy while preserving the possibility of discovering useful patterns and trends.
The most common question asked about privacy in VGI is whether data collection services and systems can be enhanced so that the spatial data collected or generated by a contributor cannot be traced back to that individual contributor. The contributor should not be identifiable through their contributions to a VGI project; more precisely, the contributor should be identifiable within the VGI project (such as through a pseudonym username in a project) but their contribution should not be linkable to the personal and private data and information for their actual person. There is a need to consider the sensitivity of the privacy issues within contributions to VGI: are there situations where a contributor would prefer not to be linked to a set of contributions or a single contribution? In the capture of aerial imagery, geotagged photographs and street-level photography, people can also potentially be identifiable as subjects. There are thus many privacy issues, and these issues have not been adequately addressed as of yet.

Privacy for non-human subjects in VGI
Privacy can also be related to non-human subjects in VGI. Suppose there is a crowdsourcing or VGI campaign in the area of biodiversity and a very rare or precious plant species is found and geolocated. To protect this species (and potentially its habitat), this information needs to be kept private. But other species identified by the campaign may not need privacy. This example could also extend to similar scenarios for a geological survey. Suppose a contributor identifies the potential location of a precious metal; there might be very good reasons related to why this location and find must be kept private. The discussions above for both human privacy and the privacy of non-human subjects raises the question of the need to have manual checking of contributions for these privacy issues: is it necessary to moderate contributions for their privacy characteristics and not just their data quality aspects? The moderation question in VGI already raises many obstacles to its implementation (Neis and Zielstra, 2014). It might not be possible to automate this process to include the consideration of privacy aspects.
While the focus above has been on the individual VGI contributor, it is often the case that contributors to VGI projects are institutions and organisations that provide datasets for VGI; institutions or organisations must also be aware of and familiar with the licence terms within which they provide content.

Ethics Issues
As far back as the work of Mitchell and Draper (1983), the issue of ethics has been subject to research conversation in geography. In their work, they indicate that geographers have not always been sensitive to ethical issues, and that, as geography researchers, one has to balance the obligations of understanding and knowledge with those of respecting the dignity and integrity of research subjects.

Key Ethical Issues in VGI
In VGI, the citizens who collect, manage and work with the data are very often the subject of research. Little work has been carried out specifically on ethics in VGI. Many studies on contributors have been performed and published in the literature in the last few years (Granell and Ostermann, 2016). Hartter et al. (2013) outline that ethical standards in science require that research with human subjects respect individuals, commit to nondisclosure of participants' identities, minimise potential harm and ensure that the benefits and burdens of research be fairly distributed, and that subjects be informed of the full nature of the research so they can decide against participation if they wish. Ethical standards and plans now usually require ethics approval funding review boards and research authorities. Luppicini (2010) introduces the term technoethics to refer to an interdisciplinary study of technological impacts on the morals and ethics in a society. Ethical conduct and social responsibility are important factors within contemporary society to maintain respect and harmony. Lingel and Bishop (2014) consider the 'labour ethics' surrounding VGI in terms not only of what is technically possible, but of what is also ethically responsible. The authors argue that the introduction of ethical considerations should not discourage the production of VGI within volunteer communities; rather, those involved in instigating this VGI or managing it must give careful consideration to how these communities are managed.
Ethical considerations can be performed by both the data producer (the volunteers) and the users (VGI project coordinator/platform operator). As before, the volunteers have to consider and adopt an ethical approach to their reporting of information and data. For example, in a disaster or crisis situation, this involves not engaging in the false reporting of damage, casualties, fatalities, etc. Indeed, ethical considerations must be given by volunteers to information and data that they provide that can lead to the action of authorities such as emergency services (Haworth and Bruce, 2015). Volunteers wilfully contributing false or misleading data or information not only undermine the VGI project in which they are involved, but also causes a further lack of trust and suspicion from users about the quality and usability of VGI in general. From the coordinator side, the volunteer must be made aware of the purpose of the project that they are volunteering for; voluntary submissions must not be used for commercial purposes, or shared with other entities for different purposes without the consent of the volunteers. At this point, it is clear that the consideration of ethics combines the issues of data privacy and the legal aspects of VGI -these issues are not easily disengaged from each other.

Summary of Ethical Issues
As communicated by Sula (2016), the key ways to respect ethics in databased research include involving participants throughout the research process, avoiding collecting information that should remain private, notifying participants of their inclusion and providing them with options to correct or delete personal information, and using public channels to disseminate research, such as Open Data. Ethical research has the least possible impact on subjects, asking or collecting only as much as is needed to answer its questions. In the case of VGI research, the researchers involved may not know exactly what knowledge they are trying to extract or patterns they are trying to uncover; the data are being used in an exploratory way. In these circumstances, it seems nearly impossible to inform participants of all anticipated harms and benefits in advance.
Today, datasets collected through VGI and crowdsourced means have a potentially very long lifespan. Given the longevity of these datasets and their potential interoperability and integration with other datasets, researchers and scientists must, in general and where possible, avoid data with personally identifiable information or information that could later be used to identify participants in connection with other datasets, e.g. screennames, usernames, etc. The potential for unintended consequences are high, but entirely mitigated when no personally identifiable information is collected in the first place (Sula, 2016). The integration of many datasets with each other creates a brand new dataset that is essentially an unknown quantity in terms of its ethical characteristics. In this situation the creators of these new datasets must be conscious of how the new dataset will be used, distributed, analysed and even itself potentially integrated with other datasets in the future.

Legal Issues
In Olteanu-Raimond et al. (2017), one of the six obstacles described for NMAs in using VGI is the legal issue. The most relevant of these legal issues in using VGI are intellectual property and liability. With the new trend of open data, more and more public bodies have adopted a policy of open data. Generally there are two concepts of open data: one concept means that 'data and content can be freely used, modified, and shared by anyone for any purpose' and the other involves open source licensing applied on software. Intellectual property concerns both data producers and users. From the producers' point of view, it defines ownership rights of the data, licences, and how data can be used and under which conditions. From the users' point of view, it defines rules to enrich and disseminate the data.

Liability as a Legal Issue in VGI
Concerning liability, the main question is that of who is liable and under what circumstances if harm is caused, economic loss happens or incorrect decisions are taken. This issue is linked closely to the concerns with data quality, i.e. precision and accuracy. Liability can be different from country to country and from product to product. When crowdsourced data are used by a legally mandated organisation such as an NMA, what are the implications for that organisation? Does the NMA take all of the legal responsibility? Is there any citizen responsibility? Should there be? Indeed, Cho (2014:10) argues that there must be legal protection for volunteers in VGI data collection and projects, otherwise 'the ensuing litigation may destroy the VGI model before it reaches its full potential' . Rak et al. (2012) studied the integration of VGI into Canadian authoritative datasets from the liability point of view by proposing four primary risk management techniques to manage risks resulting such an incorporation. One of the most important and difficult of these risk management techniques sees the information provider being required to show that steps were taken to ensure the accuracy of VGI that has been integrated into their data.

Legal Issues Surrounding Data Licence Types
The type of licence applied to VGI data for their subsequent dissemination has an important influence on their usage. There are three main types of open data licences: • Share alike licences, which require the derived datasets to be released with the same licence as the original one(s); the most famous such licence in the area of geographic information is the Open Database License (ODbL) used by OpenStreetMap (OSM). • Open licences, which allow any type of use provided the citation of the data provider is given; it allows, for instance, commercial use of derived datasets. An example of such a licence is the French 'Licence ouverte' , which is used to release governmental open data in France.
• Limited use open licences, which limit the use of the dataset to personal use, or non-commercial use. For instance, the IGN (the French mapping agency) releases its datasets openly for research and education purposes.
The choice of a licence conveys a political or commercial strategy, and the strategies of these licences might not be compatible. So what happens when projects with different strategies plan to merge their datasets? And what happens when one or more of these datasets are from VGI? It is useful at this point to provide a real-world example. The most typical case regarding geographic information is the following: how is it possible to integrate non-ODbL open data into OSM? The case of the French national address dataset is interesting to study, as it plans to integrate data from the IGN, which is a governmental administration, the French Post Office company, which is a public limited company, and OSM ( Figure 1). All three already have address datasets updated by crowdsourcing communities. They also have different licensing strategies. OSM uses the ODbL while the French Post Office would prefer a licence that allows commercial use of derived datasets. Figure 1 shows a possible integration scenario for the architecture of the project and the licensing strategy. Two new datasets are created in this scenario: a common and central address dataset, and a copy of this dataset using the OSM technologies (in RDF format). The OSM-like copy is under the ODbL licence, which allows OSM contributions regarding addresses to be directly included, and the other way around. The common address dataset is under two licences: a limited open licence that only allows personal and non-commercial use of the data, and a charged licence for other uses. The OSM-like dataset is only a partial copy, as the French Post Office does not want to release all the information of its dataset (e.g. the standardised spelling of addresses). A quality control step is included in the common dataset to improve contributions through both field survey (by mail carriers and IGN surveyors) and automatic tools. In this scenario, different access desks are proposed for citizens, derived from existing tools. The IGN desk, which fills the common address dataset, is dedicated to community-sourcing (from city administrations, firefighters, police officers, etc.); the Post Office desk, which also fills the common address dataset, is dedicated to citizens and administrations that report updates on addresses; and the OSM desk is based on OSM software, such as iD 1 , and could fill both the common dataset and the OSM-like dataset. The tricky part of the integration scenario is that the contributions go to both datasets at the same time, so it is not 'infected' by ODbL. This architecture seeks to attract OSM contributors to this project, but the contributors should accept that their contribution will fill both address datasets, which have different licences.

Summary of Legal Issues in VGI
In summary, the legal issues in VGI must be considered from the side of both the data producers or collectors (i.e. the volunteers or citizens) and the users or facilitators (i.e. VGI project management, VGI data portal operators) of the data. From the position of the volunteer, their legal role and their contribution may not always be clearly defined and this can lead to potentially exposing them to legal problems. On the other hand, if a data provider or data portal only facilitates the transfer or access to VGI data, then who carries the legal responsibilities related to consequences of future use of these data? For example, submissions from volunteers to a VGI project may indicate natural hazards in a particular location or the vulnerabilities of a property. This (potentially false) information could be used by an insurance company to raise insurance premiums. Then, from the VGI project coordinators' side, to what extent must a portal/project coordinator provide a disclaimer about legal aspects? Under what circumstances can a portal be held liable for omissions (e.g. damaged areas not mapped during a disaster), or mistakes (e.g. infrastructure shown to be intact that is actually broken, leading to inaccessibility) be challenged? In reality, there are no clear cut answers to these questions at this point in time. Christin et al. (2011) indicate that the research community should provide open datasets that can serve as a baseline for performance, security and legal evaluation in order to begin addressing these critical issues.

Conclusions and Future Directions
In this chapter we have provided a brief overview and discussion of privacy, ethics and legal issues in the production, collection, storage, dissemination and integration of VGI. These are complex issues. As VGI continues to grow rapidly in terms of popularity amongst contributors and as an alternative or complementary source of spatial data for researchers, authoritative agencies, commercial companies, etc., these issues will become more prevalent and urgent. In their study of privacy concerns in the use of location-based services such as social media, Fodor and Brem (2015) found that privacy concerns do influence citizen adoption of these services but that the answer is more complex and multi-faceted than just a simple case of trusting such services. Even now, with VGI, new technologies are emerging all of the time, offering citizens new and exciting ways to generate and collect spatial data. Luppicini and So (2016) argue that in technologies such as the use of drones for collecting data and information, a lack of understanding of the factors of ethics and privacy often causes the prohibition of the use of these technologies. A lack of understanding does not often really mitigate the issues, but can hinder the development of devices and technologies that can be used in many positive ways.
When VGI is collected and subsequently disseminated, it can be reused, displayed, integrated and transformed in a myriad of ways. The model for understanding what happens with data once they are released by the individual, or what this means on an aggregate scale, is thus fluid and uncertain (Hallinan et al., 2012). In reality, citizens often have a poor basis on which to form a picture of the data relationships, the consequences and the issues in VGI. Citizens often struggle to comprehend how these issues add to the importance of these data flows in relation to other social structures or issues. Hallinan et al. (2012:271) go on to argue that due to the complexity of the issues of privacy, ethics and legality, 'it appears that the public are being forced to act in an environment they have little template for approaching' . The concepts of VGI and Open Data are still relatively new. Consequently, it will take time for citizens to become deeply familiar with the issues discussed above. Christin et al. (2011) argue that at the moment, privacy research usually operates on either private or synthetic datasets. These datasets do not allow new mechanisms for privacy, ethical and legal considerations to be harmonised or benchmarked against. In any case, Torra and Navarro-Arribas (2014:277) indicate after their wide scale review of the issues of data privacy online that the development of methods to protect citizens 'has to take into account the specificities of the data involved' . No two VGI datasets are the same; indeed, it can be the case that within a VGI dataset different objects might be collected by different citizens in different circumstances. VGI is an exciting and powerful source of geospatial data that is likely to continue growing. Understanding how to protect the citizen while enhancing their role in the production of VGI is a big research challenge for the next few years. Indeed this research issue has not really been tackled at all by the research community at this point in time. Protection of the citizen's privacy and ethical rights under suitable legal conditions is very important. However, the frameworks or structures developed to implement these protections must not place insurmountable barriers to citizen participation in VGI. The act of being involved in VGI as citizens should continue to be a leisure activity pursued by those motivated to volunteer. There is a fine balance between, on the one hand, encouraging and fostering participation in VGI activities and, on the other hand, ensuring that the complex issues of privacy, ethics and legality are understood and adhered to by a potentially large cohort of individuals (Rak et al., 2012;Torra and Navarro-Arribas, 2014). Finding this balance will have a major influence on the future trajectory of VGI. Notes 1 http://ideditor.com/