Creating a Corpus of Pilgrim Narratives: Experiences and Perspectives from the PILNAR Project

The PILNAR project was set up in order to prepare large sets of pilgrim narratives for analysis and interpretation. The narratives that pilgrims create on the Camino de Santiago give us a good insight in the dynamics of a changing ritual in a late-modern and superdiverse society. In this project, specialists from di erent backgrounds contributed to create a database that brings together narratives from a variety of di erent online and o ine platforms. This chapter presents a retrospective on the process that led to the PILNAR database and the accompanying mobile application, and sketches possible directions in which the project could develop in the future.


Introduction
The pilgrimage to Santiago de Compostela, lovingly called the 'Camino' by its pilgrims, is an interesting site of research for a number of reasons. As its origins date back to the 10th century, its routes are spread across Europe and it is widely seen as one of the three major European pilgrimages, its opportunities for academic exploration are wide-ranging. Its signi cance has again become apparent in the last 20 to 30 years, when, a er a period of relative quiet, pilgrims started to ock to Santiago once more. According to reports issued by the o cial pilgrim o ce in Santiago the amount of pilgrims that register at the Cathedral have grown from 2,491 in 19862,491 in to 215,880 in 20132,491 in (Informe estadístico, 2016. These new pilgrims have not just stepped into a traditional Catholic ritual. Rather, they have brought onto the Camino a contemporary take on what it means to be a pilgrim in a late-modern, superdiverse society. This means that the Camino is now a site where pilgrims explore the many di erent repertoires and traditions of ritual and seek for encounters with the sacred in a myriad of di erent elds including religion, local architecture, conversations with strangers, physical endurance, and re ections on their personal past (Post, Pieper and Van Uden, 1998). Contemporary pilgrim pro les re ect the diversity of the current ritual and religious dynamics in the Netherlands, and pilgrim narratives can be explored as a unique point of entry in studying these dynamics.
There are di erent approaches to exploring the pilgrim identity that emerged at the end of the last century. One of the most productive approaches has proven to be based upon the narrative dimension of the pilgrim pro le. For the pilgrim is a highly enthusiastic storyteller. S/he tells stories about the pilgrimage s/he has planned, informs family and friends about the pilgrimage while s/he is on the Camino, and will not miss an opportunity to recount the adventures for some time a er the journey has been completed. These stories can take many forms and have many functions: they might be serious expressions of expectation, amusing anecdotes about eccentric pilgrims met along the way, personal contemplations upon life decisions, or modern re ections upon canonical Camino legends. This narrative inclination of the pilgrim opens up an opportunity for academic exploration of the contemporary pilgrim identity. It has nourished a long tradition of academic interest that stems from the stories pilgrims engage with in order to better understand what drives the pilgrims on their journey, how their experience shape them, and what transformation or a rmation they nd on the Camino (Post, 2015;Van der Beek, 2015;Post, 1994).
As we entered the 21st century, however, both the character of the pilgrim pro le and the creation and distribution of narratives have changed. Not only has the great increase of pilgrims on the Camino led to an increase of pilgrim narratives, but the popularisation of online platforms has provided the opportunity for the creation of a potentially endless amount of pilgrim narratives. As pilgrims took to the internet, their narratives became as diversi ed as their ritual identity had become. This growing collection of pilgrim narratives therefore became an interesting and fruitful site for academic inquiry. However, as we will see in this contribution, the emerged complexity and diversity of the genre of the pilgrim narrative complicates any initiative to take stock and create a corpus.
To dig into the potential described, what was needed was an infrastructure that prepared pilgrim narratives for analyses and interpretation. This meant gathering pilgrim narratives from the different platforms on which they were created and distributed and curating them in a manner that would highlight their complexity and diversity. Preferably, this database would have the potential to grow along with the narratives that would keep being created by pilgrims in future years. With these goals in mind, the PILNAR ('PILgrim NARratives') project was set up (PILNAR, 2016).
In this contribution we sketch our experiences in the PILNAR project. Although we also give some perspectives for the future, the focus is retrospective.

Team PILNAR
The initiative for the PILNAR project came from researchers at the Tilburg School of Humanities (Tilburg University), where the study of ritual in general, and pilgrimage in particular, has been a key interest. They also provided the coordinator of the project, in the person of Paul Post. However, di erent Dutch parties were also found to be interested in the goals of the PILNAR project. One partner was found in the Meertens Institute (KNAW), the research institute for the study of Dutch language and culture. The Meertens Institute had formerly provided academic study into Dutch religiosity and ritual via a large-scale documentation project on Dutch places of pilgrimages (BiN, 2016;Margry and Caspers, 2004), and had experience in gathering and curating large collections of narratives (Collections on meertens.knaw.nl 2016). Therefore, it was decided that the PILNAR database would be developed and curated within the infrastructure of the Meertens Institute.
A third obvious partner in the PILNAR project was the Dutch Society of Saint James. As the main Dutch society of Camino pilgrims (with nearly 14,000 members), they shared an interest in the creation of a database of pilgrim narratives, as well as in the academic study that would emerge from this collection (Het Genootschap van Sint Jacob, 2014). A last partner that contributed to PILNAR would be the Museum Catharijneconvent, the Utrecht national museum for Christian art and culture. In 2008, the Catharijneconvent started collecting personal narratives of its visitors and creating online datasets which made them accessible to all interested parties. A er an exhibition on the Camino in 2010, inspired by the 25th anniversary of the Society of Saint James, the museum started a collection of online pilgrim narratives (Pelgrims vertellen over hun tocht on catharijneverhalen.nl 2016).
In order to shape the proposed setup, an online infrastructure was needed that would support the complex wishes of the PILNAR team. To this end, CLARIN was approached. The funding for the start of a digital corpus of Dutch pilgrim narratives was awarded in 2011, and a year later the PILNAR project was started as a one-year CLARIN enterprise.

Creating the Corpus
From the start, the PILNAR team sought to create a corpus of pilgrim narratives from various sources, thereby mirroring the wide-ranging platforms and re ections that shape these stories. PILNAR aimed at including narratives in di erent forms and characters: written accounts in (o ine) journals, individually created booklets, articles in periodicals, personal weblogs, discussions in magazines. The di erent PILNAR partners could contribute to this collection of narratives. For example, the Society of Saint James had access to the archive of their monthly magazine De Jacobsstaf as well as their monthly online newsletter Ultreia. Together, this collection would render a great number of pilgrim stories, ranging from personal accounts to practical information, and from spiritual re ections to historical background. The Catharijneconvent had available to them the set of stories uploaded by pilgrims onto their website (pelgrimsverhalen.nl).
Additionally, a website was created on which pilgrims themselves could upload any narrative that they deemed interesting for the PILNAR dataset (PILNAR, 2016). Pilgrims were approached to participate in the creation of the PILNAR dataset via di erent channels: the Society of Saint James placed calls for stories in their newsletter and their website, the Catharijneconvent addressed the visitors of their website, and individual pilgrim bloggers were individually asked for their narratives to be included. All of these channels would provide a solid base for the PILNAR database and allow for expansion in the future. In the last phases of the PILNAR project, Tilburg University took the initiative to develop a PILNAR app. This app would function as a mobile tool with which pilgrims could upload their narratives (directly) into the PILNAR database.

Using the Corpus
A er the gathering of these narratives, the database needed some initial perspectives for structuring and using the corpus. To prepare the dataset for academics seeking to trace meaningful categories in the multiplicity of the contemporary pilgrim identity the heuristic instrument of the 'Fields of the sacred' was used. This heuristic frame was developed in the 'Religion and Ritual' research group at Tilburg University (Post, 2010a;Post, 2010b;Post, 2010c;Post, 2011a;Post, 2011b;Post, 2011c). As a tool, these elds are used to locate sacred practices in the Netherlands; rather than explicitly demarked areas, these elds should be understood as spheres or domains with a certain coherence in ideas, outlook, cultural practice, and ritual repertoire. The four elds of the sacred are the religious eld, the eld of marking and remembering, the cultural eld (art, culture), and the eld of leisure culture (sport, tourism). These four domains help to map the diversity in possibilities for engaging with the sacred when analysing contemporary ritual. Their use will be apparent with regard to the modern Camino pilgrim, whose diverse and layered appropriation of the ritual of pilgrimage calls for an approach that is similarly open. The implementation of the elds of the sacred in PILNAR took place mainly at the level of the choice of metadata. Narratives are manually labelled with keywords that implicate di erent elds (Figure 31.1). The religious eld is indicated by keywords like 'God' or 'Ritual' , the eld of marking and remembering by keywords like 'diseased' or 'history' , the cultural eld by keywords like 'art' or 'culture' , and the eld of leisure by keywords like 'nature' or 'food' . These elds also play an important role in the development of the PILNAR app mentioned earlier.

Technical Implementation
The PILNAR application takes advantage of the exibility of CLARIN's CMDI speci cation. In total six pro les were used to describe the di erent types of PILNAR submissions to the project: JacobsstafVerhaal, Website, Pelgrimsverhaal, UserSubmission, VirtualCollection and Image. Each pro le captures speci c information, such as title, author, description or participant information, related to the type of material submitted to the project. As indicated above, keywords provide an important distinguishing feature to the PILNAR researchers and are represented across almost all pro les.
The PILNAR application provides a workspace to the project members where they can interact with metadata pro les. Depending upon the type of material, the user interface supports selection of pro le-speci c editors, i.e. each pro le is associated with its own editor. Each of these editors is dynamically loaded when a user selects to create or modify a speci c type of resource. The resulting metadata and data les are stored on the server and automatically indexed to support search across all metadata les. This also allows for the displaying of distributions across facets in customised user interface components, such as the keyword pie chart shown in Figure 31.1.
For those interested in the implementation details of PILNAR, it is worth noting that the user interface is implemented in Flash, the workspace backend is implemented in Java and is delivered through a Tomcat server, and the indexing and search procedures take advantage of SOLR.

Bumps in the Road
As the project took shape, di erent issues started to rear their head. These di culties can be clustered in three categories: the variety of the source material (including their di erent property rights); di culties in the communication between more content-oriented partners and the technical infrastructure; and challenges in the curation process and other unforeseen dynamics during the process of the project itself (cf. the development of the PILNAR app). A short elaboration on these 'bumps on the road' may illustrate the clusters.
First, the variety in sources created complications in the construction of a single structure to include all desired narratives. For example, the archive of De Jacobsstaf, although digitally available, posed di culties, because it contains many forms of narratives. It proved di cult to select the texts included in the magazine that would be eligible for inclusion in PILNAR -personal accounts and thematic re ections were obviously wanted, but should discussions of movies and books be included? Or editorial pieces? Or experiences on Spanish language courses? Another problem that relates to the character of the narratives' source came from the upload website. Pilgrims that uploaded their own stories via this specially created website used the elds provided in a way that was not expected by the PILNAR team. This resulted in a lot of material that had to be converted manually before it could be included in the nal dataset. Yet another type of di culty was presented by the set of texts collected on the Catharijneconvent website, which used a di erent set of privacy regulations than the PILNAR project. This means that before the collection of the Catharijneconvent could be imported, every single author that contributed to this collection had to be approached individually -this has proven a task too extensive to perform within the PILNAR project.
A di erent cluster of problems were related to the cooperation between 'Tilburg' as the centre for content and project coordination and ' Amsterdam' , with the Meertens Institute as the centre for the computational infrastructure (with in the background a critical CLARIN representative keeping track of time and nancial planning). A rst issue is well-known and broadly recognised in multidisciplinary projects like this one: communication o en proves complicated between partners with such di erent backgrounds and outlooks. During the project, it proved di cult to explain the possibilities and limitations of the chosen technical setup of the database to the Tilburg sub-team, while, on the other end, the speci c character of the di erent sources and deliverables were hard to communicate to the team that would structure and present the data in the database. A second issue in this cluster was very practical. Although there were a set of milestones and work packages agreed upon in the initial project description, the variety of chosen materials (as described above) made it impossible to plan in detail the stages of the project. Choices needed to be made, and, in the end, only a part of the sources was incorporated. Also, the curation of the material in the database was preliminary. However, the project sta underlined the pioneering character of the project: it was all about making a start on a narratives corpus.
A third cluster of problems revolved around the method of structuring the collected texts for the end user. As discussed above, the inspiration for the PILNAR heuristics was the structure of the four elds of the sacred. Due to the somewhat exible nature of these elds and to the complex character of the data, the elds proved more di cult to apply to speci c texts than was foreseen. These and other internal dynamics of the project played their role in di culties faced by PILNAR. In di erent phases of the project, new insights came to the fore. Some of these could be incorporated into the project, while others could no longer be included. Inspired by the PhD project of Suzanne van der Beek on online pilgrim narratives, the PILNAR app was initiated in one of the last stages of the project. As this was not a foreseen development, PILNAR could not support the implementation of the stories gathered by this app directly into the database.

Results and the Future of PILNAR
Despite the di culties encountered by PILNAR, the project resulted in a functioning database with potential for growth and expansion (CLARIN -PILNAR, 2015). As it stands, pilgrims are still sending in their narratives, a new Jacobsstaf and Ultreia appear monthly, and we are exploring the possibilities for importing narratives from popular platforms like Facebook and Twitter. Since 2013, Suzanne van der Beek has become one of the main users of the database as an academic tool. Her PhD project on the narrative identity construction of contemporary pilgrims bene ts from a central collection of pilgrim stories that explores the width and depth of the complex pilgrim identity.
Perhaps the most interesting development in the PILNAR project, and the most concrete opportunity to guarantee continuity for the PILNAR database, is the creation of the PILNAR app (PILNAR Pelgrimsdagboek, 2014): in 2014, a mobile application was created that was supposed to give pilgrims the possibility to upload their narratives directly onto the database as they walked or cycled to Santiago de Compostela. The idea for this extension came from our colleague Dr Suleman Shahid, who also designed the structure of the app. Shahid's main interest lies in creating technologies that respond to the speci c needs of marginal groups of users. In the case of the Camino pilgrims, we decided to respond to the pilgrims' need for a platform on which they could exchange stories exclusively about their pilgrimages. Rather than letting pilgrims become distracted by unrelated stories on platforms like Facebook or Twitter, the PILNAR Pilgrim Diary (Figure 31.2) provides an environment that invites re ection upon the pilgrim's journey via di erent categories and inspirational questions. The keywords that can be added by users are the same as the ones used in the PILNAR database. The app would thus combine the continuation of the PILNAR database and the media-speci c needs of its pilgrim users.
In an early phase of creating the app, it became apparent that a structural connection to the PILNAR infrastructure would not be possible within the time and budget we had for the PILNAR app. This direct connection remains a wish, but is not currently an urgent issue. Pilgrims using it see the app and the database as part of the same project. The two are connected by their name, and in our communication the two are mentioned in the same breath. In this way, PILNAR travels not only di erent platforms, but also di erent countries in the pockets of pilgrims to Santiago de Compostela.

About the Authors
Suzanne van der Beek is a PhD researcher at the Department of Culture Studies of Tilburg University (NL), and underwent undergraduate training at the universities of Amsterdam, Leiden, and Lille. Her research is on the identity construction of Dutch pilgrims on the Camino to Santiago de Compostela. In particular, she studies the dynamics of online narratives in relation to the appropriation of a contemporary pilgrim pro le.
Paul Post studied theology and liturgical studies in Utrecht and Christian art and archaeology in Rome. He is professor of Ritual Studies at Tilburg University (NL; School of Humanities, Department of Culture Studies). He is vice dean for research and director of the Graduate School of Humanities. His main interests are in the elds of ritual, popular religion and (post)modern developments in ritual, on which he published books and articles. In recent years the focus of his research has been on ritual space and place, and cyber ritual.
Marc Kemps-Snijders is Head of Technical Development at the Meertens Institute. He has been involved in infrastructure development right from the start of the CLARIN project at both the European and the national level in areas such as semantic interoperability, services and work ows, information retrieval, and research data management.