The Perseids Platform: Scholarship for all!

It is rarely possible for students, members of the public, and other non-traditional scholars to access ancient documents such as texts, inscriptions, and manuscripts. The Perseids collaborative editing platform offers a gateway to scholarship which is open to all, regardless of native language, background, and level of expertise. Within this fully integrated online environment, participants can view, edit, translate, and annotate ancient documents (texts, manuscripts, inscriptions), while contributing to an ever-growing repository of open access humanities data sets. The variety of tasks available, which ranges from textual criticism to annotation of personal names and geographical entities, ensures the inclusion of all participants, and offers them a chance to learn and perform increasingly difficult tasks as they gain expertise. At the core of the platform is the integration of two pre-existing frameworks, the Son of Suda Online (SoSOL) and the CITE services. The former enables collaborative editing by providing workflow tools on top of a Git-based revision control system, supporting a built-in, versioned review process whereby each contribution is examined by a board and receives feedback before being approved for publication. The latter provides standards and APIs designed to link the resources, provide a citation scheme for texts and images, and support dynamic presentation of digital editions. Together, along with a variety of other integrated tools, standards, and services, these systems enable the Perseids platform to support communities of participants, who can collaborate based on participation in a class, shared interest in a literary work, or interest in a category of documents. How to cite this book chapter: Almas, B and Beaulieu, M-C. 2016. The Perseids Platform: Scholarship for all!. In: Bodard, G & Romanello, M (eds.) Digital Classics Outside the Echo-Chamber: Teaching, Knowledge Exchange & Public Engagement, Pp. 171–186. London: Ubiquity Press. DOI: http://dx.doi.org/10.5334/bat.j. License: CC-BY 4.0. 172 Digital Classics Outside the Echo-Chamber


Introduction
Current practice in Digital Humanities focuses heavily on crowdsourcing as a model for producing knowledge, vetting contributions, and in general, dealing with large datasets. 1 In Classics, the rapid growth of digital repositories and the increasing number of largely unedited and untranslated documents available online for processing-thousands of Greek and Latin inscriptions, 900 medieval manuscripts from e-codices and 250 from the Walters Art Museum, to name only those few-practically force us to abandon traditional single-scholar approaches to adapt to the realities of the digital age. 2 The needs are simple, yet challenging: we must process more documents, but we must do so in a manner that is sustainable, upholds the established standards of quality in the discipline, and will result in the production of fully interoperable, transferable data.
The advantages to the method are obvious: a well-organized crowdsourcing effort can accomplish far more work than any lone scholar and the work ultimately produced benefits from the variety of perspectives included in its user base. Furthermore, crowdsourcing helps break down the social and geographical barriers that have long kept ancient documents and scholarship in a limited number of hands. However, the pitfalls of crowdsourcing are numerous, such as ensuring the reliability and consistency of the data being produced and ensuring the longevity of projects by renewing and growing the user base. 3 Finally, as crowdsourcing is more and more frequently practiced in classroom settings, questions arise with respect to pedagogy. 4 Does crowdsourcing change traditional teaching methods in Classics, and if so, how? In this new disciplinary landscape, what is the relationship between teaching and scholarship?
The Perseids platform, 5 nested within the Perseus Digital Library, 6 offers an online collaborative editing and annotation environment in which to test different approaches to crowdsourcing and pedagogy. Users can form communities based on participation in a class, in a research project, or individual interest in a particular type of document or question. The flexibility of the platform and the variety of tools offered ensure that users at every level of expertise and from a variety of fields can undertake editing and annotation work. Such broad participation leads us to rethink the role that scholarship plays in pedagogy, and in general the role that Classical scholarship can play in engaging Perseids users with the past.

The Audience of Classics Scholarship
Classics scholarship, in the form of interpretative essays, critical editions, and other forms of highly specialized publications, has long been strictly targeted to established scholars such as university professors and other professionals in the field. The objective of such publications is generally understood to bring knowledge further by engaging specialists in a conversation among themselves. This is certainly useful: in all fields, it is important for new discoveries and new ideas to be examined carefully by those who know the most about the field and therefore can cast a critical eye at the work being done and express informed opinions.
Yet Classics, by its very nature as a field that encompasses disciplines as diverse as history, philosophy, archaeology, art history, rhetoric, grammar, linguistics, and many others, engages a large body of stakeholders who are not specialists. In fact, this is precisely the reason why the discipline of Classics has been conceived as part of the core educational curriculum in the West until recently. For this reason also, the teaching of Latin (and to a lesser degree, Greek) has recently regained some of its popularity in high schools, as parents and educators search for means to introduce children to the Humanities and the study of languages. 7 The question then becomes, how to accommodate these different aspects of the field without compromising either quality or accessibility, and in general, how to promote the study of Classics?
The Perseus Digital Library, within which the Perseids platform is nested, has long served such a diverse audience. Perseus' broad mission is 'to make the full record of humanity − linguistic sources, physical artifacts, historical spaces − as intellectually accessible as possible to every human being, regardless of linguistic or cultural background' . 8 Naturally, such a mission can never be fully realized, yet the infrastructure that we design now will materially enable or constrict how the next generation will be able to read languages from the past, scrutinize ancient artifacts, and explore historical spaces. With these goals and caveats in mind, the Perseids collaborative editing platform was designed to enable a broad audience to contribute to Perseus, and in general to participate in the creation of knowledge in the Humanities. 9 The Perseids platform makes a range of tasks available to its users, from micro-tasks to multi-step editorial projects. 10 Students can undertake entire editorial tasks individually or in groups, as was done in Marie-Claire Beaulieu's Medieval Latin class in 2013 with the Tisch Miscellany Collection, a group of manuscript leaves and folios from early printed books preserved in the Tisch Library at Tufts University. 11 This project served as a test bed for tasks that were later to be made available in an integrated workflow on the Perseids platform. Now, students have started using the Perseids platform for such tasks. Editing and translation work has started on a 14th-century compendium of English Forest Law preserved in the Tisch Library (see Fig. 1), 12 and we intend to finish the edition and translation of the Tisch Miscellany Collection. Within these broad tasks, students can be assigned micro-tasks such as morpho-syntactic analysis through treebanks and named entity annotation through a variety of means including Perseids interfaces, through data imported from Google Spreadsheets or via integration with tools from the Alpheios and Pelagios projects. 13 By opening up the possibility for a wide range of external and third party tools to be used for annotation, we test different approaches to scholarship and pedagogy but recognize that an integrated fluid user experience is essential to successful uptake and use of the platform. For this reason, we have also now integrated the Arethusa client-side annotation framework, which enables rapid development of new interfaces for different types of annotations and documents, within a single consistent user interface paradigm (see Fig. 2). 14 The methodology behind the development of the Perseids platform is consistent with the project's goals for openness and accessibility. Underlying all architectural decisions is the premise that all texts and data produced on the platform must be fully accessible to the creator of the data at any time, and also available to other users of the platform.
There are different aspects to accessibility. First, in terms of user access, all that is required to create an account on the platform is an account with a Social Identity Provider that supports the OpenId protocol. 15 The most common type of account for this is a Google email address, but Yahoo and AOL addresses are also accepted and additional OpenId provider services can be added. We have also included support for authentication via a user's educational institution, through support for the SAML/Shibboleth protocol. 16 The user can choose to link this account with her social identity, so that if the user changes institutions, she can retain a consistent single identity on the platform associated with her publications. Support for the OpenID and SAML/Shibboleth protocols also puts the amount of private information made available to the Perseids platform and its end users in the primary user's own control. Perseids never has access to authentication credentials (such as passwords) and the only information a user is required to provide is a nickname for their user id. Although a user may choose to provide their email address, full name, and affiliation, this is purely optional.
Next, in terms of legality, all the data produced on Perseids is published under the Creative Commons CC-BY-SA license. 17 In addition, at no time is the data locked into a closed database under proprietary formats. Instead, we use the git version control system to store and manage all texts and data, 18 and while the current deployment of the platform uses a git repository which is local to the infrastructure components that read and write data, there are various means by which this data can be retrieved by the end users.
All publications at any stage of editing are downloadable via links in the Perseids user interface, serialized according to standard and widely accepted data formats (the standards used will be discussed further below). In addition, version history and comments for any given file are available to any user of the platform through the user interface. No access controls are imposed on either download or history and commenting functionality, although in order to accommodate needs for use in the classroom, we have given the user the responsibility to share links into their publications, rather than advertising them broadly through the user interface. By the end of the first implementation phase of the project, we will also establish a public-facing clone of the master branch of the local git repository on the GitHub platform. This branch contains all committed publications, i.e. those which are no longer under review. Finally, the tools themselves used to create, curate, and annotate texts and data are all open source components, available in public version control repositories such as github, sourceforge, and bitbucket. Public contributions to these code bases are encouraged, and − subject to review by Perseids project staff − will be accepted and deployed on the platform. Should a user or set of users wish to add functionality to one or more of these tools that is not deemed to be in line with the project priorities or goals, these users are free at any time to fork the code bases and deploy their own version of the tools, taking their data with them. The tools themselves are connected via documented APIs and standard RESTful web protocols.
A different avenue we are pursuing for accessibility is via integration with other projects in the domain. The Europeana network of Ancient Greek and Latin Epigraphy (EAGLE) project 19 has setup a multilingual Wiki for the enrichment and enhancement of epigraphic images and texts, to provide a basis for future translations of inscriptions into other European languages. However, the wiki approach with open editing practices is a new model for traditional scholarship in the field, and Perseids is integrated with the EAGLE wiki to provide an alternate review workflow which allows translations to go through an editorial board. Perseids is in this case serving as a bridge between fully open wiki editing models and the more closed review circles, by providing an open platform to enable peer and board review for wiki-based publication.

Pedagogy and Scholarship
When Perseids is used in class to edit, translate, and analyze ancient documents, traditional pedagogical models give way to a new model in which the teacher becomes a collaborator, guiding students through the process of research. In such a pedagogical setting, traditional top-down teaching methods where the teacher is in control of projects and outcomes and individual students all produce work on the same texts are set aside. Rather, work is produced in small teams or as a broad group with distinct tasks. Such a pedagogical method has proved motivating for students, who expressed enthusiasm at the idea of producing original work. Furthermore, this collaborative method can easily be combined with more traditional lectures or drills in order to vary classroom activities and stimulate different types of learners, from the more passive to the more pro-active students. 20 Furthermore, collaborative teaching methods provide a new model for evaluating student work. While traditional assignments are produced on a onetime basis and usually go through only one grading cycle, collaborative assignments can be evaluated multiple times, formally or informally. This component of the Perseids platform is an extension of the Son of SUDA Online (SoSOL) application, which was developed by the Papyri.info project (where it is called the Papyrological Editor). 21 This tool supports a workflow that leverages the Git version control system and in which a publication can consist of multiple linked documents, each identified by a stable Uniform Resource Identifier (URI). 22 There are no fixed editions; everything is potentially in flux but each change is carefully recorded and vetted via an approval process which passes the documents in a publication through one or more targeted review boards made up of editors and community members. In a classroom setting, the board can be the teacher and teaching assistants, while in other projects the board can be composed of invited experts, or simply all the members of a project team. Such a review method is commonly used in research and forms the basis of the peer-review system on which modern scholarship relies. In a classroom setting, multiple reviews allow students to learn from their mistakes and correct them, cut down on stress, and allow for a longer-term formative experience that shifts the emphasis from grades to learning. The students and the teacher make sure the task is done optimally rather than simply evaluating how it meets a certain standard at first try. Furthermore, the process serves to train students as researchers as they get to experience the many stages of review through which any work of scholarship must go before publication.
Our aim is to support a wide range of publication types for the texts and data produced on the Perseids platform, from micro or nano publications 23 to fullfledged digital editions which adhere to scholarly standards such as those outlined by publications like RIDE. 24 Some examples of micro publications already supported include the commentary annotations produced by Marie-Claire Beaulieu's mythology students and published on the Perseus Digital Library, 25 additions and corrections to linked data sets like the Perseus Lexical Inventory, 26 and named entity and date annotations produced via the use of tools like TimeMapper and preserved via ingest into the Perseids repository from Google spreadsheets. 27 We also support complete digital editions comprised of multiple documents including transcriptions and translations of source text in TEI XML, complete morpho-syntactic annotations on the text in the form of treebanks, translation alignments, accompanying commentary, bibliography, and other related information. The Fragmentary Texts and Bodin prototypes are demonstrations of possible web based presentations of such editions, always backed by the raw XML, and annotations that are the substance of the data behind the edition. 28 Through tools like Arethusa, mentioned previously, we are also exploring approaches to living publications, where members of the community can distribute links to their work in progress and invite feedback. The goal is to support as small or large a contribution to the scholarly discourse as an individual is willing and able to make, always preserving the history of the work, recording provenance information according to standards for research data like the PROV ontology so that work is credited and attributable. 29 Perseids team members have also been participating in international research data efforts like those of the Research Data Alliance to ensure we are informed of and follow the best practices in the scientific and scholarly communities for preservation and publication of the data that makes up our publications. 30 Collaborative teaching methods also give permanence to student work. While traditional classroom assignments will usually get thrown out after the semester is over and are rarely expanded into further work, the work done through Perseids is published to the web in the form of editions, translations, and annotations, where it serves to support further scholarship and learning activities. In this way, the students learn by producing concrete results, much in the way students in the sciences learn by participating in experimentation in laboratories or students in trade schools learn by working on actual products. In this way, Perseids not only democratizes access to publication, but also gives value to small contributions as well as to large ones. 31 For this reason, Perseids will soon implement an e-portfolio module, 32 which will pull together all of a user's contributions to scholarship through the platform and make them available as a body of work. The module can be used as a tool for global classroom evaluation, for capstone projects, or as material for graduate school and job applications.

What is Scholarship?
Opening up participation in scholarship in this way brings us to ask the question, what exactly is scholarship, and do these methods change it? According to the American Heritage Dictionary, scholarship is 'the methods, discipline, and attainments of a scholar or scholars' . According to this definition, scholarship is about the ways in which knowledge is produced as well as the result of this production, namely, contributions to the advancement of knowledge. Thus, over the centuries, scientific research methods have evolved in order to ensure maximum accuracy of the results. These methods apply to all science, regardless of the field, whether it is in the Humanities or Natural Sciences. At their core, they involve relatively simple principles that correlate the collection of accurate data and its interpretation. In Classics, these principles are seen at work in language training, which is essential in order to understand and use the information provided by ancient documents to the fullest, and in the principles that guide the conduct of textual criticism, archaeological digs, etc. All these methods are designed to collect data in a way that is as accurate as possible so as to form the basis for sound interpretation.
Perseids helps Classicists to uphold these scholarly standards and to make them accessible to a broader population than ever before. Many of the tools offered on Perseids facilitate language acquisition, such as morpho-syntactic analysis through treebanking, which provides a visual and kinetic method to analyze language, as words and clauses are moved around the screen to show sentence structure. This is an effective method not only to gain understanding of language, but also, to display one's understanding of a sentence or group of sentences and justify interpretations. In this way, morpho-syntactic analysis offers learning tools that are also intrinsically scholarly. 33 Similarly, alignment tools developed by the Alpheios project offers readers who have no knowledge of the original language insight into the text itself, and also allow justifying translation choices. 34 As for textual criticism, the editing tools available in Perseids are designed to make the collation of textual data as transparent as possible. The Imgspect tool, the design of which was inspired by the Image Citation Tool of the Homer Multitext Project, allows us to link transcriptions directly to an image of the document (this is particularly helpful in the case of inscriptions and manuscripts) in order to justify readings and show the evidence directly to the audience. 35 Furthermore, transcriptions of inscriptions and manuscripts are encoded following the EpiDoc standards, which are ultimately based on the Leiden conventions, long accepted in the field as the scientific standard for presenting epigraphical and manuscript texts. 36 Finally, the built-in review process available in Perseids and the collaborative focus of the platform helps to uphold the standards of peer review, which are crucial in establishing credibility in any field.
The Canonical Text Services Unique Resource Name (CTS-URN) specification, developed by Chris Blackwell and Neel Smith, working with the Center for Hellenic Studies, is a core standard of the Perseids platform and enables us to connect the work we do today with the long standing tradition of scholarly citation in the classics. 37 The CTS-URN specification allows us to translate canonical citations such as Thuc 2.44 and Liv. 1.34 (which have long described chapter 44 of book 2 of Thucydides and chapter 34 of book one of Livy) into a technology independent and machine actionable form. 38 These URNs, when combined with the http://data.perseus.org namespace, allow us to provide persistent, stable, resolvable identifiers for any canonical text, passage, or even word as the target for our annotations in a manner that adheres to best practices for linked data. 39 We make this data interoperable and sharable by other projects in the field by serializing all annotations. This includes the simple identification of named entities, to commentaries on texts, to the more complex morpho-syntactic analysis, according to the Open Annotation (OA) data model, including provenance information for the creators, contributors, and reviewers of these annotations. 40 The inclusion of provenance information allows consumers of the data to make their own quality assessments about the data. And in addition to the persistent URIs for the primary source texts which are the target of the annotation, we can reference and contribute to other established data sets in the domain, such as the Pleiades Gazetteer and the annotations on ancient places aggregated by the Pelagios Project. 41 In addition to these methodological principles, the American Heritage Dictionary offers a second definition for scholarship, explaining it as the knowledge resulting from study and research in a particular field. Thus, according to this definition, Perseids users become scholars by the very fact that they engage in such activities. We note that the dictionary does not characterize scholarship or scholars as possessing definitive knowledge on a topic. Rather, both definitions imply a process that results from the practice of scholarly activities through which knowledge is produced, with new data and new interpretations constantly replacing old ones. Knowledge advances on a continuum, with contributions big and small paving the way to discovery. In Perseids, this aspect of scholarship is represented in the equal value given to all contributions, whether they are large editing tasks or the correction of typographical errors. 42 For this reason, all contributions will appear in a user's e-portfolio, showing all the aspects scholarship can take.

Conclusions
Perseids permits the practice of scholarship in the traditional sense of the word. Yet, Perseids also transforms scholarship by offering broad access to and engagement of individuals at all levels of expertise, not only to the practice of scholarship, but also to its valued outcome, publication. We encourage and enable users to take responsibility for the scholarly data they produce on the platform, offering the opportunity for it to be published, after review, by the Perseus Digital Library as part of a larger collective body of work while also leaving them free to take it with them and re-imagine its publication by itself or as part of other projects, such as that of the EAGLE network. As a result of this process, the undergraduate students who currently make up the large majority of our user base learn that they are part of a global community of interconnected scholars sharing the responsibility for making more ancient texts than ever before available for analysis and study by all, while still upholding the long-established standards of quality of the discipline. Some of these students begin to see themselves as research partners with their professors, rather than just as students completing an assignment, and discover that they are empowered to publish and disseminate their knowledge in a wide variety of forms and venues. As undergraduates in multidisciplinary courses of study, many of our users will go on to fields well outside of traditional academic structures and we hope that we have planted a seed they will carry with them, leading to their future engagement in opportunities for scholarship, whatever they may be.  Orlandi et al. 2014. 20 The pedagogical models illustrated in the use of Perseids in the classroom including meaningful research collaborations between student and professor, project-based student teamwork and student publication of their contributions at earlier stages have been highlighted elsewhere as key potential contributions of digital humanities pedagogy to broader humanities teaching, see Bonds 2014, andHirsch 2012. 21 SoSOL in Github: <https://github.com/sosol/sosol>; for more on the development and current status of work on SoSOL see Baumann 2013; Papyri. info: <http://papyri.info/>; Papyrological Editor: <http://papyri.info/ editor/>. 22 Git: <http://git-scm.com/>; for a good definition of URIs, see Wikipedia, 'Uniform resource identifier': <http://en.wikipedia.org/wiki/Uniform_ resource_identifier>. 23 Nanopub Guidelines: <http://nanopub.org/guidelines/working_draft/>.
For more on on the definition of micro or nano-publications, see Groth et al. 2010 andClark et al. 2014. And for an initial consideration of some of their potential for publication within the humanities, see Drucker 2013, andHall 2013. 24 Krohn 2014. 31 For a similar discussion of the importance of recognizing and publishing both student contributions and smaller forms of scholarly publication, see Blackwell & Martin 2009;Presner 2012.