Beyond Toponyms. Conceptualising space in narratives using networks and ontologies.
Despite its importance as a fundamental category for the evolvement of narrative action, the conceptualisation of space has long been neglected in narratological research, since it poses substantial problems for modelling: The creation of space in narratives is often based on implicit information. Rather than constructing a given, mathematical space beforehand, stories tend to evolve their setting in relation to its characters that constitute space through their actions.
Besides, spatial information can also be used for descriptions that do not contribute to the creation of the setting of a story. Recent narratological studies thus have proposed to differentiate between event regions and mentioned spatial objects (Dennerlein 2009).
Given this preliminaries, an attempt to establish digital methods for the analysis of spatial information in narratives has to cope with complex semantic structures. Therefore, we propose a combination of a lexicological (1) with a relation extraction-approach (2), which we will outline in our paper.
(1) A basic approach to detect spatial information can be achieved by extracting toponyms from a text, which can also be visualised in a map. However, toponyms are not the only relevant spatial information in a text. Dealing with German texts, we use the GermaNet-ontology (Hamp/Feldweg 1997, Henrich/Hinrichs 2010) to build wordlists of place nouns that are divided into sub-categories (architecture, landscape, etc.) and combine them with the information on toponyms extracted by Named Entity Recognition.(2) To classify spatial information, it has to be seen in its context. Although a rule-based approach that draws on dependency parsing and the detection of verb-classes (verbs of motion vs. verbs of perception) appears promising for separating event regions from mentioned spatial objects, in this paper we constrain ourselves on collocation- networks as they are used in the analysis of character relations in stage plays (e.g. Trilcke 2013). Instead of interacting characters we use place markers to serve as nodes of the network. Edges are established according to their common appearance in one sentence.
(2) To classify spatial information, it has to be seen in its context. Although a rule-based approach that draws on dependency parsing and the detection of verb-classes (verbs of motion vs. verbs of perception) appears promising for separating event regions from mentioned spatial objects, in this paper we constrain ourselves on collocation- networks as they are used in the analysis of character relations in stage plays (e.g. Trilcke 2013). Instead of interacting characters we use place markers to serve as nodes of the network. Edges are established according to their common appearance in one sentence.
Fig. 1 (created with GEPHI, Bastian et. al. 2009)
An example of this approach can be seen in figure 1, which contains a network from chapter 14 of Jules Verne’s Around the World in Eighty Days (place nouns have been translated into English). Toponyms that have been manually classified as event regions appear in brown, mentioned toponyms in red. Place nouns are divided into the sub-categories nature (green), architecture (grey) and transport (blue). The network not only visualises the relations between spatial markers, but can also serve as a starting point for testing hypothesis on their distribution (Do, for instance, mentioned spatial objects have a lower degree than event regions? Do the clusters of place markers indicate distinct settings of the story?).
However, this first approach towards a visualisation of space still has to overcome some difficulties: Firstly, even with the use of GermaNet, the wordlists have to be refined semi-automatically. Secondly, there is the problem of ambiguity (consider e.g. the terms “stream” or “railroad”), which has to be solved by more elaborated semantic analyses. Thirdly, the common appearance of place markers can only serve as a proxy; here, more elaborated models of spatial relations have to be found.
Bastian Mathieu / Heymann Sebastian / Jacomy Mathieu (2009): Gephi: an open source software for exploring and manipulating networks. International AAAI Conference on Weblogs and Social Media.
Dennerlein, Katrin (2009): Narratologie des Raumes. Berlin.
Hamp, Birgit / Feldweg, Helmut (1997): „GermaNet – a Lexical-Semantic Net for German.“, in: Proceedings of the ACL workshop Automatic Information Extraction and Building of Lexical Semantic Resources for NLP Applications. Madrid.
Henrich, Verena / Hinrichs, Erhard (2010): “GernEdiT – The GermaNet Editing Tool”, in: Proceedings of the Seventh Conference on International Language Resources and Evaluation (LREC 2010). Valletta, Malta, May 2010 2228-2235.
Trilcke, Peer (2013): Social Network Analysis (SNA) als Methode einer textempirischen Literaturwissenschaft, in: Ajouri / Mellmann / Rauen (Ed.): Empirie in der Literaturwissenschaft, Münster, 201-247.