Strengthening the Monitoring of Violations against Journalists through an Events-Based Methodology

Sustainable Development Goal (SDG) indicator 16.10.1 proposes an important monitoring agenda for the global recording of a range of violations against journalists as a means to prevent attacks on the communicative functions of journalism. However, the need for extensive collection of data on violations against journalists raises a number of methodological challenges. Our research shows the following issuesmust be addressed: the lack of conceptual consistency; the lack ofmethodological transparency; the need for sophisticated data categorisation and disaggregation to enable data to bemerged from different sources; the need to establish links to understand causal and temporal relations between people and events; and the need to explore and utilize previously untapped data sources. If we are to strengthen the monitoring of SDG 16.10.1, we propose to develop a robust and reliable events-based methodology and a set of tools which can facilitate the monitoring of the full range of proposed 16.10.1 categories of violations, reconcile data from multiple sources in order to adhere to the established 16.10.1 category definitions, and to further disaggregate the proposed 16.10.1 categories to provide more in-depth information on each instance of a violations. This, we argue, will ultimately contribute towards better understanding of the contextual circumstances and processes producing aggressions against journalists.


Introduction: The Problem of Adequate Monitoring
The UN Plan of Action on the Safety of Journalists and the Issue of Impunity states that recent years have shown "disquieting evidence of the scale and number of attacks against the physical safety of journalists and media workers as well as of incidents affecting their ability to exercise freedom of expression" (UN, 2012, p. 1). Perpetrators of these attacks span both state and non-state actors, such as formal government representatives and security forces as well as organized crime groups, militia, terrorist, and extra-state political groups. The types of attacks include "killings, death-threats, dis-appearances, abductions, hostage takings, arbitrary arrests, prosecutions and imprisonments, torture and inhuman and degrading treatment, harassment, intimidation, deportation, and confiscation of and damage to equipment and property" (Organization for Security and Co-Operation in Europe, 2012, p. 1). Research shows that journalists are typically targeted because of their work in holding power holders to account, for example when exposing corruption and organized crime and reporting in conflict zones (Horsley & Harrison, 2013;IFEX, 2015;UNESCO, 2018a). Attacks are carried out in a variety of societal contexts that range from conflict and war zones, increasingly fragile states or vulnerable regions, countries undergoing political or economic shocks, and in relatively stable countries (Asal, Krain, Murdie, & Kennedy, 2016;Bjørnskov & Freytag, 2016;Brambila, 2017;Collinson, Wilson, & Thomson, 2014;Cottle, Sambrook, & Mosdell, 2016;Gohdes & Carey, 2017;Riddick, Thomson, Wilson, & Purdie, 2008;Taback & Coupland, 2006;VonDoepp & Young, 2013;Waisbord, 2002Waisbord, , 2007. Risk and hazard exist in both conflict and non-conflict situations and, worryingly, threats that intensify risk and hazard have more recently migrated to on-line (Betz, Lisosky, & Henrichsen, 2015;Reporters Without Borders [RSF], 2018;UNESCO, 2018b).
Other factors affecting the incidence of attacks are also being recognised. These include gender (Ferrier, 2018;UNESCO, 2018b), the type of news medium the journalist works for, the beat covered, or if the journalist is local, foreign, and/or freelance (UNESCO, 2018b). Problematically, the majority of intimidatory and violent acts against journalists and freedom of expression are committed with impunity, meaning that the violations have no legal consequences and that perpetrators go unpunished (Committee to Protect Journalists [CPJ], 2019a;Horsley, 2011;Parmar, 2014;UNESCO, 2018b). Considering the multi-layered nature of problems of journalism safety, any efforts to address safety threats ultimately depend upon our ability to understand and measure the complexities and dynamics of journalistic risk and hazard.
The international community has increasingly come to recognise the safe practice of journalism as a prerequisite for sustainable and human rights-centred development. This is acknowledged not least in the SDGs Agenda, within which the occurrence of violations against the safety of journalists has been included as an indicator of Target 16.10, which aims to "ensure public access to information and protect fundamental freedoms," by recording "verified cases of killing, kidnapping, enforced disappearance, arbitrary detention and torture of journalists and other harmful acts" (Human Rights Council [HRC], 2018) through indicator 16.10.1. Indicator 16.10.1 will therefore be used to assess overall progress to the wider SDG 16, which seeks to "promote peaceful and inclusive societies for sustainable development, provide access to justice for all and build effective, accountable and inclusive institutions at all levels" (UN, 2019).
While the SDG agenda in this way opens up a path for the potential universal monitoring of violations against journalists, the requirements in terms of comprehensive data collection raise a number of methodological challenges that currently stand in the way of generating the data needed to achieve the formulated monitoring goals. Ultimately, adequate monitoring of occurrences of attacks on journalists is essential for understanding the complexity, scale, and nature of these problems and thus "a crucial step toward establishing an empirical evidence base that can serve to tailor interventions aimed at safeguarding journalists and their work" (Torsner, 2017, p. 129).
In short, these methodological challenges can be described as twofold. First, the availability of reliable quality data on a range of abuses is an issue. Here it is important to emphasise that the gathering of data on any type of abuse against journalists and the verification of its accuracy is a tremendously challenging undertaking that is being diligently carried out by a range of civil society actors. This process often involves having to gather data in the field from volatile and/or conflict-ridden societies (IFEX, 2011, pp. 20-22), and in contexts where powerful actors and vested interests are able to conceal or prevent information related to attacks on journalists from coming to light (RSF, 2019;Sullivan, 2018). Furthermore, institutionalised local mechanisms that could facilitate the systematic collection of data on abuses may be under development or completely absent in many contexts. Importantly, this extends beyond conflict situations to include developing and developed democracies (Pöyhtäri, 2016, p. 177;UNESCO, 2015). Moreover, data collected by local civil society organisations are rarely compiled into a common repository of data that can be used towards the monitoring of indicator 16.10.1 or for structural cross-country comparison or the domestic analysis of trends (see Gasteazoro, Gómez, & García, 2019, for an example of a regional initiative to monitor 16.10.1).
Secondly, the empirical measurability of indicator 16.10.1 is also a pinch point. The UN statistical commission, which is overseeing the work on operationalising the indicators, initially argued that the 16.10.1 measurement had some weaknesses and ranked it as a Tier III indicator (the weakest category). While being based on "internationally agreed standards [that] (UN, 2016, p. 55), the Tier III ranking of indicator 16.10.1 meant that no "established methodology and standards" existed for the indicator or that "a methodology/standards" were in development (UN, 2018, p. 3). Work has since been undertaken to "refine the methodology and expand the data collection scope of the indicator" and as a result, indicator 16.10.1 has been upgraded to a Tier II indicator (UN, 2018, p. 30). It is thus now regarded as "conceptually clear [with an] established methodology and standards available but data are not regularly produced by countries" (UNESCO, 2018c, p. 2). Consequently, even if a methodology for measuring and capturing data on threats and attacks against journalists is developed, the problem of limitations when it comes to access to reliable data still remains. Any attempt to improve monitoring should therefore ideally address the issue of generating quality data and establishing a methodology for systematising and comprehensively measuring safety threats concomitantly.
In practical terms, this article is concerned with methodological development aimed at contributing to the concrete measurement of the delineated 16.10.1 categories of violations against journalists. This immediate utility-oriented goal is, however, interlinked in important ways with a more overarching research agenda focusing on the task of developing methodologies of measurement to strengthen current data gathering so that it captures the contextual complexity necessary to understand problems of safety in a more comprehensive way. From a sociological perspective, understood to encompass the methodical examination of society, social interaction and patterns (Allan, 2006), generating understanding of the phenomenon of safety violations as complex is at the very heart of what the article seeks to contribute towards. Whereas "[f]actual research shows how things occur…sociology does not just consist of collecting facts" (Giddens, 2009, p. 10), sociology is concerned with "why things happen" (Giddens, 2009, p. 11) for the purpose of making sense of factual observations. Lacking a ground-level understanding of the facts of how violations against journalists are manifest in the real world will ultimately prevent any broader analysis into why the world is so constituted, and consequently how the causes and wider societal consequences of attacks on journalists should be assessed. It is thus against the background of such a wider line of inquiry, of tracking not only the incidence and nature of violations themselves (their manifestation), but also their causes and consequences that the methodological groundwork in this article is conducted (Torsner, 2019).
The challenges to achieving this are diverse and substantial, as addressing them requires that statistics on violations against journalists are not only systematically recorded as high-level categories of information-such as counting the number of killed or imprisoned journalists within a country on a yearly basis. Indeed, such information needs to be recorded in a way that can provide a disaggregated understanding of the context of each violation. This would also need to allow for the disaggregation of risk factors through the macro, meso, and micro sociological levels of analysis (Giddens, 2009;Ritzer, 2011) and therefore understanding an environment hostile to free and independent journalism as arising from a continuum of patterns of influence emerging from interactions and "articulations between systems and actors, between structures and practices" (Ferreira & Serpa, 2017, p. 3, 2019. The 16.10.1 indicators ultimately produce categories of information that allow for the identification of the number of times journalists have been exposed to a specific type of violation (killing, arbitrary arrest, and so on). However, any comprehensive monitoring must be approached holistically, taking into account the multidimensional nature of safety problems as not only consisting of manifestations but also causes and consequences that go beyond the immediate consequences suffered by the individual journalist as a result of an attack. Indeed, such consequences influence the practice of journalism, for instance by giving rise to prac-tices of self-censorship (Clark & Grech, 2017;Harrison & Pukallus, 2018), as well as for society more broadly as journalistic voices are silenced.
Whereas this would require the systematic study of risk as produced by social actors including the state, economy, the law, and the institution of journalism itself, the aim of this article is narrower in terms of its particular focus on the improvement of the monitoring of violations of the safety of the individual journalist. Nevertheless, the article does so through the lens of sociological holism with the aim of preparing the ground for establishing a monitoring methodology that allows for the recording and subsequent understanding also of the reasons why violations occur and how the implications of such violations for society at large should be understood. To this end the events-based approach developed in this article meets this requirement of holism by serving as a tool for a more systematic and disaggregated methodological approach to generate and systematise information on violations against journalists.
To show how this is achieved, the article will first diagnose the limitations with extant data that is being used to track and record violations against journalists for the purpose of 16.10.1. Second, it examines possibilities for establishing an events-based methodology for monitoring SDG 16.10.1 in a way that generates high-quality data and allows for the merging of diverse information through the establishment of an ontological categorisation scheme.

Current Data Limitations Preventing Comprehensive SDG 16.10.1 Monitoring
To understand the empirical and methodological limitations of existing data we examined a selection of data sets that provide examples of international, regional and national level monitoring of violations against journalists (see Appendix 1 in the Supplementary File). We then studied the extent to which the categories of information recorded by monitoring organisations cover the five main SDG 16.10.1 violations categories (killings, kidnapping, enforced disappearance, arbitrary detention, and torture) as well as the sixth category of 'other harmful acts' (added to the categories of violations through the adoption of the Human Rights Council Resolution in 2018; HRC, 2018). Our findings show that there are three key areas that must be addressed to achieve effective monitoring of SDG 16.10.1 and to better understand the contextual circumstances producing attacks against journalists. These include: a) the issue of data coverage; b) the issue of data reconciliation and disparate definitions; and c) problems of data categorisation and systematisation.
a) The issue of data coverage Our research shows that the violations category of killings is recorded in all data sets covered. Although illustrated here through a representative sample, the conclu-sion that killings are the violation most commonly monitored is consistent with findings presented elsewhere (see e.g., Torsner, 2017Torsner, , 2019. Whereas the monitoring of lethal attacks against journalists is absolutely essential since it captures the most serious form of violation of journalistic expression, the argument here is that it is necessary to widen current monitoring to include the full range of physical and non-physical attacks perpetrated against journalists. This ultimately points to the need to respond to wider lines of inquiry such as uncovering how different societal contexts produce certain types of violations against journalists; sub-national and regional variations in violations; the types of violations facing different categories of journalists; and the range of responses to attacks (e.g., from families, peers, news organisations, civil society, and states). While these investigations lie beyond the scope of this article, it is nevertheless the aim here to build the foundations for these explorations by establishing a methodological infrastructure that enables such analyses to be conducted from the data. Indeed, any such wider analytical assessment on the nature and scope of challenges to the safety of journalists using only data on lethal violations "as a single indicator of risk" (Torsner, 2019, p. 128) may lead to incorrect conclusions with regards to trends and their real manifestation (see e.g., Landman & Carvalho, 2010, p. 50). If we look beyond the category of killings to the other SDG 16.10.1 categories, we see that there are substantial differences in the coverage of incident types, with certain categories of violations being recorded by some organisations but not by others. Importantly, this points to the disparate nature of categories that are used to record violations. The fact that data sets covering a specific national context (such as La Fundación para la Libertad de Prensa) tend to record a wider range of categories than those covering international statistics (represented here by the CPJ) indicates the need also to facilitate the incorporation of data collected in a local context when monitoring SDG 16.10.1.
In addition to limitations related to data range and coverage, conceptual inconsistencies between different data sets are also preventing comprehensive monitoring of 16.10.1. Examining the data on lethal violations reveals that a number of definitional, methodological, and verification-related considerations lead various monitoring organisations to differing approaches when it comes to how, when and why they record a killing in their tallies, for example who is considered a journalist (only professional journalists, or also citizen journalists and bloggers). These considerations cause yearly statistics on killings within a country to differ between organisations (see e.g., IFEX, 2011;Sarikakis et al., 2017;Torsner, 2017).
b) The issue of data reconciliation and disparate definitions Our findings also show that there is a lack of conceptual consistency across data sets, with numerous defi-nitions being used to describe the same type of violation. Given that the rationale for the guidelines on the metadata for indicator 16.10.1 provided by the Office of the UN High Commissioner for Human Rights (OHCHR) is based on human rights provisions (such as the right to life and liberty), making extant data compatible with definitions adopted for monitoring 16.10.1 is key, and underlines the problem of conceptual inconsistency with extant data which also does not expressly adopt the 16.10.1 definitions. These definitional challenges become particularly clear when considering the category of 'other harmful acts' (HRC, 2018) where the disparate nature of definitions used makes any attempt to harmonize different data sets in a way that adheres to the 16.10.1 categorisation far from straightforward.

c) Problems of data categorisation and systematisation
Whereas generating more data on a wide range of different types of violations is key to strengthening the monitoring of 16.10.1, we also argue that improving monitoring is not simply a matter of gathering more data of the kind that already exists. Rather, methodological development is also needed with regards to data categorisation and systematisation. This can be illustrated through the statistics on instances of lethal violations which are commonly recorded as counts of the number of yearly occurrences of killings within a country. These figures are accompanied by varying levels of detail about the incident and its surrounding circumstances. In some cases, only the bare minimum facts (that a killing has occurred) are recorded, at least in structured form, while in others, a wider picture is put together. In most cases, a large amount of additional information is left as unstructured qualitative free text, which currently serves no purpose in terms of classification and wider monitoring efforts, but could be extremely useful to more systematically understand the bigger picture and to identify causal, temporal, and other relations between events, such as investigating the escalation of threats into full-scale killings.
From the perspective of trying to contextualise and understand why and how journalist murders occur, the recording of detailed information that goes beyond statistics that count the number of killings is thus particularly important. The CPJ records categories of information related to a killing, such as the type of perpetrator involved (e.g., military, political, or government actors), the types of topics covered by the journalist (e.g., corruption, human rights, or war), as well as whether they received threats prior to being murdered. CPJ also records the status of the judicial investigation into a killing through the categories of 'complete impunity,' 'partial justice,' and 'full justice' (CPJ, 2019b). While such further disaggregation of information related to a killing is very valuable, we argue that there is a need to systematically record additional sub-categories of information providing more in-depth information on each instance of a violation, which could for instance be used to map how acts of intimidation against a journalist might escalate into lethal violence.
In the following section, we develop a proposal for addressing these outlined data limitations by using an events-based methodology rather than the traditional person-centric approach, and demonstrate how such a methodology can effectively improve existing monitoring of violations against journalists.
3. An Events-Based Methodology for the Improved Monitoring of 16.10.1 Within social science analysis, events are commonly studied social phenomena ranging from macro social events (e.g., regime changes and civil unrest) to micro events affecting an individual (Landman & Carvalho, 2010). For the purpose of this article, an event is essentially understood as a violation of the rights of a journalist. By responding to the questions of what happened, when, and who was involved, an events-based measure can provide descriptive or numerical summaries of human rights events (Bollen, 1992, p. 37). Accordingly, the data can be disaggregated at the level of the violation, as well as at the level of the person (the individual journalist), which allows for the contextualisation and recording of related information in an in-depth manner. This may include: information about key actors involved in a viola-tion and their interrelationships (victim, perpetrator, and witnesses); the time and place of the violation; and the systematic recording of multiple violations experienced by the same victim (e.g., detention, torture, and killing), or a single violation experienced by multiple victims (e.g., a bombing). This is illustrated in Figure 1, which shows an excerpt from a BBC (2018) report on the murder of Maltese journalist Daphne Caruana Galizia.
Applying the event-based approach, all the facets of this narrative statement can be represented, allowing for both the notion of a hierarchy of events (i.e., an event can contain sub-events) and the notion of chains of events (which can be causal and/or temporal). This can be achieved by categorising each incident (an overall scenario which involves one or more journalists) as an event which may contain further sub-events (e.g., torture during imprisonment) and which may have links to other sub-events (such as death resulting from the torture). In this way, multiple violations of a single person can be represented in a connected way, as well as the same event happening to multiple people. For example, the Caruana Galizia case could be represented as shown in Figure 2.
Having illustrated how an events-based approach can facilitate the uncovering of deeper explanation and understanding of what happened and why it happened in a particular context, the article will now investigate how

She was killed by a car bomb near her home in October. Her widely-read blog accused top poliƟcians of corrupƟon.
One of her sons, Paul, said three pet dogs were killed and aƩempts were made to burn down the journalist's home.    Figure 2. Event-based representation of the case of Daphne Caruana Galizia, with an 'intimidation' event containing separate sub-events of 'dog killing' and 'home burning attempts,' followed chronologically (though not necessarily causally) by the 'murder' event (the car bomb), and subsequently by various judicial follow-up events. Note: Each of these events has a number of features attached to it (not depicted), such as the name of the victim, a date, and time.
the events-based approach combined with an ontological classification scheme can address several problem areas identified in the initial data review.

How Can Developing an Events-Based Methodology Improve the Monitoring of Violations against Journalists?
As illustrated, the events-based approach provides a way to deal with the complex nature of a human rights violation and its recording, by putting the violation itself at the centre and allowing for its in-depth description. What might be considered a single violation (such as a killing) might upon closer examination be interrelated with several other events (such as various forms of intimidation in the case of Daphne Caruana Galizia). This is important in order to understand the progression of events: It is critical to know whether killings typically appear in isolation or as the final act in a series of violations gradually increasing in severity, and similarly to understand whether threats or more minor incidents gradually escalate into more serious ones. To deal with these types of relations between event types and for their categorisation, we use a shallow ontology as a form of hierarchical classification system. Gruber (1993, p. 199) originally defined an ontology as "an explicit specification of a conceptualization." In simple terms, an ontology can be considered as essentially a hierarchical structure with general categories at the top level, branching out in more specific subcategories at lower levels, as shown in Figure 3, where 'shooting' is a more specific subcategory of 'physical attack,' which is itself a subcategory of 'abuse.' There are several important things to note about the use of an ontology as a classification system. First, an ontology is typically a directed acyclic graph, not a tree, which means among other things that categories can be represented in multiple places simultaneously (multiple inheritance). For example, 'bombing' could be a subcategory of both 'murder' and 'collateral target.' Second, it enables information to be represented at varying levels of granularity. Some databases record quite broad categories (e.g., Mapping Media Freedom does not distinguish between arrest and imprisonment) while other information sources have more specific categories, making a clear distinction between those two things. This has the advantage that information can easily be aggregated in different ways, depending on the level of specificity required (e.g., one can look at all abuse as a single unit, or one can look specifically at all psychological abuse as a subset of this). In its simpler forms, ontological classification is compatible with a spreadsheet structure and can be used to create aggregated datasets that can then be semantically searched via potentially complex queries (see e.g., Maynard, Funk, & Lepori, 2017;Maynard, Roberts, Greenwood, Rout, & Bontcheva, 2017).

The Use of a Classification Hub as a Means to Merge Disparate Data Sources
To help mitigate these issues, we propose the adoption of an ontology as a central hub which enables the mapping of different categorisation schemes. We should note, however, that there is no real concept of a single correct ontology-as with the existing categorisation schemes used by the monitoring organisations, an ontology offers a subjective viewpoint. A good ontology is therefore one which adequately meets the needs of the situation and data. On the other hand, an ontology offers a flexible approach which solves the problem of noncommensurability by enabling mapping to existing categorisation and classification schemes.
As we see from the Appendix 1 in the Supplementary File, different monitoring efforts may use different terms and classification systems. For example, one monitoring effort may consider online hate speech to be a particular kind of psychological threat, along with other verbal abuse, while another may consider it a particular kind of online threat along with doxxing, online censoring, etc. Similarly, one may use the term 'assassination' while another may use the term 'murder'-these may or may not represent the same set of events. Ideally, a standardised set of terms and schemes should be used by everyone, but a prescriptive strategy that dictates preferential terminology and classifications is simply impossible to enforce, and is highly problematic. Thus, we suggest a more flexible solution that allows monitoring organisations and researchers to enhance the existing data by mapping to an ontology-based solution.
As we have already mentioned, existing categorisation schemes for both killings and other acts of violence against journalists are insufficient for our purpose, because they are not comprehensive and because they differ widely, resulting in incommensurable data.

Harassment
Physical AƩack Torture Sexual assault ShooƟng Surveillance Psychological Abuse Bearing in mind that we do not wish to impose a new subjective classification scheme, we turn instead to existing well-defined schemes from the fields of human rights and crime-namely HURIDOCS (Dueck, Guzman, & Verstappen, 2001) from the former (see Table 1) and International Crime Classification Scheme (ICCS; UN Office on Drugs and Crime, 2015) from the latter (see Table 2). UNESCO has already started to investigate the ICCS in this respect (see guidelines on the metadata for indicator 16.10.1 provided by the OHCHR, 2018). We therefore propose there should be mapping to both schemes via an intermediary set of terms/classes, as shown in the Supplementary File. For example, currently there are no specific classes in either scheme for many of the kinds of violations we want to monitor, e.g., cyberbullying has only the general class (threat, harassment, psychological assault); exile has a specific class in HURIDOCS but not in ICCS (where it just falls under 'other deprivation of liberty') because it is not specifically a crime. The linking of existing classification systems to definitions of human rights violations such as HURIDOCS also helps to establish a link to the 16.10.1 category definitions. Crucially, such a link is currently lacking in current monitoring. Our approach therefore has the potential to embed 16.10.1 monitoring into the sustainable ongoing and institutional practice of official agencies such as HURIDOCS, assuming that it is possible to disaggregate victims in terms of their link to journalism.
A further benefit of adopting a semantic form of categorisation using an ontology-based classification system is that it enables representation at different levels of granularity and easy exchange between different datasets. As we have seen from the table in Appendix 1 in the Supplementary File, some schemes do not make subtle distinctions. Where this information is available (either through the existing scheme or through analysis of additional data on the event), our approach will enable us to make fine-grained distinctions; where it is not, we can simply assimilate data at a lesser granularity.
In Figure 4, we show a possible conceptual structure for mapping between existing categorisations, text, and databases. On the left we see information from a CPJ database. Blue boxes denote (existing) categories in the various schemes, with blue arrows connecting categories together, while red boxes denote instances of records. Thus, in the CPJ database we see an instance (Record 01) which is some text about the journalist Abay Hailu. In that database, the event has been categorised as 'Dangerous Assignment.' The horizontal blue arrow maps this cate- Extra-judicial execution outside any legal proceedings 01010103 Legal execution (capital punishment) 01010104 Politically-motivated killing by non-state agent(s) 01010105 Murder (deliberate killing which ought to be seen as a common criminal act)  gory from the CPJ database to the HURIDOCS category 0101 (direct actions which violate the right to life) in the centre of the picture. This might seem an odd mapping, but HURIDOCS has no category that really fits 'Dangerous Assignment' specifically, so we map it at the highest level, which equates just to 'killing.' The HURIDOCS scheme depicts a couple of sub-classes of Category 01, such as Category 01010601. This can be linked with the textual description of Abay Hailu's death from CPJ-either manually or by automated Natural Language Processing tools-because both describe death in prison. On the right-hand side of the picture, we also show how one might link other schemes such as ICCS to the hub. Rather than linking directly from the CPJ record to ICCS, we simply link ICCS categories to HURIDOCS categories where possible, so that by extension, we can deduce the link from an instance of an event to the ICCS (and other classification schemes). This minimises the amount of work needed each time a new event is added. So we can map ICCS Category 01 'Acts leading to death or intending to cause death' directly to HURIDOCS Category 0101, and we can link ICCS Category 0107 'Unlawful killing associated with armed conflict' directly to HURIDOCS Category 010103 'Killings in the context of conflict.' It is important to note that the scheme also enables the mapping of multiple sources of information together. Figure 4 shows the addition of the information from the free text description about Hailu's death, but we can add as many other sources as we want, such as additional news reports, information recorded in other databases, or even from social media. Discussion of this is beyond the scope of this report, but methods for information extraction and information mapping can be used to pull together the information into a single coherent representation. Finally, we touch briefly on the inter-related issue of information verification.

Verification of Information
Indicator 16.10.1 also specifies that cases must be verified. This means that reported cases should contain a minimum set of relevant information on particular people and incidents, which have been reviewed by mandated bodies, mechanisms, and institutions, who in turn have found reasonable grounds to believe that a violation took place. One of the most critical problems in the monitoring of data on killings-and other forms of violence against journalists-is connected with the validity and reliability of this data. Many factors can affect the counts of violations and thus confuse the data, such as the differences in what to count. For example, the CPJ only considers cases where a direct link to journalism is proven, while others, such as RSF, count also prima facie links and unproven cases.
The reporting of killings and other events may be inaccurate due to deliberate disinformation, such as adjusting the numbers of harmed journalists, not reporting that a journalist was harmed, or falsely reporting that a journalist was not harmed. It may also simply be misinformation due to rumour, uncertainty or confusion (such as using different names for the same person), or due to differences in definitions and data collection methodologies (see for instance IFEX, 2011). Enormous research effort has recently been put into developing methods to recognise and categorise various forms of false information in news reports and social media (del Vicario et al., 2016;Kim, Tabibian, Oh, Schölkopf, & Gomez-Rodriguez, 2018;Tucker et al., 2018), and there are a number of research projects addressing this issue, such as WeVerify. Investigations into fake news and false information have also been undertaken by both the UK government (House of Commons, 2018) and the European Parliament (2019).
In this research, we focus on methods to deal with inaccurate or incorrect information. We propose to address the notion of information verification in our monitoring approach by developing a range of mechanisms for automatically assessing the likelihood of correctness. For this we can consider features such as the number of sources reporting the event, the nature of these sources (some sources are known to be more reliable than others), the similarity between the reports, and the nature of this similarity. We propose to address the notion of information verification in our monitoring approach by developing a range of mechanisms for automatically assessing the likelihood of correctness. For this we can consider features such as the number of sources reporting the event, the nature of these sources (some sources are known to be more reliable than others), the similarity between the reports, and the nature of this similarity. We recommend including measures of: number of sources; type of sources (news, social media, eyewitness reports, etc.); reliability of the sources (a number of initiatives are focusing on this, such as the Global Disinformation Index, 2019; the Journalism Trust Initiative, 2018; and Media Bias Fact Check, 2019); and content reliability (for instance, a number of tools are being developed currently for verification of news, debunking, and fact-checking).
Finally, when the information in two or more sources conflicts, their reliability is inherently questionable, and this can be an additional factor to consider. In order to determine whether two records of an event can be matched or merged, we can consider each feature's importance (see Postma, Ilievski, & Vossen, 2018).

Conclusions
In response to the current limitations with data that is being gathered on violations against journalists on the national, regional, and international levels, and the range of challenges in monitoring the 16.10.1 indicators, this article has suggested that an events-based methodology adopting an ontological classification scheme provides a new means to map disparate data sources relating to attacks on journalists. Such an approach represents a way forward in improving our understanding of the manifestations of violations against journalists as it captures the real world complexity of these violations, while simultaneously making it possible to adhere to existing norms and schemes without trying to impose unwanted restraints on those who collect information in the field (often under adverse conditions) and organisations who maintain records of violations for monitoring purposes. We therefore propose to realise this eventbased approach by means of methods and tools that aim to strengthen ongoing monitoring efforts by facilitating processes to generate, categorise, and systematise data on a wide range of violation types. This article provides a starting point and roadmap for envisioning and designing prototype tools and associated methodologies that we ultimately hope will contribute towards building a com-prehensive evidence base to understand how and why violations occur in more depth, while also contributing towards addressing and redressing problems of safety in a more efficient way. Through this approach, there is no requirement for any current monitoring efforts to modify their practices, but rather, we propose there could be enhancement of their data through the use of text analytics and more complex classification and mapping schemes. This data enhancement applies equally to individual local monitoring efforts and to global, more encompassing schemes.
Nevertheless, there are a number of challenges and assumptions to be considered. First, an events-based methodology is a relatively fundamental change in thinking, which may not appeal to all. Second, local monitoring organisations must be open to ideas about collaborative working practices to improve monitoring efforts, which may involve sharing of data. Third, even if tools are provided, there is no guarantee that they will be used by relevant stakeholders. While the approach that we propose is only meant to serve to enhance existing information, it does require additional effort to use and understand. Related to this, it is important to understand that the use of automated tools is not without risk, particularly if not taken in its proper context. Natural language processing is certainly not infallible, and mistakes will be made by automated tools. Thus, there is an important element of caveat emptor. The same applies to verification tools, which again should only be used as a guide and not a solution; for example, a risk of inadvertent exclusion and inclusion applies if tools/accreditation are implemented and become de facto statements of trust across diverse information sources.
Moving forward, we see two key avenues to pursue. First, improved monitoring is required: Based on the needs and priorities of the community of monitoring organisations and/or individual or groups of monitoring civil society organisations, tools should be developed to address issues of data generation, categorisation and systematising, both for the systematic monitoring of 16.10.1, and for strengthening the monitoring capacity of local civil society organisations. Second, improved research and analysis of violations against journalists is required, addressing the need for data tools that can facilitate the comprehensive analysis of shifting safety trends for the purpose of better understanding the nature and dynamics of safety threats.