Open Access Journal

ISSN: 2183-2439

Article | Open Access

The News Crawler: A Big Data Approach to Local Information Ecosystems

Full Text   PDF (free download)
Views: 4035 | Downloads: 755

Abstract:  In the past 20 years, Silicon Valley’s platforms and opaque algorithms have increasingly influenced civic discourse, helping Facebook, Twitter, and others extract and consolidate the revenues generated. That trend has reduced the profitability of local news organizations, but not the importance of locally created news reporting in residents’ day-to-day lives. The disruption of the economics and distribution of news has reduced, scattered, and diversified local news sources (digital-first newspapers, digital-only newsrooms, and television and radio broadcasters publishing online), making it difficult to inventory and understand the information health of communities, individually and in aggregate. Analysis of this national trend is often based on the geolocation of known news outlets as a proxy for community coverage. This measure does not accurately estimate the quality, scale, or diversity of topics provided to the community. This project is developing a scalable, semi-automated approach to describe digital news content along journalism-quality-focused standards. We propose identifying representative corpora and applying machine learning and natural language processing to estimate the extent to which news articles engage in multiple journalistic dimensions, including geographic relevancy, critical information needs, and equity of coverage.

Keywords:  critical information needs; information ecosystem; local news; machine learning; news deserts; United States



© Asma Khanom, Damon Kiesow, Matt Zdun, Chi-Ren Shyu. This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 license (, which permits any use, distribution, and reproduction of the work without further permission provided the original author(s) and source are credited.