Journalists rely on their credibility. One of the best ways to maintain credibility is to use data to back up claims. In this lecture, we will learn how to evaluate data for journalism.
Data Journalism Handbook
- Official data portals. The government’s willingness to release a given dataset will vary from country to country. A growing number of countries are launching data portals (inspired by the U.S.’s data.gov and the U.K.’s data.gov.uk) to promote the civic and commercial re-use of government information. An up-to-date, global index of such sites can be found at datacatalogs.org. Another handy site is the Guardian World Government Data, a metasearch engine that includes many international government data catalogs.
- The Data Hub. A community-driven resource run by the Open Knowledge Foundation that makes it easy to find, share and reuse openly available sources of data, especially in ways that are machine-automated.
- ScraperWiki. an online tool to make the process of extracting “useful bits of data easier so they can be reused in other apps, or rummaged through by journalists and researchers.” Most of the scrapers and their databases are public and can be re-used.
- The World Bank and United Nations data portals provide high-level indicators for all countries, often for many years in the past.
- A number of startups are emerging, that aim to build communities around data sharing and resale. This includes Buzzdata — a place to share and collaborate on private and public datasets — and data shops such as Infochimps and DataMarket.
- DataCouch — A place to upload, refine, share & visualize your data.
- An interesting Google subsidiary, Freebase, provides “an entity graph of people, places and things, built by a community that loves open data.”
- Research data. There are numerous national and disciplinary aggregators of research data, such as the UK Data Archive. While there will be lots of data that is free at the point of access, there will also be much data that requires a subscription, or which cannot be reused or redistributed without asking permission first.
- Tableau has some good online tutorials here.
- Check out more data visualization tools here.
Doing Journalism with Data: First Steps, Skills and Tools
Check out this free online data journalism course.
For more free courses on data journalism, click here.
Racial Patterns In Police Data
The Poynter Institute hosted a free webinar June 19, 2017, called “Law, Order and Algorithms: Understanding Racial Patterns in Police Data.” You can watch the webinar here. Poynter senior faculty Al Tompkins led the webinar that evaluated:
- How to access the data
- How to interpret and analyze the data
- Key terms when looking at race and policing, for example, understanding the differences between disparate impact and discrimination
- Story ideas from the data
Before the webinar began, Tompkins used Facebook to share an article called, “Black, Latino drivers more likely to be cited and arrested.” I highly encourage you to friend Tomkins on Facebook here.
The webinar walked journalists through the Stanford Open Policing Project. Cheryl Phillips and Sharad Goel discussed records they analyzed from more than 60 million state patrol stops from 20 states between 2011 and 2015. The webinar evaluated the conclusions reporters can draw from the statistics. It also warned about conclusions journalists should be careful about drawing. Though data shows what happened, it doesn’t always explain why.
Polls & Random Sampling
The Pew Research Center’s website includes a valuable section called Fact Tank News in the Numbers. Read their article, How can a survey of 1,000 people tell you what the whole U.S. thinks? And watch this video about random sampling.
When Polls Are Wrong
A frighteningly common issue with data journalism is the number of writers who misuse statistics and inadvertently spread misinformation. Data can be a wonderful storytelling tool. But you must understand the data before you select parts of it and write a concise narrative or headline.
Sanne Blauw is a journalist with a PhD in econometrics. Watch her TED Talk for examples of data journalism fails and how to avoid them.