By Abed Khooli
There are three connected topics in this short article and they are equally important. Instead of writing a detailed coverage of each, I decided to address them together to underline their interrelation and stress the importance of using them in the right context. Of course, development is the final goal, data journalism is the process and open data is a major ingredient.
Despite its recent rise to the surface, data journalism is not new. One can trace it back to cave drawings in the early ages (for example, how many deer we hunted), but it is practical to say that the pre-cursors of data journalism predated the computer age. According to Simon Rogers (data journalist, now with Google), the Guardian had elements of data journalism as early as 1821. However, it was in the second half of the 20th century that computer assisted reporting (CAR) and data journalism started to take steady steps towards its current shape. Whether it was for predicting American elections results or working on investigative journalism, the availability of affordable computational resources (storage, computing and communication) made it possible and practical for journalists to acquire, analyze, visualize and publish data stories.
So, data journalism is basically turning raw data (numbers, texts, images, voice and video) into attractive and user-friendly stories. This is usually done by collecting raw data, cleaning data sets, performing statistical analysis and; modeling and visualizing results (charts, timelines, maps and infographics). Some outputs may even turn into news applications like Propublica, FiveThirtyEight (US elections) and Liveuamap (open data map of conflicts). Summarizing heaps of data in a few numbers and a graph or finding correlations and discovering stories (by ‘interviewing’ data) reveals insights and help readers and viewers understand the news and take decisions based on real evidence.
Journalists are usually good at extracting information for various sources. Data collection is not very different, but unlike information, data is stored differently and it takes some extra effort to acquire and manipulate. If we know that most of the raw data at the center of focus of the community is gathered or generated by government agencies, the value of open data becomes obvious. Open data must be free to use, accessible, complete, timely and usable. As such, data journalism is not only a business model in itself, but it helps with data literacy and awareness of the data driven innovation on the road to entrepreneurship and economic development.
Data verification and data ethics are at the core of data journalism work. Although the numbers don’t lie, sources may do either intentionally or as a result of human error. Data is not taken at face value and a set of verification tools and techniques are implemented during the data journalism process. As a valuable asset, data is handled with care – data ethics. Data protection and personal privacy are two basic foundations of data journalism. And, this applies to open data as well (no personal data is released in the public).
In MENA, data journalism and open data are new to most Arab countries. Awareness and capacity building are essential to transform traditional journalism and encourage governments to adopt and execute true open data policies. Part of the IDRC project on data driven innovation, the Center for Continuing Education at Birzeit University is implementing the first data journalism training in Palestine in October, 2016. Unlike the upcoming data science training, data journalism will be offered in Arabic and will cover the standard process in addition to data verification and data ethics as well as the business case. We look forward to see some impact by next year.