What is Big Data?
We live in the age of data and every moment of each day; we are generating more. As I write this, a myriad of simple data points are being created. The three V’s of big data is really the most simple definition to assist in understanding what big data actually is. The three Vs are Volume, Velocity and Variety. As its name entails, big data effectively means that there is a lot of it. The data is ‘low-intensity’ and is frequently unstructured, but there is a great volume of it. Velocity refers to the rate at which the data moves through the social system to which it pertains. That is the speed at which it is received, and the speed at which it’s receipt effects a response. This is the terrain of concepts such as ‘real-time’ data. It is tis velocity of data we are working with in making real-time, evidence-based decisions. Finally, Variety refers to the many types and sources of data. Indeed, one of the greatest potential contributions of data science to development and evaluation practice is new information.
Big data is not new, but the realisation of its value, and the growing velocity and variety due to the internet of things means that the concept is rising in prominence.
How can this be used for Social Good?
At first glance, anything associated with the internet of things hardly seems to have a place in evaluation practice. How is remotely changing the temperature of an affluent’s fridge relevant? It seems this could not be further across the digital divide.
Not so, as big data is not only about consumer use data, or Facebook interactions. And across the globe more people now have access to a cell phone than to potable water. Indeed, some of the most exciting innovations in development practice involve the use of new data tools and predictive models in low-income contexts. As long as the processes are intentionally equity-oriented, there are a number of technical solutions to merge data science and evaluation practice for greater social good.
Big data, with its reach and frequency can actually be far more representative of the realities of those living in low-income contexts than other data and information. Data sets area available across generations, allowing for comparisons across time on a far broader scale than conventionally available to evaluators and researchers. Qualitative and behavioural data for target populations is indeed more accessible, and potentially a greater reflection of the unbiased truth.
One of the most interesting differentiators between big data is data which evaluators would conventionally collect is the atomic, and ‘non-reactive’ nature of big data. Big data by definition is collected digitally, and this takes out a lot of the bias of collecting data within a development context. There are pros and cons to this, as instead of asking for specific pieces of information, researchers will be finding proxies in the data for things they hope to measure. Nonetheless, these new information sources are indeed a compelling addition, bringing new objectivity to the evaluation space.
How can Big Data and digital tools contribute to the M&E space?
The question is really around how evaluators can use new big data sources, as well as new tools arising along with the need to analyse large datasets. In a recent paper by York and Bamberger, the authors outline some applicable designs and concepts in the big data space which could be particularly useful to evaluators. Big data has given rise to new ways of looking at individual and community behaviour. With big data, we can better understand out climate and geography, as well as things such as animal migration patterns.
There are a number of potential benefits to evaluation practice from using new tools associated with data science. Using new methods to collect and hold data can bring down the cost of data collection and organisation. These tools can also provide more time with the data as collection is more efficient, and the data is more malleable. Evaluators can easily test assumptions and try different views to better understand the texture of the data and to allow findings to emerge. Data science allows for new levels of automation of analysis using AI, and data platforms and tools allow for this to happen in real time, so data can be analysed and visualised the moment it is collected (provided it has been verified!). This means that evaluators can turn their evaluation questions and findings into real working tools to improve development processes.
Big data speaks directly to a number of the things which prevent evaluation practitioners from conducting their ideal analysis. Big data can be available for whole populations at relatively low cost, and this can be aggregated, or available right down to the individual level. The possibilities for creating comparison groups are endless.
Challenges and Bottlenecks
There are a number of issues with using this data responsibly and optimally. Data, apart from being an information source, is also a great asset, and the information is frequently ‘owned’ by the owners of the software of application.
The data is also frequently messy and incomplete, making is difficult to use.
Finally, there are good reasons to be sceptical about the authenticity of the data as a reflection of true and natural behaviour, which apples to human-generated big data. All digital tools through which data is collected are actually designed to generate certain behaviours. Thus, this data may in fact be the opposite of unbiased and may be biased entirely in the direction of the system which created them!
Contexts and Applications
Big data already has multiple potential and current applications. Analysis of social media, along with satellite images can provide early warning in the case of natural disasters. GPS mapping and crowd sourcing can provide assistance for emergency relief programmes. COVID-19 provided excellent examples of how the big data universe can be used to share important information and to keep people aware. Records of transactions, phone use, social media analysis and satellite imaging can assist in better understanding the behavioural patterns of low-income groups.
Next steps for Evaluators
There is a compelling argument that in order for the world of big data, and developmental evaluation to converge, and through this to provide a whole new level of insight in development practice, development practitioners and evaluators need to work together to create the demand and institutional infrastructure for big data to be used to benefit their work.
Alongside these new policies regarding data ownership, and data use are required to ensure data availability for the public good, and that no adverse outcomes might arise from the collection of new data being sold to parties with less than noble intentions.