Big Data in Education: Researchers’ Responsibilities

By Bridget Thomas (@DrBridgeQIP)


While big data’s growing influence has impacted our lives across a spectrum of issues, it also has created many questions and concerns, particularly among education researchers.

Big data allows researchers to uncover patterns in data that might be otherwise invisible. This has led to several powerful advances, such as better treatments for disease, improvements in agriculture, and more timely and effective responses to natural disasters. The benefits of big data have even been highlighted in popular media, such as in the movie Moneyball, which dramatizes how the pioneering use of large datasets helped a general manager assemble a winning baseball team.

But the rise of big data has also prompted many to note its potential negative consequences. Within education, researchers have identified not only benefits to using big data, but also legitimate concerns. As they do with all data, education researchers have a responsibility to focus on both the integrity of their research using big data and on clear communications about this research to the public. Further, their communications with the public should focus not just on the research itself and its useful possibilities, but also on the precautions they are taking to ensure that the rise of big data does not negatively affect the education community.

Big data, defined

The term “big data” refers to very large and complex datasets—those datasets that have been described as “defying traditional data-processing applications” (National Academy of Education, 2017). Modern technologies allow us to capture information in previously unforeseen ways and transform it into digital data. This has resulted in datasets that are much larger and more complicated than anything seen before. From a research standpoint, big data changes data collection from an often lengthy and painstaking process to one that can happen nearly automatically, given the right connections to sources.

Big data in education: improving teaching and learning

Big data in education tends to fall into two major categories: administrative data and learning process data. Combining digital data from these two areas in innovative ways can allow researchers to identify patterns or correlations that may otherwise go unnoticed.

  • Administrative data can be demographic, behavioral, and achievement data and may include items such as attendance records, transcripts, and test scores.
  • Learning process data are continuous records of students’ behaviors and may include online assessments, keystrokes, or time latencies (e.g., the time it takes a student to respond to a question).

Innovative data analyses can lead to useful solutions to problems in schools and classrooms, uncover potential inequities in learning opportunities, and zero in on students’ needs in ways that reveal how to personalize learning more effectively. The overarching goal of this data collection and analysis is to expand possibilities for teaching and learning—including how to meet individual students’ needs.

Big data in education: legitimate concerns

Education researchers have raised some legitimate concerns about big data. While they recognize that big data has many exciting possibilities, researchers have also identified some potential problems with its use—or misuse. These concerns tend to fall into three main categories: misinterpretation, inappropriate use, and data privacy and security.

  • Misinterpretation concerns center on the possibility that studies using big data may be misunderstood by readers—especially if the studies are distilled or simplified before reaching the public—and that these misinterpretations could lead to inaccurate decisionmaking.
  • Inappropriate use concerns suggest that the public nature and accessibility of some big data may lead to people using the data in ways that were not intended and that defy accepted research standards.
  • Data privacy and security concerns are based on concerns that individuals’ personal information may not be properly protected, which could lead to data breaches or other inadvertent disclosures of private information.

As the education field continues to move toward greater use of big data, each of these issues should be specifically and consistently addressed. This can be accomplished through strong data governance, research standards, and other precautionary measures.

Researchers’ responsibilities: communication with the public

Education researchers must think not just about the research on big data, but also about how the public is receiving and reacting to this research. Public discussion of big data is frequently negative and inaccurate. Unlike the measured considerations of big data presented in academic articles, much of the communication about education-related big data to the public has encouraged skepticism and fear. It is not surprising that many parents and other stakeholders have developed negative views, given the frequent headlines that tout the “big dangers” of big data. The public less frequently encounters news that describes the potentially positive aspects of this education information or the clear standards that are in place to protect the privacy of personal information.

At the same time, researchers should work to ensure that members of the education community understand the legitimate concerns about big data and what we can all do to avoid or mitigate problems that may arise from misinterpretation, inappropriate use, and data privacy and security issues. Walking the fine line between explaining the intricacies of this difficult topic and communicating concisely and clearly is something education researchers must strive to master.

Big data is indeed a problem if it is used ineffectively, inappropriately, or by individuals without a requisite level of comprehension of the complexities of the subject. But that is true of all research data. Data, in various forms, can reveal that something has happened, that a phenomenon exists, or that variables appear to have a relationship, but data cannot on their own reveal why. It is the responsibility of researchers—especially those in the public sphere—to provide the lenses that make research relevant and comprehensible to varied audiences, from parents and teachers to administrators and elected officials.

It is important for education researchers to make clear that they are using the same stringent research standards for big data analysis that they have adhered to with previous types of data. Additionally, they must communicate to the public that they are regularly discussing the potential hazards of big data and routinely updating methodologies and security protocols as projects and analyses become increasingly complex. The clearest path to public trust in the research process is via straightforward and detailed communication.