fbpx

Sambodhi

Overlaps in Data Transparency and Data Privacy 

Sambodhi > Blog > Analytics and Visualization > Overlaps in Data Transparency and Data Privacy 
Posted by: Aishwarya Bhatia
Category: Analytics and Visualization
Overlaps in Data Transparency and Data Privacy 

For someone who is not an expert in the field of research and data analysis, understanding the nuances surrounding data transparency and data protection can be a complicated process. Transparency and privacy are understood differently, however, within data analytics, their definitions and functions overlap. Data transparency and privacy create a middle ground where both are equally important for reliable and accurate data collected ethically and sustainably.  

In today’s world, where policymaking is at its best when backed by credible data, the way it is collected and handled is crucial. Let us, then, map the data collection process to understand the interplay of transparency and privacy. 

Data transparency is crucial in every stage of the data collection process. Before a survey goes to the field, it must be approved by an Institutional Review Board (IRB), an organization that reviews and approves (or disapproves) any research study involving human beings. Responsible for protecting the welfare, rights, and privacy of people, an IRB requires transparent information about the research to approve, disapprove, monitor, and ask for modifications in all research activities that fall within its jurisdiction.   

Once a project secures IRB clearance, participants are informed about the purpose of the study when the survey is conducted. You can head here for a detailed dissection of a consent form and its role in safeguarding the respondents’ interest. 

Any information that reveals the respondents’ identity such as name, address, contact number, email id, etc., is eliminated or masked from the database to safeguard their privacy. This process is called deidentification and is implemented before sharing the data with other project stakeholders. During this process, unique IDs are assigned to respondents which become their identifiers for all future purposes. 

But first, why should data be transparent?  

Data transparency operates at multiple levels to become credible. It involves handling:  

  • information about survey methodologies disclosed to IRB before the survey begins, 
  • information disclosed to the respondents, in the form of the consent form with elements such as purpose of the study, compensation, risks etc., 
  • information disclosed to stakeholders, in the form of deidentified data, and 
  • information disclosed to the reader, in the form of comprehensive analysis, insightful observations and policy recommendations based on findings.  

Additionally, findings from such studies can be made available to the research ecosystem allowing: 

  • re-use of data by researchers, policymakers, students, and teachers, 
  • creation of insights based on multiple studies and answering questions on the generalizability of results, and  
  • replication and confirmation of published results. 

But if identity is made private, then how can data be transparent?  

This is where the overlap between transparency and privacy comes into play because data can be transparent while being private too. Let’s see how. 

Deidentification is a critical component of ethical research because it creates a safe environment for information to be shared for knowledge while protecting the privacy of the participants. It becomes even more important in sensitive research wherein traceable markers can lead to the identification of the respondent. For instance, in a survey about women’s experience of intimate partner violence in a rural setting, if deidentified data is not used, the participants can be tracked, and therefore made vulnerable.  

This is why, in a collaborative study, every stakeholder who has access to such data must follow due protocols. For instance, if one survey agency is in partnership with other research agencies, then the survey agency should only share the de-identified data with them.  

Even when study results are made public, no private information is shared with anybody as per ethical protocols surrounding research endeavors. Generalizing these results in the report without disclosing any identifiers is a very common method used by researchers to report their findings. 

So, while transparency is an integral part of the data analysis process, ensuring data privacy is and must always be the priority. The goal for every study should be to maintain and sustain both aspects for better and reliable data. 

References:  

https://dimewiki.worldbank.org/De-identification

Aishwarya Bhatia, Sambodhi

Author: Aishwarya Bhatia

Leave a Reply