In the realm of data-driven decision-making, the accuracy and reliability of data are critical. However, the prevailing skepticism around self-reported data often leads people to believe that alternative data collection methods, such as phone data, administrative records, or observational studies, are superior. In this blog, we aim to challenge these notions by highlighting the shortcomings of various data collection methods and emphasizing the unique strengths of self-reported data. By recognizing its comprehensiveness, verifiability, and inclusivity, we can understand why self-reported data surpasses other approaches and should be embraced.
Limitations of Alternative Data Collection Methods:
- Phone Data: Phone usage data may initially seem like good data: objective, ‘hard’ data that cannot be manipulated. However, it is not without its drawbacks. For example, the poorest individuals often use SIM cards that are not registered under their own names or frequently change SIM cards. Consequently, relying solely on phone data excludes a significant segment of the population, leading to an incomplete understanding of their behaviors and needs and potentially excluding them from services. We have even run into people in the field who are already gaming the phone data. We have recorded people reporting ‘I know I need to have my phone charged at all times’, ‘making phone calls in the morning gives you points’, and ‘making brief phone calls regularly to your business network is best’. These people are doing these things intentionally to maximize their perceived benefits. So, some seemingly good phone user behaviors may be faked.
- Administrative Records: While administrative records, such as those held by Microfinance Institutions (MFIs), may appear reliable, they too have their limitations. Automated algorithms used to assess creditworthiness or eligibility for services using existing administrative records, will by definition exclude those people who have not yet taken any loans. Using these administrative records may inadvertently perpetuate biases, and bring the MFI to keep lending to the same people who have performed before. This leads to the continued exclusion of groups other than those currently served from accessing crucial services. Those excluded are most likely vulnerable individuals and groups and they may perform very well for the services, but because they haven’t been served before, an MFI will not be able to find out how they will perform. Consequently, administrative records may not give all groups an equal chance and a fair probability, it will only manage the MFI’s risks fairly well.
- Observational Studies: Observational studies, which involve researchers directly observing and recording behaviors, have their own challenges. The presence of an observer can introduce the Hawthorne effect, where participants may alter their behavior, consciously or unconsciously, leading to biased data. Additionally, observational studies can be time-consuming, and costly, and may not capture the full range of experiences and perspectives. Here also, most studies find it hard to include the most vulnerable and most excluded groups. Thereby research based on observations can also perpetuate the exclusions.
The Strengths of Self-Reported Data:
- Comprehensive Insights: Self-reported data enables individuals to directly express their experiences, preferences, and needs. By actively involving participants in the data collection process, a more comprehensive understanding of various dimensions can be obtained. This depth of insight is crucial for making informed decisions and tailoring interventions to meet specific requirements.
- Verifiability: Contrary to the frequently voiced concerns that self-reported data cannot be proven, it can in most cases be validated through cross-checking with other sources or by triangulating responses. Verification mechanisms, such as follow-up interviews or cross-referencing with existing records, can enhance the credibility of self-reported data. Technological advancements have also provided transparent and tamper-proof systems for verifying and validating self-reported data, further enhancing its trustworthiness. In many cases, people can upload hard proof of their self-reported data, such as a receipt for a large transaction, a photo of the item purchased or a print screen of the savings account records.
- Inclusivity: Self-reported data has the unique ability to capture the perspectives and experiences of marginalized and vulnerable groups, who may be underrepresented in phone data, administrative records, or observational studies. By directly engaging with these individuals, their specific challenges and circumstances can be understood, leading to more inclusive policies and services. Self-reported data ensures that the voices of all individuals, irrespective of their socioeconomic status, are heard and considered in decision-making processes. (Importantly, there are ways for even illiterate people to report their own data, as long as they get a coach or a phone volunteer who helps them with entering the data and access to a smartphone.)
- It is hard to fake: When reporting about financial and economic data, particularly if you report in detail over an extended period of time, it is extremely hard to lie. Total payments coming in have to equal the total payments going out or the difference should be visible in an increase or decrease of reserves such as a savings account or cash stored at home. If someone tries to give the impression that they earn more, they may report fake income. But then that person needs to consistently keep in mind that fake income, report fake expenses, or report fake increases in accounts. In our experience, it is sheer impossible to fake consistently, and quite soon we can see in the data that there is something amiss.
Moreover, we have thousands of diaries data, and that has taught us what a ‘normal’ financial pattern looks like. We won’t give away the details of what we learned, but it is very easy to flag fake data.
- It makes no sense to fake: People who report their data themselves, in the first place do this to understand themselves better and find out about their own financial and economic patterns. For instance, people want to know their total income and how much each income source contributes. They want to know their total profit and their profit margin. They want to know if one or another income source makes a loss and which income source has the highest profit margin. If you report fake data, any of these calculations, totals, or proportions you will know are also fake. So faking would cost a lot of effort and as a result your main benefit of strengthening your economic activities, cannot be achieved. In conclusion, telling the truth is a rational and easy thing to do.
Self-reported data, despite common misconceptions, offers several advantages over alternative data collection methods. It provides comprehensive and verifiable insights, bridging gaps that other approaches often miss. By actively involving individuals in the data collection process, self-reported data promotes inclusivity and empowers individuals to share their experiences firsthand. It is crucial to embrace self-reported data as a valuable resource that provides a more accurate representation of individuals’ realities. By recognizing its strengths and implementing robust verification mechanisms, we can unlock its full potential to drive informed decision-making and create positive change across various domains.
By: Adonay Negash, Communications Officer