Surveys and the availability of public statistics remain a topic of major academic and political contestation in India. The delay of the Census, which was scheduled to happen in 2021, was raised as a matter of concern by technocrats and an expert committee of academics, which was later disbanded by the government. Apart from the Census, the release of survey data on crucial socio-economic indicators such as household consumption, education, and industries have also been delayed by the National Sample Survey Office (NSSO) and other government departments. The dearth of these statistical indicators creates uncertainty about the nature of India’s economic and social development, leading researchers to increasingly rely on non-state sources of data such as the Indian Human Development Survey (IHDS) and the Centre for Monitoring Indian Economy (CMIE), which has a Consumer Pyramids Household Survey.
Similar to NSS data, the IHDS and CMIE provide estimates and trends on key socio-economic variables through nationally representative sample surveys. However, these are not the only sources of statistics in India produced by non-state actors. Over the last two decades, researchers and development organisations (such as multilateral institutions and policy consultancies) have begun carrying out customised surveys on various socio-economic or policy phenomena. I use the term “customised” because it involves researchers or policy organisations designing their own questionnaires to study specific development questions that are of interest to them. Unlike national sample surveys, these customised surveys are confined to specific districts in a state, or at times spread across a small number of states.
Despite being local in scale, these surveys remain logistically complex as they involve interviewing thousands of respondents across large geographical areas. So these projects are often outsourced to private survey firms that manage operations on the ground. This has given rise to an industry of private survey firms that researchers and development organisations engage, through commercial agreements, to help design, execute, and manage customised surveys across India. These firms hire enumerators for data collection on short-term contracts to produce the desired statistics for their clients. This private industry that collects socio-economic data and the labour force of enumerators it relies on has not received adequate attention from scholars.
…[R]esearchers and development organisations (such as multilateral institutions and policy consultancies) have begun carrying out customised surveys on various socio-economic or policy phenomena.
This article, based on prior and ongoing fieldwork, seeks to provide an overview of the customised socio-economic survey industry in India. My fieldwork includes participant-observation with enumerators contractually hired for these projects, and interviews with staff at survey firms that manage them, and researchers and development professionals who are their clients. It provides insight into how an alternative system for statistics has emerged in India, and how this links to the use of randomised trials in the social sciences and the rise of philanthropy capital in the development sector.
Further, it analyses the labour and skills of enumerators who form the backbone of this industry, and how they are recruited and compensated. The implications of the data produced by this industry for public policy, and further employment avenues for enumerators are also discussed.
Origins of Customised Surveying
The traditional source of socio-economic statistics for researchers of India’s development trajectory has been secondary data produced by pan-India sample surveys such as the regular NSS rounds, the National Family Health Survey (NFHS), and, in more recent times, the IHDS and CMIE’s national surveys. They are “secondary” because researchers use data produced by another entity for analysis.
Starting in the early 2000s, researchers and development organisations began undertaking customised data collection of their own, based on specific questionnaires. Unlike secondary data surveys that are national and aim to collect data across variables to identify broad socio-economic trends, these customised surveys are local and the data collection is tailored to answer a pre-defined research question.
A primary reason for these surveys emerging as an alternative source of statistics is the use of randomised controlled trials, or RCTs, in development economics. An RCT is an empirical tool for impact evaluation, which involves randomly assigning a policy intervention (such as providing cash transfers) to a “treatment” group and comparing the outcomes of this to a “control” group that does not receive this intervention (Banerjee and Duflo 2011). Drawing from research methods in medical science (Webber and Prouse 2018), the theory underpinning RCTs is that any changes measured in the treatment group vis-à-vis the control is the result of the intervention since randomisation controls for other factors. Over the last decade, RCTs have become a widespread empirical tool to generate statistical evidence for the evaluation of development interventions (Wintrup 2022).
Scholars have noted that the use of smaller samples and a focus on specific interventions that are central to RCTs require customised surveys to produce data for these evaluations.
RCTs require undertaking a “field experiment”—a group of people in a particular area need to be selected, and surveyed before the intervention (known as a baseline survey). Following this, the intervention needs to be randomly administered to the treatment group, and an additional survey has to be conducted to gather data after this (known as an endline survey). As these experiments are localised and based on questionnaires framed by the researchers evaluating the intervention, customised data collection becomes necessary. Secondary data cannot be used since those are broader and do not pertain to the policy intervention being studied. Scholars have noted that the use of smaller samples and a focus on specific interventions that are central to RCTs require customised surveys to produce data for these evaluations (Reddy 2012).
Customised socio-economic surveys thus began on a large scale in India with the implementation of RCTs in development research. As the scope and frequency of such projects began increasing in the 2010s, researchers conducting these experiments felt the need for more professional and organised management of data collection activities in the field. This gave rise to the creation of private survey firms to which researchers outsourced the hiring and compensation of enumerators, and the monitoring of their daily data collection on the field. These firms thus comprise an industry that can be engaged by researchers or development organisations around the world for local and tailored data collection in India. It should be noted that while this industry emerged due to RCTs, it has expanded and these firms now manage several non-RCT projects as well.
The senior enumerators contact people within their social networks, including family, friends, or former colleagues, and shortlist candidates. These candidates are interviewed by the field staff to select enumerators.
Another reason for the emergence and growth of customised surveys of this nature is changes in the funding sources for development research. Various participants I spoke to during my research mentioned that RCTs and evaluations involving customised surveys are increasingly demanded by philanthropic foundations that finance development research and interventions. This includes major donors like the Bill and Melinda Gates Foundation, which has influenced other donor organisations to emphasise customised data and randomised trials in the research and policy interventions that they finance.
It should be noted that philanthropy capital, of which family foundations and corporate social responsibility (CSR) are important components, is becoming an increasingly important source of finance, contributing approximately Rs. 1.3 lakh crore (nearly US$15 billion) to India’s social sector in FY 2024. Hence, a combination of intellectual transformations in social science research and changes in sources for funding them has led to the emergence of a customised socio-economic surveying industry in India. Given the number of RCTs and customised data collection projects taking place, this industry is arguably becoming as influential a source of statistics as the national sample surveys that provide secondary data.
Labour Force of Enumerators
Though they are smaller in scale than national surveys, customised surveys are logistically complex to execute. Given the large number of respondents and wide geographical area that needs to be covered in these surveys, a significant labour force of enumerators and supervisors needs to be mobilised to collect the required data. I spent several days between September and November 2024 carrying out ethnographic fieldwork with two teams of enumerators who were hired by a survey firm to carry out customised data collection for their clients. Both surveys took place in the National Capital Region and were part of RCTs. My fieldwork involved participant-observation with these enumerators as they engaged in data collection in the field, and offered some insights into how they are recruited, the skills they utilise while working, and the compensation they receive.
Most survey firms have a few field staff who are permanently employed by them. When the details and regional boundaries of a project are decided, the field staff begin hiring enumerators and supervisors. This is done by contacting a senior enumerator that the field staff have previously worked with who is from the state where the project will take place, or from an adjoining region. These senior enumerators contact people within their social networks, including family, friends, or former colleagues, and shortlist candidates. These candidates are interviewed by the field staff to select enumerators. The senior enumerators become supervisors in the survey.
These networks are often mediated through kinship relations of caste and village, and I occasionally observed these within teams. For instance, a supervisor in one survey who was from a village in Rajasthan and belonged to the Jat caste had recruited three men from his village who were also Jats. The senior enumerators are usually careful not to blindly rely on such networks. This supervisor told me that he only recruits those he thinks are capable of doing the job and often rejects candidates referred to him by his contacts in the village.
After recruitment, the enumerators are trained, and they then begin collecting data from respondents in the field. This work is complex and involves much more than mechanically administering the survey questionnaire. The enumerators need to build a rapport with respondents and ensure that they feel comfortable participating in the survey. This exercise is crucial because it determines whether the respondent will honestly provide the required data. To do this, the enumerators often use their tacit skills acquired from their own social experience and knowledge.
One of the surveys I observed was on education and involved interviewing the parents of school children. Before going to a household for an interview, the enumerators had to call the respondents and confirm an appointment. Most respondents for this survey were women. An enumerator that I spent time with would always address respondents as “didi” (“sister” in Hindi) and talk in a colloquial manner when he called them for an appointment. He explained that he deliberately spoke in this tone to create a feeling of familiarity with them. He said that if you used formal words like “ma’am”, the respondents became wary and did not feel comfortable participating in the survey.
According to one of my participants, this industry employs approximately 100,000 to 200,000 enumerators across India on a full or part-time basis.
It seemed that this strategy was working because most of the respondents he spoke to on the phone gave him time for an interview. I observed several instances like this where enumerators relied on their own social knowledge to build a rapport with respondents. Quantitative social scientists should be attentive to these skills, as the standard framework of “bias” for understanding the labour of enumerators (Di Maio and Fiala 2020) often does not capture these social complexities.
The compensation and working conditions of enumerators are managed by survey firms. Firms usually pay the enumerators and supervisors a daily wage as opposed to a per questionnaire piece rate wage. The founder of a firm explained to me that this was done to ensure there was no incentive for an enumerator to rush through and complete multiple interviews in a day, which can adversely impact data quality. The average wages in this industry are Rs. 900 to Rs. 1,100 per day for enumerators and Rs. 1,000 to Rs. 1,200 per day for supervisors. These wages cannot be revised even if clients wish to pay more. On an average, they earn Rs. 30,000 to Rs. 35,000 a month, but this is irregular employment.
All enumerators and supervisors are contractually engaged for the length of the survey. According to one of my participants, this industry employs approximately 100,000 to 200,000 enumerators across India on a full or part-time basis. The emergence of RCTs and customised surveys as empirical tools has thus led to the growth of a significant labour force that is supported by data collection projects.
Accountability and Upward Mobility
This industry for gathering customised socio-economic data raises some important questions for researchers and policy-makers to consider. First, the data produced is usually under the proprietary ownership of the researchers who have designed the study, or at times a select group of bureaucrats whom the researchers may be working with. While this arrangement allows for new sources of knowledge and ideas to enter the government machinery, it remains outside the system of official statistics represented by the NSS and other data sources such as the NFHS or IHDS, which are publicly available.
In this context, it is crucial to consider how civil society or citizen groups can hold decision-makers accountable when policies are based on inaccessible data. Although research data may eventually become available to other academics, this delay hinders timely oversight, as policy decisions are often made much earlier.
Second, the labour force of enumerators merits further examination. Enumeration is not a straightforward exercise, and requires skilled labour to manage respondents and build an adequate rapport for data collection. Moreover, the work is irregular because there are no survey projects through the year. Hence, enumerators experience precarity and lack progress in their career.
Accountability mechanisms for policy decisions based on these proprietary datasets as well as the livelihoods of enumerators who undertake the data collection for these projects must be brought to the forefront.
There are different ways to address this issue. Researchers who commission such projects could provide certificates to the enumerators they have hired, as a way of formalising the professional experience and skills that they acquire through fieldwork. Further, state and union governments could offer more structural policy responses.
Scholars have pointed out that the shortage of regular and trained field enumerators in the NSS needs to be addressed to expand the capacity of India’s statistical machinery to produce frequent and high-quality socio-economic data (Kapoor 2019). The central and state governments, which are equal partners in NSS data collection, can consider hiring enumerators from this industry on long-term contracts or even as regular field investigators. This can ensure a steady supply of skilled enumerators who are well versed in the latest data collection tools to produce regular statistics on various socio-economic indicators that are currently not being captured. At the same time, it can facilitate stable employment for this diverse and precarious labour force.
Academics, development organisations, and policy-makers who use data produced by this industry must consider these questions. While this data is becoming increasingly important due to a dearth of official statistics, accountability mechanisms for policy decisions based on these proprietary datasets as well as the livelihoods of enumerators who undertake the data collection for these projects, must be brought to the forefront. This will allow for a broader discussion on India’s statistical infrastructure, moving away from technical concerns to addressing the social and institutional factors that are necessary for a robust data ecosystem.
Vinayak Krishnan is a PhD scholar at the School of Global Studies, University of Sussex.