India is fortunate to have a rich tradition of public data collection and compilation. Government functionaries at the national, state, district, block and panchayat levels collect data on thousands of variables on population, land use, agricultural production, irrigation, stream flows, reservoir storage, groundwater level, employment and livestock; and almost all of it is meticulously aggregated and compiled at district and state level.
These routine or regular data collection exercises are complemented by quinquennial (every five years) Agriculture, Livestock and Minor Irrigation censuses and the decadal population census. Large-scale sample surveys routinely undertaken by the National Sample Survey Organization (NSSO) add richness to the census data. Compared to other developing countries, India’s public data collection and availability is much better; some of the detailed datasets – such as the census of more than 20 million minor irrigation structures – is not even available in much ‘richer’ water economies such as the United States.
As can be expected from any large-scale data collection process, the quality of our public data is highly variable, sometimes even inconsistent. But, overall, if analyses and interpretation are done keeping in mind some of the limitations, the datasets can be a precious resource at the meso and macro level. For all the effort and resources that go into the collection of this data, and the rich overview of the water and agriculture economy that the data can paint, India’s public datasets are largely underutilized.
Most national-level censuses fall under the responsibilities of the concerned ministry of the Government of India. However, the actual execution and data collection cannot be done without support from state and local government departments. For the Minor Irrigation Census, for example, the Minor Irrigation (Statistics) Wing in the Ministry of Jal Shakti (formerly, Ministry of Water Resources, River Development and Ganga Rejuvenation) is the nodal agency at Centre. All costs associated with conducting the census is borne by the central government. However, execution is the responsibility of respective state water resource ministries/nodal departments.
The central ministry prepares the pro-forma schedule and conducts workshops in states to train enumerators. While some states take this exercise seriously, others may not. While in some states, the irrigation department collects the data themselves, in others, it has been farmed out to private data collection agencies.
In the past, due to several reasons, some states have even failed to send back any data – for example, Rajasthan is missing from the first MI census. Gujarat, Maharashtra and three union territories (Chandigarh, Daman & Diu and Lakshadweep) were missing from the second census. Daman & Diu and Lakshadweep are also missing from the subsequent third, fourth and fifth censuses. All of these create variability and inconsistencies in data and make it difficult to compare and analyse.
Given that so far, most of the data collection has been done through paper-based enumeration schedules, there is usually a big lag between data collection and its final release. The report of the Fourth MI census – which had the reference year of 2005-06 was released in 2014; the lag was reduced in the fifth census where the reference year was 2013-14 and the report was published in 2017. In recent iterations, this issue is expected to be tackled through digitization of data collection and compilation and use of tablet-based surveys and appropriate tools for data management.
Another huge challenge with data is that it is collected and compiled based on administrative boundaries (districts, blocks)– which themselves keep changing. Any comparison of data across different censuses has to deal with this issue. While changes in administrative boundaries are perhaps inevitable, the analyses can be made easy if disaggregated data was made accessible (so that it can easily be re-aggregated as per new administrative divisions).
Most of the data is released as poorly-scanned data tables in pdf format, and often needs to be painfully downloaded state-wise, or sometimes even by crop or district. Often, undertaking any analyses requires a costly and wasteful process of ‘re-digitalization’[1] which can be easily avoided.
Our hunch is that if the data was better organized and available for download in more ‘user-friendly’ formats, its utilization would improve manifold and it would start informing important policies and programs – both government-led as well as donor or civil-society driven.
Despite these limitations, India’s public ‘water and agriculture’ datasets can help nudge India’s public policy debates towards data-driven planning – at the macro level. These datasets, however, are unlikely to be very useful for the village or even small watershed-level planning.
For best results, data from large public datasets and micro-level field studies should be made interoperable so that the two can be combined to present a nuanced picture of the agrarian economy.
We conclude by highlighting some positive steps in the public data space that are encouraging and a ‘wish-list’ of what else needs to be done to nudge us in the right direction.
Availability of high-resolution remote sensing data is increasing by the day. Recently, Norway invested ~€37 million to make < 5m resolution satellite imagery for 64 countries open access to assist research and policymaking on deforestation. This includes imagery from Planet, KSAT, Airbus and historical SPOT imagery from 2002 onwards.
Population Census |
Ministry of Home Affairs, GoI - https://censusindia.gov.in/ First, non-synchronous (1965-72); Second (1881), Third (1891), Fourth (1901), Fifth (1911), Sixth (1921), Seventh (1931), Eighth (1941), Ninth (1951), Tenth (1961), Eleventh (1971), Twelfth (1981), Thirteenth (1991), Fourteenth (2001), Fifteenth (2011) |
Agriculture Census |
Ministry of Agriculture - http://agcensus.nic.in/ First (1970-71), Second (1975-76), Third (1980-81), Fourth (1985-86), Fifth (1990-91), Sixth (1995-96), Seventh (2000-01), Eighth (2005-06), Ninth (2010-11), Tenth (2015-16) Phase I, Phase II Input survey database - National tables, State tables, District tables |
Minor Irrigation Census |
Ministry of Jal Shakti http://micensus.gov.in/ First (1986-87), Second (1993-94): Part I, Part II, Part III, Part IV, Part V, Part VI Third (2000-01), Fourth (2005-06), Fifth (2013-14) Sixth (2017-18): Data Collection Manual, Census of Water Bodies MIC Dashboard: http://164.100.229.38/dashboard#/dashboard |
Livestock Census |
Department of Animal Husbandry and Dairying http://dahd.nic.in/ Sixteenth (1997), Seventeenth (2003), Eighteenth (2007), Nineteenth (2012), Twentieth (2019) |
Economic Census |
Ministry of Statistics and Programme Implementation - http://mospi.nic.in/ Second (1980), Third (1990), Fourth (1998), Fifth (2005), Sixth (2013), Seventh (2019) |
Groundwater Data |
Central Ground Water Board - http://cgwb.gov.in/ Blockwise GW Resources Assessment 2017 Dynamic Ground Water Resources of India – 2004, 2009, 2011, 2013, 2017 |
Water storage in Major Reservoirs |
Central Water Commission - http://cwc.gov.in/ CWC Dashboard, Annual Reports 2003 to 2018, Reservoirs storage bulletin |
Meteorological data |
Indian Meteorological Department - https://mausam.imd.gov.in/ |
State-wise Agriculture statistics and Land use statistics |
Directorate of Economics and Statistics, DAC & FW https://eands.dacnet.nic.in/ Agricultural statistics at a glance – 2014, 2015, 2016, 2017, 2018, 2019 LUS at a glance – Latest data (2006-07 to 2015-16), Previous data (1984-95 to 2005-06) |
Shilp Verma and Cheshta Rajora work with IWMI-Tata Policy Program. Manisha Shah is associated with Arghyam. Views expressed are personal.