If you are an economist, you are likely to find an audience today if you can claim to have used ‘big data analysis’. And if you say you have used satellite data in your analysis, you are guaranteed a receptive audience. That perhaps explains the use of satellite images in this year’s Economic Survey. In an otherwise bland survey, its night-light images of the country stood out.
Although those images went viral, the narrative around them fails to convince. The survey claimed that the change in night-time luminosity between 2012 and 2021 reflected an expansion in electrification, economic activity and urbanization, among other things. But the survey doesn’t tell us how it arrived at this list or the extent to which each of these factors contributed to the country’s change in night-time luminosity. It does not even provide basic details on how the maps were constructed.
The maps show a big jump in luminosity in some of the poorest parts of the country: Bihar, Uttar Pradesh, and West Bengal. What drove the change? Are the afore-cited factors responsible? If so, why haven’t the per capita incomes of these states converged with the rest of the country? Or is it the case that the depicted change in luminosity results from the kind of calibration technique used by the survey’s authors to arrive at such images? After all, any analysis of satellite data typically relies on data cleaning, adjustments and assumptions.
One crucial assumption in this kind of analyses relates to the threshold used for showing lighted areas. If you select too tight a threshold, only exceptionally well-lit areas will appear as lighted. But if you select too loose a threshold, semi-lit or moonlit areas could also show up as bright spots. So comparisons over time using a low threshold could exaggerate luminosity in relatively less developed areas. Neither the survey maps nor the accompanying text provide any details on how the satellite data was processed to arrive at the final images.
It is worth noting that even if satellite data is processed with great care—i.e., realistic thresholds are applied to segregate dark and lighted areas and requisite adjustments are made for the decay in satellite sensor capabilities over time—night light data could still fail to capture economic growth accurately. While some early studies using night-lights found a link between luminosity and economic activity across countries, later studies using more disaggregated data failed to find such links across regions. For instance, a 2016 research paper by Frank Bickenbach and his colleagues from the Kiel Institute for the World Economy showed that regional variations in night-lights did not tally with regional growth patterns in two large emerging markets, India and Brazil. The researchers used different sets of night-light data, and also looked at regional patterns in more developed markets. But their conclusion remained unchanged: night-lights fail to reflect trends in economic activity accurately.
This is not the first time that an economic survey has come up with questionable results using satellite data analysis. The second volume of the 2016-17 Economic Survey released in August 2017 had argued that India’s official figures underestimated urbanization, and suggested that satellite data on ‘built up area’ provided a better sense of the actual pace of urbanization in the country. The survey was not making an entirely original argument. Several influential think-tanks and multilateral organizations have argued for a long time that the Indian census underestimates urban growth. But the survey was perhaps the first official endorsement of that view.
Economists are interested in urbanization primarily because it is closely related to income growth and economic development. So a simple test to check whether an urban metric is appropriate is to see if more urbanized areas, as defined by that metric, are richer than less urbanized ones. The survey authors do not seem to have performed that test. But when Ajai Sreevatsan of Mint ran that check, he found that the built-up area metric fails that basic smell test (see ‘How much of India is actually urban’, 16 September 2017). On that metric (built-up area as measured by satellite data), poorer states such as Bihar and Uttar Pradesh had higher rates of ‘urbanization’ than richer states such as Gujarat or Karnataka! States such as Bihar and Uttar Pradesh have very high population densities and relatively high rates of population growth. So unless one is very careful in using satellite data, one could end up conflating population growth with economic activity or urbanization. Both the current and the previous survey fail to tell us how exactly they separated the population effect from other economic changes in their analysis of satellite data. The surveys’ claims would have been far more convincing if the survey authors had cared to conduct on-ground surveys to verify the results of their satellite image analysis.
As a 2016 research paper by Dave Donaldson and Adam Storeygard argued, a lot depends on the kind of input (or training) data that is fed into a computer to classify images in such analyses. So, to train a computer to distinguish an urban sprawl from a rural settlement based on satellite images, one would need representative on-the-ground data on what urban and rural settlements look like. In heterogeneous regions, sampling requirements and costs can be significantly high. But this is a price that needs to be paid for the sake of accuracy. Without adequate ‘ground-truthing’, satellite data analysis can end up being horribly wrong.
Pramit Bhattacharya is a Chennai-based journalist. His Twitter handle is pramit_b
Never miss a story! Stay connected and informed with Mint.
our App Now!!