Basic Information

Position: Chancellor's Associate Professor, School of Information and Public Policy
Director, Global Policy Lab
Faculty Co-Director, Center for Effective Global Action
University of California, Berkeley
Contact: jblumenstock[@]berkeley[.]edu
My work focuses on using novel data and methods to try and improve the lives of disadvantaged people around the world. You can find a short professional bio here.

Selected Papers

Aiken, E, Bellue, S, Karlan, D, Udry, C, and Blumenstock, JE (2022). Machine Learning and Phone Data Can Improve the Targeting of Humanitarian Aid, Nature, 603: 864-870 [pdf]
Targeting is a central challenge in the administration of anti-poverty programs: given available data, how does one rapidly identify the individuals and families with the greatest need? Here we show that non-traditional “big” data from satellites and mobile phone networks can improve the targeting of anti-poverty programs. Our analysis compares outcomes – including exclusion errors, total social welfare, and measures of fairness – under different targeting regimes. Relative to other feasible targeting options, the machine learning approach reduces errors of exclusion by 4-21%. These results highlight the potential for new data sources to contribute to humanitarian response efforts, particularly in crisis settings when traditional data are missing or out of date.
Chi, G, Fang, S, and Blumenstock, JE (2022). Microestimates of Wealth and Poverty for all Low- and Middle-Income Countries, Proceedings of the National Academy of Sciences, 119(3), 1-11 [pdf]
Many critical policy decisions rely on data about the geographic distribution of wealth and poverty. Yet many poverty maps are out of date or exist only at very coarse levels of granularity. Here we develop the first microestimates of wealth and poverty that cover the populated surface of all 135 low and middle-income countries (LMICs) at 2.4km resolution. The estimates are built by applying machine learning algorithms to vast and heterogeneous data from satellites, mobile phone networks, topographic maps, as well as aggregated and de-identified connectivity data from Facebook. We train and calibrate the estimates using nationally-representative household survey data from 56 LMICs, then validate their accuracy using four independent sources of household survey data from 18 countries. We also provide confidence intervals for each microestimate to facilitate responsible downstream use...
Rolf, Simchowitz, Dean, Liu, Björkegren, Hardt, and Blumenstock (2020). Score-Based Classifiers for Welfare-Aware Machine Learning, International Conference on Machine Learning (ICML '20) [pdf]
While real-world decisions involve many competing objectives, algorithmic decisions are often evaluated with a single objective function. We study algorithmic policies which explicitly trade off between a private objective (such as profit) and a public objective (such as social welfare). We analyze a natural class of policies which trace an empirical Pareto frontier based on learned scores, and focus on how such decisions can be made in noisy or data-limited regimes. Our theoretical results characterize the optimal strategies in this class, bound the Pareto errors due to inaccuracies in the scores, and show an equivalence between optimal strategies and a rich class of fairness-constrained profit-maximizing policies.
Blumenstock, JE, Callen, M, and Ghani, T (2018). Why Do Defaults Affect Behavior? Experimental Evidence from Afghanistan, American Economic Review, 108 (10), 2868-2901 [pdf]
We report on an experiment examining why default options impact behavior. By randomly assigning employees to different varieties of a salary-linked savings account, we find that default enrollment increases participation by 40 percentage points -- an effect equivalent to providing a 50% matching incentive. We then use a series of experimental interventions to differentiate between explanations for the default effect, which we conclude is driven largely by present-biased preferences and the cognitive cost of thinking through different savings scenarios. Default assignment also changes employees' saving habits, and makes them more likely to actively decide to save after the study concludes.
Blumenstock, JE (2018). Don't forget people in the use of big data for development, Nature, 561 (7722), 170-172 [pdf]
Aid organizations, researchers and private companies are looking for ways to leverage the 'data revolution' to transform international development. In the rush to find technological solutions to complex global problems, however, there's a danger that we get by distracted the technology and lose track of the deeper issues that are unique to each local context... The CEO of a popular big-data platform recently described data science as "a blend of Red-Bull-fueled hacking and espresso-inspired statistics." In my view, the successful use of big data in development will require a data science that is considerably more humble than this version that has captured the popular imagination.

View all publications...

Working Papers / Active Projects

Migration and the Value of Social Networks - joint with Guanghua Chi and Xu Tan (Conditionally accepted, Review of Economic Studies)
How do social networks influence the decision to migrate? Prior work suggests two distinct mechanisms that have historically been difficult to differentiate: as a conduit of information, and as a source of social and economic support. We disentangle these mechanisms using a massive 'digital trace' dataset that allows us to observe the migration decisions made by millions of individuals over several years, as well as the complete social network of each person in the months before and after migration. These data allow us to establish a new set of stylized facts about the relationship between social networks and migration. Our main analysis indicates that the average migrant derives more social capital from 'interconnected' networks that provide social support than from 'extensive' networks that efficiently transmit information.
(Machine) Learning what Policies Value - joint with Daniel Björkegren and Samsun Knight (Revise and Resubmit, Review of Economic Studies)
When a policy prioritizes one person over another, is it because they benefit more, or because they are preferred? This paper develops a method to uncover the values consistent with observed allocation decisions. We use machine learning methods to estimate how much each individual benefits from an intervention, and then reconcile its allocation with (i) the welfare weights assigned to different people; (ii) heterogeneous treatment effects of the intervention; and (iii) weights on different outcomes. We demonstrate this approach by analyzing Mexico's PROGRESA anti-poverty program. The analysis reveals that while the program prioritized certain subgroups -- such as indigenous households -- the fact that those groups benefited more implies that they were in fact assigned a lower welfare weight. The PROGRESA case illustrates how the method makes it possible to audit existing policies, and to design future policies that better align with values.
Program Targeting with Machine Learning and Mobile Phone Data: Evidence from an Anti-Poverty Intervention in Afghanistan - joint with Emily Aiken, Guadalupe Bedoya, Aidan Coville (Forthcoming, Journal of Development Economics)
Can mobile phone data improve program targeting? By combining rich survey data from a “big push” anti-poverty program in Afghanistan with detailed mobile phone logs from program beneficiaries, we study the extent to which machine learning methods can accurately differentiate ultra-poor households eligible for program benefits from ineligible households. We show that supervised learning methods leveraging mobile phone data can identify ultra-poor households nearly as accurately as survey-based measures of consumption and wealth; and that combining survey-based measures with mobile phone data produces classifications more accurate than those based on a single data source.
Manipulation-Proof Machine Learning - joint with Daniel Björkegren and Samsun Knight
An increasing number of decisions are guided by machine learning algorithms. In many settings, from consumer credit to criminal justice, those decisions are made by applying an estimator to data on an individual's observed behavior. But when consequential decisions are encoded in rules, individuals may strategically alter their behavior to achieve desired outcomes. This paper develops a new class of estimator that is stable under manipulation, even when the decision rule is fully transparent. We explicitly model the costs of manipulating different behaviors, and identify decision rules that are stable in equilibrium. Through a large field experiment in Kenya, we show that decision rules estimated with our strategy-robust method outperform those based on standard supervised learning approaches.
Religious adherence has been hard to study in part because it is hard to measure. We develop a new measure of religious adherence, which is granular in both time and space, using anonymized mobile phone transaction records. After validating the measure with traditional data, we show how it can shed light on the nature of religious adherence in Islamic societies. Exploiting random variation in climate, we find that as economic conditions in Afghanistan worsen, people become more religiously observant. The effects are most pronounced in areas where droughts have the biggest economic consequences, such as croplands without access to irrigation.
Many decisions that once were made by humans are now made using algorithms. These algorithms are typically designed with a single, profit-related objective in mind: Loan approval algorithms are designed to maximize profit, smart phone apps are optimized for engagement, and news feeds are optimized for clicks. However, these decisions have side effects: irresponsible payday loans, addictive apps, and fake news can harm individuals and society. This project develops and tests a new paradigm for prioritizing the social impact of an algorithmic decision from the start, rather than as an afterthought. The key insight is to leverage recent advances in machine learning -- which make it possible to predict who will benefit from a decision and how -- to design algorithms that balance those predicted benefits alongside traditional profit-related objectives.
We present experimental evidence on the economic impacts of mobile phone access. Our results are based on a randomized control trial in the Philippines, through which 14 isolated and previously unconnected villages were randomly assigned to either receive or not receive a new cellphone tower. Following a pre-analysis plan, we find that the introduction of mobile phones had large and significant impacts on household income and expenditure, particularly for wage workers. Mobile phone access also increased social connections within and between communities. However, there are no consistent impacts on market access, informedness, or subjective well being. In post-specified analysis, we find suggestive evidence that the improved economic conditions are driven by increases in migration, remittances, and self-employment. Working paper available by request.
Mobile phones reduce the cost of communicating with existing social contacts, but do not eliminate frictions in forming new relationships. We report the findings of a two-sided randomized control trial in central Tanzania, centered on the production and distribution of a "yellow pages" phone directory with contact information for local enterprises. Enterprises randomly assigned to be listed in the directory receive more business calls, make more use of mobile money, and employ more workers. There is evidence of positive spillovers, as both listed and unlisted enterprises in treatment villages experience significant increases in sales relative to a pure control group. Households randomly assigned to receive copies of the directory make greater use their phones for farming, are more likely to rent land and hire labor, have lower rates of crop failure, and sell crops for weakly higher prices. Willingness-to-pay to be listed in future directories is significantly higher for treated enterprises.