Interested in working on research with me?
I am always looking to work with students on projects that use novel data and methods to understand and address the causes and consequences of poverty. Representative projects can be found on my website and through the Data-Intensive Development Lab website.
I strongly prefer to work with PhD students who are enrolled at UC Berkeley, or with postdocs who have recently received a PhD. If this is you, please read the instructions below for how to get in touch with me. If you are interested in applying to the PhD program at the U.C. Berkeley School of Information, I have provided some brief advice for PhD applicants. In exceptional cases, I collaborate with students at other institutions, undergraduates, and non-students - if you think you fall into this category, send me an email explaining why.
Most of my projects require significant programming, statistical analysis, as well as the use of modern tools for the management of large-scale datasets. Students will acquire many of these skills in the process of working on these projects. However, it is generally not feasible to work on a project unless you can already demonstrate mastery of one or more programming languages (such as Java, Python, php, C/C++/C#, or similar).
If you're interested in working with me or my lab, please email me with the following information:
- A cover letter describing your interests and experience. Please tell me which of the below criteria you have already met, and how. The best way to do this is to send me links to your code, papers, blog posts, etc.
- Transcript or list of relevant coursework (with grades received) in Computer Science, Statistics/Math, Economics.
Qualified candidates are expected to meet several of the following recommended criteria:
- University-level coursework in computer science, and especially in algorithms, data structures, complexity, and databases.
- University-level coursework in statistics, econometrics, or applied mathematics, and proficiency with one or more packages for statistical analysis (such as R, MatLab, Stata, etc.)
- Prior experience with technologies for managing large datasets (>1TB), such as SQL, hadoop/hive, NoSQL, GraphLab, Spark/Shark, etc.
- Sysadmin experience, and proficiency with linux-based, command line tools.
- Experience creating compelling data visualizations, and/or proficiency with frameworks such as D3, Processing, ArcGIS, or similar.
- Prior research experience on projects that led to peer-reviewed publications.