Interested in working on research with me?

I am always looking to work with exceptional students on projects related to the use of massive datasets to understand patterns of social and economic behavior. Representative projects can be found on my website and through the Data-Intensive Development Lab website.

I strongly prefer to work with PhD students who are enrolled at UC Berkeley, or with postdocs who have recently received a PhD. If this is you, please see the instructions below for how to get in touch with me. If you are interested in applying to the PhD program at the U.C. Berkeley School of Information, I have provided some brief advice for PhD applicants. In exceptional cases, I collaborate with students at other institutions, undergraduates, and non-students - if you think you fall into this category, send me an email explaining why.

For Current Students and Prospective Postdocs

Most of my projects require significant programming, statistical analysis, as well as the use of modern tools for the management of large-scale datasets such as Hadoop, GraphLab, and Spark. Students will acquire many of these skills in the process of working on these projects. However, it is generally not feasible to work on a project unless you can already demonstrate mastery of one or more programming languages (such as Java, Python, php, C/C++/C#, or similar).

If you're interested in working with me or my lab, please email me with the following information:

  1. A cover letter describing your interests and experience. Please tell me which of the below criteria you have already met, and how. The best way to do this is to send me links to your code, papers, blog posts, etc.
  2. Resume/CV.
  3. Transcript or list of relevant coursework (with grades received) in Computer Science, Statistics/Math, Economics.

Qualified candidates are expected to meet several of the following recommended criteria:

  • University-level coursework in computer science, and especially in algorithms, data structures, complexity, and databases.
  • University-level coursework in statistics, econometrics, or applied mathematics, and proficiency with one or more packages for statistical analysis (such as R, MatLab, Stata, etc.)
  • Prior experience with technologies for managing large datasets (>1TB), such as SQL, hadoop/hive, NoSQL, GraphLab, Spark/Shark, etc.
  • Sysadmin experience, and proficiency with linux-based, command line tools.
  • Experience creating compelling data visualizations, and/or proficiency with frameworks such as D3, Processing, ArcGIS, or similar.
  • Prior research experience on projects that led to peer-reviewed publications.

For Prospective PhD Students

Are you interested in applying to the PhD program at the UC Berkeley School of Information? If so, please know that I do not directly admit students; rather, admissions decisions are made by a committee of faculty. This committee considers "the usual" criteria - research aptitude and experience (and publications), letters of recommendations, GRE/GPA scores, research statement, and personal statement.

If you do apply, please take the research statement very seriously, as we read them closely. There's no shortage of guides to writing this statement, but from my perspective it's important that you use the statement to clearly motivate and articulate a research question or idea, to situate it within work that has already been done by others, and to explain how/why you will be the one to answer it. This isn't a binding contract (you can work on something else entirely once you're admitted!), but we want to see that you know how to write clearly about research, and that you understand what you're getting yourself into. Also note that we are a small program and fit with faculty is important. If you are interested in working with me, your research statement should make clear how your research interests overlap with my own. However -- and this is perhaps unique to the School of Information -- you should also attempt to draw connections to the work of other faculty in the school from whom you would be interested in learning.

Finally, please know that every year I receive a large number of emails from prospective students. While I am happy to read these emails and learn of your interest, I generally do not meet with prospective students unless they have already been admitted to our program - there are just too many students for this to be feasible. If you're trying to get a sense for whether I would be a good fit as an advisor, please take a close look at the projects and papers described on this website and on my lab website. And if you decide to apply, please send me an email in the Fall, and I will make sure to take a close look at your application. Thanks very much for your understanding!