Reception 6:30pm, lobby of 66 west 12th street
Event 7:00-8:30pm, the auditorium at 12th street, 66 west 12th street
Data about mass violence can seem to offer insights into patterns: is
violence getting better, or worse, over time? Is violence directed more
against men or women? But in human rights data collection, we (usually)
don’t know what we don’t know — and worse, what we don’t know is
likely to be systematically different from what we do know.
This talk will explore the assumption that nearly every project using
data must make: that the data are representative of reality in the
world. We will explore how, contrary to the standard assumption,
statistical patterns in raw data tend to be quite different than
patterns in the world. Statistical patterns in data tend to reflect how
the data were collected rather than changes in the real-world phenomena
data purport to represent.
Using analysis of killings in Iraq, homicides committed by police in the
US, killings in the conflict in Syria, and homicides in Colombia, we
will contrast patterns in raw data with estimates total patterns of
violence—where the estimates correct for heterogeneous underreporting.
The talk will show how biases in raw data can be addressed through
estimation, and explain why it matters.
Patrick Ball has spent twenty-five years building databases and
conducting quantitative analysis for nine truth commissions, many
non-governmental organizations, and United Nations missions in El
Salvador, Ethiopia, Guatemala, Haiti, South Africa, Chad, Sri Lanka,
East Timor, Sierra Leone, South Africa, Kosovo, Liberia, Perú,
Colombia, the Democratic Republic of Congo, and Syria. He has provided
expert testimony in the trials of three former heads of state —
Slobodan Milošević, Efraín Ríos Montt, and Hissène Habré — for
war crimes, crimes against humanity, and genocide. He co-founded the
non-profit [Human Rights Data Analysis Group](https://hrdag.org). These
days he hacks mostly in python and R, but bash is always the first cut
(-d ‘ ‘). He thinks strings are not byte arrays.