Peter Hoff: Reckoning With Big Data to Tease Out Network Patterns
Duke statistician finds great beauty in his chosen field: the precision, the order, the binary, true or false outcomes. But for him, statistics is also a gateway to a much larger world.
鈥淭o me, statistics is ultimately trying to address the problem of how we learn about the world around us,鈥 Hoff said. 鈥淚f you are a statistician and you develop an interesting statistical method, it could be applicable to biology, to social sciences, to physical sciences. You get exposure and you get to learn about all these different disciplines.鈥
Hoff, who joined the Duke Department of Statistical Science in July after 16 years on the faculty at the University of Washington, specializes in building statistical tools to analyze network or 鈥渞elational鈥 data. These types of data, which document the complex, changing sets of interactions between different individuals within a group, are currently popping up in all areas of research, from the social sciences to genomics.
Hoff鈥檚 tools are designed to extract patterns and meaning from these wide-ranging subjects, which can vary from friendships within social networks and relationships between countries on the international stage, to interactions between different sets of proteins within a cell.
鈥淚鈥檓 interested in trying to understand patterns in these networks, and also what factors lead to the formation of ties between people, between countries, between objects in general,鈥 he said.
Born in Michigan and raised in Indiana, Hoff always enjoyed math and science. But he first fell in love with statistics as a discipline while an undergraduate at Indiana University.
鈥淪tatistics ended up being the perfect thing for me because it was a way to do math and computation, which I enjoy aesthetically, but also it鈥檚 an avenue through which I can learn about lots of different types of science,鈥 Hoff said.
After earning a doctorate in statistics at the University of Wisconsin, he joined the faculty at the University of Washington-Seattle in 2000. While there, he authored an introductory text on Bayesian Statistics, and started building tools for making sense of twenty-first century data.
鈥淭he types of data that people are gathering now are different than the types of data that people gathered ten, twenty years ago,鈥 Hoff said. 鈥淎nd so any time you have a new data structure, a new type of data, you need to develop new statistical methodologies for it.鈥
One of the beauties of creating these statistical tools, he says, is that sometimes the same approach can be applied to a great variety of subjects. He recently created a tool to sift out the correlations between gene expression levels of Leukemia patients that is now being applied to a fruit fly鈥檚 metabolism changes throughout the life cycle.
Hoff was brought to Duke as part of the Provost鈥檚 , a $10 million dollar investment in hiring world-class faculty specializing in statistics, mathematics, computer science and engineering. The initiative has a particular emphasis on attracting interdisciplinary researchers who are likely to have a broad impact across multiple disciplines at Duke, including the physical sciences, social sciences, engineering and medicine.
鈥淧eter Hoff is an outstanding representation of what the quantitative initiative hopes to achieve,鈥 said Lawrence Carin, Vice Provost for Research at Duke. 鈥淗e鈥檚 one of the foremost statisticians in the world, and his research touches many other disciplines beyond statistics, particularly the social sciences and health.鈥
Hoff isn鈥檛 only interested in data as an abstract entity. In his teaching at Duke he also wants to connect statistics students with the nitty-gritty of data gathering 鈥 either by pairing them with scientists, or by giving them the resources for building their own simple devices to collect and analyze data.
鈥淎s a statistician, I often get the data after it has been gathered by other scientists,鈥 he said. 鈥淚 would like to try to develop some projects where students in statistics as well as students in other departments such as computer science, engineering or biology are working with the tools that actually gather the data. Having them involved not just with the data analysis but also seeing what it鈥檚 like to gather data, and seeing all the challenges there, would be a great educational activity.鈥
This interest in hands-on learning or 鈥渢inkering,鈥 as he calls it, arose from basic woodworking projects with his nine-year-old son -- a passtime that he originally thought had no link to computer-bound world of statistics.
鈥淲e started by just building rudimentary boxes,鈥 he said. 鈥淏ut gradually we started to make devices with temperature sensors or pressure sensors, things to gather data. And of course as soon as that happened I had to analyze the data.鈥