RAND Statistics Seminar Series

Modeling Massive Dynamic Graphs

Presented by Dr. Chris Volinsky
AT&T Research Florham Park, NJ
Thursday, January 27, 2005, 10:30 a.m.
Forum m-1226-28, Santa Monica

Abstract

When studying large transactional networks such as telephone call detail data, credit card transactions, or web clickstream data, graphs are convenient and informative ways to represent data. In these graphs, nodes represent the transactors, and edges the transactions between them. When these edges have a time stamp, we have a "dynamic graph" where the edges are born and die through time. I will present a framework for representing and analyzing dynamic graphs, with a focus on the massive graphs found in telecommunications and Internet data. The graph is parameterized with three parameters, defining an approximation to the massive graph which allows us to prune noise from the graph. When compared to using the entire data set, the approximation actually performs better for certain predictive loss functions. In this talk I will demonstrate the application of this model to a telecommunications fraud problem, where we are looking for patterns in the graph associated with fraud.