by , ,
Abstract:
Information diffusion and virus propagation are fundamental pro- cesses talking place in networks. While it is often possible to di- rectly observe when nodes become infected, observing individual transmissions (i.e., who infects whom or who influences whom) is typically very difficult. Furthermore, in many applications, the underlying network over which the diffusions and propagations spread is actually unobserved. We tackle these challenges by devel- oping a method for tracing paths of diffusion and influence through networks and inferring the networks over which contagions prop- agate. Given the times when nodes adopt pieces of information or become infected, we identify the optimal network that best ex- plains the observed infection times. Since the optimization problem is NP-hard to solve exactly, we develop an efficient approximation algorithm that scales to large datasets and in practice gives provably near-optimal performance. We demonstrate the effectiveness of our approach by tracing in- formation cascades in a set of 170 million blogs and news articles over a one year period to infer how information flows through the online media space. We find that the diffusion network of news tends to have a core-periphery structure with a small set of core media sites that diffuse information to the rest of the Web. These sites tend to have stable circles of influence with more general news media sites acting as connectors between them.
Reference:
Inferring Networks of Diffusion and Influence M. G. Rodriguez, J. Leskovec, A. KrauseIn Proc. ACM Conference on Knowledge Discovery in Databases (KDD), 2010Best Research Paper Award Honorable Mention
Bibtex Entry:
@inproceedings{gomezrodriguez10inferring,
	author = {Manuel Gomez Rodriguez and Jure Leskovec and Andreas Krause},
	booktitle = {Proc. ACM Conference on Knowledge Discovery in Databases (KDD)},
	title = {Inferring Networks of Diffusion and Influence},
	year = {2010}}