by , , ,
Abstract:
Online services rely on unique identifiers of machines to tailor offerings to their users. An implicit assumption is made that each machine identifier maps to an individual. However, shared machines are common, leading to interwoven search histories and noisy signals for applications such as personalized search and advertising. We present methods for attributing search activity to individual searchers. Using ground truth data for a sample of almost four million U.S. Web searchers—containing both machine identifiers and person identifiers—we show that over half of the machine identifiers comprise the queries of multiple people. We characterize variations in features of topic, time, and other aspects such as the complexity of the information sought per the number of searchers on a machine, and show significant differences in all measures. Based on these insights, we develop models to accurately estimate when multiple people contribute to the logs ascribed to a single machine identifier. We also develop models to cluster search behavior on a machine, allowing us to attribute historical data accurately and automatically assign new search activity to the correct searcher. The findings have implications for the design of applications such as personalized search and advertising that rely heavily on machine identifiers to custom-tailor their services.
Reference:
From Devices to People: Attribution of Search Activity in Multi-User Settings R. W. White, A. Hassan, A. Singla, E. HorvitzIn Proc. International World Wide Web Conference (WWW), 2014
Bibtex Entry:
@inproceedings{ryen14multiusersearch,
	author = {Ryen W. White and Ahmed Hassan and Adish Singla and Eric Horvitz},
	booktitle = {Proc. International World Wide Web Conference (WWW)},
	month = {May},
	title = {From Devices to People: Attribution of Search Activity in Multi-User Settings},
	year = {2014}}