Collective Aspects of Privacy — Sensing and Localization in Online Networks

We study how user attributes such as location and biography can be inferred in online networks through proxy social sensing—not from individuals themselves, but from their connections. Using only the information shared by contacts who joined the network earlier, we evaluate how accurately a user's location can be predicted without their direct participation. Our findings reveal that individuals can be localized with surprising precision (median error of ~68 km on the global map, versus ~6300 km in the null model), especially when many of their contacts have shared mobile data. This demonstrates that privacy in online networks is collectively determined, not individually controlled.

We apply unsupervised techniques, including modal city prediction for location and vector similarity for biographical attributes, and benchmark against randomized baselines (null model). While biographical features are harder to infer, their predictability increases meaningfully as the number of disclosing connections grows. Our analysis also shows that broader disclosure behavior across the network systematically improves inference accuracy, highlighting how individual privacy is shaped by the behavior of others. This work introduces a new form of indirect localization, where network structure and peer behavior function as latent sensors for user attributes.

This study provides the first empirical support for the shadow profile hypothesis, demonstrating that online networks can infer personal information about non-users or passive participants through the disclosures of others. It raises important questions about the nature of privacy in digital ecosystems, where user-level consent is insufficient to safeguard personal information in a socially connected world.

References