Alexandria Digital Research Library

Battling Spam and Sybils on the Social Web

Author:
Wilson, Christo
Degree Grantor:
University of California, Santa Barbara. Computer Science
Degree Supervisor:
Ben Y. Zhao
Place of Publication:
[Santa Barbara, Calif.]
Publisher:
University of California, Santa Barbara
Creation Date:
2012
Issued Date:
2012
Topics:
Computer Science
Keywords:
Spam
Measurement
Sybils
Facebook
Online Social Networks
Computer Security
Genres:
Online resources and Dissertations, Academic
Dissertation:
Ph.D.--University of California, Santa Barbara, 2012
Description:

Since 2004, the social web has become a dominant force on the Internet. As of 2011, 65% of adults in the US used online social networking (OSN) sites, and this number continues to grow, both in the US and around the world. However, as OSNs gradually supplant email and instant messaging as the primary channel for online communication, the incentive for malicious users to attack these systems grows. Social spam and fake Sybil accounts are now the primary tools for online criminals looking to spreadmalware and steal personal information on OSNs. In this work, we take the first steps towards measuring, understanding, and defending against these threats to the social web.

We begin by conducting detailed studies of two of the largest OSNs in the world: Facebook and Renren. Quantifying the basic graph structural properties of these OSNs gives us a solid foundation of understanding on which to build further research. Our work goes beyond existing studies that are focused on static topologies by accounting for the relative importance of individual edges of the social graph. By analyzing visible and latent interactions between users, we show that all edges in social graphs are not equally important, and develop "interaction graphs" to capture these effects. Through simulations on real social graphs, we show that edge importance has a large effect on the performance of social applications. This result indicates that ongoing research into social applications and algorithms should take user interactions into account if they hope to obtain realistic and accurate results.

Our baseline OSN measurements allow us to characterize the behavior of normal users in great detail, which opens a window of opportunity for identifying anomalies associated with malicious activity. As a first step towards understanding malicious activity on OSNs, we examine its most prominent outward symptom: spam. We analyze hundreds of millions of wall posts received by millions of Facebook users and develop a novel set of automated techniques to detect social spam. Our results show that a significant portion of the URLs shared on Facebook are spam, the majority of which link to malicious phishing websites. These spam attacks are organized into large, coordinated campaigns by criminals working behind the scenes. Analysis of the behavior of spamming accounts demonstrates that both fake, Sybil accounts and compromised normal accounts are used as tools to attack Facebook users.

Next, we turn our attention to the problem of Sybil accounts. Although our work on spam detection identified Sybils as a major threat to OSNs, at the time no practical solutions to this problem had been developed. To address this challenge, we use ground truth data provided by Renren Inc. to build a measurement based Sybil detector. This system is currently deployed on the Renren OSN, and to date it has caught and banned millions of Sybils. Importantly, our detector operates in real-time, meaning that Sybils are banned before they get a chance to generate harmful spam. We study the edge creation behavior of Sybils on Renren, and find that contrary to prior conjecture, they do not form tight-knit communities. Instead, they integrate into the social graph just like normal users. This result confirms our hypothesis that existing Sybil community detectors from the literature are unlikely to succeed on today's OSNs.

In summary, our research makes two fundamental contributions to the study of OSNs. First, our work demonstrates the necessity of measurement driven design of social systems. Repeatedly, our measurements have contradicted assumptions from prior work, and thus revealed new avenues of research. Second, we have discovered, quantified, and developed practical solutions for pressing OSN security problems. However, the social web continues to evolve, and the shape of its attack surface is constantly changing. Attackers will continue to innovate new and unexpected strategies to exploit OSNs and evade security mechanisms. Only by bringing all of our tools to bear: measurement, graph analysis, data mining, machine learning, etc., can computer scientists hope to defend against future threats to the social web.

Physical Description:
1 online resource (329 pages)
Format:
Text
Collection(s):
UCSB electronic theses and dissertations
ARK:
ark:/48907/f33776p7
ISBN:
9781267768018
Catalog System Number:
990039148390203776
Rights:
Inc.icon only.dark In Copyright
Copyright Holder:
Christo Wilson
Access: This item is restricted to on-campus access only. Please check our FAQs or contact UCSB Library staff if you need additional assistance.