Alexandria Digital Research Library

Analyzing and Processing Big Real Graphs

Author:
Zhao, Xiaohan
Degree Grantor:
University of California, Santa Barbara. Computer Science
Degree Supervisor:
Ben Y. Zhao
Place of Publication:
[Santa Barbara, Calif.]
Publisher:
University of California, Santa Barbara
Creation Date:
2014
Issued Date:
2014
Topics:
Computer Science
Keywords:
Approximation Algorithms
Dynamic Graph Models
Graph Analysis
Privacy
Graph
Genres:
Online resources and Dissertations, Academic
Dissertation:
Ph.D.--University of California, Santa Barbara, 2014
Description:

As fundamental abstractions of network structures, graphs are everywhere, ranging from biological protein interaction networks and Internet routing networks, to emerging online social networks. Studying graphs is critical to understanding the fundamental processes behind the networks, and of practical importance in experimental research. Although many studies on graphs have been carried out in decades, most of the work focused on small or synthetic graphs. In recent years, because of the unprecedented increase of existing networks and the emergence of new complex networks, more and more big real graphs are becoming available. Compared to the graphs studied in prior work, the graphs from these networks are significantly different in scale, level of dynamics and structure.

In this dissertation, we tackle three important graph research problems caused by the significant differences of the big real graphs: efficient node distance computation, graph dynamic analysis and modeling, and graph privacy.

First, we target on a fundamental graph analysis problem, i.e. node distance computation. As a primitive of graph analysis and network applications, the computation of shortest path or random walk distances is computationally expensive, and difficult to scale with the sheer size of big real graphs. To address the scalability issue, we design a novel node distance computation method, named graph coordinate systems, to efficiently estimate node distances with high accuracy.

Our second work is to understand and model the dynamic processes in big real graphs. Specifically, we propose methods to analyze graph dynamics at multiple network scales and explore temporal properties of network growth. Through measurements on Renren first two-year dynamic data, we find independent and predictable processes at different network levels, and detect self-similar properties in its edge creation process. Based on the observations, we propose a new dynamic graph model to capture both temporal and spatial properties. Calibrated with the Renren dataset, our model successfully produces synthetic graphs showing similar dynamic properties.

Finally, to address privacy issue in sharing graphs, we design a graph privacy system to guarantee the required level of privacy. The goal of our work is to design a system that can both maintain a meaningful graph structure and provide strong privacy guarantee. To navigate the tradeoff between the strength of privacy and graph structure utility, we propose a differentially-private graph model. Our rigorous proof shows that the graphs produced by the system can achieve the required level of privacy. By running the system on real graphs collected from Facebook, Internet, and Web, the results demonstrate that the generated synthetic graphs match the original graphs in terms of graph structural metrics and application-level performance.

In summary, to analyze and process the graphs from today's large complex networks, we work on three important problems, including efficiently computing node distances in massive graphs, analyzing and modeling high volume of dynamics in big real graphs, and protecting graph privacy in sharing graphs. We propose novel solutions to address these problems. Through our extensive experiments, we show that our designs perform consistently well on big real graphs.

Physical Description:
1 online resource (230 pages)
Format:
Text
Collection(s):
UCSB electronic theses and dissertations
ARK:
ark:/48907/f32r3pvm
ISBN:
9781321568899
Catalog System Number:
990045119160203776
Rights:
Inc.icon only.dark In Copyright
Copyright Holder:
Xiaohan Zhao
File Description
Access: Public access
Zhao_ucsb_0035D_12449.pdf pdf (Portable Document Format)