Alexandria Digital Research Library

Towards Querying and Mining of Large-Scale Networks

Author:
Khan, Arijit
Degree Grantor:
University of California, Santa Barbara. Computer Science
Degree Supervisor:
Xifeng Yan
Place of Publication:
[Santa Barbara, Calif.]
Publisher:
University of California, Santa Barbara
Creation Date:
2013
Issued Date:
2013
Topics:
Computer Science
Keywords:
Graph Query
Big Graphs
Graph Mining
Information Networks
Social Networks
Genres:
Online resources and Dissertations, Academic
Dissertation:
Ph.D.--University of California, Santa Barbara, 2013
Description:

With the advent of the internet, sources of data have increased dramatically, including the World Wide Web, social networks, knowledge graphs, medical and government records. Oftentimes, relations exist among the entities in these data. Therefore, we observe structures in the data, but these structures are implicit, and not as rigid or regular as found in standard database systems. These semi-structured data are usually represented as large networks with labeled nodes and edges. Querying and mining of these linked datasets are essential for a wide range of emerging applications, such as viral marketing, web search, malware detection, image retrieval, and social networks analysis. However, the complex combinations of structure and content, coupled with the massive volume of these data, raise several challenges that require new efforts for smarter and faster graph analysis.

My research interests span the emerging problems in large-scale, heterogeneous, semi-structured data, with a focus on querying and pattern mining in social and information networks using scalable algorithms and machine learning techniques. My research on largescale graphs could be categorized into two broad directions: (1) querying of large-scale networks, including heterogeneous networks, uncertain and stream graphs, and (2) pattern mining over large graphs. In the domain of querying heterogeneous networks, due to noise and lack of schema, structured methods such as SPARQL -- which require an underlying schema to formulate a query --

are often too restrictive. Without knowing the exact structure of the data and the semantics of the entity labels and their relationships, can we still query them and obtain the relevant results? In addition, how do we query uncertain graphs and streams? In the area of graph pattern mining, what graph features one should extract in order to build an accurate and efficient classifier over large networks? From the perspective of advertising and viral marketing, what are the top-k most interesting itemsets and the top-k most influential persons in a social network? In my dissertation, I shall discuss our effective and efficient techniques to solve these emerging problems associated with querying and mining of complex Big-Graphs.

Physical Description:
1 online resource (255 pages)
Format:
Text
Collection(s):
UCSB electronic theses and dissertations
ARK:
ark:/48907/f3w66hsq
ISBN:
9781303539176
Catalog System Number:
990040924710203776
Rights:
Inc.icon only.dark In Copyright
Copyright Holder:
Arijit Khan
Access: This item is restricted to on-campus access only. Please check our FAQs or contact UCSB Library staff if you need additional assistance.