Scalable approaches to communication and inference : Minimalistic strategies for measurement and coordination
- Degree Grantor:
- University of California, Santa Barbara. Electrical and Computer Engineering
- Degree Supervisor:
- Upamanyu Madhow
- Place of Publication:
- [Santa Barbara, Calif.]
- Publisher:
- University of California, Santa Barbara
- Creation Date:
- 2014
- Issued Date:
- 2014
- Topics:
- Computer Science, Web Studies, and Engineering, Electronics and Electrical
- Keywords:
- Random projections,
Scalable routing,
Estimation,
Twitter,
Geographic routing, and
Compressed sensing - Genres:
- Online resources and Dissertations, Academic
- Dissertation:
- Ph.D.--University of California, Santa Barbara, 2014
- Description:
Recent advances in technology have enabled large-scale systems in a wide variety of settings. We consider three examples which illustrate that by carefully reducing the size of problem from the very outset, we can provide efficient solutions to the engineering difficulties of communication and inference at scale.
The first example we provide is the problem of estimating a small number of continuous valued parameters from a high dimensional signal. Random projections have been shown to be effective in capturing sparse signals with few measurements. We study the effect of random projections for parameter estimation by focusing on two lower bounds on estimation error variance, the Ziv-Zakai bound (ZZB) and the Cramer-Rao bound (CRB). They reveal that when we ensure that certain geometries (shapes) in the signal manifold (the set of all possible values that the signal can take) are preserved, these bounds are also preserved up to an SNR penalty equal to the dimensionality reduction factor. We show how the SNR penalty results when viewed in conjunction with the threshold behavior of the ZZB can be used to accurately predict the minimum number of random projections needed to avoid large estimation errors. We take up the problem of frequency estimation to illustrate the above ideas and give: (i) number of random projections needed to preserve the identified geometries and (ii) an algorithm which attains the CRB when the number of random projections exceeds the ZZB based prediction.
The second problem we consider is that of geographic routing in mobile ad hoc networks. In order to forward a packet, a relay node needs position estimates of its neighbors and the destination. When the number of nodes in the network grows, the overhead required to inform far away nodes of changes in one's position overwhelms the network. We address this problem by directing updates to only a small subset of nodes in the network. Accordingly, we give a routing protocol that accommodates scenarios where the source and/or relay nodes do not have estimates of the destination's location. The protocol parameters are chosen to guarantee that the overhead fits within network resources and to ensure that the length of every routing trajectory is bounded within a constant factor of the shortest path. We verify these analytical design prescriptions using simulations.
The third problem is an exploration of the value of time for efficient inference on Twitter. Twitter as an online social medium is defined by its real-time nature. We therefore ask whether interests (e.g., sport fandom) can be identified from a user's tweet times alone by exploiting the known timing of "events" associated with the interest (e.g., game times of the sport team). Our results indicate that tweet times can be used to make reliable inferences on the interests of a large number of users. With a view to automate event identification, we develop an inference framework with minimal measurement (time of tweets) and processing requirements for accurately determining when a topic trends on Twitter from an aggregate Twitter feed related to the topic. We then dissect the identified trending times into groups, each associated with a different subtopic, using only userIDs of tweets made during trending times based on the intuition that when two events/trending-times share a large fraction of users, they likely correspond to the same subtopic. Our results from Twitter data obtained over six months illustrate that significant insights into topic-specific Twitter activity can be obtained within our frugal measurement and processing framework. These results suggest that time can be a compact and effective cue to cull data from massive online streams like Twitter.
- Physical Description:
- 1 online resource (234 pages)
- Format:
- Text
- Collection(s):
- UCSB electronic theses and dissertations
- Other Versions:
- http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqm&rft_dat=xri:pqdiss:3682926
- ARK:
- ark:/48907/f37s7kxp
- ISBN:
- 9781321568042
- Catalog System Number:
- 990045118450203776
- Copyright:
- Dineshkumar Karuppanna Gounder Ramasamy, 2014
- Rights:
- In Copyright
- Copyright Holder:
- Dineshkumar Karuppanna Gounder Ramasamy
File | Description |
---|---|
Access: Public access | |
KaruppannaGounderRamasamy_ucsb_0035D_12394.pdf | pdf (Portable Document Format) |