Founded in 1966

From Nodes to Networks: Exploiting Autocorrelation to Improve Statistical Models of Relational Data

Jennifer Neville, University of Massachusetts Amherst

Friday, March 24
10:15am - SENSQ 5317
Students meet the speaker at 11:30 a.m.

Refreshments at 10:00am

Hosted by Alex Labrinidis

Abstract

Statistical relational learning is transforming the field of automated learning and discovery by moving beyond the conventional analysis of entities in isolation to analyze networks of interconnected entities. In domains such as bioinformatics, citation analysis, epidemiology, fraud detection, intelligence analysis, and web analytics, there is often limited information about any one entity in isolation, instead it is the connections among entities that are of crucial importance to pattern discovery.

One of the most compelling reasons to use relational models is the ubiquitous presence of autocorrelation in relational datasets. Autocorrelation is a statistical dependency between the values of the same variable of related entities (e.g., hyperlinked web pages are likely to discuss the same topic), which can be exploited to improve predictions by learning models for collective inference. In this talk, I will discuss two graphical models I have developed for collective inference: relational dependency networks (RDNs) and latent group models (LGMs). RDNs are the first statistical relational model capable of learning cyclic autocorrelation dependencies. LGMs models are the first model to exploit latent group structures to improve inference accuracy and efficiency.

To understand the performance differences between RDNs and LGMs, I have developed an extended bias-variance analysis framework that incorporates errors due to both learning and inference. Using this framework, I will demonstrate the effects of data characteristics on model performance and illustrate the mechanisms behind model performance that be used to drive the development of improved models and algorithms.

Biography of Speaker

Jennifer Neville is a PhD candidate in the Department of Computer Science at the University of Massachusetts Amherst working with Professor David Jensen in the Knowledge Discovery Laboratory. Her research focuses on data mining and machine learning in relational data, with applications in bioinformatics, citation analysis, epidemiology, fraud detection, and web analytics. Jennifer received her B.S. with honors in 2000 and her M.S. in 2004 from the University of Massachusetts Amherst. She was awarded graduate research fellowships by both NSF and AT&T Laboratories.

You are using an older browser that does not support current Web standards. Although this site is viewable in all browsers, it will look much better in a browser that supports Web standards.