Evolving and Self-Managing Data Integration Systems
Dr. AnHai Doan, UIUC
Wednesday, April 14, 2004
11:00am - SENSQ 5317
Joint Pitt/CMU database group meeting
Abstract
Data integration is the problem of providing uniform query interfaces to disparate data sources, so that users can interact with the sources as if with a single source. This problem lies at the heart of efforts to build intelligent information agents, and to process information at enterprises, across government agencies, on the World-Wide Web, and on the envisioned Semantic Web. As such, the problem has received much attention from both the database and AI communities. Much progress has been made, but today data integration systems are still very hard to build and costly to operate. They must be told in tedious detail how to interact with data sources, and must be constantly modified to deal with changes at the sources.
In this talk I will describe the AIDA project whose vision is autonomic self-managing data integration systems: those that take only minutes to be deployed, that require only minimal human coaching to rapidly reach and maintain competence, and that continuously improve over time. I discuss some fundamental issues that arise, such as schema reconciliation and entity matching and fusion. I will also describe conceptually novel solutions that leverage the mass of users, extract and apply domain knowledge, and design source schemas for interoperability. I show how machine learning techniques can be employed effectively in these solutions.
Bio
AnHai Doan is an assistant professor at the Department of Computer Science, University of Illinois at Urbana-Champaign. He obtained a Ph.D. from the University of Washington in 2002. His interests span databases and AI, with a current emphasis on schema and ontology matching, entity matching, autonomic data integration, integrating text and structured data, and machine learning. Selected recent honors include the William Chan Memorial Dissertation Award from the University of Washington, the ACM Dissertation Award in 2003, and the list of teachers ranked as excellent by their students at the University of Illinois. Selected recent professional activities include co-editors of Special Issue on Semantic Intgration for SIGMOD Record and AI Magazine, to appear in 2004.





