As part of an annual award from the University of Minnesota recognizing the univerity's top recent PhD graduates, Department of Computer Science assisstant professor Dr. Xiaowei Jia received one of four best dissertaion awards. The award is given in four categories; arts & hummanities, biological and medical sciences, social & behavioral sciences and education, and physical sciences and engineering, for which Jia was awarded. His dissertation, entitled
Integrating Physics into Machine Learning for Monitoring Scientific Systems, was published in summer of 2020.
Abstract: Machine learning (ML) has transformed all aspects of our life including how we make decisions, entertain ourselves, and interact with each other. The power of ML models lies in their ability to automatically extract useful patterns from complex data. Given the well-known success of ML in commercial domains, there is an increasing interest in using ML models for advancing scientific discovery. However, direct application of “black-box” ML models has met with limited success in scientific domains given that the data available for many scientific problems is far smaller than what is needed to effectively train advanced ML models. Moreover, in the absence of adequate information about the physical mechanisms of real-world processes, ML approaches are prone to false discoveries of patterns which look deceptively good on training data but cannot generalize to unseen scenarios. This thesis introduces a new generation of machine learning approaches which leverage accumulated scientific knowledge to solve problems of great scientific and societal relevance. We investigate multiple ways in which physical knowledge can be used in the design of ML models for effectively capturing underlying physical processes that are evolving and interacting and multiple scales. We also introduce new optimization strategies for ML models so that they can achieve higher accuracy with limited data and also preserve the correctness from a physical perspective. We will describe our technical innovations and show how they help address real-world challenges by focusing on applications from two disciplines: aquatic science and monitoring crops at scale. We first introduce a physics-guided machine learning framework, which explores a deep coupling of ML methods with scientific knowledge. We show this approach can significantly outperform the state-of-the-art physics-based models and machine learning models in monitoring lake systems and river networks using limited training data while also maintaining consistency to known physical laws. Also, we greatly advance existing deep learning methods so that they can learn patterns from real-world data of greater complexity. These techniques have shown a lot of success in detecting primary crops in US and tree crop plantations in Southeast Asia.