A topological heat map for data analysis (TopHeat)
Abstract
Modeling complex systems with complicated multi-scale and possibly high-dimensional geometric structure is an important scientific challenge. Topological Data Analysis (TDA) uses mathematics to represent complex systems. In particular, it adapts tools from algebra and topology to construct multi-scale summaries of both complicated geometric structures and high-dimensional phenomena. The main tools of TDA, the barcode and the persistence diagram, construct a quantitative and visual summary of a wide variety of data, including images, video, and audio signals. They can be applied to both static and dynamic data. The PI s Persistence Landscape is the most widely used tool for converting the barcode and the persistence diagram into a summary with which it is easy to perform statistical analysis and machine learning. For example, the Persistence Landscape can be used to detect subtle changes in the geometry between groups of images. It can also be combined with dimension reduction techniques such as principal component analysis and can be used for classification using support vector machines. However, while all of these are powerful summaries, they have the drawback that they are abstract representations and interpreting them in terms of the starting data can be a time-intensive challenge requiring expert knowledge of both the data and the mathematical method. This proposal develops an entirely new approach to TDA. Instead of representing topological features in an abstract summary, they will be represented as a heat map on the starting data. This topological heat map will provide a visual representation of the topological and geometric features of the data that is easy to interpret. Furthermore, it will be extended in two important ways. It will be used to visualize the features discovered through previous analysis using TDA with statistics and machine learning, and it will be used as a summary for further statistical analysis and machine learning. This tool will be broadly applicable to study a wide range of complex nonlinear systems of interest to scientific, Army and DOD needs. In addition to developing algorithms and software to produce this topological heat map, the main part of the work will be the development of the underlying mathematical theory. In fact, a naive version of this heat map is easily obtained, but it is wildly unstable. A small change in the data can result in a drastically different heat map. Crucially, the proposed research will combine ideas from algebraic topology and functional analysis to produce a heat map that is provably stable. Small changes in the data will only result in small changes in the heat map -- certainly a requirement for decision makers. Furthermore, the establishment of the underlying mathematics will facilitate the future development of additional advanced tools.
Document Details
- Document Type
- DoD Grant Award
- Publication Date
- Feb 25, 2019
- Source ID
- W911NF1810307
Entities
People
- Peter Bubenik
Organizations
- Army Contracting Command
- United States Army
- University of Florida