Probabilisitc Perspective on Topological Data Analysis

Workshop on Topology: Identifying Order in Complex Systems
Topic:Probabilisitc Perspective on Topological Data Analysis
Speaker:Sayan Mukherjee
Affiliation:Duke University
Date:Wednesday, February 1
Time/Room:2:00pm - 3:00pm/Rutgers University, Hill Center, Room 705

In this talk we discuss the recent area of topological data analysis (TDA) from a probabilistic perspective. The talk falls under two parts. The first part of the talk considers a classic object in topology and geometry, a (Whitney) stratified space. This object is stated as a mixture model in the statistical sense and an algorithm for clustering observations belonging to the same mixture is stated and finite sample bounds are provided. A key idea in this procedure is relative persistence homology which was developed in computational topology. In the second part of the talk we reconsider objects computed in computational topology as summary statistics of distributions. We focus on a particular popular object, the persistence diagram, and show that this summary (with minor restrictions) admits a valid probability space. We also show that Frechet means and variances for these summaries exist. This part of the talk will argue that these summary statistics can be used for probabilistic reasoning and classical statistical formalisms should apply. A practical implication of this work is that means and variances of persistence diagrams can be computed.