\ Statistical Learning with a Geometric Digraph Family

Statistical Learning with a Geometric Digraph Family

Artür Manukyan

Tarih ve Saat

30 Ekim 2017 - 14:00

Yer

Mühendislik Binası A-511

Title: Statistical Learning with a Geometric Digraph Family

Abstract: A considerable portion of statistical learning methods model data sets as graphs. Proximity graphs offer solutions to many challenges in supervised and unsupervised statistical learning problems. Among these graphs, class cover catch digraphs (CCCDs) have been introduced first to investigate the class cover problem (CCP), and then employed in classification and clustering. However, this family of digraphs can be improved further to construct better classifiers and clustering algorithms. We tackle popular problems in statistical learning like robustness, prototype selection and determining the number of clusters with proximity catch digraphs (PCD). PCDs are generalized versions of CCCDs and have been proven useful in spatial data analysis. We will investigate the performance of CCCDs and PCDs in both supervised and unsupervised statistical learning schemes, and discuss how these digraph families address some renown real life challenges. We show that CCCD classifiers perform relatively well when one class is more frequent than the others, an example of the class imbalance problem. Later, by using barycentric coordinate system and by extending the Delaunay tessellations to partition R^d, we establish PCD based classifiers and clustering methods that are both robust to the class imbalance problem and have computationally tractable prototype sets, making them both appealing and fast. In addition, our clustering algorithms are parameter-free clustering adaptations of an unsupervised version of CCCDs, namely cluster catch digraphs (CCDs). We partition data sets by incorporating spatial data analysis tools based on Ripley’s K function that estimates the spatial intensity parameters of homogeneous Poisson processes. Such methods are crucial for real life practices where the domain knowledge is often infeasible.

 

Biography: Artür Manukyan has recently received his Ph.D. degree in Computational Science and Engineering with a specialization in computational statistics from Koç University where he developed graph-based classification and clustering algorithms. Prior to this, he received B.Sc. and M.Sc. degrees in Statistics and Computational Science and Engineering from Yıldız and Istanbul Technical Universities, respectively. His research interests include various areas of computational statistics, such as graph-theoretic statistical inference, statistical machine learning and data visualization.

Yeditepe Üniversitesi, Endüstri ve Sistem Mühendisliği
26 Ağustos Yerleşimi, Kayışdağı Cad. 34755 Ataşehir, İstanbul

(216) 578 04 50 info@sye.yeditepe.edu.tr