September 13: Seminar – Arnab Auddy

Time and Location: September 13 (Tuesday) 11:30am-12:30pm in CH 212

Speaker: Arnab Auddy (Columbia University)

Title: Why and how to use orthogonally decomposable tensors for statistical learning

Abstract: As we encounter more and more complex data generating mechanisms, it becomes necessary to model higher order interactions among the observed variables. Orthogonally decomposable tensors provide a unified framework for such modeling in a number of interesting statistical problems. While this is a natural extension of matrix SVD to tensors, they automatically provide much better identifiability properties. Moreover, a small perturbation affects each singular vector in isolation, and hence their recovery does not depend on the gap between consecutive singular values. In addition to the attractive statistical properties, the tensor decomposition problem in this case presents us with intriguing computational challenges. To understand these better, we will explore some statistical-computational tradeoffs, and also describe tractable methods that provide rate optimal estimators for the tensor singular vectors.

April 24: Seminar – Yongdai Kim

Time and Location: 3-4pm in CH 212

Speaker: Yongdai Kim (Seoul National University, Korea)

Title: Fast learning with deep learning architectures for classification

Abstract: We derive the fast convergence rates of a deep neural network (DNN) classifier with the rectified linear unit (ReLU) activation function learned using the hinge loss. We consider three cases for a true model: (1) a smooth decision boundary, (2) smooth conditional class probability, and (3) the margin condition (i.e., the probability of inputs near the decision boundary is small). We show that the DNN classifier learned using the hinge loss achieves fast convergence rates for all three cases provided that the architecture (i.e., the number of layers, number of nodes and sparsity) is carefully selected. An important implication is that DNN architectures are very flexible for use in various cases without much modification. In addition, we consider a DNN classifier learned by minimizing the cross-entropy, and give conditions for fast convergence rates. If time is allowed, computational algorithms to achieve a right size of deep architectures for fast convergence rates is discussed.

This is joint work with Ph.D. students Ilsang Ohn and Dongha Kim.