Abstract: In this talk, I will present a general framework for sequence modeling by exploring the segmental structures of the sequences. We first observe that segmental structure is a common pattern in many types of sequences, e.g., phrases in human languages. We then design a probabilistic model that is able to consider all valid segmentations for a sequence. We describe an efficient and exact dynamic programming algorithm for forward and backward computations. Due to the generality, it can be used as a loss function in many sequence tasks. We demonstrate our approach on text segmentation, speech recognition, machine translation and dialog policy learning. In addition to quantitative results, we also show that our approach can discover meaningful segments in their respective application contexts. (This is a joint work with many of my previous and current collaborators.)
Bio: Chong Wang is a research scientist at Google. Before Google, He worked at Microsoft Research and Baidu Silicon Valley AI Lab. He received his PhD from Princeton University. His research interests include machine learning and their applications to speech, translation and natural language understanding. He has won several best paper awards in top machine learning conferences and some of his work went into widely used products to serve the users from the globe. His homepage is https://chongw.github.io