[AI Seminar] 7/28 John Wieting on Learning and Applications of Paraphrastic Representations for Natural Language

Speaker: John Wieting
Date: Tuesday 07/28/2020, 2pm-3pm
Zoom link: check e-mails from AI or CL lists or e-mail zong.56@osu.edu

Title: Learning and Applications of Paraphrastic Representations for Natural Language

Abstract: Representation learning has had a tremendous impact in machine learning and natural language processing (NLP), especially in recent years. Learned representations provide useful features needed for downstream tasks, allowing models to incorporate knowledge from billions of tokens of text. The result is better performance and generalization on many important problems of interest. This talk focuses on the problem of learning paraphrastic representations for units of language spanning from sub-words to full sentences – the latter being a focal point. Our primary goal is to learn models that can encode arbitrary word sequences into a vector with the property that sequences with similar semantics are near each other in the learned vector space, and that this property transfers across domains.

We first show several simple, but effective, models to learn word and sentence representations on noisy paraphrases automatically extracted from bilingual corpora. These models outperform contemporary models on a variety of semantic evaluations. We then propose techniques to enable deep networks to learn effective semantic representations, addressing a limitation of our prior work. We also automatically construct a large paraphrase corpus that improves the performance of all our studied models, especially those using deep architectures, and has found uses for a variety of generation tasks such as paraphrase generation and style-transfer.

We next propose models for multilingual paraphrastic sentence representations. Again, we first propose a simple and effective approach that outperforms more complicated methods on cross-lingual sentence similarity and mining bitext. We then propose a generative model that concentrates semantic information into a single interlingua representations and pushes information responsible for linguistic variation to separate language-specific representations. We show that this model has improved performance on both monolingual and cross-lingual tasks over prior work and successfully disentangles these two sources of information.

Finally, we apply our representations to the task of fine-tuning neural machine translation systems using minimum risk training. The conventional approach is to use BLEU (Papineni et al., 2002), since that is commonly used for evaluation. However, we found that using an embedding model to evaluate similarity allows the range of possible scores to be continuous and, as a result, introduces fine-grained distinctions between similar translations. The result is better performance on both human evaluations and BLEU score, along with faster convergence during training.

Bio: John Wieting is a PhD candidate in the Language Technology Institute at Carnegie Mellon University, supervised by Graham Neubig and Taylor Berg-Kirkpatrick. Previously he worked with Kevin Gimpel at the Toyota Technological Institute-Chicago, and completed his MS under the guidance of Dan Roth at the University of Illinois Urbana-Champaign. His research focuses on representation learning and its applications for natural language processing. He is also interested in language generation, with a particular interest in paraphrasing and related tasks.

[AI Seminar] 2/4 Yang Zhong on Discourse Level Factors for Sentence Deletion in Text Simplification

Speaker: Yang Zhong
Time: TUESDAY 02/04/2020, 4pm-5pm
Location: Dreese Lab 480

Discourse Level Factors for Sentence Deletion in Text Simplification
Yang Zhong, Chao Jiang, Wei Xu, Junyi Jessy Li
AAAI 2020
https://cocoxu.github.io/publications/AAAI-ZhongY.9975.pdf

Title: Discourse Level Factors for Sentence Deletion in Text Simplification

Abstract: In this talk, I will present our paper accepted in AAAI 2020. We conduct a data-driven study focusing on analyzing and predicting sentence deletion — a prevalent but understudied phenomenon in document simplification on a large English text simplification corpus. We inspect various discourse-level factors associated with sentence deletion, using a new manually annotated sentence alignment corpus we collected. We reveal that professional editors utilize different strategies to meet the readability standards of elementary and middle schools. To predict whether a sentence will be deleted during simplification to a certain level, we harness automatically aligned data to train a classification model. We find that discourse-level factors contribute to the challenging task of predicting sentence deletion for simplification.

Bio: Yang Zhong is a first-year Ph.D. student in the Department of Computer Science and Engineering, advised by Prof. Wei Xu. His research mainly focuses on the stylistic variation of language, as well as in the field of document level text simplification.

[AI Seminar] 11/20 Mikhail Belkin on It’s time to think again

Speaker: Mikhail Belkin, Dept. of CSE
Time: Wed 11/20, 4pm-5pm
Location: Dreese Lab 480

Title: It’s time to think again

Abstract: Deep learning has drastically changed the practice in many applied areas of AI. But an equally profound and not yet fully recognized impact of deep learning is in forcing us to rethink many deeply held beliefs and theoretical assumptions. I will discuss some of these ideas, such as over-fitting and capacity control, and show why they fail to adequately describe modern machine learning. I will point to the types of analyses we need to understand and develop modern practice.

Bio: Mikhail Belkin is a Professor in the departments of Computer Science and Engineering and Statistics at the Ohio State University. He received a PhD in mathematics from the University of Chicago in 2003. His research focuses on understanding structure in data, the principles of recovering such structures, and their computational, mathematical and statistical properties. His notable work includes algorithms such as Laplacian Eigenmaps and Manifold Regularization, which use ideas of classical differential geometry for analyzing non-linear high-dimensional data. He is the recipient of a NSF Career Award, and has served on editorial boards of the Journal of Machine Learning Research and IEEE PAMI.

[AI Seminar] 11/6 Wei-Lun (Harry) Chao on Pseudo-LiDAR for Image-based 3D Object Detection in Autonomous Driving

Speaker: Wei-Lun (Harry) Chao
Time: Wed 11/6, 4pm-5pm
Location: Dreese Lab 480

Title: “Pseudo-LiDAR for Image-based 3D Object Detection in Autonomous Driving”

Abstract: Detecting objects such as cars and pedestrians in 3D plays an indispensable role in autonomous driving. Recent techniques excel with highly accurate detection rates, provided that the 3D input data is obtained from precise but expensive LiDAR technology. Approaches based on cheaper monocular or stereo imagery data have, until now, resulted in drastically lower accuracies — a gap that is commonly attributed to poor image-based depth estimation.

In this talk, I will show that it is not the quality of the data but its representation that accounts for the majority of the performance gap. Taking the inner workings of ConvNets into consideration, we propose to convert image-based depth maps to pseudo-LiDAR representations — essentially mimicking the LiDAR signal. With this representation, we can apply different existing LiDAR-based detection algorithms. On the popular KITTI benchmark, our approach significantly outperforms the existing state of the art in image-based performance — leading to a 300% relative improvement and halving the gap between image-based and LiDAR-based systems.

I will then present methods to advance the pseudo-LiDAR framework through improvements in image-based depth estimation. Concretely, we adapt the network architecture and loss function to be more aligned with accurate depth estimation. Further, we explore the idea to leverage cheaper but extremely sparse LiDAR sensors, which alone provide insufficient information for 3D detection, to de-bias our depth estimation. On the KITTI benchmark, our combined approach yields substantial improvements — leading to a 40% relative improvement for far-away objects and achieving comparable performance to LiDAR-based systems for nearby objects. I will conclude the talk with research challenges and opportunities in robust perception for autonomous driving.

Bio: Wei-Lun (Harry) Chao is an Assistant Professor in Computer Science and Engineering at the Ohio State University. His research interests are in machine learning and its applications to computer vision, natural language processing, artificial intelligence, and healthcare. His recent work has focused on robust robotic perception and large-scale visual understanding in the wild. Prior to joining OSU, he was a Postdoctoral Associate in Computer Science at Cornell University working with Prof. Kilian Q. Weinberger and Prof. Mark Campbell. He received a Ph.D. degree in Computer Science from the University of Southern California, supervised by Prof. Fei Sha.

[AI Seminar] 10/28 Marie-Catherine de Marneffe on Do you know that there’s still a chance? Identifying speaker commitment for natural language understanding

Speaker: Marie-Catherine de Marneffe, Department of Linguistics
Time: MONDAY 10/28, 4pm-5pm
Location: Dreese Lab 480

Title: Do you know that there’s still a chance? Identifying speaker commitment for natural language understanding.

Abstract:
When we communicate, we infer a lot beyond the literal meaning of the words we hear or read. In particular, our understanding of an utterance depends on assessing the extent to which the speaker stands by the event she describes. An unadorned declarative like “The cancer has spread” conveys firm speaker commitment of the cancer having spread, whereas “There are some indicators that the cancer has spread” imbues the claim with uncertainty. It is not only the absence vs. presence of embedding material that determines whether or not a speaker is committed to the event described: from (1) we will infer that the speaker is committed to there *being* war, whereas in (2) we will infer the speaker is committed to relocating species *not being* a panacea, even though the clauses that describe the events in (1) and (2) are both embedded under “(s)he doesn’t believe”.

(1) The problem, I’m afraid, with my colleague here, he really doesn’t believe that it’s war.

(2) Transplanting an ecosystem can be risky, as history shows. Hellmann doesn’t believe that relocating species threatened by climate change is a panacea.

In this talk, I will first illustrate how looking at pragmatic information of what speakers are committed to can improve NLP applications. Previous work has tried to predict the outcome of contests (such as the Oscars or elections) from tweets. I will show that by distinguishing tweets that convey firm speaker commitment toward a given outcome (e.g., “Dunkirk will win Best Picture in 2018″) from ones that only suggest the outcome (e.g., “Dunkirk might have a shot at the 2018 Oscars”) or tweets that convey the negation of the event (“Dunkirk is good but not academy level good for the Oscars”), we can outperform previous methods. Second, I will evaluate current models of speaker commitment, using the CommitmentBank, a dataset of naturally occurring discourses developed to deepen our understanding of the factors at play in identifying speaker commitment. We found that a linguistically informed model outperforms a LSTM-based one, suggesting that linguistic knowledge is needed to achieve robust language understanding. Both models however fail to generalize to the diverse linguistic constructions present in natural language, highlighting directions for improvement.

Bio: Marie-Catherine de Marneffe is an Associate Professor in Linguistics at The Ohio State University. She received her PhD from Stanford University in December 2012 under the supervision of Christopher D. Manning. She is developing computational linguistic methods that capture what is conveyed by speakers beyond the literal meaning of the words they say. Primarily she wants to ground meanings in corpus data, and show how such meanings can drive pragmatic inference. She has also worked on Recognizing Textual Entailment and contributed to defining the Stanford Dependencies and the Universal Dependencies representations. She is the recipient of a Google Research Faculty award, NSF CRII award and recently a NSF CAREER award. She serves as a member of the NAACL board.

[AI Seminar] 9/25 Fan Bai on Structured Minimally Supervised Learning for Neural Relation Extraction

Title: Structured Minimally Supervised Learning for Neural Relation Extraction

Abstract: In this talk, I will describe an effort to extract structured knowledge from text, without relying on slow and expensive human labeling (accepted to NAACL 2019). Our approach combines the benefits of learned representations and structured learning, and accurately predicts sentence-level relation mentions given only proposition-level supervision from a knowledge base. By explicitly reasoning about missing data during learning, this method enables large-scale training of convolutional neural networks while mitigating the issue of label noise inherent in distant supervision. Our approach achieves state-of-the-art results on minimally supervised sentential relation extraction, outperforming a number of baselines, including a competitive approach that uses the attention layer of a purely neural model.

Bio: Fan Bai is a third-year PhD student in the Department of Computer Science and Engineering, advised by Prof. Alan Ritter. His research mainly focuses on extracting structured knowledge from large corpus under distant supervision.

[AI Seminar] 9/11 Denis R. Newman-Griffis on Diving for Pearls: Indexing Mobility Information in Clinical Records with a Neural Relevance Tagger

Title: Diving for Pearls: Indexing Mobility Information in Clinical Records with a Neural Relevance Tagger

Abstract: Locating sparse information in medical text that is relevant to reported functional limitations is a significant challenge in the US Social Security Administration’s (SSA) process of determining disability. In this talk, I will introduce HARE, a system for highlighting relevant information in document collections for retrieval and triage (accepted to EMNLP 2019), and describe applications of this tool to retrieve narrative descriptions of mobility limitations in NIH and SSA records. I will demonstrate that tagging for relevance at the token level achieves high recall on retrieving true mobility descriptions, and ranking documents by the number of predicted mobility-relevant segments achieves strong correlation with ranking by true mobility information. Additionally, I will show that static word embedding features and contextualized ELMo and BERT features yield substantially different patterns in system outputs, and describe several patterns identified through qualitative analysis that suggest clear directions for further research on improving indexing of functional status information.

Bio: Denis is a 6th-year PhD student in the Department of Computer Science and Engineering, studying with Dr. Eric Fosler-Lussier. He is a Pre-Doctoral Fellow of the National Institutes of Health Clinical Center since 2015, and has led pioneering research on natural language processing methods for functional status information, particularly in the domain of mobility. His research areas include information extraction and retrieval, linguistic analysis of clinical data, and representation learning, and his work has been funded by the Intramural Program of the National Institutes of Health and the US Social Security Administration.

[AI Seminar] 08/28 Mounica Maddela on Multi-task Pairwise Neural Ranking for Hashtag Segmentation

Talk Title:
Multi-task Pairwise Neural Ranking for Hashtag Segmentation

Abstract:
Hashtags are often employed on social media and beyond to add metadata to a textual utterance with the goal of increasing discoverability, aiding search, or providing additional semantics. However, the semantic content of hashtags is not straightforward to infer as these represent ad-hoc conventions which frequently include multiple words joined together and can include abbreviations and unorthodox spellings. We build a dataset of 12,594 hashtags split into individual segments and propose a set of approaches for hashtag segmentation by framing it as a pairwise ranking problem between candidate segmentations. Our novel neural approaches demonstrate 24.6% error reduction in hashtag segmentation accuracy compared to the current state-of-the-art method. Finally, we demonstrate that a deeper understanding of hashtag semantics obtained through segmentation is useful for downstream applications such as sentiment analysis, for which we achieved a 2.6% increase in average recall on the SemEval 2017 sentiment analysis dataset.

Bio:
Mounica is a third-year Ph.D. student working with Professor Wei Xu. Her research interests lie in NLP and more specifically in stylistics and social media.