Clippers Tuesday: Jie Zhao on product-related question answering

At Clippers on Tuesday, Jie Zhao will present work with Huan Sun on product-related question answering. Title and abstract below.

Title: Answer Retrieval on E-commerce Websites via Weakly Supervised Question Reformulation

Abstract: In this seminar, I will talk about our ongoing work about product-related question answering on E-commerce websites, which aims to retrieve answers from a large corpus of answer candidates. Our problem setting is different from traditional answer selection where a small answer candidate set is pre-defined and the state-of-the-art models generally adopt sophisticated models to match the semantics between the QA pairs. However, these methods will be very expensive to use when the answer candidate set is large and dynamically increasing. In our work, we adopt a classic light-weight TF-IDF search scheme for efficiency reasons but aim at better retrieval results through question reformulation. One of the challenges here is the lack of direct labeled data with pairs. To address this, we look into the word-matching results of the existing QA pairs as weak supervision signals, and define different sub-tasks that 1) learn focus attention on the question words, 2) infer words that will possibly occur in a true answer and 3) use the result of the first two sub-tasks as reformulated question to improve the final retrieval performance. We model the inter-relations among these sub-tasks and train it under a multi-task learning scheme. Preliminary results show our model has the potential to achieve better retrieval performance than existing baseline methods while guaranteeing lower search complexities. Currently, our model still does not perform very well on the second sub-task, possibly because of the large vocabulary space. We are exploring various learning strategies to further improve it. Any suggestions and comments will be appreciated.

Clippers Tuesday: Evan Jaffe on Sequential Matching Networks for a Virtual Patient Dialogue System

At Clippers Tuesday, Evan Jaffe will presenting work in progress using Sequential Matching Networks to do dialogue response selection.

SMN architecture is designed to maintain dialogue history (using an RNN) and thus provide extended context. The task is formulated as ranking a set of k candidate responses, given a dialogue history. Preliminary results on a virtual patient dataset show good ranking accuracy (95% on dev) when the network chooses between the true next response, and 9 randomly selected negative examples. However, this task may be too easy, so a few more challenging tests are worth exploring, including increasing the size of k and choosing more confusable candidates. An n-gram overlap could be a good baseline. Ultimately, using the SMN to rerank an n-best list coming from a CNN model (Jin et al 2017) could prove beneficial, complementing the CNN with an ability to track previous turns. This history could be useful for questions with zero anaphora like, ‘What dose’, which crucially rely on previous turns for successful interpretation.

Clippers Tuesday: Symon Stevens Guille and Taylor Mahler on Ethics in NLP

At Clippers on Tuesday, Symon Stevens Guille will be presenting joint work with Taylor Mahler on ethics in NLP; abstract below.

I will present the beginnings of a research project between myself and Taylor Mahler on ethical NLP and data management. I’ll discuss results from several recent papers in NLP, particularly on dialects, and sociolinguistic aspects of language use. I also review the result of Mahler et al (to appear), which illustrated several ways of fooling NLP systems into erroneously contradicting human-categorized sentiment. These results are complemented by case studies from media and civil rights investigations into the use, abuse, and (largely naive) processing of social media data by third parties, particularly the State.

Clippers Tuesday: Michael White on Dynamic Continuized CCG

At Clippers on Tuesday, I will give a revised and extended version of the talk on dynamic continuized CCG that I gave at TAG+ in which I’ll try out a new angle on explaining Charlow’s treatment of the exceptional scope of indefinites by way of comparison to DRT and explore the implications of Barker & Shan’s continuation-based approach to coordination in this framework. I also plan to start with an extended Paul Davis moment on MadlyAmbiguous where I’ll demonstrate how visualization with t-SNE helps to explain how word embeddings work in MadlyAmbiguous’s new advanced mode.

Clippers today: Deblin Bagchi on Generative Adversarial Networks

At Clippers today, Deblin will be presenting his work on Generative Adversarial Networks that he and Adam Stiff have been working on over the summer.

Generative Adversarial Networks have been used extensively in computer vision to generate images from a noise distribution. It has been found that with conditional information, they can learn to map a source distribution to a target distribution. However, their expressive power remains untested in the domain of speech recognition.

Spectral mapping is a feature denoising technique where a model learns to predict clean speech from noisy speech. In this work, we explore the effectiveness of adversarial training on a feedforward network-based (as well as convolutional network-based) spectral mapper to predict clean speech frames from noisy context. However, we have run into some issues which we would like to share and also would like helpful comments and feedback on our future plans.

Clippers Tuesday: Marie-Catherine de Marneffe on Automatically Drawing Inferences

At Clippers Tuesday, Marie-Catherine de Marneffe will be giving a dry run of an upcoming invited talk at the University of Geneva.

Automatically drawing inferences

Marie-Catherine de Marneffe
Linguistics department
The Ohio State University

When faced with a piece of text, humans understand far more than just the literal meaning of the words in the text. In our interactions, much of what we communicate is not said explicitly but rather inferred. However extracting information that is expressed without actually being said remains an issue for NLP. For instance, given(1) and (2), we want to derive that people will generally take that it is war from (1), but will take that relocating species threatened by climate is not a panacea from (2), even though both events are embedded under “(s)he doesn’t believe”.

(1) The problem, I’m afraid, with my colleague here, he really doesn’t believe that it’s war.

(2) Transplanting an ecosystem can be risky, as history shows. Hellmann doesn’t believe that relocating species threatened by climate change is a panacea.

Automatically extracting systematic inferences of that kind is fundamental to a range of NLP tasks, including information extraction, opinion detection, and textual entailment. But surprisingly, at present the vast majority of information extraction systems work at the clause level and regard any event they find as true without taking into account the context in which the event appears in the sentence.

In this talk, I will discuss two case studies of extracting such inferences, to illustrate the general approach I take in my research: use linguistically-motivated features, conjoined with surface-level ones, to enable progress in achieving robust text understanding. First, I will look at how to automatically assess the veridicality of events — whether events described in a text are viewed as actual (as in (1)), non-actual (as in (2)) or uncertain. I will describe a statistical model that balances lexical features like hedges or negations with structural features and approximations of world knowledge, thereby providing a nuanced picture of the diverse factors that shape veridicality. Second, I will examine how to identify (dis)agreement in dialogue, where people rarely overtly (dis)agree with their interlocutor, but their opinion can nonetheless be inferred (in (1) for instance, we infer that the speaker disagrees with his colleague).

Clippers Tuesday: Micha Elsner on Speech Segmentation with Neural Nets; David King on Disambiguating Coordination Ambiguities

At Clippers Tuesday, Micha Elsner and David King will be giving practice talks for EMNLP and for the Explainable Computational Intelligence Workshop at INLG, respectively:

Speech segmentation with a neural net model of working memory
Micha Elsner and Cory Shain

We present the first unsupervised LSTM speech segmenter as a cognitive model of the acquisition of words from unsegmented input. Cognitive biases toward phonological and syntactic predictability in speech are rooted in the limitations of human memory (Baddeley et al., 1998); compressed representations are easier to acquire and retain in memory. To model the biases introduced by these memory limitations, our system uses an LSTM-based encoder-decoder with a small number of hidden units, then searches for a segmentation that minimizes autoencoding loss. Linguistically meaningful segments (e.g. words) should share regular patterns of features that facilitate decoder performance in comparison to random segmentations, and we show that our learner discovers these patterns when trained on either phoneme sequences or raw acoustics. To our knowledge, ours is the first fully unsupervised system to be able to segment both symbolic and acoustic representations of speech.

A Simple Method for Clarifying Sentences with Coordination Ambiguities
Michael White, Manjuan Duan and David King

We present a simple, broad coverage method for clarifying the meaning of sentences with coordination ambiguities, a frequent cause of parse errors. For each of the two most likely parses involving a coordination ambiguity, we produce a disambiguating paraphrase that splits the sentence in two, with one conjunct appearing in each half, so that the span of each conjunct becomes clearer. In a validation study, we show that the method enables meaning judgments to be crowd-sourced with good reliability, achieving 83% accuracy at 80% coverage.

Clippers Tuesday: Manirupa Das on Query Expansion for IR

At Clippers Tuesday, Manirupa will present “A Phrasal Embedding–based General Language Model for Query Expansion in Information Retrieval”:

Traditional knowledge graphs driven by knowledge bases can represent facts about and capture relationships among entities very well, thus performing quite accurately in factual information retrieval. However, in addressing the complex information needs of subjective queries requiring adaptive decision support, these systems can fall short as they are not able to fully capture novel associations among potentially key concepts. In this work, we explore a novel use of language model–based document ranking to develop a fully unsupervised method for query expansion by associating documents with novel related concepts extracted from the text. To achieve this we extend the word embedding-based generalized language model due to Ganguly et al. (2015) to employ phrasal embeddings, and evaluate its performance on an IR task using the TREC 2016 clinical decision support challenge dataset. Our model, used for query expansion both directly and via feedback loop, shows statistically significant improvement not just over various baselines utilizing standard MeSH terms and UMLS concepts for query expansion (Rivas et al., 2014), but also over our word embedding-based language model baseline, built on top of a standard Okapi BM25 based document retrieval system.

Clippers Tuesday: Joo-Kyung Kim on Cross-lingual Transfer Learning for POS Tagging

This Tuesday, Joo-Kyung Kim will be talking about his current work on cross-lingual transfer learning for POS tagging:

POS tagging is a relatively easy task given sufficient training examples, but since each language has its own vocabulary space, parallel corpora are usually required to utilize POS datasets in different languages for transfer learning. In this talk, I introduce a cross-lingual transfer learning model for POS tagging, which utilizes language-general and language-specific representations with auxiliary objectives such as language-adversarial training and language modeling. Evaluating on POS datasets from Universal Dependencies 1.4, I show preliminary results that the proposed model can be effectively used for cross-lingual transfer learning without any parallel corpora or gazetteers.