Clippers Tuesday: Symon Stevens Guille and Taylor Mahler on Ethics in NLP

At Clippers on Tuesday, Symon Stevens Guille will be presenting joint work with Taylor Mahler on ethics in NLP; abstract below.

I will present the beginnings of a research project between myself and Taylor Mahler on ethical NLP and data management. I’ll discuss results from several recent papers in NLP, particularly on dialects, and sociolinguistic aspects of language use. I also review the result of Mahler et al (to appear), which illustrated several ways of fooling NLP systems into erroneously contradicting human-categorized sentiment. These results are complemented by case studies from media and civil rights investigations into the use, abuse, and (largely naive) processing of social media data by third parties, particularly the State.

Clippers Tuesday: Michael White on Dynamic Continuized CCG

At Clippers on Tuesday, I will give a revised and extended version of the talk on dynamic continuized CCG that I gave at TAG+ in which I’ll try out a new angle on explaining Charlow’s treatment of the exceptional scope of indefinites by way of comparison to DRT and explore the implications of Barker & Shan’s continuation-based approach to coordination in this framework. I also plan to start with an extended Paul Davis moment on MadlyAmbiguous where I’ll demonstrate how visualization with t-SNE helps to explain how word embeddings work in MadlyAmbiguous’s new advanced mode.

Clippers today: Deblin Bagchi on Generative Adversarial Networks

At Clippers today, Deblin will be presenting his work on Generative Adversarial Networks that he and Adam Stiff have been working on over the summer.

Generative Adversarial Networks have been used extensively in computer vision to generate images from a noise distribution. It has been found that with conditional information, they can learn to map a source distribution to a target distribution. However, their expressive power remains untested in the domain of speech recognition.

Spectral mapping is a feature denoising technique where a model learns to predict clean speech from noisy speech. In this work, we explore the effectiveness of adversarial training on a feedforward network-based (as well as convolutional network-based) spectral mapper to predict clean speech frames from noisy context. However, we have run into some issues which we would like to share and also would like helpful comments and feedback on our future plans.

Clippers Tuesday: Marie-Catherine de Marneffe on Automatically Drawing Inferences

At Clippers Tuesday, Marie-Catherine de Marneffe will be giving a dry run of an upcoming invited talk at the University of Geneva.

Automatically drawing inferences

Marie-Catherine de Marneffe
Linguistics department
The Ohio State University

When faced with a piece of text, humans understand far more than just the literal meaning of the words in the text. In our interactions, much of what we communicate is not said explicitly but rather inferred. However extracting information that is expressed without actually being said remains an issue for NLP. For instance, given(1) and (2), we want to derive that people will generally take that it is war from (1), but will take that relocating species threatened by climate is not a panacea from (2), even though both events are embedded under “(s)he doesn’t believe”.

(1) The problem, I’m afraid, with my colleague here, he really doesn’t believe that it’s war.

(2) Transplanting an ecosystem can be risky, as history shows. Hellmann doesn’t believe that relocating species threatened by climate change is a panacea.

Automatically extracting systematic inferences of that kind is fundamental to a range of NLP tasks, including information extraction, opinion detection, and textual entailment. But surprisingly, at present the vast majority of information extraction systems work at the clause level and regard any event they find as true without taking into account the context in which the event appears in the sentence.

In this talk, I will discuss two case studies of extracting such inferences, to illustrate the general approach I take in my research: use linguistically-motivated features, conjoined with surface-level ones, to enable progress in achieving robust text understanding. First, I will look at how to automatically assess the veridicality of events — whether events described in a text are viewed as actual (as in (1)), non-actual (as in (2)) or uncertain. I will describe a statistical model that balances lexical features like hedges or negations with structural features and approximations of world knowledge, thereby providing a nuanced picture of the diverse factors that shape veridicality. Second, I will examine how to identify (dis)agreement in dialogue, where people rarely overtly (dis)agree with their interlocutor, but their opinion can nonetheless be inferred (in (1) for instance, we infer that the speaker disagrees with his colleague).