T-Th 9:35–10:55, Derby 29
Instructor: Michael White
Description
What makes Siri tick? How does Google Translate make sense of 100+ languages? And why do they sometimes fail to do what you intend?
In this course, you will be given insight into the fundamentals of how computers are used to represent, process and organize textual and spoken information. We will cover the theory and practice of human language technology, going behind the scenes of internet search engines, spam filters, spell and grammar checkers, dialogue systems, automatic translators and more — discussing both how they work and why they often don’t. We will also consider social and ethical considerations such as privacy, job creation and loss due to language technologies, and the nature of consciousness and machine intelligence.
General Education Goals and Expected Learning Outcomes
The course satisfies the GE category Quantitative Reasoning, Mathematical or Logical Analysis. The goals of this category are for students to develop skills in quantitative literacy and logical reasoning, including the ability to identify valid arguments, and use mathematical models. The expected learning outcomes are for students to comprehend mathematical concepts and methods adequate to construct valid arguments, understand inductive and deductive reasoning, and increase their general problem solving skills.
The course satisfies the goals and learning outcomes by using natural language systems to motivate students to exercise and develop a range of basic skills in formal and computational analysis. The course philosophy is to ground abstract concepts in real world examples. We introduce strings, regular expressions, finite-state and context-free grammars, as well as probabilistic algorithms defined over these structures and techniques for probing and evaluating systems that rely on these algorithms. The course goes beyond merely subjective evaluation of systems, emphasizing analysis and reasoning to draw and argue for valid conclusions about the design, capabilities and behavior of natural language systems.
Carmen
We’ll be using the Carmen system for the schedule, homework and reading assignments. There will also be discussion forums for posting questions and providing feedback (comments, complaints or ideas) during the course.
Note that email from Carmen is sent to your official email address (Name.Number@osu.edu
). You should read email sent to your official osu account on a daily basis.
Readings
The textbook is also entitled (not coincidentally!) Language and Computers, by Markus Dickinson, Chris Brew and Detmar Meurers. We will also draw from the Natural Language Toolkit Book, entitled Natural Language Processing with Python — Analyzing Text with the Natural Language Toolkit, by Steven Bird, Ewan Klein, and Edward Loper. This book is available freely on-line.
Online quizzes will assess your understanding of the readings prior to the classes covering the material. Classes will be dedicated to in-class activities that explore selected topics in greater depth as well as topics not covered by the textbook.
Materials for in-class activities for each unit will be posted on Carmen, as will the slides presented in class. These slides are meant to aid classroom discussion and cannot replace actually being in class. Other readings may also be assigned periodically.
Requirements
The basic requirement is regular attendance in class and active participation. There will be one to two quizzes and (roughly) one homework assignment per textbook chapter, which will give you the opportunity to explore new aspects of the topics discussed in class. There will also be an essay on social/ethical considerations involving language technology. The midterm will be on the material covered in the first half of the class; the final will be on the material covered in the second half of the class, assuming the material from the first half as background knowledge.
3802H: For honors credit, the final two-part homework will constitute a group project on the topic of machine translation.
Grading
Grades will be assigned according to the following scheme:
- Quizzes (5%): Quizzes will be administered on-line through Carmen and are due by midnight of the day indicated. The quizzes naturally are open book, but you should finish the reading before attempting them as only one attempt is allowed. They will be shut off automatically once the deadline is reached, so do not put it off to the last minute! Note that I do not promise to remind you when you have a quiz due; it is your responsibility to keep up with the schedule on Carmen. The lowest quiz grade will be dropped.
- Homework assignments (40%): Homework assignments are due by the beginning of class, in Carmen. PDF format is preferred. No late homeworks will be accepted. The lowest homework grade will be dropped.
Homeworks should be done individually. Homework problems are typically similar to ones explored in groups during class; as such, regular attendance and active participation is vital to doing well on the homeworks.
- Essay (15%): A 1000–1500 word essay on a topic dealing with the social implications of language technology.
- Midterm exam (20%): The midterm will be given in class on Tuesday, October 24.
- Final exam (20%): The final will be given on Friday, December 8 (8:00–9:45).
- Class participation (+5%): Given that the homeworks and exams reflect the material covered in class, attendance is essential for doing well in this course, as is your active participation in class discussion and in-class activities. As such, participation will contribute bonus credit of up to 5% to your grade, based on the number of in-class activities completed
Grades will be assigned using the standard OSU scale.
Make-up Policy
If you know you won’t be able to make a deadline or exam, please see me before you miss the deadline or exam. If you miss the midterm or final, you will have to provide extensive written documentation for your excuse.
Class Etiquette
I expect you to respect one another, to respect me, and to respect yourself. To that end, I expect you to obey the following rules:
- Participate: share experiences, ask questions, express your opinions. Ask me to provide more information, send me emails or see me during office hours for help, clarification, or recommendations for further research.
- Do not read newspapers, materials from other classes, facebook posts, email, etc. in class. Do not pack up early. Switch off your cell phone. If for some reason, you must leave early or you have an important call coming in, notify me before class.
Policy on Academic Misconduct
As with any class at this university, students are required to follow the Ohio State Code of Student Conduct. In particular, note that students are not allowed to, among other things, submit plagiarized (copied but unacknowledged) work for credit. If any violation occurs, I am required to report the violation to the Council on Academic Misconduct. See the Committee on Academic Misconduct’s Frequently Asked Questions.
Students with Disabilities
The University strives to make all learning experiences as accessible as possible. If you anticipate or experience academic barriers based on your disability (including mental health, chronic or temporary medical conditions), please let me know immediately so that we can privately discuss options. To establish reasonable accommodations, I may request that you register with Student Life Disability Services. After registration, make arrangements with me as soon as possible to discuss your accommodations so that they may be implemented in a timely fashion. SLDS contact information: slds@osu.edu; 614-292-3307; slds.osu.edu; 098 Baker Hall, 113 W. 12th Avenue.
Disclaimer
This syllabus is subject to change. All important changes will be made in writing (email), with ample time for adjustment.