Archive for Tools and Resources

The Paul Davis Moment (Archive)

Paper Reading and Writing

  • Semantic. If you’d like to be able to create your own math symbols in latex, specifically those with ligatures, try installing this package. (Dec 07)
  • Anti-Word. If you use a linux machine pretty much exclusively, but get email attachments from people who use Windows products, they you might be interested in Anti-Word, which will convert .doc files to plain text. (Nov 07)
  • PrimoPdf. You can make PDFs of your MS Office documents for free with this nifty app. Get it here. (Oct 07)
  • Text Editing. The creator of the vim text editor gave a talk to the Google folks on efficient text editing: how to identify when you’re doing things inefficiently, and how to fix that. Emacs users can benefit, too. Find the talk at Google Video. (Mar 07)
  • yab2web. This facility allows easy publication of bibtex entries into html, ideal for listing your publication list on your website. See Donna Byron’s website for an example. (March 06)
  • Kami PDF Reader. PDF reader w/annotation. Chrome plugin. (Oct 16)
  • IPA in HTML. The following website will get you started in publishing pages on the web with IPA fonts included (updated 09/2022):
  • HeVeA. A utility for converting very simple tex files into webpages. Appropriate for text-heavy, graphics-poor websites like online syllabi, course descriptions, etc. Already installed on the Linguistics department computers. (Jan 07)
  • Graphviz. A graph visualization toolkit. This software can help you make web-ready graphics of parse trees, etc. Could be useful for teaching parsing, grammar, syntax, etc. Find it at (Jan 07, updated 09/2022)
  • bibdesk. A point-and-click interface for creating your very own BibTex file. Reduces typos. Find it on SourceForge, at least for Mac. (Jan 07)
  • latex2rtf. Have a latex file and need a Windows document? Try this resource, which works with fair accuracy. Another option is to use OpenOffice, from which documents can be directly exported to pdf, or presentations to Flash or .ppt – but use with caution, fonts can get messy. (Jan 07)
  • BibTex Yourself. When you list a citation to one of your own papers on your website, be sure to put a BibTex entry right next to it. That way, others won’t mis-cite your work. (Jan 07)
  • Google’s BibTex resource. If you use Google Scholar to find academic articles, change the Preferences to have it provide a BibTex entry for the various resources it finds. Use with caution – a quick sample done in our meeting showed some errors – but it’s a good start. (Jan 07)
  • pdfpages. This is an easy way to embed pdf files within your own latex files. Find details in this document. Or, try Googling ‘pdfpages’. (October 06, updated 09/2022)


  • Recaptcha (discontinued). Know how when you buy from Ticketmaster, you have to type in the words that appear all squirrely in the picture? Now you can use that same technology to hide your own email address on your webpage. This can help stop spam. (Feburary 09)

Unix Tools

  • sshfs (discontinued). This unix application allows you to mount an entire filesystem. Then it’s easier to access your ling files from home. This website has details. It should be available on most linux installations: try ‘appget install sshfs’. (May 07)


  • General Language Ontology. There is a recently acquired ontology on the ling server that may be useful for those who need a basic semantic representation of general concepts. Read more about the ontology, what concepts it encodes, and what it may be useful for at, and find the resource itself at /home/corpora/EN/cyc. You can also contact Stacey (s.bailey @ ling) for further information. (Feb 06)

Machine Learning

  • Machine Learning Slides. UC Berkeley’s RAD Lab has made slides and videos available on the web from a recent two-day short course on applied machine learning for its industrial affiliates: (Nov 07)


  • Statistics Primer. A good introductory text to basic statistics can be found at If you follow the link for VassarStats, you will find tools for calculating various statistics. (Sept 06)


  • Stinkpot. A repository of helpful hints on all kinds of tools we tend to use to do our work: Emacs, Python, Latex, Matlab… it’s a personal blog of a grad student at MIT who works on silly things like evolution. His version of the Paul Davis moment is something you might find helpful. (Dec 07)
  • MIT Workshop on Syntax. It’s not up as of this writing, but check on for a video of their one day workshop titled “Where Does Syntax Come From? Have We All Been Wrong?”, with guest speakers Sandiway Fong, Chris Manning, and Noam Chomsky, among others. (Nov 07)
  • Anonymous Feedback. Teachers might find it useful to allow their students to send them anonymous feedback. See Detmar’s example, and if you’d like, copy his on your own website. To do that, copy the entire directory on our department network: ~dm/public_html/feedback . Don’t forget to change all instances of the name and email address! (Jan 07)
  • AJAX. Not just a cleaning solution, it can solve your messy, slow, database-driven web page problems as well. For an overview, examples, and tutorial of how to use AJAX, see Scott’s slides (Jan 07).
  • text2onto. Automatically extracts a candidate concept hierarchy and instances from a corpus of plain text. Not fully functional, but possibly handy for small projects, or getting started with ontologies. See the website for more details. (Oct 05)
  • Corpora Mailing List. Sign up here to receive email regarding new corpora and corpus tools.
  • Website Accessibility. In constructing a website, it’s recommended (required at OSU, in fact), to make it accessible to the disabled. That means to make sure that vision-impaired folks will be able to get your information by using a screen reader. To make sure your website is compliant, use a tool like Fangs to get an idea of what your website “sounds” like. (April 07)
  • Boolistic. Not just another search engine, this website may come in handy for those teaching boolean logic: Enter your search terms, then click on different parts of the Venn diagram to alter the search query. (Sept 05)
  • Idea Solicitation. What kind of project management tools would you like to see in the Linguistics department? How can we make group projects more manageable? Bring your suggestions to Clippers, or email the CL list. (Sept 05)
  • Language Generator. This website includes Perl code that will randomly generate ‘pointy-hair-boss mission statements’, as well as a link (near the end) to similar random language generators. (Oct 05)
  • Internships. It’s high time to start thinking about summer internships in CL. If you’re interested in working someplace like Microsoft or elsewhere on the West Coast, have a chat with Chris (cbrew @ ling), Eric (fosler @ ling), or Donna (dbyron @ ling) for information and contacts. (Jan 06)
  • CCG Parser. A new CCG parser and supertagger is available from Clark and Curran. You can find the software and related literature at: The CCG site. (September 06)
  • Firefox browser. The latest version supports many standards, incl. SVG and there are nice, free extensions available, including (updated 09/2022):
    • Webdeveloper (live editing of html, css, etc.)
    • Aardvark (modify what’s displayed on any webpage, for doing screenshots etc.)
    • Greasemonkey: various neat user scripts
    • Firebug (Debugger and network traffic profiler)
  • Carmen Tip. Keep backups. The system can go down, and it can take you with it. Exporting and importing is relatively simple. (Feb 07)
  • Google Books. With a Google account, you can use their service to search through many books. You can’t necessarily read them from cover to cover, but it can be a helpful resource if you need to search for particular topics within a text. (Feb 07)
  • IR Systems. Two IR systems that are available for research purposes are Galago and Terrier. Each has its ups and downs, both are worth exploring. Talk to Chris for more info. (Feb 09)
  • Speed Reading. To practice speed reading, find a freely available program called RSVP. It will take any webpage or document and present it to you, word by word, at the speed you set. Then increase the speed as you get better. (Feb 08)
  • mechanize. This perl module will fill in form values in html documents automatically. (Jan 07)