- Introduction to Information Retrieval
- Foundations of Statistical Natural Language Processing (sadly not free online, but deemed valuable)
- Mining of Massive Datasets
- Two books on Computational Semantics (Blackburn & Bos)
- Elements of Statistical Learning
- Data-Intensive Text Processing with MapReduce
That's the books. Now the other stuff:
- CMU's machine learning course. I might work through it after Ng's course is done.
- Apache Mahout [hnn]
- The PET parser [online demo] [article], part of DELPH-IN, which has a truly painfully formatted home page but looks promising
- Natural Language Engineering journal
- StackExchange discussion of NL parsers and starting points for NLP
- A list of what's in the Ubuntu NLP stack
- The Porter stemmer
- Apache OpenNLP - probably a good place to help out
- ANTLR
And that's it, for NLP. For now. You know, I could spend a year or two internalizing this stuff and be the best natural language programmer on the planet.
No comments:
Post a Comment