Here's some of the publicly available code I've worked on over the years, as well as some useful resources (data) I've collected.

The human eval data used in the paper Unsupervised joke generation from big data presented at ACL 2013. The data can be found here (there is a README file that should explain everything). This could be useful for anyone who wants to train a supervised system from human labeled jokes.

StormCpp is a wrapper for Storm that enables you to write native C++ code and still run it on Storm.

Twitter FSD corpus is a corpus of 50 million tweets along with event annotation. This is a useful resource for measuring the performance of an event detection system.

OrngText is an add-on for the Orange software package that provides basic text mining functionality.