Abstract
We present the results of a Content Analysis of Asthma-related Tweets, which were manually annotated for a number of different content categories, including Experiencer (Self vs. Other vs. Finer-grained distinctions), Medication, Symptoms, Non-English, Information and Triggers, among other things. We used this annotated corpus of Tweets to train machine learning classifiers on unigram and bigram models of the text in order to automatically categorize Tweets according to the annotation scheme. We find that the unigram model best predicts Tweets'' categorization. We suggest that Twitter combined with NLP may provide a valuable tool in monitoring chronic conditions such as Asthma.