Hey Kartik,
It is a really nice article and super intuitive way of explaining the concept of multi-label classification.
However, I see there is a label-imbalance here. When I want to apply OneVsRest classification to a training set with 4 label, should I try to balance my classes in the training set? (I need to annotate documents myself to generate the training set.) So should I generate a training set that equally represents all the labels in question? Additionally, it will be amazing if you could point me to some article that details on how to generate a training set with minimum class imbalance.