Resolve false negatives

There are two possible ways to resolve a false negative, depending on the type of model we are working with and on the scenario we have:

Using rules

There are four things you can check:

  • Review the negative terms of the category in which we want the text to be classified and ensure that none of them appears in the text excluding the category from the classification results.
  • If there are positive defined, check if any of them appears in the text: if none appears, that's where the false negative comes from. To resolve it, you need to add to the list of positive terms a term that appears in the text.
  • Review the irrelevant terms in case any of them are decreasing the relevance of the category. If this is the case, you should evaluate if the term in question is necessary. If it isn't, it can be deleted from the list. Nevertheless, the best way to go would be to limit the context in which the irrelevant term applies in order to exclude the case that's giving the false negative.
  • Add to the category relevant terms, or if necessary, positive terms to increase the relevance assigned to the category.

This option can be applied to hybrid models and to rule-based one.

Recommendation

It's important to remember than hybrid models carry out the statistical classification first, and then over the results, they apply the rule-based classification. If one of the terms defined in the rules does not appear in the training text, it's possible than a text will not be classified in that category in the statistical classification, and so it will not appear in the final results either.

To ensure that this does not happen, it's recommended to check that all the positive and relevant terms associated to a category appear also in the training text of the category, adding them if they don't.

Using training texts

The way of resolving a false negative using training text is just to add the text that's given the false negative to the training text of the category in which it should be classified.

This option can be applied to hybrid models and to statistical ones.

Important

It's important to remember than modifying any category may change the relevance values the model assigns to the rest of the categories.