I'm using FastText and my own word embedding on a set of documents. It is being used to detect abbreviations (Y/N) for each word token.
When testing, words that does not have vectors (out-of-vocabulary - OOV words), and discarded and not included in the performance measures (precision, recall, etc.) giving a false result. How do you handle this?
Would you replace all words with NaN values be included in the performance measure? Can the NaN values be replaced with a vector? How would you decide which vector?