DOC: Deep Open Classification of Text Documents

Traditional supervised learning makes the closed-world assumption that the classes appeared in the test data must have appeared in training. This also applies to text learning or text classification. As learning is used increasingly in dynamic open environments where some new/test documents may not belong to any of the training classes, identifying these novel documents during classification presents an important problem. This problem is called open-world classification or open classification. This paper proposes a novel deep learning based approach. It outperforms existing state-of-the-art techniques dramatically.

Lei Shu 36 publications

Hu Xu 40 publications

Bing Liu 102 publications

Related Research

HDLTex: Hierarchical Deep Learning for Text Classification

The continually increasing number of documents produced each year necess.

share

Handwriting Classification for the Analysis of Art-Historical Documents

Digitized archives contain and preserve the knowledge of generations of .

share

Measuring the Novelty of Natural Language Text Using the Conjunctive Clauses of a Tsetlin Machine Text Classifier

Most supervised text classification approaches assume a closed world, co.

share

Text Classification with Novelty Detection

This paper studies the problem of detecting novel or unexpected instance.

share

Fast(er) Reconstruction of Shredded Text Documents via Self-Supervised Deep Asymmetric Metric Learning

The reconstruction of shredded documents consists in arranging the piece.

share

Deep Learning for Technical Document Classification

In large technology companies, the requirements for managing and organiz.

share

OpenGAN: Open-Set Recognition via Open Data Generation

Real-world machine learning systems need to analyze novel testing data t.

share