SVM after LSTM deep learning model for text classification

Upasana | October 24, 2019 | 2 min read | 157 views

Here, we will learn how to add SVM in last layer when we are using LSTM for text classification

LSTM (Long Short Term Memory ) based algorithms are very known algorithms for text classification and time series prediction. When we are working on text classification based problem, we often work with different kind of cases like sentiment analysis, finding polarity of sentences, multiple text classification like toxic comment classification, support ticket classification etc. LSTMs are famous in text and time series based problems because they are not only feedforward networks. They can retain past memory in their networks which is known as back propagation. Since they can retain history, they work very well on these problems.

Coming to SVM (Support Vector Machine), we could be wanting to use SVM in last layer of our deep learning model for classification.

We will be explaining an example based on LSTM with keras. To add SVM, we need to use softmax in last layer with l2 regularizer and use hinge as loss which compiling the model.

SVM in last layer for binary classification

inp = Input((train_X.shape[1], train_X.shape[2]))
lstm = LSTM(1, return_sequences=False)(inp)
output = Dense(train_Y.shape[1], activation='softmax', W_regularizer=l2(0.01)))(lstm)

model = Model(inputs=inp, outputs=output)
model.compile(loss='hinge', optimizer='adam', metrics=['accuracy'])
model.fit(train_X, train_Y, validation_split=.20, epochs=2, batch_size=50)

hinge can be used if we are working with binary text classification. But in case, we are working on multilabel (multiple classes) text classification then we can consider using categorical_hinge

SVM in last layer for multilabel classification

inp = Input((train_X.shape[1], train_X.shape[2]))
lstm = LSTM(1, return_sequences=False)(inp)
output = Dense(train_Y.shape[1], activation='softmax', W_regularizer=l2(0.01)))(lstm)

model = Model(inputs=inp, outputs=output)
model.compile(loss='categorical_hinge', optimizer='adam', metrics=['accuracy'])
model.fit(train_X, train_Y, validation_split=.20, epochs=2, batch_size=50)

Hope this helped. Thanks for reading this article.