Text-based Malicious Domain Names Detection Based on Variational Autoencoder And Supervised Learning - Citegraph

Paper Info

Title
Text-based Malicious Domain Names Detection Based on Variational Autoencoder And Supervised Learning

Abstract
With the rapid development of information technology, adaptation of an information system in industries and institutes has become more and more common. However, attacks like using zombie networks to access a host thus causing it to shut down are frequent in recent years. Domain names play a significant role in the connection with a server, considered as a key for detecting these attacks. In this paper, we propose a text-based method to convert domain names into numeric features, based on the term frequency and inverse document frequency (TF-IDF). Then we adopt the variational autoencoder (VAE) consisting of an encoder and a decoder, extracting hidden information from features. Moreover, through collapsing the Gaussian distribution of these features at the hidden layer to its mean, the distribution of domain names is visualized. After that, we adopt a supervised learning called Convolutional Neural Network (CNN) for the classification between the malicious and benign. We train the model using feature vectors from the VAE. At last, the scheme achieves a validation accuracy of 0.868 for the malicious domain names detection.

Year	DOI	Venue
2020	10.1109/CISS48834.2020.1570601577	2020 54th Annual Conference on Information Sciences and Systems (CISS)
Keywords	DocType	ISBN
malicious domain names detection,VAE,cybersecurity,machine learning	Conference	978-1-7281-8831-7
Citations	PageRank	References
0	0.34	5
Authors
3

Authors (3 rows)

Cited by (0 rows)

References (5 rows)

Name	Order	Citations	PageRank
Yuwei Sun	1	2	2.78
Ng S. T. Chong	2	0	0.34
Hideya Ochiai	3	3	3.13

1