Modern malware evolves various detection avoidance techniques to bypass the state‐of‐the‐art detection methods. An emerging trend to deal with this issue is the combination of image transformation and machine learning models to classify and detect malware. However, existing works in this field only perform simple image transformation methods. These simple transformations have not considered color encoding and pixel rendering techniques on the performance of machine learning classifiers. In this article, we propose a novel approach to encoding and arranging bytes from binary files into images. These developed images contain statistical (eg, entropy) and syntactic artifacts (eg, strings), and their pixels are filled up using space‐filling curves. Thanks to these features, our encoding method surpasses existing methods demonstrated by extensive experiments. In particular, our proposed method achieved 93.01% accuracy using the combination of the entropy encoding and character class scheme on the Hilbert curve.
HIT4Mal: Hybrid image transformation for malware classification / Vu Duc, Ly; Nguyen, Trong‐kha; Nguyen, Tam V.; Nguyen, Tu N.; Massacci, Fabio; Phung, Phu H.. - In: TRANSACTIONS ON EMERGING TELECOMMUNICATIONS TECHNOLOGIES. - ISSN 2161-3915. - 31:11(2020), pp. e378901-e3789015. [10.1002/ett.3789]
HIT4Mal: Hybrid image transformation for malware classification
Vu, Duc‐Ly;Massacci, Fabio;
2020-01-01
Abstract
Modern malware evolves various detection avoidance techniques to bypass the state‐of‐the‐art detection methods. An emerging trend to deal with this issue is the combination of image transformation and machine learning models to classify and detect malware. However, existing works in this field only perform simple image transformation methods. These simple transformations have not considered color encoding and pixel rendering techniques on the performance of machine learning classifiers. In this article, we propose a novel approach to encoding and arranging bytes from binary files into images. These developed images contain statistical (eg, entropy) and syntactic artifacts (eg, strings), and their pixels are filled up using space‐filling curves. Thanks to these features, our encoding method surpasses existing methods demonstrated by extensive experiments. In particular, our proposed method achieved 93.01% accuracy using the combination of the entropy encoding and character class scheme on the Hilbert curve.File | Dimensione | Formato | |
---|---|---|---|
ett.3789.pdf
Solo gestori archivio
Tipologia:
Versione editoriale (Publisher’s layout)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
8.1 MB
Formato
Adobe PDF
|
8.1 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione