Modern malware evolves various detection avoidance techniques to bypass the state‐of‐the‐art detection methods. An emerging trend to deal with this issue is the combination of image transformation and machine learning models to classify and detect malware. However, existing works in this field only perform simple image transformation methods. These simple transformations have not considered color encoding and pixel rendering techniques on the performance of machine learning classifiers. In this article, we propose a novel approach to encoding and arranging bytes from binary files into images. These developed images contain statistical (eg, entropy) and syntactic artifacts (eg, strings), and their pixels are filled up using space‐filling curves. Thanks to these features, our encoding method surpasses existing methods demonstrated by extensive experiments. In particular, our proposed method achieved 93.01% accuracy using the combination of the entropy encoding and character class scheme on the Hilbert curve.

HIT4Mal: Hybrid image transformation for malware classification / Vu Duc, Ly; Nguyen, Trong‐kha; Nguyen, Tam V.; Nguyen, Tu N.; Massacci, Fabio; Phung, Phu H.. - In: TRANSACTIONS ON EMERGING TELECOMMUNICATIONS TECHNOLOGIES. - ISSN 2161-3915. - 31:11(2020), pp. e378901-e3789015. [10.1002/ett.3789]

HIT4Mal: Hybrid image transformation for malware classification

Vu, Duc‐Ly;Massacci, Fabio;
2020-01-01

Abstract

Modern malware evolves various detection avoidance techniques to bypass the state‐of‐the‐art detection methods. An emerging trend to deal with this issue is the combination of image transformation and machine learning models to classify and detect malware. However, existing works in this field only perform simple image transformation methods. These simple transformations have not considered color encoding and pixel rendering techniques on the performance of machine learning classifiers. In this article, we propose a novel approach to encoding and arranging bytes from binary files into images. These developed images contain statistical (eg, entropy) and syntactic artifacts (eg, strings), and their pixels are filled up using space‐filling curves. Thanks to these features, our encoding method surpasses existing methods demonstrated by extensive experiments. In particular, our proposed method achieved 93.01% accuracy using the combination of the entropy encoding and character class scheme on the Hilbert curve.
2020
11
Vu Duc, Ly; Nguyen, Trong‐kha; Nguyen, Tam V.; Nguyen, Tu N.; Massacci, Fabio; Phung, Phu H.
HIT4Mal: Hybrid image transformation for malware classification / Vu Duc, Ly; Nguyen, Trong‐kha; Nguyen, Tam V.; Nguyen, Tu N.; Massacci, Fabio; Phung, Phu H.. - In: TRANSACTIONS ON EMERGING TELECOMMUNICATIONS TECHNOLOGIES. - ISSN 2161-3915. - 31:11(2020), pp. e378901-e3789015. [10.1002/ett.3789]
File in questo prodotto:
File Dimensione Formato  
ett.3789.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 8.1 MB
Formato Adobe PDF
8.1 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/278038
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 42
  • ???jsp.display-item.citation.isi??? 34
social impact