Luận án Android malware classification using deep learning
- Người chia sẻ : vtlong
- Số trang : 141 trang
- Lượt xem : 11
- Lượt tải : 500
Các file đính kèm theo tài liệu này
luan_an_android_malware_classification_using_deep_learning.pdf
- Tất cả luận văn được sưu tầm từ nhiều nguồn, chúng tôi không chịu trách nhiệm bản quyền nếu bạn sử dụng vào mục đích thương mại
Bạn đang xem trước 20 trang tài liệu Luận án Android malware classification using deep learning, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD LUẬN VĂN ở trên
Besides, many papers still convert API calls to vectors [9, 14, 70, 73]. Transforming
API calls into vectors as input to the model also produces good results. S. K. Sasidharan
et al. [70] trained a model using the Profile Hidden Markov model (PHHM).
API calls and methods from malware in the DREBIN dataset were transformed into
an encoded list and trained with a proportion of 70% for training and 30% for testing.
The result’s accuracy reached 94.5% with a 7% false positive rate. The precision and
recall acquired 0.93 and 0.95, respectively.
Although not being used as much as permissions and API calls, many studies have
used opcodes exclusively in malware detection problems such as [32, 52, 53, 74, 75, 76,
77, 78, 79, 80, 81]. The extracted opcodes were converted to grey images and put into
a deep-learning model, resulting in a detection accuracy of 95.55% and a classification
accuracy of 89.96%. Besides, V. Sihag et al. [53] used opcode to solve the problem
of code obfuscation. The detection result achieved 98.8% accuracy when using the
Random Forest algorithm on the Drebin and PRAGuard dataset of code obfuscation,
with the number of malware apps used is 10,479. In [79], the authors proposed an
effective opcode extraction method and applied a Convolutional Neural Network for
classification. The k-max pooling method was used in the pooling phase to achieve
an accuracy of more than 99%. On the other hand, M.Amin et al. [80] vectorized
the extracted opcode through encoding and applied deep neural networks to train the
model, e.g., Bidirectional long-short-term memory (BiLSTMs). With a dataset of more
than 1.8 million apps, the paper acquired a result of 99.9% accuracy level.
For other feature groups, they are usually combined with permissions, API calls,
or opcodes. Because these groups often have few features and are unavailable in all
apps, it isn’t easy to use them independently. From 2019 until now, according to the
statistics in dblp, only two papers [82, 83] use the Intent feature independently. The
results show that accuracy reaches 95.1% [82] and F1-score reaches 97% [83]; however,
the dataset is self-collected, and the number of usable files in a dataset is small.
Some common API packages in the Android malware detection problem datasets
are described in Table 1.4.
Features combination is commonly used, in which permission and API calls appear
a lot as they play a crucial part in malware detection [14, 25, 33, 44, 84, 85, 86, 87,
88, 89, 90]. In many research papers, using feature groups has shown high effectiveness
through evaluation results.