Image of Performance Analysis of Isolation Forest Algorithm in Fraud Detection of Credit Card Transactions

Artikel Jurnal

Performance Analysis of Isolation Forest Algorithm in Fraud Detection of Credit Card Transactions



Losses incurred due to fraud on e-commerce transactions, especially those based on credit cards, continue to increase, resulting in large losses each year. One mechanism to minimize the risk of fraudulent credit card transactions is to utilize a detection technique for ongoing transactions. Credit card transaction data in its original state does not have a label, and the amount of fraud data on the training data is very small so that it belongs to a very unbalanced category, and the pattern of fraud continues to change. Isolation forest is an unsupervised algorithm that is efficient in detecting anomalies. Several techniques can be applied to improve the performance of the Isolation forest model. Previous studies used the ROC-AUC metric in analyzing the performance of Isolation Forests, which could provide incorrect information. This study made two contributions; the first is to present a performance analysis with both the ROC-AUC and AUCPR. Thus, it can be seen that the high ROC-AUC value does not guarantee the model has the reliability in detecting fraud. In comparison, the information provided through AUCPR is more appropriate to describe the ability of the model to capture data fraud. The second contribution is to propose several techniques that can be applied to improve the performance of the Isolation forest model, namely to optimize the determination of the amount of training data, feature selection, the amount of fraud contamination, and setting hyper-parameters in the modeling stage (training). Experiments were carried out using a real-life dataset from ULB. The best results are obtained when the validation data split ratio is 60:40, using the five most important features, using only 60% of fraud data, and setting hyper-parameters with the number of trees 100, 128 sample maximum, and 0.001 contamination. The validation performance of this model is precision 0.809917, recall 0.710145, F1-score 0.756757, ROC-AUC 0.969728, and AUCPR 0.637993, while for Testing results obtained precision 0.807143, recall 0.763514, F1-score 0.784722, ROC-AUC 0.97371, and AUCPR 0.759228.


Ketersediaan

JKI5-013JKI V6N2 Oktober 2020Perpustakaan FT UPI YAITersedia

Informasi Detil

Judul Seri
Khazanah Informatika : Jurnal Ilmu Komputer dan Informatika
No. Panggil
JKI V6N2 Oktober 2020
Penerbit Universitas Muhammadiyah Surakarta : Surakarta.,
Deskripsi Fisik
hlm : 165-175
Bahasa
English
ISBN/ISSN
2621-038X
Klasifikasi
JKI
Tipe Isi
-
Tipe Media
-
Tipe Pembawa
-
Edisi
Volume 6 Nomor 2 Oktober 2020
Subyek
Info Detil Spesifik
-
Pernyataan Tanggungjawab

Versi lain/terkait

Tidak tersedia versi lain




Informasi


DETAIL CANTUMAN


Kembali ke sebelumnyaXML DetailCite this