Publication:
Increasing the Performance of Machine Learning-Based IDSs on an Imbalanced and Up-to-Date Dataset

dc.contributor.authorBAYDOĞMUŞ, GÖZDE KARATAŞ
dc.contributor.authorDemir, Önder
dc.contributor.authorŞAHİNGÖZ, ÖZGÜR KORAY
dc.date.accessioned2022-11-16T07:14:19Z
dc.date.available2022-11-16T07:14:19Z
dc.date.issued2020
dc.description.abstractIn recent years, due to the extensive use of the Internet, the number of networked computers has been increasing in our daily lives. Weaknesses of the servers enable hackers to intrude on computers by using not only known but also new attack-types, which are more sophisticated and harder to detect. To protect the computers from them, Intrusion Detection System (IDS), which is trained with some machine learning techniques by using a pre-collected dataset, is one of the most preferred protection mechanisms. The used datasets were collected during a limited period in some specific networks and generally don & x2019;t contain up-to-date data. Additionally, they are imbalanced and cannot hold sufficient data for all types of attacks. These imbalanced and outdated datasets decrease the efficiency of current IDSs, especially for rarely encountered attack types. In this paper, we propose six machine-learning-based IDSs by using K Nearest Neighbor, Random Forest, Gradient Boosting, Adaboost, Decision Tree, and Linear Discriminant Analysis algorithms. To implement a more realistic IDS, an up-to-date security dataset, CSE-CIC-IDS2018, is used instead of older and mostly worked datasets. The selected dataset is also imbalanced. Therefore, to increase the efficiency of the system depending on attack types and to decrease missed intrusions and false alarms, the imbalance ratio is reduced by using a synthetic data generation model called Synthetic Minority Oversampling TEchnique (SMOTE). Data generation is performed for minor classes, and their numbers are increased to the average data size via this technique. Experimental results demonstrated that the proposed approach considerably increases the detection rate for rarely encountered intrusions.en
dc.description.sponsorshipMarmara University
dc.identifier8
dc.identifier.citationKaratas, G., Demir, O., & Sahingoz, O. K. (2020). Increasing the performance of machine learning-based IDSs on an imbalanced and up-to-date dataset. IEEE Access, 8, 32150-32162.
dc.identifier.issn2169-3536
dc.identifier.scopus2-s2.0-85081101801
dc.identifier.urihttps://doi.org/10.1109/ACCESS.2020.2973219
dc.identifier.urihttps://hdl.handle.net/11413/7928
dc.identifier.wos000525419100028
dc.language.isoen
dc.publisherIEEE
dc.relation.journalIEEE Access
dc.rightsinfo:eu-repo/semantics/openAccess
dc.subjectIDS
dc.subjectIntrusion Detection
dc.subjectSMOTE
dc.subjectMachine Learning
dc.subjectCSE-CIC-IDS2018
dc.subjectImbalanced
dc.titleIncreasing the Performance of Machine Learning-Based IDSs on an Imbalanced and Up-to-Date Dataset
dc.typeArticle
dspace.entity.typePublication
local.indexed.atwos
local.indexed.atscopus
local.journal.endpage32162
local.journal.startpage32150
relation.isAuthorOfPublication4e820274-4a42-44ba-aced-ca58912c0424
relation.isAuthorOfPublicationc0dcce72-7c1e-4e9b-ae5c-5f3de0540a4d
relation.isAuthorOfPublication.latestForDiscovery4e820274-4a42-44ba-aced-ca58912c0424

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Tam Metin/Full Text
Size:
1.86 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: