Machine Learning Based Phishing Detection from URIs

Buber, Ebubekir; Demir, Önder; Diri, Banu; ŞAHİNGÖZ, ÖZGÜR KORAY

Publication:
Machine Learning Based Phishing Detection from URIs

dc.contributor.author	Buber, Ebubekir
dc.contributor.author	Demir, Önder
dc.contributor.author	Diri, Banu
dc.contributor.author	ŞAHİNGÖZ, ÖZGÜR KORAY
dc.contributor.authorID	214903	tr_TR
dc.date.accessioned	2019-09-05T12:31:17Z
dc.date.available	2019-09-05T12:31:17Z
dc.date.issued	2017-12
dc.description.abstract	Due to the rapid growth of the Internet, users change their preference from traditional shopping to the electronic commerce. Instead of bank/shop robbery, nowadays, criminals try to find their victims in the cyberspace with some specific tricks. By using the anonymous structure of the Internet, attackers set out new techniques, such as phishing, to deceive victims with the use of false websites to collect their sensitive information such as account IDs, usernames, passwords, etc. Understanding whether a web page is legitimate or phishing is a very challenging problem, due to its semantics-based attack struc ture, which mainly exploits the computer users’ vulnerabilities. Although software companies launch new anti-phishing products, which use blacklists, heuristics, visual and machine learning-based approaches, these products cannot prevent all of the phishing attacks. In this paper, a real-time anti-phishing system, which uses seven different classification algorithms and natural language processing (NLP) based features, is proposed. The system has the following distinguishing properties from other studies in the literature: language independence, use of a huge size of phishing and legitimate data, real-time execution, detection of new websites, independence from third-party services and use of feature-rich classifiers. For mea suring the performance of the system, a new dataset is constructed, and the experimental results are tested on it. According to the experimental and comparative results from the implemented classification algorithms, Random Forest algorithm with only NLP based features gives the best performance with the 97.98% accuracy rate for detection of phishing URLs.	tr_TR
dc.identifier.uri	https://hdl.handle.net/11413/5244
dc.identifier.wos	000449892000024
dc.language.iso	en_US	tr_TR
dc.relation.journal	17. International Conference on Intellegent Systems Design and Applications	tr_TR
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 United States	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/us/	*
dc.subject	Cyber Security	tr_TR
dc.subject	Phishing Attack	tr_TR
dc.subject	Machine Learning	tr_TR
dc.subject	Classification Algorithms	tr_TR
dc.subject	Cyber Attack Detection	tr_TR
dc.subject	Siber Güvenlik	tr_TR
dc.subject	Kimlik Avı Saldırısı	tr_TR
dc.subject	Makine Öğrenme	tr_TR
dc.subject	Sınıflandırma Algoritmaları	tr_TR
dc.subject	Siber Saldırı Tespiti	tr_TR
dc.title	Machine Learning Based Phishing Detection from URIs	tr_TR
dc.type	conferenceObject	tr_TR
dspace.entity.type	Publication
local.indexed.at	WOS
relation.isAuthorOfPublication	c0dcce72-7c1e-4e9b-ae5c-5f3de0540a4d
relation.isAuthorOfPublication.latestForDiscovery	c0dcce72-7c1e-4e9b-ae5c-5f3de0540a4d

Files

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.82 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Bilgisayar Mühendisliği Bölümü / Department of Computer Engineering
WoS İndeksli Yayınlar / WoS Indexed Publications

Publication: Machine Learning Based Phishing Detection from URIs

Files

License bundle

Collections

Publication:
Machine Learning Based Phishing Detection from URIs