Person:
İLGEN, BAHAR

Loading...
Profile Picture

Email Address

Birth Date

Research Projects

Organizational Units

Job Title

Dr. Öğr. Üyesi

Last Name

İLGEN

First Name

BAHAR

Name

Search Results

Now showing 1 - 10 of 11
  • Publication
    Building up lexical sample dataset for Turkish word sense disambiguation
    (2012-07-02) Adalı, Eşref; Tantuğ, Ahmet Cüneyd; İLGEN, BAHAR; 141812; 8786; 21833
    Word Sense Disambiguation (WSD) has become even more important research area in recent years with the widespread usage of Natural Language Processing (NLP) applications. WSD task has two variants: “Lexical Sample” and “All Words” approaches. Lexical Sample approach disambiguates the occurrences of a small sample of target words that were previously selected, while in the latter all the words in a piece of text are disambiguated. In the scope of this work, a Lexical Sample Dataset for Turkish has been prepared. As a first step, highly ambiguous words in Turkish have been selected. Collection of text samples for chosen words has been completed. Five taggers have annotated the word senses. This paper summarizes the step-by-step building-up process of a Lexical Sample Dataset in Turkish and presents the results of some experiments on it.
  • Publication
    Deep Learning Based Document Modeling for Personality Detection from Turkish Texts
    (2019-10-24) İLGEN, BAHAR; 141812
    The usage of social media is increasing exponentially since it has been the easiest and fastest way to share information between people or organizations. As a result of this broad usage and activity of people on social networks, considerable amount of data is generated continuously. The availability of user generated data makes it possible to analyze personality of people. Personality is the most distinctive feature for an individual. The results of these analyses can be utilized in several ways. They provide support for human resources recruitment units to consider suitable candidates. Similar products and services can be offered to people who share the similar personality characteristics. Personality traits help in diagnosis of certain mental illnesses. It is also helpful in forensics to use personality traits on suspects to clarify the forensic case. With the rapid dissemination of online documents in many different languages, the classification of these documents has become an important requirement. Machine Learning (ML) and Natural Language Processing (NLP) methods are used to classify these digitized data. In this study, current ML techniques and methodologies have been used to classify text documents and analyze person characteristics from these datasets. As a result of classification, detailed information about the personality traits of the writer could be obtained. It was understood that the frequency-based analysis and the use of the emotional words at the word level are very important in the textual personality analysis.
  • Publication
    Exploring feature sets for Turkish word sense disambiguation
    (TUBİTAK Scientific & Technical Research Council Turkey, Ataturk Bulvarı No 221, Kavaklıdere, Ankara, 00000, Turkey, 2016) Adalı, Eşref; Tantuğ, Ahmet Cüneyd; İLGEN, BAHAR; 141812; 8786; 21833
    This paper presents an exploration and evaluation of a diverse set of features that influence word-sense disambiguation (WSD) performance. WSD has the potential to improve many natural language processing (NLP) tasks as being one of the most crucial steps in the area. It is known that exploiting effective features and removing redundant ones help improving the results. There are two groups of feature sets to disambiguate senses and select the most appropriate ones among a set of candidates: collocational and bag-of-words (BoW) features. We introduce the effects of using these two feature sets on the Turkish Lexical Sample Dataset (TLSD), which comprises the most ambiguous verb and noun samples. In addition to our results, joint setting of feature groups has been applied to measure additional improvement in the results. Our results suggest that joint setting of features improves accuracy up to 7%. The effective window size of the ambiguous words has been determined for noun and verb sets. Additionally, the suggested feature set has been investigated on a different corpus that had been used in the previous studies on Turkish WSD. The results of the experiments to investigate diverse morphological groups show that word root and the case marker are significant features to disambiguate senses.
  • Publication
  • Publication
    Multi-document summarization for Turkish news
    (IEEE, 345 E 47Th St, New York, Ny 10017 USA, 2017) Demirci, Ferhat; Karabudak, Engin; İLGEN, BAHAR; 141812
    In this paper, we introduce our multi-document summarization system for Turkish news. The aim of the summarization system is to build a single document for multi document news that have been collected previously. The news were collected from several Turkish news sources via Real Simple Syndication (RSS). They were separated into clusters according to their topics. We utilized cosine similarity metric for the clustering process. Latent Semantic Analysis (LSA) has been used in the summarization phase. Multi-Document Summarization (MDS) differs from single document summarization in that the issues of compression, speed, redundancy and passage selection are essential inside the formation of ideal summaries. In this study, we utilized term frequency in document scoring which let us select the sentences with higher importance degree. We use ROUGE technique for evaluation of the system and our results show that the average of recall and precision percentage of this system is 43%. In the manual summarization phase, fifteen volunteers took part. The reason of low percentage is interpreted as getting texts randomly without any edit. It has been observed that the number of sentences and rate of summarization affect the accuracy rate.
  • Publication
    Investigation of Several Parameters on Goldbach Partitions
    (2011) Özkan, Derya; İLGEN, BAHAR; 141812
  • Publication
    A Comparative Study to Determine the Effective Window Size of Turkish Word Sense Disambiguation Systems
    (Springer, 233 Spring Street, New York, Ny 10013, United States, 2013) Adalı, Eşref; Tantuğ, Ahmet Cüneyd; İLGEN, BAHAR; 141812; 8786; 21833
    In this paper, the effect of different windowing schemes on word sense disambiguation accuracy is presented. Turkish Lexical SampleDataset has been used in the experiments. We took the samples of ambiguous verbs and nouns of the dataset and used bag-of-word properties as context information. The experi-ments have been repeated for different window sizes based on several machine learning algorithms. We follow 2/3 splitting strategy (2/3 for training, 1/3 for test-ing) and determine the most frequently used words in the training part. After re-moving stop words, we repeated the experiments by using most frequent 100, 75, 50 and 25 content words of the training data. Our findings show that the usage of most frequent 75 words as features improves the accuracy in results for Turkish verbs. Similar results have been obtained for Turkish nouns when we use the most frequent 100 words of the training set. Considering this information, selected al-gorithms have been tested on varying window sizes {30, 15, 10 and 5}. Our find-ings show that Naive Bayes and Functional Tree methods yielded better accuracy results. And the window size +/-5 gives the best average results both for noun and the verb groups. It is observed that the best results of the two groups are 65.8 and 56% points above the most frequent sense baseline of the verb and noun groups respectively.
  • Publication
    Face Detection & Recognition for Automatic Attendence System
    (2018) Sanli, Onur; İLGEN, BAHAR; 141812
    Human face recognition is an important part of biometric verification. The methods for utilizing physical properties, such as human face have seen a great change since the emergence of image processing techniques. Human face recognition is widely used for verification purposes, especially if the learner attends to lectures. There is a lot of time lost in classical attendance confirmations. In order to solve this time loss, an Attendance System with Face Recognition has been developed which automatically tracks the attendance status of the students. The Attendance System with Face Recognition performs daily activities of the attendance analysis which is an important aspect of face recognition task. By doing this, it saves time and effort in classrooms and meetings. In the scope of the proposed system, a camera attached to the front of the classroom continuously captures the images of the students, detects the faces in the images, compares them with the database, and thus the participation of the student is determined. Haar filtered AdaBoost is used to detect the realtime human face. Principal Component Analysis (PCA) and Local Binary Pattern Histograms (LBPH) algorithms have been used to identify the faces detected. The paired face is then used to mark course attendance. By using the Attendance System with Facial Recognition, the efficiency of lecture times’ utilization will be improved. Additionally, it will be possible to eliminate mistakes on attendance sheets.
  • Publication
    Exploring the effect of bag-of-words and bag-of-bigram features on Turkish word sense disambiguation
    (2014) Adalı, Eşref; İLGEN, BAHAR; 141812; 8786
    Feature selection in Word Sense Disambiguation (WSD) is as important as the selection of algorithm to remove sense ambiguity. Bag-of-word (BoW) features comprise the information of neighbors around the ambiguous target word without considering any relation between words. In this study, we investigate the effect of BoW features and Bag-of-bigrams (BoB) on Turkish WSD and compare the results with the collocational features. The results suggest that BoW features yield better accuracy for all the cases. According to the comparison results, collocational features are more effective than both BoW and the BoB features on disambiguation of word senses.
  • Publication
    The impact of collocational features in Turkish Word Sense Disambiguation.
    (2012-06-13) Adalı, Eşref; Tantuğ, Ahmet Cüneyd; İLGEN, BAHAR; 141812; 8786; 21833
    Word Sense Disambiguation (WSD) is the task of choosing the most appropriate sense of a word having multiple senses in a given context. Collocational features acquired from the words in neighborship with the ambiguous word are one of the important knowledge sources in this area. This paper explores the effective sets of collocational features in Turkish in order to obtain better Turkish WSD systems. A lexical sample dataset of highly polysemous nouns and verbs has been prepared as the initial step of the work. Several supervised learning algorithms have been tested on this data by supplying different feature sets to select the best performing features for both nouns and verbs in Turkish. Also, we investigated the impact of several collocational features of polysemous words and evaluated the performance of several supervised machine learning algorithms.