Structure-Preserving And Query-Biased Document Summarisation For Web Searching

Pembe, F. Canan; Güngör, Tunga

Publication:
Structure-Preserving And Query-Biased Document Summarisation For Web Searching

Date

2009

Authors

Pembe, F. Canan

Güngör, Tunga

Publisher

Emerald Group Publishing Limited, Howard House, Wagon Lane, Bingley Bd16 1Wa, W Yorkshire, England

Type

Article

Abstract

Purpose - The purpose of this paper is to develop a new summarisation approach, namely structure-preserving and query-biased summarisation, to improve the effectiveness of web searching. During web searching, one aid for users is the document summaries provided in the search results. However, the summaries provided by current search engines have limitations in directing users to relevant documents. Design/methodology/approach - The proposed system consists of two stages: document structure analysis and summarisation. In the first stage, a rule-based approach is used to identify the sectional hierarchies of web documents. In the second stage, query-biased summaries are created, making use of document structure both in the summarisation process and in the output summaries. Findings - In structural processing, about 70 per cent accuracy in identifying document sectional hierarchies is obtained. The summarisation method is tested on a task-based evaluation method using English and Turkish document collections. The results show that the proposed method is a significant improvement over both unstructured query-biased summaries and Google snippets in terms of f-measure. Practical implications - The proposed summarisation system can be incorporated into search engines. The structural processing technique also has applications in other information systems, such as browsing, outlining and indexing documents. Originality/value - In the literature on summarisation, the effects of query-biased techniques and document structure are considered in only a few works and are researched separately. The research reported here differs from traditional approaches by combining these two aspects in a coherent framework. The work is also the first automatic summarisation study for Turkish targeting web search.

ISSN

1468-4527

Keywords

Data Structures , Document Delivery , Markup Languages , Search Engines , Worldwide Web , Extraction , Veri Yapıları , Belge Teslim , Biçimlendirme Dilleri , Arama Motorları , Dünya Çapında Ağ , Çıkarma

URI

http://hdl.handle.net/11413/1283

Collections

Bilgisayar Mühendisliği Bölümü / Department of Computer Engineering
Scopus İndeksli Yayınlar / Scopus Indexed Publications
WoS İndeksli Yayınlar / WoS Indexed Publications

Publication:
Structure-Preserving And Query-Biased Document Summarisation For Web Searching

Date

Institution Authors

Organizational Units

Authors

Advisor

item.page.editor

Editor

Department

Journal Title

Journal ISSN

Volume Title

Publisher

DOI

Type

Research Projects

Organizational Units

Journal Issue

Abstract

Description

Journal or Series

ISSN

ISBN

Rights

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By

Related Patent

Related Goal

17

Views

0

Downloads

Publication: Structure-Preserving And Query-Biased Document Summarisation For Web Searching

Date

Institution Authors

Organizational Units

Authors

Advisor

item.page.editor

Editor

Department

Journal Title

Journal ISSN

Volume Title

Publisher

DOI

Type

Research Projects

Organizational Units

Journal Issue

Abstract

Description

Journal or Series

ISSN

ISBN

Rights

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By

Related Patent

Related Goal

17

Views

0

Downloads

Publication:
Structure-Preserving And Query-Biased Document Summarisation For Web Searching