Tóm tắt Luận án Document geometric layout analysis based on adaptive threshold

Người chia sẻ :
Số trang : 26 trang
Lượt xem : 14
Lượt tải : 500

Tất cả luận văn được sưu tầm từ nhiều nguồn, chúng tôi không chịu trách nhiệm bản quyền nếu bạn sử dụng vào mục đích thương mại

NHẬP MÃ XÁC NHẬN ĐỂ TẢI LUẬN VĂN NÀY

Nếu bạn thấy thông báo hết nhiệm vụ vui lòng tải lại trang

Bạn đang xem trước 20 trang tài liệu Tóm tắt Luận án Document geometric layout analysis based on adaptive threshold, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD LUẬN VĂN ở trên

Text recognition is a field that has been researched and applied for many years. Text recognition process is performed through the following main steps: The input image page will go through the preprocessing step, then the page analysis step, the output of the page analysis will be the input of the recognition step, and finally post-processing. The result of a recognition system depends on two main steps: page analysis and recognition. At this point, the problem of recognition on printed text has been resolved almost completely (ABBYY’s FineReader 12.0 commercial product can recognize printed text in various languages, recognition software of Vietnamese words in VnDOCR 4.0 of the Hanoi Information Technology Institute can recognize with accuracy over 98%). However, in the world as well as in Vietnam, the page analysis problem remains a major challenge for researchers. Until now, page analysis is still receiving the attention of many researchers. Every two years in the world there is an international page analysis contest to promote the development of page analysis algorithms. These were the motivations for the dissertation to try researching so that they can propose effective solutions to the page analysis problem. In recent years, there are many page analysis algorithms have been developed, especially are hybrid-oriented approached development algorithms. The proposed algorithms show different strengths and weaknesses, but in general most of them still suffer from two basic errors: an error separating a correct text area into smaller that leads to mislead or miss the information of text lines or paragraph (over-segmentation), the aggregation error of text areas in text columns or paragraphs together (under-segmentation). Therefore, the objective of the dissertation is to study and develop page analysis algorithms that simultaneously reduce both types of errors: over-segmentation, under-segmentation. The issues in page analysis are very broad so the dissertation limits the scale of the study within the scope of text image pages written in Latin language which particularly is English and focuses on the analysis of the text areas. The dissertation has not proposed the problem of detecting and analyzing the structure of table spaces, detecting image areas and analyzing logical structures. With the objectives of the dissertation have achieved the following results: 1. Propose a solution that speeds up the algorithm for detecting background images. 2. Proposed adaptive parameterization method reduces the effect of size and font type on the results of page analysis. 3. Proposed a new solution for the problem of detecting and using separator objects in page analysis algorithms. 4. Proposes a new solution that separates text areas into paragraphs based on context analysis

NHỮNG LUẬN VĂN LIÊN QUAN

Thạc Sĩ - Cao Học

Đề tài Nhượng quyền thương mại Co.op Mart

1. Tính thiết thực của đề tài Sau khi Việt Nam gia nhập WTO một sân chơi công bằng và khắc nghiệt, buộc chính phủ Việt Nam phải thay đổi việc quản lý mang tính bảo hộ một số ngành [...]

Download

Thạc Sĩ - Cao Học

Luận văn Xây dựng chiến lược kinh doanh bất động sản của BitexColand

1. Sự cần thiết của luận văn Việt Nam đã gia nhập Tổ chức Thương mại thế giới (WTO), trong xu thế toàn cầu hóa như hiện nay, các doanh nghiệp đang đứng trước những cơ hội lớn để xây [...]

Download

Thạc Sĩ - Cao Học

Luận văn Đo lường mức độ hài lòng khách hàng về dịch vụ giao nhận hàng không tại công ty cổ phần giao nhận vận tải và thương mại VinaLink

1.1 Trình bày vấn đề nghiên cứu Cùng với quá trình tự do hóa thương mại đang diễn ra rầm rộ trên toàn thế giới, các doanh nghiệp nhận ra rằng hài lòng khách hàng là một vũ khí chiến [...]

Download

Thạc Sĩ - Cao Học

Luận văn Giải pháp điều hành chính sách tỷ giá ở Việt Nam

Như chúng ta đ ã biết, tỷ giá hối đoái l à một công cụ kinh tế vĩ mô chủ yếu để điều tiết cán cân th ương mại quốc tế theo mục ti êu đã định trước của một [...]

Download