A Hierarchical Cluster Tree Approach Leveraging Delaunay Triangulation

Cristian Avatavului, Costin-Anton Boiangiu

Abstract


This research introduces a robust and reliable technique for structuring document image pages hierarchically, harnessing the power of Delaunay triangulation. Central to our approach is the formation of a cluster tree, which encapsulates the page's content through strategically exploiting layout elements arrangements and their relative distances. By applying our technique, we proficiently categorize the page into distinct clusters encompassing images, titles, and paragraphs. The consequent hierarchical framework, founded on the cluster tree, establishes a durable and trustworthy blueprint of the document layout, thereby accelerating document comprehension and examination.


Keywords


hierarchical clustering, document image layout analysis, Delaunay triangulation, cluster tree formation, layout element segmentation, advanced document image processing

Full Text:

PDF


(C) 2010-2022 EduSoft