Web-based Document Image Analysis Environment

ICT Solution for Automatic Document Image Analysis and Exploitation

Document image analysis (DIA) is a complex process consisting of many different tasks (image processing, segmentation, script analysis, printed text or handwriting recognition, word spotting, table analysis, form recognition, logical structure recognition, etc.) applied in a cascade. One of the major difficulties is to select the appropriate algorithms, to tune the parameters and to combine the methods to get the best possible results.

The Web-based DIAE project is about the development of a web platform that provides an easy access to publicly available DIA methods. The platform is intended to be used by private or public organizations that need to extract valuable information from scanned documents. Such information can further be used to cover internal needs or to be offered as a service for customers.


More concretely, the platform includes two major features: 

  1. The possibility for users without programming skills to define suitable workflows for their needs, combining numerous assessed DIA algorithms provided by the research community.
  2. An innovative consumption based pricing model that adapts according to the needs, the complexity of the process and the nature of the documents.


The project relies on the collaboration of two partners (DIVA Research Group and Docetis) who bring complementary skills to address the targeted goals.


The DIAE project is funded by Innosuisse under the Project no. 26986.2 PFES-ES.