Book Image

Intelligent Document Capture with Ephesoft, Second Edition - Second Edition

Book Image

Intelligent Document Capture with Ephesoft, Second Edition - Second Edition

Overview of this book

Table of Contents (14 chapters)

Glossary


The following terms are commonly used when implementing document capture:

  • Batch class: A definition of document types, associated fields, extraction rules, monitored folders, and e-mails for a specified workflow

  • Batch instance: The pages being processed in the workflow

  • Classification : Determining the type of document being processed

  • CMIS: Content Management Interoperability Services

  • CMS: Content Management System

  • DMS: Document Management System, another term for CMS

  • ECM: Enterprise Content Management, an enterprise application for managing a large number of documents

  • Extraction: Retrieving information from documents

  • Fixed form: A type of form where the positions and dimensions of the fields are always the same

  • HA: High Availability, a term applied to online applications, services, or technologies that are designed to be resistant to failure, and therefore, always accessible

  • Hand print: Hand-written text

  • ICR: Intelligent Character Recognition

  • IDC: Intelligent Document Capture

  • Indexing: The process of defining field values for a particular document instance

  • KV: A key-value pair

  • Lucene: A full-text search engine

  • Machine print: Text that is printed by a machine (not hand-written)

  • Metadata: Information about a document that is associated with that document but not stored in the body of the document itself

  • OCR: Optical Character Recognition

  • OOTB : Out-of-the-box, refers to the default configuration of an application

  • Regex: A regular expression, syntax for defining a pattern of text

  • Separation: The process of determining the start and end of documents, given a set of page images

  • UI: User Interface

  • WSDL: Web Service Definition Language