User Lexicons

Operation

DonÆt confuse (user) lexicons with font dictionaries! (User) lexicons are linguistic databases that assist the recognition, font dictionaries contain character shapes learnt during the interactive OCR phase.

User lexicons are word lists containing any term that does not occur in the ôbasicö, general purpose lexicons. Think for instance of technical, scientific, legal or other company-specific terms.

Example 1: the OCR language is English. My user lexicon contains American city names such as ôPoughkeepsieö and ôMassapequaö. The OCR process makes good use of the user lexicon.

Example 2: the OCR language is English. My user lexicon contains French proper names such as ôAuxerresö, ôFranτoisö and ôVΘllΦresö. The OCR process only uses a portion of the user lexicon. Words that contain symbols not covered by the English character set get ignored: ôAuxerresö will be used, ôFranτoisö and ôVΘllΦresö will not be used by the OCR process.

Example 3: the OCR language is English. My user lexicon contains Russian terms. The OCR process will ignore the user lexicon altogether: the English character set does not include the ôCyrillicö alphabet.

Tip: Readiris Corporate allows you to activate multiple languages simultaneously!

How to...?

Readiris Corporate is equipped with the utility User Lexicon Editor. It allows you to create and maintain user lexicons.

Tip: the tooltip of the Language button indicates the active user lexicon.