iDocuments Intelligent Capture Administration Guide (v42.0)


First published on: 26 March 2025

 

Introduction

This administration guide helps companies configure and use OCR functionality within iDocuments.

Overview

iDocuments Intelligent Capture uses optical character recognition (OCR) to automatically capture documents using machine learning (ML) and artificial intelligence (AI). This allows information to be extracted and key header and line data auto-populated to reduce manual data entry and keying errors.

 

For purchase invoices, OCR allows suppliers to email purchase invoice PDFs to one or more email addresses for each company, where they'll be imported into the system on a scheduled cycle. Each supplier layout has a template that identifies key header and line fields, such as supplier name, invoice number, and purchase order number. This allows the system to identify scanned purchase invoices by supplier and auto-populate the 'Rapid Entry' form with these key fields. 

 

For sales orders, OCR allows customers to email purchase order PDFs to one or more email addresses for each company, where they'll be imported into the system on a scheduled cycle. Each customer layout has a template that identifies key header and line fields, such as customer name, order number, and delivery address. This allows the system to identify scanned sales orders by customer and auto-populate the 'Fast Track' order form with these key fields. 

 

For goods receipt notes, OCR allows suppliers to email goods receipt PDFs to one or more email addresses for each company, where they'll be imported into the system on a scheduled cycle. Each supplier layout has a template that identifies key header and line fields, such as supplier name, receipt number, and purchase order number. This allows the system to generate a new draft receipt document. 

 

Scanned PDFs will automatically be attached as a link within the iDocuments transaction, and the system will record additional auditing information not on the document, such as scan dates and times. 

 

If the OCR can’t read a document, users can train the OCR for the required fields. 

 

All PDF, image, and text documents attached to transactions outside the OCR processing will be automatically passed for OCR via an overnight task so they can be included in the intelligent archive searches. 

 

Identifying Documents Requiring Training

From the dashboard in the 'Overdue' tile, links based on actions and user security are displayed with a link if the OCR can’t read a document (as seen below). In this scenario, the user will need to train the OCR to read the document so subsequent documents will process automatically.

 

 

 This link can be used as a shortcut to the Intelligent Capture screen.

Intelligent Capture Screen – Access

As well as the page rights required for the intelligent capture screen there are other restrictions based on user access.


To upload a document manually

  • The user needs to have a group that is available for that document type (PI, SO, GRN)
  • Attachment – The attachment file name 

To view documents and see the homepage links

  • If a user has page rights to a Sales listing page (View All or Fast Track), they can see the Sales Documents
  • If a user has page rights to a PI listing page (View All or Rapid Invoices), they can see the PI Documents
  • If a user has page rights to a Receipts listing page (View All) or are an administrator / secondary admin / financeapprover role, they can see the receipt Documents
  • If a user is an administrator / secondary admin / financeapprover role, they can see the attachments

Intelligent Capture Screen – Search Pane

The left side of the screen offers selectable options. The right side shows a list of documents that have been sent for OCR processing.

 

 

 

Upload document 

This function uploads an example document to aid with training or process an urgent document to skip mailbox extraction steps. 

 

Filter 

This function improves searching by allowing users to select various criteria, such as a date range, process range (i.e., purchase invoice, sales order, attachment, or goods receipt note), and the OCR. After entering the criteria, click ‘Search’. Documents that meet the criteria will appear on the right side of the screen.

 

This function can also search for a specific document by the partial or full document number. 

 

Partner rules 

This allows users to select a business partner and choose the option to force training when documents are received. When ‘Force Training’ is chosen, that business partner’s documents will remain in the training area, even after being trained. Thereafter, users must select and progress these documents manually before they’ll be processed into the system. 

 

To create a rule, type the partial or full email address of the supplier who would normally send the document, and the OCR will search. Select the supplier from the list, then click ‘+ Force Training’ to enable the rule. To remove this rule, simply click the x in the red box. 

 

Intelligent Capture Screen – Results Pane

 

On the right side of the screen are the OCR statistics and the documents returned. 

 

Statistics 

When refreshed, the OCR statistics are shown with their current stages showing the documents and count of header fields vs. those matched and a success percentage. The statistics reflect the past 28 days of data from documents that never required training (i.e., were auto exported) and have been coded and submitted within iDocuments.

 

The pie chart also shows the processing times of the distinct stages.  

 

Individual document stats can be seen in the Actions > Document screen

 

 

 

Document details 

The document status and options with the information columns. 

 

 

 

  • Date – Date received for processing 
  • From – Shows whether the upload was MANUAL or the email address it came from 
  • Subject – Displays ‘Invoice Processor’ (if the upload was manual) or the email subject 
  • Attachment – The attachment file name 
  • Process – Purchase invoice, sales order, attachment, or goods receipt note 
  • Result – Latest step completed 
    • Sent for processing – On its way to OCR 
    • Currently processing – At OCR 
    • Exporting – Being sent back from OCR 
    • Exported – Returned from OCR 
    • Draft invoice created – Ideal endpoint of purchase invoices being processed. (Note: If credit note phrases are found, the credit note flag is updated and the payment terms aren’t applied, so the due date is the same as the created date.) 
    • Draft sales order created – Ideal endpoint of sales orders being processed 
    • Goods receipt note – Ideal endpoint of goods receipt notes being processed 
    • Attachment created – Ideal endpoint of attachments being processed 
    • Attachments > No attachments – Email has been received for the attachment process, but nothing is attached 
    • Training required – Document can’t create a draft, likely because the supplier/customer and document number can’t be identified 
    • Duplicate – The exact document details have been seen before, so it’s highlighted as a duplicate, which allows it to be deleted or skipped for OCR 
    • Processing error – An issue occurred in processing; error messages are available in the details column 

 

 

  • Failed – Like error, but the error is unknown, so processing stops 

  • Deleting – Delete has been pressed, so it will be at this status for a short while 

  • Deleted – The document was deleted, so it will be at this status and must be resubmitted if needed again 

  • Actions – This has various hyperlinks based on the document status: 

  • Document – This shows the history of the OCR and is the only one shown if everything processed OK

 

 

  • Training – See the section below where the training of the document image can be carried out

  • Restart – Allows the document to be sent for reprocessing (e.g., if 10 invoices are sent and the training is done for 1, the other 9 can reprocess and pick up those changes) 

  • Reroute – Would cancel the entry for the mailbox company it was assigned to and allow the document to be sent to a different company. It doesn’t have to be resent to another mailbox and processed via OCR again, meaning a single AP email and OCR can be used for multiple companies, where only occasional changes are needed. (Note: The OCR doesn’t remember the changes, so next time, it will still be sent to the company for that mailbox.) 

 

  • Skip OCR – Allows problematic documents (e.g., error state or training required) to process with no data captured and can be processed manually 

  • Delete – As email extraction takes any PDF from the email, it’s possible that statements or other attachments are sent, and they can be deleted 

  • Details – This column is for system errors that could occur but that typically occur only during initial testing when configuring the mailbox extraction or initial setup 

  • Document – This is the document number that has been captured 

 

OCR Training

When brought to the Intelligent Capture screen, a list of documents to be trained will be presented. This is indicated by looking at the result column to see ‘Training Required’. 

 

Any document that has gone through the OCR and not captured the supplier/customer and document number must have additional training (dates and values can be blank). 

 

When selected, the document will be checked for any other user. As a user can simply close the training browser, the system will check if a user has previously accessed this document. If so, a warning message will appear if there are multiple users trying to access the same document. 

 

Graphical user interface, text, application, email

Description automatically generated 

 

Training Page

The training page presented will show the iDocuments fields on the left. On the right is the scanned image from the supplier. On this screen, the OCR will be taught where to look for the pieces of information required to process this document. 

v32. removed the login prompt required on first use so this page will open and login automatically.

 

 

v35 added a resize option to the training grid for the lines so they can be extended

 

iDocuments will automatically process as many fields as possible by looking at the fields on the left to find that information on the right. Users can then correct or alter those results as needed. (Note: iDocuments looks for variations of wording; for example, if a user searches for ‘invoice number’, iDocuments also looks for ‘invoice no’, ‘invoice no.’, ‘invoice #’, ‘reference number’, etc.) 

 

Change the view of the document with the buttons in the grey bar that move the image side to side or increase/decrease the size to make it easier to view.  

 

Processing – Purchase Invoice and Sales Order

These are the steps for purchase invoices and/or sales orders. Goods receipt notes follow a similar process but with slightly different fields. 

 

  1. Click in the supplier/customer identification box so it turns green.  

  1. On the right side of the screen, click in the field that will identify this supplier/customer. This will place a green box around that field.  

This field is based on one of seven fields found in the 'Master Partner Data' in the ERP system. The seven fields are: Federal tax ID, Telephone 1, Telephone 2, Mobile, Fax, Email, Company reg. no (only in UK versions). 

 

If none of this information is stored in the document, the 'Alias' field can be used to reference a unique piece of data for this supplier/customer (e.g., the code or name exactly matched to the ERP system, ignoring spaces). iDocuments will then look up all the master partner data in the ERP system to ensure the selected field is listed for a valid supplier. If the identification field is validated, iDocuments will update the name field with the information from the ERP system. 

 

The OCR process will also check if the tax code on the line is set as ‘Acquisition Reverse in SAP’ and automatically set the tax amount to 0%. 

 

  1. Click a field on the left to turn it green, then find and click the relevant field on the image to link it for this template. 

  1. Repeat this for each of the required fields.  

  1. Some fields, such as currency, may not be available on the image. They may be left blank and will be pulled from the master partner data in the ERP system where possible.  

  1. Begin building the lines next, following the same process as above but selecting the first line information. The line details will be highlighted in blue on the image.  

  1. If there are multiple lines, clicking the code field on the second line will replicate all the lines showing on the document. 

 

(Note: Additional lines or missing details can be manually entered if they’re not being captured automatically.) 

 

Post Processing – Automatic Actions

Item/GL codes:

 

The export process from the OCR will check if the code if matched to one in the ERP, then catalogue code for those who have a cross reference, then by description for last used, then specific name in the ERP. 

 

For the history when only the line description is captured, previous invoices will be checked and if they are matched by the same group and same descriptions it will know which item is most likely.

 

If no history exists, then the process will initially check against master data using the exact description.

 

Any blank descriptions that are passed through will be automatically replaced with a default so that they do not remain blank in iDocuments. 

 

For purchase invoices, the default description is "AP Invoice", “-“, and then the supplier code.

 

 

For sales orders, just the line numbering is added (e.g., "Line 1", “Line 2”).

 

Tax codes:

 

Suppliers won't know your ERP tax codes, so the OCR export process will 1) check the most frequent use of any item/GL found and use that rate or 2) determine a rate percentage based on the extracted net and gross line amounts. 

 

This will then be validated based on the below rules until it determines a tax code.

 

  1. Some company settings may be initially applied (e.g., "Tax Liable Default Code")
  2. Check the iDocuments history for the most commonly used for that rate (+/- 1.00)
  3. Check the ERP master data for a suitable rate (+/- 1.00)
  4. Check the code from the ERP supplier configuration (if it's a sales order, this is the default)
  5. Use a company setting "Default VAT Code/Default Sales VAT Code"

Therefore, the initial training is the most important factor in determining whether an item has its tax code correctly set, as one of the first checks is the history.

 

Processing – Goods Receipt

For goods receipt documents, the OCR will determine the main details.

 

 

 

If the supplier, receipt number reference, and order number are found, this will export. 

 

On export, it will check the order number, create the header, link to open order lines if there are any, and set the “creator” to the same as the PO. The receipt will be a draft with the receipt note attached. 

 

If it can’t link to the PO, it will stay in the OCR log page. You can search for a PO to manually link the document.

 

A screenshot of a computer

Description automatically generated 

 

The total from the scan will be populated in the 'Scan Total' field for the receipt, so those that aren’t created by OCR will have a blank amount. 

 

A screenshot of a computer

Description automatically generated 

 

This field is useful to see if the total of the document scanned equals the total of the receipt value, as initially it will be all outstanding lines. Once the correct lines and values remain, the values should tally. 

 

A screenshot of a computer

Description automatically generated 

 

If there are rounding issues, the receipt can still be processed as all validation remains against the total field. 

 

This document type is always held in the intelligent archive, so any portion of the OCR can be used as search criteria. 

 

Document Progression

When completed, the fields required on the left of the screen should be populated. This is also identifiable by a green tick on the image thumbnail. 

 

 

 

The document can be progressed. 

 

 

 

  1. To exit without saving, click ‘Return Document’ in the top left corner.  

  1. To save the information and remain in the training screen, click ‘Save Document’. 

  1. Click ‘Progress Document’ to progress the document through the OCR. 

    • This will automatically save any template training that has occurred. 

 

Advanced Actions

This section of the Intelligent Capture page is exclusively for users with admin rights. It requires a password but is shown below for reference.  

 

Graphical user interface, application, website

Description automatically generated 

 

The ‘re-run training’ button will resend all documents set to ‘Training required’ back through the automatic OCR for automatic matching and export. For example, one supplier sends 20 invoices, and they all need training. You can train the first one, then send the other 19 through again without going into each one. 

 

You can also use the ‘Send selected for OCR’ or ‘Delete Selected Lines’ where they won’t be processed, versus having to click and wait for the refresh each time for multiple documents. 

 

After unlocking, the below tick-box options are available. 

 

A screenshot of a computer

Description automatically generated 

 

‘Enable batch operations’ – Enables the three buttons above to be shown to standard users 

 

‘Enable restart’ – Shows and hides the restart button on the lines 

 

‘Enable rerouting’ – Allows an entry to be changed to another accessible company 

 

‘Set posting date to tax date’ – Normally, the posting date is the current date, whereas setting the tax date will make it the same date as the invoice 

 

‘Duplicate number check enabled’ – Checks the reference number (Purchase Invoice number or Customer Sales Order number) for duplicates and highlights them with a duplicate status 

 

A screenshot of a computer

Description automatically generated 

 

‘Duplicate check all companies’ – Checks the reference number across all linked companies and can manually reroute to a different company. Duplicate documents can be deleted or, as part of the training screen, can have their document numbers altered. 

 

Graphical user interface, application

Description automatically generated 

 

If no PO is found, ‘line grouping codes’ allow a GL or item code and description to be entered. This allows a single line to be created for the total amount rather than the lines captured by OCR to be used. (Note: This is against all documents without a PO.) If a PO is matched, it will use the lines on the PO. 

 

‘Address validation’ – Can be configured but is an external licenced product. Its key is entered here 

 

E.g., the first line and the postcode to get back the fully formatted lines of the address, which is sometimes better than the OCR capturing the whole address for sales orders 

 

The buttons offer further options: 

‘Configure languages’ – Allows selection for the mappings between the ERP list of countries and the more limited list of OCR languages 

 

Graphical user interface, application

Description automatically generated 

‘Licence’ – Allows for a check on the licence usage, although it’s for information only  

 

‘Update advanced password’ – Allows users to change the password for the advanced actions section

 

‘Lock’ – Relocks the advanced actions section  

 

Email Configuration

‘Configure mailboxes’ – Allows the email account(s) that will extract documents for OCR to be configured. 

 

  

There are various email options, with the most frequent being the Office 365 options. Details are provided on configuring the advanced settings in the OAuth section. 

 

Email extraction can be disabled when required. 

 

'Default group' – Allows users to set a default group for every email address entry (e.g., when iDocuments can’t determine the group via the automatically captured PO). 

 

A batch type can be set per email address to allow different form templates to be used. 

 

'Save email details' – Stores the main information from the email as a comments field within the document and the entire email body as an attachment text file (Note: Increased storage is required.) 

 

'Save all attachments' – Allows attachments that aren’t PDFs to be stored as attachments. 

 

The settings tab can define the allowed file types for these attachments to avoid unwanted images, etc. in email signatures. 

 

Graphical user interface, application

Description automatically generated with medium confidence 

An email address can be entered to allow for notifications when errors occur in extracting emails (cannot be same as the mailbox as it would create a loop).

 

If users can’t prevent certain emails but don’t want them captured each time, they can put the addresses or subjects into the right-side panel; iDocuments then ignores those emails in the extraction process. 

 

Once the email has been created, this screen will display a ‘Last run’ time. 

 

 

The ‘Show result’ button can be used to check if the last processing succeeded or if an error was generated. 

 

Template Configuration

Graphical user interface, text, application, email

Description automatically generated 

 

Each section (e.g., ‘Purchase Invoice’) allows users to modify the wording the OCR looks for on invoices when it’s being sent to OCR. (Note: Modifications must be created separately for each language.) 

 

These items are in this section: 

  • To start, select the date format to look for in the invoices received (as seen below) 
    • ‘Document’ refers to reading the date as shown on the document being processed 
  • Select the language for the OCR for localisation settings 
  • Click 'Template' to open the screen to make modifications 

 

Table

Description automatically generated with medium confidence 

 

On the screen shown above, there are several columns in the 'Fields' tab  

  • Section – Refers to where the field is located (at the header or line level)  
  • Field – This is the database name and can’t be altered  
  • Name – This is the label name for the field and may be different per template (e.g., in a different language) 
  • Hidden – If there’s a tick box, the field isn’t mandatory and can be hidden  
  • Headers – Displays a list of values the OCR will search for. For example, the OCR will search for gross, total amount, grand total, total due, and total amount due. Admins can add options to this list for invoices to be picked up by the OCR, but common phrases should be avoided.  
  • Exceptions – This is a list of values to ignore 

 

Note: for User Defined Fields (UDF’s) there is additional validation to be like the standard formatting fields. 

  • UDF type validation - for example, date/string/numeric fields are now validated. In this instance it will NOT save the UDF if the data is not the right type.
  • UDF length validation for "string" type which could be a free text or a lookup value, both of which have a maximum length which if exceeded it will NOT save the UDF so the user will have to fill it in on rapid entry as we don't want partial information being sent through.

 

On the identification mappings tab is the list of fields users can check to determine which supplier to hold this template against. Within this configuration, users can define an SQL filter to make the list of suppliers easier to sort (e.g., they could limit the search to include only certain card codes or specific UDFs). 

 

Graphical user interface, text, application

Description automatically generated 

 

On the settings tab is a field for ‘Max Suppliers to Show’. This allows users to create identifiers, limiting search results and preventing a too-long list of suppliers. The tick box allows users to force an identifier for supplier selection. 

 

Graphical user interface, text, application, email

Description automatically generated 

 

Credit notes are automatically determined if the gross amount is negative, but detection phrases can be entered to help identify credits by specific text in the document. 

 

(Note: This searches the first 50% of OCR data from the first page as the most likely area to find these phrases, but that isn’t necessarily the top half, as it all depends on how the page has been constructed as to where it is in the OCR data.) 

 

‘Use Strict Formatting’ is used for the line information table part of the document and means the OCR will go through additional processes to improve the line data. This requires more training and slows down processing significantly, so it’s rarely recommended. 

 

After completing the fields, click save and then close to finish. 

 

Attachments-Only Processing

When a mailbox process is created for ‘Attachments’, the system will expect emails with a specific subject line. 

 

The attachments will be linked to the document identified by that email subject. The email subject should be the document type (PO, GRN, GR, PI, SO, or SI), then a space, then the reference number. If not uniquely matched, it will show on the home page as ‘Pending document link’ when logged in as an admin or finance approver, where it can be linked manually.

 

 

 

Selecting the document link allows the user to search for the document type and then the document reference number. The ‘Link’ button assigns the attachment to that document. 

 

A screenshot of a computer

Description automatically generated 

 

(Note: The history entry for these attachments will be based on the base documents, so any goods receipt attachment will have the user who attached it as the PO creator.) 

 

 

 

 

Next


  

Last modified: 05/02/2025/11:12 am

-