iDocuments Intelligent Capture Administration Guide (v43.0) - V6 Beta
First published on: 20 March 2024
Introduction
This administration guide helps companies configure and use OCR functionality within iDocuments.
Overview
iDocuments Intelligent Capture uses optical character recognition (OCR) to automatically capture documents using machine learning (ML) and artificial intelligence (AI).
This allows information to be extracted and key header and line data auto-populated to reduce manual data entry and keying errors.
For purchase invoices, OCR allows suppliers to email purchase invoice PDFs to one or more email addresses for each company, where they'll be imported into the system on a scheduled cycle.
- Each supplier layout has a template that identifies key header and line fields, such as supplier name, invoice number, and purchase order number.
- This allows the system to identify scanned purchase invoices by supplier and auto-populate the 'Rapid Entry' form with these key fields.
For sales orders, OCR allows customers to email purchase order PDFs to one or more email addresses for each company, where they'll be imported into the system on a scheduled cycle.
- Each customer layout has a template that identifies key header and line fields, such as customer name, order number, and delivery address.
- This allows the system to identify scanned sales orders by customer and auto-populate the sales order form with these key fields.
For goods receipt notes, OCR allows suppliers to email goods receipt PDFs to one or more email addresses for each company, where they'll be imported into the system on a scheduled cycle.
- Each supplier layout has a template that identifies key header and line fields, such as supplier name, receipt number, and purchase order number.
- This allows the system to generate a new draft receipt document.
Scanned PDFs will automatically be attached as a link in the iDocuments transaction.
The system will record additional auditing information not on the document, such as scan dates and times.
If the OCR can’t read a document, users can train the OCR for the required fields.
All PDF, image, and text documents attached to transactions outside the OCR processing will be automatically passed for OCR via an overnight task so they can be included in the intelligent archive searches.
Identifying Documents Requiring Training
From the dashboard in the 'Overdue' tile, links based on actions and user security are displayed with a link if the OCR can’t read a document (as seen below).
In this scenario, the user will need to train the OCR to read the document so subsequent documents will process automatically.
This link can be used as a shortcut to the Intelligent Capture screen.
Intelligent Capture Screen – Access
In addition to the page rights required for the intelligent capture screen, there are other restrictions based on user access.
To upload a document manually:
- The user needs to have a group available for that document type (PI, SO, GRN)
- Attachment – The attachment file name
To view documents and see the homepage links:
- If a user has page rights to the sales order (SO) listing page, they can see the SO documents
- If a user has page rights to a purchase invoice (PI) listing page, they can see the PI documents
- If a user has page rights to a goods receipts listing page or is an administrator/secondary admin/finance approver, they can see the receipt documents
- If a user is an administrator/secondary admin/finance approver, they can see the attachments
Intelligent Capture Screen – Header options
On the top of the screen are different options for adding, actioning, and configuring documents.
Add or upload document
Typically, the documents will be processed by extracting them from an email mailbox, but you can manually skip mailbox extraction steps to aid with training or process an urgent document.
You'll see the upload's current status after selecting the transaction type and selecting the files to upload.
The top will show the upload queue progress.
Actions
These selections allow you to select one or more rows, then restart, OCR, SkipOCR, re-route, or delete.
See below line actions for more information.
Filters
Selection buttons aid searching by letting users select various criteria, such as
- Dynamic date ranges
- OCR status
- Transaction types (purchase invoice, sales order, attachment, or goods receipt note)
- Status
Users can also use the filter option to further limit the results by date range and email details and to search for a specific document by the partial or full document number.
Statistics
When refreshed, the OCR statistics are shown with their current stages showing the documents and count of header fields vs. those matched and a success percentage.
The statistics reflect the past 28 days of data from documents that never required training (i.e., were auto-exported) and have been coded and submitted in iDocuments.
The pie chart also shows the processing times of the distinct stages.
Individual document stats can be seen in the Actions > Document screen.
Configuration/Mailbox
See the advanced section below.
Intelligent Capture Screen – Results Pane
Document details
The document status and options with the information columns.
-
Actions – This has various hyperlinks based on the document status:
-
Selecting a document – Double clicking on the row shows the image and document attributes.
-
Information – From the right click option, you can see information about the document (i.e., the information that was captured and what was exported after any training changes, plus the timing audit of that process from start to finish).
-
Training – See the section below where the training of the document image can be carried out.
-
Restart – Allows the document to be sent for reprocessing (e.g., if 10 invoices are sent and the training is done for 1, the other 9 can reprocess and pick up those changes).
-
Skip OCR – Allows problematic documents (e.g., error state or training required) to process with no data captured and can be processed manually.
-
Reroute – Cancels the entry for the mailbox company it was assigned to and allows the document to be sent to a different company.
-
It doesn’t have to be resent to another mailbox and processed via OCR again, meaning a single AP email and OCR can be used for multiple companies, where only occasional changes are needed. (Note: The OCR doesn’t remember the changes, so next time, it will still be sent to the company for that mailbox.)
-
-
Delete – As email extraction takes any PDF from the email, it’s possible that statements or other attachments are sent, and they can be deleted.
-
Link - This allows a document ID to be used to attach the document where OCR couldn't find via the mail subject.
- Sent – Date received for processing.
- Sender – Shows whether the upload was MANUAL or the email address it came from.
- Subject – Displays ‘Invoice Processor’ (if the upload was manual) or the email subject.
- Attachment – The attachment file name.
- Type – Purchase invoice, sales order, attachment, or goods receipt note.
- Result – Latest step completed
- Sent for processing – On its way to OCR
- Currently processing – At OCR
- Exporting – Being sent back from OCR
- Exported – Returned from OCR
- Draft invoice created – Ideal endpoint of purchase invoices being processed.
- Note: If credit note phrases are found, the credit note flag is updated and the payment terms aren’t applied, so the due date is the same as the created date.
- Draft sales order created – Ideal endpoint of sales orders being processed.
- Goods receipt note – Ideal endpoint of goods receipt notes being processed.
- Attachment created – Ideal endpoint of attachments being processed.
- Attachments > No attachments – Email has been received for the attachment process, but nothing is attached.
- Training required – Document can’t create a draft, likely because the supplier/customer and document number can’t be identified.
- Duplicate – The exact document details have been seen before, so it’s highlighted as a duplicate, which allows it to be deleted or skipped for OCR.
- Processing error – An issue occurred in processing; error messages are available in the details column.
-
Failed – Like error, but the error is unknown, so processing stops.
-
Deleting – Delete has been pressed, so it will be at this status for a short while.
-
Deleted – The document was deleted, so it will be at this status and must be resubmitted if needed again.
-
Details – This column is for system errors that could occur but that typically occur only during initial testing when configuring the mailbox extraction or initial setup.
-
Document – This is the document number that has been captured.
OCR Training
When brought to the Intelligent Capture screen, a list of documents to be trained will be presented.
This is indicated by looking at the result column to see ‘Training Required’.
Any document that has gone through the OCR and not captured the supplier/customer and document number must have additional training (dates and values can be blank).
When selected, the document will be checked for any other user.
If a user simply closes the training browser, the system will check if a user has previously accessed this document. If so, a warning message will appear if there are multiple users trying to access the same document.
Training Page
The training page shows the iDocuments fields on the left.
On the right is the scanned image from the supplier. On this screen, the OCR will be taught where to look for the pieces of information required to process this document.
V32: Removed the login prompt required on first use, so this page will open and login automatically.
V35: Added a resize option to the training grid for the lines so they can be extended.
iDocuments will automatically process as many fields as possible by looking at the fields on the left to find that information on the right.
Users can then correct or alter those results as needed. (iDocuments looks for variations of wording; for example, if a user searches for ‘invoice number’, iDocuments also looks for ‘invoice no’, ‘invoice no.’, ‘invoice #’, ‘reference number’, etc.)
Change the view of the document with the buttons in the grey bar that move the image side to side or increase/decrease the size to make it easier to view.
Processing – Purchase Invoice and Sales Order
These are the steps for purchase invoices and/or sales orders. Goods receipt notes follow a similar process but with slightly different fields.
-
Click in the supplier/customer identification box so it turns green.
-
On the right side of the screen, click in the field that will identify this supplier/customer.
-
This will place a green box around that field.
-
This field is based on one of 7 fields found in the 'Master Partner Data' in the ERP system: Federal tax ID, Telephone 1, Telephone 2, Mobile, Fax, Email, Company reg. no (only in UK versions).
-
If none of this information is stored in the document, the 'Alias' field can be used to reference a unique piece of data for this supplier/customer (e.g., the code or name exactly matched to the ERP system, ignoring spaces).
iDocuments will then look up all the master partner data in the ERP system to ensure the selected field is listed for a valid supplier. If the identification field is validated, iDocuments will update the name field with the information from the ERP system.
The OCR process will also check if the tax code on the line is set as ‘Acquisition Reverse in SAP’ and automatically set the tax amount to 0%.
-
Click a field on the left to turn it green, then find and click the relevant field on the image to link it for this template.
-
Repeat this for each of the required fields.
-
Some fields, such as currency, may not be available on the image.
-
They may be left blank and will be pulled from the master partner data in the ERP system where possible.
-
-
Begin building the lines next, following the same process as above but selecting the first line information.
-
The line details will be highlighted in blue on the image.
-
-
If there are multiple lines, clicking the code field on the second line will replicate all the lines showing on the document.
(Note: Additional lines or missing details can be manually entered if they’re not being captured automatically.)
Post Processing – Automatic Actions
Item / GL Codes:
The export process from the OCR will check if the code if matched to one in the ERP, then catalogue code for those who have a cross reference, then by description for last used, then item name if it’s matched to an item.
For the history when only the line description is captured, previous invoices will be checked and if they are matched by the same group and same descriptions it will know which item is most likely.
If no history exists, then the process will initially check against master data using the exact description.
Any blank descriptions that are passed through will be automatically replaced with a default so that they do not remain blank in iDocuments
For purchase invoices the default description is "AP Invoice" and “-“and then the supplier code.
For sales orders just the line numbering is added e.g. "Line 1", “Line 2” etc.
Tax codes:
Similarly, suppliers would not know your ERP tax codes so the OCR export process will check the most frequent use of any item / GL found and use that rate or it will determine a rate percentage based on the extracted net and gross line amounts. This will then be validated based on the below rules its rules until it determines a tax code.
- There are some company settings that may be initially applied e.g. Tax Liable Default Code may be set
- Check the iDocuments history for the most common used for that rate +/- 1.00
- Check the ERP master data for a suitable rate +/- 1.00
- Check the code from the ERP supplier configuration (If a Sales Order then this would be the default)
- Use a company setting "Default VAT Code / Default Sales VAT Code"
The initial training therefore becomes the most important effect on whether an item has its tax code correctly set as one of the first checks is looking at the history.
Processing – Goods Receipt
For goods receipt documents, the OCR will determine the main details.
If the supplier, receipt number reference, and order number are found, this will export.
On export, it will
- Check the order number
- Create the header
- Link to open order lines if there are any
- Set the “creator” to the same as the PO
The receipt will be a draft with the receipt note attached.
If it can’t link to the PO, it will stay in the OCR log page. You can search for a PO to manually link the document.
The total from the scan will be populated in the 'Scan Total' field for the receipt, so those that aren’t created by OCR will have a blank amount.
This field is useful to see if the total of the document scanned equals the total of the receipt value, as initially it will be all outstanding lines. Once the correct lines and values remain, the values should tally.
If there are rounding issues, the receipt can still be processed as all validation remains against the total field.
This document type is always held in the intelligent archive, so any portion of the OCR can be used as search criteria.
Document Progression
When completed, the fields required on the left of the screen should be populated. This is also identifiable by a green tick on the image thumbnail.
The document can be progressed.
-
To exit without saving, click ‘Return Document’ in the top left corner.
-
To save the information and remain in the training screen, click ‘Save Document’.
-
Click ‘Progress Document’ to progress the document through the OCR.
- This will automatically save any template training that has occurred.
Advanced Actions - Configuration/Mailbox
These options in the Intelligent Capture page are exclusively for users with admin rights and requires a password.
To access, click unlock, enter the password, and click continue.
Note: You can update the password for the advanced action settings here.
Configuration
After unlocking, the below options are available.
Rules
- ‘Posting date is tax date’ – Normally, the posting date is the current date, but setting the tax date will make it the same date as the invoice.
- ‘Document number check’ – Checks the reference number for duplicates and highlights them with a duplicate status.
- ‘Check all companies’ – Checks the reference number across all linked companies and can manually reroute to a different company.
- Duplicate documents can be deleted or, as part of the training screen, have their document numbers altered.
Group lines by account or items
If no PO is found, the ‘line grouping codes’ allow you to enter a GL or item code and description.
This allows a single line to be created for the total amount rather than the lines captured by OCR to be used.
Note: This is against all documents without a PO. If a PO is matched, it will use the lines on the PO.
Languages
This allows selection for the mappings between the ERP list of countries and the more limited list of OCR languages.
Partner rules
This allows you to select a business partner and force training when any documents are received.
You should use this to allow partners' documents to be sent for training. That way, problematic documents will remain in the training screen where manual selection can be used as part of the processing.
To create a rule, type the supplier name and select from the dropdown to enable the rule. To remove the rule, simply click the slider and the supplier will be removed from the rules.
Template Configuration
Each section (e.g., ‘Purchase Invoice’) allows users to modify the wording the OCR looks for on invoices when it’s being sent to OCR. (Note: Modifications must be created separately for each language.)
These items are in this section:
- Select the date format to look for in the invoices received (as seen below)
- ‘Document’ refers to reading the date as shown on the document being processed
- Select the language for the OCR for localisation settings
- Click 'Template' to open the screen to make modifications
On the screen shown above, there are several columns in the 'Fields' tab:
- Section – Refers to where the field is located (at the header or line level)
- Field – This is the database name and can’t be altered
- Name – This is the label name for the field and may be different per template (e.g., in a different language)
- Hidden – If there’s a tick box, the field isn’t mandatory and can be hidden
- Headers – Displays a list of values the OCR will search for
- For example, the OCR will search for gross, total amount, grand total, total due, and total amount due
- Admins can add options to this list for invoices to be picked up by the OCR, but common phrases should be avoided
- Exceptions – This is a list of values to ignore
Note: For user-defined fields (UDFs) there's additional validation to be like the standard formatting fields.
- UDF-type validation – For example, date/string/numeric fields are now validated
- In this instance it will NOT save the UDF if the data isn't the right type
- UDF-length validation for "string" type, which could be a free text or a lookup value, both of which have a maximum length
- If the max length is exceeded, it will NOT save the UDF, so the user will have to fill it in on rapid entry because we don't want partial information going through
On the lookup mappings section are the list of fields users can check to determine which supplier to hold this template against.
Within this configuration, users can define an SQL filter to make the list of suppliers easier to sort (e.g., they could limit the search to include only certain card codes or specific UDFs).
Also here is a field for ‘Matches’, where you can specifiy how many suppliers to return to limit data that draws a long list of suppliers. There's also the ability to only allow an identifier to be used for selection, so suppliers can't just be selected and no training completed.
Credit notes are automatically determined if the gross amount is negative, but detection phrases can be entered to help identify credits by specific text in the document.
(Note: This searches the first 50% of OCR data from the first page as the most likely area to find these phrases, but that isn’t necessarily the top half, as it depends on how the page has been constructed as to where it is in the OCR data.)
‘Use Strict Formatting’ is used for the line information table part of the document and means the OCR will go through additional processes to improve the line data. This requires more training and slows down processing significantly, so it’s rarely recommended.
After completing the fields, click save and then close to finish.
Email Configuration
‘Mailboxes’ is where you configure the email account(s) that will extract documents for OCR.
Rules at the top of the page can define the allowed file types for these attachments to avoid unwanted images, etc. in email signatures.
There are various email options, with the most frequent being Office 365. Details are provided on configuring the advanced settings in the OAuth section.
Email extraction can be disabled when required via the active toggle switch.
A default 'Group' – Allows users to set a default group for every email address entry (e.g., when iDocuments can’t determine the group via the automatically captured PO).
A batch type can be set per email address to allow you to use different form templates.
You can enter an email address to allow notifications when errors occur in extracting emails (cannot be same as the mailbox as it would create a loop).
'Save email details' – Stores the main information from the email as a comments field within the document and the entire email body as an attachment text file. (Note: Increased storage is required.)
'Save all attachments' – Allows attachments that aren’t PDFs to be stored as attachments.
If users can’t prevent certain emails but don’t want them captured each time, they can put the addresses or subjects into the right-side panel; iDocuments will ignore those emails in the extraction process.
Once the email has been created, this screen will allow a test to display a ‘Last connection’ result.
Attachments-Only Processing - awaiting QA/Beta testing
When a mailbox process is created for ‘Attachments’, the system will expect emails with a specific subject line.
The attachments will be linked to the document identified by that email subject.
The email subject should be the document type (PO, GRN, GR, PI, SO, or SI), then a space, then the reference number.
If not uniquely matched, it will show on the home page as ‘Pending document link’ when logged in as an admin or finance approver, where it can be linked manually.
Selecting the document link allows the user to search for the document type and then the document reference number. The ‘Link’ button assigns the attachment to that document.
(Note: The history entry for these attachments will be based on the base documents, so any goods receipt attachment will have the user who attached it as the PO creator.)
Last modified: 06/25/2025/9:00 am |