MyScript’s recognition technology is very flexible. While the default configurations support common use cases, this page explains how you can fine tune them to address specific needs.
Resources are pieces of knowledge that should be attached to the recognition engine to make it able to recognize a given language or content.
An Alphabet knowledge (AK) is a resource that enables the engine to recognize individual characters for a given language and a given writing style. Default configurations include a cursive AK for each supported language.
A Linguistic knowledge (LK) is a resource that provides the engine with linguistic information for a given language. It allows the recognition engine to improve its accuracy by favoring words from its lexicon that are the most likely to occur. Default configurations include an LK for each supported language.
A lexicon is a resource that lists words that can be recognized in addition to what is included into linguistic knowledge resources.
A subset knowledge (SK) is a resource that restricts the number of text characters that the engine shall attempt to recognize. It thus corresponds to a restriction of an AK resource.
A math grammar is a resource that restricts the number of math symbols and rules that the engine shall be able to process.
For on-device use, we deliver two different sets of ready to use recognition resources with associated configurations: the standard ones and the lite ones.
The standard resources should meet most of your recognition needs. But there might be specific situation where the resource file sizes matter: as they add to the overall footprint of the OS, it reduces the space available for user data. On low end devices, you might also need the recognition process to be faster and/or to use less CPU/battery.
The usage of lite recognition resources could tackle these needs: In addition to their lower sizes, they enable an increase in recognition speed. But you have to be aware that using them might slightly decrease the recognition accuracy.
So, the decision to use lite versus standard resources is an arbitration between speed/sparing CPU/battery versus accuracy.
MyScript Developer Portal lets you download recognition assets to support a wide range of languages, as well as math, raw-content and diagram use cases. Each pack comes with the two ready-to-use standard and lite configurations that will work in most cases.
Step 1 Download as described above, your language pack(s), and if needed the content type package as well: diagram, math, raw-content.
Step 2 Install the pack(s) in your application project:
The packs consist in a *.zip
archive containing the following folders to be extracted in your project path:
recognition-assets/conf
) containing the standard resources configuration (*.conf
),recognition-assets/conf-lite
) containing the lite resources configuration (*.conf
),*.res
).Step 3 Modifiy the engine configuration in your application code:
Set the value of the configuration-manager.search-path
key to the folder(s) containing your configuration file(s) (*.conf):
So to use the standard resources, set the value to the conf
folder, for instance: zip://${packageCodePath}!/assets/conf
Or to use the lite ones, set the value to to the conf-lite
folder, for instance:zip://${packageCodePath}!/assets/conf-lite
conf
or the conf-lite
ones
There are a few situations where you may want to adapt these provided configurations:
You are building a form application and want to reduce some fields to only accept certain types of symbols, such as alphanumerical symbols, digits or even capital letters. In this case, consider building and attaching a subset knowledge.
An LK is not mandatory but not attaching one often results in a significant accuracy drop. It may be relevant to build your own LK if you do not expect to write full meaningful words, for instance if you plan to filter a list with a few letters.
You can build and attach your own custom lexicons.
A customized SK can be useful in a form application, for example, to restrict the authorized characters of an email field to alphanumerical characters, @ and a few allowed punctuation signs.
You can build and attach your own custom subset knowledge.
In education use cases, a custom math grammar can prove very useful to adapt the recognition to a given math level (for instance, only digits and basic operators for pupils).
You can build and attach your own custom math grammars.
As explained in the runtime part of the guide, iink SDK consumes configuration files, a standardized way to provide the right parameters and knowledge to recognize a specific type of content.
The resources packs that we deliver contain the corresponding configuration files that can be used as explained in previous section. This section focuses on the configuration files usage for customized resources.
To deploy and use a configuration, you need to:
*.conf
file with your application, along with all the resource files that it references (make sure that all paths are correct).*.conf
file to the paths stored in the engine configuration for the configuration-manager.search-path
key.text.configuration.bundle
and text.configuration.name
keys are matching your text configuration bundle and
configuration item name (see example below).A configuration file is a text file with a *.conf
extension. It is composed of a header (identifying a configuration bundle) and one or more named
configuration items (defining configuration names) separated by empty lines.
Here is an example:
# Bundle header
Bundle-Version: 2.1
Bundle-Name: en_US
Configuration-Script:
AddResDir ../resources/
# Configuration item #1
Name: text
Type: Text
Configuration-Script:
AddResource en_US/en_US-ak-cur.res
AddResource en_US/en_US-lk-text.res
SetTextListSize 1
SetWordListSize 5
SetCharListSize 1
# Configuration item #2
Name: text-no-candidate
Type: Text
Configuration-Script:
AddResource en_US/en_US-ak-cur.res
AddResource en_US/en_US-lk-text.res
SetTextListSize 1
SetWordListSize 1
SetCharListSize 1
Explanations:
#
and !
are considered as comments and ignored.Configuration-Script
.Bundle-name
is the name of the bundle. This is what iink SDK expects as a possible value for the text.configuration.bundle
configuration key. In this example, it would be en_US
.Name
defines a configuration item. This is one of these names that iink SDK expects as a possible value for the
text.configuration.name
configuration key. In this example, it could be text
and text-no-candidate
. A given engine can only be
configured with a single configuration item for each type of recognizer at any point in time.Type
key are: Text
, Math
, Shape
and Analyzer
. They correspond to the types of content that the core
MyScript technology is able to recognize.The table below lists some possible configuration commands (to be placed under Configuration-Script
):
Configuration item type | Syntax | Argument |
---|---|---|
All | AddResDir DIRECTORY |
Folder that the engine shall consider for resource files relative paths |
AddResource FILE |
Name of an individual resource file to attach | |
Text | SetCharListSize N |
An integer between 1 and 20, representing the number of character candidates to keep |
SetWordListSize N |
An integer between 1 and 20, representing the number of word candidates to keep | |
SetTextListSize N |
An integer between 1 and 20, representing the number of text candidates to keep |
The following tables lists the types of configuration items that you need to provide for iink SDK to support its different content types:
Content type | Required configuration item types |
---|---|
Text |
Text |
Math |
Math |
Diagram |
Text + Shape + Analyzer
|
Drawing |
None |
Text Document |
Text + Math + Shape + Analyzer
|
Raw Content |
Text 1 + Shape + Analyzer
|
If you enable text recognition on raw content, you need the corresponding language pack (see configuration) ↩
Resouces are attached in the Configuration-Script
part of the configuration items by using the AddResource
command.
For example, in the case of an en_US
AK, you would write:
AddResource en_US/en_US-ak-cur.res