MyScript Cloud WebSockets API architecture

MyScript Cloud exposes the WebSockets API version 4, enabling remote access to the Interactive Ink technology.

Requirements

This guide assumes you are familiar with the MyScript technology web concepts.

WebSockets API is the lowest level of API available in a client-server mode. This documentation section assumes that you are familiar with interactive ink concepts. To use this API, you will have to capture, render, edit strokes and rich content on the client side with your own piece of software. Consider using the client side libraries if you want to integrate handwriting recognition into a web application.

Overall architecture

The Interactive Ink WebSocket API is built for modern browsers and good network conditions. As handwriting recognition is very CPU-intensive, we can not deploy it client-side. We have made the choice to use client servers only for ink capture and content rendering. All the recognition is done server-side.

To ensure a good recognition and an accurate gesture detection, the strokes’ position client-side and server-side have to be absolutely the same. The following protocol is built assuming that the server knows the exact position of strokes and content-like in the user display. As a consequence, even if the server sends SVG content and patches, these can not be re-styled client-side with CSS. You have to use JavaScript exclusively to manipulate the content inside the editor.

Content / Content Package

Interactive Ink SDK brings the notion of content package and content part. Currently only MATH and TEXT content part can be manipulated by the Cloud API. In brief, you have to create an editor that creates a content page containing a content part. It is not possible to add several content parts in a Cloud content package.

Lifecycle of recognition

Before using the WebSockets API, you have to understand the lifecycle of handwriting recognition.

Step 1: Open a connection with the server and give the context of the recognition

Information like the type of content you want to recognize (TEXT or MATH), the language, the size of your input and other parameters have to be provided.

Step 2: Capture the user input

User input is what we call a stroke, i.e. a series of points with the timestamp of their capture. To be even more precise, user input is a pending stroke, meaning that it is not already processed by the server recognition engine. The application capturing strokes generally renders the temporary strokes to give an immediate feedback to the user.

Step 3: Send the stroke to the server

This stroke is sent to the server that acknowledges the reception. Then it answers with a SVG patch containing the temporary stroke to display to the user. A good practice is to replace your temporary rendering of the stroke with the content of this patch.

Step 4: Detect gestures and recognition

Gesture detection is done as soon as the server acknowledges the reception of the stroke. Then it tries to recognize the content. A message sharing the state is sent to the client at each step.

Step 5: Share user interactions

While writing, the user may want to undo, redo or clear the input zone. Each of those actions has to be shared by the client with the server. It is also possible to ask for an export. The server will then answer with the last recognized result in the desired format. You can import some content in the format managed by the content part. The user may also want to convert the content from its handwriting form to a digital one, called typeset. This action is called conversion. The server will answer with a patch containing text and glyph instead of strokes.

We use cookies to ensure that we give you the best experience on our website Read the privacy policy