Signal Cruncher’s XONBOT Text Matching

 

What is the XTM?

Signal Cruncher’s AI “XONBOT Text Matching” (called XTM) is an Application to evaluate large amounts of text.

Intelligenter Code

What can the XTM do?

Example tenders:
Let’s look at the example tenders. These are text-based to a high degree and are difficult to evaluate automatically. For example, in order to find suitable articles based on tender positions, it is necessary to manually evaluate the tender texts. This is often very time-consuming and can be reduced using the XTM. As in this example, the task of the XTM is the automated assignment of article numbers to tender positions.

What do we do?

The basis is historical data, which is provided in a large table from previous manual assignments. The task is to automatically find the right article number for each position for each new tender. Several “recommendations” are proposed for each article. In its simplest variant, the task is to assign an item number to each descriptive text.

We achieve a forecast quality of 80-90%.

How do we do that?

A model is learned based on historical data, which assigns each tender position to its assigned article. This can then be applied to new data. Neural networks serve as a model.

Our concept includes the following approaches:

  1. Pre-processing: Not extensive, since modern neural networks require little pre-processing due to automatic feature extraction. The core is the identification of refers. This can be done via empirical scoring using keywords (quality over 90%).
  2. Classification: Learns direct assignment from the description to the article number. Only works for “top sellers”, i.e. items that appear several times in transactions. Covers almost half of all transactions.
  3. Regression: Learns distances between the descriptions and item master data. Happens via triplet learning from description to master data of positive and negative reference items. This means that each description is assigned the item with the smallest distance. Covers the “long tail”, i.e. also assignments to articles that rarely or never appeared in historical data.
  4. Shopping basket analysis: It is analyzed which combinations of articles usually appear together in advertisements. This is used to improve the classification quality.

The implementation

The XTM is provided as a Docker container and can therefore be installed immediately on all major platforms such as Windows, Linux and MacOS. Communication takes place in REST format via an integrated web server. A file is uploaded via POST for learning. The solution can be installed on-premise or operated as a SaaS solution.

What else can you use the XTM for?

The solution can be transferred to a wide variety of tasks. In the simplest case for the classification of descriptive texts. The regression approach can also be used here: one of the descriptions assigned to each article is used as a reference description and the distance between each description and all articles is then learned via triplet learning. Articles can also be replaced by texts, for example, and greatly expand the field of application.

Idea

Would you like to learn more about our XTM or do you have ideas for another use case?
Then feel free to contact us.