ランキングおよび解決

Kore.aiのNLPエンジンは、機械学習、ファンダメンタルミーニング、ナレッジグラフ（あれば）モデルを使用してインテントを一致させます。3つのKore.aiエンジンは最終的に完全一致または可能性のある一致のどちらかとしてKore.aiランキングおよび解決コンポーネントに結果を提供します。ランキングおよび解決はNLP計算全体の最終的な結果を決定します。

動作

NLPエンジンは、機械学習、ファンダメンタルミーニング、ナレッジグラフ（Botに含まれる場合）モデルを使用したハイブリッドアプローチによって、関連性に関する一致するインテントをスコア化します。モデルは、ユーザーの発話を可能性のある一致または完全一致のいずれかに分類します。 完全一致は、高い信頼度スコアを取得し、ユーザーの発話に完全に一致すると見なされます。公開済みのBotでは、ユーザー入力が単一の完全一致と一致する場合、Botは直接タスクを実行します。発話が複数の完全一致と一致する場合、エンドユーザーが選択できるようにオプションとして送信されます。一方、可能性のある一致とは、ユーザー入力に対してある程度スコアが高いインテントを指しますが、完全一致と呼ぶには十分でないインテントのことです。内部的には、システムは、スコアに基づいて、可能な一致をさらに通常の一致と不正確な一致に分類します。公開済みのBotでエンドユーザーの発話が可能性のある一致を生成していた場合、Botはこれらの一致をエンドユーザーに「Did you mean?」として送信します。ランキングおよび解決に基づいて、エンジン間の上位インテントを確認します。プラットフォームがあいまいさを検出した場合、あいまいさのダイアログが開始されます。プラットフォームは、ユーザーの発話に対する単一の上位インテントを確認できない場合、これら2つのシステムダイアログのいずれかを開始します。

あいまい性解消ダイアログ：エンジン間で返された完全一致が複数ある場合に開始されます。このシナリオでは、Botは実行する完全一致を選択するようユーザーに求めます。NLP標準応答から、ユーザーに表示されるメッセージをカスタマイズすることができます。
「Did You Mean」ダイアログ：ランキングおよび解決が複数の上位インテントを返した場合、あるいは唯一の上位インテントが、KGエンジンのスコアがしきい値の下限と上限の間にあるFAQである場合に開始されます。このダイアログは、完全であるか不明なインテントと一致するものを検出したことをBotがユーザーに知らせるものであり、ユーザーに先へ進んための選択を促します。このシナリオでは、開発者はこれらの発話を識別し、Botをさらにトレーニングする必要があります。ユーザーに表示されるメッセージは、NLPの標準応答からカスタマイズすることができます。

モデルスコアとおよび解決に関する詳細情報

しきい値および設定

ランキングおよび解決エンジンは以下の手順で設定することができます。

しきい値を設定するBotを開きます。
サイドナビゲーションパネルにカーソルを合わせ、自然言語 > トレーニングをクリックします。
しきい値および設定タブをクリックします。
ランキングおよび解決エンジンセクションでしきい値を設定することができます。
- 可能性のある一致の近似度は、スコアの高い上位インテントとその次の可能性のあるインテントを同じように重要なものとみなすために許容される最大差を定義します。プラットフォームのバージョン7.3以前では、この設定はファンダメンタルミーニングセクションで行うことができました。
- 依存構造解析モデルは、ファンダメンタルミーニングモデルによるインテント認識と同様に、ランキングおよび解決エンジンによるインテントの再スコアリングを有効にするためのものです。この設定はデフォルトでは無効になっており、設定を行う必要があります。詳細は以下を参照してください。

依存構造解析モデル

このプラットフォームには、ファンダメンタルミーニングエンジンとランキングおよび解決エンジンによる、2つのインテントのスコアリングモデルがあります。

最初のモデルは、主に単語の存在、発話の中での単語の位置などに依存してインテントを判断し、ファンダメンタルミーニングエンジンのみによってスコア化されます。こちらがデフォルト設定になっています。
2つ目のモデルは、依存マトリックスに基づいており、インテントの検出は、単語やその相対的な位置、そして最も重要とされる、文中のキーワード間の依存関係に基づいて行われます。このモデルでは、インテントはファンダメンタルミーニングエンジンによってスコアリングされ、その後、ランキングおよび解決エンジンによって再ドスコアリングされます。

依存構造解析モデルは、自然言語 > トレーニング > しきい値および設定のランキングおよび解決セクションから有効化や設定を行うことができます。注：この機能はプラットフォームのバージョン7.3で導入され、一部の言語でのみサポートされています。対応言語についてはこちらを参照してください。依存構造解析モデルは、以下のように設定することができます。

最小一致スコア を使用して、インテントを可能性のある一致として認識するための最小スコアを定義します。0.0～1.0で値を設定することができ、デフォルトでは0.5に設定されています。
詳細設定を使用して、様々なパラメータに関連付けられた重要度やスコアを変更することで、モデルをカスタマイズすることができます。これにより、有効なコードを入力できるJSONエディタが開かれます。「デフォルト設定に復元」をクリックして、JSON構造内のデフォルトのしきい値設定を取得することができます。結果を認識していれば設定を変更することができます。

NLP検出

自然言語分析の結果、以下のようなシナリオになります。

FM、ML、またはKGエンジンで完全一致を識別するNLP分析
可能性のある一致を返し、単一の一致を選択する、複数のエンジンを用いたNLP分析
可能性のある一致を返す複数のエンジンと複数の結果を返す解決を用いたNLP分析
一致しないNLP分析

ここでは、上記のそれぞれのケースについて説明します。 NLP検出を理解するために、以下の詳細を含む銀行Botを例として見てみましょう。

Botは5つのダイアログタスクとデフォルトダイアログから構成されています。
インテントは同義語、パターン、ML発話を用いてトレーニングされています。
Botは、4つの上位レベルの用語で伝えられた86のFAQで定義されたナレッジグラフで構成されています。

Scenario 1 – シナリオ1 – 完全一致を識別するFM

ファンダメンタルミーニング（FM）モデルは、発話を完全一致として識別しました。
機械学習 (ML) モデルもそれを可能性のある一致として識別しました。
識別されたタスクに対して返されるスコアは、他のインテントスコアの6倍です。さらに、インテント名に含まれる全ての単語がユーザーの発話に含まれています。そのため、FMモデルではそれを完全一致と呼びます。
MLモデルは「Find ATM」のインテントを可能性のある一致として一致させます。

Scenario 2 – 完全一致を識別するML

MLモデルは完全一致を返し、他のモデルは一致を返しません。
資金の移動というタスク名のどの単語もユーザーの発話の単語と一致しなかったため、FMモデルはこのタスクを識別できませんでした。

シナリオ3 – 完全一致を識別するKG

ユーザーの発話は「How do I make transfer money to a London account?」です。
ユーザーの発話には、このナレッジグラフのインテントパスである「Transfer」、「Money」、「International」に一致するために必要なすべての用語が含まれています。
「international」という用語は、ユーザーが発話の中で使用した「London」の同義語として識別されます。
100％のパス用語が一致したため、パスが修飾されました。信頼度スコアリングの一部として、ユーザークエリの用語は実際のナレッジグラフの質問の用語と似ており、そのため100のスコアが返されます。
返されたスコアが100以上の場合、インテントは完全一致とマークされ、選択されます。
FMエンジンは、キー用語であるTransferがユーザーの発話の中に存在するため、可能性のある一致と判断しました。
MLエンジンは、発話がトレーニングされた発話と完全に一致しなかったため、可能性のある一致と判断しました。

シナリオ4 – 可能性のある一致を返す複数のエンジン

3つのエンジンはすべて可能性のある一致を返し、完全一致を返しませんでした。。
MLモデルには可能性のある一致が1件あり、FMモデルには可能性のある一致が2件あり、そのうち1件は共通しています。ナレッジグラフには可能性のある一致が1件あります。識別された可能性のある一致はすべて、ランキングおよび解決で再ドランク付けされます。
ランキングおよび解決コンポーネントは、ナレッジグラフエンジンから単一の一致（タスク名 – 「When can I start making payments using BillPay plus?」）の最高スコアを返しました。他の可能性のある一致のスコアは、上位スコアの2パーセンタイルよりも低いため、無視されます。この場合、上位は「KG」に返されたクエリであり、ユーザーに提示されます。
ユーザーの発話のほとんどのキーワードはKGクエリのキーワードをマッピングしますが、これらは完全一致ではありません。理由は以下のとおりです。
- パス用語の一致数は100%ではありません。
- KGエンジンは64.72%の可能性を示すスコアを返しました。「bill pay」の代わりに「Billpay」という単語を使用していた場合、スコアは87.71%になっていたはずです。（それでも100%ではありません）
- スコアが60%～80%の場合、クエリは「Did-you-mean」ダイアログの一部として表示され、完全な一致として表示されません。スコアが80%を超えていた場合、プラットフォームは「Did-you-mean」ダイアログを使用して再度確認することなく回答を表示していたはずです。

シナリオ5 – 複数の結果を返す解決

すべてのエンジンが可能性のある一致を検出しました。
KGは2つの可能性のあるパスを返しました。
ランキングおよび解決は、スコアが2%未満の2つのクエリを検出しました。
ナレッジグラフのインテントがどちらも選択され、「Did-you-mean」としてユーザーに提示されます。
両方が一致した用語として両方のパスが選択され、それらのパスのスコアはどちらも60％以上です。

シナリオ6 – 一致なし

どのエンジンも、トレーニングされたインテントやナレッジグラフのインテントを識別することはできませんでした。
このシナリオでは、デフォルトのインテントがトリガーされます。

On this Page

Ranking and Resolver

The Kore.ai NLP engine uses Machine Learning, Fundamental Meaning, and Knowledge Graph (if any) models to match intents. All the three Kore.ai engines finally deliver their findings to the Kore.ai Ranking and Resolver component as either exact matches or probable matches. Ranking and Resolver determines the final winner of the entire NLP computation.

Working

The NLP engine uses a hybrid approach using Machine Learning, Fundamental Meaning, and Knowledge Graph (if the bot has one) models to score the matching intents on relevance. The model classifies user utterances as either being Possible Matches or Definitive Matches.

Definitive Matches get high confidence scores and are assumed to be perfect matches for the user utterance. In published bots, if user input matches with a single Definitive Match, the bot directly executes the task. If the utterances match with multiple Definitive Matches, they are sent as options for the end-user to choose one.

On the other hand, Possible Matches are intents that score reasonably well against the user input but do not inspire enough confidence to be termed as exact matches. Internally the system further classifies possible matches into good and unsure matches based on their scores. If the end-user utterances were generating possible matches in a published bot, the bot sends these matches as Did you mean? suggestions for the end-user.

Based on the ranking and resolver, the winning intent between the engines is ascertained. If the platform finds ambiguity, then an ambiguity dialog is initiated. The platform initiates one of these two system dialogs when it cannot ascertain a single winning intent for a user utterance:

Disambiguation Dialog: Initiated when there are more than one Definitive matches returned across engines. In this scenario, the bot asks the user to choose a Definitive match to execute. You can customize the message shown to the user from the NLP Standard Responses.
Did You Mean Dialog: Initiated if the Ranking and Resolver returns more than one winner or the only winning intent is an FAQ whose KG engine score is between lower and upper thresholds. This dialog lets the user know that the bot found a match to an intent that it is not entirely sure about and would like the user to select to proceed further. In this scenario, the developer should identify these utterances and train the bot further. You can customize the message shown to the user from the NLP Standard Responses.

Learn more about model scores and resolver.

Thresholds & Configuration

To configure a Ranking and Resolver Engine, follow the below steps:

Open the bot for which you want to configure thresholds.
Select the Build tab from the top menu.
From the left navigation click Natural Language > Thresholds & Configurations.
The Ranking & Resolver Engine section allows you to set the threshold:
- Prefer Definitive Matches can be used to prioritize definitive matches over probable matches so that all the matches are considered for rescoring and the end-user gets to choose the right intent in case of any ambiguity. This setting is enabled by default and you can disable it. If enabled (default behavior), definitive matches will win and the probable matches would be discarded, in case of no definitive match, then probable matches would get rescored. If disabled, all the matches – definitive and probable, would be rescored.
- Rescoring of Intents can be turned off so that all the qualified intents from the different intent engines are assumed winning intents and are sent to the end-users to choose the required intent. If only one intent is qualified, then it is considered a winner, if more than one is qualified then the user will be presented with results for disambiguation.
- Proximity of Probable Matches which defines the maximum difference to be allowed between top-scoring and immediate next possible intents to consider them as equally important. Before v7.3 of the platform, this setting was under the Fundamental Meaning section.
- Dependency Parsing Model for enabling rescoring the intents by the Ranking and Resolver engine as well as for the intent recognition by the Fundamental Meaning model. This configuration is disabled by default and needs to be set implicitly. See below for details.

Dependency Parsing Model

The platform has two models for scoring intents by the Fundamental Meaning Engine and the Ranking & Resolver Engine:

The first model predominantly relies on the presence of words, the position of words in the utterance, etc. to determine the intents and is scored solely by the Fundamental Meaning Engine. This is the default setting.
The second model is based on the dependency matrix where the intent detection is based on the words, their relative position, and most importantly the dependency between the keywords in the sentence. Under this model, intents are scored by the Fundamental Meaning Engine and then rescored by the Ranking and Resolver Engine.

Dependency Parsing Model can be enabled and configured from the Ranking and Resolver section under Training > Thresholds & Configurations.

Note: This feature is supported only for select languages, see here for supported languages.

Dependency Parsing Model can be configured as follows:

Minimum Match Score to define the minimum score to qualify an intent as a probable match. It can be set to a value between 0.0 to 1.0 with the default set to 0.5.
Advanced Configurations are used to customize the model by changing the weights and scores associated with various parameters. This opens a JSON editor where you can enter the valid code. You can click the restore to default configurations to get the default threshold settings in a JSON structure, you can change the settings provided you are aware of the consequences.

NLP Detection

The Natural Language Analysis will result in the following scenarios:

NLP Analysis identifying a Definitive match with FM or ML or KG engines.
NLP Analysis with multiple engines returning probable match and selecting a single match.
NLP Analysis with multiple engines returning probable match and resolver returning multiple results.
NLP Analysis with no match.

Each of the above cases is discussed in this section.

To understand NLP detection, let us use the example of a Bank bot with the following details:

The bot consists of 5 Dialog Tasks and a Default Dialog.
The intents are trained with Synonyms, Patterns, and ML utterances.
The bot consists of a knowledge graph defined with 86 FAQs distributed in 4 top-level terms.

Scenario 1 – FM Identifying a Definitive Match

The Fundamental Meaning (FM) model identified the utterance as a Definitive match.
The Machine Learning (ML) model also identified it as a Possible match.
The score returned for the task identified is 6 times more than other intent scores. Also, all the words in the intent name are present in the user utterance. Thus the FM model termed it a Definitive match.
The ML model matches the Find ATM intent as a Probable match.

Scenario 2 – ML Identifying a Definitive Match

The ML Model returns a Definitive match with other models returning no match.
The FM model could not identify this task as none of the words in the task name Transfer Funds matched the words in the user utterance.

Scenario 3 – KG Identifying a Definitive Match

The user utterance is How do I make transfer money to a London account?
The user utterance contains all the terms required to match this Knowledge Graph intent path Transfer, Money, International.
The term international is identified as a synonym of London that the user used in the utterance.
As 100% path term matched the path was qualified. As part of confidence scoring, the terms in the user query are similar to that of the actual Knowledge Graph question. Thus, it returns a score of 100.
As the score returned is above 100, the intent is marked as a Definitive match and selected.
FM engine found it a Probable match as the key term Transfer is present in the user utterance
ML engine found the utterance as a Probable match as the utterance did not fully match any trained utterance.

Scenario 4 – Multiple Engines Returning Probable Match

All the 3 engines returned a possible match and no definitive match.
ML Model has 1 possible match and FM Model has 2 possible matches, of which 1 is common. Knowledge Graph has 1 possible match. All possible matches identified are re-ranked in the Ranking and Resolver.
The Ranking and Resolver component returned the highest score for the single match (Task name – When can I start making payments using BillPay plus?) from the Knowledge graph engine. The scores for other probable match come out to be lower than 2 percentile of the top score and are thus ignored. The winner, in this case, is the ‘KG’ returned query and is presented to the user.
Though most of the keywords in the user utterance map to the keywords in the KG query, still this is not a definitive match because
- The number of path terms matched is not 100%.
- The KG engine returned the score with a 64.72% probability. Had we used the word Billpay instead of bill pay the score must have been 87.71%. (still not a 100% match)
- Now as the score is between the 60%-80%, a threshold of the Query is presented as part of the Did-you-mean dialog and not as a complete winner. If the score was above 80%, the platform would have given out the response without re-confirming with the Did-you-mean dialog.

Scenario 5 – Resolver Returning Multiple Results

All the engines detected probable matches.
KG returned with 2 possible paths.
Ranking and resolver found the 2 queries with a score of less than 2% apart.
Both the Knowledge Graph intents are selected and presented to the user as Did-you-mean.
Both the paths were selected as terms in both matches and the score for both the paths is more than 60%.

Scenario 6 – No match

None of the engines could identify any trained intent or Knowledge Graph intent.
In this scenario, the default intent is triggered.

Traits Engine

Advanced NLP Configurations

On this Page