ナレッジグラフ

Kore.aiのナレッジグラフは、FAQの静的テキストをインテリジェントでパーソナライズされた会話体験に変えるのに役立ちます。FAQを質問と回答の組み合わせの形で収集するというこれまでの慣行以上のものです。代わりに、ナレッジグラフを使用すると、主要なドメイン用語のオントロジー構造を作成し、それらをコンテキスト固有の質問やその代替品、同義語、機械学習が可能な特性と関連付けることができます。プラットフォームによってトレーニングされた場合、このグラフはインテリジェントなFAQ体験を可能にします。

この文書ではナレッジグラフの概念、用語、実装について説明しています。ナレッジグラフに対するユースケースによるアプローチについては、こちらを参照してください。

ナレッジグラフが選ばれる理由

ユーザーは複数の方法で質問を投げかけることができます。すべての代替質問を手動で可視化および追加するのは大変な作業です。

Kore.aiはノード、タグ、同義語を使ってナレッジグラフを設計しました。これにより、考えられるすべての一致をカバーする作業が容易になります。ナレッジグラフは、ノード、タグ、同義語を使ったトレーニングにより、様々な代替質問を処理することができます。

ユーザーが質問をするたびに、ナレッジグラフのノード名はチェックされ、ユーザーの発言から得たキーワードと照合されます。ノード名、タグ、および同義語がチェックされ、質問が一致の可能性があるまたはインテントとしてリストアップされます。リストアップされた質問は、実際のユーザーの発話と比較され、ユーザーに提示されるであろう、考えられる最良のインテントを考え出します。応答は単純な応答またはダイアログタスクの実行のいずれかの形をとります。

このように、FAQに全く異なる代替質問をいくつか追加し、適切にタグ、同義語、およびノード名を提供することで、トレーニングを受けていない質問にも一致させることができます。ナレッジグラフのパフォーマンスおよびインテリジェンスは、適切なノード名、タグ、および同義語を使用したトレーニング方法に依存します。

専門用語

この文書は、ナレッジグラフを構築する際に使用される用語に慣れることを目的としています。

KGの構築へ直接移動します。

用語またはノード

用語またはノードはオントロジーの構成要素であり、ビジネス領域の基本的な概念やカテゴリを定義するために使用することができます。

以下の画像で示されるように、Botオントロジーウィンドウの左側のパネルにある用語を階層的に整理することで、組織内の情報の流れを表すことができ、そこから用語の作成、整理、編集、削除などを行うことができます。ノード数は最大20,000、FAQ数は最大50,000というプラットフォーム上の制限があります。

表現を容易にするために、以下の名前を使用して特殊なノードを識別します。

ルートノード

ルートノードはBotオントロジーの最上位の用語を形成します。ナレッジグラフは1つのルートノードのみで構成されており、オントロジー内のその他すべてのノードはルートノードの子ノードになります。ルートノードにはBotの名前がデフォルトで使用されますが、必要に応じて変更することができます。このノードはノードの適格化や処理には使用されません。パスの適格化は第1レベルノードから始まります。ルートノードの直下にFAQを置くことはお勧めできませんが、必要に応じてルートノードでのFAQの数を最大100に制限してください。

第1レベルノード

ルートノードのすぐ次のレベルのノードを第1レベルノードと呼びます。1つのコレクション内に第1レベルノードをいくつでも持つことができます。第1レベルノードは、部門名や機能などの高レベルの用語を表すために残しておくことをお勧めします（例：パーソナルバンキング、オンラインバンキング、コーポレートバンキングなど）。

リーフノード

質問と回答の組み合わせやダイアログタスクが追加されたノードは、どのレベルのものでもリーフノードと呼ばれます。

ノードの関係

ノードは、オントロジー内での位置によって第1レベルノード、第2レベルノードなどと呼ばれることがあります。簡単に言えば、第1レベルノードとは下に1つ以上の第2レベルノードと呼ばれるサブカテゴリを持つことができるカテゴリです。

例：ローンとは、住宅ローンおよび個人ローンの第1レベルノードです。個人ローンはレートと手数料、ヘルプとサポートなど、さらに2つのサブカテゴリノードを持つことができます。

注：このようなノードの階層構造は、関連する質問を便利にまとめておくためのものです。ナレッジグラフエンジンは一致した質問を評価する際に親子関係を考慮しません。FAQ組織内での位置に関係なく、すべてのノードが同じように見なされるため、階層構造はFAQの照合処理に影響を与えることはありません。

それぞれの用語/ノードにはカスタムタグを追加することが可能です。タグは用語とまったく同じように機能しますが、混乱を避けるためにナレッジグラフのオントロジーには表示されません。用語と同じように、タグにも同義語や特性を追加することができます。

同義語

ユーザーは、オントロジーの用語に対してさまざまな代替を使用します。ナレッジグラフでは、用語に同義語を追加して、用語のあらゆる代替形式を含めることができます。また、同義語を追加することで、代替質問を使用してBotをトレーニングする必要性を減らすことができます。

例えば、インターネットバンキングノードには、オンラインバンキング、e-バンキング、サイバーバンキング、ウェブバンキングなどの同義語が追加されている可能性があります。

ナレッジグラフに用語の同義語を追加する場合は、ローカル同義語またはグローバル同義語として追加することができます。ローカル同義語（またはパスレベルの同義語）は、その特定のパス内の用語にのみ適用され、グローバル同義語（またはナレッジグラフの同義語）は、その用語がオントロジー内のその他のパスに表示されている場合でも適用されます。

リリース7.2以降では、ナレッジグラフエンジン内でBotの同義語を使用してパスの適格化および質問の照合を行うことができます。この設定では、Botの同義語とKGの同義語で同じ同義語を再作成する必要はありません。

特性

注：バージョン7.0以降は、バージョン6.4以前のクラスを特性に置き換えています。

特性とは、特定のインテントに関連した情報を求める際に質問の性質を定義する、一般的なエンドユーザーによる発話の集まりです。特性についての詳細はこちらを参照してください。

特性は、Botオントロジーの複数の用語に適用することができます。

注：特性は、関連するユーザーの発話に基づいてノードをフィルタリングするのにも役立ちます。そのため、ユーザーが特性に存在する発話を入力した場合、Botは特性が適用されているノードのみを検索します。特性が適用されていないその他のノードにその発話が存在する場合、Botはそのノードを無視します。

インテント

Botはユーザーからの質問に対して、ダイアログタスクやFAQを実行して応答することができます。

FAQ：質問と回答の組み合わせはBotオントロジーの関連ノードに追加する必要があります。最大50,000のFAQが許可されています。
異なるユーザーが異なる質問をする可能性があり、これをサポートするために、それぞれの質問に対して複数の代替フォームを関連付けることができます。
代替質問の前に「||」を付けると、FAQのパターンを入力できるようになります（7.2リリース以降）。
タスク：ダイアログタスクをKGインテントにリンクした場合、ナレッジグラフとダイアログタスクの機能を活用して、複雑な会話が含まれるFAQを処理することができるようになります。

パフォーマンスの向上

ナレッジグラフエンジンはデフォルト設定でも動作しますが、Bot開発者であるお客様は、KGエンジンのパフォーマンスを様々な方法で微調整することができます。

用語、同義語、一次質問と代替質問、またはユーザーの発言を定義することにより、ナレッジグラフの適切な設定を行います。階層化はKGエンジンのパフォーマンスには影響しませんが、KGエンジンの動作を整理および誘導するのに役立ちます。
以下のパラメータを設定します。
- パス範囲 – ユーザーの発話に含まれる用語のうち、パス内に存在する用語の最低パーセンテージを定義して、さらにスコアリングの対象となるようにします。
- KGの明確なスコア – 完全一致と見なされるKGのインテント一致の最小スコアを定義して、検出されたその他のインテント一致を破棄します。
- ナレッジタスクの最小レベルおよび確定レベル – ナレッジタスクの場合に識別および応答するための最小かつ明確なしきい値を定義します。
- KGの提案数 – 明確なKGのインテント一致が利用できない場合に表示されるKG/FAQの提案の最大数を定義します。
- 提案された一致の近接性 – トップスコアおよびすぐ次の提案された質問間に許容される差の最大値を定義して、それらを平等に重要なものと見なします。
プラットフォームには上記のしきい値のデフォルト値が設定されていますが、これらは自然言語 -> トレーニング -> しきい値および設定から変更することができます。
コンテキストに応じたパスの適格化 – これにより、Botのコンテキストには一致したインテントの用語/ノードが入力され、保持されるようになり、ユーザー体験がさらにスムーズになります。
特性 – 前述したように、特性はユーザーの発話に用語/ノードが含まれていなくても、ノード/用語を適格化するために使用することができます。さらに、提案されたインテントリストをフィルタリングするのにも役立ちます。

KGエンジンの動作

ナレッジグラフエンジンは、ユーザーの発話に対して正しい応答を抽出しつつ、2段階のアプローチを行っています。これは、検索駆動型のインテント検出プロセスとルールベースのフィルタリングを組み合わせたものです。パスの範囲（必要とされる用語の割合）とユーザーの発話における用語の使用（必須または任意）に関する設定は、FAQのインテントの初期フィルタリングに役立ちます。トークン化およびn-gramベースのコサインスコアリングモデルは、最終的な検索基準の達成に役立ちます。

ナレッジグラフのトレーニングには以下の手順が含まれます。

すべての用語/ノードと同義語が検出され、インデックスが付けられます。
これらのインデックスを使用して、各KGインテントに対してフラットパスが作成されます。

ナレッジグラフエンジンがユーザーの発話を受信した場合：

ユーザの発話とKGノード/用語をトークン化し、n-gramを抽出します（ナレッジグラフエンジンでは最大4-gramまで対応）。
トークンはKGノード/用語とマッピングされ、それぞれのインデックスが取得されます。
ユーザーの発話とKGノード/用語との間のパス比較により、その発話用の適格パスが作成されます。このステップでは、上述したパス範囲と用語の使用法を考慮します。
適格パスの質問リストの中から、コサインスコアに基づいて最適なものが選ばれます。

On this Page

Knowledge Graph

Kore.ai’s Knowledge Graph helps you turn your static FAQ text into an intelligent, personalized conversational experience. It goes beyond the usual practice of capturing FAQs in the form of flat question-answer pairs. Instead, Knowledge Graph enables you to create an ontological structure of key domain terms and associate them with context-specific questions and their alternatives, synonyms, and Machine learning-enabled traits. This Graph, when trained by the platform, enables an intelligent FAQ experience.

This document explains about the concepts, terminology, and implementation of Knowledge Graph. For a use case driven approach to Knowledge Graph, refer here.

Why Knowledge Graph

A user expresses a query in multiple ways. It is a difficult task for you to visualize and add all the alternative questions manually.

Kore.ai designed Knowledge Graph with nodes, tags, and synonyms which makes the work easier for you to cover all the possible matches. The Knowledge Graph can handle various alternate questions with the training using the nodes, tags, and synonyms.

Whenever a question is asked by the user, the node names in the Knowledge Graph is checked and matched with the keywords from the user utterance. Node names, tags, and synonyms are checked and based on the score, questions are shortlisted as likely matches or intents. These shortlisted questions are then compared with the actual user utterance to come up with the best possible intent to present to the user. The response can take the form of either a simple response or execution of a dialog task.

This way, you can add a very few completely different alternative questions in the FAQ and provide tags, synonyms, and node names appropriately such that any untrained question can also be matched. The performance and intelligence of the Knowledge Graph depend on the way you train it with the appropriate node names, tags, and synonyms.

Terminology

This document is intended to familiarize the reader with the terms used in building Knowledge Graph.

Jump directly to KG Creation.

Terms or Nodes

Terms or Nodes are the building blocks of an ontology and are used to define the fundamental concepts and categories of a business domain.

As shown in the image below, you can organize the terms on the left pane of the Bot Ontology window in a hierarchical order to represent the flow of information in your organization. You can create, organize, edit, and delete terms from there. There is a platform restriction of 20k maximum number of nodes and 50k number of FAQs.

For easier representation, we identify some special nodes using the following names:

Root Node

Root node forms the topmost term of your Bot Ontology. A Knowledge Graph consists of only one root node and all other nodes in the ontology become its child nodes. Root node takes the name of the bot by default, but you can change it if you want. This node is not used for node qualification or processing. The path qualification starts from first-level nodes. While it is not advisable to have FAQs directly under the root node, in case it is essential to your needs restrict the number of FAQs to a maximum of 100 at the root node.

First-level Nodes

The immediate next level nodes of the root node are known as first-level nodes. There can be any number of first-level nodes in a collection. It is recommended to keep first-level nodes to represent high-level terms such as the names of departments or functionality. For example, Personal Banking, Online Banking, and Corporate Banking.

Leaf Node

Any node to which question-answer set or dialog task is added is called a Leaf Node, be it at any level.

Node Relation

Depending on their position in the ontology, a node is referred to as first-level nodes, second-level nodes, etc. A first-level node is a category that has one or more sub-categories under it called the second-level nodes.

For example, a Loan is the first-level node of a Home Loan and Personal Loan. A Personal Loan can again have two subcategory nodes: Rate and Fees, Help and Support.

Note: This hierarchical organization of nodes is for your convenience to keep related questions together. Knowledge Graph Engine does not consider any parent-child relation while evaluating the questions for a match. The hierarchy does not in any way influence the FAQ matching processing since all the nodes are considered the same way irrespective of their position in the FAQ organization.

Synonyms

Users use a variety of alternatives for the terms of their ontology. Knowledge Graph allows you to add synonyms for the terms to include all possible alternative forms of the terms. Adding synonyms also reduces the need for training the bot with alternative questions.

For example, the Internet Banking node may have the following synonyms added to it: Online Banking, e-banking, Cyberbanking, and Web banking.

When you add a synonym for a term in the Knowledge Graph, you can add them as local or global synonyms. Local synonyms (or Path Level Synonyms) apply to the term only in that particular path, whereas global synonyms (or Knowledge Graph Synonyms) apply to the term even if it appears on any other path in the ontology.

Post-release 7.2, you can enable the usage of Bot Synonyms inside the Knowledge Graph engine for path qualification and question matching. With this setting, you need not recreate the same set of synonyms in Bot Synonyms and KG Synonyms.

Traits

Note: From v7.0, Traits replace Classes of v6.4 and before.

A trait is a collection of typical end-user utterances that define the nature of a question when they ask for information related to a particular intent. See here for more on traits.

A trait is applied to multiple terms across your Bot Ontology.

Note: Traits also help you filter nodes based on associated user utterances. So, if the user types an utterance that is present in a trait, the bot only searches the nodes to which the trait is applied. If the utterance is present in any other node to which the trait is not applied, the node is ignored by the bot.

Intent

A bot can respond to a given question from the user either with an execution of a Dialog Task or a FAQ.

FAQ: The question-answer pairs must be added to relevant nodes in your bot ontology. A maximum of 50k FAQs is permissible.
A question is asked differently by different users and to support this, you must associate multiple alternate forms for each question.
Preceding an alternate question with || will allow you to enter patterns for FAQs (post 7.2 release).
Task: Linking a Dialog task to a KG Intent helps to leverage the capabilities of the Knowledge Graph and Dialog tasks to handle FAQs that involve complex conversations.

Improving Performance

The Knowledge Graph engine works well with the default settings. As a bot developer, you can fine-tune the KG engine performance in many ways:

Configure Knowledge Graph by defining terms, synonyms, primary and alternative questions, or user utterances. Though hierarchy does not affect the KG engine performance, it does help in organizing and guiding the working of the KG engine.
Setting the following parameters:
- Path Coverage – Define the minimum percentage of terms in the user’s utterance to be present in a path to qualify it for further scoring.
- Definite Score for KG – Define the minimum score for a KG intent match to consider as a definite match and discard any other intent matches found.
- Minimum and Definitive Level for Knowledge Tasks – Define minimum and definitive threshold to identify and respond in case of a knowledge task.
- KG Suggestions Count – Define the maximum number of KG/FAQ suggestions to present when a definite KG intent match is not available.
- The proximity of Suggested Matches – Define the maximum difference to allow between top-scoring and immediate next suggested questions to consider them as equally important.
While the platform provides default values for the above-mentioned thresholds, these can be customized from the Natural Language > Training > Thresholds & Configurations.
Qualify Contextual Paths – This ensures that the bot context is populated and retained with the terms/nodes of the matched intent. This further enhances the user experience.
Traits – As mentioned earlier, traits are used to qualify nodes/terms even if the user utterance does not contain the term/node. Traits are also helpful in filtering the suggested intent list.

Working of KG Engine

Knowledge Graph engine uses a two-step approach while extracting the right response to the user utterance. It combines a search-driven intent detection process with rule-based filtering. The settings for path coverage (percentage of terms needed) and term usage (mandatory or optional) in user utterance helps in the initial filtering of the FAQ intents. Tokenization and n-gram based cosine scoring model aids in the fulfillment of the final search criteria.

Training of the Knowledge Graph involves the following steps:

All the terms/nodes along with synonyms are identified and indexed.
Using these indices, a flattened path is established for each KG Intent.

Once the Knowledge Graph Engine receives a user utterance:

The user utterance and KG nodes/terms are tokenized, and n-gram is extracted (Knowledge Graph Engine supports a max of quad-gram).
The tokens are mapped with the KG nodes/terms to obtain their respective indices.
Path comparison between the user utterance and KG nodes/terms establishes the qualified path for that utterance. This step takes into consideration the path coverage and term usage mentioned above.
From the list of questions in the qualified path, the best match is picked based upon cosine scoring.

Digital Views

Building KG

On this Page

ナレッジグラフ