A robot that understands natural language knowledge graph queries (part 1)

SHAUNA, 2021-02-09. Last updated 2021-08-11

This is part 1 of a series of posts as I explore how to let people access and modify the knowledge graph in a robot using natural language. This post covers the general idea of a natural language interface to a knowledge graph database, then talks about how I implemented the first step: named entity recognition. Futher details will be described in future posts. Stay tuned!

A brief intro

After much preparation in the form of the recent paper, it is finally time to actually design and implement knowledge exchange interactions. The next research project aims to let the robot automatically perform suitable voice and GUI interactions in an knowledge exchange scenario.

The project is inspired by the work of Heenan et al.[1], where they applied a model of human greeting by Kendon[2] to design a state machine for social greeting interactions in robots. We also wanted to derive something similar to a state machine, or at least a set of rules for choosing pre-designed interactions, that can be implemented as robot programs. This way, the robot can really interact with humans without our intervention.

Of course, this means that we can no longer rely on Wizard-of-Oz techniques to test our design. First, we will want to test the effectiveness of our state machine (or rules). It makes no sense to use staged interactions here. Second, it would be impossible to manually do all of the following in time:

  1. read the current situation
  2. convert it to graph database commands
  3. type in the commands
  4. observe the results
  5. decide on what speech and interface to use according to our design
  6. press the remote control buttons to tell the robot what to do

It would be an absolute nightmare for the experimenter!

Anyway, I set out to explore how I might automate these steps, especially steps 1-4. I wanted the user to be able to search, add, modify, and delete things from a graph database by speaking. In research lingo, I wanted to implement a natural language interface for a knowledge graph (KG) database.

A natural language interface for a KG database

There are many recent research on this topic, considering both natural language processing and knowledge graphs are hot right now. Most of the works I referenced can be (very roughly) summarized as the same three steps:

  1. Extract parts of speech as “evidence[3]” for relations and entities
  2. Construct potential semantic relations based on the evidence
  3. Connect the relations together as a graph, which can in turn translate to queries for graph databases.

Some works use more complex methods (such as Hu et al.[3:1]) to do step 1 and 2, but the most straightforward ones rely on named entity extraction (NER) to provide entities[4], then use grammar parsers to group entities and identify relations[5].

The more complex methods have less assumption about the sentence, and require less manual processing. However, for the simple laboratory setting we are working in, a simple algorithm that can handle basic sentence variations should be enough. Therefore, I settled for a more simple approach:

  1. I decided to use an NER model that treats both relations and entities as named entities.
  2. The Hungarian algorithm can match entities to their suitable locations in the relations, leaving only the unknowns
  3. Some simple heuristics can suffice to connect and merge relation clauses into the final graph structure.

Although it all seems very abstract and complicated at this point, there’re actually many existing frameworks to help implement each step. Everything I’ve done will be open-sourced once the project is complete (somewhere around March 2021). So below I will only describe the most essential parts of the process.

Step 1: NER

For the NER part, Chatette can help generate a pretty decent dataset. Rasa NLU can be used to easily train an NER model and implement custom actions in Python. Neo4j is a easy to use graph database that has a Python API, and its Cypher query language is simple enough to translate to using Python code.

Dataset generation with Chatette

Chatette can, according to its documentation, “generate large datasets of example data for Natural Language Understanding tasks without too much of a headache.” [6]

The syntax of Chatette is simple enough. You define synonyms using ~[name], and define slots using @[name]. What I ended up doing was very conventional stuff, without touching any of the fancy syntax.

What was important to note was that I defined slots for both entities and relations, like the simplified example below:


human-robot interaction

%[ask question]
Do you know anyone who @[study] @[HRI]?

This allows me to avoid building complex rules for a Chinese syntax parser (which was not readily available) to detect relationships, at the expense of potentially missing out on some relations. I will make up for that using some heuristics in step 3.

Training NER models using Rasa NLU

To use the generated dataset with Rasa, I first had to do some format conversion. In the latest version of Rasa, they switched their dataset format from JSON and Markdown to YAML. This posed a problem, as Chatette does not support exporting YAML.

Fortunately, Chatette supports old Rasa Markdown, and Rasa provided a tool to convert Markdown datasets to YAML. All I had to do was to specify rasamd as the output format in Chatette, and convert it using the command line tool.

chatette -a rasamd -f -o [temp_dir] ask.chatette
rasa data convert nlu -f yaml --data=[temp_dir] --out=[output_dir]

However, Chinese text poses yet another challenge — there’s no whitespace to delimit the words. Fortunately, Rasa provides a spaCy-based pipeline that supports Chinese text.

pip install spacy zh_core_web_md
python -m spacy link zh_core_web_md zh --force

The training is fairly straightforward. All I had to do is running the following command to train and test the bot:

python -m rasa train
python -m rasa interactive

If all goes well, when I input “Got anyone here who studies human-robot interaction,” the model should label the sentence as:

Got anyone here who [studies](study) [human-robot interaction](HRI)?

  1. B. Heenan, S. Greenberg, S. Aghel-Manesh, and E. Sharlin, “Designing social greetings in human robot interaction,” in Proceedings of the 2014 conference on Designing interactive systems, Vancouver BC Canada, Jun. 2014, pp. 855–864, doi: 10.1145/2598510.2598513. ↩︎

  2. Kendon, A. (1990) Conducting Interactions: Patterns of Behavior in Focused Encounters. Cambridge University Press. ↩︎

  3. S. Hu, L. Zou, J. X. Yu, H. Wang, and D. Zhao, “Answering Natural Language Questions by Subgraph Matching over Knowledge Graphs,” IEEE Trans. Knowl. Data Eng., vol. 30, no. 5, pp. 824–837, May 2018, doi: 10.1109/TKDE.2017.2766634. ↩︎ ↩︎

  4. C. Sun, “A Natural Language Interface for Querying Graph Databases,” pp. 36. ↩︎

  5. F. Li and H. V. Jagadish, “Constructing an interactive natural language interface for relational databases,” Proc. VLDB Endow., vol. 8, no. 1, pp. 73–84, Sep. 2014, doi: 10.14778/2735461.2735468. ↩︎

  6. Not that it is completely headache-free, as I have experienced. ↩︎