Natural-p: Programming in Unconstrained Natural Language

Author: Sung-Young Jung
Date: 4 Aug 2011

Natural-p is a system that reads unconstrained natural language (English) and executes it like a computer programming language. Basically natural language itself is interpreted as source code of programming language, and the system performs task with inferences that are similarly happening when a human reads the sentences.

The target users are non-programmers. It is a system to which users can teach procedural knowledge in natural language. This technology is essential for teachable agents that can learn tasks from users, authoring systems that non-programmers can add procedural knowledge, humanoid robots that a human can teach what to do and how to do, and electronic devices that users can teach what they want to do.

  • Real world Text: A Physics Example
  • Loops: Summing a Sequence of Numbers
  • Function: Summing a Sequence of Numbers
  • Condition: Choosing a bigger number
  • Example: Teachable Agents
  • Example: Addition Procedure of Two Numbers
  • Screen Capture of Natural-p
  • Remark
  • Real World Text: A Physics Example

    Natural-p is a system driven from Natural-k (Thesis, presentation) designed to read physics problem statements from textbook, and solve them. Thus, Natural-p can read real world text to perform tasks. The following problem was from Physics. The system can read, interpret, solve it, and answer the question.

    Following is the output of execution of Natural-p. The internal inference results were displayed, and the answer is at the last sentence. They are all in natural language, thus they are readable to humans, and it helps a user find a bug if exists.

    In order to perform such inferences, background knowledge is needed. Background knowledge can be written in natural language in form of a rule. A background rule has conditions in the leftside, and actions in the rightside. The system finds a condition of a background rule that can be matched by an input sentence, and triggers the rule. Natural-p compares parse trees of the two sentences for matching. The system uses WordNet to match 'skateboarder' to 'object', and there are a few predefined variables for units such as 'unitSpeed' for 'm/s', and 'unitAngle' for 'degrees'.

    The first input sentence was matched to the first and second background rule, and two facts about the magnitude and the direction of the velocity were generated. The third rule was matched by the two generated facts and the second input sentence. The answer was generated after solving the equation in the rule.

    This type of language can be called "a rule-based language". However, a rule in the system works similar to a function except function naming and arguments. The conditions in the rule determine whether the rule is triggered or not. Matched words in the conditions pass values from a caller to the rule body like argument passing.

    In this way, the input sentences were interpreted as program code. More computer programming style codes will be shown below.

    Loops: Summing a Sequence of Numbers

    Summing a sequence of numbers requires a set of steps, and loops. It is procedural knowledge, and it can be represented in natural language. Natural-p allows more programming style format for looping. I adopted Python notation of indentation to represent a block of code. At the end of block, the 'Repeat until...' line checks the condition to repeat, if the condition is satisfied, then the system moves to the beginning of the block.

    The triple quotes ('''...''') above is a multi-line string notation in Python. Following is the output of the system. The variables in the working memory were displayed inside the curly braces {...}. It shows each step of execution, and the final summation value to stop the loop was 15.

    The actual calculation for each line is done by triggering a rule including equation in the rightside. The first line of input sentence matches to the condition of the first rule, and the rule is triggered. The line of equation in the rightside of the rule is interpreted by Python, and users can write any kind of equations inside the rule. In this way, the number of predefined commands was minizied. Also, the natural language definition of each calculation is written in natural language, and users can redefine or add different expressions.

    Natural-p has a few predefined control commands in kernel. In case of the repeating loop command, it has predefined function 'repeat_block', but the actual condition checking is done outside the command. The natural language definition and checking condition can be modified or added easily. In this way, various types of loops can be defined. Users can use the natural language definition, and they don't have to know predefine kernels in general.

    Function: Summing a Sequence of Numbers

    If a task is modularized, and it can be called by a line of sentence. If the previous loop example can be called like a function, it can be used easily.

    Now, the summation algorithm is written in a background rule. The beginning nunmber(1) and the final number (12) in the previous example were replaced with variables(number1 and number2). The values given in the new input (1 and 10) match to the words (number1 and number2) in the condition, and they work as argument values. The rightside of the rule is interpreted line by line like input sentences.

    The right side of a rule is interpreted line by line triggering other matching rules like function body. If there is no matching rule for a line, then the line is inserted as a fact to the context memory. Basically, a background rule behaves like a function. It has conditions corresponding to a function identifier and arguments. A word matched by input works similar to a function argument. The matched value is used in the rightside of the rule.

    Click to see the screen capture of the Function example

    Condition: Choosing a bigger Number

    If-else condition can be used like Python. The system finds a matching truth rule for the condition part of a sentence.

    A truth rule has hypothesis in the leftside, and conditions in the rightside. The condition part of if-else sentences matches to the hypothesis of the truth rule, then conditions in the rightside is tested to validate the truth of the hypothesis. The rightside of a truth rule can be an equation evaluated by Python, or can include conditions triggering other truth rules.

    A natural language definition of such conditions are defined in natural language, and users can add new definitions. In this way, condition expressions can be written in natural language instead of having a big number of predefined conditions.

    Example: Teachable Agents

    In the real life situation, it would be better if a human just can tell an agent (a humanoid robot, or a software agent) to do some task. If the agent doesn't know how to do the task, then a user can simply tell how to do, and the agent learns the task. The agent will try the learned task, but possibly fails in doing it correctly. Then the user tells how to do better. It can be repeated until the agent can do the task correct.

    Following is an example task for cleaning a table. Let's assume that there are two dishes, and you want a robot to clean the table. What you can do is just say what you want, "Clean the table". In the example below, you dont't have to say the first line if the visual sensors of the robot can recognize the dishes. The first line was added to simulate the robot environments.

    The task cleaning a table consists of sub tasks - checking objects on the table, approaching, grabbing, moving, putting, and repeating. Let's assume that the robot knows most of the sub tasks, but has never learned how to clean a table. Then the robot will say "I don't know how to do it", or the robot will do nothing. You can tell the robot how to clean the table in natural language as following.

    If the robot knows all the sub tasks (how to check objects on the table, how to approach, how to grab, how to move, ...), then it will be able to perform the task correctly. If it doesn't know some of the sub tasks, then it will fail doing the task. Then you have to teach more specific knowledge narrowing down to sub tasks until the robot can understand all about the task. Let's assume that the robot knows everything except how to check the number of objects on the table (the condition of if sentence above). What you have to do is to add how to check the condition:

    The last two rules in the above example are knowledge for counting the number of objects on the table, but such information can be received from the robot visual system instead. They were added just to simulate robot environments. Now, the robot got all knowledge required to perform the task to clean a table. Then, it will be able to repeat moving the dishes on the table. Following is the output of the example. The robot checked the objects on the table, moved them one by one until there was no dish on the table:

    In this way, a user can teach a robot (or an agent) how to do tasks in natural language.

    Click to see screen capture of the teachable agent example

    Example: Addition Procedure of Two Numbers

    Now, let's take a look at a little more complicated procedure. Adding two numbers digit by digit requires multiple procedures such as extracting a digit, summimg two digits and a carry, indexing, looping, etc. This example shows that natural language programming works well in this complexity. Natural language programming itself can be read as comments, so just reading the program code will be enough for you to understand.

    This is input sentences to show the addition procedure.

    This is the main background function.

    There are more background functions to get the biggest number, to extract a digit, to sum three numbers, two get the number of digits, and to assign a digit to a specific digit of a number.

    Now, the output of the system is as following. The sum of the two numbers (256 and 46) is 303. You can see the final sum in the bottom line.

    Screen capture of Natural-p

    Source code in Natural-p is natural language in a string (green color blow). The current version is being developed on Python. The below image is the screen capture of the Function example shown above.

    The execution result:

    This is screen capture of the teachable agent example:

    The execution result:

    Remark

    As shown above, the system was able to interpret and execute both text from the real world and programming-style natural language. The rules are written in natural language without constraints, so users have almost complete freedom using it.

    The ultimate goal is to develop this system further to let non-programmer users teach the system procedural tasks in natural language. For example, a human can teach a robot how and when to clean a desk. Students can learn by teaching a tutoring system what they learned in a class. An Internet user can teach an agent how to search the Internet using multiple resources such as Google, Wikipedia, dictionaries, etc. A cellphone user can teach his phone not to disturb him in class, meeting, or in a theater. In this way, Natural-p will enable non-programmer users to do tasks that only programmers were able to do in the past.


    Copyright by author.
    All rights reserved by the author.