Todai Robot Project


Answering the examination questions of social studies and Natural Language Processing

In examinations of social studies subjects such as world history or modern society, we express various kinds of knowledge by natural language to derive correct answers. Therefore, we are engaged in the project with a focus on the research of natural language processing, which is the technology to analyze information written in natural language by computer. Here, we introduce the representative research themes in this field.

Textual entailment recognition

In the area of social studies subjects, questions that require knowledge to be solved, so called “Memorization quiz” are often found.  For example, the typical question is as follows.

Choose the most appropriate sentence that describes military systems and soldiers.
(1) The Janissaries were the standing army of the Ottoman Empire.
(2) After the Punic Wars, the farming class, who had served as hoplites, were economically affluent.
(2009 Academic Year Main Examination: World History B)

In order to find the correct answer to a question like this, it is necessary to judge whether each choice is a historical fact or not. Since the historical facts are written in textbooks or reference books, if we memorize the contents of the textbooks or the reference books, we would be able to choose correct options.

It may appear that computers are adept at solving this kind of question because they can memorize data infinitely.  Computers are excellent at root-learning of the data (memorization in an exact, word for word manner) in fact, but it does not necessarily mean that computers acquire this concept. Therefore, the system capable to judge on its concept (i.e. knowledge) is required rather than the root-learning of the contents of the textbooks or the reference books. It can be said that understanding the concept written in natural language and using it as knowledge, means recognizing the meaning of natural language. This is an important theme in the research of natural language processing, and an unsolved difficult challenge, which is the key technology in various applications relevant to natural language.

In natural language processing, where two sentences t1 and t2 are given, the technology to recognize whether it can be said that “if t1 is assumed true, t2 is considered to be true too”, is called textual entailment recognition.
Question like the one above can be solved when textual entailment recognition is applied. For example, a textbook has the explanation like the following about answer choice 1 of the previous question.

Ottoman Empire - A great power of Mediterranean
…The Janissaries were the standing army under the emperor, which were consisting of military bands, a corps of engineers, artillery units, musketeers, etc., and it was a precursor of the modern military which had developed in Europe later.
(2007 Academic Year Textbook: World History B, Tokyo Shoseki publishing)

According to this description, (1) should be chosen as the correct answer.
This amounts to processing by textual entitlement recognition as follows.

t1: The Janissaries were the standing army under the emperor, which were consisting of military bands, a corps of engineers, artillery units, musketeers, etc., and it was a precursor of the modern military which had developed in Europe later.
t2: The Janissaries were the standing army of the Ottoman Empire.

It is natural judgment for a human. However, this judgment cannot be made by a computer. Now we are advancing research on the techniques for performing textual entailment recognition with high precision.

Moreover, we are holding the RITE (Recognizing Inference in TExt ) task-oriented workshop on the theme of textual entailment recognition, in the NTCIR (Nii Test Collection for IR systems), a series of evaluation workshops. RITE also offers the evaluation data created from the questions of the National Center Test for University Admissions, and has been pursuing research that solves questions requiring knowledge through research of textual entailment recognition.

Question answering

The technology used to reply to questions of natural language is called question answering, and has been studied in the field of information retrieval or natural language processing, for many years. In research regarding question answering, the types of questions are first classified. The easiest question to answer is called a factoid question, and an answer becomes a noun. For example, for the question “What is the highest mountain in Japan?” the expected answer is the name of a mountain. This is a typical factoid question. Questions like this can be solved by application of search technology in such a way as to look for the name of the mountain co-occurred with the keywords “Japan” and “highest”. Other research is being conducted on technologies which reply to question that ask about the reason or cause of an event, or the question that asks for the definition of a concept.

In a university examination, there are also many questions which can result in such question-answering. Typical questions are to answer the year in which an event occurred. (Example: In which year did the Kamakura shogunate start?). However, in university examinations, in many cases, since questions are asked based on the combination of two or more events, a high percentage of correct answers cannot be obtained only through applying the present question-answering technology as it is. For example, as shown in the following question.

In China at the end of the 18th century, a religious society which advocated the world’s end time based on the Buddhist eschatology with a forecast of Buddha Maitreya’s advent rose for a revolution of the lower world, at the bordering area of Sichuan and Hubei etc. And it was suppressed mainly by self-defense organizations, such as Hsiang-Yung.
Answer the name of this rebellion the religious society caused.
(University of Tokyo, 2009 entrance exam, World History)

In this question, in addition to the keywords such as “China” “the Buddhist eschatology” “religious society”, the events such as “advocated the world’s end time” ”rose“ and ”suppressed” are described, and an understanding of the inter-relation among them is necessary. Researches for greater accuracy of the question-answering, and for the technology capable of answering such a complicated question correctly have been advancing.

Moreover, the type of question which was explained in the section about textual entailment recognition can also be regarded as one type of question-answering, if each choice was individually considered to be a “true or false” question.
However, the questions which ask if it is true or not (yes/no type questions), have not been researched much because it is unexpectedly difficult to answer with the present technology of question-answering. In parallel to research on textual entailment recognition, research which replies to such questions using question-answering technology has been advanced.

Inference based on the knowledge

In textual entailment recognition, it can be judged that t2 is “to be true”, or “not to be true”, however, it is not possible to judge that t2 is “false”. If t1 and t2 were theoretically inconsistent, it can be said “t2 is false” based on t1.  However, according to research of actual questions of examinations, there are not so many case that t2 can be judged to be theoretically inconsistent.  For example, a question like following;

Choose the one correct sentence concerning events that occurred during the 8th.
(1) Pepin destroyed the Kingdom of the Lombards.
(2) The reign of Harun al-Rashid began.
(2009 Academic Year Main Examination: World History B)

In textbooks, it is explained as “Karl, the child of the Pepin III, destroyed the Kingdom of the Lombards.”, therefore we can find the choice 1 is incorrect. (The correct answer is choice 2.) . However, from the explanation “Pepin’s child destroyed X”, human would judge “Pepin destroyed X” is incorrect, this cannot be said as being theoretically inconsistent (It might be possible that two of them destroyed X together).
Therefore it is very hard for the computer to say choice 1 is “incorrect”, with confidence.

Thus, as an approach from a different angle, a method to draw inconsistency by combining two or more pieces of knowledge is conceivable. In the above example, to sum up the knowledge that the survival time of Pepin (Pepin the short) is from A.D.714 to A.D.768, and the existence period of the Lombard kingdom is from A.D.568 to A.D.774, since the end of existence of the Lombard kingdom is later than the survival time of Pepin, it is possible to know choice 1 is inconsistent. That is, the inconsistency can be inferred by combining knowledge about such matters as existence periods of a person or state, and location information, and the character that has to be fulfilled by the event “destroy”.

The databases which structure such knowledge are called ontology.  In order to answer questions like the world history correctly, it is necessary to construct the large-scale ontology which covers comprehensively the content of high school level textbooks. Although it is very difficult to construct ontology itself, it is also required to consider how knowledge is stored in ontology in what kind of form, and how it combines, and draws inconsistency.  That is, the knowledge described by natural language will be re-arranged in the form which a computer can use for automatic inference.  We are researching a design and inference method of the knowledge representation with ontological development mainly for world history.

The research of fundamental technologies in natura

To achieve high-level natural language processing as described above, the fundamental technology on which syntactic and semantic structures can be analyzed with a high precision is indispensable. Since the research of the natural language processing up until now has been aimed at clean text data which has a certain constant format, like a newspaper, if the present technology was applied as is to it, analysis doesn’t work well in many cases. Moreover, when people initiate research about meaning of natural language up until now, it has been very difficult to define a research theme. In this project, although it is a limited world of examination questions in a way, we are advancing research so that a highly precise analysis and a deep semantic analysis can be realized.

Deep Parsing

Although dependency parsing is widely used in analysis of Japanese language, since it is insufficient for semantic parsing, a system of syntactic parsing is required to analyze a detailed structure (deep syntactic structure) of a sentence with high precision, and to calculate meaning representation based on formal logic (e.g. predicate logic).

In order to output meaning representation based on formal logic with syntactic parsing of Japanese sentences, a research of syntactic parsing system for Japanese, which is based on Combinatory Categorial Grammar (CCG), has been advanced. CCG is proposed in order to describe the grammar of natural language formally, and highly precise syntactic parsing system is realized in English. In order to apply this theory to Japanese analysis, research of theory of grammar, development of large scale Japanese grammar, implementation of syntactic parsing system, etc. have been performed.

Coreference/Anaphora Resolution

In the present natural language processing, the method of analyzing each sentence independently is common, however, it is not possible to recognize correctly the meaning of the texts of actual examination questions and/or sentences in textbooks, by this method. For example, here we consider the following explanation from the textbook.

Ottoman Empire-A great power of Mediterranean
…The Janissaries were the standing army under the emperor, and were consisting of military bands, a corps of engineers, artillery units, musketeers, etc., and it was a precursor of the modern military which had developed in Europe later.

Although the word “Ottoman Empire” is not expressed in this sentence, when humans read this text, they can understand that it is a description of the Ottoman Empire. Therefore, it turns out “the emperor” here is “Emperor of the Ottoman Empire.”

Thus, a human understands the next sentence’s meaning based on former sentences' understanding, when reading sentences from the beginning in order. The technology of realizing this is called context analysis. However, there is still no common view about how the context to be analyzed and various approaches have been studied.

In the example above, the technology which performs analysis to judge that “the emperor” is actually indicating “the emperor of the Ottoman Empire”, is called coreference resolution. It is very difficult to perform coreference resolution for all the words. Therefore, in natural language processing, it is popular to simplify the task in such a way as to target the words of limited types (for example, the name of a person, the name of a country, etc.), or the words of the limited portions, such as a subject or an object of predicates, etc. However, when coreference resolution is to be applied in such a limited manner, such a text as in the above example cannot be understood and the question cannot be answered correctly.

In this project, the problem of coreference resolution has been approached from the viewpoint of what is required in context and passage comprehension to answer examination questions. Now we are advancing the data analysis, a task design, and construction of the data for evaluation.

Other research topics

When analyzing university examinations from the viewpoint of natural language processing, various interesting research themes are found. Besides the above, many challenges have been found like the following.
  • Recognition of the particular case to be applied to abstract or figurative expression
  • Summary in accordance with an intention or a viewpoint
  • Analysis of text in which formulas or signs are included
  • Recognition and inference for information for time and space
We are looking for collaborators. If you are interested in this project, please contact us.