Research based on BoB

The BoB project, along with the dialogues collected in its course, has given rise to different research activities that were conducted at the KRDB Research Centre (have a look at the BoB-related publications by Raffaella Bernardi and Manuel Kirschner). When these activities culminate in research theses, we will briefly describe them on this page.

We are always interested in learning about your research activities involving the BoB dialogue data. Please get in touch and have your research presented on this page!

Question and Answer Classifier for closed domain Interactive Question Answering

(2009 EMLCT Masters thesis by Đinh Lê Thành)


Nowadays natural language processing has made big progress thanks to the application of statistical approaches and to the large amount of data available to train the systems. These progresses are pushed by the several evaluation campaigns. Thanks to them systems are compared and progress measured. These evaluations are mostly based on data sets artificially developed by the organizers of such evaluation campaigns.  In our work we show that though useful these data sets are biased and there is the need of developing data generated in a more natural setting by real users. We consider as case studies the classification of questions. In particular we look at the classification of questions types needed in Question Answering systems, and the classification of follow up questions into topic continuation and topic shift needed in Interactive Question Answering. We evaluate classifiers first on TREC data and than on a corpus of real user’s data. In both cases the performance of the classifiers drops significantly showing the need of working on more users centered systems. The results also show that the classifiers could be better fine tuned taking into account the new challenges real users data launch to NLP systems. We leave this for future research.

Further information

The BoB question classifier code repository

Deep analysis in IQA: evaluation on real Users dialogues

(2009 EMLCT Masters thesis by Zorana Ratkovic)


Interactive Question Answering (IQA) is a natural and cohesive way for a user to obtain information by interactive with a system using natural language. With the advancement in Natural Language Processing, research in the field of IQA has started to focus on the role of semantics and the discourse structure in these systems. The need for a deeper analysis, which examines the syntax and semantics of the questions and the answers is evident. Using this deeper analysis allows us to model the context of the interaction. I will look at a current closed-domain IQA system which is based on Linear Regression modeling. This system uses superficial and non-semantically motivated features. I propose adding deep analysis and semantic features in order to improve the system and show the need for such analysis. Particular attention will be placed on the so-called follow-up questions (questions that the user poses after having received some answer from the system) and the role of context. I propose that adding the linguistically heavy features will prove beneficial, thereby showing the need for such analysis in IQA systems.

The Structure of Real User-System Dialogues in Interactive Question Answering

(2010 PhD thesis by Manuel Kirschner)


When users engage in (typed) conversations with an Interactive Question Answering (IQA) system, user questions are typically not asked in isolation. The questions' context, i.e., the preceding interactions, should be useful for understanding Follow-Up Questions (FU Qs) and helping the system pinpoint the correct answer. In this work, we study how much context, and what elements of it, should be considered to answer FU Qs. We harness Logistic Regression Models (LRMs), both for learning which aspects of dialogue structure are relevant to answering FU Qs, and for comparing the accuracy with which the resulting IQA systems can correctly answer these questions. Unlike much of the related research in IQA, which uses artificial collections of user questions, our work is based on real user-system dialogues we collected via a chatbot-inspired help-desk IQA system we deployed on the web site of our University library.

Our statistical modeling experiments integrate a wide array of shallow and deep features, each of which describing a specific relation that holds between two utterances (i.e., user questions or system answers). These relations are based on lexical similarity, as proposed in the Question Answering literature for mapping questions to their correct answers, and different theories of discourse and dialogue coherence, respectively. The experimental results demonstrate which of the proposed features hold up against empirical evidence from realistic IQA dialogue data. In a nutshell, the best LRMs for describing IQA dialogue structure combine shallow and deep utterance-utterance relations; also, the best models distinguish different FU Q types, where we show that this classification can be done implicitly and automatically using the same set of shallow and deep features we use for mapping FU Qs to their correct answer.

The implications of this work are two-fold. For the dialogue and discourse research community, concerned with theories of text coherence, we provide clues as to which automatically implementable theories of inter-utterance coherence hold up empirically in realistic IQA dialogues. On the other hand, the IQA research community could benefit from our results for learning how to automatically distinguish different types of FU Qs, and how to formulate answer pinpointing strategies for each particular FU Q type. More specifically, our work is a practical study of how a real IQA system can tackle the problem of context fusion, and as a result, improve the accuracy of selecting the correct answer to FU Qs.

Further information

Manuel's personal homepage with he PhD thesis and related publications. The data set (dialogue snippet set) used for the Machine Learning experiments in the thesis are based on a subset of the BoB dialogue corpus, namely the English dialogues gathered between September 2008 and June 2009. The snippet set can be obtained under the same terms and conditions as the full BoB dialogue corpus.