recent comments

recent articles

  • How long would it take to read Wikipedia?

    Almer S. Tigelaar 21 / 02 / 2012

    Wikipedia has become the de facto encyclopedia on the Internet. A traditional encyclopedia spans many textbook volumes which would take any normal person ages to read. Few people would likely engage in such an endeavor. However, since Wikipedia is readily accessible: should you take up the challenge?

    read more 0 comments
  • Life in a Day

    Almer S. Tigelaar 09 / 02 / 2012

    The premise behind the YouTube documentary “Life in a Day” is interesting: invite everyone around the world to shoot video on one specific day: July 24th 2010. Have people upload their raw footage and edit it so it becomes a short, ninety minute, documentary that chronicles a single day on our planet. Does this extreme form of crowdsourcing actually work?

    read more 0 comments
  • Top 8 Prejudices about Americans

    Almer S. Tigelaar 07 / 02 / 2012

    When travelling abroad it is difficult to go with an open mind. Despite our best efforts we bring with us an excess of prejudice shaped by our own culture and view of the destination country. So to it was for me when I visited the United States. When coming back, people at home are very insistent that you play into their prejudice regarding where you’ve been as well, perhaps as a means of reinforcing their own identity.

    read more 0 comments

Category: Journals

Automatic Summarisation of Discussion Fora

Almer S. Tigelaar 24 / 03 / 2010, 23:59

Automatic Summarisation of Discussion Fora
Tigelaar, A. S. & Akker, R. op den & Hiemstra, D.
Natural Language Engineering Volume 16, Issue 2, 2010, ISSN 1351-3249, (pp. 161-192).

View at Cambridge Journals On-Line
View in Repository

Abstract
Web-based discussion fora proliferate on the Internet. These fora consist of threads about specific matters. Existing forum search facilities provide an easy way for finding threads of interest. However, understanding the content of threads is not always trivial. This problem becomes more pressing as threads become longer. It frustrates users that are looking for specific information and also makes it more difficult to make valuable contributions to a discussion. We postulate that having a concise summary of a thread would greatly help forum users. But, how would we best create such summaries? In this paper, we present an automated method of summarising threads in discussion fora. Compared with summarisation of unstructured texts and spoken dialogues, the structural characteristics of threads give important advantages. We studied how to best exploit these characteristics. Messages in threads contain both explicit and implicit references to each other and are structured. Therefore, we term the threads hierarchical dialogues. Our proposed summarisation algorithm produces one summary of an hierarchical dialogue by ‘cherry-picking’ sentences out of the original messages that make up a thread. We try to select sentences usable for obtaining an overview of the discussion. Our method is built around a set of heuristics based on observations of real fora discussions. The data used for this research was in Dutch, but the developed method equally applies to other languages. We evaluated our approach using a prototype. Users judged our summariser as very useful, half of them indicating they would use it regularly or always when visiting fora.

read more 0 comments