recent comments

recent articles

  • The Avengers

    Almer S. Tigelaar 11 / 05 / 2012

    Marvel teased us with the release of this film near the end of various previously released super hero flicks like Captain America and Iron Man 2. This would be the movie that unites all the super heroes from the Marvel universe. Well actually, only those that had not been previously licensed to other studios. Hence, you will not find characters from X-Men, Spiderman, or the Fantastic Four in this movie. Director Joss Whedon brings back fond memories of creative television series like Firefly and Dollhouse, but what does he make of a 220 million blockbuster production?

    read more 0 comments
  • Hugo

    Almer S. Tigelaar 06 / 03 / 2012

    Hugo is based on a relatively recently released (2007) award winning book by Brian Selznick. It is not surprising that the film rights to the books were quickly sold, and certainly not by the least of directors either: Martin Scorsese. He has a career spanning decades and has directed a string of movies in recent years which I liked, among which are Shutter Island, The Departed and Gangs of New York. However, those were admittedly all in different, less family friendly, genres. So, I went to Hugo hoping to be pleasantly surprised.

    read more 0 comments
  • How long would it take to read Wikipedia?

    Almer S. Tigelaar 21 / 02 / 2012

    Wikipedia has become the de facto encyclopedia on the Internet. A traditional encyclopedia spans many textbook volumes which would take any normal person ages to read. Few people would likely engage in such an endeavor. However, since Wikipedia is readily accessible: should you take up the challenge?

    read more 0 comments

Category: Research

More Efficient Query-Based Sampling

Almer S. Tigelaar 09 / 09 / 2009, 11:00

Talk AbstractUSI Lugano Logo
Imagine that we have just a free-form text search field as interface to a search engine, and that we can only parse the results it returns. What can we do to get an idea of what this search engine has to offer content-wise? This is the well known problem of acquiring a resource description when faced with uncooperative servers. We present a new approach to solving this problem, which is less bandwidth intensive compared to previous approaches, such as query-based document sampling.

Presented at University of Lugano (USI), September 7th 2009, Lugano, Switzerland.

read more 0 comments

CTIT-NICE: Automatic Discussion Summarization

Almer S. Tigelaar 14 / 04 / 2009, 11:00

Automatic Discussion Summarization: A Study of Internet Fora
Presented at CTIT-NICE Symposium on analysing chats and forum discussions, April 14th 2009, Enschede, The Netherlands.

View Related Publication

read more 0 comments

Kien Tjin-Kam-Jet: Result Merging for Efficient Distributed Information Retrieval

Almer S. Tigelaar 03 / 04 / 2009, 16:00

Result Merging for Efficient Distributed Information Retrieval
by Kien T.E. Tjin-Kam-Jet

View in Repository

Abstract
Centralized Web search has difficulties with crawling and indexing the Visible Web. The Invisible Web is estimated to contain much more content, and this content is even more difficult to crawl.
Metasearch, a form of distributed search, is a possible solution. However, a major problem is how to merge the results from several search engines into a single result list. We train two types of Support Vector Machines (SVMs): a regression model and preference classification model. Round Robin (RR) is used as our merging baseline. We varied the number of search engines being merged, the selection policy, and the document collection size of the engines. Our findings show that RR is the fastest method and that, in a few cases, it performs as well as regression-SVM. Both SVM methods are much slower and, judging by performance, regression-SVM is the best of all three methods. The choice of which method to use depends strongly on the usage scenario. In most cases, we recommend using regression-SVM.

read more 0 comments

Sander Bockting: Collection Selection for Distributed Web Search

Almer S. Tigelaar 16 / 02 / 2009, 15:00

Collection Selection for Distributed Web Search using Highly Discriminative Keys, Query-driven Indexing and PageRank.
by Sander Bockting

View in Repository
Graduation Photo’s

Abstract
Current popular web search engines, such as Google, Live Search and Yahoo!, rely on crawling to build an index of the World Wide Web. Crawling is a continuous process to keep the index fresh and generates an enormous amount of data traffic. By far the largest part of the web remains unindexed, because crawlers are unaware of the existence of web pages and they have difficulties crawling dynamically generated content. These problems were the main motivation to research distributed web search.

We assume that web sites, or peers, can index a collection consisting of local content, but possibly also content from other web sites. Peers cooperate with a broker by sending a part of their index. Receiving indices from many peers, the broker gains a global overview of the peers’ content. When a user poses a query to a broker, the broker selects a few peers to which it forwards the query. Selected peers should be promising to create a good result set with many relevant documents. The result sets are merged at the broker and sent to the user. This research focuses on collection selection, which corresponds to the selection of the most promising peers. The use of highly discriminative keys is employed as a strategy to select those peers. A highly discriminative key is a term set which is an index entry at the broker. The key is highly discriminative with respect to the collections because the posting lists pointing to the collections are relatively small. Query-driven indexing is applied to reduce the index size by only storing index entries that are part of popular queries. A PageRank-like algorithm is also tested to assign scores to collections that can be used for ranking.

The Sophos prototype was developed to test these methods. Sophos was evaluated on different aspects, such as collection selection performance and index sizes. The performance of the methods is compared to a baseline that applied language modeling onto merged documents in collections. The results show that Sophos can outperform the baseline with ad-hoc queries on a web based test set. Query-driven indexing is able to substantially reduce index sizes against a small loss in collection selection performance. We also found large differences in the level of difficulty to answer queries on various corpus splits.

read more 0 comments

Matching Queries to Frequently Asked Questions: Search Functionality for the MRSA Web-Portal

Almer S. Tigelaar 02 / 02 / 2009, 17:00

Matching Queries to Frequently Asked Questions: Search Functionality for the MRSA Web-Portal
Tigelaar, A. S. & Akker, R. op den & Verhoeven, F.
In Proceedings of DIR 2009, Enschede, The Netherlands (pp. 26-33).

View in Repository

Abstract
As part of the long-term EUREGIO MRSA-net project a system was developed which enables health care workers and the general public to quickly find answers to their questions regarding the MRSA pathogen. This paper focuses on how these questions can be answered using Information Retrieval (IR) and Natural Language Processing (NLP) techniques on a Frequently-Asked-Questions-style (FAQ) database.

Presented at the Dutch-Belgian Information Retrieval Workshop 2009

read more 0 comments

DEIRA: A Dynamic Engaging Intelligent Reporter Agent (demo paper)

Almer S. Tigelaar 10 / 10 / 2008, 10:00

DEIRA: A Dynamic Engaging Intelligent Reporter Agent (demo paper)
Knoppel, F. L. A. & Tigelaar, A. S. & Oude Bos, D. & Alofs, T.
In Proceedings of BNAIC 2008, Enschede, The Netherlands (pp. 393-394).

View in Repository

Abstract
DEIRA is an embodied agent with a highly modular design, supporting several domains such as real time virtual horse race commentary, robosoccer commentary and virtual storytelling. Domain-specific information is processed to have the agent act on emotion and produce a compelling report on the situation, using synthesized speech and facial expressions. This paper briefly describes the features of the agent.

Presented at Belgian-Netherlands Conference on Artificial Intelligence 2008 on October 18th 2008 in Enschede, The Netherlands.

read more 0 comments

Automatic Discussion Summarization: A Study of Internet Fora

Almer S. Tigelaar 11 / 07 / 2008, 14:00

Automatic Discussion Summarization: A Study of Internet Fora
Tigelaar, A. S. [Master's Thesis], Supervised by Akker, R. op den.

View in Repository

Abstract
The purpose of this research was finding automated methods to summarize discussions held on Internet fora. A second goal was building a functional prototype implementing these methods. This explorative study tries to find what technologies and methods can be usefully combined into an automatic discussion summarizer. The focus of this research is on two types of threads: Problem-Solution and Statement-Discussion. Although Dutch is the main language used, much of the presented work is also applicable to other languages. Compared to summarization of unstructured texts (and spoken dialogs) the structural characteristics of threads give important advantages. We studied how these characteristics of discussion threads can be exploited. Messages in threads contain explicit and implicit references to eachother. They also have a relatively structured internal make-up. Therefore, we call the threads hierarchical dialogues. The algorithm produces one summary of an hierarchical dialogue by cherry-picking sentences out of the original messages that make up the thread. For sentence selection we try to find the main focus of the discussion that is useable to obtain an overview of the discussion. The system is build around a set of heuristics based on observations of real discussions. We developed a functioning prototype. The performance of this system was evaluated for Dutch only, but the system also supports English. Various aspects of parts of the system and the methods developed were evaluated. Much can be done to improve the current approach. Although the idea of building a summarization system in the way presented in this thesis is feasible.

Posted using Mobypicture.com

read more 0 comments

Trackside DEIRA: A Dynamic Engaging Intelligent Reporter Agent (demo paper)

Almer S. Tigelaar 15 / 05 / 2008, 15:00

Trackside DEIRA: A Dynamic Engaging Intelligent Reporter Agent (demo paper)
Knoppel, F. L. A. & Tigelaar, A. S. & Oude Bos, D. & Alofs, T. & Ruttkay, Z.
In Proceedings of AAMAS 2008, Estoril, Portugal.

View in Repository
Download from IFAAMAS

Abstract
DEIRA is a virtual agent commenting on virtual horse races in real time. DEIRA analyses the state of the race, acts on emotion and comments about the situation in a believable and engaging way, using synthesized speech and facial expressions. This paper shortly describes the features of this embodied conversational agent.

Demo given at the International Conference on Autonomous Agents and Multiagent Systems on May 15th 2008 in Estoril, Portugal.

See the HMI Showcase

Other Examples


read more 0 comments

Trackside DEIRA: A Dynamic Engaging Intelligent Reporter Agent (full paper)

Almer S. Tigelaar 14 / 05 / 2008, 11:00

Trackside DEIRA: A Dynamic Engaging Intelligent Reporter Agent (full paper)
Knoppel, F. L. A. & Tigelaar, A. S. & Oude Bos, D. & Alofs, T. & Ruttkay, Z.
In Proceedings of AAMAS 2008, Estoril, Portugal (pp. 112-119).

View in Repository
View in ACM Digital Library

Abstract
DEIRA is a virtual agent commenting on virtual horse races in real time. DEIRA analyses the state of the race, acts emotionally and comments about the situation in a believable and engaging way, using synthesized speech and facial expressions. In this paper we discuss the challenges, explain the computational models for the cognitive, emotional and communicative behavior, and account on implementation and feedback from users.

Presented at the International Conference on Autonomous Agents and Multiagent Systems on May 14th 2008 in Estoril, Portugal.

Picture of a Demo Screen on the Left and François & Almer on the Right
More Pictures

AAMAS2008 Logo

read more 0 comments

Trackside DEIRA: A Virtual Horse Race Reporter (demo paper)

Almer S. Tigelaar 08 / 01 / 2008, 14:00

Trackside DEIRA: A Virtual Horse Race Reporter (demo paper)
Tigelaar, A. S. & Knoppel, F. L. A. & Oude Bos, D. & Alofs, T. & Ruttkay, Z. & Nijholt, A.
Presented at INTETAIN 2008, Cancun, Mexico.

INTETAIN Logo

read more 0 comments