The RAINBOW Project


logo RAINBOW Reusable
for INtelligent
Web information access

(Further motivation of the name: the different modules for analysis of web data should synergistically "shed light" on the otherwise "obscure" web content, in a similar way as the different colours of the rainbow join together to form the visible light.)

June 2007: The Ex information extraction tool has reached stability and is being used in K-Space and MedIEQ projects.

May 2006: Due to breakdown of our server and migration to an 'emergency' one, some applications relying e.g. on Java (such as the Sesame repository) don't work any longer.

1 January 2006: Two new EU projects started, in which the to-date Rainbow team actively participates.

  • The K-Space Network of Excellence focuses on multimedia semantics; the synergistic analysis of images and texts, addressed in Rainbow, is one of relevant topics there.
  • The MedIEQ project deals with medical website quality labelling using information extraction, also taking advantage of website navigation and classification techniques. It thus directly follows up with previous research on Rainbow, in a specific application domain.
As we are now quite busy with these projects, it may happen that we won't update this page very often.

December 2005 - February 2006, section Bibliography
Two overview publications describing Rainbow were published. One is the the habilitation thesis of V. Svatek, Sv05d, and the other is a short paper with summary project presentation Sv06a.

14-16 September, 2005: the First International Workshop on Representation and Analysis of Web Space (RAWS-05) was organised by the Rainbow team, with support from our CSF grant.

August 29, 2005, section Bibliography
Added reference to paper at Web Intelligence conference, La05d, and to three papers to be presented at the RAWS-05 workshop: Kr05a, La05c and Sv05c.

June 9, 2005, section Downloads and shows
Added reference to collection of Rainbow service descriptions.

June 9, 2005, section Bibliography
Reference to paper at ICML workshop, Ne05b, added.

April 18, 2005, section Bibliography
References to two papers at Dateso workshop, La05b and Ne05a, added.

April 18, 2005, section Downloads and shows
Demo of information extractor trained for bike offer pages, see also documentation. Contact Martin Labsky for details. Also refer to the underlying paper La05b.

February 23, 2005, section Downloads and shows
Updated collection of training data with semantic tags (103 pages from bicycle catalogues, in XHTML). The collection is tagged twofold: once with HTML tags visible (colour) in browser, once with semantic XML tags. Contact Martin Labsky for details. Also refer to an underlying paper La05a.

January 17, 2005, section Bibliography
Five more references added. An up-to-date overview of project achievements is the Dagstuhl seminar paper (La05a).