Title Semantic-Enabled Web Crawler and Parser
OW2 project Trusite
OW2 project URL www.tsr.net
Other OW2 projects and URL (optional)

Keywords Semantic,Web Crawler,Web Parser

With the development of Internet, there is lots of information on the
Internet. Many applications have been developed to collection data (e.g.
web pages) from the Internet, and try to discover useful information or
knowledge from the collected data. The most popular examples of such kind
of applications are search engines, like Google, Bing, etc. However, in
most cases, the crawlers used in these applications cannot understand the
contents of the web pages, so they just download the whole pages and leave
the parsing work for humans or other applications. And in fact the parsing
work is very tedious. Recently, Semantic Web technologies have been
proposed and adopted in many applications. In Semantic Web, the computers
can understand the contents of the web pages. It will be very useful to
build some kind of semantic-enabled web crawler and parser that can
understand the contents of the web pages. The goal of this topic/project is
to build such kind web crawler and parser.

Main Topic Contact Person Name Junfeng ZHAO
Main Topic Contact Person e-mail zhaojf@sei.pku.edu.cn
Other Topic Contact Person(s) Name(s) (optional)

Other Topic Contact e-mail(s) (optional)

Estimated Workload (total, in manmonths) 16
Targeted Contestants master/PhD

  • Yuri Glickman, Fraunhofer FOKUS

