Are we wasting our time?
Tim Regan
timregan at microsoft.com
Fri Sep 24 13:03:32 UTC 2004
Hi All,
I'm not yet up-to-date on progress here so sorry if I'm talking
rubbish.
Steve is exactly the kind of HPfGU member who will benefit from the
work here. The fact that he and others are already using Paul's
archive and automated text searching tools to circumvent Yahoomort
shows what a huge demand there is for the catalogue. It's great news.
The problem it presents, I think, is one of publicity. It suggests
to me that the detail and the timescales of the cataloguing project
are not widely understood among HPfGU members. Thus Steve and others
are spending time building ad-hoc solutions when they could wait ???
months for the results of the catalogue to start to become available.
Steve's plan also makes me wonder if we should make a full install
of the messages, the meta data, and the search tools available for
those who would prefer them to a web search interface. In any case,
some of the technologies Paul's planning to use (e.g. a SQL server)
are not the kind of applications that most users will be comfortable
installing on their own machines, so for the majority of people the
search interface we are working towards will be their best option.
As Anne pointed out a search tool and a catalogue are very different
beasts. Engines like Google are not just running a text search, the
clever bit is the ranking of the results. That's why we have the
counter-intuitive position we are in now where it is easier to
search billions of documents on the web than the documents on one's
hard drive.
But the real difference, and the reason cataloguing is hard, is the
meta information. If I have a "new" theory on Harry's name and
search through all the HPfGU posts for the text "Harry" and "name" I
will get back thousands and thousands of successful hits. Which are
important? Who knows, and that's where the catalogue will prove
invaluable.
Cheers,
Dumbledad.
More information about the HPFGU-Catalogue
archive