Are we wasting our time?

Tim Regan timregan at microsoft.com
Fri Sep 24 13:03:32 UTC 2004


Hi All,

I'm not yet up-to-date on progress here so sorry if I'm talking 
rubbish.

Steve is exactly the kind of HPfGU member who will benefit from the 
work here. The fact that he and others are already using Paul's 
archive and automated text searching tools to circumvent Yahoomort 
shows what a huge demand there is for the catalogue. It's great news.

The problem it presents, I think, is one of publicity. It suggests 
to me that the detail and the timescales of the cataloguing project 
are not widely understood among HPfGU members. Thus Steve and others 
are spending time building ad-hoc solutions when they could wait ??? 
months for the results of the catalogue to start to become available.

Steve's plan also makes me wonder if we should make a full install 
of the messages, the meta data, and the search tools available for 
those who would prefer them to a web search interface. In any case, 
some of the technologies Paul's planning to use (e.g. a SQL server) 
are not the kind of applications that most users will be comfortable 
installing on their own machines, so for the majority of people the 
search interface we are working towards will be their best option.

As Anne pointed out a search tool and a catalogue are very different 
beasts. Engines like Google are not just running a text search, the 
clever bit is the ranking of the results. That's why we have the 
counter-intuitive position we are in now where it is easier to 
search billions of documents on the web than the documents on one's 
hard drive.

But the real difference, and the reason cataloguing is hard, is the 
meta information. If I have a "new" theory on Harry's name and 
search through all the HPfGU posts for the text "Harry" and "name" I 
will get back thousands and thousands of successful hits. Which are 
important? Who knows, and that's where the catalogue will prove 
invaluable.

Cheers,

Dumbledad.





More information about the HPFGU-Catalogue archive