9.3 Searching
Experience with previous large-scale hypermedia systems (our own experience with Videotex, but see also Frank Halasz's words regarding NoteCards in Section 3.5.1) strongly suggested that searching was a key feature of any large-scale system. Consequently, it was decided that searching would have to be built right into Hyper-G (as opposed to first-generation systems like Gopher and WWW where searching is merely an add-on that has to be provided by the server administrator).
The Hyper-G server is thus built on top of a powerful object-oriented database engine, and not just a simple file system (see next chapter). Every Hyper-G object (including documents, links, collections) can be searched for. Moreover, searching is seamlessly integrated with browsing. For example, a user may first navigate the collection hierarchy until a set of interesting collections is found, then issue a query with the search scope limited to these collections, then use location feedback to see the positions of the hits in the collection hierarchy, look at some of the documents returned, display a local map showing the links around the document (including links pointing to the document), follow a few hyperlinks, then return to the collection hierarchy and explore the `neighborhood', re-focus the query to a different set of collections or search terms, and so forth.
Every Hyper-G object has a set of attributes (meta-information) attached to it, some of which is indexed so it can be searched for very fast (e.g, title, keywords, author, creation/modification time; see also Section E.1). Boolean combinations (AND, OR, AND NOT), prefix searches and range searches (greater than, smaller than) are supported. Non-indexed attributes can only be used in conjunction with index searches to further reduce the set of matches, but allow regular expression searches also.
In addition to the search on meta-information, every text document inserted into the Hyper-G server is accessible by a full text search. The Hyper-G full text engine supports weighted boolean searches, nearest-neighbor searches, prefix searches, stemming and stop-words, and returns a scored list of matches.
Both kinds of searches can be performed either on the whole server or restricted to a subset of the collection hierarchy. In the latter case, any set of collections can be specified, and only hits that are members (recursively) of at least one of them are returned. The searched set of collections may also reside on different servers, which means that consistent cross-server searches can be performed. It is possible to construct collections (for example in one's personal home collection) that contain as members a set of collections on different servers, and query them all in one request.
