I've been in discussions with some people about the possibility of linking other services to OpenGuides. One thing that occurred to me would be the utility of providing a fuzzy node name lookup. Here's a user story:
Users of website X want to be able to associate nodes on OpenGuides site G with items on X. They go to a node lookup service offered by G and type the name of something they are looking for. They are then presented with a list of nodes on G that approximately match their search term.
Specifically, I'm thinking of search results gradiated into exact matches, partial matches and approximate matches, just like the way IMDB does it:
http://uk.imdb.com/find?q=Hideous
I don't know anything about fuzzy matching techniques. Can anyone suggest any pointers?
On Sat 14 May 2005, Earle Martin openguides@downlode.org wrote:
I don't know anything about fuzzy matching techniques. Can anyone suggest any pointers?
See fuzzy_title_match() in CGI::Wiki -
Returns a (possibly empty) hash whose keys are the node names and whose values are the scores in some kind of relevance-scoring system I haven't entirely come up with yet.
Note that even if an exact match is found, any other similar enough matches will also be returned. However, any exact match is guaranteed to have the highest relevance score.
The matching is done against "canonicalised" forms of the search string and the node titles in the database: stripping vowels, repeated letters and non-word characters, and lowercasing.
supports_fuzzy_searches() will tell you if your search backend can do this. I'm fairly sure that both the ones used by OpenGuides do.
Kake
openguides-dev@lists.openguides.org