Ivor Williams wrote:
Working on the search has caused me to mull on a particular problem:
Which searches will currently find "King's head"?
king Yes, ' is a non-word character which matches \b kings No king's Yes
I think ideally that we want the middle one to work also.
You have to normalise data before stuffing it into the database, and normalise user input before comparing it to the database.
Also, I'm wondering about having a list of standard abbreviations somewhere, which gets applied in-line as part of the node_name_to_node_title munging:
ave => avenue
also av
ct => court gdns => gardens
also gdn
hse => house rd => road st => street st => saint ...oops!
pa => pass => passage gt => great x => cross (eg Charing X, Kings X) lwr => lower up => upper sq => square cir => circ => circle or circus pk => park la => lane TCR ;-) stn => station jcn => jct => junction comm => common va => vale lvl => level hth => heath br => bridge ga => gate cnr => corner bor => borough