| 5 | | - To reduce impact of matching sub-words with "%id%" while still matching "house" to both "house" and "houses" we could explicitly match against something like "%house[s $,.\"']". In most cases (for English) any ending besides a plural 's' will change the meaning for the word ("person" vs "personal"). Could define another filter that provides the endings to match against based on language. |
| 6 | | - Similarly, could detect whether there is any whitespace in the query and if there is then add whitespace (plus punctuation) around short terms (< 4 letters?). For example "%[^ ]cat[s $,.\"']%" The reason to condition this on the query having whitespace is to not break in foreign languages without whitespace. Again these patterns should probably be able to be language specific and filterable. The reason for not doing this for all words is to improve matching against compound words ("house" can match "treehouse"). |
| | 5 | - To reduce impact of matching sub-words with "%id%" while still matching "house" to both "house" and "houses" we could explicitly match against something like `"%house[s $,.\"']"`. In most cases (for English) any ending besides a plural 's' will change the meaning for the word ("person" vs "personal"). Could define another filter that provides the endings to match against based on language. |
| | 6 | - Similarly, could detect whether there is any whitespace in the query and if there is then add whitespace (plus punctuation) around short terms (< 4 letters?). For example `"%[^ ]cat[s $,.\"']%"` The reason to condition this on the query having whitespace is to not break in foreign languages without whitespace. Again these patterns should probably be able to be language specific and filterable. The reason for not doing this for all words is to improve matching against compound words ("house" can match "treehouse"). |