mirror of
https://github.com/discourse/discourse.git
synced 2024-12-14 00:33:49 +08:00
FEATURE: allow disabling of extra term injection in search
There is a feature in search where we take over from the tokenizer in postgres and attempt to inject more words into search. So for example: sam.i.am will inject the words i and am. This is not ideal cause there are many edge cases and this can cause extreme index bloat. This is an opening move commit to make it configurable, over the next few weeks we will evaluate and decide if we disable this by default or simply remove.
This commit is contained in:
parent
5f5dd9ea67
commit
ae520b62e4
|
@ -17,6 +17,8 @@ class SearchIndexer
|
||||||
end
|
end
|
||||||
|
|
||||||
def self.inject_extra_terms(raw)
|
def self.inject_extra_terms(raw)
|
||||||
|
return raw if !SiteSetting.search_inject_extra_terms
|
||||||
|
|
||||||
# insert some extra words for I.am.a.word so "word" is tokenized
|
# insert some extra words for I.am.a.word so "word" is tokenized
|
||||||
# I.am.a.word becomes I.am.a.word am a word
|
# I.am.a.word becomes I.am.a.word am a word
|
||||||
raw.gsub(/[^[:space:]]*[\.]+[^[:space:]]*/) do |with_dot|
|
raw.gsub(/[^[:space:]]*[\.]+[^[:space:]]*/) do |with_dot|
|
||||||
|
|
|
@ -1730,6 +1730,9 @@ backups:
|
||||||
hidden: true
|
hidden: true
|
||||||
|
|
||||||
search:
|
search:
|
||||||
|
search_inject_extra_terms:
|
||||||
|
default: true
|
||||||
|
hidden: true
|
||||||
min_search_term_length:
|
min_search_term_length:
|
||||||
client: true
|
client: true
|
||||||
default: 3
|
default: 3
|
||||||
|
|
Loading…
Reference in New Issue
Block a user