mirror of
https://github.com/discourse/discourse.git
synced 2025-01-18 16:52:45 +08:00
FIX: Further reduce the input of to_tsvector (#15716)
Random strings can result into much longer tsvectors. For example
parsing a Base64 string of ~600kb can result in a tsvector of over 1MB,
which is the maximum size of a tsvector.
Follow-up-to: 823c3f09d4
This commit is contained in:
parent
e92f57255d
commit
820fea835c
|
@ -120,11 +120,11 @@ class SearchIndexer
|
|||
a_weight: topic_title,
|
||||
b_weight: category_name,
|
||||
c_weight: topic_tags,
|
||||
# Length of a tsvector must be less than 1_048_576 bytes.
|
||||
# The difference between the max ouptut limit and imposed input limit
|
||||
# accounts for the fact that sometimes the output tsvector may be
|
||||
# slighlty longer than the input.
|
||||
d_weight: scrub_html_for_search(cooked)[0..1_000_000]
|
||||
# The tsvector resulted from parsing a string can be double the size of
|
||||
# the original string. Since there is no way to estimate the length of
|
||||
# the expected tsvector, we limit the input to ~50% of the maximum
|
||||
# length of a tsvector (1_048_576 bytes).
|
||||
d_weight: scrub_html_for_search(cooked)[0..600_000]
|
||||
) do |params|
|
||||
params["private_message"] = private_message
|
||||
end
|
||||
|
|
Loading…
Reference in New Issue
Block a user