Constants should always be only assigned once. The logical OR assignment
of a constant is a relic of the past before we used zeitwerk for
autoloading and had bugs where a file could be loaded twice resulting in
constant redefinition warnings.
This improves the `TextSentinel` so that we don't consider CJK text as being uppercase and thus failing the validator.
It also optimizes the entropy computation by using native ruby `.bytes` to get all the bytes from the text.
It also tweaks the `seems_pronounceable?` and `seems_unpretentious?` check to use the `\p{Alnum}` unicode regexp group to account for non-latin languages.
Reference - https://meta.discourse.org/t/body-seems-unclear-error-when-users-are-typing-in-chinese/88715
Inspired by https://github.com/discourse/discourse/pull/27900
Co-authored-by: Paulo Magalhaes <mentalstring@gmail.com>
Over the years we accrued many spelling mistakes in the code base.
This PR attempts to fix spelling mistakes and typos in all areas of the code that are extremely safe to change
- comments
- test descriptions
- other low risk areas
nil was converted to "" and the matching regex would return [] and then be converted to nil with max usage.
Example exception:
```
NoMethodError (undefined method `<=' for nil:NilClass)
lib/text_sentinel.rb:71:in `seems_unpretentious?'
lib/validators/quality_title_validator.rb:13:in `validate_each'
lib/topic_creator.rb:25:in `valid?'
```
It used to simply say "title is invalid" without giving any hint what
the problem could be. This commit adds different errors messages for
all caps titles, low entropy titles or titles with very long words.
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.
Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
introduce a couple of custom validators
fix minor discrepancies in tests
copy I18n error message keys to default location
clean up validation invocation
move some responsibilities out of validator into class