The previous method for reused the PrettyText logic which applied the
watched word logic, but had the unwanted effect of cooking the text too.
This meant that regular text values were converted to HTML.
Follow up to commit 5a4c35f627.
Fixes a flaky spec:
```
1) WordWatcher.word_matcher_regexp format of the result regexp is correct when watched_words_regular_expressions = true
Failure/Error: expect(regexp.inspect).to eq("/(#{word1})|(#{word2})/i")
expected: "/(word35)|(word36)/i"
got: "/(word36)|(word35)/i"
(compared using ==)
# ./spec/services/word_watcher_spec.rb:19:in `block (4 levels) in <main>'
```
Censored watched words were not censored inside the title of an inline
oneboxes. Malicious users could exploit this behaviour to insert bad
words. The same issue has been fixed for regular Oneboxes in commit
d184fe59ca.
The cache was causing state to leak between tests since the `WatchedWord` record in the DB would have been rolled back but `WordWatcher` still had the word in the cache.
The generated regular expressions did not contain \b which matched
every text that contained the word, even if it was only a substring of
a word.
For example, if "art" was a watched word a post containing word
"artist" matched.
It was not clear that replace watched words can be used to replace text
with URLs. This introduces a new watched word type that makes it easier
to understand.
Watched words are always regular expressions, despite watched_words_
_regular_expressions being enabled or not. Internally, wildcard
characters are replaced with a regular expression that matches any non
whitespace character.
* FIX: Hide tag watched words if tagging is disabled
These 'autotag' words were shown even if tagging was disabled.
* FIX: Make autotag watched words case insensitive
This commit also fixes the bug when no tag was applied if no other tag
was already present.
Previously watched words ignored topic titles when applying auto tagging rules.
Also copy has been improved to reflect how the system behaves.
The text hints that we are only watching first post now
This commit includes other various improvements to watched words.
auto_silence_first_post_regex site setting was removed because it overlapped
with 'require approval' watched words.
- Client-side censoring fixed for non-chrome browsers. (Regular expression rewritten to avoid lookback)
- Regex generation is now done on the server, to reduce repeated logic, and make it easier to extend in plugins
- Censor tests are moved to ruby, to ensure everything works end-to-end
- If "watched words regular expressions" is enabled, warn the admin when the generated regex is invalid
This commit contains 3 features:
- FEATURE: Allow downloading watched words
This introduces a button that allows admins to download watched words per action in a `.txt` file.
- FEATURE: Allow clearing watched words in bulk
This adds a "Clear All" button that clears all deleted words per action (e.g. block, flag etc.)
- FEATURE: List all blocked words contained in the post when it's blocked
When a post is rejected because it contains one or more blocked words, the error message now lists all the blocked words contained in the post.
-------
This also changes the format of the file for importing watched words from `.csv` to `.txt` so it becomes inconsistent with the extension of the file when watched words are exported.
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.
Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging