This commit fixes the follow quality issue with `PostSearchData#raw_data`:
1. URLs are being tokenized and links with similar href and characters
are being duplicated in the raw data.
`Post#cooked`:
```
<p><a href=\"https://meta.discourse.org/some.png\" class=\"onebox\" target=\"_blank\" rel=\"nofollow noopener\">https://meta.discourse.org/some.png</a></p>
```
`PostSearchData#raw_data` Before:
```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png discourse org/some png https://meta.discourse.org/some.png discourse org/some png
```
`PostSearchData#raw_data` After:
```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png meta discourse org
```
2. Ligthbox being included in search pollutes the
`PostSearchData#raw_data` unncessarily.
From 28 March 2018 to 28 March 2019, searches for the term `image` on
`meta.discourse.org` had a click through rate of 2.1%. Non-lightboxed images are not included in indexing for search yet we were indexing content within a lightbox. Also, search for terms like `image` was affected we were using `Pasted image` as the filename for
uploads that were pasted.
`Post#cooked`
```
<p>Let me see how I can fix this image<br>\n<div class=\"lightbox-wrapper\"><a class=\"lightbox\" href=\"https://meta.discourse.org/some.png\" title=\"some.png\" rel=\"nofollow noopener\"><img src=\"https://meta.discourse.org/some.png\" width=\"275\" height=\"299\"><div class=\"meta\">\n<svg class=\"fa d-icon d-icon-far-image svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#far-image\"></use></svg><span class=\"filename\">some.png</span><span class=\"informations\">1750×2000</span><svg class=\"fa d-icon d-icon-discourse-expand svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#discourse-expand\"></use></svg>\n</div></a></div></p>
```
`PostSearchData#raw_data` Before:
```
This is a test topic 0 Uncategorized Let me see how I can fix this image some.png png https://meta.discourse.org/some.png discourse org/some png some.png png 1750×2000
```
`PostSearchData#raw_data` After:
```
This is a test topic 0 Uncategorized Let me see how I can fix this image
```
In terms of indexing performance, we now have to parse the given HTML
through nokogiri twice. However performance is not a huge worry here since a string length of 194170 takes only 30ms
to scrub plus the indexing takes place in a background job.
Includes support for flags, reviewable users and queued posts, with REST API
backwards compatibility.
Co-Authored-By: romanrizzi <romanalejandro@gmail.com>
Co-Authored-By: jjaffeux <j.jaffeux@gmail.com>
* This is causing certain posts to appear in searches incorrectly as `PostSearchData#raw_data` contains the outdated title, category name and tag names.
Migrates email user options to a new data structure, where `email_always`, `email_direct` and `email_private_messages` are replace by
* `email_messages_level`, with options: `always`, `only_when_away` and `never` (defaults to `always`)
* `email_level`, with options: `always`, `only_when_away` and `never` (defaults to `only_when_away`)
* FEATURE: Account for `ignored_users` when merging two users
## Why?
This is part of the [Ability to ignore a user feature](https://meta.discourse.org/t/ability-to-ignore-a-user/110254/8).
When we merge two users, we need to account for merging their list of `ignored_users` too.
The webpush gem by default sets the expiration date of the JWT token to exactly 24 hours in the future. That's not really needed because the token isn't reused. And it might cause UnauthorizedRegistration if the server's clock isn't 100% correct, because the maximum allowed value is 24 hours.
Previously it would unhide their post but leave them silenced.
This fix also cleans up some of the helper classes to make it easier
to pass extra data to the silencing code (for example, a link to the
post that caused the user to be silenced.)
This patch also refactors the auto_silence specs to avoid using
stubs.
Previously the push notification code path was not tested for notification
collapsing. This happens if you get multiple replies to a topic you are
watching.
Previously we would notify on small actions if they were whispers
this inconsistently lead to all sorts of problems including
- collapsed "N replies" after assign
- empty push notifications
New behavior adds an api to explicitly send push notifications as well
if needed: create_notification_alert
Changes to functionality
- Removed syncing of user metadata including gender, location etc.
These are no longer available to standard Facebook applications.
- Removed the remote 'revoke' functionality. No other providers have
it, and it does not appear to be standard practice in other apps.
- The 'facebook_no_email' event is no longer logged. The system can
cope fine with a missing email address.
Data is migrated to the new user_associated_accounts table.
facebook_user_infos can be dropped once we are confident the data has
been migrated successfully.
4481836 introduced accent stipping in search_indexer,
but we need to strip it from the query itself as well
TODO in search with diacritics:
- Still need to fix excerpts on search page
- need to support accent stripping in in_topic search
- need to make sure that in:title works correctly
- need to fix "word boldening" in titles
This splits off the logic between SSO keys used incoming vs outgoing, it allows to far better restrict who is allowed to log in using a site.
This allows for better auditing of the SSO provider feature
- By default, behaviour is not changed: tags are made lowercase upon creation and edit.
- If force_lowercase_tags is disabled, then mixed case tags are allowed.
- Tags must remain case-insensitively unique. This is enforced by ActiveRecord and Postgres.
- A migration is added to provide a `UNIQUE` index on `lower(name)`. Migration includes a safety to correct any current tags that do not meet the criteria.
- A `where_name` scope is added to `models/tag.rb`, to allow easy case-insensitive lookups. This is used instead of `Tag.where(name: "blah")`.
- URLs remain lowercase. Mixed case URLs are functional, but have the lowercase equivalent as the canonical.
* Add revision number to notification url
* Pop modal on route change
* Add semicolon
* Ensure modal pops even when navigating within a topic
* Ensure modal pops when visiting from other page
* Fix eslint errors
* Fix prettier errors
* Add callback for notification item click
* Remove stray revisionUrl function
* Rename to afterRouteComplete
* Phase 0 for user-selectable theme components
- Drops `key` column from the `themes` table
- Drops `theme_key` column from the `user_options` table
- Adds `theme_ids` (array of ints default []) column to the `user_options` table and migrates data from `theme_key` to the new column.
- Removes the `default_theme_key` site setting and adds `default_theme_id` instead.
- Replaces `theme_key` cookie with a new one called `theme_ids`
- no longer need Theme.settings_for_client
Introduce new patterns for direct sql that are safe and fast.
MiniSql is not prone to memory bloat that can happen with direct PG usage.
It also has an extremely fast materializer and very a convenient API
- DB.exec(sql, *params) => runs sql returns row count
- DB.query(sql, *params) => runs sql returns usable objects (not a hash)
- DB.query_hash(sql, *params) => runs sql returns an array of hashes
- DB.query_single(sql, *params) => runs sql and returns a flat one dimensional array
- DB.build(sql) => returns a sql builder
See more at: https://github.com/discourse/mini_sql
This updates tests to use latest rails 5 practice
and updates ALL dependencies that could be updated
Performance testing shows that performance has not regressed
if anything it is marginally faster now.