If a user always read all group messages, we will never update the
`first_pm_unread_at` column since the previous query will not return the
group_user. Instead, we should update `first_pm_unread_at` to the
current timestamp if the user has read everything.
Follow-up to 9b75d95fc6
In c6ceda8c, a bug was introduced where an admin searching for his own
private messages will actually end up searching through all private
messages on the site.
Follow-up to c6ceda8c4e
This PR introduces a few important changes to secure media redaction in emails. First of all, two new site settings have been introduced:
* `secure_media_allow_embed_images_in_emails`: If enabled we will embed secure images in emails instead of redacting them.
* `secure_media_max_email_embed_image_size_kb`: The cap to the size of the secure image we will embed, defaulting to 1mb, so the email does not become too big. Max is 10mb. Works in tandem with `email_total_attachment_size_limit_kb`.
`Email::Sender` will now attach images to the email based on these settings. The sender will also call `inline_secure_images` in `Email::Styles` after secure media is redacted and attachments are added to replace redaction messages with attached images. I went with attachment and `cid` URLs because base64 image support is _still_ flaky in email clients.
All redaction of secure media is now handled in `Email::Styles` and calls out to `PrettyText.strip_secure_media` to do the actual stripping and replacing with placeholders. `app/mailers/group_smtp_mailer.rb` and `app/mailers/user_notifications.rb` no longer do any stripping because they are earlier in the pipeline than `Email::Styles`.
Finally the redaction notice has been restyled and includes a link to the media that the user can click, which will show it to them if they have the necessary permissions.
![image](https://user-images.githubusercontent.com/920448/92341012-b9a2c380-f0ff-11ea-860e-b376b4528357.png)
- Lets child components extend color definitions
- Includes default theme color definitions
- Fails gracefully on color stylesheet SCSS errors
- Includes theme variables when extending colors
There is a request spec that was ignored with the `xit` flag almost a
year ago and every time you generate the api docs with
```
rake rswag:specs:swaggerize
```
it shows the output of this pending test and I guess I finally got sick
of looking at it, so here is a fix for it.
Original Commit: d84c34ad75
This PR ensures that new bookmarks cannot be created for deleted posts and topics, and also makes sure that if a bookmark was created and then the topic deleted that the show topic page does not error from trying to retrieve the bookmark reminder at.
It is possible that a user could exist without an email, if so we should
not enqueue a job to download their gravatar.
This commit resolves this error that can occur:
```
Job exception: undefined method `email' for nil:NilClass
/var/www/discourse/app/models/user.rb:1204:in `email'
/var/www/discourse/app/jobs/regular/update_gravatar.rb:12:in `execute'
```
This commit also fixes the original spec which actually was wrong. The
job never enqueued in the original spec and so the gravatar was never
actually updated and the test was checking if the two values were the
same, but they were both null and never updated, so of course they were
the same!
A new test has also been added to make sure the gravatar job isn't
enqueued when a user's email is missing.
DEV: add plugin hooks for silence message parameters
Allows plugins to add, and update extra silence message params for custom
i18n vars
Allows plugins to override system messages via `message_title` and
`message_raw` parameters. We can later expose these params where necessary via event
hooks. Expose the parameter for the on user_silenced trigger.
When a category is removed from `auto_watch_category` we are removing
CategoryUser. However, there are still TopicUser with notification level
set to `watching` which was inherited from Category.
We should move them back to `regular` unless they were modified by a user.
* FEATURE: Use predictable filenames inside the user archive export
* FEATURE: Include badges in user archive export
* FEATURE: Add user_visits table to the user archive export
Previously in some cases the test suite could fail due to a bad entry in
redis from previous tests
This ensures the correct cache is expired when needed
Additionally improves performance of the redis check
"en_US" doesn't contain most of the translations, so it falls back to "en". But that behavior stopped translation overrides to work for pluralized strings in "en_US", because it relies on existing translations. This fixes it by looking up the existing translation in all fallback locales.
This is in preparation for improvements to the user archive export data.
Some refactors happened along the way, including calling the different _export methods 'components' of the zip file.
Additionally, make the test for post export much more comprehensive.
Copy sources:
app/jobs/regular/export_csv_file.rb
spec/jobs/export_csv_file_spec.rb
This commit adds a new site setting "allowed_onebox_iframes". By default, all onebox iframes are allowed. When the list of domains is restricted, Onebox will automatically skip engines which require those domains, and use a fallback engine.
Originally disabled in 0c0192e. Upload specs now use separate paths for each spec worker.
Fixes an issue in UploadRecovery#recover_from_local – it didn't take into account the testing infix (e.g. test_0) in the uploads/tombstone paths.
With the addition of `PostSearchData#private_message`, a partial
index consisting of only search data from regular posts can be created.
The partial index helps to speed up searches on large sites since PG
will not have to do an index scan on the entire search data index which
has shown to be a bottle neck.
These fields are required when using the UI and if `suspend_until`
params isn't used the user never is actually suspended so we should
require these fields when suspending a user.
This commit is addressing an issue where it is possible that there could
be multiple topic timer jobs running to close a topic or a weird race
condition state causing a topic that was just closed to be re-opened.
By removing the logic from the Topic Timer model into the Topic Timer
controller endpoint we isolate the code that is used for setting an
auto-open or an auto-close timer to just that functionality making the
topic timer background jobs safer if multiple are running.
Possibly in the future if we would like this logic back in the model a
refactor will be needed where we actually pass in the auto-close and
auto-open action instead of mixing it with the close and open
action that is currently being passed to the controller.
When someone wants to add > 1000 users at once they will hit a timeout.
Therefore, we should introduce limit and inform the user when limit is exceeded.
* ensure emails don't have spaces
* import banned users as suspended for 1k yrs
* upgrade users to TL2 if they have comments
* topic: import views, closed and pinned info
* import messages
* encode vanilla usernames for permalinks. Vanilla usernames can contain spaces and special characters.
* parse Vanilla's new rich body format
The filter noops if an incorrect username is passed. This filter is not
exposed as part of the UI but is only used when an admin transitions
from a search within a user's personal messages to the full page search.
Follow-up to 4b30799054.
Renamed from `private_messages` to `personal_messages` without
deprecation because the `private_messages` advanced search filter never
worked in the first place when it was implemented.
This also ensures that restoring a backup works when it was created with the wrong upload paths in the time between ab4c0a4970 (shortly after v2.6.0.beta1) and this fix.
This was likely introduced with the refactor to make ColorSchemeColor a database object. Add a test so this doesn't happen again.
Also test other basics of the WizardSerializer.
For some reason, the .as_json left Ruby objects in; I solved this with a round trip through JSON during the test.
Because we allow all the other flag types on a deleted post we should be
able to send a pm to the user letting them know why we deleted their
post.
Bug report:
https://meta.discourse.org/t/-/161156
Like "default watching" and "default tracking" categories option now the "regular" categories support is added. It will be useful for sites that are muted by default. The user option will be displayed only if `mute_all_categories_by_default` site setting is enabled.
* FIX: Unlike own posts on ownership transfer
If a user has liked a post that has passed the
`post_undo_action_window_mins` system setting window and you transfer ownership
of that post to that user you will be the owner of a post that you have
liked, but cannot unlike resulting in a weird UI behavior. This commit
fixes this issue.
The existing tests didn't check for the timeout window for unliking
posts so I added that in.
I couldn't find a good way to do this logic inside of the guardian class
so rather than duplicating behavior of the `PostActionDestroyer` class
inside of the `PostOwnerChanger` I decided to pass in a "bypass"
variable that could be used to check if the calling class is the
'post_owner_changer' and bypass the guardian instead. I went this route
because the guardian `can_delete_post_action` method has no way of
distinguishing how to allow a user to be able to unlike their own posts
after the timeout window but only on a post owner change.
* use an options hash instead
The rotp gem is currently pinned to version 5.1.0 and this will bump it
up to version 6.0.1.
Follow up to: 85d4370f79
because this issue we were waiting on is now closed:
https://github.com/mdp/rotp/issues/98
Because version 6 is now encoding the params I needed to update the
tests as well.
Enabling the moderators_manage_categories_and_groups site setting will allow moderator users to create/manage groups.
* show New Group form to moderators
* Allow moderators to update groups and read logs, where appropriate
* Rename site setting from create -> manage
* improved tests
* Migration should rename old log entries
* Log group changes, even if those changes mean you can no longer see the group
* Slight reshuffle
* RouteTo /g if they no longer have permissions to view group
Themes can now declare custom colors that get compiled in core's color definitions stylesheet, thus allowing themes to better support dark/light color schemes.
For example, if you need your theme to use tertiary for an element in a light color scheme and quaternary in a dark scheme, you can add the following SCSS to your theme's `color_definitions.scss` file:
```
:root {
--mytheme-tertiary-or-quaternary: #{dark-light-choose($tertiary, $quaternary)};
}
```
And then use the `--mytheme-tertiary-or-quaternary` variable as the color property of that element. You can also use this file to add color variables that use SCSS color transformation functions (lighten, darken, saturate, etc.) without compromising your theme's compatibility with different color schemes.
There is an fk to user_profile that can make destroying uploads fail
if they happen to be set as user profile.
This ensures we clear this information when destroying uploads.
There are more relationships, but this makes some more progress.
Convert all IMAP logging to write to a database table for easier inspection. These logs are cleaned up daily if they are > 5 days old.
Logs can easily be watched in dev by setting DISCOURSE_DEV_LOG_LEVEL=\"debug\" and running tail -f development.log | grep IMAP
It's possible that the original topic image is broken in some form, so
we shouldn't try and generate a topic thumbnail for it. The fix will
prevent the generate_topic_thumbnails job being enqueued every time the
topic is viewed.
For sites that are configured to mute some or all categories and tags
for users by default, groups can now be configured to set members'
notification level to normal from the group manage UI.
* FEATURE: don't notify about changed tags for a private message
Only staff members observing specific tag should receive a notification
* FIX: remove other category which is not used
* FIX: improved specs to ensure that revise was succesful
Previously we did an early return if either SiteSetting.tagging_enabled or SiteSetting.allow_staff_to_tag_pms was false when updating the email on the IMAP server -- however this also stopped us from archiving or deleting emails if either of these were disabled.
The category model already has a default value for `color` and
`text_color` so they don't need to be required via the API. The ember UI
already requires that colors be selected.
The name of the category also doesn't need to be required when updating
the category either because we are already passing in the id for the
category we want to change.
These changes improve the api experience because you no longer have to
lookup the category name, color, or text color before updating a single
category attribute. When creating a category the name is still required.
https://meta.discourse.org/t/-/132424/2
This fixes an issue where a non-default theme set to use the base color
scheme (i.e. the theme had an empty `color_scheme_id`) was loading the
default theme's color scheme instead.
Sometime parallel spec if failing with error:
```
NoMethodError:
undefined method `data' for nil:NilClass
# ./spec/models/topic_tracking_state_spec.rb:339:in `block (4 levels) in <main>'
```
I have a theory that it might be related to instance variables in before block
Adds functionality to reflect topic delete in Discourse to IMAP inbox (Gmail only for now) and reflecting Gmail deletes in Discourse.
Adding lots of tests, various refactors and code improvements.
When Discourse topic is destroyed in PostDestroyer mark the topic incoming email as imap_sync: true, and do the opposite when post is recovered.
The `/directory_items` route needs to have a .json url, but the rails
url helper `_path` doesn't return the format of the route.
I tried passing in a format options to `directory_items_path`. Which
works in the rails console
```
[8] pry(main)> directory_items_path(params.merge(:format => :json))
=> "/directory_items.json?page=1"
```
but when I added that some logic to the controller it comes out as
```
/directory_items?format=json&page=1
```
(which is actually how I expect it to work based on how you pass in the
format param). Anyways, because I couldn't figure out how to pass a
format to the `_path` helper I just used URI.parse to append `.json`
manually.
When we run the S3 inventory, mark uploads that exist as verified true, those that don't as verified false, and uploads not included in the check / not yet checked as verified nil.
The UI prevents users from trying to create tags on topics when they
don't have permission, but if you are trying to add tags to a topic via
the API and you don't have permission before this change it would
silently succeed in creating the topic, but it wouldn't have any tags.
Now a 422 error will be returned with an error message when trying to
create a topic with tags when tagging is disabled or you don't have
enough trust level to add tags to a topic.
Bug report: https://meta.discourse.org/t/-/70525/14
Normally, secure media urls are linked like `/secure-media-uploads/...`. In this case, uploads were already being linked correctly.
But sometimes (e.g. when pulling hotlinked onebox images) secure media is referenced with a full domain name (`//example.com/secure-media-uploads`). This commit ensures that those uploads are also linked correctly.
When a tab is open but left unattended for a while, the red, green, and blue
pills tend to go out of sync.
So whevener we open the notifications menu, we sync up the notification count
(eg. blue and green pills) with the server.
However, the reviewable count (eg. the red pill) is not a notification and
is located in the hamburger menu. This commit adds a new route on the server
side to retrieve the reviewable count for the current user and a ping
(refreshReviewableCount) from the client side to sync the reviewable count
whenever they open the hamburger menu.
REFACTOR: I also refactored the hamburger-menu widget code to prevent repetitive uses
of "this.".
PERF: I improved the performance of the 'notify_reviewable' job by doing only 1 query
to the database to retrieve all the pending reviewables and then tallying based on the
various rights.
Similar to `advanced_filter` I introduced `advanced_order`.
I needed a new option because default orders are evaluated after advanced_filter so I couldn't use it.
Also, that part is a little bit more generic
```
elsif word =~ /order:\w+/
@order = word.gsub('order:', '').to_sym
nil
```
After those changes, I can use them in plugins in this way:
```
Search.advanced_order(:votes) do |posts|
posts.reorder("COALESCE((SELECT dvvc.counter FROM discourse_voting_vote_counters dvvc WHERE dvvc.topic_id = subquery.topic_id), 0) DESC")
end
```
* FEATURE: set notification levels when added to a group
This feature allows admins and group owners to define default
category and tag tracking levels that will be applied to user
preferences automatically at the time when users are added to the
group. Users are free to change those preferences afterwards.
When removed from a group, the user's notification preferences aren't
changed.
In the near future, we will be swtiching to PG headlines to generate the
search blurb. As such, we need to replace audio and video links in the
raw data used for headline generation. This also means that we avoid
replacing links each time we need to generate the blurb.
Previously we would unconditionally keep all images downloaded via pull_hotlinked_images, even if they are later removed from the post. This commit removes that logic, and relies on the existing link_post_uploads process to pick up the downloaded images in `cooked`. Specs are added to ensure this is working correctly for regular hotlinked images, and for oneboxes.
This commit should cause no functional change
- Split into functions to avoid deep nesting
- Register custom field type, and remove manual json parse/serialize
- Recover from deleted upload records
Also adds a test to ensure pull_hotlinked_images redownloads secure images only once
The assign plugin is one of two situations where a post can be both a whisper and a small-action. Check the action_code field to filter out small-actions.
* Fixed an issue I introduced in the last PR where I am just archiving everything regardless of whether it is actually archived in Discourse man_facepalming
* Refactor group list_mailboxes IMAP code to use providers, add specs, and add provider code to get the correct prodivder
A first step to adding automatic dark mode color scheme switching. Adds a new SCSS file at `color_definitions.scss` that serves to output all SCSS color variables as CSS custom properties. And replaces all SCSS color variables with the new CSS custom properties throughout the stylesheets.
This is an alpha feature at this point, can only be enabled via console using the `default_dark_mode_color_scheme_id` site setting.
Apparently latest.json and latest.rss are not routed to the same
controller methods. This change allows for any passed in query
parameters to actually be applied to the rss route.
This came in as a request on meta:
https://meta.discourse.org/t/-/155812/6
Currently, we only reset `email_digests`, `email_level` and `email_messages_level` when the user wants to unsubscribe from all email.
`mailing_list_mode` should be reset as well
Adds a imap_group_id column to IncomingEmail to deal with an issue where we were trying to update emails in the mailbox, calling IncomingEmail.where(imap_sync: true). However UID and UIDVALIDITY could be the same across accounts. So if group A used IMAP details for Gmail account A, and group B used IMAP details for Gmail account B, and both tried to sync changes to an email with UID of 3 (e.g. changing Labels), one account could affect the other. This even applied to Archiving!
Also in this PR:
* Fix error occurring if we do a uid_fetch and no emails are returned
* Allow for creating labels within the target mailbox (previously we would not do this, only use existing labels)
* Improve consistency for log messages
* Add specs for generic IMAP provider (Gmail specs still to come)
* Add custom archiving support for Gmail
* Only use Message-ID for uniqueness of IncomingEmail if it was generated by us
* Various refactors and improvements
* DEV: Show message when cannot invite user to PM
When inviting a user to a PM return a message that says, "Sorry, this
user can't be invited." if they have been muted or are not in a users
allowed pm users list.
* Minor refactor & improved some text
For the following conditions, the TopicUser.bookmarked column was not updated correctly:
* When a bookmark was auto-deleted because the reminder was sent
* When a bookmark was auto-deleted because the owner of the bookmark replied to the topic
This adds another migration to fix the out-of-sync column and also some refactors to BookmarkManager to allow for more of these delete cases. BookmarkManager is used instead of directly destroying the bookmark in PostCreator and BookmarkReminderNotificationHandler.
This changes PG text search to only match the given title against
lexemes that are formed from the title. Likewise, the given raw will
only be matched against lexemes that are formed from the post's raw.
If a user posted a topic and Akismet decided it was spam, the topic gets deleted and put into the review queue. If a category moderator for that category marked the post/topic as "Not Spam" the topic did not get recovered correctly because Guardian.new(@user).can_review_topic?(@post.topic) returned false incorrectly because the topic was deleted.
We do prefix matching in search so there is no need to inject the extra
terms.
Before:
```
"'discourse':10,11 'discourse.org':10,11 'org':10,11 'test':8A,10,11 'test.discourse.org':10,11 'titl':4A 'uncategor':9B"
```
After:
```
"'discourse.org':10,11 'org':10,11 'test':8A 'test.discourse.org':10,11 'titl':4A 'uncategor':9B"
```
There was a bug that even when `email_digest` was set to false but
`digest_after_minutes` was positive, we were not displaying correct
status.
In addition, the message is improved when the user is unsubscribed +
unsubscribe from all is hidden.
This reverts commit 84de643c04.
Users are using the search endpoint as public API even though it is
meant to be internal. Revert for now while we figure out the path
forward on providing a more stable API to end users.
This adds a special filter to topic lists that will filter to tracked and
watched categories.
To use it you can visit:
`https://sitename/?filter=tracked`
`https://sitename/unread?filter=tracked`
and so on
Note, we do not include explicitly tracked and watched topics **outside** of
the tracked categories and tags.
We can consider a `filter=all_tracked` to cover this edge case.
Before this commit if you were bulk removing group members and passed in
a user who wasn't currently a member of that group the whole request
would fail. This change will return a 200 response now listing the users
that were removed and those that were skipped.
To reproduce the initial issue here:
1. A user makes a post, which discourse-akismet marks as spam (I cheated and called `DiscourseAkismet::PostsBouncer.new.send(:mark_as_spam, post)` for this)
2. The post lands in the review queue
3. The category the topic is in has a `reviewable_by_group_id`
4. A user in that group goes and looks at the Review queue, decides the post is not spam, and clicks Not Spam
5. Weird stuff happens because the `PostDestroyer#recover` method didn't handle this (the user who clicked Not Spam was not the owner of the post and was not a staff member, so the post didn't get un-destroyed and post counts didn't get updated)
Now users who belong to a group who can review a category now have the ability to recover/delete posts fully.
This adds an option to "delete on owner reply" to bookmarks. If you select this option in the modal, then reply to the topic the bookmark is in, the bookmark will be deleted on reply.
This PR also changes the checkboxes for these additional bookmark options to an Integer column in the DB with a combobox to select the option you want.
The use cases are:
* Sometimes I will bookmark the topics to read it later. In this case we definitely don’t need to keep the bookmark after I replied to it.
* Sometimes I will read the topic in mobile and I will prefer to reply in PC later. Or I may have to do some research before reply. So I will bookmark it for reply later.
* FEATURE: Allow List for PMs
This feature adds a new user setting that is disabled by default that
allows them to specify a list of users that are allowed to send them
private messages. This way they don't have to maintain a large list of
users they don't want to here from and instead just list the people they
know they do want. Staff will still always be able to send messages to
the user.
* Update PR based on feedback
In 1bd8a075, a hidden site setting was added that causes Email::Styles
to treat its input as a complete document in all cases.
This commit enables that setting by default.
Some tests were removed that were broken by this change. They tested the
behaviour of applying email styles to empty strings. They weren't useful
because:
* Sending empty email is not something we ever intend to do,
* They were testing incidental behaviour - there are lots of
valid ways to process the empty string,
* Their intent wasn't clear from their descriptions,
It seems there was a discrepancy in that background images were attached
to the full slug category class: `category-:slug-:id` and our body class
only had `category-:slug`.
This fix adds support for both formats.
* Remove unneeded bookmark name index.
* Change bookmark search query to use post_search_data. This allows searching on topic title and post content
* Tweak the style/layout of the bookmark list so the search looks better and the whole page fits better on mobile.
* Added scopes UI
* Create scopes when creating a new API key
* Show scopes on the API key show route
* Apply scopes on API requests
* Extend scopes from plugins
* Add missing scopes. A mapping can be associated with multiple controller actions
* Only send scopes if the use global key option is disabled. Use the discourse plugin registry to add new scopes
* Add not null validations and index for api_key_id
* Annotate model
* DEV: Move default mappings to ApiKeyScope
* Remove unused attribute and improve UI for existing keys
* Support multiple parameters separated by a comma
It's possible through an import or other means to have images larger
than the current max allowed image size in the db.
If this happens the thumbnail generation job will keep running
indefinitely trying to download a new copy of the original but
discarding it because it is larger than the max_file_size eventually
causing this error
`Job exception: undefined method `path' for nil:NilClass`
because the newly downloaded image is now nil.
This fix stops the enqueuing of the `GenerateTopicThumbnails` job for
all images that happen to be larger than the max image size.
Considering document length in search introduced too much variance in
our search results such that it makes certain searches better but at the
same time made certain searches worst. Instead, we want to have a more
determistic way of ranking search so that it is easier to reason about
why a post is rank higher in search than another.
The long term plan to tackle repeated terms is to restrict the number of
positions for a given lexeme in our search index.
Follow up to d8c796bc4.
Note that his change increases query time by around 40% in the following
benchmark against `dev.discourse.org` but this is a tradeoff that has to be taken so that relevance
search is accurate.
```
require 'benchmark/ips'
Benchmark.ips do |x|
x.config(time: 10, warmup: 2)
x.report("current aggregate search query") do
DB.exec <<~SQL
SELECT "posts"."id", "posts"."user_id", "posts"."topic_id", "posts"."post_number", "posts"."raw", "posts"."cooked", "posts"."created_at", "posts"."updated_at", "posts"."reply_to_post_number", "posts"."reply_count", "posts"."quote_count", "posts"."deleted_at", "posts"."off_topic_count", "posts"."like_count", "posts"."incoming_link_count", "posts"."bookmark_count", "posts"."score", "posts"."reads", "posts"."post_type", "posts"."sort_order", "posts"."last_editor_id", "posts"."hidden", "posts"."hidden_reason_id", "posts"."notify_moderators_count", "posts"."spam_count", "posts"."illegal_count", "posts"."inappropriate_count", "posts"."last_version_at", "posts"."user_deleted", "posts"."reply_to_user_id", "posts"."percent_rank", "posts"."notify_user_count", "posts"."like_score", "posts"."deleted_by_id", "posts"."edit_reason", "posts"."word_count", "posts"."version", "posts"."cook_method", "posts"."wiki", "posts"."baked_at", "posts"."baked_version", "posts"."hidden_at", "posts"."self_edits", "posts"."reply_quoted", "posts"."via_email", "posts"."raw_email", "posts"."public_version", "posts"."action_code", "posts"."locked_by_id", "posts"."image_upload_id" FROM "posts" JOIN (SELECT *, row_number() over() row_number FROM (SELECT topics.id, min(posts.post_number) post_number FROM "posts" INNER JOIN "post_search_data" ON "post_search_data"."post_id" = "posts"."id" INNER JOIN "topics" ON "topics"."id" = "posts"."topic_id" AND ("topics"."deleted_at" IS NULL) LEFT JOIN categories ON categories.id = topics.category_id WHERE ("posts"."deleted_at" IS NULL) AND "posts"."post_type" IN (1, 2, 3, 4) AND (topics.visible) AND (topics.archetype <> 'private_message') AND (post_search_data.search_data @@ TO_TSQUERY('english', '''postgres'':*ABCD')) AND (categories.id NOT IN (
SELECT categories.id WHERE categories.search_priority = 1
)
) AND ((categories.id IS NULL) OR (NOT categories.read_restricted)) GROUP BY topics.id ORDER BY MAX((
TS_RANK_CD(
post_search_data.search_data,
TO_TSQUERY('english', '''postgres'':*ABCD'),
1|32
) *
(
CASE categories.search_priority
WHEN 2
THEN 0.6
WHEN 3
THEN 0.8
WHEN 4
THEN 1.2
WHEN 5
THEN 1.4
ELSE
CASE WHEN topics.closed
THEN 0.9
ELSE 1
END
END
)
)
) DESC, topics.bumped_at DESC LIMIT 51 OFFSET 0) xxx) x ON x.id = posts.topic_id AND x.post_number = posts.post_number WHERE ("posts"."deleted_at" IS NULL) ORDER BY row_number;
SQL
end
x.report("current aggregate search query with proper ranking") do
DB.exec <<~SQL
SELECT "posts"."id", "posts"."user_id", "posts"."topic_id", "posts"."post_number", "posts"."raw", "posts"."cooked", "posts"."created_at", "posts"."updated_at", "posts"."reply_to_post_number", "posts"."reply_count", "posts"."quote_count", "posts"."deleted_at", "posts"."off_topic_count", "posts"."like_count", "posts"."incoming_link_count", "posts"."bookmark_count", "posts"."score", "posts"."reads", "posts"."post_type", "posts"."sort_order", "posts"."last_editor_id", "posts"."hidden", "posts"."hidden_reason_id", "posts"."notify_moderators_count", "posts"."spam_count", "posts"."illegal_count", "posts"."inappropriate_count", "posts"."last_version_at", "posts"."user_deleted", "posts"."reply_to_user_id", "posts"."percent_rank", "posts"."notify_user_count", "posts"."like_score", "posts"."deleted_by_id", "posts"."edit_reason", "posts"."word_count", "posts"."version", "posts"."cook_method", "posts"."wiki", "posts"."baked_at", "posts"."baked_version", "posts"."hidden_at", "posts"."self_edits", "posts"."reply_quoted", "posts"."via_email", "posts"."raw_email", "posts"."public_version", "posts"."action_code", "posts"."locked_by_id", "posts"."image_upload_id" FROM "posts" JOIN (SELECT *, row_number() over() row_number FROM (SELECT subquery.topic_id id, (ARRAY_AGG(subquery.post_number ORDER BY rank DESC, bumped_at DESC))[1] post_number, MAX(subquery.rank) rank, MAX(subquery.bumped_at) bumped_at FROM (SELECT "posts"."id", "posts"."user_id", "posts"."topic_id", "posts"."post_number", "posts"."raw", "posts"."cooked", "posts"."created_at", "posts"."updated_at", "posts"."reply_to_post_number", "posts"."reply_count", "posts"."quote_count", "posts"."deleted_at", "posts"."off_topic_count", "posts"."like_count", "posts"."incoming_link_count", "posts"."bookmark_count", "posts"."score", "posts"."reads", "posts"."post_type", "posts"."sort_order", "posts"."last_editor_id", "posts"."hidden", "posts"."hidden_reason_id", "posts"."notify_moderators_count", "posts"."spam_count", "posts"."illegal_count", "posts"."inappropriate_count", "posts"."last_version_at", "posts"."user_deleted", "posts"."reply_to_user_id", "posts"."percent_rank", "posts"."notify_user_count", "posts"."like_score", "posts"."deleted_by_id", "posts"."edit_reason", "posts"."word_count", "posts"."version", "posts"."cook_method", "posts"."wiki", "posts"."baked_at", "posts"."baked_version", "posts"."hidden_at", "posts"."self_edits", "posts"."reply_quoted", "posts"."via_email", "posts"."raw_email", "posts"."public_version", "posts"."action_code", "posts"."locked_by_id", "posts"."image_upload_id", (
TS_RANK_CD(
post_search_data.search_data,
TO_TSQUERY('english', '''postgres'':*ABCD'),
1|32
) *
(
CASE categories.search_priority
WHEN 2
THEN 0.6
WHEN 3
THEN 0.8
WHEN 4
THEN 1.2
WHEN 5
THEN 1.4
ELSE
CASE WHEN topics.closed
THEN 0.9
ELSE 1
END
END
)
)
rank, topics.bumped_at bumped_at FROM "posts" INNER JOIN "post_search_data" ON "post_search_data"."post_id" = "posts"."id" INNER JOIN "topics" ON "topics"."id" = "posts"."topic_id" AND ("topics"."deleted_at" IS NULL) LEFT JOIN categories ON categories.id = topics.category_id WHERE ("posts"."deleted_at" IS NULL) AND "posts"."post_type" IN (1, 2, 3, 4) AND (topics.visible) AND (topics.archetype <> 'private_message') AND (post_search_data.search_data @@ TO_TSQUERY('english', '''postgres'':*ABCD')) AND (categories.id NOT IN (
SELECT categories.id WHERE categories.search_priority = 1
)
) AND ((categories.id IS NULL) OR (NOT categories.read_restricted))) subquery GROUP BY subquery.topic_id ORDER BY rank DESC, bumped_at DESC LIMIT 51 OFFSET 0) xxx) x ON x.id = posts.topic_id AND x.post_number = posts.post_number WHERE ("posts"."deleted_at" IS NULL) ORDER BY row_number;
SQL
end
x.compare!
end
```
```
Warming up --------------------------------------
current aggregate search query
1.000 i/100ms
current aggregate search query with proper ranking
1.000 i/100ms
Calculating -------------------------------------
current aggregate search query
18.040 (± 0.0%) i/s - 181.000 in 10.035241s
current aggregate search query with proper ranking
12.992 (± 0.0%) i/s - 130.000 in 10.007214s
Comparison:
current aggregate search query: 18.0 i/s
current aggregate search query with proper ranking: 13.0 i/s - 1.39x (± 0.00) slower
```
Indexing query strings in URLS produces inconsistent results in PG and
pollutes the search data for really little gain.
The following seems to work as expected...
```
discourse_development=# SELECT TO_TSVECTOR('https://www.discourse.org?test=2&test2=3');
to_tsvector
------------------------------------------------------
'2':3 '3':5 'test':2 'test2':4 'www.discourse.org':1
```
However, once a path is present
```
discourse_development=# SELECT TO_TSVECTOR('https://www.discourse.org/latest?test=2&test2=3');
to_tsvector
----------------------------------------------------------------------------------------------
'/latest?test=2&test2=3':3 'www.discourse.org':2 'www.discourse.org/latest?test=2&test2=3':1
```
The lexeme contains both the path and the query string.
```
discourse_development=# SELECT alias, lexemes FROM TS_DEBUG('www.discourse.org');
alias | lexemes
-------+---------------------
host | {www.discourse.org}
discourse_development=# SELECT TO_TSVECTOR('www.discourse.org');
to_tsvector
-----------------------
'www.discourse.org':1
```
Given the above lexeme, we will inject additional lexeme by splitting
the host on `.`. The actual tsvector stored will look something like
```
tsvector
---------------------------------------
'discourse':1 'discourse.org':1 'org':1 'www':1 'www.discourse.org':1
```