This PR improves the performance of the `most_replied_to_users` method on the `UserSummary` model.
### Old Query
```ruby
post_query
.joins(
"JOIN posts replies ON posts.topic_id = replies.topic_id AND posts.reply_to_post_number = replies.post_number",
)
# We are removing replies by @user, but we can simplify this by getting the using the user_id on the posts.
.where("replies.user_id <> ?", @user.id)
.group("replies.user_id")
.order("COUNT(*) DESC")
.limit(MAX_SUMMARY_RESULTS)
.pluck("replies.user_id, COUNT(*)")
.each { |r| replied_users[r[0]] = r[1] }
```
### Old Query with corrections
```ruby
post_query
.joins(
"JOIN posts replies ON posts.topic_id = replies.topic_id AND replies.reply_to_post_number = posts.post_number",
)
# Remove replies by @user but instead look on loaded posts (we do this so we don't count self replies)
.where("replies.user_id <> posts.user_id")
.group("replies.user_id")
.order("COUNT(*) DESC")
.limit(MAX_SUMMARY_RESULTS)
.pluck("replies.user_id, COUNT(*)")
.each { |r| replied_users[r[0]] = r[1] }
```
### New Query
```ruby
post_query
.joins(
"JOIN posts replies ON posts.topic_id = replies.topic_id AND posts.reply_to_post_number = replies.post_number",
)
# Only include regular posts in our joins, this makes sure we don't have the bloat of loading private messages
.joins(
"JOIN topics ON replies.topic_id = topics.id AND topics.archetype <> 'private_message'",
)
# Only include visible post types, so exclude posts like whispers, etc
.joins(
"AND replies.post_type IN (#{Topic.visible_post_types(@user, include_moderator_actions: false).join(",")})",
)
.where("replies.user_id <> posts.user_id")
.group("replies.user_id")
.order("COUNT(*) DESC")
.limit(MAX_SUMMARY_RESULTS)
.pluck("replies.user_id, COUNT(*)")
.each { |r| replied_users[r[0]] = r[1] }
```
# Conclusion
`most_replied_to_users` was untested, so I introduced a test for the logic, and have confirmed that it passes on both the new query **AND** the old query.
Thank you @danielwaterworth for the debugging assistance.
We will be collecting the logo URL and the site's default locale values along with existing basic details to display the site on the Discourse Discover listing page. It will be included only if the site is opted-in by enabling the "`include_in_discourse_discover`" site setting.
Also, we no longer going to use `about.json` and `site/statistics.json` endpoints retrieve these data. We will be using only the `site/basic-info.json` endpoint.
When a user is manually deactivated, they should not be deleted by our background job that purges inactive users.
In addition, site settings keywords should accept an array of keywords.
Previously the problem check registry simply looked at the subclasses of ProblemCheck. This was causing some confusion in environments where eager loading is not enabled, as the registry would appear empty as a result of the classes never being referenced (and thus never loaded.)
This PR changes the approach to a more explicit one. I followed other implementations (bookmarkable and hashtag autocomplete.) As a bonus, this now has a neat plugin entry point as well.
Why this change?
This is a follow up to 897be75941.
When updating `net-smtp` from `0.4.x` to `0.5.x`, our test suite passed
but the error `ArgumentError: SMTP-AUTH requested but missing user name`
was being thrown in production leading to emails being failed to send
out via SMTP.
This commit adds a test to ensure that our production SMTP settings will
at least attemp to connect to an SMTP server.
## Why this change?
The previous implementation of the method generated the query to find the relevant topics and iterated over the results, processing them.
This behavior made difficult reusing or changing the query logic in classes extending `CategoryList`.
This commit extracts the query logic into another method called `relevant_topics_query ` which can be reused or overwritten in descendant classes.
This was originally introduced in #26071, but that PR was closed, because the requirements changed. This PR lifts only the relevant parts, since they are a prerequisite for the new admin notice system.
This enables the following in Discourse AI
```
plugin.register_modifier(:chat_allowed_bot_user_ids) do |user_ids, guardian|
if guardian.user
mentionables = AiPersona.mentionables(user: guardian.user)
allowed_bot_ids = mentionables.map { |mentionable| mentionable[:user_id] }
user_ids.concat(allowed_bot_ids)
end
user_ids
end
```
some bots that are id < 0 need to be discoverable in search otherwise people can not talk to them.
---------
Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
When "lazy load categories" is enabled and parent_category_id was set,
the query fetching categories contained a contradiction filtering both
by parent_category_id and parent_category_id = NULL.
When categories are loaded by the frontend, the parent category is
looked up by ID and the `parentCategory` is set with the result. If the
categories returned are not in order, the parent category may miss.
We were using `OFFSET`/`LIMIT` to query topics without an 'ORDER'. Without an explicit order, postgres makes no guarantees about which rows will be returned for each query. This commit adds `ORDER BY id ASC` so that our sitemaps behave consistently.
All our link validation, and conversion from url -> route/model/query is expensive and prone to bugs. Instead, if people enter a link, we can just use it as-is.
Originally all this extra logic was added to handle unusual situations like `/safe-mode`, `/my/...`, etc. However, all of these are now handled correctly by our Ember router, so there is no need for it.
Now, we just pass the user-supplied `href` directly to the SectionLink component, and let Ember handle routing to it when clicked.
The only functional change here is that we no longer validate internal links by parsing them with the Ember router. But I'd argue this is fine, because the previous logic would cause both false positives (e.g. `/t/123` would be valid, even if topic 123 doesn't exist), and false negatives (for routes which are server-side only, like the new AI share pages).
We were incorrectly using `return` in a block which was causing exceptions at runtime. These exceptions were not causing much issues as they are in defer block.
While working on writing a test for this specific case, I noticed that our `upsert_custom_fields` function was using rails `update_all` which is not updating the `updated_at` timestamp. This commit also fixes it and adds a test for it.
In #26122 we promoted all problem checks defined as class methods on AdminDashboardData to their own first-class ProblemCheck instances.
This PR continues that by promoting problem checks that are implemented as blocks as well. This includes updating a couple plugins that have problem checks.
This is a follow up to e2da72b76c.
Why this change?
According to https://web.dev/articles/preload-critical-assets,
> By preloading a certain resource, you are telling the browser that you would like to fetch it sooner than the browser would otherwise discover it because you are certain that it is important for the current page.
The preload resource hint is meant to tell the browser to fetch
resources that it would not discover upfront or early. However, we are
not using it the right way because we are literally adding the resource
hint right before a `<script>` tag which means the browser would have
discovered the resource even without the resource hint.
What does this change do?
This commit removes the preload resource hint which are added right
before script tags since the optimization here is highly questionable at the expense of making
our initial DOM larger.
We never use that information and this also fixes an issue with the BCC plugin which ends up triggering a rate-limit because we were publishing a "NEW_PRIVATE_MESSAGE" to the user sending the BCC for every recipients 💥
Internal - t/118283
This commit operates at three levels of abstraction:
1. We want to prevent user history rows from being unbounded in size.
This commit adds rails validations to limit the sizes of columns on
user_histories,
2. However, we don't want to prevent certain actions from being
completed if these columns are too long. In those cases, we truncate
the values that are given and store the truncated versions,
3. For endpoints that perform staff actions, we can further control
what is permitted by explicitly validating the params that are given
before attempting the action,
In AdminDashboardData we have a bunch of problem checks implemented as methods on that class. This PR absolves it of the responsibility by promoting each of those checks to a first class ProblemCheck. This way each of them can have their own priority and arbitrary functionality can be isolated in its own class.
Think "extract class" refactoring over and over. Since they were all moved we can also get rid of the @@problem_syms class variable which was basically the old version of the registry now replaced by ProblemCheck.realtime.
In addition AdminDashboardData::Problem value object has been entirely replaced with the new ProblemCheck::Problem (with compatible API).
Lastly, I added some RSpec matchers to simplify testing of problem checks and provide helpful error messages when assertions fail.
There are a couple of reasons for this.
The first one is practical, and related to eager loading. Since /lib is not eager loaded, when the application boots, ProblemCheck["identifier"] will be nil because the child classes aren't loaded.
The second one is more conceptual. There turns out to be a lot of inter-dependencies between the part of the problem check system that live in /app and the parts that live in /lib, which probably suggests it should all go in /app.
This change creates a user setting that they can toggle if
they don't want to receive unread notifications when someone closes a
topic they have read and are watching/tracking it.
Why this change?
There are two problematic queries in question here when loading
notifications in various tabs in the user menu:
```
SELECT "notifications".*
FROM "notifications"
LEFT JOIN topics ON notifications.topic_id = topics.id
WHERE "notifications"."user_id" = 1338 AND (topics.id IS NULL OR topics.deleted_at IS NULL)
ORDER BY notifications.high_priority AND NOT notifications.read DESC,
NOT notifications.read AND notifications.notification_type NOT IN (5,19,25) DESC,
notifications.created_at DESC
LIMIT 30;
```
and
```
EXPLAIN ANALYZE SELECT "notifications".*
FROM "notifications"
LEFT JOIN topics ON notifications.topic_id = topics.id
WHERE "notifications"."user_id" = 1338
AND (topics.id IS NULL OR topics.deleted_at IS NULL)
AND "notifications"."notification_type" IN (5, 19, 25)
ORDER BY notifications.high_priority AND NOT notifications.read DESC, NOT notifications.read DESC, notifications.created_at DESC LIMIT 30;
```
For a particular user, the queries takes about 40ms and 26ms
respectively on one of our production instance where the user has 10K notifications while the site has 600K notifications in total.
What does this change do?
1. Adds the `index_notifications_user_menu_ordering` index to the `notifications` table which is
indexed on `(user_id, (high_priority AND NOT read) DESC, (NOT read)
DESC, created_at DESC)`.
1. Adds a second index `index_notifications_user_menu_ordering_deprioritized_likes` to the `notifications`
table which is indexed on `(user_id, (high_priority AND NOT read) DESC, (NOT read AND notification_type NOT IN (5,19,25)) DESC, created_at DESC)`. Note that we have to hardcode the like typed notifications type here as it is being used in an ordering clause.
With the two indexes above, both queries complete in roughly 0.2ms. While I acknowledge that there will be some overhead in insert,update or delete operations. I believe this trade-off is worth it since viewing notifications in the user menu is something that is at the core of using a Discourse forum so we should optimise this experience as much as possible.
Why this change?
Prior to this change, the `CategoryList#find_relevant_topics` method was
loading and allocating all `CategoryFeaturedTopic` records in the
database to eventually only just use its `category_id` and `topic_id`
column. On a site with many `CategoryFeaturedTopic` records, the loading
of the ActiveRecord objects is a source of bottleneck.
The other problem with the `CategoryList#find_relevant_topics` method is
that it is unconditionally loading all records from the database even if
the user does not have access to the category. This again is wasteful.
What does this change do?
This commit makes it such that `CategoryList#find_relevant_topics` is
called only after `CategoryList#find_categories` in the `CategoryList#initialize`
method so that we can filter featured topics against categories that the
user has access to.
The second change is that Instead of loading `CategoryFeaturedTopic` records, we make an
inner join agains the `topics` table instead and skip any allocation of
`CatgoryFeaturedTopic` ActiveRecord objects.
As part of problem checks refactoring, we're moving some data to be DB backed. In this PR it's the tracking of problem check execution. When was it last run, when was the last problem, when should it run next, how many consecutive checks had problems, etc.
This allows us to implement the perform_every feature in scheduled problem checks for checks that don't need to be run every 10 minutes.
Users can hide their public profile and presence information by checking
“Hide my public profile and presence features” on the
`u/{username}/preferences/interface` page. In that case, we also don't
want to return user status from the server.
This work has been started in https://github.com/discourse/discourse/pull/23946.
The current PR fixes all the remaining places in Core.
Note that the actual fix is quite simple – a5802f484d.
But we had a fair amount of duplication in the code responsible for
the user status serialization, so I had to dry that up first. The refactoring
as well as adding some additional tests is the main part of this PR.
Now forums can enroll their sites to be showcased in the Discourse [Discover](https://discourse.org/discover) directory. Once they enable the site setting `include_in_discourse_discover` to enroll their forum the `CallDiscourseHub` job will ping the `api.discourse.org/api/discover/enroll` endpoint. Then the Discourse Hub will fetch the basic details from the forum and add it to the review queue. If the site is approved then the forum details will be displayed in the `/discover` page.