This commit updates `S3Inventory#files` to ignore S3 inventory files
which have a `last_modified` timestamp which are not at least 2 days
older than `BackupMetadata.last_restore_date` timestamp.
This check was previously only in `Jobs::EnsureS3UploadsExistence` but
`S3Inventory` can also be used via Rake tasks so this protection needs
to be in `S3Inventory` and not in the scheduled job.
This PR introduces a basic AdminNotice model to store these notices. Admin notices are categorized by their source/type (currently only notices from problem check.) They also have a priority.
When converting a PM to a public topic (and vice versa), if there was a validation error (like a topic already used, or a tag required or not allowed) the error message wasn't bubbled up nor shown to the user.
This fix ensures we properly stop the conversion whenever a validation error happens and bubble up the errors back to the user so they can be informed.
Internal ref - t/128795
In a large forum with millions of users and millions of user_fields
updating the list of dropdown user field options will result in a
502 now due to the large number of fields.
This commit moves the indexing into a job.
Previously, when the new site was created and after the first admin login, no one will receive notifications to review the user approval queue since only the moderators would receive the PMs about it. Also, this PR will change the "pending_users_reminder_delay_minutes" site setting to 5 minutes while the site is in bootstrap mode.
This adds a hidden site setting of `skip_email_bulk_invites`
If set to `true`, the `BulkInvite` job will pass the value to `Invite`, meaning the generated invite wont trigger an email notification being sent to the newly invited user.
(This is useful if you want to manage the sending of the invite emails outside of Discourse.)
Previously the problem check registry simply looked at the subclasses of ProblemCheck. This was causing some confusion in environments where eager loading is not enabled, as the registry would appear empty as a result of the classes never being referenced (and thus never loaded.)
This PR changes the approach to a more explicit one. I followed other implementations (bookmarkable and hashtag autocomplete.) As a bonus, this now has a neat plugin entry point as well.
The build is broken due to some changes not being staged when I pushed the previous PR. The assertions that check that a job has been scheduled needs to be updated to reflect the new name.
Doing the following renames:
Jobs::ProblemChecks → Jobs::RunProblemChecks
Jobs::ProblemCheck → Jobs::RunProblemCheck
This is to disambiguate the ProblemCheck class name, ease fuzzy finding, and avoid needing to use :: in a bunch of places.
As part of problem checks refactoring, we're moving some data to be DB backed. In this PR it's the tracking of problem check execution. When was it last run, when was the last problem, when should it run next, how many consecutive checks had problems, etc.
This allows us to implement the perform_every feature in scheduled problem checks for checks that don't need to be run every 10 minutes.
This PR adds a new scheduled problem check that simply tries to connect to Twitter OAuth endpoint to check that it's working. It is using the default retry strategy of 2 retries 30 seconds apart.
Now forums can enroll their sites to be showcased in the Discourse [Discover](https://discourse.org/discover) directory. Once they enable the site setting `include_in_discourse_discover` to enroll their forum the `CallDiscourseHub` job will ping the `api.discourse.org/api/discover/enroll` endpoint. Then the Discourse Hub will fetch the basic details from the forum and add it to the review queue. If the site is approved then the forum details will be displayed in the `/discover` page.
Also, remove experimental setting and simply use top_menu for feature detection
This means that when people eventually enable the hot top menu, there will
be topics in it
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
Previously, problem checks were all added as either class methods or blocks in AdminDashboardData. Another set of class methods were used to add and run problem checks.
As of this PR, problem checks are promoted to first-class citizens. Each problem check receives their own class. This class of course contains the implementation for running the check, but also configuration items like retry strategies (for scheduled checks.)
In addition, the parent class ProblemCheck also serves as a registry for checks. For example we can get a list of all existing check classes through ProblemCheck.checks, or just the ones running on a schedule through ProblemCheck.scheduled.
After this refactor, the task of adding a new check is significantly simplified. You add a class that inherits ProblemCheck, you implement it, add a test, and you're good to go.
A while ago we increased group SMTP read and open timeouts
to address issues we were seeing with Gmail sometimes giving
really long timeouts for these values. The commit was:
3e639e4aa7
Now, we want to increase all SMTP read timeouts to 30s,
since the 5s is too low sometimes, and the ruby Net::SMTP
stdlib also defaults to 30s.
Also, we want to slightly tweak the group smtp email job
not to fail if the IncomingEmail log fails to create, or if
a ReadTimeout is encountered, to avoid retrying the job in sidekiq
again and sending the same email out.
We just completed the 3.2 release, which marks a good time to drop some previously deprecated columns.
Since the column has been marked in ignored_columns, it has been inaccessible to application code since then. There's a tiny risk that this might break a Data Explorer query, but given the nature of the column, the years of disuse, and the fact that such a breakage wouldn't be critical, we accept it.
When exporting a csv file and the size of the file exceeded the
max_export_file_size_kb it will still send the PM that the export
succeeded with a broken link to a missing export file. This change
ensures that a failed message will be sent instead.
We're changing the implementation of trust levels to use groups. Part of this is to have site settings that reference trust levels use groups instead. It converts the min_trust_level_to_tag_topics site setting to tag_topic_allowed_groups.
We have all these calls to Group.refresh_automatic_groups! littered throughout the tests. Including tests that are seemingly unrelated to groups. This is because automatic group memberships aren't fabricated when making a vanilla user. There are two places where you'd want to use this:
You have fabricated a user that needs a certain trust level (which is now based on group membership.)
You need the system user to have a certain trust level.
In the first case, we can pass refresh_auto_groups: true to the fabricator instead. This is a more lightweight operation that only considers a single user, instead of all users in all groups.
The second case is no longer a thing after #25400.
We're changing the implementation of trust levels to use groups. Part of this is to have site settings that reference trust levels use groups instead. It converts the min_trust_level_to_tag_topics site setting to tag_topic_allowed_groups.
Why this change?
On CI, we have been seeing the "handles job concurrency" job timing out
on CI after 45 seconds. Upon closer inspection of `Jobs::Base#perform`
when cluster concurrency has been set, we see that a thread is spun up
to extend the expiring of a redis key by 120 seconds every 60 seconds
while the job is still being executed. The thread looks like this before
the fix:
```
keepalive_thread =
Thread.new do
while parent_thread.alive? && !finished
Discourse.redis.without_namespace.expire(cluster_concurrency_redis_key, 120)
sleep 60
end
end
```
In an ensure block of `Jobs::Base#perform`, the thread is stop by doing
something like this:
```
finished = true
keepalive_thread.wakeup
keepalive_thread.join
```
If the thread is sleeping, `keepalive_thread.wakeup` will stop the
`sleep` method and run the next iteration causing the thread to
complete. However, there is a timing issue at play here. If
`keepalive_thread.wakeup` is called at a time when the thread is not
sleeping, it will have no effect and the thread may end up sleeping for
60 seconds which is longer than our timeout on CI of 45 seconds.
What does this change do?
1. Change `sleep 60` to sleep in intervals of 1 second checking if the
job has been finished each time.
2. Add `use_redis_snapshotting` to `Jobs::Base` spec since Redis is
involved in scheduling and we want to ensure we don't leak Redis
keys.
3. Add `ConcurrentJob.stop!` and `thread.join` to `ensure` block in "handles job concurrency"
test since a failing expectation will cause us to not clean up the
thread we created in the test.
We're changing the implementation of trust levels to use groups. Part of this is to have site settings that reference trust levels use groups instead. It converts the min_trust_to_flag_posts site setting to flag_post_allowed_groups.
Note: In the original setting, "posts" is plural. I have changed this to "post" singular in the new setting to match others.
This change converts the min_trust_to_create_topic site setting to
create_topic_allowed_groups.
See: https://meta.discourse.org/t/283408
- Hides the old setting
- Adds the new site setting
- Add a deprecation warning
- Updates to use the new setting
- Adds a migration to fill in the new setting if the old setting was
changed
- Adds an entry to the site_setting.keywords section
- Updates tests to account for the new change
- After a couple of months, we will remove the min_trust_to_create_topicsetting entirely.
Internal ref: /t/117248
The most common thing that we do with fab! is:
fab!(:thing) { Fabricate(:thing) }
This commit adds a shorthand for this which is just simply:
fab!(:thing)
i.e. If you omit the block, then, by default, you'll get a `Fabricate`d object using the fabricator of the same name.
This commit fixes an issue where when some actions were done
(deleting/recovering post, moving posts) we updated the
topic_users.bookmarked column to the wrong value. This was happening
because the SyncTopicUserBookmarked job was not taking into account
Topic level bookmarks, so if there was a Topic bookmark and no
Post bookmarks for a user in the topic, they would have
topic_users.bookmarked set to false, which meant the bookmark would
no longer show in the /bookmarks list.
To reproduce before the fix:
* Bookmark a topic and don’t bookmark any posts within
* Delete or recover any post in the topic
c.f. https://meta.discourse.org/t/disappearing-bookmarks-and-expected-behavior-of-bookmarks/264670/36
We updated scheduled admin checks to run concurrently in their own jobs. The main reason for this was so that we can implement re-check functionality for especially flaky checks (e.g. group e-mail credentials check.)
This works in the following way:
1. The check declares its retry policy using class methods.
2. A block can be yielded to if there are problems, but before they are committed to Redis.
3. The job uses this block to either a) schedule a retry if there are any remaining or b) do nothing and let the check commit.
This PR does some preparatory refactoring of scheduled admin checks in order for us to be able to do custom retry strategies for some of them.
Instead of running all checks in sequence inside a single, scheduled job, the scheduled job spawns one new job per check.
In order to be concurrency-safe, we need to change the existing Redis data structure from a string (of serialized JSON) to a list of strings (of serialized JSON).
If a codeblock contains **exactly** the same markdown as an image which has been retrieved by the 'pull hotlinked' job, then it will be replaced with the new URL. This commit adds failing (skipped) tests for this issue.