Commit Graph

1501 Commits

Author SHA1 Message Date
Sam
3fd3a76422
FIX: we introduced a Jobs::UserEmail which broke consistency checks (#30409)
Fix ensures all classes are rooted and there is a spec that will catch
failures next time
2024-12-22 21:33:47 +11:00
Alan Guo Xiang Tan
9812407f76
FIX: Redo Sidekiq monitoring to restart stuck sidekiq processes (#30198)
This commit reimplements how we monitor Sidekiq processes that are
forked from the Unicorn master process. Prior to this change, we rely on
`Jobs::Heartbeat` to enqueue a `Jobs::RunHeartbeat` job every 3 minutes.
The `Jobs::RunHeartbeat` job then sets a Redis key with a timestamp. In
the Unicorn master process, we then fetch the timestamp that has been set
by the job from Redis every 30 minutes. If the timestamp has not been
updated for more than 30 minutes, we restart the Sidekiq process. The
fundamental flaw with this approach is that it fails to consider
deployments with multiple hosts and multiple Sidekiq processes. A
sidekiq process on a host may be in a bad state but the heartbeat check
will not restart the process because the `Jobs::RunHeartbeat` job is
still being executed by the working Sidekiq processes on other hosts.

In order to properly ensure that stuck Sidekiq processs are restarted,
we now rely on the [Sidekiq::ProcessSet](https://github.com/sidekiq/sidekiq/wiki/API#processes)
API that is supported by Sidekiq. The API provides us with "near real-time (updated every 5 sec)
info about the current set of Sidekiq processes running". The API
provides useful information like the hostname, pid and also when Sidekiq
last did its own heartbeat check. With that information, we can easily
determine if a Sidekiq process needs to be restarted from the Unicorn
master process.
2024-12-18 12:48:50 +08:00
Kelv
04ba5baec0
DEV: ensure rebaking works even when some users have inconsistent data (#30261)
* DEV: add db consistency check for UserEmail

* DEV: add db consistency check for UserAvatar

* DEV: ignore inconsistent data related to user avatars when deciding whether to rebake old posts


Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>

---------

Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
2024-12-16 19:48:25 +08:00
Alan Guo Xiang Tan
f35128c6ed
DEV: Fix broken sidekiq logging due to eeb01ea0de (#30199) 2024-12-10 17:01:25 +08:00
Alan Guo Xiang Tan
eeb01ea0de
DEV: Remove unnecessary thread in Jobs::Base::JobInstrumenter take 2 (#30195)
This reverts commit 766ff723f8.

Ensure that we create the sidekiq log file first before opening it for
logging. This avoids any issue of the log file not being present when we
initialize an instance of the `Logger`.
2024-12-10 12:44:56 +08:00
Alan Guo Xiang Tan
766ff723f8
Revert "DEV: Remove unnecessary thread in Jobs::Base::JobInstrumenter (#30179)" (#30193)
This reverts commit 1670ffe82d.
2024-12-10 09:24:40 +08:00
Alan Guo Xiang Tan
1670ffe82d
DEV: Remove unnecessary thread in Jobs::Base::JobInstrumenter (#30179)
In `Jobs::Base::JobInstrumenter.raw_log`, we were creating an instance
of `Queue` and then pushing messages to the queue before popping it off
the queue in a thread. However, this complexity is not necessary when
we can just write directly to the logger without much overhead. This is
how all logging is done in other parts of the app as well.
2024-12-10 06:29:46 +08:00
Bianca Nenciu
5e734516db
DEV: Drop DISCOURSE_LIVE_SLOTS_SIDEKIQ_LIMIT (#29920)
This was used to track jobs that may leak memory, but proved to be too
noisy and not very useful.
2024-11-26 07:21:14 +11:00
Juan David Martínez Cubillos
08440b0035
DEV: Add tl3_custom_promotions plugin modifier to tl3_promotions.rb (#29834)
* DEV: Add tl3_custom_promotions plugin modifier to tl3_promotions.rb

* added tests

* added tests for demotions

* changed argument order in test
2024-11-22 15:28:43 -05:00
Bianca Nenciu
250a145361
DEV: Fix undefined variable (#29876)
Follow up to commit 429cf656e7.
2024-11-21 20:23:20 +02:00
Bianca Nenciu
429cf656e7
FIX: Use FinalDestination::HTTP to push notifications (#29858)
Sometimes `Jobs::PushNotification` gets stuck, probably because of the
network call. This commit replaces `Excon` with `FinalDestination::HTTP`
which is safer.
2024-11-21 14:11:51 +11:00
Angus McLeod
ec7de0fd68
Require permitted scopes when registering a client (#29718) 2024-11-19 15:28:04 -05:00
Keegan George
eb2992a628
REVERT: Check for features sooner (#29746) 2024-11-13 11:30:34 -08:00
Keegan George
8dc474952b
DEV: Check for features sooner (#29745)
This PR updates the Job for checking new features from the features feed from every day to every hour. This allows for easier rollout of features.
2024-11-13 10:08:13 -08:00
Alan Guo Xiang Tan
322a3be2db
DEV: Remove logical OR assignment of constants (#29201)
Constants should always be only assigned once. The logical OR assignment
of a constant is a relic of the past before we used zeitwerk for
autoloading and had bugs where a file could be loaded twice resulting in
constant redefinition warnings.
2024-10-16 10:09:07 +08:00
Yuvaraj J
65a1e149ad
FIX: Notify mailing list subscribers on category change (#28811)
cf. https://meta.discourse.org/t/email-notifications-dont-get-sent-on-category-change-for-mailing-list-mode-users/308096
2024-10-11 14:47:39 +02:00
Alan Guo Xiang Tan
c1f25cdf5b
FIX: Unicorn master and Sidekiq reopening logs at the same time (#29137)
In our production environment, we have been seeing Sidekiq processes
getting stuck randomly when a USR1 signal is sent to the Unicorn master
process. We have not been able to identify the root cause of why the
Sidekiq process gets stuck. We however noticed that when the Unicorn
master process receives a USR1 signal, it will reopen the logs for the
Unicorn master process first before sending a USR1 signal for the
Unicorn worker processes to reopen the logs. We figured that we should
do the same for the Sidekiq process as well when a USR1 signal.

In this commit, we introduce an arbitrary delay of 1 second before we
the Sidekiq process reopens its log files so as to allow enough time for the Unicorn
master to finish reopening it logs first.

We also do not send reopen logs for the Sidekiq process if the `DISCOURSE_LOG_SIDEKIQ`
env is not present because there is no need to reopen any logs.
2024-10-10 08:01:40 +08:00
Ted Johansson
e60876ce49
FIX: Appropriately handle uninstalled problem checks (#28771)
When running checks, we look to the existing problem check trackers and try to grab their ProblemCheck classes.

In some cases this is no longer in the problem check repository, e.g. when the check was part of a plugin that has been uninstalled.

In the case where the check was scheduled, this would lead to an error in one of the jobs
2024-09-18 10:11:52 +08:00
Ted Johansson
776b4ec8e2
DEV: Remove old problem check system - Part 1 (#28772)
We're now using the new, database-backed problem check system. This PR removes parts of the old, Redis-backed system that is now defunct.
2024-09-06 17:00:25 +08:00
Loïc Guitaut
e94707acdf DEV: Drop WithServiceHelper
This patch removes the `with_service` helper from the code base.
Instead, we can pass a block with actions directly to the `.call` method
of a service.

This simplifies how to use services:
- use `.call` without a block to run the service and get its result
  object.
- use `.call` with a block of actions to run the service and execute
  arbitrary code depending on the service outcome.

It also means a service is now “self-contained” and can be used anywhere
without having to include a helper or whatever.
2024-09-05 09:58:20 +02:00
Osama Sayegh
280adda09c
FEATURE: Support designating multiple groups as mods on category (#28655)
Currently, categories support designating only 1 group as a moderation group on the category. This commit removes the one group limitation and makes it possible to designate multiple groups as mods on a category.

Internal topic: t/124648.
2024-09-04 04:38:46 +03:00
Renato Atilio
54d6e52607
FIX: chat mailer log noise (#28616)
Fixes the log noise caused by a deprecation notice
2024-08-29 11:39:08 -03:00
Bianca Nenciu
95b09dd777
DEV: Log live slots of Sidekiq jobs (#28600)
Introduce a new log line for Sidekiq jobs that consume more than
`DISCOURSE_LIVE_SLOTS_SIDEKIQ_LIMIT` live slots. This is useful to
track down jobs that may leak memory.

This is enabled only when Sidekiq's job instrumenter is enabled (set
`DISCOURSE_LOG_SIDEKIQ` to `1`).
2024-08-29 12:23:27 +03:00
Loïc Guitaut
0636855706 DEV: Allow using an AR relation as a model in services
This patch allows using an AR relation as a model in services without
fetching associated records. It will just check if the relation is empty
or not. In the former case, the execution will stop at that point, as
expected.
2024-08-20 16:32:46 +02:00
Martin Brennan
c120c446da
DEV: Cleanup empty method in job (#28395)
Followup 624dc87321
2024-08-16 14:10:46 +08:00
Krzysztof Kotlarek
e82e255531
FIX: serialize Flags instead of PostActionType (#28362)
### Why?
Before, all flags were static. Therefore, they were stored in class variables and serialized by SiteSerializer. Recently, we added an option for admins to add their own flags or disable existing flags. Therefore, the class variable had to be dropped because it was unsafe for a multisite environment. However, it started causing performance problems. 

### Solution
When a new Flag system is used, instead of using PostActionType, we can serialize Flags and use fragment cache for performance reasons. 

At the same time, we are still supporting deprecated `replace_flags` API call. When it is used, we fall back to the old solution and the admin cannot add custom flags. In a couple of months, we will be able to drop that API function and clean that code properly. However, because it may still be used, redis cache was introduced to improve performance.

To test backward compatibility you can add this code to any plugin
```ruby
  replace_flags do |flag_settings|
    flag_settings.add(
      4,
      :inappropriate,
      topic_type: true,
      notify_type: true,
      auto_action_type: true,
    )
    flag_settings.add(1001, :trolling, topic_type: true, notify_type: true, auto_action_type: true)
  end
```
2024-08-14 12:13:46 +10:00
Krzysztof Kotlarek
559c9dfe0a
REVERT: FIX: serialize Flags instead of PostActionType (#28334) 2024-08-13 18:32:11 +10:00
David Battersby
0954ae70a6
FEATURE: add delay to native push notifications (#28314)
This change ensures native push notifications respect the site setting for push_notification_time_window_mins. Previously only web push notifications would account for the delay, now we can bring more consistency between Discourse in browser vs Hub, by applying the same delay strategy to both forms of push notifications.
2024-08-13 12:12:05 +04:00
Krzysztof Kotlarek
094052c1ff
FIX: serialize Flags instead of PostActionType (#28259)
### Why?
Before, all flags were static. Therefore, they were stored in class variables and serialized by SiteSerializer. Recently, we added an option for admins to add their own flags or disable existing flags. Therefore, the class variable had to be dropped because it was unsafe for a multisite environment. However, it started causing performance problems. 

### Solution
When a new Flag system is used, instead of using PostActionType, we can serialize Flags and use fragment cache for performance reasons. 

At the same time, we are still supporting deprecated `replace_flags` API call. When it is used, we fall back to the old solution and the admin cannot add custom flags. In a couple of months, we will be able to drop that API function and clean that code properly. However, because it may still be used, redis cache was introduced to improve performance.

To test backward compatibility you can add this code to any plugin
```ruby
  replace_flags do |flag_settings|
    flag_settings.add(
      4,
      :inappropriate,
      topic_type: true,
      notify_type: true,
      auto_action_type: true,
    )
    flag_settings.add(1001, :trolling, topic_type: true, notify_type: true, auto_action_type: true)
  end
```
2024-08-13 11:22:37 +10:00
Alan Guo Xiang Tan
1a09d6b246
FEATURE: Add live_slots_(start|finish) for Sidekiq perf logging (#28260)
This information is helpful in debugging memory spikes when Sidekiq
processes jobs.
2024-08-07 15:48:24 +08:00
Natalie Tay
624dc87321
DEV: Avoid initializing max_image_size_kb in initializer (#28209)
Since this might initialize to the default db's values
rather than that of the site.
2024-08-02 23:15:14 +08:00
David Battersby
6ec8728ebf
DEV: refactor live notifications setting in user preferences (#28145)
This change is mainly a refactor of the desktop notifications service to improve readability and have standardised values for tracking state for current user in regards to the Notification API and Push API.

Also improves readability when handling push notification jobs, especially in scenarios where the push_notification_time_window_mins site setting is set to 0, which will allow sending push notifications instantly.
2024-08-02 17:25:15 +04:00
锦心
319075e4dd
FIX: Ensure JsLocaleHelper to not output deprecated translations (#28037)
* FIX: Ensure JsLocaleHelper to obly outputs up-to-date translations

The old implementation forgot to filter out deprecated
translations, causing these translations to incorrectly override the new
locale in the frontend.

This commit fills in the forgotten where clause, filtering only the
up-to-date part.

Related meta topic: https://meta.discourse.org/t/outdated-translation-replacement-causing-missing-translation/314352
2024-07-29 15:21:25 +08:00
Alan Guo Xiang Tan
5a37fa3760
FIX: Fix Jobs::Onceoff.enqueue_all undefined method for nilClass error (#28073)
In development, classes are lazy loaded so `Jobs::Onceoff.onceoff_job_klasses`
may not have been set. This is not a problem in production cause stuff
is eager loaded.

Follow-up to f4d06f195d
2024-07-25 15:52:42 +08:00
Alan Guo Xiang Tan
f4d06f195d
PERF: Avoid using ObjectSpace.each_object in Jobs::Onceoff.enqueue_all (#28072)
We are investigating a memory leak in Sidekiq and saw the following line
when comparing heap dumps over time.

`Allocated IMEMO 14775 objects of size 591000/7389528 (in bytes) at:
/var/www/discourse/app/jobs/onceoff/onceoff.rb:36`

That line in question was doing a `.select { |klass| klass < self  }` on
`ObjectSpace.each_object(Class)`. This for some reason is allocating a
whole bunch of `IMEMO` objects which are instruction sequence objects.

Instead of diving deeper into why this might be leaking, we can just
save our time by switching to an implementation that is more efficient
and does not require looping through a ton of objects.
2024-07-25 13:30:56 +08:00
Guhyoun Nam
784c04ea81
FEATURE: Add Mechanism to redeliver all failed webhook events (#27609)
Background:
In order to redrive failed webhook events, an operator has to go through and click on each. This PR is adding a mechanism to retry all failed events to help resolve issues quickly once the underlying failure has been resolved.

What is the change?:
Previously, we had to redeliver each webhook event. This merge is adding a 'Redeliver Failed' button next to the webhook event filter to redeliver all failed events. If there is no failed webhook events to redeliver, 'Redeliver Failed' gets disabled. If you click it, a window pops up to confirm the operator. Failed webhook events will be added to the queue and webhook event list will show the redelivering progress. Every minute, a job will be ran to go through 20 events to redeliver. Every hour, a job will cleanup the redelivering events which have been stored more than 8 hours.
2024-07-08 15:43:16 -05:00
Keegan George
ea58140032
DEV: Remove summarization code (#27373) 2024-07-02 08:51:47 -07:00
Sam
61610a61fa
FIX: disallow concurrent downloads of hotlinked images (#27676) 2024-07-02 10:06:46 +01:00
Jan Cernik
6599b85a75
DEV: Block accidental serialization of entire AR models (#27668) 2024-07-01 17:08:48 -03:00
Martin Brennan
ffc99253fa
DEV: Resolve TODO comments for martin-brennan
I am changing many of these to notes or resolving them as is,
most of these I have not actively worked on in years so someone
else can work on them when we get to these areas again.
2024-07-01 15:32:30 +10:00
Gabriel Grubba
f3a89620a1
FEATURE: Add WebHookEventsDailyAggregate (#27542)
* FEATURE: Add WebHookEventsDailyAggregate

Add WebHookEventsDailyAggregate model to store daily aggregates of web hook events.
Add AggregateWebHooksEvents job to aggregate web hook events daily.
Add spec for WebHookEventsDailyAggregate model.

* DEV: Update annotations for web_hook_events_daily_aggregate.rb

* DEV: Update app/jobs/scheduled/aggregate_web_hooks_events.rb

Co-authored-by: Martin Brennan <martin@discourse.org>

* DEV: Address review feedback

Solves:
- https://github.com/discourse/discourse/pull/27542#discussion_r1646961101
- https://github.com/discourse/discourse/pull/27542#discussion_r1646958890
- https://github.com/discourse/discourse/pull/27542#discussion_r1646976808
- https://github.com/discourse/discourse/pull/27542#discussion_r1646979846
- https://github.com/discourse/discourse/pull/27542#discussion_r1646981036

* A11Y: Add translation to retain_web_hook_events_aggregate_days key

* FEATURE: Purge old web hook events daily aggregate

Solves: https://github.com/discourse/discourse/pull/27542#discussion_r1646961101

* DEV:  Update tests for web_hook_events_daily_aggregate

Update WebHookEventsDailyAggregate to not use save! at the end
Solves: https://github.com/discourse/discourse/pull/27542#discussion_r1646984601

* PERF: Change job query to use WebHook table instead of WebHookEvent table

* DEV: Update tests to use `fab!`

* DEV: Address code review feedback.

Add idempotency to job
Add has_many to WebHook

* DEV: add test case for job and change job query

* DEV: Change AggregateWebHooksEvents job test name

---------

Co-authored-by: Martin Brennan <martin@discourse.org>
2024-06-25 13:56:47 -03:00
Alan Guo Xiang Tan
adc824a9bc
FIX: Jobs::EnsureS3UploadsExistence broken for multisite (#27401)
This is a follow-up to 8cf4ed5f88.
2024-06-10 16:26:39 +08:00
Alan Guo Xiang Tan
8cf4ed5f88
DEV: Introduce hidden s3_inventory_bucket site setting (#27304)
This commit introduces a hidden `s3_inventory_bucket` site setting which
replaces the `enable_s3_inventory` and `s3_configure_inventory_policy`
site setting.

The reason `enable_s3_inventory` and `s3_configure_inventory_policy`
site settings are removed is because this feature has technically been
broken since it was introduced. When the `enable_s3_inventory` feature
is turned on, the app will because configure a daily inventory policy for the
`s3_upload_bucket` bucket and store the inventories under a prefix in
the bucket. The problem here is that once the inventories are created,
there is nothing cleaning up all these inventories so whoever that has
enabled this feature would have been paying the cost of storing a whole
bunch of inventory files which are never used. Given that we have not
received any complains about inventory files inflating S3 storage costs,
we think that it is very likely that this feature is no longer being
used and we are looking to drop support for this feature in the not too
distance future.

For now, we will still support a hidden `s3_inventory_bucket` site
setting which site administrators can configure via the
`DISCOURSE_S3_INVENTORY_BUCKET` env.
2024-06-10 13:16:00 +08:00
Loïc Guitaut
2a28cda15c DEV: Update to lastest rubocop-discourse 2024-05-27 18:06:14 +02:00
Mark VanLandingham
971b66e440
DEV: Move webhook event header modifier for redelivery-recalucation (#27177) 2024-05-24 10:37:10 -05:00
Alan Guo Xiang Tan
df16ab0758
FIX: S3Inventory to ignore files older than last backup restore date (#27166)
This commit updates `S3Inventory#files` to ignore S3 inventory files
which have a `last_modified` timestamp which are not at least 2 days
older than `BackupMetadata.last_restore_date` timestamp.

This check was previously only in `Jobs::EnsureS3UploadsExistence` but
`S3Inventory` can also be used via Rake tasks so this protection needs
to be in `S3Inventory` and not in the scheduled job.
2024-05-24 10:54:06 +08:00
Jeff Wong
3a3ee5e04a
DEV: replace .each with .find_each for paginated queries (#27159)
Large batches of reviewables may require paginated queries.
2024-05-23 15:42:21 -07:00
Ted Johansson
3137e60653
DEV: Database backed admin notices (#26192)
This PR introduces a basic AdminNotice model to store these notices. Admin notices are categorized by their source/type (currently only notices from problem check.) They also have a priority.
2024-05-23 09:29:08 +08:00
Régis Hanol
958437e7dd
FIX: send activity summaries based on "last seen" (#27035)
instead of "last emailed" so that people getting email notifications (from a watched topic for example) also get the activity summaries.

Context - https://meta.discourse.org/t/activity-summary-not-sent-if-other-emails-are-sent/293040

Internal Ref - t/125582

Improvement over 95885645d9
2024-05-22 10:23:03 +02:00
Isaac Janzen
ede0fa5802
DEV: Update bulk-invite logs and PM template (#27057)
# Preview

<img width="754" alt="Screenshot 2024-05-17 at 8 50 03 AM" src="https://github.com/discourse/discourse/assets/50783505/6710234f-0195-42be-b70e-9d57ba48bb4a">


# New Logs

```
[2024-05-17 08:49:54 -0600] Invalid User Field 'backend name' for 'foobarbing@gmail.com'
[2024-05-17 08:49:54 -0600] Invalid Email 'test
[2024-05-17 08:49:54 -0600] Invalid Email 'this@$@**.com
```
2024-05-17 12:21:21 -06:00