Commit Graph

24 Commits

Author SHA1 Message Date
Sam
654959faa4
DEV: reduce amount of errors logged when notifying on flags (#20472)
Forcing distributed muted to raise when a notify reviewable job is running
leads to excessive errors in the logs under many conditions.

The new pattern

1. Optimises the counting of reviewables so it is a lot faster
2. Holds the distributed lock for 2 minutes (max)


The downside is the job queue can get blocked up when tons of notify
reviewables are running at the same time. However this should be very
rare in the real world, as we only notify when stuff is flagged which
is fairly infrequent.

This also give a fair bit more time for the notifications which may be
a little slow on large sites with tons of mods.
2023-03-01 08:58:32 +11:00
jbrw
f8863b0f98
DEV: Limit concurrency of NotifyReviewables job (#19968)
Under scenarios of extremely high load where large numbers of `Reviewable*` items are being created, it has been observed that multiple instances of the `NotifyReviewable` job may run simultaneously.

These jobs will work satisfactorily if the concurrency is limited to 1, and the different types of jobs (items reviewable by admins, vs moderators, vs particular groups, etc.) are run eventually.

This change introduces a new option to `DistributedMutex` which allows the `max_get_lock_attempts` to be specified. If the number is exceeded an error will be raised, which will cause Sidekiq to requeue the job. Sidekiq has existing logic to back-off on retry times for jobs that have failed multiple times.
2023-01-25 15:19:11 -05:00
David Taylor
6417173082
DEV: Apply syntax_tree formatting to lib/* 2023-01-09 12:10:19 +00:00
Michael Brown
618fb5b34d FIX: properly count DistributedMutex locking attempts
When originally introduced, `attempts` was only used in the read-only check
context.

With the introduction of the exponential backoff in cda370db, `attempts` was
also used to count loop iterations, but was left inside the if block instead of
being incremented every loop, meaning the exponential backoff was only
happening when the site was recently readonly.

Co-authored-by: jbrw <jamie.wilson@discourse.org>
2022-12-13 17:27:13 -05:00
David Taylor
2422ca0e67
PERF: Add exponential backoff for DistributedMutex (#17886)
Polling every 0.001s can cause extreme load on the redis instance, especially in scenarios where multiple app instances are waiting on the same lock. This commit introduces an exponential backoff starting from 0.001s and reaching a maximum interval of 1s.

Previously `CHECK_READONLY_ATTEMPTS` was 10, and resulted in a block for 0.01s. Under the new logic, 10 attempts take more than 1s. Therefore CHECK_READONLY_ATTEMPTS is reduced to 5, bringing its total time to around 0.031s
2022-08-12 18:39:01 +01:00
Jarek Radosz
87353faac6
DEV: Implement distributed mutex in lua (#16228)
The rationale behind this was mostly to stop using `redis.synchronize` (now removed in redis gem 4.6)
2022-07-11 14:16:37 +02:00
Sam
2df3c65ba9
FIX: add support for pipelined and multi redis commands (#16682)
Latest redis interoduces a block form of multi / pipelined, this was incorrectly
passed through and not namespaced.

Fix also updates logster, we held off on upgrading it due to missing functions
2022-05-10 08:19:02 +10:00
Daniel Waterworth
7c7098c700 FIX: Off-by-one error setting the distributed mutex key to expire
Accounting for fractional seconds, a distributed mutex can be held for
almost a full second longer than its validity.

For example: if we grab the lock at 10.5 seconds passed the epoch with a
validity of 5 seconds, the lock would be released at 16 seconds passed
the epoch. However, in this case assuming that all other processing
takes a negligible amount of time, the key would be expired at 15.5
seconds passed the epoch.

Using expireat, the key is now expired exactly when the lock is released.
2020-02-03 14:54:50 +00:00
Joffrey JAFFEUX
0d3d2c43a0
DEV: s/\$redis/Discourse\.redis (#8431)
This commit also adds a rubocop rule to prevent global variables.
2019-12-03 10:05:53 +01:00
Daniel Waterworth
1fdba2c5b2 FIX: Harden DistributedMutex
Threadsafety

  Since we use the same redis connection in multiple threads, a rogue
  transaction in another thread can trample the connection state
  (watched keys) that we need to acquire and release the lock properly.

  This is fixed by preventing other threads from using the connection
  when we are performing these actions.

Off-by-one error

  A distributed mutex is now consistently determined to be expired if
  the current time is strictly greater than the expire time.

Unwatch before transaction

  Since the redis connection is used by so much of the code, it is
  difficult to ensure that any watched keys have been cleared. In order
  to defend against this rogue connection state, an unwatch has been
  added before locking and unlocking.

Logging

  Hopefully this log message is more clear.
2019-10-02 13:00:41 +00:00
Neil Lalonde
5f87089b67 FIX: remove dependency on present? in distributed_mutex lib 2019-08-07 15:39:51 -04:00
Sam Saffron
67f5ad5ac0 FEATURE: allow post process mutex to be held longer
Previously we would only hold the post process mutex for 1 minute, that is
not enough when processing a post with lots of images. This raises the bar
to 10 minutes.

It also cleans up error reporting around distributed mutexes expiring. We
used to double report.
2019-08-05 11:57:35 +10:00
Daniel Waterworth
20bc4a38a5
FIX: DistributedMutex (#7953) 2019-08-01 09:12:05 +01:00
Sam Saffron
30990006a9 DEV: enable frozen string literal on all files
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.

Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
2019-05-13 09:31:32 +08:00
Gerhard Schlager
ff26b4ed9b DEV: Prevent warning about already initialized constant 2019-02-28 21:57:20 +01:00
Guo Xiang Tan
4d31b425e3 DEV: Validity of distributed mutex configurable once per instance.
Follow up to 4f9e5e19c8.
2019-02-20 09:29:45 +08:00
Guo Xiang Tan
f2efa0da66
DEV: Allow validity of lock to be customizable for DistributedMutex. (#7025)
- Allows a user to override the default lock validity of 60 seconds.
- Also clean up test which was leaking a redis key
2019-02-20 09:23:42 +08:00
Sam
e0be5145cf FIX: correct readonly timeout
So it only applies in readonly mode
2018-09-20 15:15:46 +10:00
Sam
e0e6dae6a7 minor cleanup to previous commit from code review 2018-09-19 16:07:29 +10:00
Sam
5302709343 FIX: in redis readonly raise an exception from DistributedMutex
If we detect redis is in readonly we can not correctly get a mutex
raise an exception to notify caller

When getting optimized images avoid the distributed mutex unless
for some reason it is the first call and we need to generate a thumb

In redis readonly no thumbnails will be generated
2018-09-19 15:50:58 +10:00
Guo Xiang Tan
5012d46cbd Add rubocop to our build. (#5004) 2017-07-28 10:20:09 +09:00
Sam
9147af1d62 FIX: eliminate race condition creating posts
FIX: correct message bus posting
2014-07-30 14:18:01 +10:00
Sam
63f4a0e050 Tighten API, add spec for recovery, keep mutex semantics 2014-04-14 10:51:46 +10:00
Vikhyat Korrapati
56ee1ac569 Extract scheduler cross-process locking into DistributedMutex. 2014-04-13 00:05:46 +05:30