Commit Graph

10368 Commits

Author SHA1 Message Date
Alan Guo Xiang Tan
859d61003e
DEV: API to register custom request rate limiting conditions (#30239)
This commit adds the `add_request_rate_limiter` plugin API which allows plugins to add custom rate limiters on top of the default rate limiters which requests by a user's id or the request's IP address.

Example to add a rate limiter that rate limits all requests from Googlebot under the same rate limit bucket:

```
add_request_rate_limiter(
  identifier: :country,
  key: ->(request) { "country/#{DiscourseIpInfo.get(request.ip)[:country]}" },
  activate_when: ->(request) { DiscourseIpInfo.get(request.ip)[:country].present? },
)
```
2024-12-23 09:57:18 +08:00
Keegan George
380910aedd
DEV: Cleanup todos from codebase (#30394)
This PR involves cleaning up the codebase from my (@keegangeorge's) todos. 

In particular:
- Remove Form Template related todos (these are no longer in the roadmap)
- Remove old left-over AI summarization related code after moving to AI (https://github.com/discourse/discourse-ai/pull/658)
- Update one form template related spec
2024-12-19 18:22:33 -08:00
Sam
c315e26485
FIX: handle more thread pool edge cases (#30392)
* Split `shutdown` into two separate methods for better control:
  - `shutdown` - signals threads to stop accepting new work
  - `wait_for_termination` - waits for threads to finish (with optional timeout)

* Add tracking of busy threads via `@busy_threads` Set
* Make idle_time parameter optional with 30-second default
* Improve thread spawning logic:
  - Spawn initial thread immediately when work is posted
  - Spawn additional threads when all threads are busy and work is queued
* Fix race condition in work distribution
* Add busy thread count to stats output
* Add test coverage for zero min_threads configuration

This commit makes the ThreadPool more reliable, easier to use, and adds 
better visibility into its internal state.

---------

Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
2024-12-20 11:50:00 +11:00
Sam
efa50a4da2
FEATURE: ThreadPool implementation (#30364)
This commit introduces a new ThreadPool class that provides efficient worker
thread management for background tasks. Key features include:

- Dynamic scaling from min to max threads based on workload
- Proper database connection management in multisite setup
- Graceful shutdown with task completion
- Robust error handling and logging
- FIFO task processing with a managed queue
- Configurable idle timeout for worker threads

The implementation is thoroughly tested, including stress tests, error
scenarios, and multisite compatibility.
2024-12-20 07:37:12 +11:00
=
6cd964306f Bump version to v3.4.0.beta4-dev 2024-12-19 13:22:05 -03:00
=
bc4ab613ce Bump version to v3.4.0.beta3 2024-12-19 13:22:04 -03:00
Krzysztof Kotlarek
95564a3df2 SECURITY: Moderators cannot see user emails.
Unless `moderators_view_emails` SiteSetting is enabled, moderators should not be able to discover users’ emails.
2024-12-19 13:13:18 -03:00
Alan Guo Xiang Tan
e4e5db57f0
DEV: Fix undefined method check_email_sync_heartbeat in unicorn conf (#30360)
This is a follow-up to 9812407f76
2024-12-19 10:10:11 +08:00
Loïc Guitaut
133a648d9b DEV: Fix policy classes delegating their #call method in services
There’s currently a bug when using a dedicated class as a policy in
services: if that class delegates its `#call` method (to an underlying
strategy object for example), then an error will be raised saying steps
aren’t allowed to provide default parameters.

This should not happen, and this patch fixes that issue.
2024-12-18 09:59:40 +01:00
Alan Guo Xiang Tan
9812407f76
FIX: Redo Sidekiq monitoring to restart stuck sidekiq processes (#30198)
This commit reimplements how we monitor Sidekiq processes that are
forked from the Unicorn master process. Prior to this change, we rely on
`Jobs::Heartbeat` to enqueue a `Jobs::RunHeartbeat` job every 3 minutes.
The `Jobs::RunHeartbeat` job then sets a Redis key with a timestamp. In
the Unicorn master process, we then fetch the timestamp that has been set
by the job from Redis every 30 minutes. If the timestamp has not been
updated for more than 30 minutes, we restart the Sidekiq process. The
fundamental flaw with this approach is that it fails to consider
deployments with multiple hosts and multiple Sidekiq processes. A
sidekiq process on a host may be in a bad state but the heartbeat check
will not restart the process because the `Jobs::RunHeartbeat` job is
still being executed by the working Sidekiq processes on other hosts.

In order to properly ensure that stuck Sidekiq processs are restarted,
we now rely on the [Sidekiq::ProcessSet](https://github.com/sidekiq/sidekiq/wiki/API#processes)
API that is supported by Sidekiq. The API provides us with "near real-time (updated every 5 sec)
info about the current set of Sidekiq processes running". The API
provides useful information like the hostname, pid and also when Sidekiq
last did its own heartbeat check. With that information, we can easily
determine if a Sidekiq process needs to be restarted from the Unicorn
master process.
2024-12-18 12:48:50 +08:00
Sam
4437aced91
FIX: use relations for new_in_category (#30313)
`new_in_category` was using `first` instead of `limit`

This meant it gets an array and that means that you can not operate on it easily in a modifier.

This ensures we always give the modifier a relation, with the notable exception of suggested topics.
2024-12-17 16:39:07 +11:00
David Taylor
ea9cdf7d47
DEV: Compile theme raw-hbr to modules (#30299)
Previously, theme hbr files were compiled to an IIFE, which would be executed before the app is booted. That is causing silenced deprecations to be printed, because the deprecation-workflow isn't set up when the IIFE is run.

This commit updates the theme compiler so that it matches the ember-cli-based raw-hbs compiler. Templates are output to normal modules, which will then be loaded by the existing `eager-load-raw-templates` initializer. This runs after the app has started booting.
2024-12-16 17:31:49 +00:00
Gerhard Schlager
6b3e28216c
FEATURE: Allow pausing of restore before DB migration and uploads are restored (#30269)
This can be helpful if you need to fix problems in the DB before the DB gets migrated as well as before uploads are restored.
2024-12-16 12:50:08 +01:00
David Taylor
ce8c2ef6d9
Revert "DEV: prioritize new email styles over existing, to make customization easier (#30244)" (#30297)
This reverts commit 9694dc6cb0.

Some of our previous email styling depended on this 'incorrect' ordering, so the change caused some text to become illegible. Reverting while we work out a better solution
2024-12-16 11:16:17 +00:00
Loïc Guitaut
9e9abe0a82 DEV: Unify params access in services
Currently, there are two ways (kind of) for accessing `params` inside a
service:
- when there is no contract or it hasn’t been reached yet, `params` is
  just the hash that was provided to the service. To access a key, you
  have to use the bracket notation `params[:my_key]`.
- when there is a contract and it has been executed successfully,
  `params` now references the contract and the attributes are accessible
  using methods (`params.my_key`).

This patch unifies how `params` exposes its attributes. Now, even if
there is no contract at all in a service, `params` will expose its
attributes through methods, that way things are more consistent.

This patch also makes sure there is always a `params` object available
even when no `params` key is provided to the service (this allows a
contract to fail because its attributes are blank instead of having the
service raising an error because it doesn’t find `params` in its context).
2024-12-13 11:13:18 +01:00
Alan Guo Xiang Tan
ebfc33b556
DEV: Remove line of code that does not work (#30258)
We can't delete the file from disk as some of the assets are still
served by the app instead of going through the S3 bucket. It is a bug we
need to fix but it also means this ENV is unsafe now. Just drop the env
until we ensure all assets requested by the app are requested from the
S3 bucket directly.
2024-12-13 09:36:51 +08:00
Kris
9694dc6cb0
DEV: prioritize new email styles over existing, to make customization easier (#30244) 2024-12-12 11:42:50 -05:00
Loïc Guitaut
a589b48f9a DEV: Display better output when inspecting service steps
This patch aims to improve the steps inspector output:
- The service class name is displayed at the top.
- Next to each step is displayed the time it took to run said step.
- Steps that didn’t run are hidden.
- `#inspect` automatically outputs the error when it is present.
2024-12-12 15:21:10 +01:00
Régis Hanol
44cabc3569
FIX: proper details / summary excerpt (#30229)
It doesn't make much sense to have the content of a `<details>` in an excerpt so I replaced them with "▶ summary" instead.

That way, they can't be (ab)used in user cards for example.

Reference - https://meta.discourse.org/t/335094
2024-12-12 09:09:49 +01:00
Bianca Nenciu
a835fd99bd FIX: Truncate bookmarks.name when remapping
The new name may be too long for the bookmarks.name column and raise an
exception. This changes allows the remapper to truncate the new value to
fit (truncates to 100 characters).
2024-12-11 18:53:17 -05:00
Alan Guo Xiang Tan
c97d1d7c59
DEV: Remove max compression level for brotli in assets.rake (#30220)
The `max_compress?` logic is totally broken at least when used for
brotli compression because we are only seeing 4 assets subjected to the
max compression level in production. Instead of fixing the broken logic,
we should just drop this unnecessary complexity cause things are easier
to reason about when we only have one compression level to deal with
across all assets.
2024-12-11 14:01:33 +08:00
Alan Guo Xiang Tan
19321a0b86
DEV: Fix s3:upload_assets not logging newlines (#30219)
Follow-up to 9a2e31b9af
2024-12-11 12:59:17 +08:00
Alan Guo Xiang Tan
9a2e31b9af
DEV: Use a Logger for s3:upload_assets (#30218)
Now that we run the `upload` method in different threads, we need to
synchronize writes to `STDOUT` which we can do so by using a `Logger`.

Follow-up to 49e8353959
2024-12-11 11:48:06 +08:00
Alan Guo Xiang Tan
49e8353959
FIX: s3:upload_assets was uploaded some source maps twice (#30216)
This is because Sprocket's manifest already contains the source maps.
The easy and safe fix here is to just use a `Set` to prevent
duplications.
2024-12-11 11:19:38 +08:00
Bianca Nenciu
b9f8a77d9b
DEV: Upload assets to S3 in parallel (#30210)
In my local setup (with Minio), this uploads the assets to S3 ~40% faster.
2024-12-11 10:51:05 +08:00
Alan Guo Xiang Tan
864b7b6bc8
DEV: Fix flaky test (#30215)
The test was flaky and failing with the following errors:

```
Failure/Error:
  klass
    .connection
    .select_raw(relation.arel) do |result, _|
      result.type_map = DB.type_map
      result.nfields == 1 ? result.column_values(0) : result.values
    end

NoMethodError:
  undefined method `select_raw' for nil

./lib/freedom_patches/fast_pluck.rb:60:in `pluck'
./vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation/calculations.rb:354:in `pick'
./app/models/web_crawler_request.rb:27:in `request_id'
./app/models/web_crawler_request.rb:31:in `rescue in request_id'
./app/models/web_crawler_request.rb:26:in `request_id'
./app/models/web_crawler_request.rb:19:in `write_cache!'
./app/models/concerns/cached_counting.rb:135:in `block (3 levels) in flush_to_db'
./vendor/bundle/ruby/3.3.0/gems/rails_multisite-6.1.0/lib/rails_multisite/connection_management/null_instance.rb:49:in `with_connection'
./vendor/bundle/ruby/3.3.0/gems/rails_multisite-6.1.0/lib/rails_multisite/connection_management.rb:21:in `with_connection'
./app/models/concerns/cached_counting.rb:134:in `block (2 levels) in flush_to_db'
./app/models/concerns/cached_counting.rb:124:in `each'
./app/models/concerns/cached_counting.rb:124:in `block in flush_to_db'
./lib/distributed_mutex.rb:53:in `block in synchronize'
./lib/distributed_mutex.rb:49:in `synchronize'
./lib/distributed_mutex.rb:49:in `synchronize'
./lib/distributed_mutex.rb:34:in `synchronize'
./app/models/concerns/cached_counting.rb:120:in `flush_to_db'
./app/models/concerns/cached_counting.rb:187:in `perform_increment!'
./app/models/web_crawler_request.rb:15:in `increment!'
./lib/middleware/request_tracker.rb:74:in `log_request'
./lib/middleware/request_tracker.rb:409:in `block in log_later'
./lib/scheduler/defer.rb:125:in `block in do_work'
./vendor/bundle/ruby/3.3.0/gems/rails_multisite-6.1.0/lib/rails_multisite/connection_management/null_instance.rb:49:in `with_connection'
./vendor/bundle/ruby/3.3.0/gems/rails_multisite-6.1.0/lib/rails_multisite/connection_management.rb:21:in `with_connection'
./lib/scheduler/defer.rb:119:in `do_work'
./lib/scheduler/defer.rb:105:in `block (2 levels) in start_thread'
```

This was due to running the defer thread in an async manner which is
actually no representative of the production environment. It also
revealed a spot in our code base where writes are happening in a GET
request which can cause requests to fail if ActiveRecord is in readonly
mode.
2024-12-11 10:12:58 +08:00
Alan Guo Xiang Tan
eeb01ea0de
DEV: Remove unnecessary thread in Jobs::Base::JobInstrumenter take 2 (#30195)
This reverts commit 766ff723f8.

Ensure that we create the sidekiq log file first before opening it for
logging. This avoids any issue of the log file not being present when we
initialize an instance of the `Logger`.
2024-12-10 12:44:56 +08:00
Michael Brown
c546111703 DEV: add the notion of a 'crawler identifier' in anonymous_cache
We identify and deny blocked crawlers here in anonymous_cache.

Separating the notion of the crawler identifier here lets plugins perform an
override if they perform more advanced detection.
2024-12-09 13:40:22 -05:00
Osama Sayegh
976aca68f6
FEATURE: Restrict profile visibility of low-trust users (#29981)
We've seen in some communities abuse of user profile where bios and other fields are used in malicious ways, such as malware distribution. A common pattern between all the abuse cases we've seen is that the malicious actors tend to have 0 posts and have a low trust level.

To eliminate this abuse vector, or at least make it much less effective, we're making the following changes to user profiles:

1. Anonymous, TL0 and TL1 users cannot see any user profiles for users with 0 posts except for staff users
2. Anonymous and TL0 users can only see profiles of TL1 users and above

Users can always see their own profile, and they can still hide their profiles via the "Hide my public profile" preference. Staff can always see any user's profile.

Internal topic: t/142853.
2024-12-09 13:07:59 +03:00
Alan Guo Xiang Tan
25ce1f3399
PERF: Don't execute a git command each time we log a log line (#30177)
We already have a `GIT_VERSION` constant in `DiscourseLogstashLogger` so
we can just use that.
2024-12-09 11:11:03 +08:00
Martin Brennan
8a89a77248
FIX: Discard empty bundles for reviewables (#30121)
Followup c7e471d35a

It is currently possible to add a bundle (which is a collection
of actions used for a dropdown on the client) for a reviewable
via actions.add_bundle and then never add any actions to it.

This causes the client to explode, as seen in the referenced
commit, because of the way our store expects to resolve objects
referenced by ID that are passed down by the serializer, which
then causes Ember to have an unrecoverable render error.

Fixing this on the serializer level is not really possible because
of all the ActiveModel::Serializer magic that serializes
objects by ID reference when doing things like has_many.
`Reviewable#actions_for` is a better place to do this anyway,
because this is the main location where the bundles and actions
are built for every action via the serializer.
2024-12-05 15:41:13 +10:00
Krzysztof Kotlarek
28b4ff6cb6
FIX: update flag reason message with default value (#30026)
Currently only system flags are translated. When we send message to the user that their post was deleted because of custom flag, we should default to custom flag name.
2024-12-04 14:46:52 +11:00
Kris
60826162b5
A11Y: remove redundant alt text from github oneboxes (#30083) 2024-12-04 12:25:03 +11:00
Gary Pendergast
2513339955
FEATURE: Show when a badge has been granted for a post (#29696)
Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
Co-authored-by: Jarek Radosz <jradosz@gmail.com>
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
2024-12-03 13:43:27 +11:00
Kelv
435fbb7408
DEV: unsilence deprecation warning for old Font Awesome icons (#30028)
* DEV: unsilence deprecation warnings for old Font Awesome icon names

* update fa-user to user font awesome icon name
* update pencil-alt to pencil font awesome 6 icon name
2024-12-03 10:28:39 +08:00
David Taylor
b47ae6d437
UX: Strip multiline comments in github oneboxes (#30040)
We were already stripping comments from GitHub issue/PR oneboxes, but the regex was not correctly matching multiline comments.
2024-12-02 18:08:55 +00:00
Alan Guo Xiang Tan
44a81069ac
DEV: Avoid creating system message when system user initiates restore (#30027)
There is no point creating a message for the system user since it is a
non-human user.
2024-12-02 16:13:38 +08:00
Régis Hanol
7d58793759
DEV: deduplicate inline styles in emails (#30015)
In order to limit issues with duplicate inline CSS definitions, this will now deduplicate inline CSS styles with the "last-to-be-defined-wins" strategy.

Also removes unecessary whitespaces in inline styles.

Context - https://meta.discourse.org/t/resolve-final-styles-in-email-notifications/310219

Co-authored-by: Thomas Kalka <thomas.kalka@gmail.com>
2024-11-30 16:38:45 +01:00
Régis Hanol
20d46c9583
PERF: only diff HTML / Markdown when needed (#30014)
When serializing the `body_changes` in the `PostRevisionSerializer`, we create two diffs: one for the `cooked` and another one for the `raw` version of the post.

Inside `DiscourseDiff`, we generate both `html` and `markdown` diffs when we only need the `html` diffs for the `cooked` version of the post and the `markdown` diff for the `raw` version of the post.

This solves the issue repored in https://meta.discourse.org/t/server-error-accessing-topic-revisions-on-a-specific-topic/339185 where some revisions would return 500 because of a `ArgumentError : Attributes per element limit exceeded` exception when trying to generate the `html` diff on a very large `raw`.
2024-11-30 16:30:30 +01:00
Jarek Radosz
85ead5ac7a
Revert "FIX: deduplicate css in mails (#30003)" (#30013)
This reverts commit 6e726d436f.

The specs were failing in the original PR but the CI didn't run.
2024-11-30 15:32:32 +01:00
Thomas Kalka
6e726d436f
FIX: deduplicate css in mails (#30003)
Feature: Resolve final styles in email notifications

Context - https://meta.discourse.org/t/resolve-final-styles-in-email-notifications/310219
2024-11-30 14:51:02 +01:00
Hoa Nguyen
607dd2cbd8
DEV: improve the plugin:spec rake task (#29050)
Allow the plugin:spec to receive the test file path, rather than always run all tests of the plugin.
2024-11-29 06:33:14 +11:00
Bianca Nenciu
5abee8ac6b
DEV: Log number of live slots used by requests (#29884) 2024-11-28 18:25:48 +02:00
Loïc Guitaut
88f1b3b195 DEV: Try fixing flaky spec related to Scheduler::Defer
Checking if a connection is available is probably not enough, when the
connection is available, we should still verify it’s not stale.
2024-11-28 15:30:13 +01:00
Loïc Guitaut
f69f0211df DEV: Fix flaky spec related to Scheduler::Defer
In some cases in CI env, it seems the AR connection isn’t available and
the `ensure` block is executed. It’s calling `#verify!` on the
connection, so it can fail sometimes. This is probably why
`#clear_active_connections!` was failing too sometimes.

Here, we just check the connection is present before clearing the
connections.
2024-11-28 11:46:52 +01:00
Sam
07813ba83c
DEV: fix hanging spec (#29974) 2024-11-28 11:06:19 +08:00
Sam
72132c35fb
DEV: fix flaky spec (#29972)
Spec was flaky cause work could still be in pipeline after the defer
length is 0. Our length denotes the backlog, not the in progress
count.

This adds a mechanism for gracefully stopping the queue and avoids
wait_for callse
2024-11-28 11:21:35 +11:00
Angus McLeod
6acf673f8d
FIX: topic post counts for webhook post_destroyed event (#29853)
* FIX: topic post counts for webhook post_destroyed event

- Generate webhook data after posts are destroyed
- Don't count user_deleted posts

* Remove unnecessary conditional
2024-11-27 11:36:51 -08:00
Loïc Guitaut
fac6147039 DEV: Verify DB connection before trying to clear active connections 2024-11-27 18:12:11 +01:00
Loïc Guitaut
d6bec460a8 DEV: Upgrade Rails to version 7.2 2024-11-27 10:48:47 +01:00