This commit replaces the `full_name_required` setting with a new `full_name_requirement` setting to allow more flexibility with the name field in the signup form. The new setting has 2 options, "Required at signup" and "Optional at signup", which are equivalent to the true/false possibilities of the old setting, and a third option "Hidden at signup" that hides the name field from the signup form, making it effectively optional too.
New sites will have the "Hidden at signup" option as the default option, and existing site will continue to use the option that maps to their current configuration.
Internal topic: t/136746.
This commit adds the `add_request_rate_limiter` plugin API which allows plugins to add custom rate limiters on top of the default rate limiters which requests by a user's id or the request's IP address.
Example to add a rate limiter that rate limits all requests from Googlebot under the same rate limit bucket:
```
add_request_rate_limiter(
identifier: :country,
key: ->(request) { "country/#{DiscourseIpInfo.get(request.ip)[:country]}" },
activate_when: ->(request) { DiscourseIpInfo.get(request.ip)[:country].present? },
)
```
This PR involves cleaning up the codebase from my (@keegangeorge's) todos.
In particular:
- Remove Form Template related todos (these are no longer in the roadmap)
- Remove old left-over AI summarization related code after moving to AI (https://github.com/discourse/discourse-ai/pull/658)
- Update one form template related spec
* Split `shutdown` into two separate methods for better control:
- `shutdown` - signals threads to stop accepting new work
- `wait_for_termination` - waits for threads to finish (with optional timeout)
* Add tracking of busy threads via `@busy_threads` Set
* Make idle_time parameter optional with 30-second default
* Improve thread spawning logic:
- Spawn initial thread immediately when work is posted
- Spawn additional threads when all threads are busy and work is queued
* Fix race condition in work distribution
* Add busy thread count to stats output
* Add test coverage for zero min_threads configuration
This commit makes the ThreadPool more reliable, easier to use, and adds
better visibility into its internal state.
---------
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
This commit introduces a new ThreadPool class that provides efficient worker
thread management for background tasks. Key features include:
- Dynamic scaling from min to max threads based on workload
- Proper database connection management in multisite setup
- Graceful shutdown with task completion
- Robust error handling and logging
- FIFO task processing with a managed queue
- Configurable idle timeout for worker threads
The implementation is thoroughly tested, including stress tests, error
scenarios, and multisite compatibility.
There’s currently a bug when using a dedicated class as a policy in
services: if that class delegates its `#call` method (to an underlying
strategy object for example), then an error will be raised saying steps
aren’t allowed to provide default parameters.
This should not happen, and this patch fixes that issue.
This commit reimplements how we monitor Sidekiq processes that are
forked from the Unicorn master process. Prior to this change, we rely on
`Jobs::Heartbeat` to enqueue a `Jobs::RunHeartbeat` job every 3 minutes.
The `Jobs::RunHeartbeat` job then sets a Redis key with a timestamp. In
the Unicorn master process, we then fetch the timestamp that has been set
by the job from Redis every 30 minutes. If the timestamp has not been
updated for more than 30 minutes, we restart the Sidekiq process. The
fundamental flaw with this approach is that it fails to consider
deployments with multiple hosts and multiple Sidekiq processes. A
sidekiq process on a host may be in a bad state but the heartbeat check
will not restart the process because the `Jobs::RunHeartbeat` job is
still being executed by the working Sidekiq processes on other hosts.
In order to properly ensure that stuck Sidekiq processs are restarted,
we now rely on the [Sidekiq::ProcessSet](https://github.com/sidekiq/sidekiq/wiki/API#processes)
API that is supported by Sidekiq. The API provides us with "near real-time (updated every 5 sec)
info about the current set of Sidekiq processes running". The API
provides useful information like the hostname, pid and also when Sidekiq
last did its own heartbeat check. With that information, we can easily
determine if a Sidekiq process needs to be restarted from the Unicorn
master process.
`new_in_category` was using `first` instead of `limit`
This meant it gets an array and that means that you can not operate on it easily in a modifier.
This ensures we always give the modifier a relation, with the notable exception of suggested topics.
Previously, theme hbr files were compiled to an IIFE, which would be executed before the app is booted. That is causing silenced deprecations to be printed, because the deprecation-workflow isn't set up when the IIFE is run.
This commit updates the theme compiler so that it matches the ember-cli-based raw-hbs compiler. Templates are output to normal modules, which will then be loaded by the existing `eager-load-raw-templates` initializer. This runs after the app has started booting.
This reverts commit 9694dc6cb0.
Some of our previous email styling depended on this 'incorrect' ordering, so the change caused some text to become illegible. Reverting while we work out a better solution
Currently, there are two ways (kind of) for accessing `params` inside a
service:
- when there is no contract or it hasn’t been reached yet, `params` is
just the hash that was provided to the service. To access a key, you
have to use the bracket notation `params[:my_key]`.
- when there is a contract and it has been executed successfully,
`params` now references the contract and the attributes are accessible
using methods (`params.my_key`).
This patch unifies how `params` exposes its attributes. Now, even if
there is no contract at all in a service, `params` will expose its
attributes through methods, that way things are more consistent.
This patch also makes sure there is always a `params` object available
even when no `params` key is provided to the service (this allows a
contract to fail because its attributes are blank instead of having the
service raising an error because it doesn’t find `params` in its context).
We can't delete the file from disk as some of the assets are still
served by the app instead of going through the S3 bucket. It is a bug we
need to fix but it also means this ENV is unsafe now. Just drop the env
until we ensure all assets requested by the app are requested from the
S3 bucket directly.
This patch aims to improve the steps inspector output:
- The service class name is displayed at the top.
- Next to each step is displayed the time it took to run said step.
- Steps that didn’t run are hidden.
- `#inspect` automatically outputs the error when it is present.
It doesn't make much sense to have the content of a `<details>` in an excerpt so I replaced them with "▶ summary" instead.
That way, they can't be (ab)used in user cards for example.
Reference - https://meta.discourse.org/t/335094
The new name may be too long for the bookmarks.name column and raise an
exception. This changes allows the remapper to truncate the new value to
fit (truncates to 100 characters).
The `max_compress?` logic is totally broken at least when used for
brotli compression because we are only seeing 4 assets subjected to the
max compression level in production. Instead of fixing the broken logic,
we should just drop this unnecessary complexity cause things are easier
to reason about when we only have one compression level to deal with
across all assets.
Now that we run the `upload` method in different threads, we need to
synchronize writes to `STDOUT` which we can do so by using a `Logger`.
Follow-up to 49e8353959
The test was flaky and failing with the following errors:
```
Failure/Error:
klass
.connection
.select_raw(relation.arel) do |result, _|
result.type_map = DB.type_map
result.nfields == 1 ? result.column_values(0) : result.values
end
NoMethodError:
undefined method `select_raw' for nil
./lib/freedom_patches/fast_pluck.rb:60:in `pluck'
./vendor/bundle/ruby/3.3.0/gems/activerecord-7.2.2.1/lib/active_record/relation/calculations.rb:354:in `pick'
./app/models/web_crawler_request.rb:27:in `request_id'
./app/models/web_crawler_request.rb:31:in `rescue in request_id'
./app/models/web_crawler_request.rb:26:in `request_id'
./app/models/web_crawler_request.rb:19:in `write_cache!'
./app/models/concerns/cached_counting.rb:135:in `block (3 levels) in flush_to_db'
./vendor/bundle/ruby/3.3.0/gems/rails_multisite-6.1.0/lib/rails_multisite/connection_management/null_instance.rb:49:in `with_connection'
./vendor/bundle/ruby/3.3.0/gems/rails_multisite-6.1.0/lib/rails_multisite/connection_management.rb:21:in `with_connection'
./app/models/concerns/cached_counting.rb:134:in `block (2 levels) in flush_to_db'
./app/models/concerns/cached_counting.rb:124:in `each'
./app/models/concerns/cached_counting.rb:124:in `block in flush_to_db'
./lib/distributed_mutex.rb:53:in `block in synchronize'
./lib/distributed_mutex.rb:49:in `synchronize'
./lib/distributed_mutex.rb:49:in `synchronize'
./lib/distributed_mutex.rb:34:in `synchronize'
./app/models/concerns/cached_counting.rb:120:in `flush_to_db'
./app/models/concerns/cached_counting.rb:187:in `perform_increment!'
./app/models/web_crawler_request.rb:15:in `increment!'
./lib/middleware/request_tracker.rb:74:in `log_request'
./lib/middleware/request_tracker.rb:409:in `block in log_later'
./lib/scheduler/defer.rb:125:in `block in do_work'
./vendor/bundle/ruby/3.3.0/gems/rails_multisite-6.1.0/lib/rails_multisite/connection_management/null_instance.rb:49:in `with_connection'
./vendor/bundle/ruby/3.3.0/gems/rails_multisite-6.1.0/lib/rails_multisite/connection_management.rb:21:in `with_connection'
./lib/scheduler/defer.rb:119:in `do_work'
./lib/scheduler/defer.rb:105:in `block (2 levels) in start_thread'
```
This was due to running the defer thread in an async manner which is
actually no representative of the production environment. It also
revealed a spot in our code base where writes are happening in a GET
request which can cause requests to fail if ActiveRecord is in readonly
mode.
This reverts commit 766ff723f8.
Ensure that we create the sidekiq log file first before opening it for
logging. This avoids any issue of the log file not being present when we
initialize an instance of the `Logger`.
We identify and deny blocked crawlers here in anonymous_cache.
Separating the notion of the crawler identifier here lets plugins perform an
override if they perform more advanced detection.
We've seen in some communities abuse of user profile where bios and other fields are used in malicious ways, such as malware distribution. A common pattern between all the abuse cases we've seen is that the malicious actors tend to have 0 posts and have a low trust level.
To eliminate this abuse vector, or at least make it much less effective, we're making the following changes to user profiles:
1. Anonymous, TL0 and TL1 users cannot see any user profiles for users with 0 posts except for staff users
2. Anonymous and TL0 users can only see profiles of TL1 users and above
Users can always see their own profile, and they can still hide their profiles via the "Hide my public profile" preference. Staff can always see any user's profile.
Internal topic: t/142853.
Followup c7e471d35a
It is currently possible to add a bundle (which is a collection
of actions used for a dropdown on the client) for a reviewable
via actions.add_bundle and then never add any actions to it.
This causes the client to explode, as seen in the referenced
commit, because of the way our store expects to resolve objects
referenced by ID that are passed down by the serializer, which
then causes Ember to have an unrecoverable render error.
Fixing this on the serializer level is not really possible because
of all the ActiveModel::Serializer magic that serializes
objects by ID reference when doing things like has_many.
`Reviewable#actions_for` is a better place to do this anyway,
because this is the main location where the bundles and actions
are built for every action via the serializer.
Currently only system flags are translated. When we send message to the user that their post was deleted because of custom flag, we should default to custom flag name.
* DEV: unsilence deprecation warnings for old Font Awesome icon names
* update fa-user to user font awesome icon name
* update pencil-alt to pencil font awesome 6 icon name
In order to limit issues with duplicate inline CSS definitions, this will now deduplicate inline CSS styles with the "last-to-be-defined-wins" strategy.
Also removes unecessary whitespaces in inline styles.
Context - https://meta.discourse.org/t/resolve-final-styles-in-email-notifications/310219
Co-authored-by: Thomas Kalka <thomas.kalka@gmail.com>
When serializing the `body_changes` in the `PostRevisionSerializer`, we create two diffs: one for the `cooked` and another one for the `raw` version of the post.
Inside `DiscourseDiff`, we generate both `html` and `markdown` diffs when we only need the `html` diffs for the `cooked` version of the post and the `markdown` diff for the `raw` version of the post.
This solves the issue repored in https://meta.discourse.org/t/server-error-accessing-topic-revisions-on-a-specific-topic/339185 where some revisions would return 500 because of a `ArgumentError : Attributes per element limit exceeded` exception when trying to generate the `html` diff on a very large `raw`.
In some cases in CI env, it seems the AR connection isn’t available and
the `ensure` block is executed. It’s calling `#verify!` on the
connection, so it can fail sometimes. This is probably why
`#clear_active_connections!` was failing too sometimes.
Here, we just check the connection is present before clearing the
connections.
Spec was flaky cause work could still be in pipeline after the defer
length is 0. Our length denotes the backlog, not the in progress
count.
This adds a mechanism for gracefully stopping the queue and avoids
wait_for callse