This commit adds a `MiniSchedulerLongRunningJobLogger` class which will
poll every 60 seconds for mini_scheduler jobs which are stuck. When it
detects that a job is stuck, it will log a warning message with the
current backtrace of the thread that is executing the job.
Note that for scheduled jobs which are executed at a frequency of less
than 30 minutes, we will log when the job has been executing for 30
minutes.
For scheduled jobs executed at a frequency of less than 2 hours, we will
log when the job has been executing for a duration greater than its
specified frequency.
For scheduled jobs executed at a frequency greater than 2 hours, we will
log as long as the job has been executing for more than 2 hours.
This commit patches `Net::HTTP` to reduce the default timeouts of 60
seconds when we are processing a request. There are certain routes in
Discourse which makes external requests and if the proper timeouts are
not set, we risk having the Unicorn master process force restarting the
Unicorn workers once the `30` seconds timeout is reached. This can
potentially become a vector for DoS attacks and this commit is aimed at
reducing the risk here.
Previously, we couldn't change the user agent name dynamically for onebox requests. In this commit, a new hidden site setting `onebox_user_agent` is created to override the default user agent value specified in the [initializer](c333e9d6e6/config/initializers/100-onebox_options.rb (L15)).
Co-authored-by: Régis Hanol <regis@hanol.fr>
`after_commit` should be used before refreshing processes to be sure that the database is already updated.
Also, MessageBus is used instead of events as MessageBus works correctly with many processes;
This site setting has always been experimental and hidden since it was
added 7 years ago. Drop it to simplify the way we enable logging in a
logstash friendly way.
This commit updates the `101-lograge` initializer to unsubscribe from
log events logged by `ActionView::LogSubscriber`. This is what the `lograge`
gem has been doing but its implementation is not compatible with Rails
7.1 and we started getting noise in our logs when lograge is enabled.
This commit monkey patches `Rails::Rack::Logger` to not log reqeust
information like `Started GET "/service-worker.js" for 127.0.0.1 at 2024-07-05 15:28:12 +0800`
when lograge is enabled. This was previously excluded by a monkey patch
in the `lograge` gem but that monkey patch has since broke and the gem
is unmaintained.
This commit rewrites `DiscourseLogstashLogger` to not be an instance
of `LogstashLogger`. The reason we don't want it to be an instance of
`LogstashLogger` is because we want the new logger to be chained to
Logster's logger which can then pass down useful information like the
request's env and error backtraces which Logster has already gathered.
Note that this commit does not bother to maintain backwards
compatibility and drops the `LOGSTASH_URI` and `UNICORN_LOGSTASH_URI`
ENV variables which were previously used to configure the destination in
which `logstash-logger` would send the logs to. Instead, we introduce
the `ENABLE_LOGSTASH_LOGGER` ENV variable to replace both ENV and remove
the need for the log paths to be specified. Note that the previous
feature was considered experimental as stated in d888d3c54c
and the new feature should be considered experimental as well. The code
may be moved into a plugin in the future.
Followup 2f2da72747
This commit moves topic view tracking from happening
every time a Topic is requested, which is susceptible
to inflating numbers of views from web crawlers, to
our request tracker middleware.
In this new location, topic views are only tracked when
the following headers are sent:
* HTTP_DISCOURSE_TRACK_VIEW - This is sent on every page navigation when
clicking around the ember app. We count these as browser page views
because we know it comes from the AJAX call in our app. The topic ID
is extracted from HTTP_DISCOURSE_TRACK_VIEW_TOPIC_ID
* HTTP_DISCOURSE_DEFERRED_TRACK_VIEW - Sent when MessageBus initializes
after first loading the page to count the initial page load view. The
topic ID is extracted from HTTP_DISCOURSE_DEFERRED_TRACK_VIEW.
This will bring topic views more in line with the change we
made to page views in the referenced commit and result in
more realistic topic view counts.
* DEV: Upgrade Rails to 7.1
* FIX: Remove references to `Rails.logger.chained`
`Rails.logger.chained` was provided by Logster before Rails 7.1
introduced their broadcast logger. Now all the loggers are added to
`Rails.logger.broadcasts`.
Some code in our initializers was still using `chained` instead of
`broadcasts`.
* DEV: Make parameters optional to all FakeLogger methods
* FIX: Set `override_level` on Logster loggers (#27519)
A followup to f595d599dd
* FIX: Don’t duplicate Rack response
---------
Co-authored-by: Jarek Radosz <jradosz@gmail.com>
* DEV: Upgrade Rails to 7.1
* FIX: Remove references to `Rails.logger.chained`
`Rails.logger.chained` was provided by Logster before Rails 7.1
introduced their broadcast logger. Now all the loggers are added to
`Rails.logger.broadcasts`.
Some code in our initializers was still using `chained` instead of
`broadcasts`.
* DEV: Make parameters optional to all FakeLogger methods
* FIX: Set `override_level` on Logster loggers (#27519)
A followup to f595d599dd
* FIX: Don’t duplicate Rack response
---------
Co-authored-by: Jarek Radosz <jradosz@gmail.com>
* Revert "FIX: Set `override_level` on Logster loggers (#27519)"
This reverts commit c1b0488c54.
* Revert "DEV: Make parameters optional to all FakeLogger methods"
This reverts commit 3318dad7b4.
* Revert "FIX: Remove references to `Rails.logger.chained`"
This reverts commit f595d599dd.
* Revert "DEV: Upgrade Rails to 7.1"
This reverts commit 081b00391e.
`Rails.logger.chained` was provided by Logster before Rails 7.1
introduced their broadcast logger. Now all the loggers are added to
`Rails.logger.broadcasts`.
Some code in our initializers was still using `chained` instead of
`broadcasts`.
This commits introduces the `sidekiq_report_long_running_jobs_minutes`
global setting which allows a site administrator to log a warning in the
Rails log when a Sidekiq job has been running for too long.
The warning is logged with the backtrace of the thread that is
processing the Sidekiq job to make it easier to figure out what a
sidekiq job is stuck on.
This commit introduces a hidden `s3_inventory_bucket` site setting which
replaces the `enable_s3_inventory` and `s3_configure_inventory_policy`
site setting.
The reason `enable_s3_inventory` and `s3_configure_inventory_policy`
site settings are removed is because this feature has technically been
broken since it was introduced. When the `enable_s3_inventory` feature
is turned on, the app will because configure a daily inventory policy for the
`s3_upload_bucket` bucket and store the inventories under a prefix in
the bucket. The problem here is that once the inventories are created,
there is nothing cleaning up all these inventories so whoever that has
enabled this feature would have been paying the cost of storing a whole
bunch of inventory files which are never used. Given that we have not
received any complains about inventory files inflating S3 storage costs,
we think that it is very likely that this feature is no longer being
used and we are looking to drop support for this feature in the not too
distance future.
For now, we will still support a hidden `s3_inventory_bucket` site
setting which site administrators can configure via the
`DISCOURSE_S3_INVENTORY_BUCKET` env.
Our 'page_view_crawler' / 'page_view_anon' metrics are based purely on the User Agent sent by clients. This means that 'badly behaved' bots which are imitating real user agents are counted towards 'anon' page views.
This commit introduces a new method of tracking visitors. When an initial HTML request is made, we assume it is a 'non-browser' request (i.e. a bot). Then, once the JS application has booted, we notify the server to count it as a 'browser' request. This reliance on a JavaScript-capable browser matches up more closely to dedicated analytics systems like Google Analytics.
Existing data collection and graphs are unchanged. Data collected via the new technique is available in a new 'experimental' report.
- Run the CSP-nonce-related middlewares on the generated response
- Fix the readonly mode checking to avoid empty strings being passed (the `check_readonly_mode` before_action will not execute in the case of these re-dispatched exceptions)
- Move the BlockRequestsMiddleware cookie-setting to the middleware, so that it is included even for unusual HTML responses like these exceptions
This is to enable :array type attributes for Contract
attributes in services, this is a followup to the move
of services from chat to core here:
cab178a405
Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
Why this change?
This regressed in 6e9fbb5bab because we
had a `request.xhr?` check before we decide to block requests. However,
there could not none-xhr requests which we need to block as well at the
end of each system test when `@@block_requests` is true.
This also reverts commit 6437f27f90.
Why this change?
This reverts 725561cf4b as it did not
address the root cause of the problem even though it fixed the failing tests we were seeing
when running `bundle exec rspec --tag ~type:multisite --order random:776 spec/system/admin_customize_form_templates_spec.rb spec/system/admin_sidebar_navigation_spec.rb spec/system/admin_site_setting_search_spec.rb spec/system/composer/dont_feed_the_trolls_popup_spec.rb spec/system/composer/review_media_unless_trust_level_spec.rb spec/system/create_account_spec.rb spec/system/editing_sidebar_tags_navigation_spec.rb spec/system/email_change_spec.rb spec/system/emojis/emoji_deny_list_spec.rb spec/system/group_activity_spec.rb spec/system/hashtag_autocomplete_spec.rb spec/system/network_disconnected_spec.rb spec/system/post_menu_spec.rb spec/system/post_small_action_spec.rb spec/system/tags_intersection_spec.rb spec/system/topic_list_focus_spec.rb spec/system/topic_page_spec.rb spec/system/user_page/user_profile_info_panel_spec.rb spec/system/viewing_group_members_spec.rb spec/system/viewing_navigation_menu_preferences_spec.rb`.
The root cause here is that `before_action`s added to a controller is
order dependent. As such, some requests were not setting the cookie
because the `before_action` callback was not even hit as a prior
`before_action` callbacks has raised an error such as the `check_xhr`
`before_action` callback.
To resolve the problem, we need to add the `prepend: true` option in
our monkey patch of `ApplicationController` to ensure that the
`before_action` callback which we have added is always run first.
This change also makes a couple of changes:
1. Improve the response body when a request is blocked by the `BlockRequestsMiddleware` middleware
so that it makes debugging easier.
2. Only set the cookies for non-xhr HTML format requests. Setting it for
other formats is kind of pointless.
Why this change?
We noticed that running `LOAD_PLUGINS=1 rspec --seed=38855 plugins/chat/spec/system/chat_new_message_spec.rb` locally
results in the system tests randomly failing. When we inspected the
request logs closely, we noticed that a `/presence/get` request from a
previous rspec example was being processed when a new rspec example is
already being run. We know it was from the previous rspec example
because inspecting the auth token showed the request using the auth
token of a user from the previous example. However, when a request using
an auth token from a previous example is used it ends up logging out the
same user on the server side because the user id in the cookie is the same
due to the use of `fab!`.
I did some research and there is apparently no way to wait until all
inflight requests by the browser has completed through capybara or
selenium. Therefore, we will add an identifier by attaching a cookie to all non-xhr requests so that
xhr requests which are triggered subsequently will contain the cookie in the request.
In the `BlockRequestsMiddleware` middleware, we will then reject any
requests when the value of the identifier in the cookie does not match the current rspec's example
location.
To see the problem locally, change `Auth::DefaultCurrentUserProvider.find_v1_auth_cookie` to the following:
```
def self.find_v1_auth_cookie(env)
return env[DECRYPTED_AUTH_COOKIE] if env.key?(DECRYPTED_AUTH_COOKIE)
env[DECRYPTED_AUTH_COOKIE] = begin
request = ActionDispatch::Request.new(env)
cookie = request.cookies[TOKEN_COOKIE]
# don't even initialize a cookie jar if we don't have a cookie at all
if cookie&.valid_encoding? && cookie.present?
puts "#{env["REQUEST_PATH"]} #{request.cookie_jar.encrypted[TOKEN_COOKIE]&.with_indifferent_access}"
request.cookie_jar.encrypted[TOKEN_COOKIE]&.with_indifferent_access
end
end
end
```
After which run the following command: `LOAD_PLUGINS=1 rspec --format documentation --seed=38855 plugins/chat/spec/system/chat_new_message_spec.rb`
It takes a few tries but the last spec should fail and you should see something like this:
```
assets/chunk.c16f6ba8b6824baa47ac.d41d8cd9.js {"token"=>"37d995a4b65395d3b343ec70fff915b4", "user_id"=>3382, "username"=>"bruce0", "trust_level"=>1, "issued_at"=>1708591735}
/assets/chunk.050148142e1d2dc992dd.d41d8cd9.js {"token"=>"37d995a4b65395d3b343ec70fff915b4", "user_id"=>3382, "username"=>"bruce0", "trust_level"=>1, "issued_at"=>1708591735}
/chat/api/channels/527/messages {"token"=>"37d995a4b65395d3b343ec70fff915b4", "user_id"=>3382, "username"=>"bruce0", "trust_level"=>1, "issued_at"=>1708591735}
/uploads/default/test_0/optimized/1X/_129430568242d1b7f853bb13ebea28b3f6af4e7_2_512x512.png {"token"=>"37d995a4b65395d3b343ec70fff915b4", "user_id"=>3382, "username"=>"bruce0", "trust_level"=>1, "issued_at"=>1708591735}
redirects to existing chat channel
redirects to chat channel if recipients param is missing (PENDING: Temporarily skipped with xit)
with multiple users
/favicon.ico {"token"=>"9a75c114c4d3401509a23d240f0a46d4", "user_id"=>3382, "username"=>"bruce0", "trust_level"=>1, "issued_at"=>1708591736}
/chat/new-message {"token"=>"9a75c114c4d3401509a23d240f0a46d4", "user_id"=>3382, "username"=>"bruce0", "trust_level"=>1, "issued_at"=>1708591736}
/presence/get {"token"=>"37d995a4b65395d3b343ec70fff915b4", "user_id"=>3382, "username"=>"bruce0", "trust_level"=>1, "issued_at"=>1708591735}
```
Note how the `/presence/get` request is using a token from the previous example.
Co-authored-by: David Taylor <david@taylorhq.com>
This commit introduces the possibility to stream messages. To allow plugins to use streaming this commit also ships a `ChatSDK` library to allow to interact with few parts of discourse chat.
```ruby
ChatSDK::Message.create_with_stream(raw: "test") do |helper|
5.times do |i|
is_streaming = helper.stream(raw: "more #{i}")
next if !is_streaming
sleep 2
end
end
```
This commit also introduces all the frontend parts:
- messages can now be marked as streaming
- when streaming their content will be updated when a new content is appended
- a special UI will be showing (a blinking indicator)
- a cancel button allows the user to stop the streaming, when cancelled `helper.stream(...)` will return `false`, and the plugin can decide exit early
The strict-dynamic CSP directive is supported in all our target browsers, and makes for a much simpler configuration. Instead of allowlisting paths, we use a per-request nonce to authorize `<script>` tags, and then those scripts are allowed to load additional scripts (or add additional inline scripts) without restriction.
This becomes especially useful when admins want to add external scripts like Google Tag Manager, or advertising scripts, which then go on to load a ton of other scripts.
All script tags introduced via themes will automatically have the nonce attribute applied, so it should be zero-effort for theme developers. Plugins *may* need some changes if they are inserting their own script tags.
This commit introduces a strict-dynamic-based CSP behind an experimental `content_security_policy_strict_dynamic` site setting.
Why this change?
We have been debugging flaky system tests and noticed in https://github.com/discourse/discourse/actions/runs/7911902047/job/21596791343?pr=25690
that ActiveRecord connection checkout timeouts are encountered because
the Capybara server thread is processing requests even after
`Capybara.reset_session!` and ActiveRecord's `teardown_fixtures` have already been call.
The theory here is that an inflight request can still hit the Capybara
server even after `Capybara.reset_session!` has been called and end up
eating up an ActiveRecord connection for too long and also messing with
the database outside of a transaction.
What does this change do?
This change adds a `BlockRequestsMiddleware` middleware in the test
environment which is enabled to reject all incoming requests at the end
of each system test and before `Capybara.reset_session!` is called. At
the start of each RSpec test, the middleware is disabled again.
These routes were previously rendered using Rails, and had a fairly fragile 2fa implementation in vanilla-js. This commit refactors the routes to be handled in the Ember app, removes the custom vanilla-js bundles, and leans on our centralized 2fa implementation. It also introduces a set of system specs for the behavior.
We add `Access-Control-Allow-Origin: *` to all asset requests which are requested via a configured CDN. This is particularly important now that we're using browser-native `import()` to load the highlightjs bundle. Unfortunately, user-configurable 'cors_origins' site setting was overriding the wldcard value on CDN assets and causing CORS errors.
This commit updates the logic to give the `*` value precedence, and adds a spec for the situation. It also invalidates the cache of hljs assets (because CDNs will have cached the bad Access-Control-Allow-Origin header).
The rack-cors middleware is also slightly tweaked so that it is always inserted. This makes things easier to test and more consistent.
This value is included when generating static asset URLs. Updating the value will allow site operators to invalidate all asset urls to recover from configuration issues which may have been cached by CDNs/browsers.
Why this change?
As the number of themes which the Discourse team supports officially
grows, we want to ensure that changes made to Discourse core do not
break the plugins. As such, we are adding a step to our Github actions
test job to run the QUnit tests for all official themes.
What does this change do?
This change adds a new job to our tests Github actions workflow to run the QUnit
tests for all official plugins. This is achieved with the following
changes:
1. Update `testem.js` to rely on the `THEME_TEST_PAGES` env variable to set the
`test_page` option when running theme QUnit tests with testem. The
`test_page` option [allows an array to be specified](https://github.com/testem/testem#multiple-test-pages) such that tests for
multiple pages can be run at the same time. We are relying on a ENV variable
because the `testem` CLI does not support passing a list of pages
to the `--test_page` option.
2. Support a `/testem-theme-qunit/:testem_id/theme-qunit` Rails route in the development environment. This
is done because testem prefixes the path with a unique ID to the configured `test_page` URL.
This is problematic for us because we proxy all testem requests to the
Rails server and testem's proxy configuration option does not allow us
to easily rewrite the URL to remove the prefix. Therefore, we configure a proxy in testem to prefix `theme-qunit` requests with
`/testem-theme-qunit` which can then be easily identified by the Rails server and routed accordingly.
3. Update `qunit:test` to support a `THEME_IDS` environment variable
which will allow it to run QUnit tests for multiple themes at the
same time.
4. Support `bin/rake themes:qunit[ids,"<theme_id>|<theme_id>"]` to run
the QUnit tests for multiple themes at the same time.
5. Adds a `themes:qunit_all_official` Rake task which runs the QUnit
tests for all the official themes.
- Remove the wildcard crawler. This was already excluding almost all file types, but the exclude list was missing '.gjs' which meant those files were unnecessarily being hoisted into the `public/` directory during precompile
- Automatically include all ember-cli-generated assets without needing them to be listed. The main motivation for this change is to allow us to start using async imports via Embroider/Webpack. The filenames for those new async bundles will not be known in advance.
- Skips sprockets fingerprinting on Embroider/Webpack chunk JS files. Their filenames already include a fingerprint, and having sprockets change the filenames will cause problems for the async import feature (where filenames are included deep inside js bundles)
This commit also updates our ember-cli build so that it skips building plugin tests in the production environment. This should provide a slight build speed improvement.
Why this change?
This ensures that malicious requests cannot end up causing the logs to
quickly fill up. The default chosen is sufficient for most legitimate
requests to the Discourse application.
When truncation happens, parsing of logs in supported format like
lograge may break down.
Why this change?
This is a follow up to e8f7b62752.
Tracking of GC stats didn't really belong in the `MethodProfiler` class
so we want to extract that concern into its own class.
As part of this PR, the `track_gc_stat_per_request` site setting has
also been renamed to `instrument_gc_stat_per_request`.