Why this change?
We have been seeing specs time out on GitHub CI but the problem is that
we are unable to debug those timeouts due to a lack of information. This
change seeks to print out the backtraces of all threads right before a
spec times out on CI.
What does this change do?
1. Starts a thread on CI which will wait for a spec to start running.
1. Once a spec starts running, the thread will sleep for
`PER_SPEC_TIMEOUT_SECONDS -1` seconds.
1. After sleeping, the thread checks if the spec is still running and
prints out the backtraces of all threads if it is. Otherwise, the
thread does nothing and runs the next loop.
1. At the end of each spec run, we ensure that the thread is in a
waiting state for the next spec run to start.
Note that there is no need for us to teardown or cleanup the thread
since the process terminates after running all the tests.
Why this change?
We have been looking into a flaky system tests in one of our plugins
where the DB transaction flow can be messed up from time to time. Our
debugging effort is complicated by that fact that `test-prof` starts a
DB transaction in a `before(:all)` block which makes it hard to properly
log information. By allowing test-prof to be disabled completely, we
believe it will make it easier for us to isolate the problem we are
investigating.
What does this change do?
1. Avoid loading test-prof files if `PREFABRICATION` env has been set to
`0`.
2. Set `PREFABRICATION=0` for plugin system tests in Github actions
What we have now:
```
~~~~~~~ SERVER EXCEPTIONS ~~~~~~~Error encountered while proccessing /tag/tag24/l/latest ArgumentError: wrong number of arguments (given 1, expected 0) /__w/discourse/discourse/lib/site_setting_extension.rb:521:in `block in setup_methods'
```
What we actually want:
```
~~~~~~~ SERVER EXCEPTIONS ~~~~~~~
Error encountered while proccessing /tag/tag24/l/latest ArgumentError: wrong number of arguments (given 1, expected 0) /__w/discourse/discourse/lib/site_setting_extension.rb:521:in `block in setup_methods'
```
Why this change?
We have been running into flaky tests which seems to be related to
AR transaction problems. However, we are not able to reproduce this
locally and do not have sufficient information on our builds now to
debug the problem.
What does this change do?
Noe the following changes only applies when `ENV["GITHUB_ACTIONS"]` is
present.
This change introduces an RSpec around hook when `capture_log: true` has
been set for a test. The responsibility of the hook is to capture the
ActiveRecord debug logs and print them out.
Why this change?
When running system tests on our CI, we have been occasionally seeing
server errors like:
```
Error encountered while proccessing /stylesheets/desktop_e58cf7f686aab173f9b778797f241913c2833c39.css
NoMethodError: undefined method `+' for nil:NilClass
/__w/discourse/discourse/vendor/bundle/ruby/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/journey/path/pattern.rb:139:in `[]'
/__w/discourse/discourse/vendor/bundle/ruby/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/journey/router.rb:127:in `block (2 levels) in find_routes'
/__w/discourse/discourse/vendor/bundle/ruby/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/journey/router.rb:126:in `each'
/__w/discourse/discourse/vendor/bundle/ruby/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/journey/router.rb:126:in `each_with_index'
/__w/discourse/discourse/vendor/bundle/ruby/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/journey/router.rb:126:in `block in find_routes'
/__w/discourse/discourse/vendor/bundle/ruby/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/journey/router.rb:123:in `map!'
/__w/discourse/discourse/vendor/bundle/ruby/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/journey/router.rb:123:in `find_routes'
/__w/discourse/discourse/vendor/bundle/ruby/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/journey/router.rb:32:in `serve'
/__w/discourse/discourse/vendor/bundle/ruby/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/routing/route_set.rb:852:in `call'
```
While looking through various Rails issues related to the error above, I
came across https://github.com/rails/rails/pull/27647 which is a fix to
fully initialize routes before the first request is handled. However,
the routes are only fully initialize only if `config.eager_load` is set
to `true`. There is no reason why `config.eager_load` shouldn't be `true` in the
CI environment and this is what a new Rails 7.1 app is generated with.
What does this change do?
Enable `config.eager_load` when `env["CI"]` is present
Follow-up to f5ca96528d
Why this change?
`RSpec.current_example.metadata[:extra_failure_lines]` can be `nil` and
calling `<<` on `nil` is not a good idea.
What does this change do?
Set `RSpec.current_example.metadata[:extra_failure_lines]` to `""` as
long as there are exceptions.
Why this change?
When running system tests with all official plugins installed, we have
encountered instances where the system tests will hang. When dumping the
backtraces of the threads, we can see that the main thread running the
tests is stuck in a deadlock with the puma thread while serving a
request.
The deadlock happens when the main thread acquires the `ActiveSupport::Concurrency::LoadInterlockAwareMonitor`
lock first in `ActiveRecord::ConnectionAdapters::AbstractAdapter` before acquring another `Monitor` lock in
`ActiveRecord::ModelSchema`. In the Puma thread, it acquires the
`Monitor` lock in `ActiveRecord::ModelSchema` first before acquring the
`ActiveSupport::Concurrency::LoadInterlockAwareMonitor` lock.
What does this change do?
To workaround this problem, we will preload all model schema cache
before running system tests such that the `Monitor` lock in `ActiveRecord::ModelSchema`
will not be acquired.
Why this change?
Previously, we were attaching any server exception to the RSpec
example's `Exception#cause` by doing `example.exception.cause =
RspecErrorTracker.last_exception`. However, this is problematic because
it relies on RSpec internal implementation details where RSpec will
print out the exception's cause. The other problem is that when RSpec
prints out the exception cause, it only includes a single line of
backtrace which isn't very helpful sometimes.
While this change of tracking the last exception works OK for request
specs, it doesn't not work for system specs where multiple requests can
be triggered in an example potentially leading to multiple exceptions.
Knowing all the exceptions which happened in the request is important
for us when it comes to debugging system test failures.
What does this change do?
`RspecErrorTracker` now tracks all exceptions that occurs during an
RSpec example run. All the exceptions including the fullback trace of
each exception is printed out as part of the example's `extra_failure_lines` metadata.
Example:
```
Failures:
1) Shortcuts | mark all read when chat is open when pressing shift+esc marks all channels read
Failure/Error: expect(page).to have_content("all read messagasd")
expected to find text "all read messagasd" in "Topics\nMy Posts\nReview\nAdmin\nMore\nCategories\nAmazing Category 0\nAmazing Category 1\nAmazing Category 2\nUncategorized\nAll categories\nConfigure defaults\nMessages\nInbox\nMy threads\nChannels\nKino Buffs 2\nMusic Lodge 0\nMusic Lodge 1\nPersonal chat\nMusic Lodge 1\nChat settings have been set to retain channel messages for 90 days.\nToday\nbruce6\n2:46 pm\nall read message 0\nbruce7\n2:46 pm\nall read message 1\nbruce8\n2:46 pm\nall read message 2\nbruce9\n2:46 pm\nall read message 3\nbruce10\n2:46 pm\nall read message 4\nbruce11\n2:46 pm\nall read message 5\nbruce12\n2:46 pm\nall read message 6\nbruce13\n2:46 pm\nall read message 7\nbruce14\n2:46 pm\nall read message 8\nbruce15\n2:46 pm\nall read message 9\nShowing all messages"
[Screenshot Image]: /home/tgxworld/work/discourse/tmp/capybara/failures_r_spec_example_groups_shortcuts_mark_all_read_when_chat_is_open_when_pressing_shift_esc_marks_all_channels_read_236.png
~~~~~~~ SERVER EXCEPTIONS ~~~~~~~
Error encountered while proccessing /stylesheets/desktop_theme_1_5dba82f48b7d6e4a9d54ffd915712811591356b7.css
RuntimeError: boom
/home/tgxworld/work/discourse/app/controllers/application_controller.rb:996:in `set_cross_origin_opener_policy_header'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/callbacks.rb:400:in `block in make_lambda'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/callbacks.rb:236:in `block in halting_and_conditional'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/callbacks.rb:599:in `block in invoke_after'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/callbacks.rb:599:in `each'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/callbacks.rb:599:in `invoke_after'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/callbacks.rb:133:in `block in run_callbacks'
/home/tgxworld/work/discourse/app/controllers/application_controller.rb:423:in `block in with_resolved_locale'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/i18n-1.14.1/lib/i18n.rb:322:in `with_locale'
/home/tgxworld/work/discourse/app/controllers/application_controller.rb:423:in `with_resolved_locale'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/callbacks.rb:127:in `block in run_callbacks'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/callbacks.rb:138:in `run_callbacks'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/abstract_controller/callbacks.rb:233:in `process_action'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_controller/metal/rescue.rb:23:in `process_action'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_controller/metal/instrumentation.rb:67:in `block in process_action'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/notifications.rb:206:in `block in instrument'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/notifications/instrumenter.rb:24:in `instrument'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/notifications.rb:206:in `instrument'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_controller/metal/instrumentation.rb:66:in `process_action'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_controller/metal/params_wrapper.rb:259:in `process_action'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activerecord-7.0.7/lib/active_record/railties/controller_runtime.rb:27:in `process_action'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/abstract_controller/base.rb:151:in `process'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionview-7.0.7/lib/action_view/rendering.rb:39:in `process'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_controller/metal.rb:188:in `dispatch'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_controller/metal.rb:251:in `dispatch'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/routing/route_set.rb:49:in `dispatch'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/routing/route_set.rb:32:in `serve'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/journey/router.rb:50:in `block in serve'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/journey/router.rb:32:in `each'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/journey/router.rb:32:in `serve'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/routing/route_set.rb:852:in `call'
/home/tgxworld/work/discourse/lib/middleware/omniauth_bypass_middleware.rb:64:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/rack-2.2.8/lib/rack/tempfile_reaper.rb:15:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/rack-2.2.8/lib/rack/conditional_get.rb:27:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/rack-2.2.8/lib/rack/head.rb:12:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/http/permissions_policy.rb:38:in `call'
/home/tgxworld/work/discourse/lib/content_security_policy/middleware.rb:12:in `call'
/home/tgxworld/work/discourse/lib/middleware/anonymous_cache.rb:351:in `call'
/home/tgxworld/work/discourse/lib/middleware/gtm_script_nonce_injector.rb:10:in `call'
/home/tgxworld/work/discourse/spec/rails_helper.rb:47:in `call'
/home/tgxworld/work/discourse/config/initializers/008-rack-cors.rb:14:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/rack-2.2.8/lib/rack/session/abstract/id.rb:266:in `context'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/rack-2.2.8/lib/rack/session/abstract/id.rb:260:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/middleware/cookies.rb:704:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/middleware/callbacks.rb:27:in `block in call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/callbacks.rb:99:in `run_callbacks'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/middleware/callbacks.rb:26:in `call'
/home/tgxworld/work/discourse/plugins/discourse-geoblocking/lib/geoblocking_middleware.rb:24:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/middleware/debug_exceptions.rb:28:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/middleware/show_exceptions.rb:29:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/railties-7.0.7/lib/rails/rack/logger.rb:40:in `call_app'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/railties-7.0.7/lib/rails/rack/logger.rb:27:in `call'
/home/tgxworld/work/discourse/config/initializers/100-quiet_logger.rb:20:in `call'
/home/tgxworld/work/discourse/config/initializers/100-silence_logger.rb:29:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/middleware/remote_ip.rb:93:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/middleware/request_id.rb:26:in `call'
/home/tgxworld/work/discourse/lib/middleware/enforce_hostname.rb:24:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/rack-2.2.8/lib/rack/method_override.rb:24:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/middleware/executor.rb:14:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/middleware/static.rb:23:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/rack-2.2.8/lib/rack/sendfile.rb:110:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/middleware/host_authorization.rb:131:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/message_bus-4.3.8/lib/message_bus/rack/middleware.rb:60:in `call'
/home/tgxworld/work/discourse/lib/middleware/request_tracker.rb:233:in `call'
/home/tgxworld/work/discourse/config/initializers/200-first_middlewares.rb:27:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/railties-7.0.7/lib/rails/engine.rb:530:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/rack-2.2.8/lib/rack/urlmap.rb:74:in `block in call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/rack-2.2.8/lib/rack/urlmap.rb:58:in `each'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/rack-2.2.8/lib/rack/urlmap.rb:58:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/rack-2.2.8/lib/rack/builder.rb:244:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/capybara-3.39.2/lib/capybara/server/animation_disabler.rb:25:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/capybara-3.39.2/lib/capybara/server/middleware.rb:60:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/puma-6.4.0/lib/puma/configuration.rb:272:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/puma-6.4.0/lib/puma/request.rb💯in `block in handle_request'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/puma-6.4.0/lib/puma/thread_pool.rb:378:in `with_force_shutdown'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/puma-6.4.0/lib/puma/request.rb:99:in `handle_request'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/puma-6.4.0/lib/puma/server.rb:443:in `process_client'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/puma-6.4.0/lib/puma/server.rb:241:in `block in run'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/puma-6.4.0/lib/puma/thread_pool.rb:155:in `block in spawn_thread'
~~~~~~~ END SERVER EXCEPTIONS ~~~~~~~
~~~~~~~ JS LOGS ~~~~~~~
http://localhost:31337/stylesheets/desktop_theme_1_5dba82f48b7d6e4a9d54ffd915712811591356b7.css?__ws=localhost - Failed to load resource: the server responded with a status of 500 (Internal Server Error)
~~~~~ END JS LOGS ~~~~~
```
Why this change?
By default, `Capybara.default_max_wait_time` is set to `2`. However,
this is not a high enough default for Discourse as certain requests like
creating a post can take upwards of 2 seconds even on a high end desktop
CPU like the Ryzen 5950x. Therefore, we have decided to double the default max wait time.
Why this change?
The code changes introduced in 5b91dc1844
resulted in errors being raised when `session.quit` is called when using
multiple sessions. From my debugging, this seems to be attributed to the
fact that the change introduced resulted in multiple sessions sharing
the same instance of `Selenium::WebDriver::Remote::Http::Default`. While
sharing the same instance in theory should be fine, but the problem is
that `Selenium::WebDriver::Driver` will mutate the `server_url` of the
client in `Selenium::WebDriver::Remote::Bridge`. This is problematic
because each session created by capbyara relies on a different server
URL and this mutation causes all sorts of weird errors to occur.
To reproduce the problem, run `LOAD_PLUGINS=1 rspec plugins/chat/spec/system/send_message_spec.rb:76`
locally while excluding the changes in this commit.
The most common thing that we do with fab! is:
fab!(:thing) { Fabricate(:thing) }
This commit adds a shorthand for this which is just simply:
fab!(:thing)
i.e. If you omit the block, then, by default, you'll get a `Fabricate`d object using the fabricator of the same name.
Why this change?
When using a remote capybara driver configured through the
`CAPYBARA_REMOTE_DRIVER_URL` env, webmock is thinking that is an
external request and blocking it. As such, we need to set the URL to the
allowlist for webmock.
Why this change?
When running in a Docker container, we want to bind the Rails server
started by Capybara to 0.0.0.0 instead of localhost. This is done via
the `server_host` config for Capybara which can now be configured via
the `CAPYBARA_SERVER_HOST` env.
What motivated this change?
We are currently working on allowing system tests to be run within a
Docker container. While system tests are usually ran in chrome headless
mode, it is useful to also be able to run the system tests with chrome
in the non-headless mode. However, running a GUI application from within
a docker container is not usually recommended and from our research
quite difficult. As such, we want to allow running system tests against
a remote browser.
For example, one can run a `chromedriver` server on localhost and then
configure Capybara to connect to the `chromedriver` from within the
container.
What does this change do?
This change adds support for a `CAPYBARA_REMOTE_DRIVER_URL` env variable
which will switch Capybara to use the remote driver instead of the
`chrome` driver. Currently, we expect the remote driver to be a
`chromedriver` server.
This commit adds some system specs to test uploads with
direct to S3 single and multipart uploads via uppy. This
is done with minio as a local S3 replacement. We are doing
this to catch regressions when uppy dependencies need to
be upgraded or we change uppy upload code, since before
this there was no way to know outside manual testing whether
these changes would cause regressions.
Minio's server lifecycle and the installed binaries are managed
by the https://github.com/discourse/minio_runner gem, though the
binaries are already installed on the discourse_test image we run
GitHub CI from.
These tests will only run in CI unless you specifically use the
CI=1 or RUN_S3_SYSTEM_SPECS=1 env vars.
For a history of experimentation here see https://github.com/discourse/discourse/pull/22381
Related PRs:
* https://github.com/discourse/minio_runner/pull/1
* https://github.com/discourse/minio_runner/pull/2
* https://github.com/discourse/minio_runner/pull/3
What is the problem here?
The `selenium-webdriver` gem is responsible for downloading the
right version of the `chromedriver` binary and it downloads it into the
`~/.cache/selenium` folder. THe problem here is that when a user runs `bin/turbo_rspec spec/system`
for the first time, all of the processes will try to download the
`chromedriver` binary to the same path at the same time and will lead
to concurrency errors.
What is the fix here?
Before running any RSpec suite, we first check if the `.cache/selenium`
folder is present. If it is not present, we use a file system lock to
download the `chromedriver` binary such that other processes that runs
after will not need to install the `chromedriver` binary.
The long term fix here is to get `selenium-manager` to download the `chromedriver` binary to a unique path for each
process but the `--cache-path` option for `selenium-manager` is currently not supported in `selenium-webdriver`.
We can no long user Webdriver - SeleniumHQ/selenium#11066. Bumping selenium-webdriver did the trick, as well as manually setting the user_agent for mobile system specs. Unsure what changed to make this necessary, but it is necessary to get the app to boot in mobile view.
Why this change?
By default in the test environment, MessageBus used the memory backend
which means all messages are stored in an in-memory data structure. However,
the in-memory data structure is not cleared after each system test so we
have the potential to be leaking stuff between system tests.
Similarly for the defer queue which process work in another thread, we
want to ensure that the defer queue processes everything it has to do
before the transaction is rolled back.
If a selenium finder takes the full wait duration to resolve, that means it has been written inefficiently. Most likely a matcher has been negated incorrectly.
This commit introduces a patch which will raise an error in this situation so that we can catch the issues while developing specs.
This commit also fixes chat's visit_thread helper. It was spinning on `has_css?(".chat-skeleton")` for the full selenium wait duration, and then returns false. That's because the thread is often already fully loaded before `has_css?` is even called. It's now updated to only look for the final expected state.
What is the problem here?
In multiple controllers, we are accepting a `limit` params but do not
impose any upper bound on the values being accepted. Without an upper
bound, we may be allowing arbituary users from generating DB queries
which may end up exhausing the resources on the server.
What is the fix here?
A new `fetch_limit_from_params` helper method is introduced in
`ApplicationController` that can be used by controller actions to safely
get the limit from the params as a default limit and maximum limit has
to be set. When an invalid limit params is encountered, the server will
respond with the 400 response code.
* CHROME_LOAD_EXTENSIONS_MANIFEST - An env var with a path to a file
that contains one path per line. These are paths to extensions installed
in chrome that the user wants to load while running system specs.
Useful to run things like Ember Inspector.
* CHROME_DISABLE_FORCE_DEVICE_SCALE_FACTOR - On some systems the
--force-device-scale-factor=1 argument makes the UI for chrome
super small, add a way to disable this.
This commit makes sure we don't load all data into memory when doing CSV exports.
The most important change here made to the recently introduced export of chat
messages (3ea31f4). We were loading all data into memory in the first version, with
this commit it's not the case anymore.
Speaking of old exports. Some of them already use find_each, and it worked as
expected, without loading all data into memory. And it will proceed working as
expected after this commit.
In general, I made sure this change didn't break other CSV exports, first manually, and
then by writing system specs for them. Sadly, I haven't managed yet to make those
specs stable, they work fine locally, but flaky in GitHub actions, so I've disabled them
for now.
I'll be making more changes to the CSV exports code soon, those system specs will be
very helpful. I'll be running them locally, and I hope I'll manage to make them stable
while doing that work.
Instead of having to remember every time, just always wait until the
current transaction (if it exists) has committed before clearing any
DistributedCache.
The only exception to this is caches that aren't caching things from
postgres.
This means we have to do the test setup after setting the test
transaction, because doing the test setup involves clearing caches.
Reapplying this - it now doesn't use after_commit if skip_db is set
Instead of having to remember every time, just always wait until the
current transaction (if it exists) has committed before clearing any
DistributedCache.
The only exception to this is caches that aren't caching things from
postgres.
This means we have to do the test setup after setting the test
transaction, because doing the test setup involves clearing caches.
Introduced in cec68b3e2c,
this is flaky because if you click the back button before
the route is fully transitioned to the loaded thread,
we end up going to the history _before_ the thread list,
which ends up being the channel.
We need to make sure that everything is loaded for the
thread first, meaning the skeleton is not there.
Also exclude some noise from the capybara logs (image load failures)
Clicking on TOC heading anchors in a subfolder setup was breaking the current URL for users.
Other than the fix this change introduces the ability to test the subfolder setup in system specs.
It seems like the overhead of GPU acceleration is not worth it and is
slowing down our system tests. Locally the following command completes
in `2 minutes 8.6 seconds` with GPU disabled as compared to `2 minutes 45.4 seconds` with GPU enabled.
## What is the problem?
MessageBus by default uses long polling which keeps a connection
open for 25 seconds by default. The problem here is that Capybara does not know about these
connections being kept opened by MessageBus and hence does not know how
to stop these connections at the end of each test. As a result, the long polling MessageBus connections are kept opened by the browser and we hit chrome's limit of 6 concurrent requests per host, new request made in the browser is marked as "pending" until a request is freed up. Since we keep a MessageBus long polling connection opened for 25 seconds, our finders in Capybara end up hitting Capybara's wait time out causing the tests to fail.
## What is the fix?
Since we can't rely on Capybara to close all the existing Capybara
connections, we manually execute a script to stop all MessageBus
connections after each system test.
```
for i in {1..10}; do
echo "Running iteration $i"
PARALLEL_TEST_PROCESSORS=8 CAPYBARA_DEFAULT_MAX_WAIT_TIME=10 bin/turbo_rspec --seed=34908 --profile --verbose --format documentation spec/system
if [ $? -ne 0 ]; then
echo "Error encountered on iteration $i"
exit 1
fi
done
echo "All 10 iterations completed successfully"
```
Without the fix, the script fails consistently in the first few iterations. Running in non-headless mode with the "network" tab opened will reveal the requests that are marked as pending.