Why this change?
On CI, we have been seeing flaky system tests because ActiveRecord is
unable to checkout a connection. This patch is meant to help us debug
which thread is not returning the connection to the queue.
Why this change?
We are getting the following error on CI:
`Text file busy -
/github/home/.cache/selenium/chromedriver/linux64/121.0.6167.85/chromedriver`
This happens when two processes tries to download the chromedriver at
the same time. I'm not entirely sure why the previous implementatio is
not locking properly since we still saw the `Text file busy` error but
by accquring the lock before we even check for the existence of
`~/.cache/selenium`, we should be able to eliminate the chance of two
processes trying to download the chromedriver binary at the same time.
Why this change?
In workflow runs, we have seen processes being stuck on a flock lock and
I'm guessing because we are using `"w"` when opening the file which the
ruby documentation advises against as it states "don't use "w" because it truncates the file before lock."
Stuck workflow run: https://github.com/discourse/discourse/actions/runs/7690278010/job/20953851469
This patch allows running system specs on an aarch64 Linux system
(typically our `discourse_dev` docker image).
As Chrome isn’t available for the aarch64 architecture (yet), we have to
rely on Firefox instead. This has some drawbacks like not being able to
access the browser logs like we do with the Chrome webdriver.
Why this change?
We have been seeing specs time out on GitHub CI but the problem is that
we are unable to debug those timeouts due to a lack of information. This
change seeks to print out the backtraces of all threads right before a
spec times out on CI.
What does this change do?
1. Starts a thread on CI which will wait for a spec to start running.
1. Once a spec starts running, the thread will sleep for
`PER_SPEC_TIMEOUT_SECONDS -1` seconds.
1. After sleeping, the thread checks if the spec is still running and
prints out the backtraces of all threads if it is. Otherwise, the
thread does nothing and runs the next loop.
1. At the end of each spec run, we ensure that the thread is in a
waiting state for the next spec run to start.
Note that there is no need for us to teardown or cleanup the thread
since the process terminates after running all the tests.
Why this change?
We have been looking into a flaky system tests in one of our plugins
where the DB transaction flow can be messed up from time to time. Our
debugging effort is complicated by that fact that `test-prof` starts a
DB transaction in a `before(:all)` block which makes it hard to properly
log information. By allowing test-prof to be disabled completely, we
believe it will make it easier for us to isolate the problem we are
investigating.
What does this change do?
1. Avoid loading test-prof files if `PREFABRICATION` env has been set to
`0`.
2. Set `PREFABRICATION=0` for plugin system tests in Github actions
What we have now:
```
~~~~~~~ SERVER EXCEPTIONS ~~~~~~~Error encountered while proccessing /tag/tag24/l/latest ArgumentError: wrong number of arguments (given 1, expected 0) /__w/discourse/discourse/lib/site_setting_extension.rb:521:in `block in setup_methods'
```
What we actually want:
```
~~~~~~~ SERVER EXCEPTIONS ~~~~~~~
Error encountered while proccessing /tag/tag24/l/latest ArgumentError: wrong number of arguments (given 1, expected 0) /__w/discourse/discourse/lib/site_setting_extension.rb:521:in `block in setup_methods'
```
Why this change?
We have been running into flaky tests which seems to be related to
AR transaction problems. However, we are not able to reproduce this
locally and do not have sufficient information on our builds now to
debug the problem.
What does this change do?
Noe the following changes only applies when `ENV["GITHUB_ACTIONS"]` is
present.
This change introduces an RSpec around hook when `capture_log: true` has
been set for a test. The responsibility of the hook is to capture the
ActiveRecord debug logs and print them out.
Why this change?
When running system tests on our CI, we have been occasionally seeing
server errors like:
```
Error encountered while proccessing /stylesheets/desktop_e58cf7f686aab173f9b778797f241913c2833c39.css
NoMethodError: undefined method `+' for nil:NilClass
/__w/discourse/discourse/vendor/bundle/ruby/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/journey/path/pattern.rb:139:in `[]'
/__w/discourse/discourse/vendor/bundle/ruby/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/journey/router.rb:127:in `block (2 levels) in find_routes'
/__w/discourse/discourse/vendor/bundle/ruby/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/journey/router.rb:126:in `each'
/__w/discourse/discourse/vendor/bundle/ruby/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/journey/router.rb:126:in `each_with_index'
/__w/discourse/discourse/vendor/bundle/ruby/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/journey/router.rb:126:in `block in find_routes'
/__w/discourse/discourse/vendor/bundle/ruby/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/journey/router.rb:123:in `map!'
/__w/discourse/discourse/vendor/bundle/ruby/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/journey/router.rb:123:in `find_routes'
/__w/discourse/discourse/vendor/bundle/ruby/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/journey/router.rb:32:in `serve'
/__w/discourse/discourse/vendor/bundle/ruby/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/routing/route_set.rb:852:in `call'
```
While looking through various Rails issues related to the error above, I
came across https://github.com/rails/rails/pull/27647 which is a fix to
fully initialize routes before the first request is handled. However,
the routes are only fully initialize only if `config.eager_load` is set
to `true`. There is no reason why `config.eager_load` shouldn't be `true` in the
CI environment and this is what a new Rails 7.1 app is generated with.
What does this change do?
Enable `config.eager_load` when `env["CI"]` is present
Follow-up to f5ca96528d
Why this change?
`RSpec.current_example.metadata[:extra_failure_lines]` can be `nil` and
calling `<<` on `nil` is not a good idea.
What does this change do?
Set `RSpec.current_example.metadata[:extra_failure_lines]` to `""` as
long as there are exceptions.
Why this change?
When running system tests with all official plugins installed, we have
encountered instances where the system tests will hang. When dumping the
backtraces of the threads, we can see that the main thread running the
tests is stuck in a deadlock with the puma thread while serving a
request.
The deadlock happens when the main thread acquires the `ActiveSupport::Concurrency::LoadInterlockAwareMonitor`
lock first in `ActiveRecord::ConnectionAdapters::AbstractAdapter` before acquring another `Monitor` lock in
`ActiveRecord::ModelSchema`. In the Puma thread, it acquires the
`Monitor` lock in `ActiveRecord::ModelSchema` first before acquring the
`ActiveSupport::Concurrency::LoadInterlockAwareMonitor` lock.
What does this change do?
To workaround this problem, we will preload all model schema cache
before running system tests such that the `Monitor` lock in `ActiveRecord::ModelSchema`
will not be acquired.
Why this change?
Previously, we were attaching any server exception to the RSpec
example's `Exception#cause` by doing `example.exception.cause =
RspecErrorTracker.last_exception`. However, this is problematic because
it relies on RSpec internal implementation details where RSpec will
print out the exception's cause. The other problem is that when RSpec
prints out the exception cause, it only includes a single line of
backtrace which isn't very helpful sometimes.
While this change of tracking the last exception works OK for request
specs, it doesn't not work for system specs where multiple requests can
be triggered in an example potentially leading to multiple exceptions.
Knowing all the exceptions which happened in the request is important
for us when it comes to debugging system test failures.
What does this change do?
`RspecErrorTracker` now tracks all exceptions that occurs during an
RSpec example run. All the exceptions including the fullback trace of
each exception is printed out as part of the example's `extra_failure_lines` metadata.
Example:
```
Failures:
1) Shortcuts | mark all read when chat is open when pressing shift+esc marks all channels read
Failure/Error: expect(page).to have_content("all read messagasd")
expected to find text "all read messagasd" in "Topics\nMy Posts\nReview\nAdmin\nMore\nCategories\nAmazing Category 0\nAmazing Category 1\nAmazing Category 2\nUncategorized\nAll categories\nConfigure defaults\nMessages\nInbox\nMy threads\nChannels\nKino Buffs 2\nMusic Lodge 0\nMusic Lodge 1\nPersonal chat\nMusic Lodge 1\nChat settings have been set to retain channel messages for 90 days.\nToday\nbruce6\n2:46 pm\nall read message 0\nbruce7\n2:46 pm\nall read message 1\nbruce8\n2:46 pm\nall read message 2\nbruce9\n2:46 pm\nall read message 3\nbruce10\n2:46 pm\nall read message 4\nbruce11\n2:46 pm\nall read message 5\nbruce12\n2:46 pm\nall read message 6\nbruce13\n2:46 pm\nall read message 7\nbruce14\n2:46 pm\nall read message 8\nbruce15\n2:46 pm\nall read message 9\nShowing all messages"
[Screenshot Image]: /home/tgxworld/work/discourse/tmp/capybara/failures_r_spec_example_groups_shortcuts_mark_all_read_when_chat_is_open_when_pressing_shift_esc_marks_all_channels_read_236.png
~~~~~~~ SERVER EXCEPTIONS ~~~~~~~
Error encountered while proccessing /stylesheets/desktop_theme_1_5dba82f48b7d6e4a9d54ffd915712811591356b7.css
RuntimeError: boom
/home/tgxworld/work/discourse/app/controllers/application_controller.rb:996:in `set_cross_origin_opener_policy_header'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/callbacks.rb:400:in `block in make_lambda'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/callbacks.rb:236:in `block in halting_and_conditional'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/callbacks.rb:599:in `block in invoke_after'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/callbacks.rb:599:in `each'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/callbacks.rb:599:in `invoke_after'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/callbacks.rb:133:in `block in run_callbacks'
/home/tgxworld/work/discourse/app/controllers/application_controller.rb:423:in `block in with_resolved_locale'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/i18n-1.14.1/lib/i18n.rb:322:in `with_locale'
/home/tgxworld/work/discourse/app/controllers/application_controller.rb:423:in `with_resolved_locale'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/callbacks.rb:127:in `block in run_callbacks'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/callbacks.rb:138:in `run_callbacks'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/abstract_controller/callbacks.rb:233:in `process_action'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_controller/metal/rescue.rb:23:in `process_action'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_controller/metal/instrumentation.rb:67:in `block in process_action'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/notifications.rb:206:in `block in instrument'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/notifications/instrumenter.rb:24:in `instrument'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/notifications.rb:206:in `instrument'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_controller/metal/instrumentation.rb:66:in `process_action'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_controller/metal/params_wrapper.rb:259:in `process_action'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activerecord-7.0.7/lib/active_record/railties/controller_runtime.rb:27:in `process_action'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/abstract_controller/base.rb:151:in `process'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionview-7.0.7/lib/action_view/rendering.rb:39:in `process'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_controller/metal.rb:188:in `dispatch'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_controller/metal.rb:251:in `dispatch'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/routing/route_set.rb:49:in `dispatch'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/routing/route_set.rb:32:in `serve'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/journey/router.rb:50:in `block in serve'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/journey/router.rb:32:in `each'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/journey/router.rb:32:in `serve'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/routing/route_set.rb:852:in `call'
/home/tgxworld/work/discourse/lib/middleware/omniauth_bypass_middleware.rb:64:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/rack-2.2.8/lib/rack/tempfile_reaper.rb:15:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/rack-2.2.8/lib/rack/conditional_get.rb:27:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/rack-2.2.8/lib/rack/head.rb:12:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/http/permissions_policy.rb:38:in `call'
/home/tgxworld/work/discourse/lib/content_security_policy/middleware.rb:12:in `call'
/home/tgxworld/work/discourse/lib/middleware/anonymous_cache.rb:351:in `call'
/home/tgxworld/work/discourse/lib/middleware/gtm_script_nonce_injector.rb:10:in `call'
/home/tgxworld/work/discourse/spec/rails_helper.rb:47:in `call'
/home/tgxworld/work/discourse/config/initializers/008-rack-cors.rb:14:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/rack-2.2.8/lib/rack/session/abstract/id.rb:266:in `context'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/rack-2.2.8/lib/rack/session/abstract/id.rb:260:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/middleware/cookies.rb:704:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/middleware/callbacks.rb:27:in `block in call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/activesupport-7.0.7/lib/active_support/callbacks.rb:99:in `run_callbacks'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/middleware/callbacks.rb:26:in `call'
/home/tgxworld/work/discourse/plugins/discourse-geoblocking/lib/geoblocking_middleware.rb:24:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/middleware/debug_exceptions.rb:28:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/middleware/show_exceptions.rb:29:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/railties-7.0.7/lib/rails/rack/logger.rb:40:in `call_app'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/railties-7.0.7/lib/rails/rack/logger.rb:27:in `call'
/home/tgxworld/work/discourse/config/initializers/100-quiet_logger.rb:20:in `call'
/home/tgxworld/work/discourse/config/initializers/100-silence_logger.rb:29:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/middleware/remote_ip.rb:93:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/middleware/request_id.rb:26:in `call'
/home/tgxworld/work/discourse/lib/middleware/enforce_hostname.rb:24:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/rack-2.2.8/lib/rack/method_override.rb:24:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/middleware/executor.rb:14:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/middleware/static.rb:23:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/rack-2.2.8/lib/rack/sendfile.rb:110:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/actionpack-7.0.7/lib/action_dispatch/middleware/host_authorization.rb:131:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/message_bus-4.3.8/lib/message_bus/rack/middleware.rb:60:in `call'
/home/tgxworld/work/discourse/lib/middleware/request_tracker.rb:233:in `call'
/home/tgxworld/work/discourse/config/initializers/200-first_middlewares.rb:27:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/railties-7.0.7/lib/rails/engine.rb:530:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/rack-2.2.8/lib/rack/urlmap.rb:74:in `block in call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/rack-2.2.8/lib/rack/urlmap.rb:58:in `each'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/rack-2.2.8/lib/rack/urlmap.rb:58:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/rack-2.2.8/lib/rack/builder.rb:244:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/capybara-3.39.2/lib/capybara/server/animation_disabler.rb:25:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/capybara-3.39.2/lib/capybara/server/middleware.rb:60:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/puma-6.4.0/lib/puma/configuration.rb:272:in `call'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/puma-6.4.0/lib/puma/request.rb💯in `block in handle_request'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/puma-6.4.0/lib/puma/thread_pool.rb:378:in `with_force_shutdown'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/puma-6.4.0/lib/puma/request.rb:99:in `handle_request'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/puma-6.4.0/lib/puma/server.rb:443:in `process_client'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/puma-6.4.0/lib/puma/server.rb:241:in `block in run'
/home/tgxworld/.asdf/installs/ruby/3.2.2/lib/ruby/gems/3.2.0/gems/puma-6.4.0/lib/puma/thread_pool.rb:155:in `block in spawn_thread'
~~~~~~~ END SERVER EXCEPTIONS ~~~~~~~
~~~~~~~ JS LOGS ~~~~~~~
http://localhost:31337/stylesheets/desktop_theme_1_5dba82f48b7d6e4a9d54ffd915712811591356b7.css?__ws=localhost - Failed to load resource: the server responded with a status of 500 (Internal Server Error)
~~~~~ END JS LOGS ~~~~~
```
Why this change?
By default, `Capybara.default_max_wait_time` is set to `2`. However,
this is not a high enough default for Discourse as certain requests like
creating a post can take upwards of 2 seconds even on a high end desktop
CPU like the Ryzen 5950x. Therefore, we have decided to double the default max wait time.
Why this change?
The code changes introduced in 5b91dc1844
resulted in errors being raised when `session.quit` is called when using
multiple sessions. From my debugging, this seems to be attributed to the
fact that the change introduced resulted in multiple sessions sharing
the same instance of `Selenium::WebDriver::Remote::Http::Default`. While
sharing the same instance in theory should be fine, but the problem is
that `Selenium::WebDriver::Driver` will mutate the `server_url` of the
client in `Selenium::WebDriver::Remote::Bridge`. This is problematic
because each session created by capbyara relies on a different server
URL and this mutation causes all sorts of weird errors to occur.
To reproduce the problem, run `LOAD_PLUGINS=1 rspec plugins/chat/spec/system/send_message_spec.rb:76`
locally while excluding the changes in this commit.
The most common thing that we do with fab! is:
fab!(:thing) { Fabricate(:thing) }
This commit adds a shorthand for this which is just simply:
fab!(:thing)
i.e. If you omit the block, then, by default, you'll get a `Fabricate`d object using the fabricator of the same name.
Why this change?
When using a remote capybara driver configured through the
`CAPYBARA_REMOTE_DRIVER_URL` env, webmock is thinking that is an
external request and blocking it. As such, we need to set the URL to the
allowlist for webmock.
Why this change?
When running in a Docker container, we want to bind the Rails server
started by Capybara to 0.0.0.0 instead of localhost. This is done via
the `server_host` config for Capybara which can now be configured via
the `CAPYBARA_SERVER_HOST` env.
What motivated this change?
We are currently working on allowing system tests to be run within a
Docker container. While system tests are usually ran in chrome headless
mode, it is useful to also be able to run the system tests with chrome
in the non-headless mode. However, running a GUI application from within
a docker container is not usually recommended and from our research
quite difficult. As such, we want to allow running system tests against
a remote browser.
For example, one can run a `chromedriver` server on localhost and then
configure Capybara to connect to the `chromedriver` from within the
container.
What does this change do?
This change adds support for a `CAPYBARA_REMOTE_DRIVER_URL` env variable
which will switch Capybara to use the remote driver instead of the
`chrome` driver. Currently, we expect the remote driver to be a
`chromedriver` server.
This commit adds some system specs to test uploads with
direct to S3 single and multipart uploads via uppy. This
is done with minio as a local S3 replacement. We are doing
this to catch regressions when uppy dependencies need to
be upgraded or we change uppy upload code, since before
this there was no way to know outside manual testing whether
these changes would cause regressions.
Minio's server lifecycle and the installed binaries are managed
by the https://github.com/discourse/minio_runner gem, though the
binaries are already installed on the discourse_test image we run
GitHub CI from.
These tests will only run in CI unless you specifically use the
CI=1 or RUN_S3_SYSTEM_SPECS=1 env vars.
For a history of experimentation here see https://github.com/discourse/discourse/pull/22381
Related PRs:
* https://github.com/discourse/minio_runner/pull/1
* https://github.com/discourse/minio_runner/pull/2
* https://github.com/discourse/minio_runner/pull/3
What is the problem here?
The `selenium-webdriver` gem is responsible for downloading the
right version of the `chromedriver` binary and it downloads it into the
`~/.cache/selenium` folder. THe problem here is that when a user runs `bin/turbo_rspec spec/system`
for the first time, all of the processes will try to download the
`chromedriver` binary to the same path at the same time and will lead
to concurrency errors.
What is the fix here?
Before running any RSpec suite, we first check if the `.cache/selenium`
folder is present. If it is not present, we use a file system lock to
download the `chromedriver` binary such that other processes that runs
after will not need to install the `chromedriver` binary.
The long term fix here is to get `selenium-manager` to download the `chromedriver` binary to a unique path for each
process but the `--cache-path` option for `selenium-manager` is currently not supported in `selenium-webdriver`.
We can no long user Webdriver - SeleniumHQ/selenium#11066. Bumping selenium-webdriver did the trick, as well as manually setting the user_agent for mobile system specs. Unsure what changed to make this necessary, but it is necessary to get the app to boot in mobile view.
Why this change?
By default in the test environment, MessageBus used the memory backend
which means all messages are stored in an in-memory data structure. However,
the in-memory data structure is not cleared after each system test so we
have the potential to be leaking stuff between system tests.
Similarly for the defer queue which process work in another thread, we
want to ensure that the defer queue processes everything it has to do
before the transaction is rolled back.
If a selenium finder takes the full wait duration to resolve, that means it has been written inefficiently. Most likely a matcher has been negated incorrectly.
This commit introduces a patch which will raise an error in this situation so that we can catch the issues while developing specs.
This commit also fixes chat's visit_thread helper. It was spinning on `has_css?(".chat-skeleton")` for the full selenium wait duration, and then returns false. That's because the thread is often already fully loaded before `has_css?` is even called. It's now updated to only look for the final expected state.
What is the problem here?
In multiple controllers, we are accepting a `limit` params but do not
impose any upper bound on the values being accepted. Without an upper
bound, we may be allowing arbituary users from generating DB queries
which may end up exhausing the resources on the server.
What is the fix here?
A new `fetch_limit_from_params` helper method is introduced in
`ApplicationController` that can be used by controller actions to safely
get the limit from the params as a default limit and maximum limit has
to be set. When an invalid limit params is encountered, the server will
respond with the 400 response code.
* CHROME_LOAD_EXTENSIONS_MANIFEST - An env var with a path to a file
that contains one path per line. These are paths to extensions installed
in chrome that the user wants to load while running system specs.
Useful to run things like Ember Inspector.
* CHROME_DISABLE_FORCE_DEVICE_SCALE_FACTOR - On some systems the
--force-device-scale-factor=1 argument makes the UI for chrome
super small, add a way to disable this.
This commit makes sure we don't load all data into memory when doing CSV exports.
The most important change here made to the recently introduced export of chat
messages (3ea31f4). We were loading all data into memory in the first version, with
this commit it's not the case anymore.
Speaking of old exports. Some of them already use find_each, and it worked as
expected, without loading all data into memory. And it will proceed working as
expected after this commit.
In general, I made sure this change didn't break other CSV exports, first manually, and
then by writing system specs for them. Sadly, I haven't managed yet to make those
specs stable, they work fine locally, but flaky in GitHub actions, so I've disabled them
for now.
I'll be making more changes to the CSV exports code soon, those system specs will be
very helpful. I'll be running them locally, and I hope I'll manage to make them stable
while doing that work.
Instead of having to remember every time, just always wait until the
current transaction (if it exists) has committed before clearing any
DistributedCache.
The only exception to this is caches that aren't caching things from
postgres.
This means we have to do the test setup after setting the test
transaction, because doing the test setup involves clearing caches.
Reapplying this - it now doesn't use after_commit if skip_db is set
Instead of having to remember every time, just always wait until the
current transaction (if it exists) has committed before clearing any
DistributedCache.
The only exception to this is caches that aren't caching things from
postgres.
This means we have to do the test setup after setting the test
transaction, because doing the test setup involves clearing caches.
Introduced in cec68b3e2c,
this is flaky because if you click the back button before
the route is fully transitioned to the loaded thread,
we end up going to the history _before_ the thread list,
which ends up being the channel.
We need to make sure that everything is loaded for the
thread first, meaning the skeleton is not there.
Also exclude some noise from the capybara logs (image load failures)
Clicking on TOC heading anchors in a subfolder setup was breaking the current URL for users.
Other than the fix this change introduces the ability to test the subfolder setup in system specs.
It seems like the overhead of GPU acceleration is not worth it and is
slowing down our system tests. Locally the following command completes
in `2 minutes 8.6 seconds` with GPU disabled as compared to `2 minutes 45.4 seconds` with GPU enabled.
## What is the problem?
MessageBus by default uses long polling which keeps a connection
open for 25 seconds by default. The problem here is that Capybara does not know about these
connections being kept opened by MessageBus and hence does not know how
to stop these connections at the end of each test. As a result, the long polling MessageBus connections are kept opened by the browser and we hit chrome's limit of 6 concurrent requests per host, new request made in the browser is marked as "pending" until a request is freed up. Since we keep a MessageBus long polling connection opened for 25 seconds, our finders in Capybara end up hitting Capybara's wait time out causing the tests to fail.
## What is the fix?
Since we can't rely on Capybara to close all the existing Capybara
connections, we manually execute a script to stop all MessageBus
connections after each system test.
```
for i in {1..10}; do
echo "Running iteration $i"
PARALLEL_TEST_PROCESSORS=8 CAPYBARA_DEFAULT_MAX_WAIT_TIME=10 bin/turbo_rspec --seed=34908 --profile --verbose --format documentation spec/system
if [ $? -ne 0 ]; then
echo "Error encountered on iteration $i"
exit 1
fi
done
echo "All 10 iterations completed successfully"
```
Without the fix, the script fails consistently in the first few iterations. Running in non-headless mode with the "network" tab opened will reveal the requests that are marked as pending.
Rescuing them still makes timing-out tests fail but doesn't break `after` spec cleanup (which could trigger more errors) Using custom error class to avoid any other possible timeout-catching code.
Also:
* remove an unnecessary `.select { |x| x.size > 0 }`
* fix a typo in a test title
Why is this change required?
By default, `RSpec` comes with a `--profile=[COUNT]` option as well but
enabling that option means that the entire test suite needs to be
executed. This does not work so well for `turbo_rspec` which splits our
test files into various "buckets" for the tests to be executed in
multiple processes. Therefore, this commit adds a similar
`--profile=[COUNT]` option to `turbo_rspec` but will only profile the
tests being executed. Examples:
`LOAD_PLUGINS=1 bin/turbo_rspec --profile plugins/*/spec/system`
or
`LOAD_PLUGINS=1 bin/turbo_rspec --profile=20 plugins/*/spec/system`
What is this change?
This change is an attempt to avoid flakiness in our tests due to
animations being enabled in our tests. An example of flakiness caused by
animations is when the `find(selector).click` pattern is used. When
`find(selector)` returns the node, its position may have changed if the
element is still moving. However, the `click` method will end up
clicking on the old position.
Either way, there is no need for us to make system tests even more
complicated by enabling animations.
Usage:
```
CHROME_DEV_TOOLS=bottom bundle exec rspec /path/to/system/spec
```
This commit also regroups common chrome options under `apply_base_chrome_options`, and removes the size of the mobile window which was incorrect. browser_log param is also passed to mobile chrome options.
New headless shares the same implementation as the chrome browser
instead of being a separate implementation of its own.
See https://developer.chrome.com/articles/new-headless/ for more
details
Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>
This commit also includes two changes to the rails helper which make tests more consistent on different devices. With this change the failure was reproducible locally and not only on CI:
```
options.add_argument("--force-device-scale-factor=1")
```
The fix itself is quite simple and attempts to find safe click coordinates, the previous solution could fail depending on the size of the sidebar.
* Color for turbo_rspec in CI (`progress` and `documentation` formats)
* Show "DONE" only when `documentation` formatter is used
* Fix formatting
* Collapse RSpec commands
* Add line wrapping to the `progress` formatter (to mitigate GH Actions issue)
This pull request is a full overhaul of the chat-composer and contains various improvements to the thread panel. They have been grouped in the same PR as lots of improvements/fixes to the thread panel needed an improved composer. This is meant as a first step.
### New features included in this PR
- A resizable side panel
- A clear dropzone area for uploads
- A simplified design for image uploads, this is only a first step towards more redesign of this area in the future
### Notable fixes in this PR
- Correct placeholder in thread panel
- Allows to edit the last message of a thread with arrow up
- Correctly focus composer when replying to a message
- The reply indicator is added instantly in the channel when starting a thread
- Prevents a large variety of bug where the composer could bug and prevent sending message or would clear your input while it has content
### Technical notes
To achieve this PR, three important changes have been made:
- `<ChatComposer>` has been fully rewritten and is now a glimmer component
- The chat composer now takes a `ChatMessage` as input which can directly be used in other operations, it simplifies a lot of logic as we are always working a with a `ChatMessage`
- `TextareaInteractor` has been created to wrap the existing `TextareaTextManipulation` mixin, it will make future migrations easier and allow us to have a less polluted `<ChatComposer>`
Note ".chat-live-pane" has been renamed ".chat-channel"
Design for upload dropzone is from @chapoi
Similar spirit to e195e6f614,
this moves the Bookmarkable registration to DiscoursePluginRegistry
so plugins which are not enabled do not register additional
bookmarkable classes.
This commit introduces the skeleton of the chat thread UI. The
structure of the components looks like this. Its done this way
so the side panel can be used for other things as well if we wish,
not just for threads:
```
.main-chat-outlet
<ChatLivePane />
<ChatSidePanel>
<-- rendered with {{outlet}} -->
<ChatThread />
</ChatSidePanel>
```
Later on the `ChatThreadList` will be rendered here as well.
Now, when you go to a channel you can open a thread by clicking
on either the Open Thread message action button or by clicking on
the reply indicator. This will take you to a route like `chat/c/:slug/:channelId/t/:threadId`.
This works on mobile as well.
This commit includes basic serializers and routes for threads,
as well as a new `ChatThreadsManager` service in JS that caches
threads for a channel the same way the channel threads manager does.
The chat messages inside the thread are intentionally left out
until a later PR.
**NOTE: These changes are gated behind the site setting enable_experimental_chat_threaded_discussions
and the threading_enabled boolean on a ChatChannel**
* DEV: Rnemae channel path to just c
Also swap the channel id and channel slug params to be consistent with core.
* linting
* channel_path
* Drop slugify helper and channel route without slug
* Request slug and route models through the channel model if possible
* DEV: Pass messageId as a dynamic segment instead of a query param
* Ensure change is backwards-compatible
* drop query param from oneboxes
* Correctly extract channelId from routes
* Better route organization using siblings for regular and near-message
* Ensures sessions are unique even when using parallelism
* prevents didReceiveAttrs to clear input mid test
* we disable animations in capybara so sometimes the message was barely showing
* adds wait
* ensures finished loading
* is it causing more harm than good?
* this check is slowing things for no reason
* actually target the button
* more resilient select chat message
* apply similar fix to bookmark
* fix
---------
Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
In dev/prod, these are absorbed by unicorn. Most commonly, they occur when a client interrupts a message-bus long-polling request.
Also reverts the EPIPE workaround introduced in 011c9b9973
Having this set to ALL pollutes the JS system spec
logs with a bunch of unnecessary noise like this:
> "PresenceChannel '/chat-user/core/1' dropped message (received 315, expecting 246), resyncing..."
Or:
> "DEPRECATION: The \u003Cdiscourse@component:plugin-connector::ember1112>#save computed property was just overridden. This removes the computed property and replaces it with a plain value, and has been deprecated.
Now, we will only log errors. To configure this set
the `SELENIUM_BROWSER_LOG_LEVEL` env var.
Our working theory is that system tests on Github run on much less
powerful hardware as compared to running the tests on our work machines.
Hopefully, increasing the wait time now will help reduce some flakes
that we're seeing on Github.
It should fix flakeys we have due to using_session. This commit is also fixing tests which were failing constantly with treadsafe enabled.
A test has also bene skipped as the issue couldn't be found so far.
More info: https://github.com/teamcapybara/capybara#threadsafe-mode
Previously, browser logs would be printed to STDOUT halfway through the test run. This commit changes the behaviour so that the logs are included in the failure summary along with other rspec failure information.
Currently the `turbo:spec` task will fail when encountering system
tests as Capypara tries to use the same port for each process.
This simple change uses the same strategy as for databases, by just
incrementing the port number by `TEST_ENV_NUMBER` for each process.
We are all in on system specs, so this commit moves all the chat quoting acceptance tests (some of which have been skipped for a while) into system specs.
Follow up to a review in #18937, this commit changes the HashtagAutocompleteService to no longer use class variables to register hashtag data sources or types in context priority order. This is to address multisite concerns, where one site could e.g. have chat disabled and another might not. The filtered plugin registers I added will not be included if the plugin is disabled.
This ensures that all system tests are starting from a clean state and
not leak state between requests. Note that we have to simplify flush the
Redis db here because it is not pratical to manually clean up Redis keys
in system tests.
Will make your test run in an emulated iPhone 12 Pro view. It means you can now use `click(delay: 0.5)` to emulate some long press or that `mobile_view=1` will be set automatically.
Usage:
```
it "works", mobile: true do
visit("/")
end
```
Note: `window-size=390,950` is different than native iPhone 12 Pro size, but due to minimum browser size and the automated browser alert at the top of the view, this was the best size I could find.
This commit fleshes out and adds functionality for the new `#hashtag` search and
lookup system, still hidden behind the `enable_experimental_hashtag_autocomplete`
feature flag.
**Serverside**
We have two plugin API registration methods that are used to define data sources
(`register_hashtag_data_source`) and hashtag result type priorities depending on
the context (`register_hashtag_type_in_context`). Reading the comments in plugin.rb
should make it clear what these are doing. Reading the `HashtagAutocompleteService`
in full will likely help a lot as well.
Each data source is responsible for providing its own **lookup** and **search**
method that returns hashtag results based on the arguments provided. For example,
the category hashtag data source has to take into account parent categories and
how they relate, and each data source has to define their own icon to use for the
hashtag, and so on.
The `Site` serializer has two new attributes that source data from `HashtagAutocompleteService`.
There is `hashtag_icons` that is just a simple array of all the different icons that
can be used for allowlisting in our markdown pipeline, and there is `hashtag_context_configurations`
that is used to store the type priority orders for each registered context.
When sending emails, we cannot render the SVG icons for hashtags, so
we need to change the HTML hashtags to the normal `#hashtag` text.
**Markdown**
The `hashtag-autocomplete.js` file is where I have added the new `hashtag-autocomplete`
markdown rule, and like all of our rules this is used to cook the raw text on both the clientside
and on the serverside using MiniRacer. Only on the server side do we actually reach out to
the database with the `hashtagLookup` function, on the clientside we just render a plainer
version of the hashtag HTML. Only in the composer preview do we do further lookups based
on this.
This rule is the first one (that I can find) that uses the `currentUser` based on a passed
in `user_id` for guardian checks in markdown rendering code. This is the `last_editor_id`
for both the post and chat message. In some cases we need to cook without a user present,
so the `Discourse.system_user` is used in this case.
**Chat Channels**
This also contains the changes required for chat so that chat channels can be used
as a data source for hashtag searches and lookups. This data source will only be
used when `enable_experimental_hashtag_autocomplete` is `true`, so we don't have
to worry about channel results suddenly turning up.
------
**Known Rough Edges**
- Onebox excerpts will not render the icon svg/use tags, I plan to address that in a follow up PR
- Selecting a hashtag + pressing the Quote button will result in weird behaviour, I plan to address that in a follow up PR
- Mixed hashtag contexts for hashtags without a type suffix will not work correctly, e.g. #ux which is both a category and a channel slug will resolve to a category when used inside a post or within a [chat] transcript in that post. Users can get around this manually by adding the correct suffix, for example ::channel. We may get to this at some point in future
- Icons will not show for the hashtags in emails since SVG support is so terrible in email (this is not likely to be resolved, but still noting for posterity)
- Additional refinements and review fixes wil
This commit adds a new `/hashtag/search` endpoint and both
relevant JS and ruby plugin APIs to handle plugins adding their
own data sources and priority orders for types of things to search
when `#` is pressed.
A `context` param is added to `setupHashtagAutocomplete` which
a corresponding chat PR https://github.com/discourse/discourse-chat/pull/1302
will now use.
The UI calls `registerHashtagSearchParam` for each context that will
require a `#` search (e.g. the topic composer), for each type of record that
the context needs to search for, as well as a priority order for that type. Core
uses this call to add the `category` and `tag` data sources to the topic composer.
The `register_hashtag_data_source` ruby plugin API call is for plugins to
add a new data source for the hashtag searching endpoint, e.g. discourse-chat
may add a `channel` data source.
This functionality is hidden behind the `enable_experimental_hashtag_autocomplete`
flag, except for the change to `setupHashtagAutocomplete` since only core and
discourse-chat are using that function. Note this PR does **not** include required
changes for hashtag lookup or new styling.
This commit introduces rails system tests run with chromedriver, selenium,
and headless chrome to our testing toolbox.
We use the `webdrivers` gem and `selenium-webdriver` which is what
the latest Rails uses so the tests run locally and in CI out of the box.
You can use `SELENIUM_VERBOSE_DRIVER_LOGS=1` to show extra
verbose logs of what selenium is doing to communicate with the system
tests.
By default JS logs are verbose so errors from JS are shown when
running system tests, you can disable this with
`SELENIUM_DISABLE_VERBOSE_JS_LOGS=1`
You can use `SELENIUM_HEADLESS=0` to run the system
tests inside a chrome browser instead of headless, which can be useful to debug things
and see what the spec sees. See note above about `bin/ember-cli` to avoid
surprises.
I have modified `bin/turbo_rspec` to exclude `spec/system` by default,
support for parallel system specs is a little shaky right now and we don't
want them slowing down the turbo by default either.
### PageObjects and System Tests
To make querying and inspecting parts of the page easier
and more reusable inbetween system tests, we are using the
concept of [PageObjects](https://www.selenium.dev/documentation/test_practices/encouraged/page_object_models/) in
our system tests. A "Page" here is generally corresponds to
an overarching ember route, e.g. "Topic" for `/t/324345/some-topic`,
and this contains logic for querying components within the topic
such as "Posts".
I have also split "Modals" into their own entity. Further down the
line we may want to explore creating independent "Component"
contexts.
Capybara DSL should be included in each PageObject class,
reference for this can be found at https://rubydoc.info/github/teamcapybara/capybara/master#the-dsl
For system tests, since they are so slow, we want to focus on
the "happy path" and not do every different possible context
and branch check using them. They are meant to be overarching
tests that check a number of things are correct using the full stack
from JS and ember to rails to ruby and then the database.
### CI Setup
Whenever a system spec fails, a screenshot
is taken and a build artifact is produced _after the entire CI run is complete_,
which can be downloaded from the Actions UI in the repo.
Most importantly, a step to build the Ember app using Ember CLI
is needed, otherwise the JS assets cannot be found by capybara:
```
- name: Build Ember CLI
run: bin/ember-cli --build
```
A new `--build` argument has been added to `bin/ember-cli` for this
case, which is not needed locally if you already have the discourse
rails server running via `bin/ember-cli -u` since the whole server is built and
set up by default.
Co-authored-by: David Taylor <david@taylorhq.com>
The cache was causing state to leak between tests since the `WatchedWord` record in the DB would have been rolled back but `WordWatcher` still had the word in the cache.
This pull request follows on from https://github.com/discourse/discourse/pull/16308. This one does the following:
* Changes `BookmarkQuery` to allow for querying more than just Post and Topic bookmarkables
* Introduces a `Bookmark.register_bookmarkable` method which requires a model, serializer, fields and preload includes for searching. These registered `Bookmarkable` types are then used when validating new bookmarks, and also when determining which serializer to use for the bookmark list. The `Post` and `Topic` bookmarkables are registered by default.
* Adds new specific types for Post and Topic bookmark serializers along with preloading of associations in `UserBookmarkList`
* Changes to the user bookmark list template to allow for more generic bookmarkable types alongside the Post and Topic ones which need to display in a particular way
All of these changes are gated behind the `use_polymorphic_bookmarks` site setting, apart from the .hbs changes where I have updated the original `UserBookmarkSerializer` with some stub methods.
Following this PR will be several plugin PRs (for assign, chat, encrypt) that will register their own bookmarkable types or otherwise alter the bookmark serializers in their own way, also gated behind `use_polymorphic_bookmarks`.
This commit also removes `BookmarkQuery.preloaded_custom_fields` and the functionality surrounding it. It was added in 0cd502a558 but only used by one plugin (discourse-assign) where it has since been removed, and is now used by no plugins. We don't need it anymore.
Previously cached counting made redis calls in main thread and performed
the flush in main thread.
This could lead to pathological states in extreme heavy load.
This refactor reduces load and cleans up the interface
Also:
* Remove an unused method (#fill_email)
* Replace a method that was used just once (#generate_username) with `SecureRandom.alphanumeric`
* Remove an obsolete dev puma `tmp/restart` file logic
* File.exists? is deprecated and removed in Ruby 3.2 in favor of
File.exist?
* Dir.exists? is deprecated and removed in Ruby 3.2 in favor of
Dir.exist?
OmniAuth test mode is disabled by default, so that we can integration-test the omniauth strategies. Sometimes, we manually enable test mode for specific specs. This commit ensures that test_mode is always disabled again after each spec.
We can fake redis transactions so that `fab!` works for redis and PG
data, but it's too slow to be used indiscriminately. Instead, you can
opt into it with the `use_redis_snapshotting` helper.
Insofar as snapshotting allows us to `fab!` more things, it provides a
speedup.