discourse/lib/turbo_tests.rb

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

111 lines
2.6 KiB
Ruby
Raw Normal View History

2019-06-21 09:33:41 +08:00
# frozen_string_literal: true
require "bundler/setup"
require "open3"
require "fileutils"
require "json"
require "rspec"
require "rails"
require File.expand_path("../../config/environment", __FILE__)
require "parallel_tests"
require "parallel_tests/rspec/runner"
require "./lib/turbo_tests/reporter"
require "./lib/turbo_tests/runner"
require "./lib/turbo_tests/json_rows_formatter"
require "./lib/turbo_tests/documentation_formatter"
DEV: Repair RSpec full_cause_backtrace under bin/turbo_rspec (#26087) This commit fixes two issues: 1. The wrong exception was being printed as the 'cause' in turbo_rspec output. This was happening because RSpec [expects exceptions to be subclasses of `Exception`](https://github.com/rspec/rspec-core/blob/d6e320dc11f77d0898fda32a33ddee3d17b14d00/lib/rspec/core/formatters/exception_presenter.rb#L102). This commit resolves the issue by replacing the `FakeException` `Struct` with a subclass of `Exception`. 2. The `full_cause_backtrace` option we set in `rails_helper.rb` does not carry through to the RSpec formatters running in the turbo_rspec reporter process. To fix that, this commit duplicates the necessary config in `lib/turbo_tests.rb`. Example before - note that the cause is a duplicate of the original exception, and only has three lines of backtrace: ``` Failure/Error: raise capybara_timeout_error CapybaraTimeoutExtension::CapybaraTimedOut: This spec passed, but capybara waited for the full wait duration (4s) at least once. This will slow down the test suite. Beware of negating the result of selenium's RSpec matchers. [Screenshot Image]: /Users/david/discourse/discourse/tmp/capybara/failures_r_spec_example_groups_glimmer_header_when_cmd_f_keyboard_shortcut_pressed_when_within_a_topic_with_less_than20_posts_does_not_open_search_484.png ~~~~~~~ JS LOGS ~~~~~~~ ~~~~~ END JS LOGS ~~~~~ # ./spec/rails_helper.rb:372:in `block (3 levels) in <top (required)>' # ./spec/rails_helper.rb:472:in `block (2 levels) in <top (required)>' # /Users/david/.rvm/gems/ruby-3.2.1/gems/webmock-3.23.0/lib/webmock/rspec.rb:39:in `block (2 levels) in <top (required)>' # ------------------ # --- Caused by: --- # CapybaraTimeoutExtension::CapybaraTimedOut: # This spec passed, but capybara waited for the full wait duration (4s) at least once. This will slow down the test suite. Beware of negating the result of selenium's RSpec matchers. # ./spec/rails_helper.rb:372:in `block (3 levels) in <top (required)>' # ./spec/rails_helper.rb:472:in `block (2 levels) in <top (required)>' # /Users/david/.rvm/gems/ruby-3.2.1/gems/webmock-3.23.0/lib/webmock/rspec.rb:39:in `block (2 levels) in <top (required)>' ``` After - note correct causing exception, and the full backtrace 🎉 ``` Failure/Error: raise capybara_timeout_error CapybaraTimeoutExtension::CapybaraTimedOut: This spec passed, but capybara waited for the full wait duration (4s) at least once. This will slow down the test suite. Beware of negating the result of selenium's RSpec matchers. [Screenshot Image]: /Users/david/discourse/discourse/tmp/capybara/failures_r_spec_example_groups_glimmer_header_when_cmd_f_keyboard_shortcut_pressed_when_within_a_topic_with_less_than20_posts_does_not_open_search_61.png ~~~~~~~ JS LOGS ~~~~~~~ ~~~~~ END JS LOGS ~~~~~ # ./spec/rails_helper.rb:372:in `block (3 levels) in <top (required)>' # ./spec/rails_helper.rb:472:in `block (2 levels) in <top (required)>' # /Users/david/.rvm/gems/ruby-3.2.1/gems/webmock-3.23.0/lib/webmock/rspec.rb:39:in `block (2 levels) in <top (required)>' # ------------------ # --- Caused by: --- # Capybara::ExpectationNotMet: # expected to find css ".search-menu .search-menu-panel" but there were no matches # /Users/david/.rvm/gems/ruby-3.2.1/gems/capybara-3.40.0/lib/capybara/node/matchers.rb:112:in `block in assert_selector' # /Users/david/.rvm/gems/ruby-3.2.1/gems/capybara-3.40.0/lib/capybara/node/matchers.rb:869:in `block in _verify_selector_result' # /Users/david/.rvm/gems/ruby-3.2.1/gems/capybara-3.40.0/lib/capybara/node/base.rb:84:in `synchronize' # ./spec/rails_helper.rb:345:in `synchronize' # /Users/david/.rvm/gems/ruby-3.2.1/gems/capybara-3.40.0/lib/capybara/node/matchers.rb:868:in `_verify_selector_result' # /Users/david/.rvm/gems/ruby-3.2.1/gems/capybara-3.40.0/lib/capybara/node/matchers.rb:110:in `assert_selector' # /Users/david/.rvm/gems/ruby-3.2.1/gems/capybara-3.40.0/lib/capybara/node/matchers.rb:39:in `block in has_selector?' # /Users/david/.rvm/gems/ruby-3.2.1/gems/capybara-3.40.0/lib/capybara/node/matchers.rb:902:in `make_predicate' # /Users/david/.rvm/gems/ruby-3.2.1/gems/capybara-3.40.0/lib/capybara/node/matchers.rb:39:in `has_selector?' # /Users/david/.rvm/gems/ruby-3.2.1/gems/capybara-3.40.0/lib/capybara/session.rb:774:in `has_selector?' # ./spec/system/page_objects/pages/search.rb:46:in `has_search_menu_visible?' # ./spec/system/header_spec.rb:206:in `block (4 levels) in <main>' # ./spec/rails_helper.rb:472:in `block (2 levels) in <top (required)>' # /Users/david/.rvm/gems/ruby-3.2.1/gems/webmock-3.23.0/lib/webmock/rspec.rb:39:in `block (2 levels) in <top (required)>' ```
2024-03-08 02:35:46 +08:00
RSpec.configure do |config|
# this is an unusual config option because it is used by the formatter, not just the runner
config.full_cause_backtrace = true
end
module TurboTests
DEV: Repair RSpec full_cause_backtrace under bin/turbo_rspec (#26087) This commit fixes two issues: 1. The wrong exception was being printed as the 'cause' in turbo_rspec output. This was happening because RSpec [expects exceptions to be subclasses of `Exception`](https://github.com/rspec/rspec-core/blob/d6e320dc11f77d0898fda32a33ddee3d17b14d00/lib/rspec/core/formatters/exception_presenter.rb#L102). This commit resolves the issue by replacing the `FakeException` `Struct` with a subclass of `Exception`. 2. The `full_cause_backtrace` option we set in `rails_helper.rb` does not carry through to the RSpec formatters running in the turbo_rspec reporter process. To fix that, this commit duplicates the necessary config in `lib/turbo_tests.rb`. Example before - note that the cause is a duplicate of the original exception, and only has three lines of backtrace: ``` Failure/Error: raise capybara_timeout_error CapybaraTimeoutExtension::CapybaraTimedOut: This spec passed, but capybara waited for the full wait duration (4s) at least once. This will slow down the test suite. Beware of negating the result of selenium's RSpec matchers. [Screenshot Image]: /Users/david/discourse/discourse/tmp/capybara/failures_r_spec_example_groups_glimmer_header_when_cmd_f_keyboard_shortcut_pressed_when_within_a_topic_with_less_than20_posts_does_not_open_search_484.png ~~~~~~~ JS LOGS ~~~~~~~ ~~~~~ END JS LOGS ~~~~~ # ./spec/rails_helper.rb:372:in `block (3 levels) in <top (required)>' # ./spec/rails_helper.rb:472:in `block (2 levels) in <top (required)>' # /Users/david/.rvm/gems/ruby-3.2.1/gems/webmock-3.23.0/lib/webmock/rspec.rb:39:in `block (2 levels) in <top (required)>' # ------------------ # --- Caused by: --- # CapybaraTimeoutExtension::CapybaraTimedOut: # This spec passed, but capybara waited for the full wait duration (4s) at least once. This will slow down the test suite. Beware of negating the result of selenium's RSpec matchers. # ./spec/rails_helper.rb:372:in `block (3 levels) in <top (required)>' # ./spec/rails_helper.rb:472:in `block (2 levels) in <top (required)>' # /Users/david/.rvm/gems/ruby-3.2.1/gems/webmock-3.23.0/lib/webmock/rspec.rb:39:in `block (2 levels) in <top (required)>' ``` After - note correct causing exception, and the full backtrace 🎉 ``` Failure/Error: raise capybara_timeout_error CapybaraTimeoutExtension::CapybaraTimedOut: This spec passed, but capybara waited for the full wait duration (4s) at least once. This will slow down the test suite. Beware of negating the result of selenium's RSpec matchers. [Screenshot Image]: /Users/david/discourse/discourse/tmp/capybara/failures_r_spec_example_groups_glimmer_header_when_cmd_f_keyboard_shortcut_pressed_when_within_a_topic_with_less_than20_posts_does_not_open_search_61.png ~~~~~~~ JS LOGS ~~~~~~~ ~~~~~ END JS LOGS ~~~~~ # ./spec/rails_helper.rb:372:in `block (3 levels) in <top (required)>' # ./spec/rails_helper.rb:472:in `block (2 levels) in <top (required)>' # /Users/david/.rvm/gems/ruby-3.2.1/gems/webmock-3.23.0/lib/webmock/rspec.rb:39:in `block (2 levels) in <top (required)>' # ------------------ # --- Caused by: --- # Capybara::ExpectationNotMet: # expected to find css ".search-menu .search-menu-panel" but there were no matches # /Users/david/.rvm/gems/ruby-3.2.1/gems/capybara-3.40.0/lib/capybara/node/matchers.rb:112:in `block in assert_selector' # /Users/david/.rvm/gems/ruby-3.2.1/gems/capybara-3.40.0/lib/capybara/node/matchers.rb:869:in `block in _verify_selector_result' # /Users/david/.rvm/gems/ruby-3.2.1/gems/capybara-3.40.0/lib/capybara/node/base.rb:84:in `synchronize' # ./spec/rails_helper.rb:345:in `synchronize' # /Users/david/.rvm/gems/ruby-3.2.1/gems/capybara-3.40.0/lib/capybara/node/matchers.rb:868:in `_verify_selector_result' # /Users/david/.rvm/gems/ruby-3.2.1/gems/capybara-3.40.0/lib/capybara/node/matchers.rb:110:in `assert_selector' # /Users/david/.rvm/gems/ruby-3.2.1/gems/capybara-3.40.0/lib/capybara/node/matchers.rb:39:in `block in has_selector?' # /Users/david/.rvm/gems/ruby-3.2.1/gems/capybara-3.40.0/lib/capybara/node/matchers.rb:902:in `make_predicate' # /Users/david/.rvm/gems/ruby-3.2.1/gems/capybara-3.40.0/lib/capybara/node/matchers.rb:39:in `has_selector?' # /Users/david/.rvm/gems/ruby-3.2.1/gems/capybara-3.40.0/lib/capybara/session.rb:774:in `has_selector?' # ./spec/system/page_objects/pages/search.rb:46:in `has_search_menu_visible?' # ./spec/system/header_spec.rb:206:in `block (4 levels) in <main>' # ./spec/rails_helper.rb:472:in `block (2 levels) in <top (required)>' # /Users/david/.rvm/gems/ruby-3.2.1/gems/webmock-3.23.0/lib/webmock/rspec.rb:39:in `block (2 levels) in <top (required)>' ```
2024-03-08 02:35:46 +08:00
class FakeException < Exception
attr_reader :backtrace, :message, :cause
def initialize(backtrace, message, cause)
@backtrace = backtrace
@message = message
@cause = cause
end
def self.from_obj(obj)
if obj
obj = obj.symbolize_keys
klass = Class.new(FakeException) { define_singleton_method(:name) { obj[:class_name] } }
klass.new(obj[:backtrace], obj[:message], FakeException.from_obj(obj[:cause]))
end
end
end
FakeExecutionResult =
Struct.new(
:example_skipped?,
:pending_message,
:status,
:pending_fixed?,
:exception,
:pending_exception,
)
class FakeExecutionResult
def self.from_obj(obj)
obj = obj.symbolize_keys
new(
obj[:example_skipped?],
obj[:pending_message],
obj[:status].to_sym,
obj[:pending_fixed?],
FakeException.from_obj(obj[:exception]),
FakeException.from_obj(obj[:pending_exception]),
)
end
end
FakeExample =
Struct.new(
:execution_result,
:location,
:description,
:full_description,
:metadata,
:location_rerun_argument,
:process_id,
DEV: Introduce automatic reruns to RSpec tests on Github actions (#24811) What motivated this change? Our builds on Github actions have been extremely flaky mostly due to system tests. This has led to a drop in confidence in our test suite where our developers tend to assume that a failed job is due to a flaky system test. As a result, we have had occurrences where changes that resulted in legitimate test failures are merged into the `main` branch because developers assumed it was a flaky test. What does this change do? This change seeks to reduce the flakiness of our builds on Github Actions by automatically re-running RSpec tests once when they fail. If a failed test passes subsequently in the re-run, we mark the test as flaky by logging it into a file on disk which is then uploaded as an artifact of the Github workflow run. We understand that automatically re-runs will lead to lower accuracy of our tests but we accept this as an acceptable trade-off since a fragile build has a much greater impact on our developers' time. Internally, the Discourse development team will be running a service to fetch the flaky tests which have been logged for internal monitoring. How is the change implemented? 1. A `--retry-and-log-flaky-tests` CLI flag is added to the `bin/turbo_rspec` CLI which will then initialize `TurboTests::Runner` with the `retry_and_log_flaky_tests` kwarg set to `true`. 2. When the `retry_and_log_flaky_tests` kwarg is set to `true` for `TurboTests::Runner`, we will register an additional formatter `Flaky::FailuresLoggerFormatter` to the `TurboTests::Reporter` in the `TurboTests::Runner#run` method. The `Flaky::FailuresLoggerFormatter` has a simple job of logging all failed examples to a file on disk when running all the tests. The details of the failed example which are logged can be found in `TurboTests::Flaky::FailedExample.to_h`. 3. Once all the tests have been run once, we check the result for any failed examples and if there are, we read the file on disk to fetch the `location_rerun_location` of the failed examples which is then used to run the tests in a new RSpec process. In the rerun, we configure a `TurboTests::Flaky::FlakyDetectorFormatter` with RSpec which removes all failed examples from the log file on disk since those examples are not flaky tests. Note that if there are too many failed examples on the first run, we will deem the failures to likely not be due to flaky tests and not re-run the test failures. As of writing, the threshold of failed examples is set to 10. If there are more than 10 failed examples, we will not re-run the failures.
2023-12-13 07:18:27 +08:00
:command_string,
)
class FakeExample
DEV: Introduce automatic reruns to RSpec tests on Github actions (#24811) What motivated this change? Our builds on Github actions have been extremely flaky mostly due to system tests. This has led to a drop in confidence in our test suite where our developers tend to assume that a failed job is due to a flaky system test. As a result, we have had occurrences where changes that resulted in legitimate test failures are merged into the `main` branch because developers assumed it was a flaky test. What does this change do? This change seeks to reduce the flakiness of our builds on Github Actions by automatically re-running RSpec tests once when they fail. If a failed test passes subsequently in the re-run, we mark the test as flaky by logging it into a file on disk which is then uploaded as an artifact of the Github workflow run. We understand that automatically re-runs will lead to lower accuracy of our tests but we accept this as an acceptable trade-off since a fragile build has a much greater impact on our developers' time. Internally, the Discourse development team will be running a service to fetch the flaky tests which have been logged for internal monitoring. How is the change implemented? 1. A `--retry-and-log-flaky-tests` CLI flag is added to the `bin/turbo_rspec` CLI which will then initialize `TurboTests::Runner` with the `retry_and_log_flaky_tests` kwarg set to `true`. 2. When the `retry_and_log_flaky_tests` kwarg is set to `true` for `TurboTests::Runner`, we will register an additional formatter `Flaky::FailuresLoggerFormatter` to the `TurboTests::Reporter` in the `TurboTests::Runner#run` method. The `Flaky::FailuresLoggerFormatter` has a simple job of logging all failed examples to a file on disk when running all the tests. The details of the failed example which are logged can be found in `TurboTests::Flaky::FailedExample.to_h`. 3. Once all the tests have been run once, we check the result for any failed examples and if there are, we read the file on disk to fetch the `location_rerun_location` of the failed examples which is then used to run the tests in a new RSpec process. In the rerun, we configure a `TurboTests::Flaky::FlakyDetectorFormatter` with RSpec which removes all failed examples from the log file on disk since those examples are not flaky tests. Note that if there are too many failed examples on the first run, we will deem the failures to likely not be due to flaky tests and not re-run the test failures. As of writing, the threshold of failed examples is set to 10. If there are more than 10 failed examples, we will not re-run the failures.
2023-12-13 07:18:27 +08:00
def self.from_obj(obj, process_id:, command_string:)
obj = obj.symbolize_keys
metadata = obj[:metadata].symbolize_keys
metadata[:shared_group_inclusion_backtrace].map! do |frame|
frame = frame.symbolize_keys
RSpec::Core::SharedExampleGroupInclusionStackFrame.new(
frame[:shared_group_name],
frame[:inclusion_location],
)
end
new(
FakeExecutionResult.from_obj(obj[:execution_result]),
obj[:location],
obj[:description],
obj[:full_description],
metadata,
obj[:location_rerun_argument],
process_id,
DEV: Introduce automatic reruns to RSpec tests on Github actions (#24811) What motivated this change? Our builds on Github actions have been extremely flaky mostly due to system tests. This has led to a drop in confidence in our test suite where our developers tend to assume that a failed job is due to a flaky system test. As a result, we have had occurrences where changes that resulted in legitimate test failures are merged into the `main` branch because developers assumed it was a flaky test. What does this change do? This change seeks to reduce the flakiness of our builds on Github Actions by automatically re-running RSpec tests once when they fail. If a failed test passes subsequently in the re-run, we mark the test as flaky by logging it into a file on disk which is then uploaded as an artifact of the Github workflow run. We understand that automatically re-runs will lead to lower accuracy of our tests but we accept this as an acceptable trade-off since a fragile build has a much greater impact on our developers' time. Internally, the Discourse development team will be running a service to fetch the flaky tests which have been logged for internal monitoring. How is the change implemented? 1. A `--retry-and-log-flaky-tests` CLI flag is added to the `bin/turbo_rspec` CLI which will then initialize `TurboTests::Runner` with the `retry_and_log_flaky_tests` kwarg set to `true`. 2. When the `retry_and_log_flaky_tests` kwarg is set to `true` for `TurboTests::Runner`, we will register an additional formatter `Flaky::FailuresLoggerFormatter` to the `TurboTests::Reporter` in the `TurboTests::Runner#run` method. The `Flaky::FailuresLoggerFormatter` has a simple job of logging all failed examples to a file on disk when running all the tests. The details of the failed example which are logged can be found in `TurboTests::Flaky::FailedExample.to_h`. 3. Once all the tests have been run once, we check the result for any failed examples and if there are, we read the file on disk to fetch the `location_rerun_location` of the failed examples which is then used to run the tests in a new RSpec process. In the rerun, we configure a `TurboTests::Flaky::FlakyDetectorFormatter` with RSpec which removes all failed examples from the log file on disk since those examples are not flaky tests. Note that if there are too many failed examples on the first run, we will deem the failures to likely not be due to flaky tests and not re-run the test failures. As of writing, the threshold of failed examples is set to 10. If there are more than 10 failed examples, we will not re-run the failures.
2023-12-13 07:18:27 +08:00
command_string,
)
end
def notification
RSpec::Core::Notifications::ExampleNotification.for(self)
end
end
end