Why is this change required?
We've been seeing flaky tests due to server errors on CI but are unable
to debug it because we do not log any of the errors. This change gives
us a fighting chance the next time we encounter a server error during
system test runs.
See
https://github.com/discourse/discourse/actions/runs/5459248864/jobs/9935049920?pr=22424
for an example of server errors encountered during system tests.
Using the runtime information, we will be able to more efficiently group
the test files across the test processes hence leading to better
utilization of resources.
Using the runtime information, we will be able to more efficiently group
the test files across the test processes hence leading to better
utilization of resources.
4dd053a69c addressed most of the
instability we were seeing with system tests on CI and locally. Let's
try pushing the number of parallel processes up to squeeze as much time
savings as possible from the runner.
We're running on pretty crappy hardware on Github's CI and this has an
impact on the stability of our system tests on CI. Therefore, we are
bumping `CABPYARA_DEFAULT_MAX_WAIT_TIME` to 10 seconds to account for
the less than ideal hardware we're running the system tests on.
This change trades off speed for stability but speed is already bad on
CI so stability is more important for our case.
## How does this work?
Any time a lint rule is added or changed, you can run `yarn lint:fix` to handle all the auto-fixable situations.
But not all lints are auto-fixable -- for those, lint-to-the-future has tooling to automatically ignore present violations.
An alias has been added for lint-to-the-future to ignore new violations, `yarn lttf:ignore`.
The command will add lint-ignore declarations throughout all the files with present violations, which should then be committed.
An excerpt from lint-to-the-future's [README](https://github.com/mansona/lint-to-the-future#lint-to-the-future-dashboard):
> The point of Lint to the Future is to allow you to progressively update your codebase using new lint rules without overwhelming you with the task. You can easily ignore lint rules using project-based ignores in your config files but that doesn't prevent you from making the same errors in new files.
> We chose to do the ignores on a file basis as it is a perfect balance and it means that the tracking/graphing aspects of Lint to the Future provide you with achievable goals, especially in large codebases.
## How do I view progress?
lint-to-the-future provides graphs of violations-over-time per lint rule in a dashboard format, so we can track how well we're doing at cleaning up the violations.
To view the dashboard locally, run `yarn lint-progress` and visit `http://localhost:8084` (or whatever the port it chose, as it will choose a new port if 8084 is preoccupied)
Also there is a `list` command which shows a JSON object of:
```ts
{
[date: string]: { // yyyy-mm-dd
[pluginName: string]: {
[fileName: string]: string[]; // list of files with violations
}
}
}
```
```bash
yarn lint-to-the-future list --stdout
```
## What about lint-todo?
Lint todo is another system available for both eslint and ember-template-lint that _forces_ folks to "leave things better than they found them" by being transparent / line-specific ignoring of violations.
It was decided that for _this_ project, it made more sense, and would be less disruptive to new contributors to have the ignore declarations explicitly defined in each file (whereas in lint-todo, they are hidden).
To effectively use lint-todo, a whole team needs to agree to the workflow, and in open source, we want "just anyway" to be able to contribute, and throwing surprises at them can deter contributions.
In production, `eager_load=true`. This sometimes leads to boot errors which are not present in dev/test environments. Running `zeitwerk:check` in CI will help us to pick up on any errors early.
This commit also introduces a `DISCOURSE_ZEITWERK_EAGER_LOAD` environment variable to make it easier to toggle the behaviour when developing locally.
This is a combined work of Martin Brennan, Loïc Guitaut, and Joffrey Jaffeux.
---
This commit implements a base service object when working in chat. The documentation is available at https://discourse.github.io/discourse/chat/backend/Chat/Service.html
Generating documentation has been made as part of this commit with a bigger goal in mind of generally making it easier to dive into the chat project.
Working with services generally involves 3 parts:
- The service object itself, which is a series of steps where few of them are specialized (model, transaction, policy)
```ruby
class UpdateAge
include Chat::Service::Base
model :user, :fetch_user
policy :can_see_user
contract
step :update_age
class Contract
attribute :age, :integer
end
def fetch_user(user_id:, **)
User.find_by(id: user_id)
end
def can_see_user(guardian:, **)
guardian.can_see_user(user)
end
def update_age(age:, **)
user.update!(age: age)
end
end
```
- The `with_service` controller helper, handling success and failure of the service within a service and making easy to return proper response to it from the controller
```ruby
def update
with_service(UpdateAge) do
on_success { render_serialized(result.user, BasicUserSerializer, root: "user") }
end
end
```
- Rspec matchers and steps inspector, improving the dev experience while creating specs for a service
```ruby
RSpec.describe(UpdateAge) do
subject(:result) do
described_class.call(guardian: guardian, user_id: user.id, age: age)
end
fab!(:user) { Fabricate(:user) }
fab!(:current_user) { Fabricate(:admin) }
let(:guardian) { Guardian.new(current_user) }
let(:age) { 1 }
it { expect(user.reload.age).to eq(age) }
end
```
Note in case of unexpected failure in your spec, the output will give all the relevant information:
```
1) UpdateAge when no channel_id is given is expected to fail to find a model named 'user'
Failure/Error: it { is_expected.to fail_to_find_a_model(:user) }
Expected model 'foo' (key: 'result.model.user') was not found in the result object.
[1/4] [model] 'user' ❌
[2/4] [policy] 'can_see_user'
[3/4] [contract] 'default'
[4/4] [step] 'update_age'
/Users/joffreyjaffeux/Code/pr-discourse/plugins/chat/app/services/update_age.rb:32:in `fetch_user': missing keyword: :user_id (ArgumentError)
from /Users/joffreyjaffeux/Code/pr-discourse/plugins/chat/app/services/base.rb:202:in `instance_exec'
from /Users/joffreyjaffeux/Code/pr-discourse/plugins/chat/app/services/base.rb:202:in `call'
from /Users/joffreyjaffeux/Code/pr-discourse/plugins/chat/app/services/base.rb:219:in `call'
from /Users/joffreyjaffeux/Code/pr-discourse/plugins/chat/app/services/base.rb:417:in `block in run!'
from /Users/joffreyjaffeux/Code/pr-discourse/plugins/chat/app/services/base.rb:417:in `each'
from /Users/joffreyjaffeux/Code/pr-discourse/plugins/chat/app/services/base.rb:417:in `run!'
from /Users/joffreyjaffeux/Code/pr-discourse/plugins/chat/app/services/base.rb:411:in `run'
from <internal:kernel>:90:in `tap'
from /Users/joffreyjaffeux/Code/pr-discourse/plugins/chat/app/services/base.rb:302:in `call'
from /Users/joffreyjaffeux/Code/pr-discourse/plugins/chat/spec/services/update_age_spec.rb:15:in `block (3 levels) in <main>'
```
This broke because of directory ownership errors during `git ls-files`. This commit fixes the permissions and adds bash flags so that those kind of errors will blow up the step in future.
The `git` version in our discourse_test docker image was recently updated to include a permissions check before running any git commands. For this to pass, the owner of the discourse directory needs to match the user running any git commands.
Under GitHub actions, by default the working directory is created with uid=1000 as the owner. We run all our tests as `root`, so this mismatch causes git to raise the permissions error. We can't switch to run the entire workflow as the `discourse (uid=1000)` user because our discourse_test image is not configured to allow `discourse` access to postgres/redis directories. For now, this commit updates the working directory's owner to match the user running the workflow.
Note this commit also slightly changes internal API: channel instead of getChannel and updateCurrentUserChannelNotificationsSettings instead of updateCurrentUserChatChannelNotificationsSettings.
Also destroyChannel takes a second param which is the name confirmation instead of an optional object containing this confirmation. This is to enforce the fact that it's required.
In the future a top level jsdoc config file could be used instead of the hack tempfile, but while it's only an experiment for chat, it's probably good enough.
This commit introduces the necessary gems and config, but adds all our ruby code directories to the `--ignore-files` list.
Future commits will apply syntax_tree to parts of the codebase, removing the ignore patterns as we go
Our working theory is that system tests on Github run on much less
powerful hardware as compared to running the tests on our work machines.
Hopefully, increasing the wait time now will help reduce some flakes
that we're seeing on Github.
These screenshots are located at paths like:
/__w/discourse/discourse/tmp/capybara/failures_r_spec_example_groups_quoting_chat_message_transcripts_copying_quote_transcripts_with_the_clipboard_quotes_multiple_chat_messages_into_a_topic_134.png
not /tmp/screenshots. This should fix the issue. Also makes plugin system specs
use documentation format and profile.