Commit Graph

25 Commits

Author SHA1 Message Date
Arkshine
1fffb236b2 FIX: crawler requests exceptions for non UTF-8 user agents with invalid bytes 2024-06-11 14:02:46 +02:00
Osama Sayegh
361992bb74
FIX: Apply crawler rate limits to cached requests (#27174)
This commit moves the logic for crawler rate limits out of the application controller and into the request tracker middleware. The reason for this move is to apply rate limits to all crawler requests instead of just the requests that make it to the application controller. Some requests are served early from the middleware stack without reaching the Rails app for performance reasons (e.g. `AnonymousCache`) which results in crawlers getting 200 responses even though they've reached their limits and should be getting 429 responses.

Internal topic: t/128810.
2024-05-27 16:26:35 +03:00
David Taylor
ece0150cb7
FIX: Ensure RequestTracker handles bubbled exceptions correctly (#26940)
This can happen for various reasons including rate limiting and middleware bugs. This should resolve the warning we're seeing in the logs

```
RequestTracker.get_data failed : NoMethodError : undefined method `[]' for nil:NilClass
```
2024-05-08 16:08:39 +01:00
David Taylor
2f2da72747
FEATURE: Add experimental tracking of 'real browser' pageviews (#26647)
Our 'page_view_crawler' / 'page_view_anon' metrics are based purely on the User Agent sent by clients. This means that 'badly behaved' bots which are imitating real user agents are counted towards 'anon' page views.

This commit introduces a new method of tracking visitors. When an initial HTML request is made, we assume it is a 'non-browser' request (i.e. a bot). Then, once the JS application has booted, we notify the server to count it as a 'browser' request. This reliance on a JavaScript-capable browser matches up more closely to dedicated analytics systems like Google Analytics.

Existing data collection and graphs are unchanged. Data collected via the new technique is available in a new 'experimental' report.
2024-04-25 11:00:01 +01:00
Martin Brennan
6bcbe56116
DEV: Use freeze_time_safe in more places (#25949)
Followup to 120a2f70a9,
uses new method to avoid time-based spec flakiness
2024-03-01 10:07:35 +10:00
dependabot[bot]
f087234ff7
Build(deps-dev): Bump rubocop from 1.60.2 to 1.61.0 (#25958)
* Build(deps-dev): Bump rubocop from 1.60.2 to 1.61.0

Bumps [rubocop](https://github.com/rubocop/rubocop) from 1.60.2 to 1.61.0.
- [Release notes](https://github.com/rubocop/rubocop/releases)
- [Changelog](https://github.com/rubocop/rubocop/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rubocop/rubocop/compare/v1.60.2...v1.61.0)

---
updated-dependencies:
- dependency-name: rubocop
  dependency-type: indirect
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* fix the issue

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jarek Radosz <jradosz@gmail.com>
2024-02-29 14:09:49 +01:00
David Taylor
a562214f56
FIX: Update global rate limiter keys/messages to clarify user vs ip (#25264) 2024-01-15 19:54:50 +00:00
Sam
b09422428d
DEV: update syntax tree to latest (#24623)
update format to latest syntax tree
2023-11-29 16:38:07 +11:00
Alan Guo Xiang Tan
773b22e8d0
DEV: Seperate concerns of tracking GC stat from MethodProfiler (#22921)
Why this change?

This is a follow up to e8f7b62752.
Tracking of GC stats didn't really belong in the `MethodProfiler` class
so we want to extract that concern into its own class.

As part of this PR, the `track_gc_stat_per_request` site setting has
also been renamed to `instrument_gc_stat_per_request`.
2023-08-02 10:46:37 +08:00
OsamaSayegh
0976c8fad6
SECURITY: Don't reuse CSP nonce between anonymous requests 2023-07-28 12:53:44 +01:00
Martin Brennan
9174716737
DEV: Remove Discourse.redis.delete_prefixed (#22103)
This method is a huge footgun in production, since it calls
the Redis KEYS command. From the Redis documentation at
https://redis.io/commands/keys/:

> Warning: consider KEYS as a command that should only be used in
production environments with extreme care. It may ruin performance when
it is executed against large databases. This command is intended for
debugging and special operations, such as changing your keyspace layout.
Don't use KEYS in your regular application code.

Since we were only using `delete_prefixed` in specs (now that we
removed the usage in production in 24ec06ff85)
we can remove this and instead rely on `use_redis_snapshotting` on the
particular tests that need this kind of clearing functionality.
2023-06-16 12:44:35 +10:00
Sam
83f1a13374
DEV: stop leaking data into tables during test (#21403)
This amends it so our cached counting reliant specs run in synchronize mode

When running async there are situations where data is left over in the table
after a transactional test. This means that repeat runs of the test suite
fail.
2023-05-06 07:15:33 +10:00
David Taylor
798b4bb604
FIX: Ensure anon-cached values are never returned for API requests (#20021)
Under some situations, we would inadvertently return a public (unauthenticated) result to an authenticated API request. This commit adds the `Api-Key` header to our anonymous cache bypass logic.
2023-01-26 13:26:29 +00:00
David Taylor
cb932d6ee1
DEV: Apply syntax_tree formatting to spec/* 2023-01-09 11:49:28 +00:00
David Taylor
66e8a35b4d
DEV: Include message-bus request type in HTTP request data (#19762) 2023-01-06 11:26:18 +00:00
Bianca Nenciu
3048d3d07d
FEATURE: Track API and user API requests (#19186)
Adds stats for API and user API requests similar to regular page views.
This comes with a new report to visualize API requests per day like the
consolidated page views one.
2022-11-29 13:07:42 +02:00
Vinoth Kannan
076abe46fa
FEATURE: new site setting to set locale from cookie for anonymous users. (#18377)
This new hidden default-disabled site setting `set_locale_from_cookie` will set locale from anonymous user's cookie value.
2022-09-27 14:26:06 +05:30
Loïc Guitaut
3eaac56797 DEV: Use proper wording for contexts in specs 2022-08-04 11:05:02 +02:00
Phil Pirozhkov
493d437e79
Add RSpec 4 compatibility (#17652)
* Remove outdated option

04078317ba

* Use the non-globally exposed RSpec syntax

https://github.com/rspec/rspec-core/pull/2803

* Use the non-globally exposed RSpec syntax, cont

https://github.com/rspec/rspec-core/pull/2803

* Comply to strict predicate matchers

See:
 - https://github.com/rspec/rspec-expectations/pull/1195
 - https://github.com/rspec/rspec-expectations/pull/1196
 - https://github.com/rspec/rspec-expectations/pull/1277
2022-07-28 10:27:38 +08:00
Jarek Radosz
1a5dbbf430
FIX: Correctly handle invalid auth cookies (#16995)
Previously it would blow up on invalid utf byte sequences. This was a source of spec flakiness.
2022-06-07 13:00:25 +02:00
Jarek Radosz
3f0e767106
DEV: Use FakeLogger in RequestTracker specs (#16640)
`TestLogger` was responsible for some flaky specs runs:

```
Error during failsafe response: undefined method `debug' for #<TestLogger:0x0000556c4b942cf0 @warnings=1>
Did you mean?  debugger
```

This commit also cleans up other uses of `FakeLogger`
2022-05-05 09:53:54 +08:00
David Taylor
8f786268be
SECURITY: Ensure user-agent-based responses are cached separately (#16475) 2022-04-14 14:25:52 +01:00
David Taylor
c9dab6fd08
DEV: Automatically require 'rails_helper' in all specs (#16077)
It's very easy to forget to add `require 'rails_helper'` at the top of every core/plugin spec file, and omissions can cause some very confusing/sporadic errors.

By setting this flag in `.rspec`, we can remove the need for `require 'rails_helper'` entirely.
2022-03-01 17:50:50 +00:00
Sam
d4d3580761
PERF: perform all cached counting in background (#15991)
Previously cached counting made redis calls in main thread and performed
the flush in main thread.

This could lead to pathological states in extreme heavy load.

This refactor reduces load and cleans up the interface
2022-02-22 16:45:25 +00:00
Jarek Radosz
45cc16098d
DEV: Move spec/components to spec/lib (#15987)
Lib specs were inexplicably split into two directories (`lib` and `components`)

This moves them all into `lib`.
2022-02-18 19:41:54 +01:00