Commit Graph

25 Commits

Author SHA1 Message Date
Alan Guo Xiang Tan
859d61003e
DEV: API to register custom request rate limiting conditions (#30239)
This commit adds the `add_request_rate_limiter` plugin API which allows plugins to add custom rate limiters on top of the default rate limiters which requests by a user's id or the request's IP address.

Example to add a rate limiter that rate limits all requests from Googlebot under the same rate limit bucket:

```
add_request_rate_limiter(
  identifier: :country,
  key: ->(request) { "country/#{DiscourseIpInfo.get(request.ip)[:country]}" },
  activate_when: ->(request) { DiscourseIpInfo.get(request.ip)[:country].present? },
)
```
2024-12-23 09:57:18 +08:00
Loïc Guitaut
d637bd6519
DEV: Don’t replace Rails logger in specs (#29721)
Instead of replacing the Rails logger in specs, we can instead use
`#broadcast_to` which has been introduced in Rails 7.
2024-11-13 08:47:39 +08:00
Alan Guo Xiang Tan
ed6c9d1545
DEV: Call Discourse.redis.flushdb after the end of each test (#29117)
There have been too many flaky tests as a result of leaking state in
Redis so it is easier to resolve them by ensuring we flush Redis'
database.

Locally on my machine, calling `Discourse.redis.flushdb` takes around
0.1ms which means this change will have very little impact on test
runtimes.
2024-10-09 07:19:31 +08:00
Martin Brennan
527f02e99f
FEATURE: Only count topic views for explicit/deferred tracked views (#27533)
Followup 2f2da72747

This commit moves topic view tracking from happening
every time a Topic is requested, which is susceptible
to inflating numbers of views from web crawlers, to
our request tracker middleware.

In this new location, topic views are only tracked when
the following headers are sent:

* HTTP_DISCOURSE_TRACK_VIEW - This is sent on every page navigation when
  clicking around the ember app. We count these as browser page views
  because we know it comes from the AJAX call in our app. The topic ID
  is extracted from HTTP_DISCOURSE_TRACK_VIEW_TOPIC_ID
* HTTP_DISCOURSE_DEFERRED_TRACK_VIEW - Sent when MessageBus initializes
  after first loading the page to count the initial page load view. The
  topic ID is extracted from HTTP_DISCOURSE_DEFERRED_TRACK_VIEW.

This will bring topic views more in line with the change we
made to page views in the referenced commit and result in
more realistic topic view counts.
2024-07-03 10:38:49 +10:00
Arkshine
1fffb236b2 FIX: crawler requests exceptions for non UTF-8 user agents with invalid bytes 2024-06-11 14:02:46 +02:00
Osama Sayegh
361992bb74
FIX: Apply crawler rate limits to cached requests (#27174)
This commit moves the logic for crawler rate limits out of the application controller and into the request tracker middleware. The reason for this move is to apply rate limits to all crawler requests instead of just the requests that make it to the application controller. Some requests are served early from the middleware stack without reaching the Rails app for performance reasons (e.g. `AnonymousCache`) which results in crawlers getting 200 responses even though they've reached their limits and should be getting 429 responses.

Internal topic: t/128810.
2024-05-27 16:26:35 +03:00
David Taylor
ece0150cb7
FIX: Ensure RequestTracker handles bubbled exceptions correctly (#26940)
This can happen for various reasons including rate limiting and middleware bugs. This should resolve the warning we're seeing in the logs

```
RequestTracker.get_data failed : NoMethodError : undefined method `[]' for nil:NilClass
```
2024-05-08 16:08:39 +01:00
David Taylor
2f2da72747
FEATURE: Add experimental tracking of 'real browser' pageviews (#26647)
Our 'page_view_crawler' / 'page_view_anon' metrics are based purely on the User Agent sent by clients. This means that 'badly behaved' bots which are imitating real user agents are counted towards 'anon' page views.

This commit introduces a new method of tracking visitors. When an initial HTML request is made, we assume it is a 'non-browser' request (i.e. a bot). Then, once the JS application has booted, we notify the server to count it as a 'browser' request. This reliance on a JavaScript-capable browser matches up more closely to dedicated analytics systems like Google Analytics.

Existing data collection and graphs are unchanged. Data collected via the new technique is available in a new 'experimental' report.
2024-04-25 11:00:01 +01:00
Martin Brennan
6bcbe56116
DEV: Use freeze_time_safe in more places (#25949)
Followup to 120a2f70a9,
uses new method to avoid time-based spec flakiness
2024-03-01 10:07:35 +10:00
dependabot[bot]
f087234ff7
Build(deps-dev): Bump rubocop from 1.60.2 to 1.61.0 (#25958)
* Build(deps-dev): Bump rubocop from 1.60.2 to 1.61.0

Bumps [rubocop](https://github.com/rubocop/rubocop) from 1.60.2 to 1.61.0.
- [Release notes](https://github.com/rubocop/rubocop/releases)
- [Changelog](https://github.com/rubocop/rubocop/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rubocop/rubocop/compare/v1.60.2...v1.61.0)

---
updated-dependencies:
- dependency-name: rubocop
  dependency-type: indirect
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* fix the issue

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jarek Radosz <jradosz@gmail.com>
2024-02-29 14:09:49 +01:00
David Taylor
a562214f56
FIX: Update global rate limiter keys/messages to clarify user vs ip (#25264) 2024-01-15 19:54:50 +00:00
Sam
b09422428d
DEV: update syntax tree to latest (#24623)
update format to latest syntax tree
2023-11-29 16:38:07 +11:00
Alan Guo Xiang Tan
773b22e8d0
DEV: Seperate concerns of tracking GC stat from MethodProfiler (#22921)
Why this change?

This is a follow up to e8f7b62752.
Tracking of GC stats didn't really belong in the `MethodProfiler` class
so we want to extract that concern into its own class.

As part of this PR, the `track_gc_stat_per_request` site setting has
also been renamed to `instrument_gc_stat_per_request`.
2023-08-02 10:46:37 +08:00
OsamaSayegh
0976c8fad6
SECURITY: Don't reuse CSP nonce between anonymous requests 2023-07-28 12:53:44 +01:00
Martin Brennan
9174716737
DEV: Remove Discourse.redis.delete_prefixed (#22103)
This method is a huge footgun in production, since it calls
the Redis KEYS command. From the Redis documentation at
https://redis.io/commands/keys/:

> Warning: consider KEYS as a command that should only be used in
production environments with extreme care. It may ruin performance when
it is executed against large databases. This command is intended for
debugging and special operations, such as changing your keyspace layout.
Don't use KEYS in your regular application code.

Since we were only using `delete_prefixed` in specs (now that we
removed the usage in production in 24ec06ff85)
we can remove this and instead rely on `use_redis_snapshotting` on the
particular tests that need this kind of clearing functionality.
2023-06-16 12:44:35 +10:00
Sam
83f1a13374
DEV: stop leaking data into tables during test (#21403)
This amends it so our cached counting reliant specs run in synchronize mode

When running async there are situations where data is left over in the table
after a transactional test. This means that repeat runs of the test suite
fail.
2023-05-06 07:15:33 +10:00
David Taylor
cb932d6ee1
DEV: Apply syntax_tree formatting to spec/* 2023-01-09 11:49:28 +00:00
David Taylor
66e8a35b4d
DEV: Include message-bus request type in HTTP request data (#19762) 2023-01-06 11:26:18 +00:00
Bianca Nenciu
3048d3d07d
FEATURE: Track API and user API requests (#19186)
Adds stats for API and user API requests similar to regular page views.
This comes with a new report to visualize API requests per day like the
consolidated page views one.
2022-11-29 13:07:42 +02:00
Loïc Guitaut
3eaac56797 DEV: Use proper wording for contexts in specs 2022-08-04 11:05:02 +02:00
Phil Pirozhkov
493d437e79
Add RSpec 4 compatibility (#17652)
* Remove outdated option

04078317ba

* Use the non-globally exposed RSpec syntax

https://github.com/rspec/rspec-core/pull/2803

* Use the non-globally exposed RSpec syntax, cont

https://github.com/rspec/rspec-core/pull/2803

* Comply to strict predicate matchers

See:
 - https://github.com/rspec/rspec-expectations/pull/1195
 - https://github.com/rspec/rspec-expectations/pull/1196
 - https://github.com/rspec/rspec-expectations/pull/1277
2022-07-28 10:27:38 +08:00
Jarek Radosz
3f0e767106
DEV: Use FakeLogger in RequestTracker specs (#16640)
`TestLogger` was responsible for some flaky specs runs:

```
Error during failsafe response: undefined method `debug' for #<TestLogger:0x0000556c4b942cf0 @warnings=1>
Did you mean?  debugger
```

This commit also cleans up other uses of `FakeLogger`
2022-05-05 09:53:54 +08:00
David Taylor
c9dab6fd08
DEV: Automatically require 'rails_helper' in all specs (#16077)
It's very easy to forget to add `require 'rails_helper'` at the top of every core/plugin spec file, and omissions can cause some very confusing/sporadic errors.

By setting this flag in `.rspec`, we can remove the need for `require 'rails_helper'` entirely.
2022-03-01 17:50:50 +00:00
Sam
d4d3580761
PERF: perform all cached counting in background (#15991)
Previously cached counting made redis calls in main thread and performed
the flush in main thread.

This could lead to pathological states in extreme heavy load.

This refactor reduces load and cleans up the interface
2022-02-22 16:45:25 +00:00
Jarek Radosz
45cc16098d
DEV: Move spec/components to spec/lib (#15987)
Lib specs were inexplicably split into two directories (`lib` and `components`)

This moves them all into `lib`.
2022-02-18 19:41:54 +01:00