Commit Graph

124 Commits

Author SHA1 Message Date
Penar Musaraj
74869b8a7f FIX: Do not consider mobile app traffic as crawler visits
Followup to a4eb523a
2019-11-04 09:16:50 -05:00
Daniel Waterworth
563253e9ed FIX: Fix options given to per-minute rate limiter
Previously the options for the per-minute and per-10-second rate
limiters were the same.
2019-09-20 10:48:59 +01:00
Sam Saffron
ed00f35306 FEATURE: improve performance of anonymous cache
This commit introduces 2 features:

1. DISCOURSE_COMPRESS_ANON_CACHE (true|false, default false): this allows
you to optionally compress the anon cache body entries in Redis, can be
useful for high load sites with Redis that lives on a separate server to
to webs

2. DISCOURSE_ANON_CACHE_STORE_THRESHOLD (default 2), only pop entries into
redis if we observe them more than N times. This avoids situations where
a crawler can walk a big pile of topics and store them all in Redis never
to be used. Our default anon cache time for topics is only 60 seconds. Anon
cache is in place to avoid the "slashdot" effect where a single topic is
hit by 100s of people in one minute.
2019-09-04 17:18:32 +10:00
Sam Saffron
b9954b53bb FIX: report cached controller and action to loggers
Previously we would treat all cached hits in anon cache as "other"

This hinders analysis of cache performance and makes logging inaccurate
2019-09-03 10:55:16 +10:00
Sam Saffron
08743e8ac0 FEATURE: anon cache reports data to loggers
This allows custom plugins such as prometheus exporter to log how many
requests are stored in the anon cache vs used by the anon cache.

This metric allows us to fine tune cache behaviors
2019-09-02 18:45:35 +10:00
Régis Hanol
75eebc904e FEATURE: new 'Discourse-Render' HTTP header 2019-08-30 20:45:18 +02:00
Sam Saffron
8cea78c833 Revert "FEATURE: Protect against replay attacks when using TLS 1.3 0-RTT (#8020)"
This reverts commit 39c31a3d76.

Sorry about this, we have decided againse supporting 0-RTT directly in
core, this can be supported with similar hacks to this commit in a
plugin.

That said, we recommend against using a 0-RTT proxy for the Discourse
app due to inherit risk of replay attacks.
2019-08-26 08:56:49 +10:00
Rafael dos Santos Silva
39c31a3d76
FEATURE: Protect against replay attacks when using TLS 1.3 0-RTT (#8020) 2019-08-23 11:52:47 -03:00
David Taylor
f4aa6096ab FIX: Convert omniauth authenticator names to symbols before comparing
This is necessary because some auth plugins define their name as a string
2019-08-14 12:57:11 +01:00
David Taylor
1a8fee11a0 DEV: If only one auth provider is enabled allow GET request
In this case, the auth provider is acting as a SSO provider, and can be trusted to maintain its own CSRF protections.
2019-08-12 11:03:05 +01:00
David Taylor
750802bf56
UX: Improve error handling for common OmniAuth exceptions (#7991)
This displays more useful messages for the most common issues we see:
- CSRF (when the user switches browser)
- Invalid IAT (when the server clock is wrong)
- OAuth::Unauthorized for OAuth1 providers, when the credentials are incorrect

This commit also stops earlier for disabled authenticators. Now we stop at the request phase, rather than the callback phase.
2019-08-12 10:55:02 +01:00
Sam Saffron
1f47ed1ea3 PERF: message_bus will be deferred by server when flooded
The message_bus performs a fair amount of work prior to hijacking requests
this change ensures that if there is a situation where the server is flooded
message_bus will inform client to back off for 30 seconds + random(120 secs)

This back-off is ultra cheap and happens very early in the middleware.

It corrects a situation where a flood to message bus could cause the app
to become unresponsive

MessageBus update is here to ensure message_bus gem properly respects
Retry-After header and status 429.

Under normal state this code should never trigger, to disable raise the
value of DISCOURSE_REJECT_MESSAGE_BUS_QUEUE_SECONDS, default is to tell
message bus to go away if we are queueing for 100ms or longer
2019-08-09 17:48:01 +10:00
David Taylor
3b8c468832 SECURITY: Require POST with CSRF token for OmniAuth request phase 2019-08-08 11:58:00 +01:00
Sam Saffron
62141b6316 FEATURE: enable_performance_http_headers for performance diagnostics
This adds support for DISCOURSE_ENABLE_PERFORMANCE_HTTP_HEADERS
when set to `true` this will turn on performance related headers

```text
X-Redis-Calls: 10     # number of redis calls
X-Redis-Time: 1.02    # redis time in seconds
X-Sql-Commands: 102   # number of SQL commands
X-Sql-Time: 1.02      # duration in SQL in seconds
X-Queue-Time: 1.01    # time the request sat in queue (depends on NGINX)
```

To get queue time NGINX must provide: HTTP_X_REQUEST_START

We do not recommend you enable this without thinking, it exposes information
about what your page is doing, usually you would only enable this if you
intend to strip off the headers further down the stream in a proxy
2019-06-05 16:08:11 +10:00
Maja Komel
42809f4d69 FIX: use crawler layout when saving url in Wayback Machine (#7667) 2019-06-03 12:13:32 +10:00
Sam Saffron
30990006a9 DEV: enable frozen string literal on all files
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.

Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
2019-05-13 09:31:32 +08:00
Penar Musaraj
a4eb523af6 Track Discourse user agent pageviews as crawler
Since 5bfe051e, Discourse user agents are marked as non-crawlers (to avoid accidental blacklisting). This makes sure pageviews for these agents are tracked as crawler hits.
2019-05-08 10:38:55 -04:00
David Taylor
8963f1af30
FEATURE: Optional detailed performance logging for Sidekiq jobs (#7091)
By default, this does nothing. Two environment variables are available:

- `DISCOURSE_LOG_SIDEKIQ`

  Set to `"1"` to enable logging. This will log all completed jobs to `log/rails/sidekiq.log`, along with various db/redis/network statistics. This is useful to track down poorly performing jobs.

- `DISCOURSE_LOG_SIDEKIQ_INTERVAL`

  (seconds) Check running jobs periodically, and log their current duration. They will appear in the logs with `status:pending`. This is useful to track down jobs which take a long time, then crash sidekiq before completing.
2019-03-05 11:19:11 +00:00
Guo Xiang Tan
c732ae9ca9 FIX: Don't update User#last_seen_at when PG is in readonly. 2019-01-21 13:29:29 +08:00
Guo Xiang Tan
e2a20d90fe FIX: Don't log request when Discourse is in readonly due to PG. 2019-01-21 11:04:32 +08:00
Sam
a19170a4c2 DEV: avoid require_dependency for some libs
This avoids require dependency on method_profiler and anon cache.

It means that if there is any change to these files the reloader will not pick it up.

Previously the reloader was picking up the anon cache twice causing it to double load on boot.

This caused warnings.

Long term my plan is to give up on require dependency and instead use:

https://github.com/Shopify/autoload_reloader
2018-12-31 10:53:30 +11:00
Sam
939b82ef0c DEV: correct intermittent test failure
ActionController::BadRequest can not be re-dispatched, under some conditions
we are getting this vs InvalidParameterError in the following test

59c56bd20f/spec/requests/application_controller_spec.rb (L34-L62)
2018-12-13 18:27:13 +11:00
David Taylor
c7c56af397
FEATURE: Allow connecting associated accounts when two-factor is enabled (#6754)
Previously the 'reconnect' process was a bit magic - IF you were already logged into discourse, and followed the auth flow, your account would be reconnected and you would be 'logged in again'.

Now, we explicitly check for a reconnect=true parameter when the flow is started, store it in the session, and then only follow the reconnect logic if that variable is present. Setting this parameter also skips the 'logged in again' step, which means reconnect now works with 2fa enabled.
2018-12-11 13:19:00 +00:00
Sam
955cdad649 FIX: exec_params needs instrumentation
the method no longer routes to "exec" or "async_exec" in latest PG so we
need to explicitly intercept
2018-12-10 14:28:10 +11:00
David Taylor
4e010382cc REFACTOR: Initialize auth providers after plugin.activate!
Also added some helpful functionality for plugin developers:
- Raises RuntimeException if the auth provider has been registered too late
- Logs use of deprecated parameters
2018-11-30 16:58:18 +00:00
Sam
e7001f879a SECURITY: enforce hostname to match discourse hostname
This ensures that the hostname rails uses for various helpers always matches
the Discourse hostname
2018-11-15 15:23:06 +11:00
Sam
5b630f3188 FIX: stop logging every time invalid params are sent
Previously we were logging warning for invalid encoded params, this can
cause a log flood
2018-10-05 14:33:19 +10:00
Neil Lalonde
526ffc4966 FIX: error in response body to blocked crawlers, showing 500 Internal Server Error with status of 403 2018-09-14 15:40:20 -04:00
Neil Lalonde
b87a089822 FIX: don't block api requests when whitelisted_crawler_user_agents is set 2018-09-14 15:40:20 -04:00
Sam
168ffd8384 FEATURE: group warnings about IP level rate limiting 2018-08-13 14:38:20 +10:00
Osama Sayegh
0b7ed8ffaf FEATURE: backend support for user-selectable components
* FEATURE: backend support for user-selectable components

* fix problems with previewing default theme

* rename preview_key => preview_theme_id

* omit default theme from child themes dropdown and try a different fix

* cache & freeze stylesheets arrays
2018-08-08 14:46:34 +10:00
Sam
379384ae1e FIX: never block /srv/status which is used for health checks
This route is also very cheap so blocking it is not required

It is still rate limited and so on elsewhere
2018-07-18 12:37:01 +10:00
OsamaSayegh
decf1f27cf FEATURE: Groundwork for user-selectable theme components
* Phase 0 for user-selectable theme components

- Drops `key` column from the `themes` table
- Drops `theme_key` column from the `user_options` table
- Adds `theme_ids` (array of ints default []) column to the `user_options` table and migrates data from `theme_key` to the new column.
- Removes the `default_theme_key` site setting and adds `default_theme_id` instead.
- Replaces `theme_key` cookie with a new one called `theme_ids`
- no longer need Theme.settings_for_client
2018-07-12 14:18:21 +10:00
Sam
e72fd7ae4e FIX: move crawler blocking into anon cache
This refinement of previous fix moves the crawler blocking into
anonymous cache

This ensures we never poison the cache incorrectly when blocking crawlers
2018-07-04 11:14:43 +10:00
Sam
7f98ed69cd FIX: move crawler blocking to app controller
We need access to site settings in multisite, we do not have access
yet if we attempt to get them in request tracker middleware
2018-07-04 10:30:50 +10:00
Neil Lalonde
e8a6323bea remove crawler blocking until multisite support 2018-07-03 17:54:45 -04:00
Sam
035312d501 FIX: specify path for dosp cookie 2018-04-24 11:07:58 -04:00
Sam
ded84a4b58 PERF: improve performance once logged in rate limiter hits
If "logged in" is being forced anonymous on certain routes, trigger
the protection for any requests that spend 50ms queueing

This means that ...

1. You need to trip it by having 3 requests take longer than 1 second in 10 second interval
2. Once tripped, if your route is still spending 50m queueuing it will continue to be protected

This means that site will continue to function with almost no delays while it is scaling up to handle the new load
2018-04-23 11:55:25 +10:00
Sam
4810ce3607 correct regression 2018-04-18 21:04:08 +10:00
Sam
59cd7894d9 FEATURE: if site is under extreme load show anon view
If a particular path is being hit extremely hard by logged on users,
revert to anonymous cached view.

This will only come into effect if 3 requests queue for longer than 2 seconds
on a *single* path.

This can happen if a URL is shared with the entire forum base and everyone
is logged on
2018-04-18 16:58:57 +10:00
Neil Lalonde
b87fa6d749 FIX: blacklisted crawlers could get through by omitting the accept header 2018-04-17 12:39:30 -04:00
Sam
9980f18d86 FEATURE: track request queueing as early as possible 2018-04-17 18:06:17 +10:00
Neil Lalonde
4d12ff2e8a when writing cache, remove elements from the user agents list. also return a message and content type when blocking a crawler. 2018-03-27 13:44:14 -04:00
Neil Lalonde
a84bb81ab5 only applies to get html requests 2018-03-22 17:57:44 -04:00
Neil Lalonde
ced7e9a691 FEATURE: control which web crawlers can access using a whitelist or blacklist 2018-03-22 15:41:02 -04:00
Sam
0134e41286 FEATURE: detect when client thinks user is logged on but is not
This cleans up an error condition where UI thinks a user is logged on
but the user is not. If this happens user will be prompted to refresh.
2018-03-06 16:49:31 +11:00
Sam
f0d5f83424 FEATURE: limit assets less that non asset paths
By default assets can be requested up to 200 times per 10 seconds
from the app, this includes CSS and avatars
2018-03-06 15:20:39 +11:00
Sam
f295a18e94 FIX: stop double counting net calls in logs 2018-02-28 10:45:11 +11:00
Sam
ca1a3f37e3 FEATURE: add instrumentation for all external net calls 2018-02-21 15:20:29 +11:00
Guo Xiang Tan
3e835047da Remove "already initialized" constant warning. 2018-02-13 08:55:15 +08:00