discourse

mirror of https://github.com/discourse/discourse.git synced 2024-12-21 06:03:42 +08:00

Author	SHA1	Message	Date
Michael Brown	c546111703	DEV: add the notion of a 'crawler identifier' in anonymous_cache We identify and deny blocked crawlers here in anonymous_cache. Separating the notion of the crawler identifier here lets plugins perform an override if they perform more advanced detection.	2024-12-09 13:40:22 -05:00
Bianca Nenciu	e081cc14fb	SECURITY: Use different anon cache keys for XHR requests XHR requests are handled differently by the application and the responses do not have any preloaded data so the cache key needs to differntiate between those requests.	2024-10-07 11:48:45 +08:00
Arkshine	1fffb236b2	FIX: crawler requests exceptions for non UTF-8 user agents with invalid bytes	2024-06-11 14:02:46 +02:00
Jarek Radosz	6a66dc1cfb	DEV: Fix Lint/BooleanSymbol (#24747 )	2023-12-06 13:19:09 +01:00
Martin Brennan	30d5e752d7	DEV: Revert guardian changes (#24742 ) I took the wrong approach here, need to rethink. * Revert "FIX: Use Guardian.basic_user instead of new (anon) (#24705)" This reverts commit `9057272ee2`. * Revert "DEV: Remove unnecessary method_missing from GuardianUser (#24735)" This reverts commit `a5d4bf6dd2`. * Revert "DEV: Improve Guardian devex (#24706)" This reverts commit `77b6a038ba`. * Revert "FIX: Introduce Guardian::BasicUser for oneboxing checks (#24681)" This reverts commit `de983796e1`.	2023-12-06 16:37:32 +10:00
Martin Brennan	9057272ee2	FIX: Use Guardian.basic_user instead of new (anon) (#24705 ) c.f. `de983796e1` There will soon be additional login_required checks for Guardian, and the intent of many checks by automated systems is better fulfilled by using BasicUser, which simulates a logged in TL0 forum user, rather than an anon user. In some cases the use of anon still makes sense (e.g. anonymous_cache), and in that case the more explicit `Guardian.anon_user` is used	2023-12-06 11:56:21 +10:00
OsamaSayegh	0976c8fad6	SECURITY: Don't reuse CSP nonce between anonymous requests	2023-07-28 12:53:44 +01:00
David Taylor	798b4bb604	FIX: Ensure anon-cached values are never returned for API requests (#20021 ) Under some situations, we would inadvertently return a public (unauthenticated) result to an authenticated API request. This commit adds the `Api-Key` header to our anonymous cache bypass logic.	2023-01-26 13:26:29 +00:00
Daniel Waterworth	666536cbd1	DEV: Prefer \A and \z over ^ and $ in regexes (#19936 )	2023-01-20 12:52:49 -06:00
David Taylor	6417173082	DEV: Apply syntax_tree formatting to `lib/*`	2023-01-09 12:10:19 +00:00
Vinoth Kannan	076abe46fa	FEATURE: new site setting to set locale from cookie for anonymous users. (#18377 ) This new hidden default-disabled site setting `set_locale_from_cookie` will set locale from anonymous user's cookie value.	2022-09-27 14:26:06 +05:30
Loïc Guitaut	008b700a3f	DEV: Upgrade to Rails 7 This patch upgrades Rails to version 7.0.2.4.	2022-04-28 11:51:03 +02:00
David Taylor	8f786268be	SECURITY: Ensure user-agent-based responses are cached separately (#16475 )	2022-04-14 14:25:52 +01:00
David Taylor	11c93342dc	DEV: Consolidate Redis evalsha logic into DiscourseRedis::EvalHelper (#15957 )	2022-02-15 16:06:12 +00:00
Osama Sayegh	b86127ad12	FEATURE: Apply rate limits per user instead of IP for trusted users (#14706 ) Currently, Discourse rate limits all incoming requests by the IP address they originate from regardless of the user making the request. This can be frustrating if there are multiple users using Discourse simultaneously while sharing the same IP address (e.g. employees in an office). This commit implements a new feature to make Discourse apply rate limits by user id rather than IP address for users at or higher than the configured trust level (1 is the default). For example, let's say a Discourse instance is configured to allow 200 requests per minute per IP address, and we have 10 users at trust level 4 using Discourse simultaneously from the same IP address. Before this feature, the 10 users could only make a total of 200 requests per minute before they got rate limited. But with the new feature, each user is allowed to make 200 requests per minute because the rate limits are applied on user id rather than the IP address. The minimum trust level for applying user-id-based rate limits can be configured by the `skip_per_ip_rate_limit_trust_level` global setting. The default is 1, but it can be changed by either adding the `DISCOURSE_SKIP_PER_IP_RATE_LIMIT_TRUST_LEVEL` environment variable with the desired value to your `app.yml`, or changing the setting's value in the `discourse.conf` file. Requests made with API keys are still rate limited by IP address and the relevant global settings that control API keys rate limits. Before this commit, Discourse's auth cookie (`_t`) was simply a 32 characters string that Discourse used to lookup the current user from the database and the cookie contained no additional information about the user. However, we had to change the cookie content in this commit so we could identify the user from the cookie without making a database query before the rate limits logic and avoid introducing a bottleneck on busy sites. Besides the 32 characters auth token, the cookie now includes the user id, trust level and the cookie's generation date, and we encrypt/sign the cookie to prevent tampering. Internal ticket number: t54739.	2021-11-17 23:27:30 +03:00
Rafael dos Santos Silva	6645243a26	SECURITY: Disallow caching of MIME/Content-Type errors (#14907 ) This will sign intermediary proxies and/or misconfigured CDNs to not cache those error responses.	2021-11-12 15:52:25 -03:00
Dan Ungureanu	69f0f48dc0	DEV: Fix rubocop issues (#14715 )	2021-10-27 11:39:28 +03:00
David Taylor	7a52ce0d6d	FIX: Strip `discourse-logged-in` header during `force_anonymous!` (#14533 ) When the anonymous cache forces users into anonymous mode, it strips the cookies from their request. However, the discourse-logged-in header from the JS client remained. When the discourse-logged-in header is present without any valid auth_token, the current_user_provider [marks the request as ['logged out'](`dbbfad7ed0/lib/auth/default_current_user_provider.rb (L125-L125)`), and a [discourse-logged-out header is returned to the client](`dbbfad7ed0/lib/middleware/request_tracker.rb (L209-L211)`). This causes the JS app to [popup a "you were logged out" modal](`dbbfad7ed0/app/assets/javascripts/discourse/app/components/d-document.js (L29-L29)`), which is very disruptive. This commit strips the discourse-logged-in header from the request at the same time as the auth cookie.	2021-10-07 12:31:42 +01:00
Alan Guo Xiang Tan	8e3691d537	PERF: Eager load Theme associations in Stylesheet Manager. Before this change, calling `StyleSheet::Manager.stylesheet_details` for the first time resulted in multiple queries to the database. This is because the code was modelled in a way where each `Theme` was loaded from the database one at a time. This PR restructures the code such that it allows us to load all the theme records in a single query. It also allows us to eager load the required associations upfront. In order to achieve this, I removed the support of loading multiple themes per request. It was initially added to support user selectable theme components but the feature was never completed and abandoned because it wasn't a feature that we thought was worth building.	2021-06-21 11:06:58 +08:00
Alan Guo Xiang Tan	38b6b098bc	FIX: Bypass `AnonymousCache` for `/srv/status` route. (#11491 ) `/srv/status` routes should not be cached at all. Also, we want to decouple the route from Redis which `AnonymouseCache` relies on. The `/srv/status` should continue to return a success response even if Redis is down.	2020-12-16 16:47:46 +11:00
Sam	a6d9adf346	DEV: ensure queue_time and background_requests are floats (#10901 ) GlobalSetting can end up with a String and we expect a Float	2020-10-13 18:08:38 +11:00
Sam	32393f72b1	PERF: backoff background requests when overloaded (#10888 ) When the server gets overloaded and lots of requests start queuing server will attempt to shed load by returning 429 errors on background requests. The client can flag a request as background by setting the header: `Discourse-Background` to `true` Out-of-the-box we shed load when the queue time goes above 0.5 seconds. The only request we shed at the moment is the request to load up a new post when someone posts to a topic. We can extend this as we go with a more general pattern on the client. Previous to this change, rate limiting would "break" the post stream which would make suggested topics vanish and users would have to scroll the page to see more posts in the topic. Server needs this protection for cases where tons of clients are navigated to a topic and a new post is made. This can lead to a self inflicted denial of service if enough clients are viewing the topic. Due to the internal security design of Discourse it is hard for a large number of clients to share a channel where we would pass the full post body via the message bus. It also renames (and deprecates) triggerNewPostInStream to triggerNewPostsInStream This allows us to load a batch of new posts cleanly, so the controller can keep track of a backlog Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>	2020-10-13 16:56:03 +11:00
Guo Xiang Tan	fe83baa9b3	FIX: Exclude `DELETE` methods from invalid request with payload. Follow-up `105d560177` Our client side code is sending params as part of the request payload so that is going to be tricky to fix.	2020-08-03 17:05:11 +08:00
Guo Xiang Tan	105d560177	SECURITY: 413 for GET, HEAD or DELETE requests with payload.	2020-08-03 14:21:33 +08:00
David Taylor	c09b5807f3	FIX: Include resolved locale in anonymous cache key (#10289 ) This only applies when set_locale_from_accept_language_header is enabled	2020-07-22 18:00:07 +01:00
David Taylor	19814c5e81	FIX: Allow CSP to work correctly for non-default hostnames/schemes (#9180 ) - Define the CSP based on the requested domain / scheme (respecting force_https) - Update EnforceHostname middleware to allow secondary domains, add specs - Add URL scheme to anon cache key so that CSP headers are cached correctly	2020-03-19 19:54:42 +00:00
Robin Ward	895d5cb592	FIX: Anonymous cache regression	2019-12-05 15:07:48 -05:00
Robin Ward	532fea1460	DEV: Provide API for anonymous cache segments (#8455 ) This can be used from a plugin that needs to establish something new in the anonymous cache. For example `is_ie` for an internet explorer plugin.	2019-12-05 14:57:18 -05:00
Joffrey JAFFEUX	0d3d2c43a0	DEV: s/\$redis/Discourse\.redis (#8431 ) This commit also adds a rubocop rule to prevent global variables.	2019-12-03 10:05:53 +01:00
Penar Musaraj	74869b8a7f	FIX: Do not consider mobile app traffic as crawler visits Followup to `a4eb523a`	2019-11-04 09:16:50 -05:00
Sam Saffron	ed00f35306	FEATURE: improve performance of anonymous cache This commit introduces 2 features: 1. DISCOURSE_COMPRESS_ANON_CACHE (true\|false, default false): this allows you to optionally compress the anon cache body entries in Redis, can be useful for high load sites with Redis that lives on a separate server to to webs 2. DISCOURSE_ANON_CACHE_STORE_THRESHOLD (default 2), only pop entries into redis if we observe them more than N times. This avoids situations where a crawler can walk a big pile of topics and store them all in Redis never to be used. Our default anon cache time for topics is only 60 seconds. Anon cache is in place to avoid the "slashdot" effect where a single topic is hit by 100s of people in one minute.	2019-09-04 17:18:32 +10:00
Sam Saffron	b9954b53bb	FIX: report cached controller and action to loggers Previously we would treat all cached hits in anon cache as "other" This hinders analysis of cache performance and makes logging inaccurate	2019-09-03 10:55:16 +10:00
Sam Saffron	08743e8ac0	FEATURE: anon cache reports data to loggers This allows custom plugins such as prometheus exporter to log how many requests are stored in the anon cache vs used by the anon cache. This metric allows us to fine tune cache behaviors	2019-09-02 18:45:35 +10:00
Régis Hanol	75eebc904e	FEATURE: new 'Discourse-Render' HTTP header	2019-08-30 20:45:18 +02:00
Maja Komel	42809f4d69	FIX: use crawler layout when saving url in Wayback Machine (#7667 )	2019-06-03 12:13:32 +10:00
Penar Musaraj	a4eb523af6	Track Discourse user agent pageviews as crawler Since `5bfe051e`, Discourse user agents are marked as non-crawlers (to avoid accidental blacklisting). This makes sure pageviews for these agents are tracked as crawler hits.	2019-05-08 10:38:55 -04:00
Neil Lalonde	526ffc4966	FIX: error in response body to blocked crawlers, showing 500 Internal Server Error with status of 403	2018-09-14 15:40:20 -04:00
Neil Lalonde	b87a089822	FIX: don't block api requests when whitelisted_crawler_user_agents is set	2018-09-14 15:40:20 -04:00
Osama Sayegh	0b7ed8ffaf	FEATURE: backend support for user-selectable components * FEATURE: backend support for user-selectable components * fix problems with previewing default theme * rename preview_key => preview_theme_id * omit default theme from child themes dropdown and try a different fix * cache & freeze stylesheets arrays	2018-08-08 14:46:34 +10:00
Sam	379384ae1e	FIX: never block /srv/status which is used for health checks This route is also very cheap so blocking it is not required It is still rate limited and so on elsewhere	2018-07-18 12:37:01 +10:00
OsamaSayegh	decf1f27cf	FEATURE: Groundwork for user-selectable theme components * Phase 0 for user-selectable theme components - Drops `key` column from the `themes` table - Drops `theme_key` column from the `user_options` table - Adds `theme_ids` (array of ints default []) column to the `user_options` table and migrates data from `theme_key` to the new column. - Removes the `default_theme_key` site setting and adds `default_theme_id` instead. - Replaces `theme_key` cookie with a new one called `theme_ids` - no longer need Theme.settings_for_client	2018-07-12 14:18:21 +10:00
Sam	e72fd7ae4e	FIX: move crawler blocking into anon cache This refinement of previous fix moves the crawler blocking into anonymous cache This ensures we never poison the cache incorrectly when blocking crawlers	2018-07-04 11:14:43 +10:00
Sam	035312d501	FIX: specify path for dosp cookie	2018-04-24 11:07:58 -04:00
Sam	ded84a4b58	PERF: improve performance once logged in rate limiter hits If "logged in" is being forced anonymous on certain routes, trigger the protection for any requests that spend 50ms queueing This means that ... 1. You need to trip it by having 3 requests take longer than 1 second in 10 second interval 2. Once tripped, if your route is still spending 50m queueuing it will continue to be protected This means that site will continue to function with almost no delays while it is scaling up to handle the new load	2018-04-23 11:55:25 +10:00
Sam	59cd7894d9	FEATURE: if site is under extreme load show anon view If a particular path is being hit extremely hard by logged on users, revert to anonymous cached view. This will only come into effect if 3 requests queue for longer than 2 seconds on a single path. This can happen if a URL is shared with the entire forum base and everyone is logged on	2018-04-18 16:58:57 +10:00
Guo Xiang Tan	5012d46cbd	Add rubocop to our build. (#5004 )	2017-07-28 10:20:09 +09:00
Sam	bdb848b4f3	Split the theme_key so we extract the key from seq	2017-06-15 14:09:44 -04:00
Sam	ac1f84d3e1	SECURITY: theme key should be an anon cache breaker	2017-06-15 09:36:27 -04:00
Sam	39a524aac8	FEATURE: brotli cdn bypass for assets Allow CDNS that strip out brotli encoding to use brotli regardless	2016-12-05 13:57:09 +11:00
Robin Ward	a9823ab59a	FIX: Use a cookie to bypass the anon cache	2015-10-28 17:16:56 -04:00

1 2

62 Commits