Some sites have a large number of categories and fetching the category
IDs or category topic IDs just to build another query can take a long
time or resources (i.e. memory).
Why this change?
As part of our ongoing efforts to security harden the Discourse
application, we are adding the `cross_origin_opener_policy_header` site setting
which allows the `Cross-Origin-Opener-Policy` response header to be set on requests
that preloads the Discourse application. In more technical terms, only
GET requests that are not json or xhr will have the response header set.
The `cross_origin_opener_policy_header` site setting is hidden for now
for testing purposes and will either be released as a public site
setting or be remove if we decide to be opinionated and ship a default
for the `Cross-Origin-Opener-Policy` response header.
What is the problem here?
In multiple controllers, we are accepting a `limit` params but do not
impose any upper bound on the values being accepted. Without an upper
bound, we may be allowing arbituary users from generating DB queries
which may end up exhausing the resources on the server.
What is the fix here?
A new `fetch_limit_from_params` helper method is introduced in
`ApplicationController` that can be used by controller actions to safely
get the limit from the params as a default limit and maximum limit has
to be set. When an invalid limit params is encountered, the server will
respond with the 400 response code.
Some plugins call require_plugin with a wrong argument instead of the
plugin name. In a production environment that used to be a warning, but
there is no reason to keep it like that in a testing environment
because the issue will continue to be ignored.
]When changing fonts in the `/wizard/steps/styling` step of
the wizard, users would not see the font loaded straight away,
having to switch to another one then back to the original to
see the result. This is because we are using canvas to render
the style preview and this fails with a Chrome-based intervention
when font loading is taking too long:
> [Intervention] Slow network is detected. See
https://www.chromestatus.com/feature/5636954674692096 for more details.
Fallback font will be used while loading:
https://sea2.discourse-cdn.com/business7/fonts/Roboto-Bold.ttf?v=0.0.9
We can get around this by manually loading the fonts selected using
the FontFace JS API when the user selects them and before rerendering
the canvas. This just requires preloading more information about the
fonts if the user is admin so the wizard can query this data.
This corrects two issues:
1. We were double serializing topic tracking state (as_json calls were not cached)
2. We were inefficiently serializing items by instantiating extra objects
What does this change do?
This commit the client to override the navigation menu setting
configured by the site temporarily based on the value of the
`navigation_menu` query param. The new query param replaces the old
`enable_sidebar` query param.
Why do we need this change?
The motivation here is to allow theme maintainers to quickly preview
what the site will look like with the various navigation menu site
setting.
## Why do we need this change?
When loading the ember app, [MessageBus does not start polling immediately](f31f0b70f8/app/assets/javascripts/discourse/app/initializers/message-bus.js (L71-L81)) and instead waits for `document.readyState` to be `complete`. What this means is that if there are new messages being created while we have yet to start polling, those messages will not be received by the client.
With sidebar being the default navigation menu, the counts derived from `topic-tracking-state.js` on the client side is prominently displayed on every page. Therefore, we want to ensure that we are not dropping any messages on the channels that `topic-tracking-state.js` subscribes to.
## What does this change do?
This includes the `MessageBus.last_id`s for the MessageBus channels which `topic-tracking-state.js` subscribes to as part of the preloaded data when loading a page. The last ids are then used when we subscribe the MessageBus channels so that messages which are published before MessageBus starts polling will not be missed.
## Review Notes
1. See https://github.com/discourse/message_bus#client-support for documentation about subscribing from a given message id.
* Revert "Revert "FEATURE: Preload resources via link header (#18475)" (#18511)"
This reverts commit 95a57f7e0c.
* put behind feature flag
* env -> global setting
* declare global setting
* forgot one spot
Experiment moving from preload tags in the document head to preload information the the response headers.
While this is a minor improvement in most browsers (headers are parsed before the response body), this allows smart proxies like Cloudflare to "learn" from those headers and build HTTP 103 Early Hints for subsequent requests to the same URI, which will allow the user agent to download and parse our JS/CSS while we are waiting for the server to generate and stream the HTML response.
Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>
- `no_custom` -> `no_themes` (history: before themes existed, we had a similar tool called 'customizations')
- `only_official` -> `no_unofficial_plugins` (matches format of `no_themes` and `no_plugins`, and makes it clear that this doesn't affect themes)
- `?safe_mode=no_themes%2C%no_plugins` -> `?safe_mode=no_themes,no_plugins` (the query portion of a URL does not require commas to be encoded. This is much nicer to read)
- If `no_plugins` is chosen from `/safe-mode` the URL generated will omit the superfluous `no_unofficial_plugins` flag
- Some tweaks to copy on `/safe-mode`
Some errors (e.g. InvalidAccess) are rendered with `include_ember: true`. Booting the ember app requires that the 'preload' data is rendered in the HTML.
If a particular route was configured to `skip_before_action :preload_json`, and then went on to raise an InvalidAccess error, then we'd attempt to render the Ember app without the preload json. This led to a blank screen and a client-side error.
This commit ensures that error pages will fallback to the no_ember view if there is no preload data. It also adds a sanity check in `discourse-bootstrap` so that it's easier for us to identify similar errors in future.
The title had to be added both on the 404 page generated by the server
side, displayed when the user reaches a bad page directly and the 404
page rendered by Ember when a user reaches a missing topic while
navigating the forum.
Discourse has the Discourse Connect Provider protocol that makes it possible to
use a Discourse instance as an identity provider for external sites. As a
natural extension to this protocol, this PR adds a new feature that makes it
possible to use Discourse as a 2FA provider as well as an identity provider.
The rationale for this change is that it's very difficult to implement 2FA
support in a website and if you have multiple websites that need to have 2FA,
it's unrealistic to build and maintain a separate 2FA implementation for each
one. But with this change, you can piggyback on Discourse to take care of all
the 2FA details for you for as many sites as you wish.
To use Discourse as a 2FA provider, you'll need to follow this guide:
https://meta.discourse.org/t/-/32974. It walks you through what you need to
implement on your end/site and how to configure your Discourse instance. Once
you're done, there is only one additional thing you need to do which is to
include `require_2fa=true` in the payload that you send to Discourse.
When Discourse sees `require_2fa=true`, it'll prompt the user to confirm their
2FA using whatever methods they've enabled (TOTP or security keys), and once
they confirm they'll be redirected back to the return URL you've configured and
the payload will contain `confirmed_2fa=true`. If the user has no 2FA methods
enabled however, the payload will not contain `confirmed_2fa`, but it will
contain `no_2fa_methods=true`.
You'll need to be careful to re-run all the security checks and ensure the user
can still access the resource on your site after they return from Discourse.
This is very important because there's nothing that guarantees the user that
will come back from Discourse after they confirm 2FA is the same user that
you've redirected to Discourse.
Internal ticket: t62183.
* FIX: Redirect if Discourse-Xhr-Redirect is present
`handleRedirect` was passed an wrong argument type (a string) instead of
a jqXHR object and missed the fields checked in condition, thus always
evaluating to `false`.
* FIX: Add `errors` field if group update confirmation
An explicit confirmation about the effect of the group update is
required if the default notification level changes. Previously, if the
confirmation was missing the API endpoint failed silently returning
a 200 response code and a `user_count` field. This change ensures that
a proper error code is returned (422), a descriptive error message and
the additional information in the `user_count` field.
This commit also refactors the API endpoint to use the
`Discourse-Xhr-Redirect` header to redirect the user if the group is
no longer visible.
The `blocked onebox domains` setting lets site owners change what sites
are allowed to be oneboxed. When a link is entered into a post,
Discourse checks the domain of the link against that setting and blocks
the onebox if the domain is blocked. But if there's a chain of
redirects, then only the final destination website is checked against
the site setting.
This commit amends that behavior so that every website in the redirect
chain is checked against the site setting, and if anything is blocked
the original link doesn't onebox at all in the post. The
`Discourse-No-Onebox` header is also checked in every response and the
onebox is blocked if the header is set to "1".
Additionally, Discourse will now include the `Discourse-No-Onebox`
header with every response if the site requires login to access content.
This is done to signal to a Discourse instance that it shouldn't attempt
to onebox other Discourse instances if they're login-only. Non-Discourse
websites can also use include that header if they don't wish to have
Discourse onebox their content.
Internal ticket: t59305.
2FA support in Discourse was added and grown gradually over the years: we first
added support for TOTP for logins, then we implemented backup codes, and last
but not least, security keys. 2FA usage was initially limited to logging in,
but it has been expanded and we now require 2FA for risky actions such as
adding a new admin to the site.
As a result of this gradual growth of the 2FA system, technical debt has
accumulated to the point where it has become difficult to require 2FA for more
actions. We now have 5 different 2FA UI implementations and each one has to
support all 3 2FA methods (TOTP, backup codes, and security keys) which makes
it difficult to maintain a consistent UX for these different implementations.
Moreover, there is a lot of repeated logic in the server-side code behind these
5 UI implementations which hinders maintainability even more.
This commit is the first step towards repaying the technical debt: it builds a
system that centralizes as much as possible of the 2FA server-side logic and
UI. The 2 main components of this system are:
1. A dedicated page for 2FA with support for all 3 methods.
2. A reusable server-side class that centralizes the 2FA logic (the
`SecondFactor::AuthManager` class).
From a top-level view, the 2FA flow in this new system looks like this:
1. User initiates an action that requires 2FA;
2. Server is aware that 2FA is required for this action, so it redirects the
user to the 2FA page if the user has a 2FA method, otherwise the action is
performed.
3. User submits the 2FA form on the page;
4. Server validates the 2FA and if it's successful, the action is performed and
the user is redirected to the previous page.
A more technically-detailed explanation/documentation of the new system is
available as a comment at the top of the `lib/second_factor/auth_manager.rb`
file. Please note that the details are not set in stone and will likely change
in the future, so please don't use the system in your plugins yet.
Since this is a new system that needs to be tested, we've decided to migrate
only the 2FA for adding a new admin to the new system at this time (in this
commit). Our plan is to gradually migrate the remaining 2FA implementations to
the new system.
For screenshots of the 2FA page, see PR #15377 on GitHub.
When redirecting to login, we store a destination_url cookie, which the user is then redirected to after login. We never want the user to be redirected to a JSON URL. Instead, we should return a 403 in these situations.
This should also be much less confusing for API consumers - a 403 is a better representation than a 302.
Under some conditions, these varied responses could lead to cache poisoning, hence the 'security' label.
Previously the Rails application would serve JSON data in place of HTML whenever Ember CLI requested an `application.html.erb`-rendered page. This commit removes that logic, and instead parses the HTML out of the standard response. This means that Rails doesn't need to customize its response for Ember CLI.
We have a couple of site setting, `slow_down_crawler_user_agents` and `slow_down_crawler_rate`, that are meant to allow site owners to signal to specific crawlers that they're crawling the site too aggressively and that they should slow down.
When a crawler is added to the `slow_down_crawler_user_agents` setting, Discourse currently adds a `Crawl-delay` directive for that crawler in `/robots.txt`. Unfortunately, many crawlers don't support the `Crawl-delay` directive in `/robots.txt` which leaves the site owners no options if a crawler is crawling the site too aggressively.
This PR replaces the `Crawl-delay` directive with proper rate limiting for crawlers added to the `slow_down_crawler_user_agents` list. On every request made by a non-logged in user, Discourse will check the User Agent string and if it contains one of the values of the `slow_down_crawler_user_agents` list, Discourse will only allow 1 request every N seconds for that User Agent (N is the value of the `slow_down_crawler_rate` setting) and the rest of requests made within the same interval will get a 429 response.
The `slow_down_crawler_user_agents` setting becomes quite dangerous with this PR since it could rate limit lots if not all of anonymous traffic if the setting is not used appropriately. So to protect against this scenario, we've added a couple of new validations to the setting when it's changed:
1) each value added to setting must 3 characters or longer
2) each value cannot be a substring of tokens found in popular browser User Agent. The current list of prohibited values is: apple, windows, linux, ubuntu, gecko, firefox, chrome, safari, applewebkit, webkit, mozilla, macintosh, khtml, intel, osx, os x, iphone, ipad and mac.
* FEATURE: Optionally send a 'noindex' header in non-canonical responses
This will be used in a SEO experiment.
Co-authored-by: David Taylor <david@taylorhq.com>
Currently, Discourse rate limits all incoming requests by the IP address they
originate from regardless of the user making the request. This can be
frustrating if there are multiple users using Discourse simultaneously while
sharing the same IP address (e.g. employees in an office).
This commit implements a new feature to make Discourse apply rate limits by
user id rather than IP address for users at or higher than the configured trust
level (1 is the default).
For example, let's say a Discourse instance is configured to allow 200 requests
per minute per IP address, and we have 10 users at trust level 4 using
Discourse simultaneously from the same IP address. Before this feature, the 10
users could only make a total of 200 requests per minute before they got rate
limited. But with the new feature, each user is allowed to make 200 requests
per minute because the rate limits are applied on user id rather than the IP
address.
The minimum trust level for applying user-id-based rate limits can be
configured by the `skip_per_ip_rate_limit_trust_level` global setting. The
default is 1, but it can be changed by either adding the
`DISCOURSE_SKIP_PER_IP_RATE_LIMIT_TRUST_LEVEL` environment variable with the
desired value to your `app.yml`, or changing the setting's value in the
`discourse.conf` file.
Requests made with API keys are still rate limited by IP address and the
relevant global settings that control API keys rate limits.
Before this commit, Discourse's auth cookie (`_t`) was simply a 32 characters
string that Discourse used to lookup the current user from the database and the
cookie contained no additional information about the user. However, we had to
change the cookie content in this commit so we could identify the user from the
cookie without making a database query before the rate limits logic and avoid
introducing a bottleneck on busy sites.
Besides the 32 characters auth token, the cookie now includes the user id,
trust level and the cookie's generation date, and we encrypt/sign the cookie to
prevent tampering.
Internal ticket number: t54739.
By default, Rails only includes the Vary:Accept header in responses when the Accept: header is included in the request. This means that proxies/browsers may cache a response to a request with a missing Accept header, and then later serve that cached version for a request which **does** supply the Accept header. This can lead to some very unexpected behavior in browsers.
This commit adds the Vary:Accept header for all requests, even if the Accept header is not present in the request. If a format parameter (e.g. `.json` suffix) is included in the path, then the Accept header is still omitted. (The format parameter takes precedence over any Accept: header, so the response is no longer varies based on the Accept header)
The exception page is shown before Ember can actually figure out what the final destination URL we're going to is.
This means that the new page is not present in the history stack, so if we attempt to use the history stack to go back, we will actually navigate back by two steps.
By instead forcing a navigation to the current URL, we achieve the goal of going "back" with no history mucking.
Unfortunately, the actual URL that was attempted is not available. Additionally, this only works for the on-screen back button and not the browser back.
Additionally, several modernizations of the exception page code were made.
Before this change, calling `StyleSheet::Manager.stylesheet_details`
for the first time resulted in multiple queries to the database. This is
because the code was modelled in a way where each `Theme` was loaded
from the database one at a time.
This PR restructures the code such that it allows us to load all the
theme records in a single query. It also allows us to eager load the
required associations upfront. In order to achieve this, I removed the
support of loading multiple themes per request. It was initially added
to support user selectable theme components but the feature was never
completed and abandoned because it wasn't a feature that we thought was
worth building.
The first thing we needed here was an enum rather than a boolean to determine how a directory_column was created. Now we have `automatic`, `user_field` and `plugin` directory columns.
This plugin API is assuming that the plugin has added a migration to a column to the `directory_items` table.
This was created to be initially used by discourse-solved. PR with API usage - https://github.com/discourse/discourse-solved/pull/137/
The `bootstrap.json` contains most preloaded information but some routes
provide extra information, such as invites.
This fixes the issue by having the preload request pass on the preloaded
data from the source page, which is then merged with the bootstrap's
preloaded data for the final HTML payload.
I merged this PR in yesterday, finally thinking this was done https://github.com/discourse/discourse/pull/12958 but then a wild performance regression occurred. These are the problem methods:
1aa20bd681/app/serializers/topic_tracking_state_serializer.rb (L13-L21)
Turns out date comparison is super expensive on the backend _as well as_ the frontend.
The fix was to just move the `treat_as_new_topic_start_date` into the SQL query rather than using the slower `UserOption#treat_as_new_topic_start_date` method in ruby. After this change, 1% of the total time is spent with the `created_in_new_period` comparison instead of ~20%.
----
History:
Original PR which had to be reverted **https://github.com/discourse/discourse/pull/12555**. See the description there for what this PR is achieving, plus below.
The issue with the original PR is addressed in 92ef54f402
If you went to the `x unread` link for a tag Chrome would freeze up and possibly crash, or eventually unfreeze after nearly 10 mins. Other routes for unread/new were similarly slow. From profiling the issue was the `sync` function of `topic-tracking-state.js`, which calls down to `isNew` which in turn calls `moment`, a change I had made in the PR above. The time it takes locally with ~1400 topics in the tracking state is 2.3 seconds.
To solve this issue, I have moved these calculations for "created in new period" and "unread not too old" into the tracking state serializer.
When I was looking at the profiler I also noticed this issue which was just compounding the problem. Every time we modify topic tracking state we recalculate the sidebar tracking/everything/tag counts. However this calls `forEachTracked` and `countTags` which can be quite expensive as they go through the whole tracking state (and were also calling the removed moment functions).
I added some logs and this was being called 30 times when navigating to a new /unread route because `sync` is being called from `build-topic-route` (one for each topic loaded due to pagination). So I just added a debounce here and it makes things even faster.
Finally, I changed topic tracking state to use a Map so our counts of the state keys is faster (Maps have .size whereas objects you have to do Object.keys(obj) which is O(n).)
<!-- NOTE: All pull requests should have tests (rspec in Ruby, qunit in JavaScript). If your code does not include test coverage, please include an explanation of why it was omitted. -->