This commit refactors the direct external upload routes (get presigned
put, complete external, create/abort/complete multipart) into a
helper which is then included in both BackupController and the
UploadController. This is done so UploadController doesn't need
strange backup logic added to it, and so each controller implementing
this helper can do their own validation/error handling nicely.
This is a follow up to e4350bb966
Currently, Discourse rate limits all incoming requests by the IP address they
originate from regardless of the user making the request. This can be
frustrating if there are multiple users using Discourse simultaneously while
sharing the same IP address (e.g. employees in an office).
This commit implements a new feature to make Discourse apply rate limits by
user id rather than IP address for users at or higher than the configured trust
level (1 is the default).
For example, let's say a Discourse instance is configured to allow 200 requests
per minute per IP address, and we have 10 users at trust level 4 using
Discourse simultaneously from the same IP address. Before this feature, the 10
users could only make a total of 200 requests per minute before they got rate
limited. But with the new feature, each user is allowed to make 200 requests
per minute because the rate limits are applied on user id rather than the IP
address.
The minimum trust level for applying user-id-based rate limits can be
configured by the `skip_per_ip_rate_limit_trust_level` global setting. The
default is 1, but it can be changed by either adding the
`DISCOURSE_SKIP_PER_IP_RATE_LIMIT_TRUST_LEVEL` environment variable with the
desired value to your `app.yml`, or changing the setting's value in the
`discourse.conf` file.
Requests made with API keys are still rate limited by IP address and the
relevant global settings that control API keys rate limits.
Before this commit, Discourse's auth cookie (`_t`) was simply a 32 characters
string that Discourse used to lookup the current user from the database and the
cookie contained no additional information about the user. However, we had to
change the cookie content in this commit so we could identify the user from the
cookie without making a database query before the rate limits logic and avoid
introducing a bottleneck on busy sites.
Besides the 32 characters auth token, the cookie now includes the user id,
trust level and the cookie's generation date, and we encrypt/sign the cookie to
prevent tampering.
Internal ticket number: t54739.
When 31035010af
was done it failed to take into account the case where the smtp_enabled
site setting was true, but the topic had no allowed groups / no
incoming email record, which caused errors for topics even with
nothing to do with group SMTP.
Uppy adds the file name as the "name" parameter in the
payload by default, which means that for things like the
emoji uploader which have a name param used by the controller,
that param will be passed as the file name. We already use
the existing file name if the name param is null, so this
commit just does further cleanup of the name param, removing
the extension if it is a filename so we don't end up with
emoji names like blah_png.
When there are multiple groups on a topic, we were selecting
the first from the topic allowed groups to act as the sender
email address when sending group SMTP replies via PostAlerter.
However, this was not ordered, and since there is no created_at
column on TopicAllowedGroup we cannot order this nicely, which
caused just a random group to be used (based on whatever postgres
decided it felt like that morning).
This commit changes the group used for SMTP sending to be the
group using the email_username of the to address of the first
incoming email for the topic, if there are more than one allowed
groups on the topic. Otherwise it just uses the only SMTP enabled
group.
Sometimes, a user may have a malformed email such as
`test@test.com<mailto:test@test.com` their email address,
and as a topic participant will be included as a CC email
when sending a GroupSmtpEmail. This causes the CC parsing to
fail and further down the line in Email::Sender the code
to check the CC addresses expects an array but gets a string
instead because of the parse failure.
Instead, we can just check if the CC addresses are valid
and drop them if they are not in the GroupSmtpEmail job.
This only affects multisite Discourse instances (where multiple forums are served from a single application server). The vast majority of self-hosted Discourse forums do not fall into this category.
On affected instances, this vulnerability could allow encrypted session cookies to be re-used between sites served by the same application instance.
Uppy and Resumable slice up their chunks differently, which causes a difference
in this algorithm. Let's take a 131.6MB file (137951695 bytes) with a 5MB (5242880 bytes)
chunk size. For resumable, there are 26 chunks, and uppy there are 27. This is
controlled by forceChunkSize in resumable which is false by default. The final
chunk size is 6879695 (chunk size + remainder) whereas in uppy it is 1636815 (just remainder).
This means that the current condition of uploaded_file_size + current_chunk_size >= total_size
is hit twice by uppy, because it uses a more correct number of chunks. This
can be solved for both uppy and resumable by checking the _previous_ chunk
number * chunk_size as the uploaded_file_size.
An example of what is happening before that change, using the current
chunk number to calculate uploaded_file_size.
chunk 26: resumable: uploaded_file_size (26 * 5242880) + current_chunk_size (6879695) = 143194575 >= total_size (137951695) ? YES
chunk 26: uppy: uploaded_file_size (26 * 5242880) + current_chunk_size (5242880) = 141557760 >= total_size (137951695) ? YES
chunk 27: uppy: uploaded_file_size (27 * 5242880) + current_chunk_size (1636815) = 143194575 >= total_size (137951695) ? YES
An example of what this looks like after the change, using the previous
chunk number to calculate uploaded_file_size:
chunk 26: resumable: uploaded_file_size (25 * 5242880) + current_chunk_size (6879695) = 137951695 >= total_size (137951695) ? YES
chunk 26: uppy: uploaded_file_size (25 * 5242880) + current_chunk_size (5242880) = 136314880 >= total_size (137951695) ? NO
chunk 27: uppy: uploaded_file_size (26 * 5242880) + current_chunk_size (1636815) = 137951695 >= total_size (137951695) ? YES
Same issue as 28b00dc6fc, the
Mocha::ExpectationError inherits from Exception instead
of StandardError so RspecErrorTracker does not show the
actual failed expectation in request specs, the status of
the response is just 500 with no further detail.
This commit adds the RailsMultisite middleware in test mode when Rails.configuration.multisite is true. This allows for much more realistic integration testing. The `multisite_spec.rb` file is rewritten to avoid needing to simulate a middleware stack.
This PR introduces a new `enable_experimental_backup_uploads` site setting (default false and hidden), which when enabled alongside `enable_direct_s3_uploads` will allow for direct S3 multipart uploads of backup .tar.gz files.
To make multipart external uploads work with both the S3BackupStore and the S3Store, I've had to move several methods out of S3Store and into S3Helper, including:
* presigned_url
* create_multipart
* abort_multipart
* complete_multipart
* presign_multipart_part
* list_multipart_parts
Then, S3Store and S3BackupStore either delegate directly to S3Helper or have their own special methods to call S3Helper for these methods. FileStore.temporary_upload_path has also removed its dependence on upload_path, and can now be used interchangeably between the stores. A similar change was made in the frontend as well, moving the multipart related JS code out of ComposerUppyUpload and into a mixin of its own, so it can also be used by UppyUploadMixin.
Some changes to ExternalUploadManager had to be made here as well. The backup direct uploads do not need an Upload record made for them in the database, so they can be moved to their final S3 resting place when completing the multipart upload.
This changeset is not perfect; it introduces some special cases in UploadController to handle backups that was previously in BackupController, because UploadController is where the multipart routes are located. A subsequent pull request will pull these routes into a module or some other sharing pattern, along with hooks, so the backup controller and the upload controller (and any future controllers that may need them) can include these routes in a nicer way.
We are only using list_multipart_parts right now in the
uploads controller for multipart uploads to check if the
upload exists; thus we don't need up to 1000 parts.
Also adding a note for future explorers that list_multipart_parts
only gets 1000 parts max, and adding params for max parts
and starting parts.
Support for invites alongside DiscourseConnect was added in 355d51af. This commit fixes the guardian method so that the bulk invite button functionality also works.
* FIX: Preserve field types when updating revision
When a post was edited quickly twice by the same user, the old post
revision was updated with the newest changes. To check if the change
was reverted (i.e. rename topic A to B and then back to A) a comparison
of the initial value and last value is performed. If the check passes
then the intermediary value is dismissed and only the initial value and
the last ones are preserved. Otherwise, the modification is dismissed
because the field returned to its initial value.
This used to work well for most fields, but failed for "tags" because
the field is an array and the values were transformed to strings to
perform the comparison.
* FIX: Reset last_editor_id if revision is reverted
If a post was revised and then the same revision was reverted,
last_editor_id was still set to the ID of the user who last edited the
post. This was a problem because the same person could then edit the
same post again and because it was the same user and same post, the
system attempted to update the last one (that did not exist anymore).
This commit introduces a new s3:ensure_cors_rules rake task
that is run as a prerequisite to s3:upload_assets. This rake
task calls out to the S3CorsRulesets class to ensure that
the 3 relevant sets of CORS rules are applied, depending on
site settings:
* assets
* direct S3 backups
* direct S3 uploads
This works for both Global S3 settings and Database S3 settings
(the latter set directly via SiteSetting).
As it is, only one rule can be applied, which is generally
the assets rule as it is called first. This commit changes
the ensure_cors! method to be able to apply new rules as
well as the existing ones.
This commit also slightly changes the existing rules to cover
direct S3 uploads via uppy, especially multipart, which requires
some more headers.
FinalDestination's follow_canonical mode used for embedded topics should work when canonical URLs are relative, as specified in [RFC 6596](https://datatracker.ietf.org/doc/html/rfc6596)
Previously, suppressed category topics are included in the digest emails if the user visited that topic before and the `TopicUser` record is created with any notification level except 'muted'.
We are no longer able to display the image returned by Instagram directly within a Discourse site (either in the composer, or within a cooked post within a topic), so:
- Display an image placeholder in the composer preview
- A cooked post should use an iframe to display the Instagram 'embed' content
* DEV: Output webmock errors in request specs
In request specs, if you had not properly mocked an external
HTTP call, you would end up with a 500 error with no further
information instead of your expected response code, with an
rspec output like this:
```
Failures:
1) UploadsController#generate_presigned_put when the store is external generates a presigned URL and creates an external upload stub
Failure/Error: expect(response.status).to eq(200)
expected: 200
got: 500
(compared using ==)
# ./spec/requests/uploads_controller_spec.rb:727:in `block (4 levels) in <top (required)>'
# ./spec/rails_helper.rb:280:in `block (2 levels) in <top (required)>'
```
This is not helpful at all when you want to find what you actually
failed to mock, which is shown straight away in non-request specs.
This commit introduces a rescue_from block in the application
controller to log this error, so we have a much nicer output that
helps the developer find the issue:
```
Failures:
1) UploadsController#generate_presigned_put when the store is external generates a presigned URL and creates an external upload stub
Failure/Error: expect(response.status).to eq(200)
expected: 200
got: 500
(compared using ==)
# ./spec/requests/uploads_controller_spec.rb:727:in `block (4 levels) in <top (required)>'
# ./spec/rails_helper.rb:280:in `block (2 levels) in <top (required)>'
# ------------------
# --- Caused by: ---
# WebMock::NetConnectNotAllowedError:
# Real HTTP connections are disabled. Unregistered request: GET https://s3-upload-bucket.s3.us-west-1.amazonaws.com/?cors with headers {'Accept'=>'*/*', 'Accept-Encoding'=>'', 'Authorization'=>'AWS4-HMAC-SHA256 Credential=some key/20211101/us-west-1/s3/aws4_request, SignedHeaders=host;user-agent;x-amz-content-sha256;x-amz-date, Signature=test', 'Host'=>'s3-upload-bucket.s3.us-west-1.amazonaws.com', 'User-Agent'=>'aws-sdk-ruby3/3.121.2 ruby/2.7.1 x86_64-linux aws-sdk-s3/1.96.1', 'X-Amz-Content-Sha256'=>'test', 'X-Amz-Date'=>'20211101T035113Z'}
#
# You can stub this request with the following snippet:
#
# stub_request(:get, "https://s3-upload-bucket.s3.us-west-1.amazonaws.com/?cors").
# with(
# headers: {
# 'Accept'=>'*/*',
# 'Accept-Encoding'=>'',
# 'Authorization'=>'AWS4-HMAC-SHA256 Credential=some key/20211101/us-west-1/s3/aws4_request, SignedHeaders=host;user-agent;x-amz-content-sha256;x-amz-date, Signature=test',
# 'Host'=>'s3-upload-bucket.s3.us-west-1.amazonaws.com',
# 'User-Agent'=>'aws-sdk-ruby3/3.121.2 ruby/2.7.1 x86_64-linux aws-sdk-s3/1.96.1',
# 'X-Amz-Content-Sha256'=>'test',
# 'X-Amz-Date'=>'20211101T035113Z'
# }).
# to_return(status: 200, body: "", headers: {})
#
# registered request stubs:
#
# stub_request(:head, "https://s3-upload-bucket.s3.us-west-1.amazonaws.com/")
#
# ============================================================
```
* DEV: Require webmock in application controller if rails.env.test
* DEV: Rescue from StandardError and NetConnectNotAllowedError
The `白名单` term becomes `名单 白名单` after it is processed by
cppjieba in :query mode. However, `白名单` is not tokenized as such by cppjieba when it
appears in a string of text. Therefore, this may lead to failed matches as
the search data generated while indexing may not contain all of the
terms generated by :query mode. We've decided to maintain parity for now
such that both indexing and querying uses the same :mix mode. This may
lead to less accurate search but our plan is to properly support CJK
search in the future.
In preparation for adding automatic CORS rules creation
for direct S3 uploads, I am adding tests here and moving the
CORS rule definitions into a dedicated class so they are all
in the one place.
There is a problem with ensure_cors! as well -- if there is
already a CORS rule defined (presumably the asset one) then
we do nothing and do not apply the new rule. This means that
the S3BackupStore.ensure_cors method does nothing right now
if the assets rule is already defined, and it will mean the
same for any direct S3 upload rules I add for uppy. We need
to be able to add more rules, not just one.
This is not a problem on our hosting because we define the
rules at an infra level.
* FIX: allowed_theme_ids should not be persisted in GlobalSettings
It was observed that the memoized value of `GlobalSetting.allowed_theme_ids` would be persisted across requests, which could lead to unpredictable/undesired behaviours in a multisite environment.
This change moves that logic out of GlobalSettings so that the returned theme IDs are correct for the current site.
Uses get_set_cache, which ultimately uses DistributedCache, which will take care of multisite issues for us.
* DEV: Sanitize HTML admin inputs
This PR adds on-save HTML sanitization for:
Client site settings
translation overrides
badges descriptions
user fields descriptions
I used Rails's SafeListSanitizer, which [accepts the following HTML tags and attributes](018cf54073/lib/rails/html/sanitizer.rb (L108))
* Make sure that the sanitization logic doesn't corrupt settings with special characters
This PR doesn't change any behavior, but just removes code that wasn't in use. This is a pretty dangerous place to change, since it gets called during user's registration. At the same time the refactoring is very straightforward, it's clear that this code wasn't doing any work (it still needs to be double-checked during review though). Also, the test coverage of UserNameSuggester is good.
By default, Rails only includes the Vary:Accept header in responses when the Accept: header is included in the request. This means that proxies/browsers may cache a response to a request with a missing Accept header, and then later serve that cached version for a request which **does** supply the Accept header. This can lead to some very unexpected behavior in browsers.
This commit adds the Vary:Accept header for all requests, even if the Accept header is not present in the request. If a format parameter (e.g. `.json` suffix) is included in the path, then the Accept header is still omitted. (The format parameter takes precedence over any Accept: header, so the response is no longer varies based on the Accept header)
Not reseting the registry could lead to assets still being registered for example.
This flakky spec was reprdocible with this call: `bundle exec rspec --seed 9472 spec/components/discourse_plugin_registry_spec.rb spec/components/svg_sprite/svg_sprite_spec.rb`
Which would trigger the following error:
```
Failures:
1) DiscoursePluginRegistry#register_asset registers vendored_core_pretty_text properly
Failure/Error: expect(registry.javascripts.count).to eq(0)
expected: 0
got: 1
(compared using ==)
# ./spec/components/discourse_plugin_registry_spec.rb:248:in `block (3 levels) in <top (required)>'
# ./spec/rails_helper.rb:280:in `block (2 levels) in <top (required)>'
# /Users/joffreyjaffeux/.gem/ruby/2.7.3/gems/webmock-3.14.0/lib/webmock/rspec.rb:37:in `block (2 levels) in <top (required)>'
```
When inviting a group to a topic, there may be members of
the group already in the topic as topic allowed users. These
can be safely removed from the topic, because they are implicitly
allowed in the topic based on their group membership.
Also, this prevents issues with group SMTP emails, which rely
on the topic_allowed_users of the topic to send to and cc's
for emails, and if there are members of the group as topic_allowed_users
then that complicates things and causes odd behaviour.
We also ensure that the OP of the topic is not removed from
the topic_allowed_users when a group they belong to is added,
as it will make it harder to add them back later.
User API keys (not the same thing as admin API keys) are currently
leaked to redis when rate limits are applied to them since redis is the
backend for rate limits in Discourse and the API keys are included in
the redis keys that are used to track usage of user API keys in the last
24 hours.
This commit stops the leak by using a SHA-256 representation of the user
API key instead of the key itself to form the redis key.
We don't need to manually delete the existing redis keys that contain
unhashed user API keys because they're not long-lived and will be
automatically deleted within 48 hours after this commit is deployed to
your Discourse instance.
This is a follow-up to https://github.com/discourse/discourse/pull/14541. This adds a hidden setting for restoring the old behavior for those users who rely on it. We'll likely deprecate this setting at some point in the future.
This commit removes the recipient's username from the
respond to / participants list that is shown at the bottom
of user notification emails. For example if the recipient's
username was jsmith, and there were participants ljones and
bmiller, we currently show this:
> "reply to this email to respond to jsmith, ljones, bmiller"
or
> "Participants: jsmith, ljones, bmiller"
However this is a bit redundant, as you are not replying to
yourself here if you are the recipient user. So we omit the
recipient user's username from this list, which is only used
in the text of the email and not elsewhere.
The method was only used for mega topics but it was redundant as the
first post can be determined from using the condition where
`Post#post_number` equal to one.
* FEATURE: Cache CORS preflight requests for 2h
Browsers will cache this for 5 seconds by default. If using MessageBus
in a different domain, Discourse will issue a new long polling, by
default, every 30s or so. This means we would be issuing a new preflight
request **every time**. This can be incredibly wasteful, so let's cache
the authorization in the client for 2h, which is the maximum Chromium
allows us as of today.
* fix tests
`/u/username/invited.json?filter=expired` and `/u/username/invited.json?filter=pending` APIs are already returning data to admins. However, the `can_see_invite_details?` boolean was false, which prevented the Ember frontend from showing the tabs correctly. This commit updates the guardian method to match reality.
* FIX: do not display add to calendar for past dates
There is no value in saving past dates into calendar
* FIX: remove postId and move ICS to frontend
PostId is not necessary and will make the solution more generic for dates which doesn't belong to a specific post.
Also, ICS file can be generated in JavaScript to avoid calling backend.
Sometimes administrators want to permanently delete posts and topics
from the database. To make sure that this is done for a good reasons,
administrators can do this only after one minute has passed since the
post was deleted or immediately if another administrator does it.
We don't want to be using emails as source for username and name suggestions in cases when it's possible that a user have no chance to intervene and correct a suggested username. It risks exposing email addresses.
Previosuly, quotes from original topics are rendered incorrectly since the moved posts are not rebaked.
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
* FIX: Remove List-Post email header
This header is used for mailing lists and can confuse some email clients
such as Thunderbird to display wrong replying options.
* FIX: Replace reply_key in email custom headers
Admins can add custom email headers from site settings. Email sender
will try to replace the reply key if %{reply_key} exists or remove the
header if a reply key does not exist.
Calling create_notification_alert could still send a notification to a
suspended user. This just moves the check if user is suspended right
before sending the notification.
Invite is used in two contexts, when inviting a new user to the forum
and when inviting an existent user to a topic. The first case is more
complex and it involves permission checks to ensure that new users can
be created. In the second case, it is enough to ensure that the topic
is visible for both users and that all preconditions are met.
One edge case is the invite to topic via email functionality which
checks for both conditions because first the user must be invited to
create an account first and then to the topic.
A side effect of these changes is that all site settings related to
invites refer to inviting new users only now.
We aren't translating these settings, so it makes more sense to move them into the code. I added an instance method so plugins can add mappings for custom reasons.
- Allow the `/presence/get` endpoint to return multiple channels in a single request (limited to 50)
- When multiple presence channels are initialized in a single Ember runloop, batch them into a single GET request
- Introduce the `presence-pretender` to allow easy testing of PresenceChannel-related features
- Introduce a `use_cache` boolean (default true) on the the server-side PresenceChannel initializer. Useful during testing.
Change the type for default_calendar to a string.
The type specified for the default calendar in the api docs wasn't a
valid type. The linting in the api docs repo reports:
```
`type` can be one of the following only: "object", "array", "string", "number", "integer", "boolean", "null".
```
This linting currently is only in the `discourse_api_docs` repo.
* DEV: Add include_subcategories param to api docs
Adding the `include_subcategories=true` query param to the
`/categories.json` api docs.
Follow up to: fe676f334a
* fix spec
This commit fixes the `eval` and `evalsha` commands/methods and any other methods that don't have a wrapper in `DiscourseRedis` and expect keyword arguments. I noticed this problem in Logster when I was trying to fetch some log messages in JSON format using the rails console and saw the messages were missing the `env` field. Logster uses the `eval` command to fetch messages `env`s:
dc351fd00f/lib/logster/redis_store.rb (L250-L253)
and that code was not fetching anything because `DiscourseRedis` didn't pass the `keys` keyword arg to the redis gem.
It allows saving local date to calendar.
Modal is giving option to pick between ics and google. User choice can be remembered as a default for the next actions.
* FEATURE: Return subcategories on categories endpoint
When using the API subcategories will now be returned nested inside of
each category response under the `subcategory_list` param. We already
return all the subcategory ids under the `subcategory_ids` param, but
you then would have to make multiple separate API calls to fetch each of
those subcategories. This way you can get **ALL** of the categories
along with their subcategories in a single API response.
The UI will not be affected by this change because you need to pass in
the `include_subcategories=true` param in order for subcategories to be
returned.
In a follow up PR I'll add the API scoping for fetching categories so
that a readonly API key can be used for the `/categories.json` endpoint. This
endpoint should be used instead of the `/site.json` endpoint for
fetching a sites categories and subcategories.
* Update PR based on feedback
- Have spec check for specific subcategory
- Move comparison check out of loop
- Only populate subcategory list if option present
- Remove empty array initialization
- Update api spec to allow null response
* More PR updates based on feedback
- Use a category serializer for the subcategory_list
- Don't include the subcategory_list param if empty
- For the spec check for the subcategory by id
- Fix spec to account for param not present when empty
Usually, when an email is received a user lookup is performed using the
email address found in the `From` header. When an email has an
`X-Original-From` header, if it is equal to `Reply-To` then it uses that
one instead. The comparison was sensitive to whitespaces and other
insignificant characters such as quotes because it reconstructed the
`From` header.
For the fixture added in this commit, it compared the reconstructed
`From` header `John Doe <johndoe@example.com>` with the `Reply-To`
header `"John Doe" <johndoe@example.com>`.
We relied on backticks to identify and replace site setting names with links. Unfortunately, some translations don't follow this convention, breaking this feature.
Additionally, this lets us linkify `category settings` and `watched words` without using HTML in the translations.
You may notice that I split the texts we want to linkify into two groups. I did this on purpose to emphasize those that should be translated (regular_links) from those who don't (site_settings_link). If you can think of a better solution, I'm open to suggestions.
* DEV: Remove HTML setting type and sanitization logic.
We concluded that we don't want settings to contain HTML, so I'm removing the setting type and sanitization logic. Additionally, we no longer allow the global-notice text to contain HTML.
I searched for usages of this setting type in the `all-the-plugins` repo and found none, so I haven't added a migration for existing settings.
* Mark Global notices containing links as HTML Safe.
FinalDestination now supports the `follow_canonical` option, which will perform an initial GET request, parse the canonical link if present, and perform a HEAD request to it.
We use this mode during embeds to avoid treating URLs with different query parameters as different topics.
Previously, while retrieving each upload urls in a post S3 CDN urls with different path in prefix (external urls technically) are considered as uploaded url. It created issue while checking missing uploads.
This commit resolves refactors can_invite_to? to use
can_invite_to_forum? for checking the site-wide permissions and then
perform topic specific checkups.
Similarly, can_invite_to? is always used with a topic object and this is
now enforced.
There was another problem before when `must_approve_users` site setting
was not checked when inviting users to forum, but was checked when
inviting to a topic.
Another minor security issue was that group owners could invite to
group topics even if they did not have the minimum trust level to do
it.