This commit moves the logic for crawler rate limits out of the application controller and into the request tracker middleware. The reason for this move is to apply rate limits to all crawler requests instead of just the requests that make it to the application controller. Some requests are served early from the middleware stack without reaching the Rails app for performance reasons (e.g. `AnonymousCache`) which results in crawlers getting 200 responses even though they've reached their limits and should be getting 429 responses.
Internal topic: t/128810.
This commit updates `S3Inventory#files` to ignore S3 inventory files
which have a `last_modified` timestamp which are not at least 2 days
older than `BackupMetadata.last_restore_date` timestamp.
This check was previously only in `Jobs::EnsureS3UploadsExistence` but
`S3Inventory` can also be used via Rake tasks so this protection needs
to be in `S3Inventory` and not in the scheduled job.
* DEV: allow reply_by_email, visit_link_to_respond strings to be modified by plugins
* DEV: separate visit_link_to_respond and reply_by_email modifiers out
This PR aims to add bulk actions to the user's bookmarks.
After this feature, all users should be able to select multiple bookmarks and perform the actions of "deleting" or "clear reminders"
This fixes the `PrettyText.make_all_links_absolute` to better handle subfolder.
In subfolder, when given the cooked version of a post, links to mentions includes the `Discourse.base_path` prefix. Adding the `Discourse.base_url` was doubling the `Discourse.base_path`.
The issue was hidden behind the specs which was stubbing `Discourse.base_url` instead of relying on `Discourse.base_path`.
This fixes both the "algorithm" used in `PrettyText.make_all_links_absolute` to better handle this case and correct the specs to properly handle subfolder cases.
There are lots of changes in the specs due to a refactoring to use squiggly heredoc strings for easier reading and less escaping.
AuthProvider#enabled_setting=, used primarily by plugins, has been deprecated since version 2.9, in favour of Authenticator#enabled?. This PR confirms we are seeing no more usage and removes the method.
Whenever one creates, updates, or deletes a post, we should keep the `topic.word_count` counter in sync.
Context - https://meta.discourse.org/t/-/308062
This commits updates `FinalDestination#get` to not forward
`Authorization` header on redirects since most HTTP clients I tested like
curl and wget does not it.
This also fixes a recent problem in `DiscourseIpInfo.mmdb_download`
where we will fail to download the databases when both `GlobalSetting.maxmind_account_id` and
`GlobalSetting.maxmind_license_key` has been set. The failure is due to
the bug above where the redirected URL given by MaxMind does not accept
an `Authorization` header.
This reverts commit 0f4520867b.
This has led to two problems:
1. An incompatibility with Cloudflare's "auto minify" feature. They've deprecated this feature because of incompatibility with modern JS syntax. But unfortunately it will remain enabled on existing properties until 2024-08-05.
2. Discourse fails to boot in Safari 15. This is strange, because Safari does support all the required features in our production JS bundles. Even more strangely, things start working as soon as you open the developer tools. That suggests the cause could be a Safari bug rather than a simple incompatibility.
Reverting while we work out a path forward on both those issues.
In 95a82d608d, we lowered the default for
`Onebox.options.max_download_kb` from 10mb to 2mb for security hardening
purposes. However, this resulted in multiple bug reports where seemingly
nomral URLs stopped being oneboxed. It turns out that lowering
`Onebox.options.max_download_kb` resulted in `Onebox::Helpers::DownloadTooLarge` being raised
more often for more URLs in `Onebox::Helpers.fetch_response` which
`Onebox::Helpers.fetch_html_doc` relies on. When
`Onebox::Helpers::DownloadTooLarge` is raised in
`Onebox::Helpers.fetch_response`, we throw away whatever response body
which we have already downloaded at that point. This is not ideal
because Nokogiri can parse incomplete HTML documents and there is a
really high chance that the incomplete HTML document still contains the
information which we need for oneboxing.
Therefore, this commit updates `Onebox::Helpers.fetch_html_doc` to not
throw away the response body when the size of the response body exceeds
`Onebox.options.max_download_size`. Instead, we just take whatever
response which we have and get Nokogiri to parse it.
This can happen for various reasons including rate limiting and middleware bugs. This should resolve the warning we're seeing in the logs
```
RequestTracker.get_data failed : NoMethodError : undefined method `[]' for nil:NilClass
```
decorator-transforms (https://github.com/ef4/decorator-transforms) is a modern replacement for babel's plugin-proposal-decorators. It provides a decorator implementation using modern browser features, without needing to enable babel's full suite of class feature transformations. This improves the developer experience and performance.
In local testing with Google's 'tachometer' tool, this reduces Discourse's 'init-to-render' time by around 3-4% (230ms -> 222ms).
It reduces our initial gzip'd JS payloads by 3.2% (2.43MB -> 2.35MB), or 7.5% (14.5MB -> 13.4MB) uncompressed.
This keeps coming up in user testing as something
we want to get rid of. The `navigation_menu` setting
has been set to sidebar by default for some time now,
and we are rolling out admin sidebar widely. It just
doesn't make sense to let people turn this off in
the first step of the wizard -- we _want_ people to
use the sidebar.
Whenever a post already failed "lightweight" validations, we skip all the expensive validations (that cooks the post or run SQL queries) so that we reply as soon as possible.
Also skip validating polls when there's no "[/poll]" in the raw.
Internal ref - t/115890
This spec has been failing forever on my machine. I guess I have a "better" version of pngquant?
---------
Co-authored-by: Jarek Radosz <jradosz@gmail.com>
This commit adds a `isValidUrl` helper function to the context in
which theme migrations are ran in. This helper function is to make it
easier for theme developers to check if a string is a valid URL or path
when writing theme migrations. This can be helpful in cases when
migrating a string based setting to `type: objects` which contain `type:
string` properties with URL validations enabled.
This commit also introduces the `UrlHelper.is_valid_url?` method
which actually checks that the URL string is of the valid format instead of
only checking if the URL string is parseable which is what `UrlHelper.relaxed_parse` does
and is not sufficient for our needs.
This is a follow up of 5fcb7c262d
It was missing the case where secure uploads is enabled, which creates a copy of the upload no matter what.
So this checks for the original_sha1 of the uploads as well when checking for duplicates.
Our 'page_view_crawler' / 'page_view_anon' metrics are based purely on the User Agent sent by clients. This means that 'badly behaved' bots which are imitating real user agents are counted towards 'anon' page views.
This commit introduces a new method of tracking visitors. When an initial HTML request is made, we assume it is a 'non-browser' request (i.e. a bot). Then, once the JS application has booted, we notify the server to count it as a 'browser' request. This reliance on a JavaScript-capable browser matches up more closely to dedicated analytics systems like Google Analytics.
Existing data collection and graphs are unchanged. Data collected via the new technique is available in a new 'experimental' report.
LinkedIn has grandfathered its old OAuth2 provider. This can only be used by existing apps. New apps have to use the new OIDC provider.
This PR adds a linkedin_oidc provider to core. This will exist alongside the discourse-linkedin-auth plugin, which will be kept for those still using the deprecated provider.
For e-mails, secure uploads redacts all secure images, and later uses the access control post to re-attached allowed ones. We pass the ID of this post through the X-Discourse-Post-Id header. As the name suggests, this assumes there's only ever one access control post. This is not true for activity summary e-mails, as they summarize across posts.
This adds a new header, X-Discourse-Post-Ids, which is used the same way as the old header, but also works for the case where an e-mail is associated with multiple posts.
Automatically add `moderators` and `admins` auto groups to specific site settings.
In the new group-based permissions systems, we just want to check the user’s groups since it more accurately reflects reality
Affected settings:
- tag_topic_allowed_groups
- create_tag_allowed_groups
- send_email_messages_allowed_groups
- personal_message_enabled_groups
- here_mention_allowed_groups
- approve_unless_allowed_groups
- approve_new_topics_unless_allowed_groups
- skip_review_media_groups
- email_in_allowed_groups
- create_topic_allowed_groups
- edit_wiki_post_allowed_groups
- edit_post_allowed_groups
- self_wiki_allowed_groups
- flag_post_allowed_groups
- post_links_allowed_groups
- embedded_media_post_allowed_groups
- profile_background_allowed_groups
- user_card_background_allowed_groups
- invite_allowed_groups
- ignore_allowed_groups
- user_api_key_allowed_groups
This commit addresses an issue for sites where secure_uploads
is turned on after the site has been operating without it for
some time.
When uploads are linked when they are used inside a post,
we were setting the access_control_post_id unconditionally
if it was NULL to that post ID and secure_uploads was true.
However this causes issues if an upload has been used in a
few different places, especially if a post was previously
used in a PM and marked secure, so we end up with a case of
the upload using a public post for its access control, which
causes URLs to not use the /secure-uploads/ path in the post,
breaking things like image uploads.
We should only set the access_control_post_id if the post is the first time the
upload is referenced so it cannot hijack uploads from other places.
This is to enable :array type attributes for Contract
attributes in services, this is a followup to the move
of services from chat to core here:
cab178a405
Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
This commit changes enum typed theme objects property to be optional.
Previously, an enum typed property is always required but we have found
that this might not be ideal so we want to change it.
This method name is a bit confusing; with_secure_uploads implies
it may return a block or something with the uploads of the post,
and has_secure_uploads implies that it's checking whether the post
is linked to any secure uploads.
should_secure_uploads? communicates the true intent of this method --
which is to say whether uploads attached to this post should be
secure or not.
We will be collecting the logo URL and the site's default locale values along with existing basic details to display the site on the Discourse Discover listing page. It will be included only if the site is opted-in by enabling the "`include_in_discourse_discover`" site setting.
Also, we no longer going to use `about.json` and `site/statistics.json` endpoints retrieve these data. We will be using only the `site/basic-info.json` endpoint.
Why this change?
For a schema like this:
```
schema = {
name: "section",
properties: {
category_property: {
type: "categories",
required: true,
},
},
}
```
When the value of the property is set to an empty array, we are not
raising an error which we should because the property is marked as
required.
Why this change?
This is a follow-up to 86b2e3a.
Basically, we want to allow people to select more than 1 group as well.
What does this change do?
1. Change `type: group` to `type: groups` and support `min` and `max`
validations for `type: groups`.
2. Fix the `<SchemaThemeSetting::Types::Groups>` component to support the
`min` and `max` validations and switch it to use the `<GroupChooser>` component
instead of the `<ComboBoxComponent>` component which previously only supported
selecting a single group.
The `TopicCreator` class has a `skip_validations` option that can force-create a topic without performing permission checks or validation rules. However, at the moment it doesn't skip validations that are related to tags, so topics that are created by the system or by some scrip can still fail if they use tags. This commit makes the `TopicCreator` class skip all tags-related checks if the `skip_validations` is specified.
Internal topic: t/124280.
Why this change?
This is a follow-up to 86b2e3aa3e.
Basically, we want to allow people to select more than 1 category as well.
What does this change do?
1. Change `type: category` to `type: categories` and support `min` and `max`
validations for `type: categories`.
2. Fix the `<SchemaThemeSetting::Types::Categories>` component to support the
`min` and `max` validations and switch it to use the `<CategorySelector>` component
instead of the `<CategoryChooser>` component which only supports selecting one category.
At the moment, all topic `?page=` views are served with exactly identical page titles. If you search for something which is mentioned many times in the same Discourse topic, this makes for some very hard-to-understand search results! All the result titles are exactly the same, with no indication of why there are multiple results showing.
This commit adds a `- Page #` suffix to the titles in this situation. This lines up with our existing strategy for topic-list pagination.
Why this change?
While working on the tag selector for the theme object editor, I
realised that there is an extremely high possibility that users might want to select
more than one tag. By supporting the ability to select more than one
tag, it also means that we get support for a single tag for free as
well.
What does this change do?
1. Change `type: tag` to `type: tags` and support `min` and `max`
validations for `type: tags`.
2. Fix the `<SchemaThemeSetting::Types::Tags>` component to support the
`min` and `max` validations
Why this change?
Fortunately or unfortunately in Discourse core, we mainly use `Tag#name`
to look up tags and not its id. This assumption is built into the
frontend as well so we need to use the tag's name instead of the id
here.
Followup 3094f32ff5,
this fixes an issue with the logic in this commit where
we were returning false if any of the conditionals here
were false, regardless of the type of `obj`, where we should
have only done this if `obj` was a `PostAction`, which lead
us to return false in cases where we were checking if the
user could edit their own post as anon.
This commit changes the API for registering the plugin config
page nav configuration from a server-side to a JS one;
there is no need for it to be server-side.
It also makes some changes to allow for 2 different ways of displaying
navigation for plugin pages, depending on complexity:
* TOP - This is the best mode for simple plugins without a lot of different
custom configuration pages, and it reuses the grey horizontal nav bar
already used for admins.
* SIDEBAR - This is better for more complex plugins; likely this won't
be used in the near future, but it's readily available if needed
There is a new AdminPluginConfigNavManager service too to manage which
plugin the admin is actively viewing, otherwise we would have trouble
hiding the main plugin nav for admins when viewing a single plugin.
We were incorrectly using `return` in a block which was causing exceptions at runtime. These exceptions were not causing much issues as they are in defer block.
While working on writing a test for this specific case, I noticed that our `upsert_custom_fields` function was using rails `update_all` which is not updating the `updated_at` timestamp. This commit also fixes it and adds a test for it.
We never use that information and this also fixes an issue with the BCC plugin which ends up triggering a rate-limit because we were publishing a "NEW_PRIVATE_MESSAGE" to the user sending the BCC for every recipients 💥
Internal - t/118283
This commit moves the generation of category background CSS from the
server side to the client side. This simplifies the server side code
because it does not need to check which categories are visible to the
current user.
Why this change?
Previously, we identified that ActiveRecord's PostgreSQL adapter
executes 3 db queries each time a new connection is created. The 3 db
queries was identified when we looked at the `pg_stats_statement` table
on one of our multisite production cluster. At that time, the hypothesis
is that because we were agressively reaping and creating connections,
the db queries executed each time a connection is created is wasting
resources on our database servers. However, we didn't see any the needle
move much on our servers after deploying the patch so we have decided to
drop this patch as it makes it harder for us to upgrade ActiveRecord in
the future.
This commit adds new plugin show routes (`/admin/plugins/:plugin_id`) as we move
towards every plugin having a consistent UI/landing page.
As part of this, we are introducing a consistent way for plugins
to show an inner sidebar in their config page, via a new plugin
API `register_admin_config_nav_routes`
This accepts an array of links with a label/text, and an
ember route. Once this commit is merged we can start the process
of conforming other plugins to follow this pattern, as well
as supporting a single-page version of this for simpler plugins
that don't require an inner sidebar.
Part of /t/122841 internally
There are a couple of reasons for this.
The first one is practical, and related to eager loading. Since /lib is not eager loaded, when the application boots, ProblemCheck["identifier"] will be nil because the child classes aren't loaded.
The second one is more conceptual. There turns out to be a lot of inter-dependencies between the part of the problem check system that live in /app and the parts that live in /lib, which probably suggests it should all go in /app.
Why this change?
Prior this change, we were using `URI.regexp` which was too strict as it
doesn't allow a URL path.
What does this change do?
Just parse the string using `URI.parse` and if it doesn't raise an error
we consider the string to be a valid URL
## What?
Depending on the email software used, when you reply to an email that has some attachments, they will be sent along, since they're part of the embedded (replied to) email.
When Discourse processes the reply as an incoming email, it will automatically add all the (valid) attachments at the end of the post. Including those that were sent as part of the "embedded reply".
This generates posts in Discourse with duplicate attachments 🙁
## How?
When processing attachments of an incoming email, before we add it to the bottom of the post, we check it against all the previous uploads in the same topic. If there already is an `Upload` record, it means that it's a duplicate and it is _therefore_ skipped.
All the inline attachments are left untouched since they're more likely new attachments added by the sender.
Why this change?
This is a follow up to 408d2f8e69. When
`ActiveRecord::ConnectionAdapaters::PostgreSQLAdatper#reload_type_map`
is called, we need to clear the type map cache otherwise migrations
adding an array column will end up throwing errors.
Why this change?
This patch has been added to address the problems identified in https://github.com/rails/rails/issues/35311. For every,
new connection created using the PostgreSQL adapter, 3 queries are executed to fetch type map information from the `pg_type`
system catalog, adding about 1ms overhead to every connection creation.
On multisite clusters where connections are reaped more aggressively, the 3 queries executed
accounts for a significant portion of CPU usage on the PostgreSQL cluster. This patch works around the problem by
caching the type map in a class level attribute to reuse across connections.
Prior to this fix even when the user was not part of a group allowing sending pm we would show the prompt: "You've replied to ... X times, did you know you could send them a personal message instead?"
Why this change?
Before this change, the error messages returned when validating theme
settings of typed objects was an array of array instead of just an
array.
Why this change?
Previously in cac60a2c6b, I added support
for `type: "category"` for a property in the theme objects schema. This
commit extend the work previously to add support for types `topic`,
`post`, `group`, `upload` and `tag`.
Why this change?
```
some_setting:
default: 0
type: string
```
A theme setting like the above will cause an error to be thrown on the
server when importing the theme because the default would be parsed as
an integer which caused an error to be thrown when we are validating the
value of the setting.
What does this change do?
Convert the value to a string when working with string typed theme
settings.
Why this change?
This change adds validation for the default value for `type: objects` theme
settings when a setting theme field is uploaded. This helps the theme
author to ensure that the objects which they specifc in the default
value adhere to the schema which they have declared.
When an error is encountered in one of the objects, the error
message will look something like:
`"The property at JSON Pointer '/0/title' must be at least 5 characters
long."`
We use a JSON Pointer to reference the property in the object which is
something most json-schema validator uses as well.
What does this change do?
1. This commit once again changes the shape of hash returned by
`ThemeSettingsObjectValidator.validate`. Instead of using the
property name as the key previously, we have decided to avoid
multiple levels of nesting and instead use a JSON Pointer as the key
which helps to simplify the implementation.
2 Introduces `ThemeSettingsObjectValidator.validate_objects` which
returns an array of validation error messages for all the objects
passed to the method.
Before this commit, we had a yarn package set up in the root directory and also in `app/assets/javascripts`. That meant two `yarn install` calls and two `node_modules` directories. This commit merges them both into the root location, and updates references to node_modules.
A previous attempt can be found at https://github.com/discourse/discourse/pull/21172. This commit re-uses that script to merge the `yarn.lock` files.
Co-authored-by: Jarek Radosz <jradosz@gmail.com>
This PR adds a new scheduled problem check that simply tries to connect to Twitter OAuth endpoint to check that it's working. It is using the default retry strategy of 2 retries 30 seconds apart.
Now forums can enroll their sites to be showcased in the Discourse [Discover](https://discourse.org/discover) directory. Once they enable the site setting `include_in_discourse_discover` to enroll their forum the `CallDiscourseHub` job will ping the `api.discourse.org/api/discover/enroll` endpoint. Then the Discourse Hub will fetch the basic details from the forum and add it to the review queue. If the site is approved then the forum details will be displayed in the `/discover` page.
Also, remove experimental setting and simply use top_menu for feature detection
This means that when people eventually enable the hot top menu, there will
be topics in it
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
Previously, problem checks were all added as either class methods or blocks in AdminDashboardData. Another set of class methods were used to add and run problem checks.
As of this PR, problem checks are promoted to first-class citizens. Each problem check receives their own class. This class of course contains the implementation for running the check, but also configuration items like retry strategies (for scheduled checks.)
In addition, the parent class ProblemCheck also serves as a registry for checks. For example we can get a list of all existing check classes through ProblemCheck.checks, or just the ones running on a schedule through ProblemCheck.scheduled.
After this refactor, the task of adding a new check is significantly simplified. You add a class that inherits ProblemCheck, you implement it, add a test, and you're good to go.
Why this change?
The current shape of errors returns the error messages after it has been
translated but there are cases where we want to customize the error
messages and the current way return only translated error messages is
making customization of error messages difficult. If we
wish to have the error messages in complete sentences like
"`some_property` property must be present in #link 1", this is not
possible at the moment with the current shape of the errors we return.
What does this change do?
This change introduces the `ThemeSettingsObjectValidator::ThemeSettingsObjectErrors`
and `ThemeSettingsObjectValidator::ThemeSettingsObjectError` classes to
hold the relevant error key and i18n translation options.
Why this change?
This change supports a property of `type: category` in the schema that
is declared for a theme setting object. Example:
```
sections:
type: objects
schema:
name: section
properties:
category_property:
type: category
```
The value of a property declared as `type: category` will have to be a
valid id of a row in the `categories` table.
What does this change do?
Adds a property value validation step for `type: category`. Care has
been taken to ensure that we do not spam the database with a ton of
requests if there are alot of category typed properties. This is done by
walking through the entire object and collecting all the values for
properties typed category. After which, a single database query is
executed to validate which values are valid.
Why this change?
This commit updates `ThemeSettingsObjectValidator` to validate a
property's value against the validations listed in the schema.
For string types, `min_length`, `max_length` and `url` are supported.
For integer and float types, `min` and `max` are supported.
Why this change?
This change adds property value validation to `ThemeSettingsObjectValidator`
for the following types: "string", "integer", "float", "boolean", "enum". Note
that this class is not being used anywhere yet and is still in
development.