* FIX: Use Category.secured(guardian) for hashtag datasource
Follow up to comments in #19219, changing the category
hashtag datasource to use Category.secured(guardian) instead
of Site.new(guardian).categories here since the latter does
more work for not much benefit, and the query time is the
same. Also eliminates some Hash -> Model back and forth
busywork. Add some more specs too.
* FIX: Server-side hashtag lookup cooking user loading
When we were using the PrettyText.options.currentUser
and parsing back and forth with JSON for the hashtag
lookups server-side, we had a bug where the user's
secure categories were not loaded since we never actually
loaded a User model from the database, only parsed it
from JSON.
This commit fixes the issue by instead using the
PretyText.options.userId and looking up the user directly
from the database when calling hashtag_lookup via the
PrettyText::Helpers code when cooking server-side. Added
the missing spec to check for this as well.
This commit fleshes out and adds functionality for the new `#hashtag` search and
lookup system, still hidden behind the `enable_experimental_hashtag_autocomplete`
feature flag.
**Serverside**
We have two plugin API registration methods that are used to define data sources
(`register_hashtag_data_source`) and hashtag result type priorities depending on
the context (`register_hashtag_type_in_context`). Reading the comments in plugin.rb
should make it clear what these are doing. Reading the `HashtagAutocompleteService`
in full will likely help a lot as well.
Each data source is responsible for providing its own **lookup** and **search**
method that returns hashtag results based on the arguments provided. For example,
the category hashtag data source has to take into account parent categories and
how they relate, and each data source has to define their own icon to use for the
hashtag, and so on.
The `Site` serializer has two new attributes that source data from `HashtagAutocompleteService`.
There is `hashtag_icons` that is just a simple array of all the different icons that
can be used for allowlisting in our markdown pipeline, and there is `hashtag_context_configurations`
that is used to store the type priority orders for each registered context.
When sending emails, we cannot render the SVG icons for hashtags, so
we need to change the HTML hashtags to the normal `#hashtag` text.
**Markdown**
The `hashtag-autocomplete.js` file is where I have added the new `hashtag-autocomplete`
markdown rule, and like all of our rules this is used to cook the raw text on both the clientside
and on the serverside using MiniRacer. Only on the server side do we actually reach out to
the database with the `hashtagLookup` function, on the clientside we just render a plainer
version of the hashtag HTML. Only in the composer preview do we do further lookups based
on this.
This rule is the first one (that I can find) that uses the `currentUser` based on a passed
in `user_id` for guardian checks in markdown rendering code. This is the `last_editor_id`
for both the post and chat message. In some cases we need to cook without a user present,
so the `Discourse.system_user` is used in this case.
**Chat Channels**
This also contains the changes required for chat so that chat channels can be used
as a data source for hashtag searches and lookups. This data source will only be
used when `enable_experimental_hashtag_autocomplete` is `true`, so we don't have
to worry about channel results suddenly turning up.
------
**Known Rough Edges**
- Onebox excerpts will not render the icon svg/use tags, I plan to address that in a follow up PR
- Selecting a hashtag + pressing the Quote button will result in weird behaviour, I plan to address that in a follow up PR
- Mixed hashtag contexts for hashtags without a type suffix will not work correctly, e.g. #ux which is both a category and a channel slug will resolve to a category when used inside a post or within a [chat] transcript in that post. Users can get around this manually by adding the correct suffix, for example ::channel. We may get to this at some point in future
- Icons will not show for the hashtags in emails since SVG support is so terrible in email (this is not likely to be resolved, but still noting for posterity)
- Additional refinements and review fixes wil
This commit renames all secure_media related settings to secure_uploads_* along with the associated functionality.
This is being done because "media" does not really cover it, we aren't just doing this for images and videos etc. but for all uploads in the site.
Additionally, in future we want to secure more types of uploads, and enable a kind of "mixed mode" where some uploads are secure and some are not, so keeping media in the name is just confusing.
This also keeps compatibility with the `secure-media-uploads` path, and changes new
secure URLs to be `secure-uploads`.
Deprecated settings:
* secure_media -> secure_uploads
* secure_media_allow_embed_images_in_emails -> secure_uploads_allow_embed_images_in_emails
* secure_media_max_email_embed_image_size_kb -> secure_uploads_max_email_embed_image_size_kb
* FIX: Use theme color for anchor icon
* FIX: Do not count anchor links
* FIX: Do not count hashtags links either
* DEV: Add tests for link_count
* FIX: Disable anchors in quotes and preview
* FIX: Try building some anchor slugs for unicode
* DEV: Fix tests
Category and tag hashtags used to be handled differently even though
most of the code was very similar. This design was the root cause of
multiple issues related to hashtags.
This commit reduces the number of requests (just one and debounced
better), removes the use of CSS classes which marked resolved hashtags,
simplifies a lot of the code as there is a single source of truth and
previous race condition fixes are now useless.
It also includes a very minor security fix which let unauthorized users
to guess hidden tags.
When secure media is enabled and an attachment is marked as secure we want to use the full url instead of the short-url so we get the same access control post protections as secure media uploads.
When pull_hotlinked_images tried to run on posts with secure media (which had already been downloaded from external sources) we were getting a 404 when trying to download the image because the secure endpoint doesn't allow anon downloads.
Also, we were getting into an infinite loop of pull_hotlinked_images because the job didn't consider the secure media URLs as "downloaded" already so it kept trying to download them over and over.
In this PR I have also refactored secure-media-upload URL checks and mutations into single source of truth in Upload, adding a SECURE_MEDIA_ROUTE constant to check URLs against too.
This PR introduces a new secure media setting. When enabled, it prevent unathorized access to media uploads (files of type image, video and audio). When the `login_required` setting is enabled, then all media uploads will be protected from unauthorized (anonymous) access. When `login_required`is disabled, only media in private messages will be protected from unauthorized access.
A few notes:
- the `prevent_anons_from_downloading_files` setting no longer applies to audio and video uploads
- the `secure_media` setting can only be enabled if S3 uploads are already enabled and configured
- upload records have a new column, `secure`, which is a boolean `true/false` of the upload's secure status
- when creating a public post with an upload that has already been uploaded and is marked as secure, the post creator will raise an error
- when enabling or disabling the setting on a site with existing uploads, the rake task `uploads:ensure_correct_acl` should be used to update all uploads' secure status and their ACL on S3
Zeitwerk simplifies working with dependencies in dev and makes it easier reloading class chains.
We no longer need to use Rails "require_dependency" anywhere and instead can just use standard
Ruby patterns to require files.
This is a far reaching change and we expect some followups here.
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.
Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
We were looking up each mention one by one without any form of caching and that results
in a problem somewhat similar to an N+1. When we have to do alot of DB
lookups, it also increased the time spent in the V8 context which may
eventually lead to a timeout. The change here makes it such that mention lookups only does a single
DB query per post that happens outside of the V8 context.
* Adds primary user group as a class to quote
This feature addition will add the class `group-PRIMARY_USER_GROUP` to
the quote `aside`. `PRIMARY_USER_GROUP` will be the primary user group
of the user being quoted. This is similar to the class that is added to
a `topic-post`.
* Remove trailing whitespace
* Fix avatar in test
* Address PR comments
* Fix trailing whitespace