This commit fleshes out and adds functionality for the new `#hashtag` search and
lookup system, still hidden behind the `enable_experimental_hashtag_autocomplete`
feature flag.
**Serverside**
We have two plugin API registration methods that are used to define data sources
(`register_hashtag_data_source`) and hashtag result type priorities depending on
the context (`register_hashtag_type_in_context`). Reading the comments in plugin.rb
should make it clear what these are doing. Reading the `HashtagAutocompleteService`
in full will likely help a lot as well.
Each data source is responsible for providing its own **lookup** and **search**
method that returns hashtag results based on the arguments provided. For example,
the category hashtag data source has to take into account parent categories and
how they relate, and each data source has to define their own icon to use for the
hashtag, and so on.
The `Site` serializer has two new attributes that source data from `HashtagAutocompleteService`.
There is `hashtag_icons` that is just a simple array of all the different icons that
can be used for allowlisting in our markdown pipeline, and there is `hashtag_context_configurations`
that is used to store the type priority orders for each registered context.
When sending emails, we cannot render the SVG icons for hashtags, so
we need to change the HTML hashtags to the normal `#hashtag` text.
**Markdown**
The `hashtag-autocomplete.js` file is where I have added the new `hashtag-autocomplete`
markdown rule, and like all of our rules this is used to cook the raw text on both the clientside
and on the serverside using MiniRacer. Only on the server side do we actually reach out to
the database with the `hashtagLookup` function, on the clientside we just render a plainer
version of the hashtag HTML. Only in the composer preview do we do further lookups based
on this.
This rule is the first one (that I can find) that uses the `currentUser` based on a passed
in `user_id` for guardian checks in markdown rendering code. This is the `last_editor_id`
for both the post and chat message. In some cases we need to cook without a user present,
so the `Discourse.system_user` is used in this case.
**Chat Channels**
This also contains the changes required for chat so that chat channels can be used
as a data source for hashtag searches and lookups. This data source will only be
used when `enable_experimental_hashtag_autocomplete` is `true`, so we don't have
to worry about channel results suddenly turning up.
------
**Known Rough Edges**
- Onebox excerpts will not render the icon svg/use tags, I plan to address that in a follow up PR
- Selecting a hashtag + pressing the Quote button will result in weird behaviour, I plan to address that in a follow up PR
- Mixed hashtag contexts for hashtags without a type suffix will not work correctly, e.g. #ux which is both a category and a channel slug will resolve to a category when used inside a post or within a [chat] transcript in that post. Users can get around this manually by adding the correct suffix, for example ::channel. We may get to this at some point in future
- Icons will not show for the hashtags in emails since SVG support is so terrible in email (this is not likely to be resolved, but still noting for posterity)
- Additional refinements and review fixes wil
* FEATURE: API to update user's discourse connect external id
This adds a special handling of updates to DiscourseConnect external_id
in the general user update API endpoint.
Admins can create, update or delete a user SingleSignOn record using
PUT /u/:username.json
{
"external_ids": {
"discourse_connect": "new-external-id"
}
}
The problem was reported as a problem with changing theme in user preferences, after saving a new theme the previously set user status was disappearing (https://meta.discourse.org/t/user-status/240335/42). Turned out though that the problem was more wide, changing pretty much any setting in user preferences apart from user status itself led to clearing the status.
The previous sidebar default tags and categories implementation did not
allow for a user to configure their sidebar to have no categories or
tags. This commit changes how the defaults are applied. When a user is being created,
we create the SidebarSectionLink records based on the `default_sidebar_categories` and
`default_sidebar_tags` site settings. SidebarSectionLink records are
only created for categories and tags which the user has visibility on at
the point of user creation.
With this change, we're also adding the ability for admins to apply
changes to the `default_sidebar_categories` and `default_sidebar_tags`
site settings historically when changing their site setting. When a new
category/tag has been added to the default, the new category/tag will be
added to the sidebar for all users if the admin elects to apply the changes historically.
Like wise when a tag/category is removed, the tag/category will be
removed from the sidebar for all users if the admin elects to apply the
changes historically.
Internal Ref: /t/73500
Related to aeee7ed.
Before the change in aeee7ed, notifications for direct replies to your posts and notifications for replies in watched topics looked the same in the notifications menu -- they both used the arrow icon.
We decided in aeee7ed to distinguish them by changing "watched topics" notifications to use the bell icon because it was confusing for users who watch topics to see the same icon for direct replies and "watched topics". However, that change also means that non-power/new users who receive replies to topics _they create_ will get notifications with the bell icon because technically they're watching the topic, but the arrow icon is more appropriate for this case because we use it throughout the app to indicate "replies".
This commit adds a special-case so that if a user is watching a topic AND the topic is created by them, they receive notifications with the arrow icon (type `replied`) instead of the bell icon (type `posted`) for new posts in the topic.
Internal topic: t/79051.
Adds sorting for the HashtagAutocompleteService to
sort the results by case-insensitive text _within_
the type sort order specified by the params. This
should fix some flaky specs as well.
This commit adds a new `/hashtag/search` endpoint and both
relevant JS and ruby plugin APIs to handle plugins adding their
own data sources and priority orders for types of things to search
when `#` is pressed.
A `context` param is added to `setupHashtagAutocomplete` which
a corresponding chat PR https://github.com/discourse/discourse-chat/pull/1302
will now use.
The UI calls `registerHashtagSearchParam` for each context that will
require a `#` search (e.g. the topic composer), for each type of record that
the context needs to search for, as well as a priority order for that type. Core
uses this call to add the `category` and `tag` data sources to the topic composer.
The `register_hashtag_data_source` ruby plugin API call is for plugins to
add a new data source for the hashtag searching endpoint, e.g. discourse-chat
may add a `channel` data source.
This functionality is hidden behind the `enable_experimental_hashtag_autocomplete`
flag, except for the change to `setupHashtagAutocomplete` since only core and
discourse-chat are using that function. Note this PR does **not** include required
changes for hashtag lookup or new styling.
This commit introduces a new framework for building user tutorials as
popups using the Tippy JS library. Currently, the new framework is used
to replace the old notification spotlight and tips and show a new one
related to the topic timeline.
All popups follow the same structure and have a title, a description and
two buttons for either dismissing just the current tip or all of them
at once.
The state of all seen popups is stored in a user option. Updating
skip_new_user_tips will automatically update the list of seen popups
accordingly.
`delete_previous!` deletes existing topics even when we cannot send a new one due to the `limit_once_per` option. The dashboard problems PM gets deleted the next time the job runs (30 minutes), so the inbox could be empty when
admins click on the summary notification.
A user could receive more than a notification for the same post if they
watched both the categories and tags at different levels. This commit
makes sure that only the watching notification is created.
* Add DiscourseEvent before post notifications are created
If a user was granted a trust level, joined a group that granted a trust
level and left the group, the trust level was reset. This commit tries
to restore the last known trust level before joining the group by
looking into staff logs.
This commit also migrates old :change_trust_level user history records
to use previous_value and new_value fields.
The logic to determine what post excerpt to show for
a topic-level bookmark based on the last unread post
was complex and slow, so we decided to remove it and
always just use the first post excerpt.
This commit also fixes an issue where a couple of
instances of for_topic were missed when doing the
Bookmarkable refactors, so:
1. Clicking the topic bookmark link was not taking
the user to the last unread post
2. When replying to a topic where there was a topic
level bookmark with the auto delete preference
of "on owner reply", we were not removing the
bookmark from the UI correctly.
A test has been added for the former, the latter would
be quite time-consuming to test and not really worth
it considering it's quite an edge case UI bug.
Previously, for every bookmarked topic, all topic_user records were being preloaded. Only the current user's record is actually required.
This commit introduces a new `perform_custom_preload!` API which bookmarkables can use to add custom preloading logic. We use this in topic_bookmarkable to load just the topic_user data we need (in the same way as `topic_list.rb`).
Co-authored-by: Blake Erickson <o.blakeerickson@gmail.com>
862007fb18 introduced a change to the format that watched words are cached in Redis. Newly-deployed versions of the app were attempting to load the old-format data from Redis, leading to a server error. This commit introduces a CACHE_VERSION constant which we can easily bump when making changes to the cache schema.
* FEATURE: Add case-sensitivity flag to watched_words
Currently, all watched words are matched case-insensitively. This flag
allows a watched word to be flagged for case-sensitive matching.
To allow allow for backwards compatibility the flag is set to false by
default.
* FEATURE: Support case-sensitive creation of Watched Words via API
Extend admin creation and upload of Watched Words to support case
sensitive flag. This lays the ground work for supporting
case-insensitive matching of Watched Words.
Support for an extra column has also been introduced for the Watched
Words upload CSV file. The new column structure is as follows:
word,replacement,case_sentive
* FEATURE: Enable case-sensitive matching of Watched Words
WordWatcher's word_matcher_regexp now returns a list of regular
expressions instead of one case-insensitive regular expression.
With the ability to flag a Watched Word as case-sensitive, an action
can have words of both sensitivities.This makes the use of the global
Regexp::IGNORECASE flag added to all words problematic.
To get around platform limitations around the use of subexpression level
switches/flags, a list of regular expressions is returned instead, one for each
case sensitivity.
Word matching has also been updated to use this list of regular expressions
instead of one.
* FEATURE: Use case-sensitive regular expressions for Watched Words
Update Watched Words regular expressions matching and processing to handle
the extra metadata which comes along with the introduction of
case-sensitive Watched Words.
This allows case-sensitive Watched Words to matched as such.
* DEV: Simplify type casting of case-sensitive flag from uploads
Use builtin semantics instead of a custom method for converting
string case flags in uploaded Watched Words to boolean.
* UX: Add case-sensitivity details to Admin Watched Words UI
Update Watched Word form to include a toggle for case-sensitivity.
This also adds support for, case-sensitive testing and matching of Watched Word
in the admin UI.
* DEV: Code improvements from review feedback
- Extract watched word regex creation out to a utility function
- Make JS array presence check more explicit and readable
* DEV: Extract Watched Word regex creation to utility function
Clean-up work from review feedback. Reduce code duplication.
* DEV: Rename word_matcher_regexp to word_matcher_regexp_list
Since a list is returned now instead of a single regular expression,
change `word_matcher_regexp` to `word_matcher_regexp_list` to better communicate
this change.
* DEV: Incorporate WordWatcher updates from upstream
Resolve conflicts and ensure apply_to_text does not remove non-word characters in matches
that aren't at the beginning of the line.
This commit removes the ability to enable/disable the Sidebar on a per
user basis and introduces a site wide setting. For testing purposes, sidebar can be enabled/disabled via the `enable_sidebar=1` or `enable_sidebar=0` query param.
The previous method for reused the PrettyText logic which applied the
watched word logic, but had the unwanted effect of cooking the text too.
This meant that regular text values were converted to HTML.
Follow up to commit 5a4c35f627.
This is so we can join the Notification table onto the
Bookmark table. A slight refactor was needed to ensure
that the required values are always included and the
consumer does not need to think about this.
The discourse-chat and discourse-data-explorer plugins
will be updated to take advantage of this commit.
It makes more sense to use user_ids for the UserCommScreener
introduced in fa5f3e228c since
in most cases the ID will be available, not the username. This
was discovered while starting work on a plugin that will
use this. In the cases where only usernames are available
the extra query is negligble.
The idea behind this refactor is to centralise all of the user ignoring / muting / disallow PM checks in a single place, so they can be used consistently in core as well as for plugins like chat, while improving the main bulk of the checks to run in a single fast non-AR query.
Also fixed up the invite error when someone is muting/ignoring the user that is trying to invite them to the topic.
Mutating the `raw` variable like this would cause issues upstream, meaning that the modification is not persisted. Instead, we should allocate a new string like the other replacement methods.
If an image is oneboxed directly, then we should replace the onebox URL with a markdown image tag. This ensures that the wrapper link points to the downloaded version rather than the original.
This regressed in bf6f8299
Fixes a flaky spec:
```
1) WordWatcher.word_matcher_regexp format of the result regexp is correct when watched_words_regular_expressions = true
Failure/Error: expect(regexp.inspect).to eq("/(#{word1})|(#{word2})/i")
expected: "/(word35)|(word36)/i"
got: "/(word36)|(word35)/i"
(compared using ==)
# ./spec/services/word_watcher_spec.rb:19:in `block (4 levels) in <main>'
```
We have a `cleanup!` class method on bookmarks that deletes
bookmarks X days after their related record (post/topic) are
deleted. This commit changes this method to use the
registered_bookmarkables for this instead, and each bookmarkable
type can delete related bookmarks in their own way.
Due to some changes we started notifying via push notifications on other
families of notifications. There are a total of about 30 or so possible
notification you could get, some can be pushed.
This fallback means that if for any reason we are unable to find an icon
for a push notification we just fallback to the Discourse logo.
Also go with a simple reply icon for watching first post.
Note, that in production `image_url` can return an exception if an image is
missing. This is not the case in test / development.
Censored watched words were not censored inside the title of an inline
oneboxes. Malicious users could exploit this behaviour to insert bad
words. The same issue has been fixed for regular Oneboxes in commit
d184fe59ca.
Previously, with the default `editing_grace_period`, hotlinked images were pulled 5 minutes after a post is created. This delay was added to reduce the chance of automated edits clashing with user edits.
This commit refactors things so that we can pull hotlinked images immediately. URLs are immediately updated in the post's `cooked` HTML. The post's raw markdown is updated later, after the `editing_grace_period`.
This involves a number of behind-the-scenes changes including:
- Schedule Jobs::PullHotlinkedImages immediately after Jobs::ProcessPost. Move scheduling to after the `update_column` call to avoid race conditions
- Move raw changes into a separate job, which is delayed until after the ninja-edit window
- Move disable_if_low_on_disk_space logic into the `pull_hotlinked_images` job
- Move raw-parsing/replacing logic into `InlineUpload` so it can be easily be shared between `UpdateHotlinkedRaw` and `PullUserProfileHotlinkedImages`
This makes it easier to find PMs involving a particular user, for
example by searching for `in:messages thisUser` (previously, that query
would only return results in posts where `thisUser` was in the post body).
The cache was causing state to leak between tests since the `WatchedWord` record in the DB would have been rolled back but `WordWatcher` still had the word in the cache.
7a284164 previously switched the UserDestroyer to use find_each when iterating over UserHistory records. Unfortunately, since this logic is wrapped in a transaction, this didn't actually solve the memory usage problem. ActiveRecord maintains references to all modified models within a transaction.
This commit updates the logic to use a single SQL query, rather than updating models one-by-one
These validate/after_create/after_destroy methods were added
back in b8828d4a2d before
the RegisteredBookmarkable API and pattern was nailed down.
This commit updates BookmarkManager to call out to the
relevant bookmarkable for these and bookmark_metadata for
consistency.
We have not used anything related to bookmarks for PostAction
or UserAction records since 2020, bookmarks are their own thing
now. Deleting all this is just cleaning up old cruft.
Latest redis interoduces a block form of multi / pipelined, this was incorrectly
passed through and not namespaced.
Fix also updates logster, we held off on upgrading it due to missing functions
A bit of a mixed bag, this addresses several edge areas of bookmarks and makes them compatible with polymorphic bookmarks (hidden behind the `use_polymorphic_bookmarks` site setting). The main ones are:
* ExportUserArchive compatibility
* SyncTopicUserBookmarked job compatibility
* Sending different notifications for the bookmark reminders based on the bookmarkable type
* Import scripts compatibility
* BookmarkReminderNotificationHandler compatibility
This PR also refactors the `register_bookmarkable` API so it accepts a class descended from a `BaseBookmarkable` class instead. This was done because we kept having to add more and more lambdas/properties inline and it was very messy, so a factory pattern is cleaner. The classes can be tested independently as well.
Some later PRs will address some other areas like the discourse narrative bot, advanced search, reports, and the .ics endpoint for bookmarks.
This bug was causing double events to be fired as :user_badge_granted is already called when a `user_badge` is created. More over the signature of the block in the UserBadge code is `badge_id, user_id` not `badge, user_id`.
* hidden siteSetting to enable experimental sidebar
* user preference to enable experimental sidebar
* `experimental_sidebar_enabled` attribute for current user
* Empty glimmer component for Sidebar
This pull request follows on from https://github.com/discourse/discourse/pull/16308. This one does the following:
* Changes `BookmarkQuery` to allow for querying more than just Post and Topic bookmarkables
* Introduces a `Bookmark.register_bookmarkable` method which requires a model, serializer, fields and preload includes for searching. These registered `Bookmarkable` types are then used when validating new bookmarks, and also when determining which serializer to use for the bookmark list. The `Post` and `Topic` bookmarkables are registered by default.
* Adds new specific types for Post and Topic bookmark serializers along with preloading of associations in `UserBookmarkList`
* Changes to the user bookmark list template to allow for more generic bookmarkable types alongside the Post and Topic ones which need to display in a particular way
All of these changes are gated behind the `use_polymorphic_bookmarks` site setting, apart from the .hbs changes where I have updated the original `UserBookmarkSerializer` with some stub methods.
Following this PR will be several plugin PRs (for assign, chat, encrypt) that will register their own bookmarkable types or otherwise alter the bookmark serializers in their own way, also gated behind `use_polymorphic_bookmarks`.
This commit also removes `BookmarkQuery.preloaded_custom_fields` and the functionality surrounding it. It was added in 0cd502a558 but only used by one plugin (discourse-assign) where it has since been removed, and is now used by no plugins. We don't need it anymore.
`Scoped order is ignored, it's forced to be batch order.`
`find_each` ignores the `order` scope and triggers a warning in
production which is noisy.
Follow-up to 7a284164ce
This commit improves the logic for rolling up IPv4 screened IP
addresses and extending it for IPv6. IPv4 addresses will roll up only
up to /24. IPv6 can rollup to /48 at most. The log message that is
generated contains the list of original IPs and new subnet.
All users are members of the EVERYONE group, but this group is special and
is omitted from the group_users table. When checking permission we need to
make sure we also add a bypass.
This also fixes a very buggy test in post_alerter, it was confirming the
broken behavior due to fabricator flow.
When it defined the tag group the everyone group automatically had full access
then the additional permission fabricated just added one more group. After
fix was made to code the test started failing. Fabricators can be risky.
When emailing a group inbox and including other support-type
emails (or even just regular ones with autoresponders) in the
CC field, each automated reply to the group inbox triggered
more emails to be sent out to all CC addresses to notify them
of the new reply, which in turn caused more automated emails
to be sent to the group inbox.
This commit fixes the issue by preventing any emails being sent
by the PostAlerter when the new post has an incoming email record
which is_auto_generated, which we detect in Email::Receiver.
Fixes the issue where making a user x as owner of a post doesn't
cause the concerned topic to be listed in new owner's `My Posts`
top menu filter
per https://meta.discourse.org/t/199369
Under some conditions, replacing an `<img` with `![]()` can break rendering, and make the image disappear.
Context at https://meta.discourse.org/t/152801
The search_ignore_accents site setting can be used to make the search
indexer remove the accents before indexing the content. The unaccent
function from PostgreSQL is better than Ruby's unicode_normalize(:nfkd).
Discourse users and associated accounts are created or updated when a
user logins or connects the account using their account preferences.
This new API can be used to create associated accounts and users too,
if necessary.
This allows text editors to use correct syntax coloring for the heredoc sections.
Heredoc tag names we use:
languages: SQL, JS, RUBY, LUA, HTML, CSS, SCSS, SH, HBS, XML, YAML/YML, MF, ICS
other: MD, TEXT/TXT, RAW, EMAIL
We added this constraint in 5bd55acf83
but it is causing problems in hosted sites and is catching the
issue too far down the line. This commit removes the constraint
for now, and also fixes an issue found with PostDestroyer
which wasn't using the UserStatCountUpdater when updating post_count
and thus was causing negative numbers to occur.
Breakdown of fixes in this commit:
* `UserStat#topic_count` was not updated when visibility of
the topic changed.
* `UserStat#post_count` was not updated when post was hidden or
unhidden.
* `TopicConverter` was only incrementing or decrementing the counts by 1
even if a user has multiple posts in the topic.
* The commit turns off the verbose logging by default as it is just
noise to normal users who are not debugging this problem.
In ab5361d69a, we rescue from the PG error
but the transaction is already aborted causing any DB query after to
fail. As such, we avoid triggering the error in the first place by
checking that we would not be insertin a negative number into the
counter cache.
Follow-up to ab5361d69a
There are still spots in the code base which results in us trying to turn the post and topic count negative. However,
we have a job that runs on a daily basis which will correct the count. Therefore, avoid raising an error for now
and log the exception instead.
Random strings can result into much longer tsvectors. For example
parsing a Base64 string of ~600kb can result in a tsvector of over 1MB,
which is the maximum size of a tsvector.
Follow-up-to: 823c3f09d4
Ensures that `UserStat#post_count` and `UserStat#topic_count` does not
go below 0. When it does like it did now, we tend to have bugs in our
code since we're usually coding with the assumption that the count isn't
negative.
In order to support the constraints, our post and topic fabricators in
tests will now automatically increment the count for the respective
user's `UserStat` as well. We have to do this because our fabricators
bypasss `PostCreator` which holds the responsibility of updating `UserStat#post_count` and
`UserStat#topic_count`.
This commit fixes a bug where we our `HTMLScrubber` was only searching
for emoji img tags which contains only the "emoji" class. However, our emoji image tags
may contain more than just the "emoji" class like "only-emoji" when an
emoji exists by itself on a single line.
Also:
* Remove an unused method (#fill_email)
* Replace a method that was used just once (#generate_username) with `SecureRandom.alphanumeric`
* Remove an obsolete dev puma `tmp/restart` file logic
* File.exists? is deprecated and removed in Ruby 3.2 in favor of
File.exist?
* Dir.exists? is deprecated and removed in Ruby 3.2 in favor of
Dir.exist?
You can add callbacks that get called before updating an already consolidated notification or creating a consolidated one.
Instances of this rule can add callbacks to access the old notifications about to be destroyed or the consolidated one and add additional data inside the data hash versus having to execute extra queries when adding this logic inside the `set_mutations` block.
I plan to use this in an upcoming discourse-reactions PR, where I want to like a post without notifying the user, so I can instead create a reaction notification.
Additionally, we decouple the a11y attributes from the icon itself, which will let us extend the widget's icon without losing them.
This PR moves the behavior from the PostAlerter. We delete an existing liked notification and set the `username2` attribute to the previous `display_username`. We repeat this process unless the last one is old enough or it's not in the most recent ones.
We previously used ConsolidateNotifications with a threshold of 1 to re-use an existing notification and bump it to the top instead of creating a new one. It produces some jumpiness in the user notification list, and it relies on updating the `created_at` attribute, which is a bit hacky.
As a better alternative, we're introducing a new plan that deletes all the previous versions of the notification, then creates a new one.
We send the reminder using the GroupMessage class, which supports removing previous messages. We can't match them by raw because they could mention different moderators. Also, I had to change the subject to remove dynamically generated values, which is necessary for finding them.
This commit introduces a new site setting "google_oauth2_hd_groups". If enabled, group information will be fetched from Google during authentication, and stored in the Discourse database. These 'associated groups' can be connected to a Discourse group via the "Membership" tab of the group preferences UI.
The majority of the implementation is generic, so we will be able to add support to more authentication methods in the near future.
https://meta.discourse.org/t/managing-group-membership-via-authentication/175950
* REFACTOR: Improve support for consolidating notifications.
Before this commit, we didn't have a single way of consolidating notifications. For notifications like group summaries, we manually removed old ones before creating a new one. On the other hand, we used an after_create callback for likes and group membership requests, which caused unnecessary work, as we need to delete the record we created to replace it with a consolidated one.
We now have all the consolidation rules centralized in a single place: the consolidation planner class. Other parts of the app looking to create a consolidable notification can do so by calling Notification#consolidate_or_save!, instead of the default Notification#create! method.
Finally, we added two more rules: one for re-using existing group summaries and another for deleting duplicated dashboard problems PMs notifications when the user is tracking the moderator's inbox. Setting the threshold to one forces the planner to apply this rule every time.
I plan to add plugin support for adding custom rules in another PR to keep this one relatively small.
* DEV: Introduces a plugin API for consolidating notifications.
This commit removes the `Notification#filter_by_consolidation_data` scope since plugins could have to define their criteria. The Plan class now receives two blocks, one to query for an already consolidated notification, which we'll try to update, and another to query for existing ones to consolidate.
It also receives a consolidation window, which accepts an ActiveSupport::Duration object, and filter notifications created since that value.
This commit adds token_hash and scopes columns to email_tokens table.
token_hash is a replacement for the token column to avoid storing email
tokens in plaintext as it can pose a security risk. The new scope column
ensures that email tokens cannot be used to perform a different action
than the one intended.
To sum up, this commit:
* Adds token_hash and scope to email_tokens
* Reuses code that schedules critical_user_email
* Refactors EmailToken.confirm and EmailToken.atomic_confirm methods
* Periodically cleans old, unconfirmed or expired email tokens
Use @here to mention all users that were allowed to topic directly or
through group, who liked topics or read the topic. Only first 10 users
will be notified.
We are pushing /notification-alert/#{user_id} and /notification/#{user_id}
messages to MessageBus from both PostAlerter and User#publish_notification_state.
This can cause memory issues on large sites with many users. This commit
stems the bleeding by only sending these alert messages if the user
in question has been seen in the last 30 days, which eliminates a large
chunk of users on some sites.
When 31035010af
was done it failed to take into account the case where the smtp_enabled
site setting was true, but the topic had no allowed groups / no
incoming email record, which caused errors for topics even with
nothing to do with group SMTP.
When there are multiple groups on a topic, we were selecting
the first from the topic allowed groups to act as the sender
email address when sending group SMTP replies via PostAlerter.
However, this was not ordered, and since there is no created_at
column on TopicAllowedGroup we cannot order this nicely, which
caused just a random group to be used (based on whatever postgres
decided it felt like that morning).
This commit changes the group used for SMTP sending to be the
group using the email_username of the to address of the first
incoming email for the topic, if there are more than one allowed
groups on the topic. Otherwise it just uses the only SMTP enabled
group.
This PR introduces a new `enable_experimental_backup_uploads` site setting (default false and hidden), which when enabled alongside `enable_direct_s3_uploads` will allow for direct S3 multipart uploads of backup .tar.gz files.
To make multipart external uploads work with both the S3BackupStore and the S3Store, I've had to move several methods out of S3Store and into S3Helper, including:
* presigned_url
* create_multipart
* abort_multipart
* complete_multipart
* presign_multipart_part
* list_multipart_parts
Then, S3Store and S3BackupStore either delegate directly to S3Helper or have their own special methods to call S3Helper for these methods. FileStore.temporary_upload_path has also removed its dependence on upload_path, and can now be used interchangeably between the stores. A similar change was made in the frontend as well, moving the multipart related JS code out of ComposerUppyUpload and into a mixin of its own, so it can also be used by UppyUploadMixin.
Some changes to ExternalUploadManager had to be made here as well. The backup direct uploads do not need an Upload record made for them in the database, so they can be moved to their final S3 resting place when completing the multipart upload.
This changeset is not perfect; it introduces some special cases in UploadController to handle backups that was previously in BackupController, because UploadController is where the multipart routes are located. A subsequent pull request will pull these routes into a module or some other sharing pattern, along with hooks, so the backup controller and the upload controller (and any future controllers that may need them) can include these routes in a nicer way.
This commit introduces a new s3:ensure_cors_rules rake task
that is run as a prerequisite to s3:upload_assets. This rake
task calls out to the S3CorsRulesets class to ensure that
the 3 relevant sets of CORS rules are applied, depending on
site settings:
* assets
* direct S3 backups
* direct S3 uploads
This works for both Global S3 settings and Database S3 settings
(the latter set directly via SiteSetting).
As it is, only one rule can be applied, which is generally
the assets rule as it is called first. This commit changes
the ensure_cors! method to be able to apply new rules as
well as the existing ones.
This commit also slightly changes the existing rules to cover
direct S3 uploads via uppy, especially multipart, which requires
some more headers.
Instead of using image-uploader, which relies on the old
UploadMixin, we can now use the uppy-image-uploader which
uses the new UppyUploadMixin which is stable enough and
supports both regular XHR uploads and direct S3 uploads,
controlled by a site setting (default to XHR).
At some point it may make sense to rename uppy-image-uploader
back to image-uploader, once we have gone through plugins
etc. and given a bit of deprecation time period.
This commit also fixes `for_private_message`, `for_site_setting`,
and `pasted` flags not being sent via uppy uploads onto the
UploadCreator, both via regular XHR uploads and also through
external/multipart uploads.
The uploaders changed are:
* site setting images
* badge images
* category logo
* category background
* group flair
* profile background
* profile card background
Calling create_notification_alert could still send a notification to a
suspended user. This just moves the check if user is suspended right
before sending the notification.
It allows saving local date to calendar.
Modal is giving option to pick between ics and google. User choice can be remembered as a default for the next actions.
Also promote the `create_notification_alert` and `push_notification`
methods from instance methods to class methods so that plugins can call
them. This is temporary until we add a more comprehensive API for
extending `PostAlerter`.
Discourse is sending regularly message to admins when potential problems are persisted. Most of the time they have exactly the same content. In that case, when there are no replies, the old one should be trashed before a new one is created.
This pull request introduces the endpoints required, and the JavaScript functionality in the `ComposerUppyUpload` mixin, for direct S3 multipart uploads. There are four new endpoints in the uploads controller:
* `create-multipart.json` - Creates the multipart upload in S3 along with an `ExternalUploadStub` record, storing information about the file in the same way as `generate-presigned-put.json` does for regular direct S3 uploads
* `batch-presign-multipart-parts.json` - Takes a list of part numbers and the unique identifier for an `ExternalUploadStub` record, and generates the presigned URLs for those parts if the multipart upload still exists and if the user has permission to access that upload
* `complete-multipart.json` - Completes the multipart upload in S3. Needs the full list of part numbers and their associated ETags which are returned when the part is uploaded to the presigned URL above. Only works if the user has permission to access the associated `ExternalUploadStub` record and the multipart upload still exists.
After we confirm the upload is complete in S3, we go through the regular `UploadCreator` flow, the same as `complete-external-upload.json`, and promote the temporary upload S3 into a full `Upload` record, moving it to its final destination.
* `abort-multipart.json` - Aborts the multipart upload on S3 and destroys the `ExternalUploadStub` record if the user has permission to access that upload.
Also added are a few new columns to `ExternalUploadStub`:
* multipart - Whether or not this is a multipart upload
* external_upload_identifier - The "upload ID" for an S3 multipart upload
* filesize - The size of the file when the `create-multipart.json` or `generate-presigned-put.json` is called. This is used for validation.
When the user completes a direct S3 upload, either regular or multipart, we take the `filesize` that was captured when the `ExternalUploadStub` was first created and compare it with the final `Content-Length` size of the file where it is stored in S3. Then, if the two do not match, we throw an error, delete the file on S3, and ban the user from uploading files for N (default 5) minutes. This would only happen if the user uploads a different file than what they first specified, or in the case of multipart uploads uploaded larger chunks than needed. This is done to prevent abuse of S3 storage by bad actors.
Also included in this PR is an update to vendor/uppy.js. This has been built locally from the latest uppy source at d613b849a6. This must be done so that I can get my multipart upload changes into Discourse. When the Uppy team cuts a proper release, we can bump the package.json versions instead.
While merging two user accounts don't merge the source user's email address if the target user is not a human.
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
Long posts may have `cooked` fields that produce tsvectors longer than
the maximum size of 1MiB (1,048,576 bytes). This commit uses just the
first million characters of the scrubbed cooked text for indexing.
Reducing the size to exactly 1MB (1_048_576) is not sufficient because
sometimes the output tsvector may be longer than the input and this
gives us some breathing room.
We shouldn't be checking if a user is allowed to do an action in the logger. We should be checking it just before we perform the action. In fact, guardians in the logger can make things even worse in case of a security bug. Let's say we forgot to check user's permissions before performing some action, but we still have a call to the guardian in the logger. In this case, a user would perform the action anyway, and this action wouldn't even be logged!
I've checked all cases and I confirm that we're safe to delete this calls from the logger.
I've added two calls to guardians in admin/user_controller. We didn't have security bugs there, because regular users can't access admin/... routes at all. But it's good to have calls to guardian in these methods anyway, neighboring methods have them.
This adds a few different things to allow for direct S3 uploads using uppy. **These changes are still not the default.** There are hidden `enable_experimental_image_uploader` and `enable_direct_s3_uploads` settings that must be turned on for any of this code to be used, and even if they are turned on only the User Card Background for the user profile actually uses uppy-image-uploader.
A new `ExternalUploadStub` model and database table is introduced in this pull request. This is used to keep track of uploads that are uploaded to a temporary location in S3 with the direct to S3 code, and they are eventually deleted a) when the direct upload is completed and b) after a certain time period of not being used.
### Starting a direct S3 upload
When an S3 direct upload is initiated with uppy, we first request a presigned PUT URL from the new `generate-presigned-put` endpoint in `UploadsController`. This generates an S3 key in the `temp` folder inside the correct bucket path, along with any metadata from the clientside (e.g. the SHA1 checksum described below). This will also create an `ExternalUploadStub` and store the details of the temp object key and the file being uploaded.
Once the clientside has this URL, uppy will upload the file direct to S3 using the presigned URL. Once the upload is complete we go to the next stage.
### Completing a direct S3 upload
Once the upload to S3 is done we call the new `complete-external-upload` route with the unique identifier of the `ExternalUploadStub` created earlier. Only the user who made the stub can complete the external upload. One of two paths is followed via the `ExternalUploadManager`.
1. If the object in S3 is too large (currently 100mb defined by `ExternalUploadManager::DOWNLOAD_LIMIT`) we do not download and generate the SHA1 for that file. Instead we create the `Upload` record via `UploadCreator` and simply copy it to its final destination on S3 then delete the initial temp file. Several modifications to `UploadCreator` have been made to accommodate this.
2. If the object in S3 is small enough, we download it. When the temporary S3 file is downloaded, we compare the SHA1 checksum generated by the browser with the actual SHA1 checksum of the file generated by ruby. The browser SHA1 checksum is stored on the object in S3 with metadata, and is generated via the `UppyChecksum` plugin. Keep in mind that some browsers will not generate this due to compatibility or other issues.
We then follow the normal `UploadCreator` path with one exception. To cut down on having to re-upload the file again, if there are no changes (such as resizing etc) to the file in `UploadCreator` we follow the same copy + delete temp path that we do for files that are too large.
3. Finally we return the serialized upload record back to the client
There are several errors that could happen that are handled by `UploadsController` as well.
Also in this PR is some refactoring of `displayErrorForUpload` to handle both uppy and jquery file uploader errors.
Configuring staged users to watch categories and tags is a way to sign
them up to get many emails. These emails may be unwanted and get marked
as spam, hurting the site's email deliverability.
Users can opt-in to email notifications by logging on to their
account and configuring their own preferences.
If staff need to be able to configure these preferences on behalf of
staged users, the "allow changing staged user tracking" site setting
can be enabled. Default is to not allow it.
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
There was a bug with changing timestamps using the topic wrench button. Under some circumstances, a topic was disappearing from the top of the latest tab after changing timestamps. Steps to reproduce:
- Choose a topic on the latest tab (the topic should be created some time ago, but has recent posts)
- Change topic timestamps (for example, move them one day forward):
- Go back to the latest tab and see that topic has disappeared.
This PR fixes this. We were setting topic.bumped_at to the timestamp user specified on the modal. This is incorrect. Instead, we should be setting topic.bumped_at to the created_at timestamp of the last regular (not a whisper and so on) post on the topic.
Currently when bulk-awarding a badge that can be granted multiple times, users in the CSV file are granted the badge once no matter how many times they're listed in the file and only if they don't have the badge already.
This PR adds a new option to the Badge Bulk Award feature so that it's possible to grant users a badge even if they already have the badge and as many times as they appear in the CSV file.
User flair was given by user's primary group. This PR separates the
two, adds a new field to the user model for flair group ID and users
can select their flair from user preferences now.
For other private messages we have the site setting
personal_email_time_window_seconds (default 20s) which allows
people to edit their post etc. before the email is sent.
This PR makes the Jobs::GroupSmtpEmail enqueuer in the
PostAlerter use the same delay.
<!-- NOTE: All pull requests should have tests (rspec in Ruby, qunit in JavaScript). If your code does not include test coverage, please include an explanation of why it was omitted. -->
This PR backtracks a fair bit on this one https://github.com/discourse/discourse/pull/13220/files.
Instead of sending the group SMTP email for each user via `UserNotifications`, we are changing to send only one email with the existing `Jobs::GroupSmtpEmail` job and `GroupSmtpMailer`. We are changing this job and mailer along with `PostAlerter` to make the first topic allowed user the `to_address` for the email and any other `topic_allowed_users` to be the CC address on the email. This is to cut down on emails sent via SMTP, which is subject to daily limits from providers such as Gmail. We log these details in the `EmailLog` table now.
In addition to this, we have changed `PostAlerter` to no longer rely on incoming email email addresses for sending the `GroupSmtpEmail` job. This was unreliable as a user's email could have changed in the meantime. Also it was a little overcomplicated to use the incoming email records -- it is far simpler to reason about to just use topic allowed users.
This also adds a fix to include cc_addresses in the EmailLog.addressed_to_user scope.
The generated regular expressions did not contain \b which matched
every text that contained the word, even if it was only a substring of
a word.
For example, if "art" was a watched word a post containing word
"artist" matched.
Subclasses must call #delete_user_actions inside build_actions to support user deletion. The method adds a delete user bundle, which has a delete and a delete + block option. Every subclass is responsible for implementing these actions.
Notifying about a tag change sometimes resulted in loading a large
number of users in memory just to perform an exclusion. This commit
prefers to do inclusion (i.e. instead of exclude users X, do include
users in groups Y) and does it in SQL to avoid fetching unnecessary
data that is later discarded.
When a group only has SMTP enabled and not IMAP, we do not
want to enqueue the :group_smtp_email job because using the group's
SMTP credentials for sending user_private_message emails is
handled by the UserNotifications class.
We do not want the :group_smtp_email job to be enqueued because
that uses a reply key instead of the group.email_username
for the reply-to address which is not what we want for SMTP
only, and also creates an IncomingEmail record to prevent IMAP
double syncing which we do not need either.
There is an open question about what happens when IMAP is
enabled after SMTP has been enabled for a while, and also questions
around whether we could do away with :group_smtp_email altogether
and handle everything via EmailLog and UserNotifications, adding
additional columns to the former and modifying the Imap::Sync
class to take this into account...a lot more further testing
for IMAP needs to be done to answer those questions.
For now, this fix should be sufficient to get the correct
reply-to address for user_private_response messages sent in
response to emails sent directly to the group's
email_username SMTP address.
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
This PR changes the `UserNotification` class to send outbound `user_private_message` using the group's SMTP settings, but only if:
* The first allowed_group on the topic has SMTP configured and enabled
* SiteSetting.enable_smtp is true
* The group does not have IMAP enabled, if this is enabled the `GroupSMTPMailer` handles things
The email is sent using the group's `email_username` as both the `from` and `reply-to` address, so when the user replies from their email it will go through the group's SMTP inbox, which needs to have email forwarding set up to send the message on to a location (such as a hosted site email address like meta@discoursemail.com) where it can be POSTed into discourse's handle_mail route.
Also includes a fix to `EmailReceiver#group_incoming_emails_regex` to include the `group.email_username` so the group does not get a staged user created and invited to the topic (which was a problem for IMAP), as well as updating `Group.find_by_email` to find using the `email_username` as well for inbound emails with that as the TO address.
#### Note
This is safe to merge without impacting anyone seriously. If people had SMTP enabled for a group they would have IMAP enabled too currently, and that is a very small amount of users because IMAP is an alpha product, and also because the UserNotification change has a guard to make sure it is not used if IMAP is enabled for the group. The existing IMAP tests work, and I tested this functionality by manually POSTing replies to the SMTP address into my local discourse.
There will probably be more work needed on this, but it needs to be tested further in a real hosted environment to continue.