This commit adds an index for the query which the chat plugin executes
multiple times when preloading user data in `Chat::ChatChannelFetcher.unread_counts`.
Sample query plan from a query I grabbed from one of our production
instance.
Before:
```
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=10.77..696.67 rows=7 width=16) (actual time=7.735..7.736 rows=0 loops=1)
Group Key: cc.id
-> Nested Loop (cost=10.77..696.54 rows=12 width=8) (actual time=7.734..7.735 rows=0 loops=1)
Join Filter: (cc.id = cm.chat_channel_id)
-> Nested Loop (cost=0.56..76.44 rows=1 width=16) (actual time=0.011..0.037 rows=7 loops=1)
-> Index Only Scan using chat_channels_pkey on chat_channels cc (cost=0.28..22.08 rows=7 width=8) (actual time=0.004..0.014 rows=7 loops=1)
Index Cond: (id = ANY ('{192,300,228,727,8,612,1633}'::bigint[]))
Heap Fetches: 0
-> Index Scan using user_chat_channel_unique_memberships on user_chat_channel_memberships uccm (cost=0.28..7.73 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=7)
Index Cond: ((user_id = 1338) AND (chat_channel_id = cc.id))
-> Bitmap Heap Scan on chat_messages cm (cost=10.21..618.98 rows=89 width=12) (actual time=1.096..1.097 rows=0 loops=7)
Recheck Cond: (chat_channel_id = uccm.chat_channel_id)
Filter: ((deleted_at IS NULL) AND (user_id <> 1338) AND (id > COALESCE(uccm.last_read_message_id, 0)))
Rows Removed by Filter: 2085
Heap Blocks: exact=7106
-> Bitmap Index Scan on index_chat_messages_on_chat_channel_id_and_created_at (cost=0.00..10.19 rows=270 width=0) (actual time=0.114..0.114 rows=2085 loops=7)
Index Cond: (chat_channel_id = uccm.chat_channel_id)
Planning Time: 0.408 ms
Execution Time: 7.762 ms
(19 rows)
```
After:
```
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=5.84..367.39 rows=7 width=16) (actual time=0.130..0.131 rows=0 loops=1)
Group Key: cc.id
-> Nested Loop (cost=5.84..367.26 rows=12 width=8) (actual time=0.129..0.130 rows=0 loops=1)
Join Filter: (cc.id = cm.chat_channel_id)
-> Nested Loop (cost=0.56..76.44 rows=1 width=16) (actual time=0.038..0.069 rows=7 loops=1)
-> Index Only Scan using chat_channels_pkey on chat_channels cc (cost=0.28..22.08 rows=7 width=8) (actual time=0.011..0.022 rows=7 loops=1)
Index Cond: (id = ANY ('{192,300,228,727,8,612,1633}'::bigint[]))
Heap Fetches: 0
-> Index Scan using user_chat_channel_unique_memberships on user_chat_channel_memberships uccm (cost=0.28..7.73 rows=1 width=8) (actual time=0.006..0.006 rows=1 loops=7)
Index Cond: ((user_id = 1338) AND (chat_channel_id = cc.id))
-> Bitmap Heap Scan on chat_messages cm (cost=5.28..289.71 rows=89 width=12) (actual time=0.008..0.008 rows=0 loops=7)
Recheck Cond: ((chat_channel_id = uccm.chat_channel_id) AND (id > COALESCE(uccm.last_read_message_id, 0)) AND (deleted_at IS NULL))
Filter: (user_id <> 1338)
-> Bitmap Index Scan on index_chat_messages_on_chat_channel_id_and_id (cost=0.00..5.26 rows=90 width=0) (actual time=0.008..0.008 rows=0 loops=7)
Index Cond: ((chat_channel_id = uccm.chat_channel_id) AND (id > COALESCE(uccm.last_read_message_id, 0)))
Planning Time: 1.217 ms
Execution Time: 0.188 ms
(17 rows)
```
In both ChatMessage#rebake! and in ChatMessageProcessor
when we were calling ChatMessage.cook we were missing the
user_id to cook with, which causes missed hashtag cooks
because of missing permissions.
There is no need to duplicate check chat messages when they are being
edited but not having their message text changed. This was leading to
a validation error when adding/removing an upload but not changing the
message text.
Only allow maximum of 6000 characters for chat messages when they
are created or edited. A hidden setting can control this limit,
6000 is the default.
There is also a migration here to truncate any existing messages to
6000 characters if the message is already over that and if the
chat_messages table exists. We also set cooked_version to NULL
for those messages so we can identify them for rebake.
This commit fleshes out and adds functionality for the new `#hashtag` search and
lookup system, still hidden behind the `enable_experimental_hashtag_autocomplete`
feature flag.
**Serverside**
We have two plugin API registration methods that are used to define data sources
(`register_hashtag_data_source`) and hashtag result type priorities depending on
the context (`register_hashtag_type_in_context`). Reading the comments in plugin.rb
should make it clear what these are doing. Reading the `HashtagAutocompleteService`
in full will likely help a lot as well.
Each data source is responsible for providing its own **lookup** and **search**
method that returns hashtag results based on the arguments provided. For example,
the category hashtag data source has to take into account parent categories and
how they relate, and each data source has to define their own icon to use for the
hashtag, and so on.
The `Site` serializer has two new attributes that source data from `HashtagAutocompleteService`.
There is `hashtag_icons` that is just a simple array of all the different icons that
can be used for allowlisting in our markdown pipeline, and there is `hashtag_context_configurations`
that is used to store the type priority orders for each registered context.
When sending emails, we cannot render the SVG icons for hashtags, so
we need to change the HTML hashtags to the normal `#hashtag` text.
**Markdown**
The `hashtag-autocomplete.js` file is where I have added the new `hashtag-autocomplete`
markdown rule, and like all of our rules this is used to cook the raw text on both the clientside
and on the serverside using MiniRacer. Only on the server side do we actually reach out to
the database with the `hashtagLookup` function, on the clientside we just render a plainer
version of the hashtag HTML. Only in the composer preview do we do further lookups based
on this.
This rule is the first one (that I can find) that uses the `currentUser` based on a passed
in `user_id` for guardian checks in markdown rendering code. This is the `last_editor_id`
for both the post and chat message. In some cases we need to cook without a user present,
so the `Discourse.system_user` is used in this case.
**Chat Channels**
This also contains the changes required for chat so that chat channels can be used
as a data source for hashtag searches and lookups. This data source will only be
used when `enable_experimental_hashtag_autocomplete` is `true`, so we don't have
to worry about channel results suddenly turning up.
------
**Known Rough Edges**
- Onebox excerpts will not render the icon svg/use tags, I plan to address that in a follow up PR
- Selecting a hashtag + pressing the Quote button will result in weird behaviour, I plan to address that in a follow up PR
- Mixed hashtag contexts for hashtags without a type suffix will not work correctly, e.g. #ux which is both a category and a channel slug will resolve to a category when used inside a post or within a [chat] transcript in that post. Users can get around this manually by adding the correct suffix, for example ::channel. We may get to this at some point in future
- Icons will not show for the hashtags in emails since SVG support is so terrible in email (this is not likely to be resolved, but still noting for posterity)
- Additional refinements and review fixes wil
Follow up to 766bcbc684
Makes ChatMessage.last_editor_id and ChatMessageRevision.user_id
NOT NULL since they are always filled in now and the last commit
had a migration to backfill this data.
This commit adds last_editor_id to ChatMessage for parity with Post in
core, as well as adding user_id to the ChatMessageRevision record since
we need to know who is making edits and revisions to messages, in case
in future we want to allow more than just the current user to edit chat
messages. The backfill for data here simply uses the record's creating
user ID, but in future if we allow other people to edit the messages it
will use their ID.
This is a followup of the previous refactor where we created two new
models to handle all the dedicated logic that was present in the
`ChatChannel` model.
For the sake of consistency, `DMChannel` has been renamed to
`DirectMessageChannel` and the previous `DirectMessageChannel` model is
now named `DirectMessage`. This should help reasoning about direct
messages.