If a post contains domain with a word that stems to a non prefix single
words will not match it.
For example: in happy.com, `happy` stems to `happi`. Thus searches for happy
will not find URLs with it included.
This bloats the index a tiny bit, but impact is limited.
Will require a full reindex of search to take effect.
When we are done refining search we can consider a full version bump.
Previously to_tsquery would split terms and join with &
In PG 14 terms are split and use <-> which means followed directly by.
In PG 13:
discourse_test=# SELECT to_tsquery('english', '''hello world''');
to_tsquery
---------------------
'hello' & 'world'
(1 row)
In PG 14:
discourse_test=# SELECT to_tsquery('english', '''hello world''');
to_tsquery
---------------------
'hello' <-> 'world'
(1 row)
Change is very unobtrosive, we simply amend our to_tsquery to behave like
it used to behave and make no use of the `<->` operator
More detail at: https://akorotkov.github.io/blog/2021/05/22/pg-14-query-parsing/
Note that plainto_tsquery used elsewhere in Discourse keeps the exact
same function.
This also corrects a faulty test that was passing by a fluke on older
version of PG
Currently, when doing `@mention` for users we have 0 tolerance for typos and misspellings.
With this patch, if a user search doesn't return enough results we go and use `pg_trgm` features to try and find more matches based on trigrams of usernames and names.
It also introduces GiST indexes on those fields in order to improve performance of this search, going from 130ms down to 15ms in my tests.
This is all gated in a feature flag and can be enabled by running `SiteSetting.user_search_similar_results = true` in the rails console.
Posts with self-mentions aren't updated with username updates. This happens
because mention `UserAction` entries aren't logged for self-mentions.
This change updates the lookup of `Post` and `PostRevision` with mentions to bypass
`UserAction` entries.
1. What is the problem here?
When a user's reviewables count changes, the changes are published via
MessageBus in a background Sidekiq job which means there is a delay before the
client receives the MessageBus message with the updated count. During
the time the reviewables count for a user has been updated and the time
when the client receives the MessageBus message with the updated count,
a user may view the reviewables list in the user menu. When that happens, the number of
reviewables in the list may be out of sync with the count shown.
2. What is the fix?
Going forward, the response for the `ReviewablesController#user_menu_list` action will include the user's reviewables count as
the `reviewables_count` attribute. This is then used by the client side
to update the user's reviewables count to ensure that the reviewables
list and count are kept in sync.
Previous regex did not allow for cases where a lexeme contains a : (colon)
This can happen when parsing URLs. New algorithm allows for this.
Test was amended to more clearly call out index problems
Due to the way templates work, the incorrect variable (user instead of item) was not causing any error, and just failing silently to display the avatar.
This commit is also providing a basic spec for completion of users and groups.
* DEV: Rnemae channel path to just c
Also swap the channel id and channel slug params to be consistent with core.
* linting
* channel_path
* Drop slugify helper and channel route without slug
* Request slug and route models through the channel model if possible
* DEV: Pass messageId as a dynamic segment instead of a query param
* Ensure change is backwards-compatible
* drop query param from oneboxes
* Correctly extract channelId from routes
* Better route organization using siblings for regular and near-message
* Ensures sessions are unique even when using parallelism
* prevents didReceiveAttrs to clear input mid test
* we disable animations in capybara so sometimes the message was barely showing
* adds wait
* ensures finished loading
* is it causing more harm than good?
* this check is slowing things for no reason
* actually target the button
* more resilient select chat message
* apply similar fix to bookmark
* fix
---------
Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
We've had a couple of problems with the R2 gem where it generated a broken RTL CSS bundle that caused a badly broken layout when Discourse is used in an RTL language, see a3ce93b and 5926386. For this reason, we're replacing R2 with `rtlcss` that can handle modern CSS features better than R2 does.
`rltcss` is written in JS and available as an npm package. Calling the `rltcss` from rubyland is done via the `rtlcss_wrapper` gem which contains a distributable copy of the `rtlcss` package and loads/calls it with Mini Racer. See https://github.com/discourse/rtlcss_wrapper for more details.
Internal topic: t/76263.
`--d-hover` is calculated to be equivalent to primary-100 in light mode, or primary-low in dark mode
`--d-selected` is calculated to be equivalent to primary-low in light mode, or primary-100 in dark mode
`lib/color_math` is introduced to provide some utilities for making these calculations.
## Why do we need this change?
When loading the ember app, [MessageBus does not start polling immediately](f31f0b70f8/app/assets/javascripts/discourse/app/initializers/message-bus.js (L71-L81)) and instead waits for `document.readyState` to be `complete`. What this means is that if there are new messages being created while we have yet to start polling, those messages will not be received by the client.
With sidebar being the default navigation menu, the counts derived from `topic-tracking-state.js` on the client side is prominently displayed on every page. Therefore, we want to ensure that we are not dropping any messages on the channels that `topic-tracking-state.js` subscribes to.
## What does this change do?
This includes the `MessageBus.last_id`s for the MessageBus channels which `topic-tracking-state.js` subscribes to as part of the preloaded data when loading a page. The last ids are then used when we subscribe the MessageBus channels so that messages which are published before MessageBus starts polling will not be missed.
## Review Notes
1. See https://github.com/discourse/message_bus#client-support for documentation about subscribing from a given message id.
When we introduce new color scheme colors, they are not immediately persisted to the database for all color schemes. Previously, this meant that they would be unavailable in the admin UI for editing. The only way to work with the new colors was to create a new color scheme.
This commit updates the serializer so that all colors are serialized, even if they are not yet persisted to the database for the current scheme. This means that they now show up in the admin UI and can be edited.
* DEV: Change default bootstrap min users for private sites
Private sites should have a lower min users to escape bootstrap mode.
* reset back to 50 if site is changed to public, added some tests
* fix formatting
* Remove comment
* Move constant declaration
* Update config/initializers/014-track-setting-changes.rb
Shaving a bit of repetition
Co-authored-by: Jarek Radosz <jradosz@gmail.com>
* Remove commented out code
* stree
---------
Co-authored-by: Jarek Radosz <jradosz@gmail.com>
The new `prioritize_exact_search_match` can be used to force the search
algorithm to prioritize exact term matches in title when ranking results.
This is scoped narrowly to titles for cases such as a topic titled:
"organisation chart" and a search of "org chart".
If we scoped this wider, all discussion about "org chart" would float to
the top and leave a very common title de-prioritized.
This is a hidden site setting and it has some performance impact due
to double ranking.
That said, performance impact is somewhat mitigated cause ranking on
title alone is a very cheap operation.
Currently `Topic#pm_topic_count` is a count of all personal messages tagged for a given tag. As a result, any user with access to PM tags can poll a sensitive tag to determine if a new personal message has been created using that tag even if the user does not have access to the personal message. We classify this as a minor leak in sensitive information.
With this commit, `Topic#pm_topic_count` is hidden from users by default unless the `display_personal_messages_tag_counts` site setting is enabled.
* FEATURE: allow restricting duplication in search index
This introduces the site setting `max_duplicate_search_index_terms`.
Using this number we limit the amount of duplication in our search index.
This allows us to more correctly weight title searches, so bloated posts
don't unfairly bump to the top of search results.
This feature is completely disabled by default and behind a site setting
We will experiment with it first. Note entire search index must be rebuilt
for it to take effect.
---------
Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
In some situations, these HTTP calls would cause some cache to warmup and send a `/distributed_hash` message-bus message. We can avoid tracking those by passing a specific channel name to `track_publish`.
Using a shared channel with per-message permissions means that every client is updated with the channel's 'last_id', even if there are no messages available to them. Per-user channel names avoid this problem - the last_id will only be incremented when there is a message for the given user.
This commits adds a database migration to limit the user status to 100
characters, limits the user status in the UI and makes sure that the
emoji is valid.
Follow up to commit b6f75e231c.
This simplifies the crawler-linkback-list to only be a point of reference to the actual DiscussionForumPosting objects.
See "Summary page": https://developers.google.com/search/docs/advanced/structured-data/carousel?hl=en#summary-page
> [It] defines an ItemList, where each ListItem has only three properties: @type (set to ListItem), position (the position in the list), and url (the URL of a page with full details about that item).
This replaces the position declared as `#123` with the more simple version `123`.
The property position may be of type Integer or Text. A value of type Integer, or more precise of type Text which simply casts to integer, is sufficient here.
See: https://schema.org/position
In category-view the topic-list already uses this notation for the position of topics:
`<meta itemprop="position" content="123">`
Many users seems surprised by prefix matching in search leading to
unexpected results.
Over the years we always would return results starting with a search term
and not expect exact matches.
Meaning a search for `abra` would find `abracadabra`
This introduces the Site Setting `enable_search_prefix_matching` which
defaults to true. (behavior unchanged)
We plan to experiment on select sites with exact matches to see if the
results are less surprising
When the `tags_listed_by_group` site setting is disable, we were seeing
the N+1 queries problem when multiple `CategoryTag` records are listed.
This commit fixes that by ensuring that we are not filtering through the
category `tags` association after the association has been eager loaded.
* FIX: Ensure soft-deleted topics can be deleted
The topic was not found during the deletion process because it was
deleted and `@post.topic` was nil.
* DEV: Use @topic instead of finding the topic every time
Under some situations, we would inadvertently return a public (unauthenticated) result to an authenticated API request. This commit adds the `Api-Key` header to our anonymous cache bypass logic.
This assertion was failing in internal builds. I can repro locally if I
set `foobarbaz` to be created after `quxbarbaz`.
For now, I think this complication in the test is unnecessary, hence this
removes the `quxbarbaz` case.
The check used to be necessary because we validated the referrer too and
this bypass was a workaround a bug that is present in some browsers that
do not send the correct referrer.
When creating a group membership request, there is no character
limit on the 'reason' field. This can be potentially be used by
an attacker to create enormous amount of data in the database.
Co-authored-by: Ted Johansson <ted@discourse.org>