Commit Graph

10325 Commits

Author SHA1 Message Date
Sam
651476e89e
FIX: domain searches not working properly for URLs (#20136)
If a post contains domain with a word that stems to a non prefix single
words will not match it.

For example: in happy.com, `happy` stems to `happi`. Thus searches for happy
will not find URLs with it included.

This bloats the index a tiny bit, but impact is limited.

Will require a full reindex of search to take effect. 

When we are done refining search we can consider a full version bump.
2023-02-03 09:55:28 +11:00
Sam
1dba1aca27
FIX: add support for PG 14 and up (#20137)
Previously to_tsquery would split terms and join with &

In PG 14 terms are split and use <-> which means followed directly by.

In PG 13:

discourse_test=# SELECT to_tsquery('english', '''hello world''');
     to_tsquery
---------------------
 'hello' & 'world'
(1 row)

In PG 14:

discourse_test=# SELECT to_tsquery('english', '''hello world''');
     to_tsquery
---------------------
 'hello' <-> 'world'
(1 row)


Change is very unobtrosive, we simply amend our to_tsquery to behave like
it used to behave and make no use of the `<->` operator


More detail at: https://akorotkov.github.io/blog/2021/05/22/pg-14-query-parsing/

Note that plainto_tsquery used elsewhere in Discourse keeps the exact
same function.

This also corrects a faulty test that was passing by a fluke on older
version of PG
2023-02-03 08:11:25 +11:00
Rafael dos Santos Silva
e4fd3d9850
FIX: Better ordering of similar user search suggestions (#20142)
* FIX: Better ordering of similar user search suggestions
2023-02-02 14:39:44 -03:00
Rafael dos Santos Silva
14cf8eacf1
FEATURE: Use similarity in user search (#20112)
Currently, when doing `@mention` for users we have 0 tolerance for typos and misspellings.

With this patch, if a user search doesn't return enough results we go and use `pg_trgm` features to try and find more matches based on trigrams of usernames and names.

It also introduces GiST indexes on those fields in order to improve performance of this search, going from 130ms down to 15ms in my tests.

This is all gated in a feature flag and can be enabled by running  `SiteSetting.user_search_similar_results = true` in the rails console.
2023-02-02 13:35:04 -03:00
David Taylor
54f165beae DEV: Correct syntax_tree violations 2023-02-02 13:03:11 +00:00
Selase Krakani
2e78045af1
FIX: Extend username updates to self-mentions (#20071)
Posts with self-mentions aren't updated with username updates. This happens
because mention `UserAction` entries aren't logged for self-mentions.

This change updates the lookup of `Post` and `PostRevision` with mentions to bypass
`UserAction` entries.
2023-02-02 12:33:42 +00:00
Alan Guo Xiang Tan
ce531913a8
FIX: Sync user's reviewables count when loading reviewables list (#20128)
1. What is the problem here?

When a user's reviewables count changes, the changes are published via
MessageBus in a background Sidekiq job which means there is a delay before the
client receives the MessageBus message with the updated count. During
the time the reviewables count for a user has been updated and the time
when the client receives the MessageBus message with the updated count,
a user may view the reviewables list in the user menu. When that happens, the number of
reviewables in the list may be out of sync with the count shown.

2. What is the fix?

Going forward, the response for the `ReviewablesController#user_menu_list` action will include the user's reviewables count as
the `reviewables_count` attribute. This is then used by the client side
to update the user's reviewables count to ensure that the reviewables
list and count are kept in sync.
2023-02-02 10:19:51 +08:00
Sam
4570118a63
FIX: search index duplicate parser matching is too restrictive (#20129)
Previous regex did not allow for cases where a lexeme contains a : (colon)

This can happen when parsing URLs. New algorithm allows for this.
Test was amended to more clearly call out index problems
2023-02-02 12:17:19 +11:00
Joffrey JAFFEUX
df50df041a
FIX: corrects a regression hiding avatar in user selector (#20107)
Due to the way templates work, the incorrect variable (user instead of item) was not causing any error, and just failing silently to display the avatar.

This commit is also providing a basic spec for completion of users and groups.
2023-02-01 16:42:39 +01:00
Roman Rizzi
5c699e4384
DEV: Pass messageId as a dynamic segment instead of a query param (#20013)
* DEV: Rnemae channel path to just c

Also swap the channel id and channel slug params to be consistent with core.

* linting

* channel_path

* Drop slugify helper and channel route without slug

* Request slug and route models through the channel model if possible

* DEV: Pass messageId as a dynamic segment instead of a query param

* Ensure change is backwards-compatible

* drop query param from oneboxes

* Correctly extract channelId from routes

* Better route organization using siblings for regular and near-message

* Ensures sessions are unique even when using parallelism

* prevents didReceiveAttrs to clear input mid test

* we disable animations in capybara so sometimes the message was barely showing

* adds wait

* ensures finished loading

* is it causing more harm than good?

* this check is slowing things for no reason

* actually target the button

* more resilient select chat message

* apply similar fix to bookmark

* fix

---------

Co-authored-by: Joffrey JAFFEUX <j.jaffeux@gmail.com>
2023-02-01 12:39:23 -03:00
Osama Sayegh
f94951147e
FIX: Replace R2 gem with rtlcss for generating RTL CSS (#19636)
We've had a couple of problems with the R2 gem where it generated a broken RTL CSS bundle that caused a badly broken layout when Discourse is used in an RTL language, see a3ce93b and 5926386. For this reason, we're replacing R2 with `rtlcss` that can handle modern CSS features better than R2 does.

`rltcss` is written in JS and available as an npm package. Calling the `rltcss` from rubyland is done via the `rtlcss_wrapper` gem which contains a distributable copy of the `rtlcss` package and loads/calls it with Mini Racer. See https://github.com/discourse/rtlcss_wrapper for more details.

Internal topic: t/76263.
2023-02-01 14:21:15 +03:00
David Taylor
66256c15bd
UX: Calculate missing hover/selected colors from existing colors (#20105)
`--d-hover` is calculated to be equivalent to primary-100 in light mode, or primary-low in dark mode

`--d-selected` is calculated to be equivalent to primary-low in light mode, or primary-100 in dark mode

`lib/color_math` is introduced to provide some utilities for making these calculations.
2023-02-01 09:55:21 +00:00
Alan Guo Xiang Tan
07ef828db9
DEV: Improve MessageBus subscriptions for TopicTrackingState (#19767)
## Why do we need this change? 

When loading the ember app, [MessageBus does not start polling immediately](f31f0b70f8/app/assets/javascripts/discourse/app/initializers/message-bus.js (L71-L81)) and instead waits for `document.readyState` to be `complete`. What this means is that if there are new messages being created while we have yet to start polling, those messages will not be received by the client.

With sidebar being the default navigation menu, the counts derived from `topic-tracking-state.js` on the client side is prominently displayed on every page. Therefore, we want to ensure that we are not dropping any messages on the channels that `topic-tracking-state.js` subscribes to.  

## What does this change do? 

This includes the `MessageBus.last_id`s for the MessageBus channels which `topic-tracking-state.js` subscribes to as part of the preloaded data when loading a page. The last ids are then used when we subscribe the MessageBus channels so that messages which are published before MessageBus starts polling will not be missed.

## Review Notes

1. See https://github.com/discourse/message_bus#client-support for documentation about subscribing from a given message id.
2023-02-01 07:18:45 +08:00
Alan Guo Xiang Tan
f1ea2a2509
DEV: Add validator for search_ranking_weights site setting (#20088)
Follow-up to 6934edd97c
2023-02-01 06:43:41 +08:00
David Taylor
c760efc924
FIX: Allow non-persisted color-scheme colors to be edited (#20104)
When we introduce new color scheme colors, they are not immediately persisted to the database for all color schemes. Previously, this meant that they would be unavailable in the admin UI for editing. The only way to work with the new colors was to create a new color scheme.

This commit updates the serializer so that all colors are serialized, even if they are not yet persisted to the database for the current scheme. This means that they now show up in the admin UI and can be edited.
2023-01-31 17:10:32 +00:00
Blake Erickson
64986244d7
DEV: Change default bootstrap min users for private sites (#19810)
* DEV: Change default bootstrap min users for private sites

Private sites should have a lower min users to escape bootstrap mode.

* reset back to 50 if site is changed to public, added some tests

* fix formatting

* Remove comment

* Move constant declaration

* Update config/initializers/014-track-setting-changes.rb

Shaving a bit of repetition

Co-authored-by: Jarek Radosz <jradosz@gmail.com>

* Remove commented out code

* stree

---------

Co-authored-by: Jarek Radosz <jradosz@gmail.com>
2023-01-31 09:09:03 -07:00
Jan Cernik
06817bd94f
FIX: Category permission change not creating a log (#20027)
It didn't create a log if the category was public { "everyone" => 1 }
2023-01-31 10:15:17 -03:00
Ghassan Maslamani
96a6bb69b5
FIX: vimeo iframe url when data-original-href is missing (#18894) 2023-01-31 12:00:27 +01:00
Harry Wood
bdf8815b71
DEV: Add a test for create_post in import scripts (#18893)
Add some testing of the `create_post` method in ImportScripts::Base

Basic test of Post creation and (if enabled) the bbcode_to_md call.
2023-01-31 11:10:06 +01:00
Joffrey JAFFEUX
dfba155c54
DEV: skip failing spec (#20095) 2023-01-31 10:58:50 +01:00
Sam
c5345d0e54
FEATURE: prioritize_exact_search_title_match hidden setting (#20089)
The new `prioritize_exact_search_match` can be used to force the search
algorithm to prioritize exact term matches in title when ranking results.

This is scoped narrowly to titles for cases such as a topic titled:

"organisation chart" and a search of "org chart".

If we scoped this wider, all discussion about "org chart" would float to
the top and leave a very common title de-prioritized.

This is a hidden site setting and it has some performance impact due
to double ranking.

That said, performance impact is somewhat mitigated cause ranking on
title alone is a very cheap operation.
2023-01-31 16:34:01 +11:00
Alan Guo Xiang Tan
f31f0b70f8
SECURITY: Hide PM count for tags by default (#20061)
Currently `Topic#pm_topic_count` is a count of all personal messages tagged for a given tag. As a result, any user with access to PM tags can poll a sensitive tag to determine if a new personal message has been created using that tag even if the user does not have access to the personal message. We classify this as a minor leak in sensitive information.

With this commit, `Topic#pm_topic_count` is hidden from users by default unless the `display_personal_messages_tag_counts` site setting is enabled.
2023-01-31 12:08:23 +08:00
Sam
07679888c8
FEATURE: allow restricting duplication in search index (#20062)
* FEATURE: allow restricting duplication in search index

This introduces the site setting `max_duplicate_search_index_terms`.
Using this number we limit the amount of duplication in our search index.

This allows us to more correctly weight title searches, so bloated posts
don't unfairly bump to the top of search results.

This feature is completely disabled by default and behind a site setting

We will experiment with it first. Note entire search index must be rebuilt
for it to take effect.


---------

Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>
2023-01-31 12:41:31 +11:00
Alan Guo Xiang Tan
c5c72a74b7
DEV: Fix flaky test due to a lack of deterministic ordering (#20087) 2023-01-31 08:57:34 +08:00
Alan Guo Xiang Tan
6934edd97c
DEV: Add hidden site setting to configure search ranking weights (#20086)
This site setting is mostly experimental at this point.
2023-01-31 08:57:13 +08:00
Sam
5d669d8aa2
Revert "FEATURE: hidden site setting to disable search prefix matching (#20058)" (#20073)
This reverts commit 64f7b97d08.

Too many side effects for this setting, we have decided to remove it
2023-01-31 07:39:23 +08:00
David Taylor
4d12bdfdcb
DEV: Fix user_status_controller_spec flakiness (#20083)
In some situations, these HTTP calls would cause some cache to warmup and send a `/distributed_hash` message-bus message. We can avoid tracking those by passing a specific channel name to `track_publish`.
2023-01-30 22:42:47 +00:00
Joffrey JAFFEUX
a4c32e3970
DEV: attempts to fix flakey spec (#20075) 2023-01-30 21:47:44 +01:00
Joffrey JAFFEUX
137f28e0d6
DEV: skip spec failing on CI (#20077) 2023-01-30 21:47:31 +01:00
David Taylor
fa7f8d8e1b
DEV: Update user-status test to assert message-bus channels (#20068)
This test appears to be flaky. This assertion should help us track down the reason.
2023-01-30 13:54:44 +00:00
David Taylor
79bea9464c
PERF: Move user-tips and narrative to per-user messagebus channels (#19773)
Using a shared channel with per-message permissions means that every client is updated with the channel's 'last_id', even if there are no messages available to them. Per-user channel names avoid this problem - the last_id will only be incremented when there is a message for the given user.
2023-01-30 11:48:09 +00:00
Bianca Nenciu
23a74ecf8f
FIX: Truncate existing user status to 100 chars (#20044)
This commits adds a database migration to limit the user status to 100
characters, limits the user status in the UI and makes sure that the
emoji is valid.

Follow up to commit b6f75e231c.
2023-01-30 10:49:08 +02:00
Ayke Halder
9f14d643a5
DEV: use structured data in crawler-linkback-list for referencing only (#16237)
This simplifies the crawler-linkback-list to only be a point of reference to the actual DiscussionForumPosting objects.

See "Summary page": https://developers.google.com/search/docs/advanced/structured-data/carousel?hl=en#summary-page
> [It] defines an ItemList, where each ListItem has only three properties: @type (set to ListItem), position (the position in the list), and url (the URL of a page with full details about that item).
2023-01-30 08:26:55 +01:00
Ayke Halder
137dbaf0dc
DEV: declare post position as simple number in structured data (#16231)
This replaces the position declared as `#123` with the more simple version `123`.

The property position may be of type Integer or Text. A value of type Integer, or more precise of type Text which simply casts to integer, is sufficient here.
See: https://schema.org/position

In category-view the topic-list already uses this notation for the position of topics:
`<meta itemprop="position" content="123">`
2023-01-30 08:07:04 +01:00
Sam
64f7b97d08
FEATURE: hidden site setting to disable search prefix matching (#20058)
Many users seems surprised by prefix matching in search leading to
unexpected results.

Over the years we always would return results starting with a search term
and not expect exact matches.

Meaning a search for `abra` would find `abracadabra`

This introduces the Site Setting `enable_search_prefix_matching` which
defaults to true. (behavior unchanged)

We plan to experiment on select sites with exact matches to see if the
results are less surprising
2023-01-30 12:44:40 +08:00
Alan Guo Xiang Tan
7ec6e6b3d0
PERF: N+1 queries on /tags with multiple categories tags (#19906)
When the `tags_listed_by_group` site setting is disable, we were seeing
the N+1 queries problem when multiple `CategoryTag` records are listed.
This commit fixes that by ensuring that we are not filtering through the
category `tags` association after the association has been eager loaded.
2023-01-30 08:53:17 +08:00
Keegan George
a4c68d4a2e
FIX: Failing system spec for rate limited search (#20046) 2023-01-27 12:14:29 -08:00
Sam
2c8dfc3dbc
FEATURE: rate limit anon searches per second (#19708) 2023-01-27 10:05:27 -08:00
Bianca Nenciu
b6f75e231c
FIX: Limit user status to 100 characters (#20040)
* FIX: Limit user status to 100 characters

* FIX: Make sure the emoji is valid
2023-01-27 16:32:27 +02:00
Bianca Nenciu
8fc11215e1
FIX: Ensure soft-deleted topics can be deleted (#19802)
* FIX: Ensure soft-deleted topics can be deleted

The topic was not found during the deletion process because it was
deleted and `@post.topic` was nil.

* DEV: Use @topic instead of finding the topic every time
2023-01-27 16:15:33 +02:00
Martin Brennan
48eb8d5f5a
Revert "DEV: Delete dead Topic#incoming_email_addresses code (#19970)" (#20037)
This reverts commit 88a972c61b.

It's actually used in some plugins.
2023-01-27 11:27:15 +10:00
Martin Brennan
c8f8d9dbb6
DEV: Change slugs/generate endpoint from GET to POST (#19984)
Followup on feedback on PR #19928
https://github.com/discourse/discourse/pull/19928#discussion_r1083687839,
it makes more sense to have this endpoint as a POST rather than
a GET.
2023-01-27 10:58:33 +10:00
David Taylor
798b4bb604
FIX: Ensure anon-cached values are never returned for API requests (#20021)
Under some situations, we would inadvertently return a public (unauthenticated) result to an authenticated API request. This commit adds the `Api-Key` header to our anonymous cache bypass logic.
2023-01-26 13:26:29 +00:00
Penar Musaraj
60990aab55
DEV: Fix flakey assertion in test (#20011)
This assertion was failing in internal builds. I can repro locally if I
set `foobarbaz` to be created after `quxbarbaz`.

For now, I think this complication in the test is unnecessary, hence this
removes the `quxbarbaz` case.
2023-01-25 13:24:13 -05:00
Bianca Nenciu
c186a46910
SECURITY: Prevent XSS in local oneboxes (#20008)
Co-authored-by: OsamaSayegh <asooomaasoooma90@gmail.com>
2023-01-25 19:17:21 +02:00
Bianca Nenciu
f55e0fe791
SECURITY: Update to exclude tag topic filter (#20006)
Ignores tags specified in exclude_tag topics param that a user does not
have access to.

Co-authored-by: Blake Erickson <o.blakeerickson@gmail.com>
2023-01-25 18:56:22 +02:00
Bianca Nenciu
105fee978d
SECURITY: only show restricted tag lists to authorized users (#20004)
Co-authored-by: Penar Musaraj <pmusaraj@gmail.com>
2023-01-25 18:55:55 +02:00
Bianca Nenciu
cd7c8861ae
SECURITY: Remove bypass for base_url (#19995)
The check used to be necessary because we validated the referrer too and
this bypass was a workaround a bug that is present in some browsers that
do not send the correct referrer.
2023-01-25 13:50:45 +02:00
Natalie Tay
d5745d34c2
SECURITY: Limit the character count of group membership requests (#19993)
When creating a group membership request, there is no character
limit on the 'reason' field. This can be potentially be used by
an attacker to create enormous amount of data in the database.

Co-authored-by: Ted Johansson <ted@discourse.org>
2023-01-25 13:50:33 +02:00
Natalie Tay
f91ac52a22
SECURITY: Limit the length of drafts (#19989)
Co-authored-by: Loïc Guitaut <loic@discourse.org>
2023-01-25 13:50:21 +02:00