discourse

mirror of https://github.com/discourse/discourse.git synced 2024-11-24 02:09:54 +08:00

Author	SHA1	Message	Date
Daniel Waterworth	1a95543e93	PERF: Don't use unaccent on string literals (#28120 ) unaccent isn't marked as a pure function, so it gets evaluated per row instead of once.	2024-07-29 15:37:25 -05:00
Sérgio Saquetim	4b20021033	DEV: Restrict `include:unlisted` search option to users that can view unlisted topics (#27977 )	2024-07-18 16:33:14 -03:00
Sérgio Saquetim	6a3e12a39c	FEATURE: Include advanced search option to include unlisted topics in the results (#27958 ) --------- Co-authored-by: Régis Hanol <regis@hanol.fr>	2024-07-18 13:43:53 -03:00
Isaac Janzen	005f623c42	DEV: Add `user_agent` column to `search_logs` (#27742 ) Add a new column - `user_agent` - to the `SearchLog` table. This column can be null as we are only allowing a the user-agent string to have a max length of 2000 characters. In the case the user-agent string surpasses the max characters allowed, we simply nullify the value, and save/write the log as normal.	2024-07-05 14:05:00 -05:00
Régis Hanol	a56321efb5	FIX: topic search order When using the full page search and filtering down to a specific topic, the sort order was overwritten to by by "post_number". This was confusing because we allow different type of sort order in the full search page. This fixes it by only sorting by post_number when there's no "global" sort order defined. Since the "new topic map" uses the search endpoint behind the scene, this also fixes the "most likes" popup. Context - https://meta.discourse.org/t/searching-order-seems-to-be-broken-when-searching-in-topic/312303	2024-06-27 18:13:26 +02:00
Sam	dc8249c08a	FEATURE: align with /filter and allow multiple category search (#27440 ) This introduces the syntax of `category:a,b,c` which will search across multiple categories. Previously there was no way to allow search across a wide selection of categories.	2024-06-12 16:06:04 +10:00
Loïc Guitaut	2a28cda15c	DEV: Update to lastest rubocop-discourse	2024-05-27 18:06:14 +02:00
Jan Cernik	9fb888923d	FIX: Do not show hidden posts in search results (#26800 )	2024-04-29 12:32:02 -03:00
Régis Hanol	f7a1272fa4	DEV: cleanup custom filters to prevent leaks Ensures we clean up any custom filters added in the specs to prevent any leaks when running the specs. Follow up to https://github.com/discourse/discourse/pull/26770#discussion_r1582464760	2024-04-29 16:11:12 +02:00
Bianca Nenciu	9199c52e5e	FIX: Load categories with search topic results (#25700 ) Add categories to the serialized search results together with the topics when lazy load categories is enabled. This is necessary in order for the results to be rendered correctly and display the category information.	2024-02-21 17:29:47 +02:00
Alan Guo Xiang Tan	e61608d080	FIX: Remap postgres text search proximity operator (#25497 ) Why this change? Since `1dba1aca27`, we have been remapping the `<->` proximity operator in a tsquery to `&`. However, there is another variant of it which follows the `<N>` pattern. For example, the following text "end-to-end" will eventually result in the following tsquery `end-to-end:* <-> end:* <2> end:` being generated by Postgres. Before this fix, the tsquery is remapped to `end-to-end: & end:* <2> end:*` by us. This is requires the search data which we store to contain `end` at exactly 2 position apart. Due to the way we limit the number of duplicates in our search data, the search term may end up not matching anything. In `bd32912c5e`, we made it such that we do not allow any duplicates when indexing a topic's title. Therefore, search for `end-to-end` against a topic title with `end-to-end` will never match because our index will only contain one `end` term. What does this change do? We will remap the `<N>` variant of the proximity operator.	2024-02-01 07:20:46 +08:00
Martin Brennan	146da75fd7	FEATURE: Add setting & preference for search sort default order (#24428 ) This commit adds a new `search_default_sort_order` site setting, set to "relevance" by default, that controls the default sort order for the full page /search route. If the user changes the order in the dropdown on that page, we remember their preference automatically, and it takes precedence over the site setting as a default from then on. This way people who prefer e.g. Latest Post as their default can make it so.	2023-11-20 10:43:58 +10:00
Sam	f25849501d	FEATURE: allow consumers to parse a search string (#23528 ) This extends search so it can have consumers that: 1. Can split off "term" from various advanced filters and orders 2. Can build a relation of either order or filter It also moves a lot of stuff around in the search class for clarity. Two new APIs are exposed: `.apply_filter` to apply all the special filters to a posts/topics relation `.apply_order` to force a particular order (eg: order:latest) This can then be used by semantic search in Discourse AI	2023-09-12 16:21:01 +10:00
Mark VanLandingham	730f652255	DEV: Add plugin modifier locations for user search locations (#23169 )	2023-08-21 12:23:42 -05:00
Canapin	b3c722f2f7	FIX: `created:@` search keyword for uppercase usernames (#22878 ) The filter wasn't working if the username had uppercase letters.	2023-08-02 15:28:17 -04:00
Sérgio Saquetim	908117e270	DEV: Added modifier hooks to allow plugins to tweak how categories and groups are fetched (#21837 ) This commit adds modifiers that allow plugins to change how categories and groups are prefetched into the application and listed in the respective controllers. Possible use cases: - prevent some categories/groups from being prefetched when the application loads for performance reasons. - prevent some categories/groups from being listed in their respective index pages.	2023-05-30 18:41:50 -03:00
Sam	b2e3084205	FEATURE: allow searching for oldest topics (#21715 ) In some cases reverse chronological can be very important. - Oldest post by sam - Oldest topic by sam Prior to these new filters we had no way of searching for them. Now the 2 new orders `order:oldest` and `order:oldest_topic` can be used to find oldest topics and posts * Update spec/lib/search_spec.rb Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com> * Update spec/lib/search_spec.rb Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com> --------- Co-authored-by: Alan Guo Xiang Tan <gxtan1990@gmail.com>	2023-05-24 18:26:36 +10:00
Mark VanLandingham	96e3c5e102	DEV: Add hidden site setting to control search page size (#21640 )	2023-05-18 15:30:08 -05:00
Bianca Nenciu	d6534bdb11	DEV: Fix test (#21283 ) Apostrophe-like characters (for example, ’ and ') are transformed to the ASCII apostrophe (') regardless of search_ignore_accents.	2023-05-04 17:04:26 +03:00
Sam	c63551d227	FEATURE: search_rank_sort_priorities modifier (#21329 ) This new modifier can be used by plugins to modify search ordering. Specifically plugins such as discourse_solved can amend search ordering so solved topics bump to the top. Also correct edge case where low and high sort priority categories did not order correctly when it came to closed/archived	2023-05-02 16:36:36 +10:00
Ted Johansson	25a226279a	DEV: Replace #pluck_first freedom patch with AR #pick in core (#19893 ) The #pluck_first freedom patch, first introduced by @danielwaterworth has served us well, and is used widely throughout both core and plugins. It seems to have been a common enough use case that Rails 6 introduced it's own method #pick with the exact same implementation. This allows us to retire the freedom patch and switch over to the built-in ActiveRecord method. There is no replacement for #pluck_first!, but a quick search shows we are using this in a very limited capacity, and in some cases incorrectly (by assuming a nil return rather than an exception), which can quite easily be replaced with #pick plus some extra handling.	2023-02-13 12:39:45 +08:00
Sam	5d28cb709a	FIX: de-prioritize archived topics (#20161 ) Previously due to an error archived topics were more prominent in search than closed topics. This amends our internal logic to ensure archived topics are bumped down the list.	2023-02-03 13:23:27 +11:00
Sam	1dba1aca27	FIX: add support for PG 14 and up (#20137 ) Previously to_tsquery would split terms and join with & In PG 14 terms are split and use <-> which means followed directly by. In PG 13: discourse_test=# SELECT to_tsquery('english', '''hello world'''); to_tsquery --------------------- 'hello' & 'world' (1 row) In PG 14: discourse_test=# SELECT to_tsquery('english', '''hello world'''); to_tsquery --------------------- 'hello' <-> 'world' (1 row) Change is very unobtrosive, we simply amend our to_tsquery to behave like it used to behave and make no use of the `<->` operator More detail at: https://akorotkov.github.io/blog/2021/05/22/pg-14-query-parsing/ Note that plainto_tsquery used elsewhere in Discourse keeps the exact same function. This also corrects a faulty test that was passing by a fluke on older version of PG	2023-02-03 08:11:25 +11:00
Sam	c5345d0e54	FEATURE: prioritize_exact_search_title_match hidden setting (#20089 ) The new `prioritize_exact_search_match` can be used to force the search algorithm to prioritize exact term matches in title when ranking results. This is scoped narrowly to titles for cases such as a topic titled: "organisation chart" and a search of "org chart". If we scoped this wider, all discussion about "org chart" would float to the top and leave a very common title de-prioritized. This is a hidden site setting and it has some performance impact due to double ranking. That said, performance impact is somewhat mitigated cause ranking on title alone is a very cheap operation.	2023-01-31 16:34:01 +11:00
Alan Guo Xiang Tan	6934edd97c	DEV: Add hidden site setting to configure search ranking weights (#20086 ) This site setting is mostly experimental at this point.	2023-01-31 08:57:13 +08:00
Sam	5d669d8aa2	Revert "FEATURE: hidden site setting to disable search prefix matching (#20058 )" (#20073 ) This reverts commit `64f7b97d08`. Too many side effects for this setting, we have decided to remove it	2023-01-31 07:39:23 +08:00
Sam	64f7b97d08	FEATURE: hidden site setting to disable search prefix matching (#20058 ) Many users seems surprised by prefix matching in search leading to unexpected results. Over the years we always would return results starting with a search term and not expect exact matches. Meaning a search for `abra` would find `abracadabra` This introduces the Site Setting `enable_search_prefix_matching` which defaults to true. (behavior unchanged) We plan to experiment on select sites with exact matches to see if the results are less surprising	2023-01-30 12:44:40 +08:00
Daniel Waterworth	666536cbd1	DEV: Prefer \A and \z over ^ and $ in regexes (#19936 )	2023-01-20 12:52:49 -06:00
Sérgio Saquetim	0feb9ad341	DEV: Added callback to change the query used to filter groups in search (#19884 ) Added plugin registry that will allow adding callbacks that can change the query that is used to filter groups while running a search.	2023-01-16 15:48:00 -03:00
Bianca Nenciu	fb780c50fd	FIX: Replace all quote-like unicodes with quotes (#19714 ) If unaccent is called with quote-like Unicode characters then it can generate invalid queries because some of the transformed quotes by unaccent are not escaped and to_tsquery fails because of bad input. This commits replaces more quote-like Unicode characters before unaccent is called.	2023-01-09 19:19:51 +02:00
David Taylor	6417173082	DEV: Apply syntax_tree formatting to `lib/*`	2023-01-09 12:10:19 +00:00
Bianca Nenciu	17b7ab0d7b	FIX: Make sure generated tsqueries are valid (#19368 ) The tsquery used for searching is generated using both functions from Ruby and Postgresql (for example, unaccent function). Depending on the term used, it generated an invalid tsquery. For example "can’t" generated "''can''t''" instead of "''can''''t''".	2022-12-12 17:57:20 +02:00
Du Jiajun	41e6b516e5	FIX: Support unicode in search filter @username (#18804 )	2022-11-16 10:42:37 +01:00
Daniel Waterworth	167181f4b7	DEV: Quote values when constructing SQL (#18827 ) All of these cases should already be safe, but still good to quote for "defense in depth".	2022-11-01 14:05:13 -05:00
Bianca Nenciu	9db8f00b3d	FEATURE: Create upload_references table (#16146 ) This table holds associations between uploads and other models. This can be used to prevent removing uploads that are still in use. * DEV: Create upload_references * DEV: Use UploadReference instead of PostUpload * DEV: Use UploadReference for SiteSetting * DEV: Use UploadReference for Badge * DEV: Use UploadReference for Category * DEV: Use UploadReference for CustomEmoji * DEV: Use UploadReference for Group * DEV: Use UploadReference for ThemeField * DEV: Use UploadReference for ThemeSetting * DEV: Use UploadReference for User * DEV: Use UploadReference for UserAvatar * DEV: Use UploadReference for UserExport * DEV: Use UploadReference for UserProfile * DEV: Add method to extract uploads from raw text * DEV: Use UploadReference for Draft * DEV: Use UploadReference for ReviewableQueuedPost * DEV: Use UploadReference for UserProfile's bio_raw * DEV: Do not copy user uploads to upload references * DEV: Copy post uploads again after deploy * DEV: Use created_at and updated_at from uploads table * FIX: Check if upload site setting is empty * DEV: Copy user uploads to upload references * DEV: Make upload extraction less strict	2022-06-09 09:24:30 +10:00
Penar Musaraj	8222810099	FIX: Limits for PM and group header search (#16887 ) When searching for PMs or PMs in a group inbox, results in the header search were not being limited to 5 with a "More" link to the full page search. This PR fixes that. It also simplifies the logic and updates the search API docs to include recently added `in:messages` and `group_messages:groupname` options.	2022-05-24 11:31:24 -04:00
Martin Brennan	fcc2e7ebbf	FEATURE: Promote polymorphic bookmarks to default and migrate (#16729 ) This commit migrates all bookmarks to be polymorphic (using the bookmarkable_id and bookmarkable_type) columns. It also deletes all the old code guarded behind the use_polymorphic_bookmarks setting and changes that setting to true for all sites and by default for the sake of plugins. No data is deleted in the migrations, the old post_id and for_topic columns for bookmarks will be dropped later on.	2022-05-23 10:07:15 +10:00
Martin Brennan	955d47bbd0	FIX: Use polymorphic bookmarks for in:bookmarks search (#16684 ) This commit makes sure the in:bookmarks post advanced search filter works with polymorphic bookmarks.	2022-05-10 09:08:01 +10:00
Martin Brennan	222c8d9b6a	FEATURE: Polymorphic bookmarks pt. 3 (reminders, imports, exports, refactors) (#16591 ) A bit of a mixed bag, this addresses several edge areas of bookmarks and makes them compatible with polymorphic bookmarks (hidden behind the `use_polymorphic_bookmarks` site setting). The main ones are: * ExportUserArchive compatibility * SyncTopicUserBookmarked job compatibility * Sending different notifications for the bookmark reminders based on the bookmarkable type * Import scripts compatibility * BookmarkReminderNotificationHandler compatibility This PR also refactors the `register_bookmarkable` API so it accepts a class descended from a `BaseBookmarkable` class instead. This was done because we kept having to add more and more lambdas/properties inline and it was very messy, so a factory pattern is cleaner. The classes can be tested independently as well. Some later PRs will address some other areas like the discourse narrative bot, advanced search, reports, and the .ics endpoint for bookmarks.	2022-05-09 09:37:23 +10:00
Penar Musaraj	b266a36967	FEATURE: Add `group_messages:` keyword to advanced search (#16584 )	2022-04-28 10:47:40 -04:00
Penar Musaraj	eebce8f80a	FEATURE: Add in:messages search modifier (#16567 ) This adds `in:messages` as a synonym for `in:personal` and sets it up as our default nomenclature (`in:personal` will still work).	2022-04-26 16:47:01 -04:00
Bianca Nenciu	6eb3d658ca	FIX: Do not wrap unaccent around tsqueries (#16284 ) tsqueries use quotes and having other characters that when unaccented become quotes results in invalid tsqueries.	2022-03-25 19:10:05 +02:00
Bianca Nenciu	34b4b53bac	FEATURE: Use Postgres unaccent to ignore accents (#16100 ) The search_ignore_accents site setting can be used to make the search indexer remove the accents before indexing the content. The unaccent function from PostgreSQL is better than Ruby's unicode_normalize(:nfkd).	2022-03-07 23:03:10 +02:00
Alan Guo Xiang Tan	930f51e175	FEATURE: Split up text segmentation for Chinese and Japanese. * Chinese segmenetation will continue to rely on cppjieba * Japanese segmentation will use our port of TinySegmenter * Korean currently does not rely on segmentation which was dropped in `c677877e4f` * SiteSetting.search_tokenize_chinese_japanese_korean has been split into SiteSetting.search_tokenize_chinese and SiteSetting.search_tokenize_japanese respectively	2022-02-07 09:21:14 +08:00
Alan Guo Xiang Tan	fff8b98485	SECURITY: Advanced group search did not respect visiblity of groups.	2022-01-10 13:49:26 +08:00
Penar Musaraj	d99deaf1ab	FEATURE: show recent searches in quick search panel (#15024 )	2021-11-25 15:44:15 -05:00
Penar Musaraj	20f5474be9	FEATURE: Log only topic/post search queries in search log (#14994 )	2021-11-18 09:21:12 +08:00
Alan Guo Xiang Tan	a03c48b720	FIX: Use the same mode for chinese search when indexing and querying. (#14780 ) The `白名单` term becomes `名单白名单` after it is processed by cppjieba in :query mode. However, `白名单` is not tokenized as such by cppjieba when it appears in a string of text. Therefore, this may lead to failed matches as the search data generated while indexing may not contain all of the terms generated by :query mode. We've decided to maintain parity for now such that both indexing and querying uses the same :mix mode. This may lead to less accurate search but our plan is to properly support CJK search in the future.	2021-11-01 10:14:47 +08:00
Dan Ungureanu	f003e31e2f	PERF: Optimize search in private messages query (#14660 ) * PERF: Remove JOIN on categories for PM search JOIN on categories is not needed when searchin in private messages as PMs are not categorized. * DEV: Use == for string comparison * PERF: Optimize query for allowed topic groups There was a query that checked for all topics a user or their groups were allowed to see. This used UNION between topic_allowed_users and topic_allowed_groups which was very inefficient. That was replaced with a OR condition that checks in either tables more efficiently.	2021-10-26 10:16:38 +03:00
Alan Guo Xiang Tan	6544e3b02a	DEV: Remove useless ordering when searching within a topic. (#14676 ) Searching within a topic currently does not make use of PG search and we're simply doing an `ilike` against the post raw. Furthermore, `Post#post_number` is already unique within a topic so the other ordering will never ever be used. This change simply makes the query cleaner to read.	2021-10-22 10:38:21 +08:00

1 2 3 4 5 ...

345 Commits