Commit Graph

6333 Commits

Author SHA1 Message Date
Guo Xiang Tan
cfd507822f
PERF: Improve quality of PostSearchData#raw_data. (#7275)
This commit fixes the follow quality issue with `PostSearchData#raw_data`:

1. URLs are being tokenized and links with similar href and characters
are being duplicated in the raw data.

`Post#cooked`:

```
<p><a href=\"https://meta.discourse.org/some.png\" class=\"onebox\" target=\"_blank\" rel=\"nofollow noopener\">https://meta.discourse.org/some.png</a></p>
```

`PostSearchData#raw_data` Before:

```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png discourse org/some png https://meta.discourse.org/some.png discourse org/some png
```

`PostSearchData#raw_data` After:

```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png meta discourse org
```

2. Ligthbox being included in search pollutes the
`PostSearchData#raw_data` unncessarily.

From 28 March 2018 to 28 March 2019, searches for the term `image` on
`meta.discourse.org` had a click through rate of 2.1%. Non-lightboxed images are not included in indexing for search yet we were indexing content within a lightbox. Also, search for terms like `image` was affected we were using `Pasted image` as the filename for
uploads that were pasted.

`Post#cooked`

```
<p>Let me see how I can fix this image<br>\n<div class=\"lightbox-wrapper\"><a class=\"lightbox\" href=\"https://meta.discourse.org/some.png\" title=\"some.png\" rel=\"nofollow noopener\"><img src=\"https://meta.discourse.org/some.png\" width=\"275\" height=\"299\"><div class=\"meta\">\n<svg class=\"fa d-icon d-icon-far-image svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#far-image\"></use></svg><span class=\"filename\">some.png</span><span class=\"informations\">1750×2000</span><svg class=\"fa d-icon d-icon-discourse-expand svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#discourse-expand\"></use></svg>\n</div></a></div></p>
```

`PostSearchData#raw_data` Before:

```
This is a test topic 0 Uncategorized Let me see how I can fix this image some.png png https://meta.discourse.org/some.png discourse org/some png some.png png 1750×2000
```

`PostSearchData#raw_data` After:

```
This is a test topic 0 Uncategorized Let me see how I can fix this image
```

In terms of indexing performance, we now have to parse the given HTML
through nokogiri twice. However performance is not a huge worry here since a string length of 194170 takes only 30ms
to scrub plus the indexing takes place in a background job.
2019-04-01 10:14:29 +08:00
Guo Xiang Tan
7ac76fe935 DEV: Remove warning.
Library has already been loaded in application.rb.
2019-04-01 10:11:08 +08:00
Guo Xiang Tan
daeda80ada
FIX: Don't index posts with empty Post#raw for search. (#7263)
* DEV: Remove unnecessary join in `Jobs::ReindexSearch`.

* FIX: Don't index posts with empty `Post#raw` for search.
2019-04-01 10:06:27 +08:00
Guo Xiang Tan
730ebdfcba DEV: Refactor Jobs::EmitWebHookEvent specs. 2019-04-01 09:46:39 +08:00
Vinoth Kannan
904ba266cf SPEC: Add test case in emit_web_hook_event_spec for commit 4c6bfb9 2019-03-31 16:28:40 +05:30
Bianca Nenciu
034b8a7ecc FIX: Let users delete topics.
Follow-up to 31053f30de.
2019-03-29 22:00:36 +02:00
Robin Ward
370355d754 FIX: Allow users with posts to be rejected 2019-03-29 13:53:46 -04:00
Bianca Nenciu
31053f30de FEATURE: Let users delete their own topics. (#7267) 2019-03-29 17:10:05 +01:00
Maja Komel
4a3daacb1b FIX: reset embedding settings when no embeddable host, log host changes (#7264) 2019-03-29 17:05:51 +01:00
Roman Rizzi
7740b1570b FIX: Avoid the deleted_at scope when recovering a topic from a recently recovered post 2019-03-29 09:40:15 -04:00
Tarek Khalil
b1cb95fc23
FEATURE: Introduce ignore duration selection (#7266)
* FEATURE: Introducing new UI for tracking User's ignored or muted states
2019-03-29 10:14:53 +00:00
Guo Xiang Tan
f458cba4cb FIX: Admin search logs should filter by date instead of timestamp.
The client side filters by date so it is confusion when the data changes as each second passes.
2019-03-29 11:50:25 +08:00
Guo Xiang Tan
d8faf5f79e FIX: SearcLog.term_details generating incorrect data because of case.
Also match on equality rather than "LIKE ?" which is quite strange.
2019-03-29 11:31:01 +08:00
Guo Xiang Tan
8c2fa99f78 FIX: Remove :term from admin/search_logs/term/:term route.
Search log terms is a string that can contain characters like `/` which
messes with the route.
2019-03-29 09:48:20 +08:00
Robin Ward
118f98b6ed Linting error 2019-03-28 12:52:46 -04:00
Robin Ward
b58867b6e9 FEATURE: New 'Reviewable' model to make reviewable items generic
Includes support for flags, reviewable users and queued posts, with REST API
backwards compatibility.

Co-Authored-By: romanrizzi <romanalejandro@gmail.com>
Co-Authored-By: jjaffeux <j.jaffeux@gmail.com>
2019-03-28 12:45:10 -04:00
Sam Saffron
9ebabc1de8 FEATURE: unconditionally update Topic updated_at when posts change in topic
Previously we would bypass touching `Topic.updated_at` for whispers and post
recovery / deletions.

This meant that certain types of caching can not be done where we rely on
this information for cache accuracy.

For example if we know we have zero unread topics as of yesterday and whisper
is made I need to bump this date so the cache remains accurate

This is only half of a larger change but provides the groundwork.

Confirmed none of our serializers leak out Topic.updated_at so this is safe
spot for this info

At the moment edits still do not change this but it is not relevant for the
unread cache.

This commit also cleans up some specs to use the new `eq_time` matcher for
millisecond fidelity comparison of times

Previously `freeze_time` would fudge this which is not that clean.
2019-03-28 17:28:01 +11:00
David Taylor
95d5819218 FIX: Re-download hotlinked optimized images (#7249)
* FIX: Download local images, even if download remote is disabled
2019-03-27 21:31:12 +01:00
Tim Lange
12181599db FIX: Staff action records now also accepts action_name as filter (#7256) 2019-03-27 21:29:15 +01:00
Bianca Nenciu
a9798f0c47
FEATURE: Add page for all group membership requests. (#6909) 2019-03-27 13:30:59 +02:00
Tarek Khalil
ef2362a30f
FEATURE: Introducing new UI for changing User's notification levels (#7248)
* FEATURE: Introducing new UI for tracking User's ignored or muted states
2019-03-27 09:41:50 +00:00
Gerhard Schlager
4f04ae5692 FIX: Failed to show details about some bounced emails
Bounces sent to reply_by_email_address could not be found.
2019-03-26 18:00:27 +01:00
Tarek Khalil
41563ba6b2
FIX: flaky test in reports (#7255)
* FIX: flaky test in reports
2019-03-26 13:23:57 +00:00
Guo Xiang Tan
4774633dac DEV: Remove StatsSocket.
Removed in favor of https://github.com/discourse/discourse-prometheus.
2019-03-26 18:16:58 +08:00
Gerhard Schlager
dc90133d29 FIX: Forcing permissions of seeded categories shouldn't fail
Less restrictive permissions of subcategories could make the seeding of categories fail.
2019-03-26 10:39:07 +01:00
Guo Xiang Tan
dae0bb4c67 FIX: Post blurb incorrect when search contains a phrase match.
If the blurb generated is not around the search term, we will not be
able to highlight it on the client side.
2019-03-26 17:01:52 +08:00
Guo Xiang Tan
1799820256 DEV: Improve search phrase spec to show that it actually works. 2019-03-26 16:31:15 +08:00
Guo Xiang Tan
bf57f39353 DEV: Remove code that is not used. 2019-03-26 15:36:26 +08:00
Tim Lange
5a9dd923cc FIX: Onebox discourse user not respecting enable names (#7245) 2019-03-25 12:50:14 +05:30
Guo Xiang Tan
ac661e856a
FEATURE: Allow categories to be prioritized/deprioritized in search. (#7209) 2019-03-25 10:59:55 +08:00
Vinoth Kannan
b8bd031648 FIX: Always include custom fields in CategorySerializer
even if it is empty
2019-03-25 07:59:56 +05:30
Sam Saffron
40ac895ef7 SECURITY: properly validate return URL for SSO
Previously carefully crafted URLs could redirect off site
2019-03-25 09:02:42 +11:00
David Taylor
bb3f8e32e5 DEV: Run pull_hotlinked_images onebox specs without synchronous jobs
This was changed in fa5a1586, and caused a mutex to lock up, adding 60 seconds to the test suite.
2019-03-22 19:27:37 +00:00
Neil Lalonde
399e937a38 FIX: prevent sending multiple summary emails due to Sidekiq delays 2019-03-22 12:34:34 -04:00
Penar Musaraj
51e08feb7e DEV: Refactor icons used in lightbox HTML
Uses <svg> elements instead of hacky CSS pseudoelements

Adds a migration to mark posts with lightboxes as needing a rebake
2019-03-22 11:52:06 -04:00
David Taylor
a9d5ffbe3d FIX: Prevent critical emails bypassing disable, and improve email test logic
- The test_email job is removed, because it was always being run synchronously (not in sidekiq)
- 34b29f62 added a bypass for critical emails, to match the spec. This removes the bypass, and removes the spec.
- This adapts the specs for 72ffabf6, so that they check for emails being sent
- This reimplements c2797921, allowing test emails to be sent even when emails are disabled
2019-03-22 17:28:43 +08:00
Guo Xiang Tan
839a54b97b FIX: Destroy OptimizedImage record even if Upload record is invalid. 2019-03-22 16:47:06 +08:00
Guo Xiang Tan
19c3c25db1 FIX: Handle BBCode in migrate_to_s3 task as well. 2019-03-22 16:47:06 +08:00
David Taylor
3f9e7eb326 FIX: Respect the disable_emails=non-staff site setting correctly
This reverts commit c279792130.

This commit inadvertently removed all of the non-staff email logic, rather than just for the 'test email' button. 

https://meta.discourse.org/t/112231/5
2019-03-21 21:44:14 +00:00
Neil Lalonde
1812a38f0a FIX: upload watched words should use UTF-8 2019-03-21 13:46:16 -04:00
Maja Komel
34730a0b16 UX: show if webhook is disabled (#7217)
+ show in staff logs when webhook is created/updated/destroyed
2019-03-21 16:13:09 +01:00
Tarek Khalil
605530a77f FEATURE: Include muted users count within the ignored users report (#7230) 2019-03-21 14:31:45 +01:00
Tarek Khalil
a31a35b334 FEATURE: Ignored user notification behaviour should be as a muted user (#7227) 2019-03-21 12:15:34 +01:00
Tim Lange
f7b156ffbd UX: Better emoji escaping for topic title (#7218)
* FIX: Fixed failing discourse-prometheus-alert-receiver plugin specs
2019-03-21 09:11:33 +01:00
Gerhard Schlager
64bf4d4483 DEV: Add spec for reusing category permalink
Follow-up to f3c76ad482
2019-03-20 23:38:59 +01:00
Tarek Khalil
1dd0fa0c4e
REFACTOR: Move redundant ignored user check into guardian (#7219)
* REFACTOR: Move redundant ignored user check into guardian
2019-03-20 19:55:46 +00:00
Tarek Khalil
ed73cc60a9 FIX: Staff should be allowed to ignore users (#7216) 2019-03-20 15:47:13 +01:00
Tarek Khalil
5852e86226 FEATURE: Only allow TL2 Users to ignore other users (#7212) 2019-03-20 15:02:33 +01:00
Tarek Khalil
3b59ff0d02 [FEATURE] Disallow ignoring self, admins or moderators users (#7202) 2019-03-20 11:18:46 +01:00
Tarek Khalil
fed2dd9148 FEATURE: Add scheduled job to purge expired ignored users (#7211) 2019-03-20 11:01:43 +01:00