The default ranking options ranks by the number of matches which is
highly problematic when posts are stuffed with a keyword. The ranking
will now be divided by the document length which is a much fairer way to
rank.
This commit fixes the follow quality issue with `PostSearchData#raw_data`:
1. URLs are being tokenized and links with similar href and characters
are being duplicated in the raw data.
`Post#cooked`:
```
<p><a href=\"https://meta.discourse.org/some.png\" class=\"onebox\" target=\"_blank\" rel=\"nofollow noopener\">https://meta.discourse.org/some.png</a></p>
```
`PostSearchData#raw_data` Before:
```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png discourse org/some png https://meta.discourse.org/some.png discourse org/some png
```
`PostSearchData#raw_data` After:
```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png meta discourse org
```
2. Ligthbox being included in search pollutes the
`PostSearchData#raw_data` unncessarily.
From 28 March 2018 to 28 March 2019, searches for the term `image` on
`meta.discourse.org` had a click through rate of 2.1%. Non-lightboxed images are not included in indexing for search yet we were indexing content within a lightbox. Also, search for terms like `image` was affected we were using `Pasted image` as the filename for
uploads that were pasted.
`Post#cooked`
```
<p>Let me see how I can fix this image<br>\n<div class=\"lightbox-wrapper\"><a class=\"lightbox\" href=\"https://meta.discourse.org/some.png\" title=\"some.png\" rel=\"nofollow noopener\"><img src=\"https://meta.discourse.org/some.png\" width=\"275\" height=\"299\"><div class=\"meta\">\n<svg class=\"fa d-icon d-icon-far-image svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#far-image\"></use></svg><span class=\"filename\">some.png</span><span class=\"informations\">1750×2000</span><svg class=\"fa d-icon d-icon-discourse-expand svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#discourse-expand\"></use></svg>\n</div></a></div></p>
```
`PostSearchData#raw_data` Before:
```
This is a test topic 0 Uncategorized Let me see how I can fix this image some.png png https://meta.discourse.org/some.png discourse org/some png some.png png 1750×2000
```
`PostSearchData#raw_data` After:
```
This is a test topic 0 Uncategorized Let me see how I can fix this image
```
In terms of indexing performance, we now have to parse the given HTML
through nokogiri twice. However performance is not a huge worry here since a string length of 194170 takes only 30ms
to scrub plus the indexing takes place in a background job.
Includes support for flags, reviewable users and queued posts, with REST API
backwards compatibility.
Co-Authored-By: romanrizzi <romanalejandro@gmail.com>
Co-Authored-By: jjaffeux <j.jaffeux@gmail.com>
Previously we would bypass touching `Topic.updated_at` for whispers and post
recovery / deletions.
This meant that certain types of caching can not be done where we rely on
this information for cache accuracy.
For example if we know we have zero unread topics as of yesterday and whisper
is made I need to bump this date so the cache remains accurate
This is only half of a larger change but provides the groundwork.
Confirmed none of our serializers leak out Topic.updated_at so this is safe
spot for this info
At the moment edits still do not change this but it is not relevant for the
unread cache.
This commit also cleans up some specs to use the new `eq_time` matcher for
millisecond fidelity comparison of times
Previously `freeze_time` would fudge this which is not that clean.
Previously we relied on the provider name matching the name of the icon. Now icon names are explicitly set. Plugin providers which do not define an icon will get the default "sign-in-alt" icon
- The test_email job is removed, because it was always being run synchronously (not in sidekiq)
- 34b29f62 added a bypass for critical emails, to match the spec. This removes the bypass, and removes the spec.
- This adapts the specs for 72ffabf6, so that they check for emails being sent
- This reimplements c2797921, allowing test emails to be sent even when emails are disabled
- s3_force_path_style was added as a Minio specific url scheme but it has never been well supported in our code base.
- Our new migrate_to_s3 rake task does not work reliably with path style urls too
- Minio has also added support for virtual style requests i.e the same scheme as AWS S3/DO Spaces so we can rely on that instead of using path style requests.
- Add migration to drop s3_force_path_style from the site_settings table
Fixes two issues:
1. Redirecting to an external origin's path after login did not work
2. User would be erroneously redirected to the external origin after logout
https://meta.discourse.org/t/109755
* improved emoji support
- always optimize images as part of the task
- use the unicode standard ordering/naming for sections
* UX: more height for when there are recently used
Migrates email user options to a new data structure, where `email_always`, `email_direct` and `email_private_messages` are replace by
* `email_messages_level`, with options: `always`, `only_when_away` and `never` (defaults to `always`)
* `email_level`, with options: `always`, `only_when_away` and `never` (defaults to `only_when_away`)
This test did not support 'no auth' use case and other auth methods except 'login'. I fixed it by simply making the call to start() in the right way.
As shown in the source code of Net::SMTP (https://github.com/ruby/ruby/blob/ruby_2_5/lib/net/smtp.rb#L452), the start() function does accept the 'user' and 'secret' arguments. Also, in do_start() function (https://github.com/ruby/ruby/blob/ruby_2_5/lib/net/smtp.rb#L542), it automatically checks the auth method and args, skips the authentication if 'user' is not provided, and selects the right auth method from 'plain', 'login' or 'cram_md5'. This is exactly all of what we should do in a connection test and the odd 'auth_login' call in the previous code makes problems.
BTW, I am using 'localhost' as the third argument, which is the same as the default value in start(). This parameter is the domain address sent along with the 'ehlo' command in SMTP protocol. I have seen some documents, e.g. https://github.com/tpn/msmtp/blob/master/doc/msmtp.1#L455, saying that 'localhost' is fine. It works for me.
* First take
* Add support for sprites in themes
Automatically register any custom icons added via themes or plugins
* Fix theme sprite caching
* Simplify test
* Update lib/svg_sprite/svg_sprite.rb
Co-Authored-By: pmusaraj <pmusaraj@gmail.com>
* Fix /svg-sprite/search request
It is not a setting, and only relevant in specs. The new API is:
```
Jobs.run_later! # jobs will be thrown on the queue
Jobs.run_immediately! # jobs will run right away, avoid the queue
```
If the existing email address for a user ends in `.invalid`, we should take the email address from an authentication payload, and replace the invalid address. This typically happens when we import users from a system without email addresses.
This commit also adds some extensibility so that plugin authenticators can define `always_update_user_email?`
When using the api and you provide an http header based api key any other
auth based information (username, external_id, or user_id) passed in as
query params will not be used and vice versa.
Followup to f03b293e6a
We can only be sure that an email is sent when we get a mailer in
`ActionMailer::Deliveries`. A couple of tests were actually incorrect
because it didn't flow through our email sender where there are more
conditions in determining whether an email is sent or not.
Previously if you wanted to have jobs execute in test mode, you'd have
to do `SiteSetting.queue_jobs = false`, because the opposite of queue
is to execute.
I found this very confusing, so I created a test helper called
`run_jobs_synchronously!` which is much more clear about what it does.
- Notices are visible only by poster and trust level 2+ users.
- Notices are not generated for non-human or staged users.
- Notices are deleted when post is deleted.
Now you can also make authenticated API requests by passing the
`api_key` and `api_username` in the HTTP header instead of query params.
The new header values are: `Api-key` and `Api-Username`.
Here is an example in cURL:
``` text
curl -i -sS -X POST "http://127.0.0.1:3000/categories" \
-H "Content-Type: multipart/form-data;" \
-H "Api-Key: 7aa202bec1ff70563bc0a3d102feac0a7dd2af96b5b772a9feaf27485f9d31a2" \
-H "Api-Username: system" \
-F "name=7c1c0ed93583cba7124b745d1bd56b32" \
-F "color=49d9e9" \
-F "text_color=f0fcfd"
```
There is also support for `Api-User-Id` and `Api-User-External-Id`
instead of specifying the username along with the key.
When a plugin registers a language and sets fallbackLocale="en", fallback strings were missing. This commit strips any duplicate ":en" symbols when loading merged translations.
If you reply to an email with the word "mute" a topic will be muted
If you reply to an email with the word "track" a topic will be tracked
If you reply to an email with the word "watch" a topic will be watched
These ninja command can help advanced mailing list ex-users, saves a trip
to the website
By default, this does nothing. Two environment variables are available:
- `DISCOURSE_LOG_SIDEKIQ`
Set to `"1"` to enable logging. This will log all completed jobs to `log/rails/sidekiq.log`, along with various db/redis/network statistics. This is useful to track down poorly performing jobs.
- `DISCOURSE_LOG_SIDEKIQ_INTERVAL`
(seconds) Check running jobs periodically, and log their current duration. They will appear in the logs with `status:pending`. This is useful to track down jobs which take a long time, then crash sidekiq before completing.
Uses github.com/discourse/moment-timezone-names-translations to translate timezone names.
Plugins can also provide their own timezone name translations.
- Fixes deprecation regarding usage of BigDecimal in dev
- Handle edge case where query_hash would clear a non existent result
- Minor perf improvement to query_single
Most important thing though is that we are now on the latest gem
Previously with had `in:title` and `in:first` search shortcuts for
searching in first post or title only. They are a bit of handful to type.
This add 2 shortcuts (t and f) for searching titles of first posts.
This commit also cleans up all advanced filters, they were not properly
regex terminated allowing for weird clauses like `in:firstinator` acting
the same as `in:first`
State was being stored in a class variable, so was not consistent across processes. This commit moves the storage to redis. The change only affects development environments.
When a new post is triggered via message bus post stream will attempt to load
it, previously the `/topic/TOPIC_ID/posts.json` would unconditionally include
suggested topics, this caused excessive load on the server.
New pattern defaults to exclude suggested and related topics from this API
unless people explicitly ask for suggested.
- overrides :region and uses :endpoint when SiteSetting.s3_endpoint is provided
- Now, we can use the new rake task with DigitalOcean Spaces
- I've tested that it's compatible with/without bucket folder path
- I've tested that it's compatible with S3 and it doesn't break S3 for non-default regions
- follow-up on 97e17fe0
When the S3 store was enabled, we were only applying the S3 CDN.
So all images stored locally, like the emojis, were never put on the local CDN.
Fixed a bunch of CookedPostProcessor test by adding a call to 'optimize_urls'
in order to get final URLs.
I also removed the unnecessary PrettyText.add_s3_cdn method since this is already
handled in the CookedPostProcessor.
If a theme setting contained invalid SCSS, it would cause an error 500 on the site, with no way to recover. This commit stops loading theme settings in the core stylesheets, and instead only loads the color scheme variables. This change also makes `common/foundation/variables.scss` available to themes without an explicit import.
- These advanced fields are hidden behind an 'advanced' button, so will not affect normal use
- The editor has been refactored into a component, and styling cleaned up so menu items do not overlap on small screens
- Styling has been added to indicate which fields are in use for a theme
- Icons have been added to identify which fields have errors
This adjusts 53d592ad by @tgxworld
- Adds Sidekiq.upause_all! to unpause all sites
- Adds Sidekiq.paused_dbs to list dbs that are currently paused
- Handles some edge cases where unpause thread could extend expiry on
sites that were unpaused from a different process
- Ensures tests always terminates background thread used for pause
keepalive
tar exits with status 1 when uploads are modified or deleted by a sidekiq job, so we need to treat it like status 0.
According to the documentation it should be safe to ignore status 1 ("Some files differ"):
> If tar was given `--create', `--append' or `--update' option, this exit code means that some files were changed while being archived and so the resulting archive does not contain the exact copy of the file set.
Status 2 ("Fatal error") still results in an exception.
Treating TIFF and BMP as images cause us to add them to IMG tags, this is very inconsistent across browsers.
You can still upload these files they will simply not be displayed in IMG tags.
Previously it would unhide their post but leave them silenced.
This fix also cleans up some of the helper classes to make it easier
to pass extra data to the silencing code (for example, a link to the
post that caused the user to be silenced.)
This patch also refactors the auto_silence specs to avoid using
stubs.