* Support private uploads in S3
* Use localStore for local avatars
* Add job to update private upload ACL on S3
* Test multisite paths
* update ACL for private uploads in migrate_to_s3 task
The site settings beginning with "topic views heat" and "topic post like
heat" are set to defaults when installing Discourse, but there has not
been a process or guidance for updating these values based on
community activity.
This feature will update them once a month. The low, medium, and
high settings will be based on the minimums of the 45th, 25th, and
10th percentile topics respectively, so that 45% of topics will have
some "heat".
Disable automatic changes with the automatic_topic_heat_values setting.
This does two things
1. Our "index grace period" has been wound down to 1 day, there is no point
keeping a bloated index for a week, usually when people delete stuff they
mean for it to be removed
2. We were never dropping deleted posts from the index, only posts from
deleted topics
These changes speed up search a tiny bit and reduce background work.
The `AutoQueueHandler` will ignore really old flags. In that case, don't
notify the user that the moderator is looking into it. They probably
never saw it because it didn't meet the reviewable minimum priority.
Historically we would keep the user data export posts around but delete
the uploads.
This leaves a lot of broken uploads in the system.
This rake task allows us to clean up old mess.
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.
Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
We found score hard to understand. It is still there behind the scenes
for sorting purposes, but it is no longer shown.
You can now filter by minimum priority (low, med, high) instead of
score.
This removes all uses of both `send` and `public_send` from consumers of
SiteSetting and instead introduces a `get` helper for dynamic lookup
This leads to much cleaner and safer code long term as we are always explicit
to test that a site setting is really there before sending an arbitrary
string to the class
It also removes a couple of risky stubs from the auth provider test
After careful analysis of large data-sets it became apparent that avg_time
had no impact whatsoever on "best of" topic scoring. Calculating avg_time
was a very costly operation especially on large databases.
We have some longer term plans of introducing other weighting that is read
time based into our scoring for "best of" and "top" topics, but in the
interim to stop a large amount of work that is not achieving any value we
are removing the jobs.
Column removal will follow once we decide on a new replacement metric.
`Upload#url` is more likely and can change from time to time. When it
does changes, we don't want to have to look through multiple tables to
ensure that the URLs are all up to date. Instead, we simply associate
uploads properly to `UserProfile` so that it does not have to replicate
the URLs in the table.
* Fix handling SNS notifications for AWS SES
This fixes detection of email bounce by:
- removing hard requirement for email ID, ID in webhook msg never equals this in email_log
- gets bounce_score from user stats instead of nonexistent field in webhook msg
* Remove empty line
* Prettify access to EmailLog for parsing SNS notification
Co-Authored-By: SystemZ <SystemZ@users.noreply.github.com>
If you turn it on now, default all users to approved since they were
previously. Also support approving a user that doesn't have a reviewable
record (it will be created first.)
This also includes a refactor to move class method calls to
`DiscourseEvent` into an initializer. Otherwise the load order of
classes makes a difference in the test environment and some settings
might be triggered and others not, randomly.
Theme developers can include any number of scss files within the /scss/ directory of a theme. These can then be imported from the main common/desktop/mobile scss.
also add a hard limit of 1000 users per job run so we do not clog the
scheduler
destroyer.destroy has a transaction and this can have some serious complications
with the open record set find_each has going
"Rejecting" a user in the queue is equivalent to deleting them, which
would then making it impossible to review rejected users. Now we store
information about the user in the payload so if they are deleted things
still display in the Rejected view.
Secondly, if a user is destroyed outside of the review queue, it will
now automatically "Reject" that queue item.
Conversely, if a user is deactivated the reviewable should automatically
be rejected.
Before this fix, if a user was not active they'd still show in the
review queue but without an "Approve" button which was confusing.
If the post ids keep loading, we might end up in a situations where
we're always loading the same post ids over and over again without
indexing anything new.
Follow up to daeda80ada.
This commit fixes the follow quality issue with `PostSearchData#raw_data`:
1. URLs are being tokenized and links with similar href and characters
are being duplicated in the raw data.
`Post#cooked`:
```
<p><a href=\"https://meta.discourse.org/some.png\" class=\"onebox\" target=\"_blank\" rel=\"nofollow noopener\">https://meta.discourse.org/some.png</a></p>
```
`PostSearchData#raw_data` Before:
```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png discourse org/some png https://meta.discourse.org/some.png discourse org/some png
```
`PostSearchData#raw_data` After:
```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png meta discourse org
```
2. Ligthbox being included in search pollutes the
`PostSearchData#raw_data` unncessarily.
From 28 March 2018 to 28 March 2019, searches for the term `image` on
`meta.discourse.org` had a click through rate of 2.1%. Non-lightboxed images are not included in indexing for search yet we were indexing content within a lightbox. Also, search for terms like `image` was affected we were using `Pasted image` as the filename for
uploads that were pasted.
`Post#cooked`
```
<p>Let me see how I can fix this image<br>\n<div class=\"lightbox-wrapper\"><a class=\"lightbox\" href=\"https://meta.discourse.org/some.png\" title=\"some.png\" rel=\"nofollow noopener\"><img src=\"https://meta.discourse.org/some.png\" width=\"275\" height=\"299\"><div class=\"meta\">\n<svg class=\"fa d-icon d-icon-far-image svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#far-image\"></use></svg><span class=\"filename\">some.png</span><span class=\"informations\">1750×2000</span><svg class=\"fa d-icon d-icon-discourse-expand svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#discourse-expand\"></use></svg>\n</div></a></div></p>
```
`PostSearchData#raw_data` Before:
```
This is a test topic 0 Uncategorized Let me see how I can fix this image some.png png https://meta.discourse.org/some.png discourse org/some png some.png png 1750×2000
```
`PostSearchData#raw_data` After:
```
This is a test topic 0 Uncategorized Let me see how I can fix this image
```
In terms of indexing performance, we now have to parse the given HTML
through nokogiri twice. However performance is not a huge worry here since a string length of 194170 takes only 30ms
to scrub plus the indexing takes place in a background job.
Includes support for flags, reviewable users and queued posts, with REST API
backwards compatibility.
Co-Authored-By: romanrizzi <romanalejandro@gmail.com>
Co-Authored-By: jjaffeux <j.jaffeux@gmail.com>
- The test_email job is removed, because it was always being run synchronously (not in sidekiq)
- 34b29f62 added a bypass for critical emails, to match the spec. This removes the bypass, and removes the spec.
- This adapts the specs for 72ffabf6, so that they check for emails being sent
- This reimplements c2797921, allowing test emails to be sent even when emails are disabled
* Remove use of 0 in favor of `TrustLevel.levels[:newuser]`.
* Consolidate two tests into a single one.
* Test that disabling the feature works.
* Avoid loading full ActiveRecord object in test when we only need to
know the existence of the record.
Migrates email user options to a new data structure, where `email_always`, `email_direct` and `email_private_messages` are replace by
* `email_messages_level`, with options: `always`, `only_when_away` and `never` (defaults to `always`)
* `email_level`, with options: `always`, `only_when_away` and `never` (defaults to `only_when_away`)
* FEATURE: Add `IgnoredUsersSummary` daily job
## Why?
This is part of the [Ability to ignore a user feature](https://meta.discourse.org/t/ability-to-ignore-a-user/110254/8).
We want to:
1. Send an automatic group PM that goes out to moderators
2. When {x} users have Ignored the same user, threshold defined by a site setting, default of 5
3. Only send this message every X days which is defined by another site setting
It is not a setting, and only relevant in specs. The new API is:
```
Jobs.run_later! # jobs will be thrown on the queue
Jobs.run_immediately! # jobs will run right away, avoid the queue
```
We can only be sure that an email is sent when we get a mailer in
`ActionMailer::Deliveries`. A couple of tests were actually incorrect
because it didn't flow through our email sender where there are more
conditions in determining whether an email is sent or not.
- Open the log file in "append" mode. This avoids issues if the file does not exist (and matches standard rails log behavior)
- Correctly parse the interval logging environment variable
By default, this does nothing. Two environment variables are available:
- `DISCOURSE_LOG_SIDEKIQ`
Set to `"1"` to enable logging. This will log all completed jobs to `log/rails/sidekiq.log`, along with various db/redis/network statistics. This is useful to track down poorly performing jobs.
- `DISCOURSE_LOG_SIDEKIQ_INTERVAL`
(seconds) Check running jobs periodically, and log their current duration. They will appear in the logs with `status:pending`. This is useful to track down jobs which take a long time, then crash sidekiq before completing.
There was a situation where if:
* There were new flags to review that met the visibility threshold
AND
* There were old flags that *didn't* meet the threshold
THEN
a pending flags notification would be sent out. This fixes that case.
Staff should not be notified of flags if they do not meet the threshold
and are old.
This commit introduces an ultra low priority queue for post rebakes. This
way rebakes can never interfere with regular sidekiq processing for cases
where we perform a large scale rebake.
Additionally it allows Post.rebake_old to be run with rate_limiter: false
to avoid triggering the limiter when rebaking. This is handy for cases
where you want to just force the full rebake and not wait for it to trickle
This allows us to run regular rebakes without starving the normal queue.
It additionally adds the ability to specify queue with `Jobs.enqueue` so
we can specifically queue a job with lower priority using the `queue` arg.
* Dashboard doesn't timeout anymore when Amazon S3 is used for backups
* Storage stats are now a proper report with the same caching rules
* Changing the backup_location, s3_backup_bucket or creating and deleting backups removes the report from the cache
* It shows the number of backups and the backup location
* It shows the used space for the correct backup location instead of always showing used space on local storage
* It shows the date of the last backup as relative date
* Doing it in a post migration was a bad idea
because the migration will fail if the site
is down while trying to download uploads
which points to the instance. This mainly
affects self-hosters using `discourse_docker`
where `./launcher rebuild` will take the
existing container down.
In the past onceoff was forcing inline download of gravatars,
this can be so expensive that it will never finish
This fix ensures it only marks avatars stale which will be picked
up by regular schedules
Last week we added support for dual stack urls but did not remap the
the old records in the uploads and optimized images table
This caused a few minor edge cases worst was that if you rebaked old
images S3 CDN was not repopulated.
If sidekiq is paused or Discourse is in readonly continue to queue
heartbeats
If we do not do that then a master process can end up reaping sidekiq
workers and causing various badness
This also impacts restore which can do weird stuff TM in cases like this
Introduces a hidden setting (default is 0.1) that erodes bounce score
every time we send an email. This means that erratic failures are less
painful cause system auto corrects
Previously we used width and height for thumbnails, new code ensures
1. We auto correct width and height
2. We added extra columns for thumbnail_width and height, this is determined
by actual upload and no longer passed in as a side effect
3. Optimized Image now stores filesize which can be used for analysis, decisions
Also
- fixes Android image manifest as a side effect
- fixes issue where a thumbnail generated that is smaller than the upload is no longer used
- moderation tab
- sorting/pagination
- improved third party reports support
- trending charts
- better perf
- many fixes
- refactoring
- new reports
Co-Authored-By: Simon Cossar <scossar@users.noreply.github.com>
### navigate_to_first_post_after_read setting for categories
When enabled on categories logged on users will return to OP after
reading the entire category. (useful for documentation categories)
### num_auto_bump_daily
Set a number of topics that will automatically bump daily on a category.
- Every 15 minutes we will check if any category has this setting
- Categories with the setting are shuffled
- We exclude pinned, closed, category description and archived topics
- Maximum of 1 topic for the list of categories is bumped till limit reached per category
- We always try to bump oldest first
- Limit is elastic using a RateLimiter that ensures that we only bump N per day
Also some minor organisation on category settings
Froze strings on category.rb
Introduce new patterns for direct sql that are safe and fast.
MiniSql is not prone to memory bloat that can happen with direct PG usage.
It also has an extremely fast materializer and very a convenient API
- DB.exec(sql, *params) => runs sql returns row count
- DB.query(sql, *params) => runs sql returns usable objects (not a hash)
- DB.query_hash(sql, *params) => runs sql returns an array of hashes
- DB.query_single(sql, *params) => runs sql and returns a flat one dimensional array
- DB.build(sql) => returns a sql builder
See more at: https://github.com/discourse/mini_sql
* don't update search index if post belongs to deleted topic
* log errors instead of crashing when updating post or revision fails
* update mentions even when the href attribute is missing
* run the background job with low priority
* replace username in all notifications
* update `action_code_who` used by small action posts
- remove inactive user report and replace with posts
- clean up internals so grouping by week happens on client
- when switching periods old report was not destroyed leading to bugs
- calculate trend based on previous interval ... not previous 30 days
- show percentages for mau/dau
- be more careful about utc date usage
- show uniqu and click through rate on search panel
- publish key of report with report so we only load the correct one
- subscribe earlier in channel in case of concurrency issues
* FEATURE: Update avatars in posts and revisions when user gets renamed
* FIX: Replace username in deleted posts when user gets renamed
* FEATURE: Replace username in notifications when user gets renamed
FEATURE: Update mentions and quotes when user gets merged
* Feature: Push notifications for Android
Notification config for desktop and mobile are merged.
Desktop notifications stay as they are for desktop views.
If mobile mode, push notifications are enabled.
Added push notification subscriptions in their own table, rather than through
custom fields.
Notification banner prompts appear for both mobile and desktop when enabled.
* `rescue nil` is a really bad pattern to use in our code base.
We should rescue errors that we expect the code to throw and
not rescue everything because we're unsure of what errors the
code would throw. This would reduce the amount of pain we face
when debugging why something isn't working as expexted. I've
been bitten countless of times by errors being swallowed as a
result during debugging sessions.
This feature can be enabled by choosing a destination for the
`shared drafts category` site setting.
* Staff members can create shared drafts, choosing a destination
category for the topic when it is published.
* Shared Drafts can be viewed in their category, or above the
topic list for the destination category where it will end up.
* When the shared draft is ready, it can be published to the
appropriate category by clicking a button on the topic view.
* When published, Drafts change their timestamps to the current
time, and any edits to the original post are removed.
Also
- Significantly improved search ranking, title is treated most strongly
- Adds tag names to the index
- Run search re-indexer more aggressively
- Re-index topic and all posts on category change
* This exposes the token in the Sidekiq dashboard which can be
viewed by an admin and defeats the purpose of using a token
in the download backup email ink.
* SPEC: PollFeedJob parsing atom feed
* add FeedItemAccessor
It is to provide a consistent interface to access a feed item's tag
content.
* add FeedElementInstaller
to install non-standard and non-namespaced feed elements
* FEATURE: replace SimpleRSS with Ruby RSS module
* get FinalDestination and download with Excon
* support namespaced element with FeedElementInstaller
This refactors handling of s3 so it can be specified via GlobalSetting
This means that in a multisite environment you can configure s3 uploads
without actual sites knowing credentials in s3
It is a critical setting for situations where assets are mirrored to s3.
Setting a user's default groups based on their email address should only be done once, ie. when they confirm their email address.
Previously we were doing this everytime we'd save a user record 🤷
Figuring out what unread topics a user has is a very expensive
operation over time.
Users can easily accumulate 10s of thousands of tracking state rows
(1 for every topic they ever visit)
When figuring out what a user has that is unread we need to join
the tracking state records to the topic table. This can very quickly
lead to cases where you need to scan through the entire topic table.
This commit optimises it so we always keep track of the "first" date
a user has unread topics. Then we can easily filter out all earlier
topics from the join.
We use pg functions, instead of nested queries here to assist the
planner.