### General Changes and Duplication
* We now consider a post `with_secure_media?` if it is in a read-restricted category.
* When uploading we now set an upload's secure status straight away.
* When uploading if `SiteSetting.secure_media` is enabled, we do not check to see if the upload already exists using the `sha1` digest of the upload. The `sha1` column of the upload is filled with a `SecureRandom.hex(20)` value which is the same length as `Upload::SHA1_LENGTH`. The `original_sha1` column is filled with the _real_ sha1 digest of the file.
* Whether an upload `should_be_secure?` is now determined by whether the `access_control_post` is `with_secure_media?` (if there is no access control post then we leave the secure status as is).
* When serializing the upload, we now cook the URL if the upload is secure. This is so it shows up correctly in the composer preview, because we set secure status on upload.
### Viewing Secure Media
* The secure-media-upload URL will take the post that the upload is attached to into account via `Guardian.can_see?` for access permissions
* If there is no `access_control_post` then we just deliver the media. This should be a rare occurrance and shouldn't cause issues as the `access_control_post` is set when `link_post_uploads` is called via `CookedPostProcessor`
### Removed
We no longer do any of these because we do not reuse uploads by sha1 if secure media is enabled.
* We no longer have a way to prevent cross-posting of a secure upload from a private context to a public context.
* We no longer have to set `secure: false` for uploads when uploading for a theme component.
- Allow revoking keys without deleting them
- Auto-revoke keys after a period of no use (default 6 months)
- Allow multiple keys per user
- Allow attaching a description to each key, for easier auditing
- Log changes to keys in the staff action log
- Move all key management to one place, and improve the UI
Adds the settings:
raw_email_max_length, raw_rejected_email_max_length, delete_rejected_email_after_days.
These settings control retention of the "raw" emails logs.
raw_email_max_length ensures that if we get incoming email that is huge we will truncate it removing uploads from the raw log.
raw_rejected_email_max_length introduces an even more aggressive truncation for rejected incoming mail.
delete_rejected_email_after_days controls how many days we will keep rejected emails for (default 90)
Previously every hour we would run a full scan of the entire DB searching
for expired uploads that need to be moved to the tombstone folder.
This commit amends it so we only run the job 2 times per clean_orpha_uploads_grace_period_hours
There is a upper bound of 7 days so even if the grace period is set really
high it will still run at least once a week.
By default we have a 48 grace period so this amends it to run this cleanup
daily instead of hourly. This eliminates 23 times we run this ultra expensive
query.
Zeitwerk simplifies working with dependencies in dev and makes it easier reloading class chains.
We no longer need to use Rails "require_dependency" anywhere and instead can just use standard
Ruby patterns to require files.
This is a far reaching change and we expect some followups here.
Previously, calculating thresholds for reviewables was done based on the
50th and 85th percentile across all reviewables. However, many forum
owners provided feedback that these thresholds were too easy to hit, in
particular when it came to auto hiding content.
The calculation has been adjusted to base the priorities on reviewables
that have a minimum of 2 scores (flags). This should push the amount of
flags required to hide something higher then before.
On forums with very few flags you don't want to calculate averages
because they won't be very useful. Stick with the defaults until we hit
15 reviewables at least.
This reverts commit e805d44965.
We now have mechanisms in place to ensure heartbeat will always
be scheduled even if the scheduler is overloaded per: 098f938b
Databases can have a lot of user actions, self joining and running an
aggregate on millions of rows can be very costly
This optimisation will reduce the regular window of consistency down to 13
hours, this ensures the job runs much faster
* FIX: Heartbeat check per sidekiq process
* Rename method
* Remove heartbeat queues of previous bootups
* Regis feedback
* Refactor before_start
* Update lib/demon/sidekiq.rb
Co-Authored-By: Régis Hanol <regis@hanol.fr>
* Update lib/demon/sidekiq.rb
Co-Authored-By: Régis Hanol <regis@hanol.fr>
* Expire redis keys after 3600 seconds
* Don't use redis to store the list of queues
The site settings beginning with "topic views heat" and "topic post like
heat" are set to defaults when installing Discourse, but there has not
been a process or guidance for updating these values based on
community activity.
This feature will update them once a month. The low, medium, and
high settings will be based on the minimums of the 45th, 25th, and
10th percentile topics respectively, so that 45% of topics will have
some "heat".
Disable automatic changes with the automatic_topic_heat_values setting.
This does two things
1. Our "index grace period" has been wound down to 1 day, there is no point
keeping a bloated index for a week, usually when people delete stuff they
mean for it to be removed
2. We were never dropping deleted posts from the index, only posts from
deleted topics
These changes speed up search a tiny bit and reduce background work.
The `AutoQueueHandler` will ignore really old flags. In that case, don't
notify the user that the moderator is looking into it. They probably
never saw it because it didn't meet the reviewable minimum priority.
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.
Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
We found score hard to understand. It is still there behind the scenes
for sorting purposes, but it is no longer shown.
You can now filter by minimum priority (low, med, high) instead of
score.
This removes all uses of both `send` and `public_send` from consumers of
SiteSetting and instead introduces a `get` helper for dynamic lookup
This leads to much cleaner and safer code long term as we are always explicit
to test that a site setting is really there before sending an arbitrary
string to the class
It also removes a couple of risky stubs from the auth provider test
After careful analysis of large data-sets it became apparent that avg_time
had no impact whatsoever on "best of" topic scoring. Calculating avg_time
was a very costly operation especially on large databases.
We have some longer term plans of introducing other weighting that is read
time based into our scoring for "best of" and "top" topics, but in the
interim to stop a large amount of work that is not achieving any value we
are removing the jobs.
Column removal will follow once we decide on a new replacement metric.
`Upload#url` is more likely and can change from time to time. When it
does changes, we don't want to have to look through multiple tables to
ensure that the URLs are all up to date. Instead, we simply associate
uploads properly to `UserProfile` so that it does not have to replicate
the URLs in the table.
also add a hard limit of 1000 users per job run so we do not clog the
scheduler
destroyer.destroy has a transaction and this can have some serious complications
with the open record set find_each has going
Conversely, if a user is deactivated the reviewable should automatically
be rejected.
Before this fix, if a user was not active they'd still show in the
review queue but without an "Approve" button which was confusing.
If the post ids keep loading, we might end up in a situations where
we're always loading the same post ids over and over again without
indexing anything new.
Follow up to daeda80ada.
Includes support for flags, reviewable users and queued posts, with REST API
backwards compatibility.
Co-Authored-By: romanrizzi <romanalejandro@gmail.com>
Co-Authored-By: jjaffeux <j.jaffeux@gmail.com>
* Remove use of 0 in favor of `TrustLevel.levels[:newuser]`.
* Consolidate two tests into a single one.
* Test that disabling the feature works.
* Avoid loading full ActiveRecord object in test when we only need to
know the existence of the record.
* FEATURE: Add `IgnoredUsersSummary` daily job
## Why?
This is part of the [Ability to ignore a user feature](https://meta.discourse.org/t/ability-to-ignore-a-user/110254/8).
We want to:
1. Send an automatic group PM that goes out to moderators
2. When {x} users have Ignored the same user, threshold defined by a site setting, default of 5
3. Only send this message every X days which is defined by another site setting
There was a situation where if:
* There were new flags to review that met the visibility threshold
AND
* There were old flags that *didn't* meet the threshold
THEN
a pending flags notification would be sent out. This fixes that case.
Staff should not be notified of flags if they do not meet the threshold
and are old.
This commit introduces an ultra low priority queue for post rebakes. This
way rebakes can never interfere with regular sidekiq processing for cases
where we perform a large scale rebake.
Additionally it allows Post.rebake_old to be run with rate_limiter: false
to avoid triggering the limiter when rebaking. This is handy for cases
where you want to just force the full rebake and not wait for it to trickle
This allows us to run regular rebakes without starving the normal queue.
It additionally adds the ability to specify queue with `Jobs.enqueue` so
we can specifically queue a job with lower priority using the `queue` arg.
- moderation tab
- sorting/pagination
- improved third party reports support
- trending charts
- better perf
- many fixes
- refactoring
- new reports
Co-Authored-By: Simon Cossar <scossar@users.noreply.github.com>
### navigate_to_first_post_after_read setting for categories
When enabled on categories logged on users will return to OP after
reading the entire category. (useful for documentation categories)
### num_auto_bump_daily
Set a number of topics that will automatically bump daily on a category.
- Every 15 minutes we will check if any category has this setting
- Categories with the setting are shuffled
- We exclude pinned, closed, category description and archived topics
- Maximum of 1 topic for the list of categories is bumped till limit reached per category
- We always try to bump oldest first
- Limit is elastic using a RateLimiter that ensures that we only bump N per day
Also some minor organisation on category settings
Froze strings on category.rb
Introduce new patterns for direct sql that are safe and fast.
MiniSql is not prone to memory bloat that can happen with direct PG usage.
It also has an extremely fast materializer and very a convenient API
- DB.exec(sql, *params) => runs sql returns row count
- DB.query(sql, *params) => runs sql returns usable objects (not a hash)
- DB.query_hash(sql, *params) => runs sql returns an array of hashes
- DB.query_single(sql, *params) => runs sql and returns a flat one dimensional array
- DB.build(sql) => returns a sql builder
See more at: https://github.com/discourse/mini_sql
Also
- Significantly improved search ranking, title is treated most strongly
- Adds tag names to the index
- Run search re-indexer more aggressively
- Re-index topic and all posts on category change
* SPEC: PollFeedJob parsing atom feed
* add FeedItemAccessor
It is to provide a consistent interface to access a feed item's tag
content.
* add FeedElementInstaller
to install non-standard and non-namespaced feed elements
* FEATURE: replace SimpleRSS with Ruby RSS module
* get FinalDestination and download with Excon
* support namespaced element with FeedElementInstaller
This refactors handling of s3 so it can be specified via GlobalSetting
This means that in a multisite environment you can configure s3 uploads
without actual sites knowing credentials in s3
It is a critical setting for situations where assets are mirrored to s3.
Figuring out what unread topics a user has is a very expensive
operation over time.
Users can easily accumulate 10s of thousands of tracking state rows
(1 for every topic they ever visit)
When figuring out what a user has that is unread we need to join
the tracking state records to the topic table. This can very quickly
lead to cases where you need to scan through the entire topic table.
This commit optimises it so we always keep track of the "first" date
a user has unread topics. Then we can easily filter out all earlier
topics from the join.
We use pg functions, instead of nested queries here to assist the
planner.