Scrubbing an ASCII-8BIT string isn't ever going to remove anything, because
there's no code point that isn't valid 8-bit ASCII. Since we'd really
prefer it if everything were UTF-8 anyway, we'll just assume, for now, that
whatever comes out of SimpleRSS is probably UTF-8, and just nuke anything
that isn't a valid UTF-8 codepoint.
Of course, the *real* bug here is that SimpleRSS [unilaterally converts
everything to
ASCII-8BIT](https://github.com/cardmagic/simple-rss/issues/15). It's
presumably *far* too much to ask that it detects the encoding of the source
RSS feed and marks the parsed strings with the correct encoding...
This reduces the number of scans that the db has to do in the query
to fetch orphan uploads. Futheremore, we were not batching our
records which bloats memory.
- All unsubscribes go to the exact same page
- You may unsubscribe from watching a category on that page
- You no longer need to be logged in to unsubscribe from a topic
- Simplified footer on emails
* Rearrange frontend to account for mailing list mode
* Allow update of user preference for mailing list frequency
* Add mailing list frequency estimate
* Simplify frequency estimate; disable activity summary for mailing list mode
* Remove combined updates
* Add specs for enqueue mailing list mode job
* Write mailing list method for mailer
* Fix linting error
* Account for stale topics
* Add translations for default mailing list setting
* One query for mailing list topics
* Fix failing spec
* WIP
* Flesh out html template
* First pass at text-based mailing list summary
* Add user avatar
* Properly format posts for mailing list
* Move make_all_links_absolute into Email::Styles
* Apply first_seen_at to user
* Send mailing list email summary hourly based on first_seen_at
* Branch and test cleanup
* Use existing mailing list mode estimate
* Fix failing specs
- Show bounce score on user admin page
- Added reset bounce score button on user admin page
- Only whitelisted email types are sent to emails with high bounce score
- FIX: properly detect bounces even when there is no TO: header in the email
- Don't desactivate a user when reaching the bounce threshold
As it stands we load up user records quite frequently on the topic pages,
this in turn pulls all the columns for the users being selected, just to
discard them after they are loaded
New structure keeps all options in a discrete table, this is better organised
and allows us to easily add more column without worrying about bloating the
user table
There were actually 2 bugs:
1/ Calling '.try(:key)' on a hash doesn't work. So EmailLogs were never associated to a user.
2/ Turns out that we update the 'user.last_emailed_at' whenever we create an EmailLog (in the 'after_create' callback).
So we need to always create an EmailLog (whenever the email is sent or skipped).
There are people who have RSS feeds set up that do HTTPS -> HTTP
redirects which throw errors. Since RSS feeds are all configured
by admins I think it's OK if they allow an unsafe redirect as the
content is public anyway. This will reduce many server side errors.
We cap new and unread at 2/5th of SiteSetting.max_tracked_new_unread
This dynamic capping is applied under 2 conditions:
1. New capping is applied once every 15 minutes in the periodical job, this effectively ensures that usually even super active sites are capped at 200 new items
2. Unread capping is applied if a user hits max_tracked_new_unread,
meaning if new + unread == 500, we defer a job that runs within 15 minutes that will cap user at 200 unread
This logic ensures that at worst case a user gets "bad" numbers for 15 minutes and then the system goes ahead and fixes itself up
FEATURE: allow backup interval of up to 30 days
FIX: if a custom file exists in backup directory look at its date
FEATURE: site setting automatic_backups_enabled default true
SimpleRss is unreliable with parsing RSS feeds that contain German Umlauts.
For example this feed http://www.lauffeuer-lb.de/api/v2/articles.xml can't be
parsed by SimpleRss. Discourse's logs are full of
```
Job exception: Wrapped Encoding::CompatibilityError: incompatible character encodings: ASCII-8BIT and UTF-8
Job exception: incompatible character encodings: ASCII-8BIT and UTF-8
```
The embedding fails because the feed can't be parsed.
This change forces the encoding (using #scrub) which prevents the numerous
encoding errors.
- new hidden site setting 'migrate_to_new_scheme' (defaults to false)
- new rake tasks to toggle migration to new scheme
- FIX: migrate_to_new_scheme also works with CDN
- PERF: improve perf of the DbHelper.remap method
- REFACTOR: UrlHelper is now a class
Changed internals so trust levels are referred to with
TrustLevel[1], TrustLevel[2] etc.
This gives us much better flexibility naming trust levels, these names
are meant to be controlled by various communities.
Now that post numbers are monotonically increasing we should not need this job
Stuff should just self correct as users browser along
Corrected the job not to reset the disimissed posts in case we need it
Introduced badge triggers, introduced concept of badge that happens due to a post but has the post hidden
Delta badge grant happens once a minute, backed by redis
This change has a tradeoff.
It increases our backscatter vulnerability - the subject could have spammy content - but it's extremely valuable to the user to know exactly which message was rejected.
If you sent two at the same time, and only one was rejected, you would have no way of knowing which worked and which to resend without going to the website (which is what email-in is trying to avoid, kinda).
BUGFIX: User locale was used index data
BUGFIX: missing Norwegian fulltext config
FEATURE: store the text used to index stuff in fulltext (for diagnostics / in page search)
FEATURE: re-index posts when locale changes (in bg job)
FEATURE: allow reindexing by trucating post_search_data
Note: I removed japanese specific config cause it requires custom pg config,
happy to add it once our base docker config ships with it
The default 5 minutes may add too much lag for some sites used to mailing list performance.
Unfortunately, this seems to require restarting the server for the change to be noticed - is there any way to avoid that, or otherwise should this be noted in the setting text?
Feature to allow each imported post to be created using a different discourse
username. A possible use case of this is a multi-author blog where discourse
is being used to track comments. This feature allows authors to receive
updates when someone leaves a comment on one of their articles because each of
the imported posts can be created using the discourse username of the author.
Ideally it would be a menu selection to select POP3, POP3S, and potentially other future protocols like IMAP if desired, but I didn't want to deal with data migration at this point. And then I was going to have a checkbox for "Secure" (on by default, obviously), but that was very hard to word as to how it was different given everything else referred to pop3s and I couldn't change that either. So I settled on a preference:
pop3s_polling_insecure: "Poll using plain text POP3 without SSL"
Off by default.
This makes it very clear that as to what turning on that checkbox will be, and by calling it "insecure" makes sure people will think twice before turning it on.
I have not attempted to do any of the translations of the preference, I'm ot sure how you handle that.