Commit Graph

30 Commits

Author SHA1 Message Date
Sam Saffron
30990006a9 DEV: enable frozen string literal on all files
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.

Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
2019-05-13 09:31:32 +08:00
Gerhard Schlager
b73950692b FIX: Parsing non-existent feed should not fail 2018-08-10 18:37:14 +02:00
Gerhard Schlager
a115aae45f Use rchardet instead of charlock_holmes gem 2018-08-01 10:41:20 +02:00
Gerhard Schlager
5d421fb946 FIX: Try respecting charset in HTTP header of RSS feed 2018-08-01 10:41:20 +02:00
Gerhard Schlager
ff942ed2f3 FIX: Try detecting encoding of RSS feed 2018-08-01 10:41:20 +02:00
Arpit Jalan
1841dd48dc FIX: revert utf-8 encode changes 2018-05-20 17:50:36 +05:30
Arpit Jalan
9512796ef6 FIX: check for blank response when polling feed 2018-05-18 17:00:44 +05:30
Arpit Jalan
146f634c8f FIX: UTF-8 encode feed response body 2018-05-16 12:59:23 +05:30
Sam
d8b4627fc8 we have to define this for tests to pass 2018-02-15 13:30:34 +11:00
Sam
b5b866aab3 oops 2018-02-15 13:13:31 +11:00
Sam
c89b42c488 PERF: only require the rss library if used
Before:

Total allocated: 257909321 bytes (2514134 objects)
Total retained:  39681579 bytes (343387 objects)

allocated memory by gem
-----------------------------------
  42875979  rss

retained memory by gem
-----------------------------------
   2080188  rss

retained objects by gem
-----------------------------------
     13052  rss

After:

Total allocated: 210562047 bytes (2252030 objects)
Total retained:  37433816 bytes (328635 objects)

----

So, 2 less megabytes on boot and 13000 objects stuck in ruby heaps forever.
2018-02-15 13:11:33 +11:00
Kyle Zhao
5f318a5241 FEATURE: Replace SimpleRSS with Ruby RSS module (#5311)
* SPEC: PollFeedJob parsing atom feed

* add FeedItemAccessor

It is to provide a consistent interface to access a feed item's tag
content.

* add FeedElementInstaller

to install non-standard and non-namespaced feed elements

* FEATURE: replace SimpleRSS with Ruby RSS module

* get FinalDestination and download with Excon

* support namespaced element with FeedElementInstaller
2017-12-06 10:45:09 +11:00
Kyle Zhao
ac666ddf17 PollFeed: check 'content:encoded' for content first 2017-10-02 01:16:11 -04:00
Neil Lalonde
9edc490d3f FIX: remove memoized values in jobs 2017-05-22 16:26:30 -04:00
Robin Ward
4db76796b9 FEATURE: Setting to poll feeds more frequently 2017-05-10 14:30:12 -04:00
Robin Ward
6e4ba8a33e Catch RSS Parsing errors 2017-05-09 15:07:06 -04:00
Robin Ward
424fc8e2e2 FIX: Don't raise an error if the RSS endpoint is 404 2016-12-05 12:29:14 -05:00
Matt Palmer
394cd43d77 Scrub only after converting strings to UTF-8
Scrubbing an ASCII-8BIT string isn't ever going to remove anything, because
there's no code point that isn't valid 8-bit ASCII.  Since we'd really
prefer it if everything were UTF-8 anyway, we'll just assume, for now, that
whatever comes out of SimpleRSS is probably UTF-8, and just nuke anything
that isn't a valid UTF-8 codepoint.

Of course, the *real* bug here is that SimpleRSS [unilaterally converts
everything to
ASCII-8BIT](https://github.com/cardmagic/simple-rss/issues/15).  It's
presumably *far* too much to ask that it detects the encoding of the source
RSS feed and marks the parsed strings with the correct encoding...
2016-08-25 16:09:07 +10:00
Guo Xiang Tan
bc4087b9bb FIX: RSS description might be nil. 2016-03-07 17:42:17 +08:00
Robin Ward
40934e595a FIX: Some RSS feeds do unsafe redirects
There are people who have RSS feeds set up that do HTTPS -> HTTP
redirects which throw errors. Since RSS feeds are all configured
by admins I think it's OK if they allow an unsafe redirect as the
content is public anyway. This will reduce many server side errors.
2015-09-18 13:25:09 -04:00
5minpause
4ee1bc6320 Changes RSS item creation to prevent encoding errors
SimpleRss is unreliable with parsing RSS feeds that contain German Umlauts.
For example this feed http://www.lauffeuer-lb.de/api/v2/articles.xml can't be
parsed by SimpleRss. Discourse's logs are full of

```
Job exception: Wrapped Encoding::CompatibilityError: incompatible character encodings: ASCII-8BIT and UTF-8
Job exception: incompatible character encodings: ASCII-8BIT and UTF-8
```

The embedding fails because the feed can't be parsed.

This change forces the encoding (using #scrub)  which prevents the numerous
encoding errors.
2015-07-20 14:30:42 +02:00
Neil Lalonde
4e857bfa12 FIX: add http:// to feed_polling_url if it's missing 2014-08-19 17:57:14 -04:00
Sam
198731de23 FIX: 100% cpu while parsing feeds 2014-07-02 13:53:04 +10:00
Justin Leveck
a78df3d57d Add custom embed_by_username feature
Feature to allow each imported post to be created using a different discourse
username. A possible use case of this is a multi-author blog where discourse
is being used to track comments. This feature allows authors to receive
updates when someone leaves a comment on one of their articles because each of
the imported posts can be created using the discourse username of the author.
2014-06-09 12:35:38 -07:00
Louis Rose
1574485443 Perform the where(...).first to find_by(...) refactoring.
This refactoring was automated using the command: bundle exec "ruby refactorings/where_dot_first_to_find_by/app.rb"
2014-05-06 14:41:59 +01:00
Sam
e1f293ad66 FEATURE: new scheduler
Removed sidetiq, introduced new scheduler

- add basic UI
- add schedule discover
- add scheduling in initializer
2014-02-06 10:26:16 +11:00
Robin Ward
ed2e53bb06 FIX: Support feeds with description as well as content 2014-01-02 14:29:27 -05:00
Robin Ward
4f8aed295a FEATURE: Embeddable Discourse comments, now with simple-rss instead of feedzirra 2013-12-31 15:01:22 -05:00
Robin Ward
62db063e1e Revert "Support for Embeddable Comments via IFRAME" - it depends on Curl
which not every server has. Have to rethink this.

This reverts commit e3e4c62887.
2013-12-31 12:52:31 -05:00
Robin Ward
e3e4c62887 Support for Embeddable Comments via IFRAME 2013-12-31 12:26:24 -05:00