Commit Graph

51 Commits

Author SHA1 Message Date
Natalie Tay
aac9f43038
Only block domains at the final destination (#15689)
In an earlier PR, we decided that we only want to block a domain if 
the blocked domain in the SiteSetting is the final destination (/t/59305). That 
PR used `FinalDestination#get`. `resolve` however is used several places
 but blocks domains along the redirect chain when certain options are provided.

This commit changes the default options for `resolve` to not do that. Existing
users of `FinalDestination#resolve` are
- `Oneboxer#external_onebox`
- our onebox helper `fetch_html_doc`, which is used in amazon, standard embed 
and youtube
  - these folks already go through `Oneboxer#external_onebox` which already
  blocks correctly
2022-01-31 15:35:12 +08:00
Peter Zhu
c5fd8c42db
DEV: Fix methods removed in Ruby 3.2 (#15459)
* File.exists? is deprecated and removed in Ruby 3.2 in favor of
File.exist?
* Dir.exists? is deprecated and removed in Ruby 3.2 in favor of
Dir.exist?
2022-01-05 18:45:08 +01:00
David Taylor
03998e0a29
FIX: Use CDN URL for internal onebox avatars (#15077)
This commit will also trigger a background rebake for all existing posts with internal oneboxes
2021-11-25 12:07:34 +00:00
jbrw
aec125b617
FIX: Display Instagram Oneboxes in an iframe (#14789)
We are no longer able to display the image returned by Instagram directly within a Discourse site (either in the composer, or within a cooked post within a topic), so:

- Display an image placeholder in the composer preview
- A cooked post should use an iframe to display the Instagram 'embed' content
2021-11-02 14:34:51 -04:00
Bianca Nenciu
d184fe59ca
FEATURE: Censor Oneboxes (#12902)
Previously onebox content was not passed by the censor regex, meaning you could sneak in censored words via onebox.
2021-06-03 11:39:12 +10:00
jbrw
a24b6daa87
FIX: An unresolved blank uri should attempt an alternate Oneboxing strategy, if available (#13070) 2021-05-14 15:23:20 -04:00
jbrw
19182b1386
DEV: Oneboxer wildcard subdomains (#13015)
* DEV: Allow wildcards in Oneboxer optional domain Site Settings

Allows a wildcard to be used as a subdomain on Oneboxer-related SiteSettings, e.g.:

- `force_get_hosts`
- `cache_onebox_response_body_domains`
- `force_custom_user_agent_hosts`

* DEV: fix typos

* FIX: Try doing a GET after receiving a 500 error from a HEAD

By default we try to do a `HEAD` requests. If this results in a 500 error response, we should try to do a `GET`

* DEV: `force_get_hosts` should be a hidden setting

* DEV: Oneboxer Strategies

Have an alternative oneboxing ‘strategy’ (i.e., set of options) to use when an attempt to generate a Onebox fails. Keep track of any non-default strategies that were used on a particular host, and use that strategy for that host in the future.

Initially, the alternate strategy (`force_get_and_ua`) forces the FinalDestination step of Oneboxing to do a `GET` rather than `HEAD`, and forces a custom user agent.

* DEV: change stubbed return code

The stubbed status code needs to be a value not recognized by FinalDestination
2021-05-13 15:48:35 -04:00
Bianca Nenciu
21d1ee1065
FIX: Use Nokogiri and Loofah consistently (#12693)
CookedPostProcessor used Loofah to parse the cooked content of a post
and Nokogiri to parse cooked Oneboxes. Even though Loofah is built on
top of Nokogiri, replacing an element from the cooked post (a Nokogiri
node) with a parsed onebox (a Loofah node) produced a strange result
which included XML namespaces. Removing the mix and using Loofah
to parse Oneboxes fixed the problem.
2021-04-14 18:09:55 +03:00
jbrw
50252d803e
DEV: stub youtube embed requests (#12637)
* DEV: stub youtube embed requests

* DEV: Ignore redirects on youtube.com when oneboxing
2021-04-07 13:32:27 -04:00
jbrw
68d0916eb5
FEATURE: Oneboxer cache response body (#12562)
* FEATURE: Cache successful HTTP GET requests during Oneboxing

Some oneboxes may fail if when making excessive and/or odd requests against the target domains. This change provides a simple mechanism to cache the results of succesful GET requests as part of the oneboxing process, with the goal of reducing repeated requests and ultimately improving the rate of successful oneboxing.

To enable:

Set `SiteSetting.cache_onebox_response_body` to `true`

Add the domains you’re interesting in caching to `SiteSetting. cache_onebox_response_body_domains` e.g. `example.com|example.org|example.net`

Optionally set `SiteSetting.cache_onebox_user_agent` to a user agent string of your choice to use when making requests against domains in the above list.

* FIX: Swap order of duration and value in redis call

The correct order for `setex` arguments is `key`, `duration`, and `value`.

Duration and value had been flipped, however the code would not have thrown an error because we were caching the value of `1.day.to_i` for a period of 1 seconds… The intention appears to be to set a value of 1 (purely as a flag) for a period of 1 day.
2021-03-31 13:19:34 -04:00
jbrw
aed97c7bab
FIX: Add amazon sites to force_get_hosts (#12341)
It has been observed that doing a HEAD against an Amazon store URL may result in a 405 error being returned.

Skipping the HEAD request may result in an improved oneboxing experience when requesting these URLs.
2021-03-10 14:42:17 -05:00
jbrw
fff8a24f2b
FIX: Don’t display error if only error is a missing image (#12216)
`Onebox.preview` can return 0-to-n errors, where the errors are missing OpenGraph attributes (e.g. title, description, image, etc.). If any of these attributes are missing, we construct an error message and attach it to the Oneboxer preview HTML. The error message is something like:

 “Sorry, we were unable to generate a preview for this web page, because the following oEmbed / OpenGraph tags could not be found: description, image”

However, if the only missing tag is `image` we don’t need to display the error, as we have enough other data (title, description, etc.) to construct a useful/complete Onebox.
2021-02-25 14:30:40 -05:00
Dan Ungureanu
2d51833ca9
FIX: Make Oneboxer#apply insert block Oneboxes correctly (#11449)
It used to insert block Oneboxes inside paragraphs which resulted in
invalid HTML. This needed an additional parsing for removal of empty
paragraphs and the resulting HTML could still be invalid.

This commit ensure that block Oneboxes are inserted correctly, by
splitting the paragraph containing the link and putting the block
between the two. Paragraphs left with nothing but whitespaces will
be removed.

Follow up to 7f3a30d79f.
2020-12-14 17:49:37 +02:00
jbrw
331236d6d7
Onebox improved error handling and support for Instagram Access Tokens (#11253)
* FEATURE: display error if Oneboxing fails due to HTTP error

- display warning if onebox URL is unresolvable
- display warning if attributes are missing

* FEATURE: Use new Instagram oEmbed endpoint if access token is configured

Instagram requires an Access Token to access their oEmbed endpoint. The requirements (from https://developers.facebook.com/docs/instagram/oembed/) are as follows:

- a Facebook Developer account, which you can create at developers.facebook.com
- a registered Facebook app
- the oEmbed Product added to the app
- an Access Token
- The Facebook app must be in Live Mode

The generated Access Token, once added to SiteSetting.facebook_app_access_token, will be passed to onebox. Onebox can then use this token to access the oEmbed endpoint to generate a onebox for Instagram.

* DEV: update user agent string

* DEV: don’t do HEAD requests against news.yahoo.com

* DEV: Bump onebox version from 2.1.5 to 2.1.6

* DEV: Avoid re-reading templates

* DEV: Tweaks to onebox mustache templates

* DEV: simplified error message for missing onebox data

* Apply suggestions from code review
Co-authored-by: Gerhard Schlager <mail@gerhard-schlager.at>
2020-11-18 12:55:16 -05:00
Daniel Waterworth
721ee36425
Replace base_uri with base_path (#10879)
DEV: Replace instances of Discourse.base_uri with Discourse.base_path

This is clearer because the base_uri is actually just a path prefix. This continues the work started in 555f467.
2020-10-09 12:51:24 +01:00
David Taylor
a3577435f7
FEATURE: Additional control of iframes in oneboxes (#10523)
This commit adds a new site setting "allowed_onebox_iframes". By default, all onebox iframes are allowed. When the list of domains is restricted, Onebox will automatically skip engines which require those domains, and use a fallback engine.
2020-08-27 20:12:13 +01:00
Krzysztof Kotlarek
e0d9232259
FIX: use allowlist and blocklist terminology (#10209)
This is a PR of the renaming whitelist to allowlist and blacklist to the blocklist.
2020-07-27 10:23:54 +10:00
Régis Hanol
91c89df68a FIX: onebox local topic when using slug-less URL
When linking to a topic in the same Discourse, we try to onebox the link to show the title
and other various information depending on whether it's a "standard" or "inline" onebox.

However, we were not properly detecting links to topics that had no slugs (eg. https://meta.discourse.org/t/1234).
2020-06-23 17:18:38 +02:00
Penar Musaraj
4b6a47be48 DEV: do not persist force_custom_user_agent_hosts setting
Followup to f029e2
2020-02-06 11:56:54 -05:00
Penar Musaraj
f029e2eaf6 FEATURE: Add site setting for specific hosts using custom user agent when oneboxing
Followup to #00c406
2020-02-06 10:32:42 -05:00
Martin Brennan
901054fd75
FIX: Cache failed onebox URL request server-side (#8421)
We already cache failed onebox URL requests client-side, we now want to cache this on the server-side for extra protection. failed onebox previews will be cached for 1 hour, and any more requests for that URL will fail with a 404 status. Forcing a rebake via the Rebake HTML action will delete the failed URL cache (like how the oneboxer preview cache is deleted).
2019-11-28 07:48:29 +10:00
Krzysztof Kotlarek
427d54b2b0 DEV: Upgrading Discourse to Zeitwerk (#8098)
Zeitwerk simplifies working with dependencies in dev and makes it easier reloading class chains. 

We no longer need to use Rails "require_dependency" anywhere and instead can just use standard 
Ruby patterns to require files.

This is a far reaching change and we expect some followups here.
2019-10-02 14:01:53 +10:00
Penar Musaraj
3debdc8131 SECURITY: XSS when oneboxing user profile location field
The XSS here is only possible if CSP is disabled. Low impact since CSP is enabled by default in SiteSettings.
2019-09-17 16:12:50 -04:00
Sam Saffron
4ea21fa2d0 DEV: use #frozen_string_literal: true on all spec
This change both speeds up specs (less strings to allocate) and helps catch
cases where methods in Discourse are mutating inputs.

Overall we will be migrating everything to use #frozen_string_literal: true
it will take a while, but this is the first and safest move in this direction
2019-04-30 10:27:42 +10:00
Tim Lange
5a9dd923cc FIX: Onebox discourse user not respecting enable names (#7245) 2019-03-25 12:50:14 +05:30
Bianca Nenciu
4e0533a20b FIX: Generate Onebox for posts of type moderator_action. (#6466) 2018-10-10 18:39:03 +08:00
Arpit Jalan
fadcd36f92 FIX: do not treat ignore_redirects domains as blacklisted
This fix prevents domains present in `ignore_redirects` to be treated as
blacklisted domains and makes sure that onboxing happens for those domains.
Issue reported here: https://meta.discourse.org/t/steam-store-oneboxing-no-longer-works/97266
2018-09-18 10:38:02 +05:30
Bianca Nenciu
b6963b8ffb FIX: Ignore OneBox blacklisted domains. 2018-08-27 20:40:55 +02:00
Régis Hanol
3c8b43bb01 FIX: non-oneboxed links on separate lines should stay on separate lines 2018-04-11 21:33:45 +02:00
Vinoth Kannan
58bb3967e5 SECURITY: Oneboxer should escape the URL before processing 2018-03-15 19:57:55 +05:30
Régis Hanol
3be0294465 FIX: local post onebox was always pointing to 1st post 2018-02-26 16:05:35 +01:00
Régis Hanol
7d7f6faf40 FIX: properly render emojis in local oneboxes 2018-02-26 11:16:53 +01:00
Régis Hanol
0559a4736a FIX: don't double request when downloading a file 2018-02-24 12:35:57 +01:00
Régis Hanol
0799831dbe FIX: use the avatar of the post rather than the topic in local oneboxes 2018-02-20 19:49:39 +01:00
Régis Hanol
60ec483caa FIX: include title in local onebox when linking to a different topic 2018-02-19 22:40:14 +01:00
Sam
f028ffaf29 SECURITY: correct local onebox category checks
Also removes ugly "source_topic_id" from cooked posts

Patch was authored by @zogstrip

Signed-off-by: Sam <sam.saffron@gmail.com>
2018-02-14 10:40:46 +11:00
Robin Ward
a8d1e44943 FIX: Onebox will do a HEAD request first for redirects 2017-05-22 16:52:26 -04:00
Guo Xiang Tan
04016f0dec Support Ruby 2.4. 2017-04-15 12:29:00 +08:00
Régis Hanol
3841cd9a7f FEATURE: onebox everything by default
FEATURE: new 'max_oneboxes_per_post' site setting
FEATURE: change onebox whitelist to a blacklist
PERF: debounce the loading of oneboxes
PERF: improve perf of mention links in preview
FIX: sort loading of custom oneboxer
2016-10-24 12:46:22 +02:00
Andy Waite
3e50313fdc Prepare for separation of RSpec helper files
Since rspec-rails 3, the default installation creates two helper files:
* `spec_helper.rb`
* `rails_helper.rb`

`spec_helper.rb` is intended as a way of running specs that do not
require Rails, whereas `rails_helper.rb` loads Rails (as Discourse's
current `spec_helper.rb` does).

For more information:

https://www.relishapp.com/rspec/rspec-rails/docs/upgrade#default-helper-files

In this commit, I've simply replaced all instances of `spec_helper` with
`rails_helper`, and renamed the original `spec_helper.rb`.

This brings the Discourse project closer to the standard usage of RSpec
in a Rails app.

At present, every spec relies on loading Rails, but there are likely
many that don't need to. In a future pull request, I hope to introduce a
separate, minimal `spec_helper.rb` which can be used in tests which
don't rely on Rails.
2015-12-01 20:39:42 +00:00
Luciano Sousa
0fd98b56d8 few components with rspec3 syntax 2015-01-09 13:34:37 -03:00
Sam
0bc3525b10 BUGFIX: more robust onebox implementation 2014-05-28 17:15:10 +10:00
Robin Ward
e453bfa073 Work in progress: Swap out onebox code for onebox gem 2014-01-29 14:14:07 -05:00
Sam
33e3ad1603 clean up onebox application so it uses a single code path
use fragments for oneboxes
strip parent <p> if <div> is in it
clean some tests
2013-04-10 17:52:38 +10:00
Robin Ward
babcfe6234 Cache oneboxes in Redis now instead of postgres. 2013-03-21 13:11:54 -04:00
Tijmen Brommet
ba04e72a99 Fix weirdly indented code 2013-03-02 00:48:49 +01:00
Tijmen Brommet
eaef66a423 Fix display host for onebox 2013-03-02 00:46:55 +01:00
Gosha Arinich
cafc75b238 remove trailing whitespaces ❤️ 2013-02-26 07:31:35 +03:00
Jaime Iniesta
6995e75d41 Replace Hpricot with Nokogiri 2013-02-14 11:35:50 +01:00
Sam Saffron
0f88947279 fix onebox for your own site 2013-02-06 16:22:11 +11:00