discourse

mirror of https://github.com/discourse/discourse.git synced 2024-11-23 02:50:00 +08:00

Author	SHA1	Message	Date
jbrw	2f28ba318c	FEATURE: Onebox can match engines based on the content_type (#13876 ) * FEATURE: Onebox can match engines based on the content_type `FinalDestination` now returns the `content_type` of a resolved URL. `Oneboxer` passes this value to `Onebox` itself. Onebox engines can now specify a `matches_content_type` regex of content_types that the engine can handle, regardless of the URL. `ImageOnebox` will match URLs with a content type of `image/png`, `jpg`, `gif`, `bmp`, `tif`, etc. This will allow images that exist at a URL without a file type extension to be correctly rendered, assuming a valid `content_type` is returned.	2021-07-30 13:36:30 -04:00
jbrw	09d23a37a5	DEV: Add default `Accept-Language` to FinalDestination requests (#13817 ) Not specifying an `Accept-Language` should be equivalent to specifying an `Accept-Language` of `*`, however some webservers seem to prefer it if we are explicit about being able to handle a response of content in any language.	2021-07-22 15:49:59 +10:00
Joffrey JAFFEUX	e50b7e9111	SECURITY: ensures timeouts are correctly used on connect (#13455 )	2021-06-21 17:34:01 +02:00
jbrw	19182b1386	DEV: Oneboxer wildcard subdomains (#13015 ) * DEV: Allow wildcards in Oneboxer optional domain Site Settings Allows a wildcard to be used as a subdomain on Oneboxer-related SiteSettings, e.g.: - `force_get_hosts` - `cache_onebox_response_body_domains` - `force_custom_user_agent_hosts` * DEV: fix typos * FIX: Try doing a GET after receiving a 500 error from a HEAD By default we try to do a `HEAD` requests. If this results in a 500 error response, we should try to do a `GET` * DEV: `force_get_hosts` should be a hidden setting * DEV: Oneboxer Strategies Have an alternative oneboxing ‘strategy’ (i.e., set of options) to use when an attempt to generate a Onebox fails. Keep track of any non-default strategies that were used on a particular host, and use that strategy for that host in the future. Initially, the alternate strategy (`force_get_and_ua`) forces the FinalDestination step of Oneboxing to do a `GET` rather than `HEAD`, and forces a custom user agent. * DEV: change stubbed return code The stubbed status code needs to be a value not recognized by FinalDestination	2021-05-13 15:48:35 -04:00
Joffrey JAFFEUX	64dda7112d	FIX: correctly use timeouts in `FileHelper` and `FinalDestination` (#12921 ) Previous refactors have lost usage of read_timeout in `FileHelper.download` and `FinalDestination` was incorrectly using `Net::HTTP.start` by setting `open_timeout` in the block instead of directly during the invocation. Couldn't figure how to write a good test for this without slowing the spec.	2021-05-03 09:21:11 +02:00
jbrw	68d0916eb5	FEATURE: Oneboxer cache response body (#12562 ) * FEATURE: Cache successful HTTP GET requests during Oneboxing Some oneboxes may fail if when making excessive and/or odd requests against the target domains. This change provides a simple mechanism to cache the results of succesful GET requests as part of the oneboxing process, with the goal of reducing repeated requests and ultimately improving the rate of successful oneboxing. To enable: Set `SiteSetting.cache_onebox_response_body` to `true` Add the domains you’re interesting in caching to `SiteSetting. cache_onebox_response_body_domains` e.g. `example.com\|example.org\|example.net` Optionally set `SiteSetting.cache_onebox_user_agent` to a user agent string of your choice to use when making requests against domains in the above list. * FIX: Swap order of duration and value in redis call The correct order for `setex` arguments is `key`, `duration`, and `value`. Duration and value had been flipped, however the code would not have thrown an error because we were caching the value of `1.day.to_i` for a period of 1 seconds… The intention appears to be to set a value of 1 (purely as a flag) for a period of 1 day.	2021-03-31 13:19:34 -04:00
jbrw	331236d6d7	Onebox improved error handling and support for Instagram Access Tokens (#11253 ) * FEATURE: display error if Oneboxing fails due to HTTP error - display warning if onebox URL is unresolvable - display warning if attributes are missing * FEATURE: Use new Instagram oEmbed endpoint if access token is configured Instagram requires an Access Token to access their oEmbed endpoint. The requirements (from https://developers.facebook.com/docs/instagram/oembed/) are as follows: - a Facebook Developer account, which you can create at developers.facebook.com - a registered Facebook app - the oEmbed Product added to the app - an Access Token - The Facebook app must be in Live Mode The generated Access Token, once added to SiteSetting.facebook_app_access_token, will be passed to onebox. Onebox can then use this token to access the oEmbed endpoint to generate a onebox for Instagram. * DEV: update user agent string * DEV: don’t do HEAD requests against news.yahoo.com * DEV: Bump onebox version from 2.1.5 to 2.1.6 * DEV: Avoid re-reading templates * DEV: Tweaks to onebox mustache templates * DEV: simplified error message for missing onebox data * Apply suggestions from code review Co-authored-by: Gerhard Schlager <mail@gerhard-schlager.at>	2020-11-18 12:55:16 -05:00
Krzysztof Kotlarek	e0d9232259	FIX: use allowlist and blocklist terminology (#10209 ) This is a PR of the renaming whitelist to allowlist and blacklist to the blocklist.	2020-07-27 10:23:54 +10:00
Martin Brennan	edbc356593	FIX: Replace deprecated URI.encode, URI.escape, URI.unescape and URI.unencode (#8528 ) The following methods have long been deprecated in ruby due to flaws in their implementation per http://blade.nagaokaut.ac.jp/cgi-bin/vframe.rb/ruby/ruby-core/29293?29179-31097: URI.escape URI.unescape URI.encode URI.unencode escape/encode are just aliases for one another. This PR uses the Addressable gem to replace these methods with its own encode, unencode, and encode_component methods where appropriate. I have put all references to Addressable::URI here into the UrlHelper to keep them corralled in one place to make changes to this implementation easier. Addressable is now also an explicit gem dependency.	2019-12-12 12:49:21 +10:00
Joffrey JAFFEUX	0d3d2c43a0	DEV: s/\$redis/Discourse\.redis (#8431 ) This commit also adds a rubocop rule to prevent global variables.	2019-12-03 10:05:53 +01:00
Arpit Jalan	00c406520e	FEATURE: allow FinalDestination to use custom user agent for specific hosts	2019-11-07 14:47:51 +05:30
Arpit Jalan	e90aac11cb	fix the build	2019-08-07 16:39:58 +05:30
Arpit Jalan	b0e781e2d4	FIX: do not follow redirect on same host with path /login or /session	2019-08-07 16:26:55 +05:30
Sam Saffron	7429700389	FIX: ensure we can download maxmind without redis or db config This also corrects FileHelper.download so it supports "follow_redirect" correctly (it used to always follow 1 redirect) and adds a `validate_url` param that will bypass all uri validation if set to false (default is true)	2019-05-28 10:28:57 +10:00
Sam Saffron	30990006a9	DEV: enable frozen string literal on all files This reduces chances of errors where consumers of strings mutate inputs and reduces memory usage of the app. Test suite passes now, but there may be some stuff left, so we will run a few sites on a branch prior to merging	2019-05-13 09:31:32 +08:00
Gerhard Schlager	92df6890df	FIX: GET request didn't use headers	2019-03-08 21:36:49 +01:00
Sam	cfddfa6de2	SECURITY: bypass long GET requests In some rare cases we would check URLs with very large payloads this ensures we always bypass and do not read entire payloads	2019-02-27 14:51:28 +11:00
Arpit Jalan	1ab91f0474	FIX: preserve github fragment URL	2018-12-19 12:34:47 +05:30
Guo Xiang Tan	8dc1463ab3	Enable `Lint/ShadowingOuterLocalVariable` for Rubocop.	2018-09-04 10:16:42 +08:00
Bianca Nenciu	b6963b8ffb	FIX: Ignore OneBox blacklisted domains.	2018-08-27 20:40:55 +02:00
Régis Hanol	de92913bf4	FIX: store the topic links using the cooked upload url	2018-08-14 12:23:32 +02:00
Robin Ward	7058205f70	FIX: Broken specs	2018-07-24 12:00:34 -04:00
Robin Ward	236243f38a	SECURITY: Consider `0.0.0.0` a private IP	2018-07-24 11:16:27 -04:00
Guo Xiang Tan	d43895e2a0	Don't log 404s for `FinalDestination`. * We can't do anything about 404s	2018-05-25 10:11:16 +08:00
Guo Xiang Tan	142571bba0	Remove use of `rescue nil`. * `rescue nil` is a really bad pattern to use in our code base. We should rescue errors that we expect the code to throw and not rescue everything because we're unsure of what errors the code would throw. This would reduce the amount of pain we face when debugging why something isn't working as expexted. I've been bitten countless of times by errors being swallowed as a result during debugging sessions.	2018-04-02 13:52:51 +08:00
Guo Xiang Tan	ee69d58a59	FIX: Tests could get stucked in infinite loop if it fails to resolve IP of a hostname.	2018-03-28 14:49:05 +08:00
Gerhard Schlager	4a54c09e46	FIX: Retry with GET request when HEAD fails with error 400	2018-02-27 12:07:16 +01:00
Régis Hanol	0559a4736a	FIX: don't double request when downloading a file	2018-02-24 12:35:57 +01:00
Gerhard Schlager	b6277e208b	FIX: Cookies header didn't have the right format	2018-02-19 12:46:57 +01:00
Sam	fa5880e04f	PERF: ability to crawl for titles without extra HEAD req Also, introduces a much more aggressive timeout for title crawling and introduces gzip to body that is crawled	2018-01-29 15:40:12 +11:00
Gerhard Schlager	e30851e45a	Move escape_uri method to a more suitable place	2017-12-12 20:17:46 +01:00
Régis Hanol	de037da731	FIX: FinalDestination's small_get method wasn't using proper request headers	2017-11-17 17:24:35 +01:00
Régis Hanol	aebcd56300	FIX: try a GET for error code 406	2017-11-17 16:59:51 +01:00
Régis Hanol	221ff24418	SQL != Ruby	2017-11-17 16:12:20 +01:00
Régis Hanol	a0fc8bd924	don't log 404s to gravatar.com	2017-11-17 15:38:26 +01:00
Sam	3ac7d041ae	UX: generic onebox treats all square images as avatars and renders them smaller	2017-11-13 11:21:19 +11:00
Gerhard Schlager	d1f257d275	FinalDestination should only log when verbose is enabled	2017-10-31 17:16:59 +01:00
Gerhard Schlager	8c27f28dcb	add more logging to FinalDestination	2017-10-31 12:26:35 +01:00
Sam Saffron	8185b8cb06	FEATURE: cache https redirects per hostname If a hostname does an https redirect we cache that so next lookup does not incur it. Also, only rate limit per ip once per final destination Raise final destination protection to 1000 ip lookups an hour	2017-10-17 16:22:54 +11:00
Sam	70bb2aa426	FEATURE: allow specifying s3 config via globals This refactors handling of s3 so it can be specified via GlobalSetting This means that in a multisite environment you can configure s3 uploads without actual sites knowing credentials in s3 It is a critical setting for situations where assets are mirrored to s3.	2017-10-06 16:20:01 +11:00
Sam	8ecf313a81	FIX: correctly raise errors when downloads fail This corrects an issue where we are hitting Gravatar for 404 over and over Also ensures file download properly reports errors	2017-09-28 16:35:43 +10:00
Guo Xiang Tan	5324c01209	FIX: Don't raise an error if reading from URL timeout.	2017-09-27 14:53:22 +08:00
Guo Xiang Tan	367fb1c524	FIX: Onebox fails on encoded URL. https://meta.discourse.org/t/onebox-breaks-if-theres-chinese-text-in-url/67364	2017-09-26 18:34:54 +08:00
Joffrey JAFFEUX	6cd8203686	FIX: allows onebox to force GET hosts returning wrong headers on HEAD	2017-08-08 11:44:27 +02:00
Arpit Jalan	b059a0f789	extract url escaping to a dedicated class method and improved tests	2017-07-29 22:16:51 +05:30
Arpit Jalan	1fe553873c	FIX: preserve fragment identifier when escaping url	2017-07-29 17:22:45 +05:30
Guo Xiang Tan	5012d46cbd	Add rubocop to our build. (#5004 )	2017-07-28 10:20:09 +09:00
Guo Xiang Tan	b534778f46	FIX: Escape URL before attempting to resolve it.	2017-07-18 10:04:24 +09:00
Guo Xiang Tan	089a1bd3be	Specify the error that we want to ignore instead of rescuing all errors.	2017-07-18 09:55:52 +09:00
Robin Ward	db485ae0da	FIX: Support for skipping redirects on certain domains (like steam)	2017-06-26 15:38:43 -04:00

1 2

62 Commits