Commit Graph

17 Commits

Author SHA1 Message Date
Dan Ungureanu
4e46732346
FEATURE: Implement browser update in crawler view (#12448)
browser-update script does not work correctly in some very old browsers
because the contents of <noscript> is not accessible in JavaScript.
For these browsers, the server can display the crawler page and add the
browser update notice.

Simply loading the browser-update script in the crawler view is not a
solution because that means all crawlers will also see it.
2021-03-22 19:41:42 +02:00
Krzysztof Kotlarek
e0d9232259
FIX: use allowlist and blocklist terminology (#10209)
This is a PR of the renaming whitelist to allowlist and blacklist to the blocklist.
2020-07-27 10:23:54 +10:00
Dan Ungureanu
3ed6a0e904
FIX: Detect Wayback Machine using user agent (#9777) 2020-05-14 21:10:07 +10:00
Gerhard Schlager
d12f2580de FIX: Serve crawler view to Google PageSpeed 2019-11-27 22:15:34 +01:00
Krzysztof Kotlarek
427d54b2b0 DEV: Upgrading Discourse to Zeitwerk (#8098)
Zeitwerk simplifies working with dependencies in dev and makes it easier reloading class chains. 

We no longer need to use Rails "require_dependency" anywhere and instead can just use standard 
Ruby patterns to require files.

This is a far reaching change and we expect some followups here.
2019-10-02 14:01:53 +10:00
Maja Komel
42809f4d69 FIX: use crawler layout when saving url in Wayback Machine (#7667) 2019-06-03 12:13:32 +10:00
Penar Musaraj
8f2c442435 Fix tests 2019-05-08 09:58:47 -04:00
Sam Saffron
4ea21fa2d0 DEV: use #frozen_string_literal: true on all spec
This change both speeds up specs (less strings to allocate) and helps catch
cases where methods in Discourse are mutating inputs.

Overall we will be migrating everything to use #frozen_string_literal: true
it will take a while, but this is the first and safest move in this direction
2019-04-30 10:27:42 +10:00
Sam
f66efc601d FIX: cubot android devices were detected as crawlers 2018-06-21 10:56:46 +10:00
Neil Lalonde
ced7e9a691 FEATURE: control which web crawlers can access using a whitelist or blacklist 2018-03-22 15:41:02 -04:00
Sam
7b562d2f46 FEATURE: much improved and simplified crawler detection
- phase one does it match 'trident|webkit|gecko|chrome|safari|msie|opera'
    yes- well it is possibly a browser

- phase two does it match 'rss|bot|spider|crawler|facebook|archive|wayback|ping|monitor'
    probably a crawler then

Based off: https://gist.github.com/SamSaffron/6cfad7ea3e6df321ffb7a84f93720a53
2018-01-16 15:41:45 +11:00
Sam
f6fdc1ebe8 FEATURE: flexible crawler detection
You can use the crawler user agents site setting to amend what user agents
are considered crawlers based on a string match in the user agent

Also improves performance of crawler detection slightly
2017-09-29 12:31:50 +10:00
Robin Ward
2a4006fe0c Add YandexBot to our list of crawlers 2016-07-26 13:21:37 -04:00
Andy Waite
3e50313fdc Prepare for separation of RSpec helper files
Since rspec-rails 3, the default installation creates two helper files:
* `spec_helper.rb`
* `rails_helper.rb`

`spec_helper.rb` is intended as a way of running specs that do not
require Rails, whereas `rails_helper.rb` loads Rails (as Discourse's
current `spec_helper.rb` does).

For more information:

https://www.relishapp.com/rspec/rspec-rails/docs/upgrade#default-helper-files

In this commit, I've simply replaced all instances of `spec_helper` with
`rails_helper`, and renamed the original `spec_helper.rb`.

This brings the Discourse project closer to the standard usage of RSpec
in a Rails app.

At present, every spec relies on loading Rails, but there are likely
many that don't need to. In a future pull request, I hope to introduce a
separate, minimal `spec_helper.rb` which can be used in tests which
don't rely on Rails.
2015-12-01 20:39:42 +00:00
Luciano Sousa
0fd98b56d8 few components with rspec3 syntax 2015-01-09 13:34:37 -03:00
Vikhyat Korrapati
e3702ecb30 Improved crawler detection: add Twitterbot, Facebook, curl, Bing, Baidu. 2014-03-16 19:30:20 +05:30
Robin Ward
c4b5455c21 REFACTOR: Rename GooglebotDetection to CrawlerDetection because we
will likely whitelist more crawlers in the future.
2014-02-20 16:07:02 -05:00