DEV: add the notion of a 'crawler identifier' in anonymous_cache

We identify and deny blocked crawlers here in anonymous_cache. Separating the notion of the crawler identifier here lets plugins perform an override if they perform more advanced detection.
2024-12-20 11:43:54 +08:00 · 2024-12-05 16:24:21 -05:00 · 2024-12-05 16:24:21 -05:00 · c546111703
commit c546111703
parent 6e54696003
1 changed files with 5 additions and 1 deletions
--- a/lib/middleware/anonymous_cache.rb
+++ b/lib/middleware/anonymous_cache.rb
@ -78,13 +78,17 @@ module Middleware
        @request = request || Rack::Request.new(@env)
      end

+      def crawler_identifier
+        @user_agent
+      end
+
      def blocked_crawler?
        @request.get? && !@request.xhr? && !@request.path.ends_with?("robots.txt") &&
          !@request.path.ends_with?("srv/status") &&
          @request[Auth::DefaultCurrentUserProvider::API_KEY].nil? &&
          @env[Auth::DefaultCurrentUserProvider::USER_API_KEY].nil? &&
          @env[Auth::DefaultCurrentUserProvider::HEADER_API_KEY].nil? &&
-          CrawlerDetection.is_blocked_crawler?(@user_agent)
+          CrawlerDetection.is_blocked_crawler?(crawler_identifier)
      end

      # rubocop:disable Lint/BooleanSymbol