discourse/app/jobs/scheduled/clean_up_crawler_stats.rb
Sam 5f64fd0a21 DEV: remove exec_sql and replace with mini_sql
Introduce new patterns for direct sql that are safe and fast.

MiniSql is not prone to memory bloat that can happen with direct PG usage.
It also has an extremely fast materializer and very a convenient API

- DB.exec(sql, *params) => runs sql returns row count
- DB.query(sql, *params) => runs sql returns usable objects (not a hash)
- DB.query_hash(sql, *params) => runs sql returns an array of hashes
- DB.query_single(sql, *params) => runs sql and returns a flat one dimensional array
- DB.build(sql) => returns a sql builder

See more at: https://github.com/discourse/mini_sql
2018-06-19 16:13:36 +10:00

27 lines
717 B
Ruby

module Jobs
class CleanUpCrawlerStats < Jobs::Scheduled
every 1.day
def execute(args)
WebCrawlerRequest.where('date < ?', WebCrawlerRequest.max_record_age.ago).delete_all
# keep count of only the top user agents
DB.exec <<~SQL
WITH ranked_requests AS (
SELECT row_number() OVER (ORDER BY count DESC) as row_number, id
FROM web_crawler_requests
WHERE date = '#{1.day.ago.strftime("%Y-%m-%d")}'
)
DELETE FROM web_crawler_requests
WHERE id IN (
SELECT ranked_requests.id
FROM ranked_requests
WHERE row_number > #{WebCrawlerRequest.max_records_per_day}
)
SQL
end
end
end