Commit Graph

19 Commits

Author SHA1 Message Date
Robin Ward
b23fc2bf84 Helper to find the final destination for a URL 2017-05-22 15:52:41 -04:00
Robin Ward
773445b8df FIX: Topic Crawling should only crawl HTTP/S urls 2017-05-22 11:57:20 -04:00
Robin Ward
ea9f93dcc5 FIX: Don't crawl non-http/s links 2017-05-19 16:57:41 -04:00
Sam
add6e12ce4 FIX: topic links with long titles can not be crawled
0..255 == 256 numbers column fits 255
2015-08-18 17:34:46 +10:00
Robin Ward
1434e46ed2 FIX: Excon was wrapping our ReadOnly exception
This was preventing the crawling of many topic links
2015-05-27 14:29:52 -04:00
Sam
cd9e499b77 Don't try loading embeds on deleted topics 2015-05-06 16:53:28 +10:00
Sam
bb20f64cb2 use standard error so its easier to catch 2015-03-23 12:20:50 +11:00
Akshay
6301a43d57 Not initializing variable for looping if unused in loop 2014-08-15 03:24:55 +05:30
Sam
a2e2d0e886 Merge pull request #2316 from mutiny/refactor-where-first
Refactor `where(...).first` to `find_by(...)`
2014-05-08 09:10:45 +10:00
Camille Roux
f14c71b9d4 Fix the Amazon links regex 2014-05-06 19:19:32 +02:00
Camille Roux
e77e7f23ca Update the Amazon links regexp
Added all the countries displayed in the Amazon footer
2014-05-06 18:36:07 +02:00
Louis Rose
1574485443 Perform the where(...).first to find_by(...) refactoring.
This refactoring was automated using the command: bundle exec "ruby refactorings/where_dot_first_to_find_by/app.rb"
2014-05-06 14:41:59 +01:00
Robin Ward
a57f802048 If there's a TopicEmbed record for a url, we don't have to crawl it.
This should help sites like Boing Boing where sometimes links are
crawled before saved in WordPress.
2014-04-17 14:00:22 -04:00
Robin Ward
e80851b0fa Special case: When crawling a link to an image, just put the filename as
the title.
2014-04-10 13:45:13 -04:00
Robin Ward
99e2bab62d Use update_all to prevent after_commit from executing again. 2014-04-10 13:19:57 -04:00
Robin Ward
aa63868d5e FIX: Problem crawling amazon titles 2014-04-08 16:39:47 -04:00
Robin Ward
1e3faddfe4 FIX: Change crawl size to 10k. Youtube for example doesn't work with the
first 1k
2014-04-07 16:03:47 -04:00
Robin Ward
7e0028ba50 FIX: Don't crawl in test mode, raise correct exception when parameters
are missing
2014-04-07 14:38:18 -04:00
Robin Ward
7e3ea5d644 Support for crawling topic links 2014-04-07 14:08:34 -04:00