Commit Graph

26 Commits

Author SHA1 Message Date
David Taylor
5a003715d3
DEV: Apply syntax_tree formatting to app/* 2023-01-09 14:14:59 +00:00
Régis Hanol
aa1138ff71
FIX: reindex_search job should work on model with no search data ()
Lots of changes but it's mostly a refactoring.

The interesting part that was fix are the 'load_problem_<model>_ids' methods.
They will now return records with no search data associated so they can be properly indexed for the search.
This "bad" state usually happens after a migration.
2021-01-25 11:23:36 +01:00
Guo Xiang Tan
069a109cbb
DEV: Require scheduled job in development to avoid loading file twice.
This removes the need to memoize constant in order to avoid the "warning: already initialized constant".
2020-09-01 10:14:40 +08:00
Guo Xiang Tan
3766122a82
DEV: Allow developmental post search index versions. 2020-07-23 15:19:46 +08:00
Guo Xiang Tan
609ba50fe8
DEV: Add more granularity to SearchIndexer versions.
Sometimes, we just want to reindex a specific model and not all the
things.
2020-07-23 14:24:06 +08:00
Sam Saffron
9ffc022cf4
DEV: improve verbose mode for reindexer
This makes the verbose mode provide a bit of progress notification
while reindexing as it can take many hours to do a giant site
2020-06-24 17:29:45 +10:00
Sam Saffron
dcad720a4c
DEV: add optional verbose logging to re-index job
This verbose logging can be useful when executing the job by hand
for debugging purposes

In general people will not use this
2020-06-24 15:37:08 +10:00
Sam Saffron
451e9c4bb9
DEV: minor SQL formatting change
Moved join prior to left join to make query less confusing.

Has no material impact on performance.
2020-05-12 16:55:42 +10:00
Joffrey JAFFEUX
b6ef473a31
DEV: prevents redefinition of various constants ()
CLEANUP_GRACE_PERIOD
MAX_AWARDED
POLL_MAILBOX_TIMEOUT_ERROR_KEY
2019-10-17 09:18:07 +02:00
Krzysztof Kotlarek
427d54b2b0 DEV: Upgrading Discourse to Zeitwerk ()
Zeitwerk simplifies working with dependencies in dev and makes it easier reloading class chains. 

We no longer need to use Rails "require_dependency" anywhere and instead can just use standard 
Ruby patterns to require files.

This is a far reaching change and we expect some followups here.
2019-10-02 14:01:53 +10:00
Sam Saffron
22abad4151 PERF: stop reindexing and skipping deleted posts 2019-06-04 17:53:35 +10:00
Sam Saffron
74c4f926fc FIX: drop deleted posts from search index
This does two things

1. Our "index grace period" has been wound down to 1 day, there is no point
keeping a bloated index for a week, usually when people delete stuff they
mean for it to be removed

2. We were never dropping deleted posts from the index, only posts from
deleted topics

These changes speed up search a tiny bit and reduce background work.
2019-06-04 17:19:59 +10:00
Sam Saffron
77300c1d8d DEV: reindex old data in a more consistent way
Previously we were grabbing arbitrary rows in many cases which makes
diagnosing issues in the indexer more complex
2019-06-04 11:47:24 +10:00
Sam Saffron
30990006a9 DEV: enable frozen string literal on all files
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.

Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
2019-05-13 09:31:32 +08:00
Guo Xiang Tan
108c231d1c FIX: Clean up topic_search_data of trashed topics.
This keeps the index and table smaller.
2019-04-08 16:53:39 +08:00
Guo Xiang Tan
c4997ce85f FIX: Don't try and reindex posts that have been trashed. 2019-04-08 16:38:43 +08:00
Guo Xiang Tan
d151425353
PERF: Delete search data of posts from trashed topics periodically. ()
This keeps both the index and table smaller.
2019-04-03 10:10:41 +08:00
Guo Xiang Tan
d8704c11ca PERF: Better use of index when queueing a topci for search reindex.
Also move `Search::INDEX_VERSION` to `SearchIndexer` which is where the
version is actually being used.
2019-04-02 09:53:37 +08:00
Guo Xiang Tan
aa2311a7b0 FIX: Don't reindex posts belonging to a deleted topic for search.
Posts belonging to a deleted topic can't be index for search so we need
to avoid loading those post ids.
2019-04-02 07:36:53 +08:00
Guo Xiang Tan
3fc5dbb045 FIX: Don't attempt to reindex posts that have an empty raw.
If the post ids keep loading, we might end up in a situations where
we're always loading the same post ids over and over again without
indexing anything new.

Follow up to daeda80ada.
2019-04-02 07:13:33 +08:00
Guo Xiang Tan
daeda80ada
FIX: Don't index posts with empty Post#raw for search. ()
* DEV: Remove unnecessary join in `Jobs::ReindexSearch`.

* FIX: Don't index posts with empty `Post#raw` for search.
2019-04-01 10:06:27 +08:00
Sam
86d12bd44b FEATURE: search within title using in:title
Also

- Significantly improved search ranking, title is treated most strongly
- Adds tag names to the index
- Run search re-indexer more aggressively
- Re-index topic and all posts on category change
2018-02-20 14:41:21 +11:00
Neil Lalonde
2c56f8df7c FEATURE: show tags in search results 2017-08-25 11:52:59 -04:00
Sam Saffron
56f7b4e01e PERF: reindex search data without loading large post counts 2017-08-16 08:18:59 -04:00
Erick Guan
6e59149a77 FIX: rebuild index when engine replaced () 2017-08-16 07:38:34 -04:00
Sam
760e9a756d PERF: push reindex job to daily 2014-07-01 10:09:55 +10:00