PERF: Improve database query perf when loading topics for a category. (#14416)

* PERF: Improve database query perf when loading topics for a category.

Instead of left joining the `topics` table against `categories` by filtering with `categories.id`,
we can improve the query plan by filtering against `topics.category_id`
first before joining which helps to reduce the number of rows in the
topics table that has to be joined against the other tables and also
make better use of our existing index.

The following is a before and after of the query plan for a category
with many subcategories.

Before:

```
                                                                                                       QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------
 Limit  (cost=1.28..747.09 rows=30 width=12) (actual time=85.502..2453.727 rows=30 loops=1)
   ->  Nested Loop Left Join  (cost=1.28..566518.36 rows=22788 width=12) (actual time=85.501..2453.722 rows=30 loops=1)
         Join Filter: (category_users.category_id = topics.category_id)
         Filter: ((topics.category_id = 11) OR (COALESCE(category_users.notification_level, 1) <> 0) OR (tu.notification_level > 1))
         ->  Nested Loop Left Join  (cost=1.00..566001.58 rows=22866 width=20) (actual time=85.494..2453.702 rows=30 loops=1)
               Filter: ((COALESCE(tu.notification_level, 1) > 0) AND ((topics.category_id <> 11) OR (topics.pinned_at IS NULL) OR ((t
opics.pinned_at <= tu.cleared_pinned_at) AND (tu.cleared_pinned_at IS NOT NULL))))
               Rows Removed by Filter: 1
               ->  Nested Loop  (cost=0.57..528561.75 rows=68606 width=24) (actual time=85.472..2453.562 rows=31 loops=1)
                     Join Filter: ((topics.category_id = categories.id) AND ((categories.topic_id <> topics.id) OR (categories.id = 1
1)))
                     Rows Removed by Join Filter: 13938306
                     ->  Index Scan using index_topics_on_bumped_at on topics  (cost=0.42..100480.05 rows=715549 width=24) (actual ti
me=0.010..633.015 rows=464623 loops=1)
                           Filter: ((deleted_at IS NULL) AND ((archetype)::text <> 'private_message'::text))
                           Rows Removed by Filter: 105321
                     ->  Materialize  (cost=0.14..36.04 rows=30 width=8) (actual time=0.000..0.002 rows=30 loops=464623)
                           ->  Index Scan using categories_pkey on categories  (cost=0.14..35.89 rows=30 width=8) (actual time=0.006.
.0.040 rows=30 loops=1)
                                 Index Cond: (id = ANY ('{11,53,57,55,54,56,112,94,107,115,116,117,97,95,102,103,101,105,99,114,106,1
13,104,98,100,96,108,109,110,111}'::integer[]))
               ->  Index Scan using index_topic_users_on_topic_id_and_user_id on topic_users tu  (cost=0.43..0.53 rows=1 width=16) (a
ctual time=0.004..0.004 rows=0 loops=31)
                     Index Cond: ((topic_id = topics.id) AND (user_id = 1103877))
         ->  Materialize  (cost=0.28..2.30 rows=1 width=8) (actual time=0.000..0.000 rows=0 loops=30)
               ->  Index Scan using index_category_users_on_user_id_and_last_seen_at on category_users  (cost=0.28..2.29 rows=1 width
=8) (actual time=0.004..0.004 rows=0 loops=1)
                     Index Cond: (user_id = 1103877)
 Planning Time: 1.359 ms
 Execution Time: 2453.765 ms
(23 rows)
```

After:

```
                                                                                                                            QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1.28..438.55 rows=30 width=12) (actual time=38.297..657.215 rows=30 loops=1)
   ->  Nested Loop Left Join  (cost=1.28..195944.68 rows=13443 width=12) (actual time=38.296..657.211 rows=30 loops=1)
         Filter: ((categories.topic_id <> topics.id) OR (topics.category_id = 11))
         Rows Removed by Filter: 29
         ->  Nested Loop Left Join  (cost=1.13..193462.59 rows=13443 width=16) (actual time=38.289..657.092 rows=59 loops=1)
               Join Filter: (category_users.category_id = topics.category_id)
               Filter: ((topics.category_id = 11) OR (COALESCE(category_users.notification_level, 1) <> 0) OR (tu.notification_level > 1))
               ->  Nested Loop Left Join  (cost=0.85..193156.79 rows=13489 width=20) (actual time=38.282..657.059 rows=59 loops=1)
                     Filter: ((COALESCE(tu.notification_level, 1) > 0) AND ((topics.category_id <> 11) OR (topics.pinned_at IS NULL) OR ((topics.pinned_at <= tu.cleared_pinned_at) AND (tu.cleared_pinned_at IS NOT NULL))))
                     Rows Removed by Filter: 1
                     ->  Index Scan using index_topics_on_bumped_at on topics  (cost=0.42..134521.06 rows=40470 width=24) (actual time=38.267..656.850 rows=60 loops=1)
                           Filter: ((deleted_at IS NULL) AND ((archetype)::text <> 'private_message'::text) AND (category_id = ANY ('{11,53,57,55,54,56,112,94,107,115,116,117,97,95,102,103,101,105,99,114,106,113,104,98,100,96,108,109,110,111}'::integer[])))
                           Rows Removed by Filter: 569895
                     ->  Index Scan using index_topic_users_on_topic_id_and_user_id on topic_users tu  (cost=0.43..1.43 rows=1 width=16) (actual time=0.003..0.003 rows=0 loops=60)
                           Index Cond: ((topic_id = topics.id) AND (user_id = 1103877))
               ->  Materialize  (cost=0.28..2.30 rows=1 width=8) (actual time=0.000..0.000 rows=0 loops=59)
                     ->  Index Scan using index_category_users_on_user_id_and_last_seen_at on category_users  (cost=0.28..2.29 rows=1 width=8) (actual time=0.004..0.004 rows=0 loops=1)
                           Index Cond: (user_id = 1103877)
         ->  Index Scan using categories_pkey on categories  (cost=0.14..0.17 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=59)
               Index Cond: (id = topics.category_id)
 Planning Time: 1.633 ms
 Execution Time: 657.255 ms
(22 rows)
```

* PERF: Optimize index on topics bumped_at.

Replace `index_topics_on_bumped_at` index with a partial index on `Topic#bumped_at` filtered by archetype since there is already another index that covers private topics.
This commit is contained in:
Alan Guo Xiang Tan 2021-09-28 10:05:00 +08:00 committed by GitHub
parent ba17d9106e
commit cd64e88711
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
4 changed files with 111 additions and 17 deletions

View File

@ -1846,7 +1846,7 @@ end
# idx_topics_user_id_deleted_at (user_id) WHERE (deleted_at IS NULL)
# idxtopicslug (slug) WHERE ((deleted_at IS NULL) AND (slug IS NOT NULL))
# index_topics_on_bannered_until (bannered_until) WHERE (bannered_until IS NOT NULL)
# index_topics_on_bumped_at (bumped_at)
# index_topics_on_bumped_at_public (bumped_at) WHERE ((deleted_at IS NULL) AND ((archetype)::text <> 'private_message'::text))
# index_topics_on_created_at_and_visible (created_at,visible) WHERE ((deleted_at IS NULL) AND ((archetype)::text <> 'private_message'::text))
# index_topics_on_id_and_deleted_at (id,deleted_at)
# index_topics_on_id_filtered_banner (id) UNIQUE WHERE (((archetype)::text = 'banner'::text) AND (deleted_at IS NULL))

View File

@ -0,0 +1,27 @@
# frozen_string_literal: true
class AlterBumpedAtIndexesOnTopics < ActiveRecord::Migration[6.1]
disable_ddl_transaction!
def up
execute(<<~SQL)
CREATE INDEX CONCURRENTLY IF NOT EXISTS index_topics_on_bumped_at_public
ON topics (bumped_at)
WHERE ((deleted_at IS NULL) AND ((archetype)::text <> 'private_message'::text));
SQL
execute(<<~SQL)
DROP INDEX IF EXISTS index_topics_on_bumped_at;
SQL
# The following index is known to have not been properly renamed. Drop it if
# exists just in case.
execute(<<~SQL)
DROP INDEX IF EXISTS index_forum_threads_on_bumped_at;
SQL
end
def down
raise ActiveRecord::IrreversibleMigration
end
end

View File

@ -625,11 +625,11 @@ class TopicQuery
@options[:category_id] = category_id
if category_id
if options[:no_subcategories]
result = result.where('categories.id = ?', category_id)
result = result.where('topics.category_id = ?', category_id)
else
result = result.where("categories.id IN (?)", Category.subcategory_ids(category_id))
result = result.where("topics.category_id IN (?)", Category.subcategory_ids(category_id))
if !SiteSetting.show_category_definitions_in_topic_lists
result = result.where("categories.topic_id <> topics.id OR categories.id = ?", category_id)
result = result.where("categories.topic_id <> topics.id OR topics.category_id = ?", category_id)
end
end
result = result.references(:categories)

View File

@ -268,33 +268,100 @@ describe TopicQuery do
let!(:subcategory) { Fabricate(:category_with_definition, parent_category_id: category.id) }
let(:subsubcategory) { Fabricate(:category_with_definition, parent_category_id: subcategory.id) }
# Not used in assertions but fabricated to ensure we're not leaking topics
# across categories
let!(:_category) { Fabricate(:category_with_definition) }
let!(:_subcategory) { Fabricate(:category_with_definition, parent_category_id: _category.id) }
it "works with subcategories" do
expect(TopicQuery.new(moderator, category: category.id).list_latest.topics.size).to eq(1)
expect(TopicQuery.new(moderator, category: subcategory.id).list_latest.topics.size).to eq(1)
expect(TopicQuery.new(moderator, category: category.id, no_subcategories: true).list_latest.topics.size).to eq(1)
expect(
TopicQuery
.new(moderator, category: category.id)
.list_latest.topics
).to contain_exactly(category.topic)
expect(
TopicQuery
.new(moderator, category: subcategory.id)
.list_latest.topics
).to contain_exactly(subcategory.topic)
expect(
TopicQuery
.new(moderator, category: category.id, no_subcategories: true)
.list_latest.topics
).to contain_exactly(category.topic)
end
it "shows a subcategory definition topic in its parent list with the right site setting" do
SiteSetting.show_category_definitions_in_topic_lists = true
expect(TopicQuery.new(moderator, category: category.id).list_latest.topics.size).to eq(2)
expect(
TopicQuery
.new(moderator, category: category.id)
.list_latest.topics
).to contain_exactly(category.topic, subcategory.topic)
end
it "works with subsubcategories" do
SiteSetting.max_category_nesting = 3
Fabricate(:topic, category: category)
Fabricate(:topic, category: subcategory)
Fabricate(:topic, category: subsubcategory)
category_topic = Fabricate(:topic, category: category)
subcategory_topic = Fabricate(:topic, category: subcategory)
subsubcategory_topic = Fabricate(:topic, category: subsubcategory)
SiteSetting.max_category_nesting = 2
expect(TopicQuery.new(moderator, category: category.id).list_latest.topics.size).to eq(3)
expect(TopicQuery.new(moderator, category: subcategory.id).list_latest.topics.size).to eq(3)
expect(TopicQuery.new(moderator, category: subsubcategory.id).list_latest.topics.size).to eq(2)
expect(
TopicQuery
.new(moderator, category: category.id)
.list_latest.topics
).to contain_exactly(category.topic, category_topic, subcategory_topic)
expect(
TopicQuery
.new(moderator, category: subcategory.id)
.list_latest.topics
).to contain_exactly(
subcategory.topic,
subcategory_topic,
subsubcategory_topic
)
expect(
TopicQuery
.new(moderator, category: subsubcategory.id)
.list_latest.topics
).to contain_exactly(subsubcategory.topic, subsubcategory_topic)
SiteSetting.max_category_nesting = 3
expect(TopicQuery.new(moderator, category: category.id).list_latest.topics.size).to eq(4)
expect(TopicQuery.new(moderator, category: subcategory.id).list_latest.topics.size).to eq(3)
expect(TopicQuery.new(moderator, category: subsubcategory.id).list_latest.topics.size).to eq(2)
expect(
TopicQuery
.new(moderator, category: category.id)
.list_latest.topics
).to contain_exactly(
category.topic,
category_topic,
subcategory_topic,
subsubcategory_topic
)
expect(
TopicQuery
.new(moderator, category: subcategory.id)
.list_latest.topics
).to contain_exactly(
subcategory.topic,
subcategory_topic,
subsubcategory_topic
)
expect(
TopicQuery
.new(moderator, category: subsubcategory.id)
.list_latest.topics
).to contain_exactly(subsubcategory.topic, subsubcategory_topic)
end
end
end