discourse/spec
Krzysztof Kotlarek bd301c3f08
FIX: improve performance of UserStat.ensure_consistency (#21044) (#21121)
Optimize `UserStatpost_read_count` calculation.

In addition, tests were updated to fail when code is not evaluated. Creation of PostTiming was updating `post_read_count`. Count it has to be reset to ensure that ensure_consitency correctly calculates result.

Extracting users seen in the last hour to separate Common Table Expression reduces the amount of processed rows.

Before
```
Update on user_stats  (cost=267492.07..270822.95 rows=2900 width=174) (actual time=12606.121..12606.127 rows=0 loops=1)
  ->  Hash Join  (cost=267492.07..270822.95 rows=2900 width=174) (actual time=12561.814..12603.689 rows=10 loops=1)
        Hash Cond: (user_stats.user_id = x.user_id)
        Join Filter: (x.c <> user_stats.posts_read_count)
        Rows Removed by Join Filter: 67
        ->  Seq Scan on user_stats  (cost=0.00..3125.34 rows=75534 width=134) (actual time=0.014..39.173 rows=75534 loops=1)
        ->  Hash  (cost=267455.80..267455.80 rows=2901 width=48) (actual time=12558.613..12558.617 rows=77 loops=1)
              Buckets: 4096  Batches: 1  Memory Usage: 39kB
              ->  Subquery Scan on x  (cost=267376.03..267455.80 rows=2901 width=48) (actual time=12168.601..12558.572 rows=77 loops=1)
                    ->  GroupAggregate  (cost=267376.03..267426.79 rows=2901 width=12) (actual time=12168.595..12558.525 rows=77 loops=1)
                          Group Key: pt.user_id
                          ->  Sort  (cost=267376.03..267383.28 rows=2901 width=4) (actual time=12100.490..12352.106 rows=2072830 loops=1)
                                Sort Key: pt.user_id
                                Sort Method: external merge  Disk: 28488kB
                                ->  Nested Loop  (cost=1.28..267209.18 rows=2901 width=4) (actual time=0.040..11528.680 rows=2072830 loops=1)
                                      ->  Nested Loop  (cost=0.86..261390.02 rows=13159 width=8) (actual time=0.030..3492.887 rows=3581648 loops=1)
                                            ->  Index Scan using index_users_on_last_seen_at on users u  (cost=0.42..89.71 rows=28 width=4) (actual time=0.010..0.201 rows=78 loops=1)
                                                  Index Cond: (last_seen_at > '2023-04-11 00:22:49.555537'::timestamp without time zone)
                                            ->  Index Scan using index_post_timings_on_user_id on post_timings pt  (cost=0.44..9287.60 rows=4455 width=8) (actual time=0.081..38.542 rows=45919 loops=78)
                                                  Index Cond: (user_id = u.id)
                                      ->  Index Scan using forum_threads_pkey on topics t  (cost=0.42..0.44 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=3581648)
                                            Index Cond: (id = pt.topic_id)
                                            Filter: ((deleted_at IS NULL) AND ((archetype)::text = 'regular'::text))
                                            Rows Removed by Filter: 0
Planning Time: 0.692 ms
Execution Time: 12612.587 ms
```
After
```
Update on user_stats  (cost=9473.60..12804.30 rows=2828 width=174) (actual time=677.724..677.729 rows=0 loops=1)
  ->  Hash Join  (cost=9473.60..12804.30 rows=2828 width=174) (actual time=672.536..677.706 rows=1 loops=1)
        Hash Cond: (user_stats.user_id = x.user_id)
        Join Filter: (x.c <> user_stats.posts_read_count)
        Rows Removed by Join Filter: 54
        ->  Seq Scan on user_stats  (cost=0.00..3125.34 rows=75534 width=134) (actual time=0.012..23.977 rows=75534 loops=1)
        ->  Hash  (cost=9438.24..9438.24 rows=2829 width=48) (actual time=647.818..647.822 rows=55 loops=1)
              Buckets: 4096  Batches: 1  Memory Usage: 37kB
              ->  Subquery Scan on x  (cost=9381.66..9438.24 rows=2829 width=48) (actual time=647.409..647.805 rows=55 loops=1)
                    ->  HashAggregate  (cost=9381.66..9409.95 rows=2829 width=12) (actual time=647.403..647.786 rows=55 loops=1)
                          Group Key: pt.user_id
                          Batches: 1  Memory Usage: 121kB
                          ->  Nested Loop  (cost=1.86..9367.51 rows=2829 width=4) (actual time=0.056..625.245 rows=120022 loops=1)
                                ->  Nested Loop  (cost=1.44..3692.96 rows=12832 width=8) (actual time=0.047..171.754 rows=217440 loops=1)
                                      ->  Nested Loop  (cost=1.00..254.63 rows=25 width=12) (actual time=0.030..1.407 rows=56 loops=1)
                                            Join Filter: (u.id = user_stats_1.user_id)
                                            ->  Nested Loop  (cost=0.71..243.08 rows=25 width=8) (actual time=0.018..1.207 rows=87 loops=1)
                                                  ->  Index Scan using index_users_on_last_seen_at on users u  (cost=0.42..86.71 rows=27 width=4) (actual time=0.009..0.156 rows=87 loops=1)
                                                        Index Cond: (last_seen_at > '2023-04-11 00:47:07.437568'::timestamp without time zone)
                                                  ->  Index Only Scan using user_stats_pkey on user_stats us  (cost=0.29..5.79 rows=1 width=4) (actual time=0.011..0.011 rows=1 loops=87)
                                                        Index Cond: (user_id = u.id)
                                                        Heap Fetches: 87
                                            ->  Index Scan using user_stats_pkey on user_stats user_stats_1  (cost=0.29..0.45 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=87)
                                                  Index Cond: (user_id = us.user_id)
                                                  Filter: (posts_read_count < 10000)
                                                  Rows Removed by Filter: 0
                                      ->  Index Scan using index_post_timings_on_user_id on post_timings pt  (cost=0.44..92.98 rows=4455 width=8) (actual time=0.036..2.492 rows=3883 loops=56)
                                            Index Cond: (user_id = user_stats_1.user_id)
                                ->  Index Scan using forum_threads_pkey on topics t  (cost=0.42..0.44 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=217440)
                                      Index Cond: (id = pt.topic_id)
                                      Filter: ((deleted_at IS NULL) AND ((archetype)::text = 'regular'::text))
                                      Rows Removed by Filter: 0
Planning Time: 1.406 ms
Execution Time: 677.817 ms
```
2023-04-18 10:04:50 +10:00
..
fabricators DEV: Apply syntax_tree formatting to spec/* 2023-01-09 11:49:28 +00:00
fixtures DEV: Apply syntax_tree formatting to spec/* 2023-01-09 11:49:28 +00:00
helpers DEV: Apply syntax_tree formatting to spec/* 2023-01-09 11:49:28 +00:00
import_export DEV: Apply syntax_tree formatting to spec/* 2023-01-09 11:49:28 +00:00
initializers DEV: Apply syntax_tree formatting to spec/* 2023-01-09 11:49:28 +00:00
integration FIX: Query UploadReference in UploadSecurity for existing uploads (#19917) 2023-01-25 13:48:49 +02:00
integrity FIX: Fix incorrect hashtag setting migration (#19857) 2023-01-25 13:48:49 +02:00
jobs DEV: Fix threading error when running jobs immediately in system tests (#19811) 2023-01-10 13:41:25 +08:00
lib SECURITY: strip xlink:href from uploaded SVGs (#21058) 2023-04-11 14:15:41 -04:00
mailers DEV: Apply syntax_tree formatting to spec/* 2023-01-09 11:49:28 +00:00
models FIX: improve performance of UserStat.ensure_consistency (#21044) (#21121) 2023-04-18 10:04:50 +10:00
multisite DEV: Apply syntax_tree formatting to spec/* 2023-01-09 11:49:28 +00:00
requests SECURITY: Limit URL length for theme remote (stable) (#20788) 2023-03-23 12:07:02 +00:00
script/import_scripts DEV: Apply syntax_tree formatting to spec/* 2023-01-09 11:49:28 +00:00
serializers SECURITY: Default tags to show count of topics in unrestricted categories (#19929) 2023-01-20 11:59:37 +08:00
services SECURITY: Add FinalDestination::FastImage that's SSRF safe 2023-03-16 16:25:48 -06:00
support DEV: Introduce stub_ip_lookup spec helper (#20571) 2023-03-09 08:46:41 +08:00
system FIX: Failing system spec for rate limited search (#20046) 2023-02-01 19:05:58 -08:00
tasks DEV: Apply syntax_tree formatting to spec/* 2023-01-09 11:49:28 +00:00
views DEV: Apply syntax_tree formatting to spec/* 2023-01-09 11:49:28 +00:00
rails_helper.rb DEV: Apply syntax_tree formatting to spec/* 2023-01-09 11:49:28 +00:00
regenerate_swagger_docs DEV: Add API docs for uploads and API doc watcher (#15387) 2021-12-23 08:40:15 +10:00
swagger_helper.rb DEV: Apply syntax_tree formatting to spec/* 2023-01-09 11:49:28 +00:00