discourse

mirror of https://github.com/discourse/discourse.git synced 2024-11-26 06:40:23 +08:00

Author	SHA1	Message	Date
Alan Guo Xiang Tan	c1f25cdf5b	FIX: Unicorn master and Sidekiq reopening logs at the same time (#29137 ) In our production environment, we have been seeing Sidekiq processes getting stuck randomly when a USR1 signal is sent to the Unicorn master process. We have not been able to identify the root cause of why the Sidekiq process gets stuck. We however noticed that when the Unicorn master process receives a USR1 signal, it will reopen the logs for the Unicorn master process first before sending a USR1 signal for the Unicorn worker processes to reopen the logs. We figured that we should do the same for the Sidekiq process as well when a USR1 signal. In this commit, we introduce an arbitrary delay of 1 second before we the Sidekiq process reopens its log files so as to allow enough time for the Unicorn master to finish reopening it logs first. We also do not send reopen logs for the Sidekiq process if the `DISCOURSE_LOG_SIDEKIQ` env is not present because there is no need to reopen any logs.	2024-10-10 08:01:40 +08:00
Renato Atilio	54d6e52607	FIX: chat mailer log noise (#28616 ) Fixes the log noise caused by a deprecation notice	2024-08-29 11:39:08 -03:00
Bianca Nenciu	95b09dd777	DEV: Log live slots of Sidekiq jobs (#28600 ) Introduce a new log line for Sidekiq jobs that consume more than `DISCOURSE_LIVE_SLOTS_SIDEKIQ_LIMIT` live slots. This is useful to track down jobs that may leak memory. This is enabled only when Sidekiq's job instrumenter is enabled (set `DISCOURSE_LOG_SIDEKIQ` to `1`).	2024-08-29 12:23:27 +03:00
Alan Guo Xiang Tan	1a09d6b246	FEATURE: Add `live_slots_(start\|finish)` for Sidekiq perf logging (#28260 ) This information is helpful in debugging memory spikes when Sidekiq processes jobs.	2024-08-07 15:48:24 +08:00
Daniel Waterworth	13083d03ae	DEV: Async category search for sidebar modal (#25686 )	2024-02-20 11:24:30 -06:00
Alan Guo Xiang Tan	043ba1d179	DEV: Fix job cluster concurrency spec timing out (#25035 ) Why this change? On CI, we have been seeing the "handles job concurrency" job timing out on CI after 45 seconds. Upon closer inspection of `Jobs::Base#perform` when cluster concurrency has been set, we see that a thread is spun up to extend the expiring of a redis key by 120 seconds every 60 seconds while the job is still being executed. The thread looks like this before the fix: ``` keepalive_thread = Thread.new do while parent_thread.alive? && !finished Discourse.redis.without_namespace.expire(cluster_concurrency_redis_key, 120) sleep 60 end end ``` In an ensure block of `Jobs::Base#perform`, the thread is stop by doing something like this: ``` finished = true keepalive_thread.wakeup keepalive_thread.join ``` If the thread is sleeping, `keepalive_thread.wakeup` will stop the `sleep` method and run the next iteration causing the thread to complete. However, there is a timing issue at play here. If `keepalive_thread.wakeup` is called at a time when the thread is not sleeping, it will have no effect and the thread may end up sleeping for 60 seconds which is longer than our timeout on CI of 45 seconds. What does this change do? 1. Change `sleep 60` to sleep in intervals of 1 second checking if the job has been finished each time. 2. Add `use_redis_snapshotting` to `Jobs::Base` spec since Redis is involved in scheduling and we want to ensure we don't leak Redis keys. 3. Add `ConcurrentJob.stop!` and `thread.join` to `ensure` block in "handles job concurrency" test since a failing expectation will cause us to not clean up the thread we created in the test.	2023-12-26 14:47:03 +08:00
Sam	eb603b246b	PERF: limit anonymization to 1 per cluster (#21992 ) Anonymization is among the most expensive operations we can perform with extreme potential to impact the database. To mitigate risk we only allow a single anonymization across the entire cluster concurrently. This commit introduces support for `cluster_concurrency 1`. When you set that on a Job it will only allow 1 concurrent execution per cluster.	2023-06-14 08:30:23 +10:00
Daniel Waterworth	666536cbd1	DEV: Prefer \A and \z over ^ and $ in regexes (#19936 )	2023-01-20 12:52:49 -06:00
Alan Guo Xiang Tan	8a7b62b126	DEV: Fix threading error when running jobs immediately in system tests (#19811 ) ``` class Jobs::DummyDelayedJob < Jobs::Base def execute(args = {}) end end RSpec.describe "Jobs.run_immediately!" do before { Jobs.run_immediately! } it "explodes" do current_user = Fabricate(:user) Jobs.enqueue_in(1.seconds, :dummy_delayed_job) sign_in(current_user) end end ``` The test above will fail with the following error if `ActiveRecord::Base.connection_handler.clear_active_connections!` is called before the configured Capybara server checks out a connection from the connection pool. ``` ActiveRecord::ActiveRecordError: Cannot expire connection, it is owned by a different thread: #<Thread:0x00007f437391df58@puma srv tp 001 /home/tgxworld/.asdf/installs/ruby/3.1.3/lib/ruby/gems/3.1.0/gems/puma-6.0.2/lib/puma/thread_pool.rb:106 sleep_forever>. Current thread: #<Thread:0x00007f437d6cfc60 run>. ``` We're not exactly sure if this is an ActiveRecord bug or not but we've invested too much time into investigating this problem. Fundamentally, we also no longer understand why `ActiveRecord::Base.connection_handler.clear_active_connections!` is being called in an ensure block within `Jobs::Base#perform` which was added in `ceddb6e0da` 10 years ago. This commit moves the logic for running jobs immediately out of the `Jobs::Base#perform` method into another `Jobs::Base#perform_immediately` method such that `ActiveRecord::Base.connection_handler.clear_active_connections!` is not called. This change will only impact the test environment.	2023-01-10 13:41:25 +08:00
David Taylor	5a003715d3	DEV: Apply syntax_tree formatting to `app/*`	2023-01-09 14:14:59 +00:00
Alan Guo Xiang Tan	7d41e980c9	FIX: Uninitialized class variable error in sidekiq (#17227 ) Follow-up to `4199ada1ce`	2022-06-24 14:17:39 +10:00
Martin Brennan	3f5e19c62a	FIX: Typo in log_thread (#17226 ) Follow up to `4199ada1ce`	2022-06-24 12:12:30 +08:00
Alan Guo Xiang Tan	4199ada1ce	DEV: Ensure Sidekiq logging thread is always running (#17211 )	2022-06-24 10:28:18 +08:00
Jarek Radosz	2fc70c5572	DEV: Correctly tag heredocs (#16061 ) This allows text editors to use correct syntax coloring for the heredoc sections. Heredoc tag names we use: languages: SQL, JS, RUBY, LUA, HTML, CSS, SCSS, SH, HBS, XML, YAML/YML, MF, ICS other: MD, TEXT/TXT, RAW, EMAIL	2022-02-28 20:50:55 +01:00
Alan Guo Xiang Tan	6f03b2694d	DEV: Fix typo. (#15857 )	2022-02-08 09:04:53 +08:00
David Taylor	15cff27bfe	DEV: Stringify keys of nested hashes in job arguments (#15850 ) This provides symmetry with the `.with_indifferent_access` usage in `Jobs#perform`, which is also recursive.	2022-02-07 20:28:45 +00:00
David Taylor	c8c23ba557	DEV: Introduce deprecation warning for non-json Job arguments (#15842 ) This commit introduces our own handling and warning for Sidekiq's new 'non-json-serializable' warning. This decouples us from Sidekiq's own deprecation cycle, and allows us to use our own deprecation system. It also means that the dump/parse happens in test mode, which will help us to catch occurrences before they reach production.	2022-02-07 17:59:55 +00:00
David Taylor	f53d70ac63	DEV: Ensure `delay_for` and `queue` are not passed as job arguments (#15824 ) This regressed in `3a85c4d680` because deep_stringify_keys makes a copy of the `opts` hash	2022-02-04 20:11:03 +00:00
David Taylor	3a85c4d680	DEV: Ensure Sidekiq job arguments have stringified keys The latest version of Sidekiq introduced a warning when jobs are queued with arguments which 'do not stringify to JSON safely'. In the vast majority of cases, this is because a hash is passed with symbols as keys. When those args are passed to the job, the keys will be stringified. Our job wrapper already takes care of this issue by calling '.with_indifferent_access' on the args before passing them to `#execute`, so we don't need to change anything about our use. All we need to do is satisfy Sidekiq's warning system by 'stringifying' all the keys before enqueuing the job.	2022-02-04 18:28:18 +00:00
Osama Sayegh	228264d17c	Revert "DEV: add routes_lazy_route to boost boot-up time (#14545 )" (#14581 ) This reverts commit `f5cf647e57`. The gem breaks usage of Rails URL helpers when used outside views and controllers, for example in `88ecb83382/app/models/upload.rb (L239-L242)` the `upload_short_path` method call fails with an undefined method exception when this gem is enabled.	2021-10-12 17:30:38 +03:00
Sam	f5cf647e57	DEV: add routes_lazy_route to boost boot-up time (#14545 ) The lazy route initialization cuts down boot time of rails. On my local system it cuts out 200ms of boot time taking me from 3.2 to 3 seconds. This is not a radically enormous amount of time, but paper cuts add up, and a faster boot in dev will make everyone happy. TBD if we want to also include this in production. Gem is heavily maintained by @amatsuda, last commit 3 days ago.	2021-10-11 13:22:13 +11:00
David Taylor	c69bb5d5be	DEV: Always enqueue sidekiq jobs after database transaction commit (#11293 ) When jobs are enqueued inside a transaction, it's possible that they will be executed before the necessary data is available in the database. This commit ensures all jobs are enqueued in an ActiveRecord after_commit hook. One potential downside here is if the job fails to enqueue, the transaction will no longer be aborted. However, the chance of that happening is reasonably low, and the impact is significantly lower than the current issue where jobs are scheduled before their data is ready.	2020-12-08 11:05:01 +11:00
Guo Xiang Tan	c6202af005	Update rubocop to 2.3.1.	2020-07-24 17:19:21 +08:00
David Taylor	8a3d9d7036	DEV: Run jobs sequentially in test mode (#9897 ) When running jobs in tests, we use `Jobs.run_immediately!`. This means that jobs are run synchronously when they are enqueued. Jobs sometimes enqueue other jobs, which are also executed synchronously. This means that the outermost job will block until the inner jobs have finished executing. In some cases (e.g. process_post with hotlinked images) this can lead to a deadlock. This commit changes the behavior slightly. Now we will never run jobs inside other jobs. Instead, we will queue them up and run them sequentially in the order they were enqueued. As a whole, they are still executed synchronously. Consider the example ```ruby class Jobs::InnerJob < Jobs::Base def execute(args) puts "Running inner job" end end class Jobs::OuterJob < Jobs::Base def execute(args) puts "Starting outer job" Jobs.enqueue(:inner_job) puts "Finished outer job" end end Jobs.enqueue(:outer_job) puts "All jobs complete" ``` The old behavior would result in: ``` Starting outer job Running inner job Finished outer job All jobs complete ``` The new behavior will result in: ``` Starting outer job Finished outer job Running inner job All jobs complete ```	2020-05-28 12:52:27 +01:00
Sam Saffron	28292d2759	PERF: avoid shelling to get hostname aggressively Previously we had many places in the app that called `hostname` to get hostname of a server. This commit replaces the pattern in 2 ways 1. We cache the result in `Discourse.os_hostname` so it is only ever called once 2. We prefer to use Socket.gethostname which avoids making a shell command This improves performance as we are not spawning hostname processes throughout the app lifetime	2020-02-18 15:13:19 +11:00
Dan Ungureanu	086b46051c	FIX: Zeitwerk-related fixes for jobs. (#8187 )	2019-10-14 13:03:22 +03:00
Krzysztof Kotlarek	427d54b2b0	DEV: Upgrading Discourse to Zeitwerk (#8098 ) Zeitwerk simplifies working with dependencies in dev and makes it easier reloading class chains. We no longer need to use Rails "require_dependency" anywhere and instead can just use standard Ruby patterns to require files. This is a far reaching change and we expect some followups here.	2019-10-02 14:01:53 +10:00
Gerhard Schlager	b788948985	FEATURE: English locale with international date formats Makes en_US the new default locale	2019-05-20 13:47:20 +02:00
Sam Saffron	30990006a9	DEV: enable frozen string literal on all files This reduces chances of errors where consumers of strings mutate inputs and reduces memory usage of the app. Test suite passes now, but there may be some stuff left, so we will run a few sites on a branch prior to merging	2019-05-13 09:31:32 +08:00
Robin Ward	b58867b6e9	FEATURE: New 'Reviewable' model to make reviewable items generic Includes support for flags, reviewable users and queued posts, with REST API backwards compatibility. Co-Authored-By: romanrizzi <romanalejandro@gmail.com> Co-Authored-By: jjaffeux <j.jaffeux@gmail.com>	2019-03-28 12:45:10 -04:00
Robin Ward	fa5a158683	REFACTOR: Move `queue_jobs` out of `SiteSetting` It is not a setting, and only relevant in specs. The new API is: ``` Jobs.run_later! # jobs will be thrown on the queue Jobs.run_immediately! # jobs will run right away, avoid the queue ```	2019-03-14 10:47:38 -04:00
David Taylor	0a4562253e	DEV: Add 'starting' event to sidekiq log when interval logging enabled	2019-03-08 10:56:36 +00:00
David Taylor	e2510d79cc	DEV: Improve thread-safety of sidekiq logging	2019-03-08 10:31:49 +00:00
David Taylor	9db05a895a	DEV: Add job_id to sidkiq log This makes it easier to correlate 'pending' logs from the same job	2019-03-08 09:16:13 +00:00
David Taylor	df474bceee	DEV: Further sidekiq logging stability improvements - Open the log file in "append" mode. This avoids issues if the file does not exist (and matches standard rails log behavior) - Correctly parse the interval logging environment variable	2019-03-06 12:50:15 +00:00
David Taylor	fe62de68dd	DEV: Correct sidekiq logging to avoid thread leak	2019-03-06 10:11:31 +00:00
David Taylor	8b30ed5b7a	DEV: Serialize the job parameters in sidekiq logs Otherwise this can lead to some very large data structures when processing the logs later	2019-03-05 17:44:49 +00:00
David Taylor	8963f1af30	FEATURE: Optional detailed performance logging for Sidekiq jobs (#7091 ) By default, this does nothing. Two environment variables are available: - `DISCOURSE_LOG_SIDEKIQ` Set to `"1"` to enable logging. This will log all completed jobs to `log/rails/sidekiq.log`, along with various db/redis/network statistics. This is useful to track down poorly performing jobs. - `DISCOURSE_LOG_SIDEKIQ_INTERVAL` (seconds) Check running jobs periodically, and log their current duration. They will appear in the logs with `status:pending`. This is useful to track down jobs which take a long time, then crash sidekiq before completing.	2019-03-05 11:19:11 +00:00
Sam	e08a3f719c	FEATURE: push post rebake regular task to low priority queue This allows us to run regular rebakes without starving the normal queue. It additionally adds the ability to specify queue with `Jobs.enqueue` so we can specifically queue a job with lower priority using the `queue` arg.	2019-01-09 08:57:20 +11:00
Sam	e4498d2a8a	FIX: keep db and job correctly in multisite logs This ensures we report job and db correctly, previously we were only reporting this on default	2018-09-04 16:05:44 +10:00
Sam	44cf3cf975	FIX: queue heartbeats in readonly modes If sidekiq is paused or Discourse is in readonly continue to queue heartbeats If we do not do that then a master process can end up reaping sidekiq workers and causing various badness This also impacts restore which can do weird stuff TM in cases like this	2018-08-29 12:36:59 +10:00
Sam	1f17b84b63	FEATURE: more context for error reporting on jobs fails	2018-08-16 12:38:49 +10:00
Neil Lalonde	4ad7ce70ce	REFACTOR: extract scheduler to the mini_scheduler gem	2018-07-31 17:12:55 -04:00
Sam	7c5a71e929	DEV: allow queue_jobs = false in dev your mileage may vary	2017-10-31 13:50:58 +11:00
Guo Xiang Tan	9dcb11f553	Fix the build.	2017-10-11 17:45:19 +08:00
Guo Xiang Tan	09721090a3	FIX: Ensure that we revert back to default connection after running jobs.	2017-10-11 17:17:03 +08:00
Guo Xiang Tan	59aeb0bc56	FIX: Sidekiq hot reloading wasn't working in dev. https://meta.discourse.org/t/webhooks-sidekiq-issue-on-dev-instance/71129 * Remove code that is no longer required as well.	2017-10-09 18:23:25 +08:00
Guo Xiang Tan	5012d46cbd	Add rubocop to our build. (#5004 )	2017-07-28 10:20:09 +09:00
Régis Hanol	7db2083d45	FIX: 'cancel_scheduled_job' was deleting all jobs in multisite	2016-08-12 13:10:52 +02:00
Régis Hanol	5dcdfb9777	ensure default locale is 'en' instead of nil	2016-06-30 17:37:00 +02:00

1 2

72 Commits