Improve queue & process & stacktrace (#24636)
Although some features are mixed together in this PR, this PR is not
that large, and these features are all related.
Actually there are more than 70 lines are for a toy "test queue", so
this PR is quite simple.
Major features:
1. Allow site admin to clear a queue (remove all items in a queue)
* Because there is no transaction, the "unique queue" could be corrupted
in rare cases, that's unfixable.
* eg: the item is in the "set" but not in the "list", so the item would
never be able to be pushed into the queue.
* Now site admin could simply clear the queue, then everything becomes
correct, the lost items could be re-pushed into queue by future
operations.
3. Split the "admin/monitor" to separate pages
4. Allow to download diagnosis report
* In history, there were many users reporting that Gitea queue gets
stuck, or Gitea's CPU is 100%
* With diagnosis report, maintainers could know what happens clearly
The diagnosis report sample:
[gitea-diagnosis-20230510-192913.zip](https://github.com/go-gitea/gitea/files/11441346/gitea-diagnosis-20230510-192913.zip)
, use "go tool pprof profile.dat" to view the report.
Screenshots:
![image](https://github.com/go-gitea/gitea/assets/2114189/320659b4-2eda-4def-8dc0-5ea08d578063)
![image](https://github.com/go-gitea/gitea/assets/2114189/c5c46fae-9dc0-44ca-8cd3-57beedc5035e)
![image](https://github.com/go-gitea/gitea/assets/2114189/6168a811-42a1-4e64-a263-0617a6c8c4fe)
---------
Co-authored-by: Jason Song <i@wolfogre.com>
Co-authored-by: Giteabot <teabot@gitea.io>
2023-05-11 15:45:47 +08:00
|
|
|
// Copyright 2023 The Gitea Authors.
|
|
|
|
// SPDX-License-Identifier: MIT
|
|
|
|
|
|
|
|
package admin
|
|
|
|
|
|
|
|
import (
|
|
|
|
"archive/zip"
|
|
|
|
"fmt"
|
|
|
|
"runtime/pprof"
|
|
|
|
"time"
|
|
|
|
|
|
|
|
"code.gitea.io/gitea/modules/httplib"
|
2024-02-27 15:12:22 +08:00
|
|
|
"code.gitea.io/gitea/services/context"
|
Improve queue & process & stacktrace (#24636)
Although some features are mixed together in this PR, this PR is not
that large, and these features are all related.
Actually there are more than 70 lines are for a toy "test queue", so
this PR is quite simple.
Major features:
1. Allow site admin to clear a queue (remove all items in a queue)
* Because there is no transaction, the "unique queue" could be corrupted
in rare cases, that's unfixable.
* eg: the item is in the "set" but not in the "list", so the item would
never be able to be pushed into the queue.
* Now site admin could simply clear the queue, then everything becomes
correct, the lost items could be re-pushed into queue by future
operations.
3. Split the "admin/monitor" to separate pages
4. Allow to download diagnosis report
* In history, there were many users reporting that Gitea queue gets
stuck, or Gitea's CPU is 100%
* With diagnosis report, maintainers could know what happens clearly
The diagnosis report sample:
[gitea-diagnosis-20230510-192913.zip](https://github.com/go-gitea/gitea/files/11441346/gitea-diagnosis-20230510-192913.zip)
, use "go tool pprof profile.dat" to view the report.
Screenshots:
![image](https://github.com/go-gitea/gitea/assets/2114189/320659b4-2eda-4def-8dc0-5ea08d578063)
![image](https://github.com/go-gitea/gitea/assets/2114189/c5c46fae-9dc0-44ca-8cd3-57beedc5035e)
![image](https://github.com/go-gitea/gitea/assets/2114189/6168a811-42a1-4e64-a263-0617a6c8c4fe)
---------
Co-authored-by: Jason Song <i@wolfogre.com>
Co-authored-by: Giteabot <teabot@gitea.io>
2023-05-11 15:45:47 +08:00
|
|
|
)
|
|
|
|
|
|
|
|
func MonitorDiagnosis(ctx *context.Context) {
|
|
|
|
seconds := ctx.FormInt64("seconds")
|
|
|
|
if seconds <= 5 {
|
|
|
|
seconds = 5
|
|
|
|
}
|
|
|
|
if seconds > 300 {
|
|
|
|
seconds = 300
|
|
|
|
}
|
|
|
|
|
|
|
|
httplib.ServeSetHeaders(ctx.Resp, &httplib.ServeHeaderOptions{
|
|
|
|
ContentType: "application/zip",
|
|
|
|
Disposition: "attachment",
|
|
|
|
Filename: fmt.Sprintf("gitea-diagnosis-%s.zip", time.Now().Format("20060102-150405")),
|
|
|
|
})
|
|
|
|
|
|
|
|
zipWriter := zip.NewWriter(ctx.Resp)
|
|
|
|
defer zipWriter.Close()
|
|
|
|
|
|
|
|
f, err := zipWriter.CreateHeader(&zip.FileHeader{Name: "goroutine-before.txt", Method: zip.Deflate, Modified: time.Now()})
|
|
|
|
if err != nil {
|
|
|
|
ctx.ServerError("Failed to create zip file", err)
|
|
|
|
return
|
|
|
|
}
|
|
|
|
_ = pprof.Lookup("goroutine").WriteTo(f, 1)
|
|
|
|
|
|
|
|
f, err = zipWriter.CreateHeader(&zip.FileHeader{Name: "cpu-profile.dat", Method: zip.Deflate, Modified: time.Now()})
|
|
|
|
if err != nil {
|
|
|
|
ctx.ServerError("Failed to create zip file", err)
|
|
|
|
return
|
|
|
|
}
|
|
|
|
|
|
|
|
err = pprof.StartCPUProfile(f)
|
|
|
|
if err == nil {
|
|
|
|
time.Sleep(time.Duration(seconds) * time.Second)
|
|
|
|
pprof.StopCPUProfile()
|
|
|
|
} else {
|
|
|
|
_, _ = f.Write([]byte(err.Error()))
|
|
|
|
}
|
|
|
|
|
|
|
|
f, err = zipWriter.CreateHeader(&zip.FileHeader{Name: "goroutine-after.txt", Method: zip.Deflate, Modified: time.Now()})
|
|
|
|
if err != nil {
|
|
|
|
ctx.ServerError("Failed to create zip file", err)
|
|
|
|
return
|
|
|
|
}
|
|
|
|
_ = pprof.Lookup("goroutine").WriteTo(f, 1)
|
2023-12-24 03:06:02 +08:00
|
|
|
|
|
|
|
f, err = zipWriter.CreateHeader(&zip.FileHeader{Name: "heap.dat", Method: zip.Deflate, Modified: time.Now()})
|
|
|
|
if err != nil {
|
|
|
|
ctx.ServerError("Failed to create zip file", err)
|
|
|
|
return
|
|
|
|
}
|
|
|
|
_ = pprof.Lookup("heap").WriteTo(f, 0)
|
Improve queue & process & stacktrace (#24636)
Although some features are mixed together in this PR, this PR is not
that large, and these features are all related.
Actually there are more than 70 lines are for a toy "test queue", so
this PR is quite simple.
Major features:
1. Allow site admin to clear a queue (remove all items in a queue)
* Because there is no transaction, the "unique queue" could be corrupted
in rare cases, that's unfixable.
* eg: the item is in the "set" but not in the "list", so the item would
never be able to be pushed into the queue.
* Now site admin could simply clear the queue, then everything becomes
correct, the lost items could be re-pushed into queue by future
operations.
3. Split the "admin/monitor" to separate pages
4. Allow to download diagnosis report
* In history, there were many users reporting that Gitea queue gets
stuck, or Gitea's CPU is 100%
* With diagnosis report, maintainers could know what happens clearly
The diagnosis report sample:
[gitea-diagnosis-20230510-192913.zip](https://github.com/go-gitea/gitea/files/11441346/gitea-diagnosis-20230510-192913.zip)
, use "go tool pprof profile.dat" to view the report.
Screenshots:
![image](https://github.com/go-gitea/gitea/assets/2114189/320659b4-2eda-4def-8dc0-5ea08d578063)
![image](https://github.com/go-gitea/gitea/assets/2114189/c5c46fae-9dc0-44ca-8cd3-57beedc5035e)
![image](https://github.com/go-gitea/gitea/assets/2114189/6168a811-42a1-4e64-a263-0617a6c8c4fe)
---------
Co-authored-by: Jason Song <i@wolfogre.com>
Co-authored-by: Giteabot <teabot@gitea.io>
2023-05-11 15:45:47 +08:00
|
|
|
}
|