Commit Graph

23 Commits

Author SHA1 Message Date
Bruno Sofiato
f64fbd9b74
Updated tokenizer to better matching when search for code snippets (#32261)
This PR improves the accuracy of Gitea's code search. 

Currently, Gitea does not consider statements such as
`onsole.log("hello")` as hits when the user searches for `log`. The
culprit is how both ES and Bleve are tokenizing the file contents (in
both cases, `console.log` is a whole token).

In ES' case, we changed the tokenizer to
[simple_pattern_split](https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-simplepatternsplit-tokenizer.html#:~:text=The%20simple_pattern_split%20tokenizer%20uses%20a,the%20tokenization%20is%20generally%20faster.).
In such a case, tokens are words formed by digits and letters. In
Bleve's case, it employs a
[letter](https://blevesearch.com/docs/Tokenizers/) tokenizer.

Resolves #32220

---------

Signed-off-by: Bruno Sofiato <bruno.sofiato@gmail.com>
2024-11-06 20:51:20 +00:00
Bruno Sofiato
900ac62251
Allow code search by filename (#32210)
Some checks are pending
release-nightly / nightly-binary (push) Waiting to run
release-nightly / nightly-docker-rootful (push) Waiting to run
release-nightly / nightly-docker-rootless (push) Waiting to run
This is a large and complex PR, so let me explain in detail its changes.

First, I had to create new index mappings for Bleve and ElasticSerach as
the current ones do not support search by filename. This requires Gitea
to recreate the code search indexes (I do not know if this is a breaking
change, but I feel it deserves a heads-up).

I've used [this
approach](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/analysis-pathhierarchy-tokenizer.html)
to model the filename index. It allows us to efficiently search for both
the full path and the name of a file. Bleve, however, does not support
this out-of-box, so I had to code a brand new [token
filter](https://blevesearch.com/docs/Token-Filters/) to generate the
search terms.

I also did an overhaul in the `indexer_test.go` file. It now asserts the
order of the expected results (this is important since matches based on
the name of a file are more relevant than those based on its content).
I've added new test scenarios that deal with searching by filename. They
use a new repo included in the Gitea fixture.

The screenshot below depicts how Gitea shows the search results. It
shows results based on content in the same way as the current version
does. In matches based on the filename, the first seven lines of the
file contents are shown (BTW, this is how GitHub does it).


![image](https://github.com/user-attachments/assets/9d938d86-1a8d-4f89-8644-1921a473e858)

Resolves #32096

---------

Signed-off-by: Bruno Sofiato <bruno.sofiato@gmail.com>
2024-10-11 23:35:04 +00:00
yp05327
70b7df0e5e
Support repo license (#24872)
Close #278
Close #24076

## Solutions:
- Use
[google/licenseclassifier](https://github.com/google/licenseclassifier/)
Test result between
[google/licensecheck](https://github.com/google/licensecheck) and
[go-license-detector](https://github.com/go-enry/go-license-detector):
https://github.com/go-gitea/gitea/pull/24872#issuecomment-1560361167
Test result between
[google/licensecheck](https://github.com/google/licensecheck) and
[google/licenseclassifier](https://github.com/google/licenseclassifier/):
https://github.com/go-gitea/gitea/pull/24872#issuecomment-1576092178
- Generate License Convert Name List to avoid import license templates
with same contents
Gitea automatically get latest license data from[
spdx/license-list-data](https://github.com/spdx/license-list-data).
But unfortunately, some license templates have same contents. #20915
[click here to see the
list](https://github.com/go-gitea/gitea/pull/24872#issuecomment-1584141684)
So we will generate a list of these license templates with same contents
and create a new file to save the result when using `make
generate-license`. (Need to decide the save path)
- Save License info into a new table `repo_license`
Can easily support searching repo by license in the future.

## Screen shot
Single License:

![image](https://github.com/go-gitea/gitea/assets/18380374/41260bd7-0b4c-4038-8592-508706cffa9f)

Multiple Licenses:

![image](https://github.com/go-gitea/gitea/assets/18380374/34ce2f73-7e18-446b-9b96-ecc4fb61bd70)

Triggers:
- [x] Push commit to default branch
- [x] Create repo
- [x] Mirror repo
- [x] When Default Branch is changed, licenses should be updated

Todo:
- [x] Save Licenses info in to DB when there's a change to license file
in the commit
- [x] DB Migration
- [x] A nominal test?
- [x] Select which library to
use(https://github.com/go-gitea/gitea/pull/24872#issuecomment-1560361167)
- [x] API Support
- [x] Add repo license table
- ~Select license in settings if there are several licenses(Not
recommended)~
- License board(later, not in this PR)

![image](https://github.com/go-gitea/gitea/assets/18380374/2c3c3bf8-bcc2-4c6d-8ce0-81d1a9733878)

---------

Co-authored-by: silverwind <me@silverwind.io>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Denys Konovalov <kontakt@denyskon.de>
Co-authored-by: delvh <dev.lh@web.de>
Co-authored-by: KN4CK3R <admin@oldschoolhack.me>
Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: 6543 <m.huber@kithara.com>
Co-authored-by: a1012112796 <1012112796@qq.com>
Co-authored-by: techknowlogick <techknowlogick@gitea.com>
2024-10-01 15:25:08 -04:00
Lunny Xiao
8b92eba21f
Fix agit automerge (#31207) 2024-08-20 14:17:21 +08:00
Rowan Bohde
1310649331
render plain text file if the LFS object doesn't exist (#31812)
We had an issue where a repo was using LFS to store a file, but the user
did not push the file. When trying to view the file, Gitea returned a
500 HTTP status code referencing `ErrLFSObjectNotExist`. It appears the
intent was the render this file as plain text, but the conditional was
flipped. I've also added a test to verify that the file is rendered as
plain text.
2024-08-15 05:50:09 +08:00
Rafael
c1f76aea45
Use raw Wiki links for non-renderable Wiki files (#30273)
In Wiki pages, short-links created to local Wiki files were always
expanded as regular Wiki Links. In particular, if a link wanted to point
to a file that Gitea doesn't know how to render (e.g, a .zip file), a
user following the link would be silently redirected to the Wiki's home
page.

This change makes short-links* in Wiki pages be expanded to raw wiki
links, so these local wiki files may be accessed without manually
accessing their URL.

* only short-links ending in a file extension that isn't renderable are
affected.

Closes #27121.

Signed-off-by: Rafael Girão <rafael.s.girao@tecnico.ulisboa.pt>
Co-authored-by: silverwind <me@silverwind.io>
2024-04-10 17:49:57 +00:00
Zettat123
c42083a339
Allow non-admin users to delete review requests (#29057)
Fix #14459

The following users can add/remove review requests of a PR
- the poster of the PR
- the owner or collaborators of the repository
- members with read permission on the pull requests unit
2024-02-24 12:38:43 +00:00
Adam Majer
075c4c89ee
tests: missing refs/ in bare repositories (#28844)
Git 2.43.0 will not detect a git repository as valid without refs/
subdirectory present. `git gc` cleans this up and puts it in
packed-refs. We must keep refs/ non-empty.
2024-01-19 15:12:21 +08:00
Mihir Joshi
b8270240bf
Fix reverting a merge commit failing (#28794)
Fixes #22236

---
Error occurring currently while trying to revert commit using read-tree
-m approach:
> 2022/12/26 16:04:43 ...rvices/pull/patch.go:240:AttemptThreeWayMerge()
[E] [63a9c61a] Unable to run read-tree -m! Error: exit status 128 -
fatal: this operation must be run in a work tree
> 	 - fatal: this operation must be run in a work tree

We need to clone a non-bare repository for `git read-tree -m` to work.

bb371aee6e
adds support to create a non-bare cloned temporary upload repository.

After cloning a non-bare temporary upload repository, we [set default
index](https://github.com/go-gitea/gitea/blob/main/services/repository/files/cherry_pick.go#L37)
(`git read-tree HEAD`).
This operation ends up resetting the git index file (see investigation
details below), due to which, we need to call `git update-index
--refresh` afterward.


Here's the diff of the index file before and after we execute
SetDefaultIndex: https://www.diffchecker.com/hyOP3eJy/

Notice the **ctime**, **mtime** are set to 0 after SetDefaultIndex.

You can reproduce the same behavior using these steps:
```bash
$ git clone https://try.gitea.io/me-heer/test.git -s -b main
$ cd test
$ git read-tree HEAD
$ git read-tree -m 1f085d7ed8 1f085d7ed8 9933caed00
error: Entry '1' not uptodate. Cannot merge.
```

After which, we can fix like this:
```
$ git update-index --refresh
$ git read-tree -m 1f085d7ed8 1f085d7ed8 9933caed00
```
2024-01-16 15:06:51 +00:00
Lunny Xiao
6e87a44034
Allow get release download files and lfs files with oauth2 token format (#26430)
Fix #26165
Fix #25257
2023-10-01 10:41:52 +00:00
Nanguan Lin
da50be7360
Replace 'userxx' with 'orgxx' in all test files when the user type is org (#27052)
Currently 'userxx' and 'orgxx' are both used as username in test files
when the user type is org, which is confusing. This PR replaces all
'userxx' with 'orgxx' when the user type is org(`user.type==1`).
Some non-trivial changes
1. Rename `user3` dir to `org3` in `tests/git-repositories-meta` 
2. Change `end` in `issue reference` because 'org3' is one char shorter
than 'user3'

![ksnip_20230913-112819](https://github.com/go-gitea/gitea/assets/70063547/442988c5-4cf4-49b8-aa01-4dd6bf0ca954)
3. Change the search result number of `user/repo2` because
`user3/repo21` can't be searched now

![ksnip_20230913-112931](https://github.com/go-gitea/gitea/assets/70063547/d9ebeba4-479f-4110-9a85-825efbc981fd)
4. Change the first org name getting from API because the result is
ordered by alphabet asc and now `org 17` is before `org25`
![JW8U7NIO(J$H
_YCRB36H)T](https://github.com/go-gitea/gitea/assets/70063547/f55a685c-cf24-40e5-a87f-3a2327319548)
![)KFD411O4I8RB5ZOH7E0
Z3](https://github.com/go-gitea/gitea/assets/70063547/a0dc3299-249c-46f6-91cb-d15d4ee88dd5)

Other modifications are just find all and replace all.
Unit tests with SQLite are all passed.

---------

Co-authored-by: caicandong <1290147055@qq.com>
2023-09-14 02:59:53 +00:00
sebastian-sauer
55532061c8
Add commits dropdown in PR files view and allow commit by commit review (#25528)
This PR adds a new dropdown to select a commit or a commit range
(shift-click like github) of a Pull Request.
After selection of a commit only the changes of this commit will be shown.
When selecting a range of commits the diff of this range is shown.

This allows to review a PR commit by commit or by viewing only commit ranges.
The "Show changes since your last review" mechanism github uses is implemented, too.
When reviewing a single commit or a commit range the "Viewed" functionality is disabled.

## Screenshots

### The commit dropdown

![image](https://github.com/go-gitea/gitea/assets/51889757/0db3ae62-1272-436c-be64-4730c5d611e3)

### Selecting a commit range

![image](https://github.com/go-gitea/gitea/assets/51889757/ad81eedb-8437-42b0-8073-2d940c25fe8f)

### Show changes of a single commit only

![image](https://github.com/go-gitea/gitea/assets/51889757/6b1a113b-73ef-4ecc-adf6-bc2340bb8f97)

### Show changes of a commit range

![image](https://github.com/go-gitea/gitea/assets/51889757/6401b358-cd66-4c09-8baa-6cf6177f23a7)


Fixes https://github.com/go-gitea/gitea/issues/20989
Fixes https://github.com/go-gitea/gitea/issues/19263

---------

Co-authored-by: silverwind <me@silverwind.io>
Co-authored-by: KN4CK3R <admin@oldschoolhack.me>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
Co-authored-by: delvh <dev.lh@web.de>
2023-07-28 21:18:12 +02:00
Lunny Xiao
de981c39e6
Fix bug of branches API with tests (#25578)
Fix #25558
Extract from #22743

This PR added a repository's check when creating/deleting branches via
API. Mirror repository and archive repository cannot do that.
2023-07-01 10:52:52 +08:00
Kyle D
8220e50b56
Substitute variables in path names of template repos too (#25294)
### Summary

Extend the template variable substitution to replace file paths. This
can be helpful for setting up log files & directories that should match
the repository name.

### PR Changes

 - Move files matching glob pattern when setting up repos from template
- For security, added ~escaping~ sanitization for cross-platform support
and to prevent directory traversal (thanks @silverwind for the
reference)
 - Added unit testing for escaping function 
- Fixed the integration tests for repo template generation by passing
the repo_template_id
- Updated the integration testfiles to add some variable substitution &
assert the outputs

I had to fix the existing repo template integration test and extend it
to add a check for variable substitutions.

Example:

![image](https://github.com/go-gitea/gitea/assets/12700993/621feb09-0ef3-460e-afa8-da74cd84fa4e)
2023-06-20 21:14:47 +00:00
wxiaoguang
ce9c1ddc4c
Remove git sample files and ignore them (#24271) 2023-04-22 20:29:29 +08:00
oliverpool
bb2783860b
fix calReleaseNumCommitsBehind (#24148)
`repoCtx.CommitsCount` is not reliably the commit count of the default
branch (Repository.GetCommitsCount depends on what is currently
displayed).

For instance on the releases page the commit count is correct:
https://codeberg.org/Codeberg/pages-server/releases


![2023-04-15-215027](https://user-images.githubusercontent.com/3864879/232250500-6c05dc00-7030-4ec9-87f1-18c7797d36bf.png)

However it is not on the single page:
https://codeberg.org/Codeberg/pages-server/releases/tag/v4.6.2


![2023-04-15-215036](https://user-images.githubusercontent.com/3864879/232250503-620c8038-7c2c-45a1-b99d-cb994ef955a6.png)

This PR fixes this by removing a "fast branch" which was using this
field (I think this field should be removed, since it is a bit
unpredictable - but this would mean a larger refactoring PR).

_contributed in the context of @forgejo_

---------

Co-authored-by: Giteabot <teabot@gitea.io>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
2023-04-18 21:11:17 +02:00
Nick
6aef9e0a2f
Replace repo.namedBlob by git.TreeEntry. (#22898)
`namedBlob` turned out to be a poor imitation of a `TreeEntry`. Using
the latter directly shortens this code.

This partially undoes https://github.com/go-gitea/gitea/pull/23152/,
which I found a merge conflict with, and also expands the test it added
to cover the subtle README-in-a-subfolder case.
2023-03-15 16:51:39 -05:00
Nick
52e24167e5
Test renderReadmeFile (#23185)
Add test coverage to the important features of
[`routers.web.repo.renderReadmeFile`](067b0c2664/routers/web/repo/view.go (L273));
namely that:

- it can handle looking in docs/, .gitea/, and .github/
- it can handle choosing between multiple competing READMEs
- it prefers the localized README to the markdown README to the
plaintext README
- it can handle broken symlinks when processing all the options
- it uses the name of the symlink, not the name of the target of the
symlink
2023-03-09 09:24:23 +08:00
yp05327
699f20234b
Use correct README link to render the README (#23152)
`renderReadmeFile` needs `readmeTreelink` as parameter but gets
`treeLink`.
The values of them look like as following:
`treeLink`:  `/{OwnerName}/{RepoName}/src/branch/{BranchName}`
`readmeTreelink`:
`/{OwnerName}/{RepoName}/src/branch/{BranchName}/{ReadmeFileName}`

`path.Dir` in

8540fc45b1/routers/web/repo/view.go (L316)
should convert `readmeTreelink` into
`/{OwnerName}/{RepoName}/src/branch/{BranchName}` instead of the current
`/{OwnerName}/{RepoName}/src/branch`.

Fixes #23151

---------

Co-authored-by: Jason Song <i@wolfogre.com>
Co-authored-by: John Olheiser <john.olheiser@gmail.com>
Co-authored-by: silverwind <me@silverwind.io>
2023-03-03 18:01:33 +08:00
Kyle D
2b3f12f6fd
Use beforeCommit instead of baseCommit (#22949)
Replaces: https://github.com/go-gitea/gitea/pull/22947
Fixes https://github.com/go-gitea/gitea/issues/22946
Probably related to https://github.com/go-gitea/gitea/issues/19530

Basically, many of the diffs were broken because they were comparing to
the base commit, where a 3-dot diff should be comparing to the [last
common
ancestor](https://matthew-brett.github.io/pydagogue/git_diff_dots.html).

This should have an integration test so that we don’t run into this
issue again.

---------

Co-authored-by: Jonathan Tran <jonnytran@gmail.com>
2023-02-20 11:56:07 +08:00
Nick
a2779def36 Test views of LFS files (#22196) 2022-12-23 07:41:56 +08:00
Lunny Xiao
36a2d2f919
Add a simple test for external renderer (#20033)
Fix #16402
2022-12-12 20:45:21 +08:00
Kyle D
c8ded77680
Kd/ci playwright go test (#20123)
* Add initial playwright config

* Simplify Makefile

* Simplify Makefile

* Use correct config files

* Update playwright settings

* Fix package-lock file

* Don't use test logger for e2e tests

* fix frontend lint

* Allow passing TEST_LOGGER variable

* Init postgres database

* use standard gitea env variables

* Update playwright

* update drone

* Move empty env var to commands

* Cleanup

* Move integrations to subfolder

* tests integrations to tests integraton

* Run e2e tests with go test

* Fix linting

* install CI deps

* Add files to ESlint

* Fix drone typo

* Don't log to console in CI

* Use go test http server

* Add build step before tests

* Move shared init function to common package

* fix drone

* Clean up tests

* Fix linting

* Better mocking for page + version string

* Cleanup test generation

* Remove dependency on gitea binary

* Fix linting

* add initial support for running specific tests

* Add ACCEPT_VISUAL variable

* don't require git-lfs

* Add initial documentation

* Review feedback

* Add logged in session test

* Attempt fixing drone race

* Cleanup and bump version

* Bump deps

* Review feedback

* simplify installation

* Fix ci

* Update install docs
2022-09-02 15:18:23 -04:00