This is the more correct implementation of 23dadc0d86 (#4179)... I think. This commit effectively undoes the revert in 8848df9c5d, but with corrections to the logic.
We *do* need to use the original request path (the path the browser knows) for redirects, since they are external, and rewrites are only internal.
However, if the path was rewritten to a non-canonical path, we should not redirect to canonicalize that, since rewrites are intentional by the site owner. Canonicalizing the path involves modifying only the suffix (base element, or filename) of the path. Thus, if a rewrite involves only the prefix (like how handle_path strips a path prefix), then we can (hopefully!) safely redirect using the original URI since the filename was not rewritten.
So basically, if rewrites modify the filename, we should not canonicalize those requests. If rewrites only modify another part of the path (commonly a prefix), we should be OK to redirect.
Templates are parsed at request-time (like they are in the templates middleware) to allow live changes to the template while the server is running. Fixes race condition.
Also refactored use of a buffer so a buffer put back in the pool will not continue to be used (written to client) in the meantime.
A couple of benchmarks removed due to refactor, which is fine, since we know pooling helps here.
* fileserver: Fix `file` matcher with empty `try_files`
Fixes https://github.com/caddyserver/caddy/issues/4146
If `TryFiles` is empty, we fill it with `r.URL.Path`. In this case, this is `/`. Then later, in `prepareFilePath()`, we run the replacer (which turns `{path}` into `/` at that point) but `file` remains the original value (and the placeholder is still the placeholder there).
So then `strings.HasSuffix(file, "/")` will be `false` for the placeholder, but `true` for the empty `TryFiles` codepath, because `file` was `/` due to being set to the actual request value beforehand.
This means that `suffix` becomes `//` in that case, so after `sanitizedPathJoin`, it becomes `./`, so `strictFileExists`'s `strings.HasSuffix(file, separator)` codepath will return true.
I think we should change the `m.TryFiles == nil` codepath to `m.TryFiles = []string{"{http.request.uri.path}"}` for consistency. (And maybe consider hoisting this to `Provision` cause there's no point doing this on every request). I don't think this "optimization" of directly using `r.URL.Path` is so valuable, cause it causes this edgecase with directories.
* Update modules/caddyhttp/fileserver/matcher.go
Co-authored-by: Matt Holt <mholt@users.noreply.github.com>
Co-authored-by: Matt Holt <mholt@users.noreply.github.com>
After reading a question about the `handle_response` feature of `reverse_proxy`, I realized that we didn't have a way of serving an arbitrary file with a status code other than 200. This is an issue in situations where you want to serve a custom error page in routes that are not errors, like the aforementioned `handle_response`, where you may want to retain the status code returned by the proxy but write a response with content from a file.
This feature is super simple, basically if a status code is configured (can be a status code number, or a placeholder string) then that status will be written out before serving the file - if we write the status code first, then the stdlib won't write its own (only the first HTTP status header wins).
* encode: implement prefer setting
* encode: minimum_length configurable via caddyfile
* encode: configurable content-types which to encode
* file_server: support precompressed files
* encode: use ReponseMatcher for conditional encoding of content
* linting error & documentation of encode.PrecompressedOrder
* encode: allow just one response matcher
also change the namespace of the encoders back, I accidently changed to precompressed >.>
default matchers include a * to match to any charset, that may be appended
* rounding of the PR
* added integration tests for new caddyfile directives
* improved various doc strings (punctuation and typos)
* added json tag for file_server precompress order and encode matcher
* file_server: add vary header, remove accept-ranges when serving precompressed files
* encode: move Suffix implementation to precompressed modules
At some point we changed how paths are represented down the function calls of browse listings and forgot to update the canGoUp logic. I think this is right? It's simpler now.
* fastcgi: Set PATH_INFO to file matcher remainder as fallback
* fastcgi: Avoid changing scriptName when not necessary
* Stylistic tweaks
Co-authored-by: Matthew Holt <mholt@users.noreply.github.com>
* ci: Use golangci's github action for linting
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
* Fix most of the staticcheck lint errors
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
* Fix the prealloc lint errors
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
* Fix the misspell lint errors
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
* Fix the varcheck lint errors
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
* Fix the errcheck lint errors
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
* Fix the bodyclose lint errors
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
* Fix the deadcode lint errors
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
* Fix the unused lint errors
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
* Fix the gosec lint errors
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
* Fix the gosimple lint errors
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
* Fix the ineffassign lint errors
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
* Fix the staticcheck lint errors
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
* Revert the misspell change, use a neutral English
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
* Remove broken golangci-lint CI job
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
* Re-add errantly-removed weakrand initialization
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
* don't break the loop and return
* Removing extra handling for null rootKey
* unignore RegisterModule/RegisterAdapter
Co-authored-by: Mohammed Al Sahaf <msaa1990@gmail.com>
* single-line log message
Co-authored-by: Matt Holt <mholt@users.noreply.github.com>
* Fix lint after a1808b0dbf209c615e438a496d257ce5e3acdce2 was merged
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
* Revert ticker change, ignore it instead
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
* Ignore some of the write errors
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
* Remove blank line
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
* Use lifetime
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
* close immediately
Co-authored-by: Matt Holt <mholt@users.noreply.github.com>
* Preallocate configVals
Signed-off-by: Dave Henderson <dhenderson@gmail.com>
* Update modules/caddytls/distributedstek/distributedstek.go
Co-authored-by: Mohammed Al Sahaf <msaa1990@gmail.com>
Co-authored-by: Matt Holt <mholt@users.noreply.github.com>
* fileserver: Improve and clarify file hiding logic
* Oops, forgot to run integration tests
* Make this one integration test OS-agnostic
* See if this appeases the Windows gods
* D'oh
* fileserver: Fix try_files for directories, windows fix
* fileserver: Add new file type placeholder, refactoring, tests
* fileserver: Review cleanup
* fileserver: Flip the return args order
Now, a filename to hide that is specified without a path separator will
count as hidden if it appears in any component of the file path (not
only the last component); semantically, this means hiding a file by only
its name (without any part of a path) will hide both files and folders,
e.g. hiding ".git" will hide "/.git" and also "/.git/foo".
We also do prefix matching so that hiding "/.git" will hide "/.git"
and "/.git/foo" but not "/.gitignore".
The remaining logic is a globular match like before.
* fileserver: First attempt to fix failing test on Linux
I think I updated the wrong test case before
* Make new test function
I guess what we really are trying to test is the case insensitivity of
firstSplit. So a new test function is better for that.
We can't use a positional index on an original string that we got from
its lower-cased equivalent. Implement our own IndexFold() function b/c
the std lib does not have one.
Previously, matching by trying files other than the actual path of the
URI was:
file {
try_files <files...>
}
Now, the same can be done in one line:
file <files...>
As before, an empty file matcher:
file
still matches if the request URI exists as a file in the site root.
* matcher: Add `split_path` option to file matcher; used in php_fastcgi
* matcher: Skip try_files split if not the final part of the filename
* matcher: Add MatchFile tests
* matcher: Clarify SplitPath godoc
* pki: Initial commit of PKI app (WIP) (see #2502 and #3021)
* pki: Ability to use root/intermediates, and sign with root
* pki: Fix benign misnamings left over from copy+paste
* pki: Only install root if not already trusted
* Make HTTPS port the default; all names use auto-HTTPS; bug fixes
* Fix build - what happened to our CI tests??
* Fix go.mod
This is a breaking change primarily in two areas:
- Storage paths for certificates have changed
- Slight changes to JSON config parameters
Huge improvements in this commit, to be detailed more in
the release notes.
The upcoming PKI app will be powered by Smallstep libraries.
Previously, all matchers in a route would be evaluated before any
handlers were executed, and a composite route of the matching routes
would be created. This made rewrites especially tricky, since the only
way to defer later matchers' evaluation was to wrap them in a subroute,
or to invoke a "rehandle" which often caused bugs.
Instead, this new sequential design evaluates each route's matchers then
its handlers in lock-step; matcher-handlers-matcher-handlers...
If the first matching route consists of a rewrite, then the second route
will be evaluated against the rewritten request, rather than the original
one, and so on.
This should do away with any need for rehandling.
I've also taken this opportunity to avoid adding new values to the
request context in the handler chain, as this creates a copy of the
Request struct, which may possibly lead to bugs like it has in the past
(see PR #1542, PR #1481, and maybe issue #2463). We now add all the
expected context values in the top-level handler at the server, then
any new values can be added to the variable table via the VarsCtxKey
context key, or just the GetVar/SetVar functions. In particular, we are
using this facility to convey dial information in the reverse proxy.
Had to be careful in one place as the middleware compilation logic has
changed, and moved a bit. We no longer compile a middleware chain per-
request; instead, we can compile it at provision-time, and defer only the
evaluation of matchers to request-time, which should slightly improve
performance. Doing this, however, we take advantage of multiple function
closures, and we also changed the use of HandlerFunc (function pointer)
to Handler (interface)... this led to a situation where, if we aren't
careful, allows one request routed a certain way to permanently change
the "next" handler for all/most other requests! We avoid this by making
a copy of the interface value (which is a lightweight pointer copy) and
using exclusively that within our wrapped handlers. This way, the
original stack frame is preserved in a "read-only" fashion. The comments
in the code describe this phenomenon.
This may very well be a breaking change for some configurations, however
I do not expect it to impact many people. I will make it clear in the
release notes that this change has occurred.
This commit goes a long way toward making automated documentation of
Caddy config and Caddy modules possible. It's a broad, sweeping change,
but mostly internal. It allows us to automatically generate docs for all
Caddy modules (including future third-party ones) and make them viewable
on a web page; it also doubles as godoc comments.
As such, this commit makes significant progress in migrating the docs
from our temporary wiki page toward our new website which is still under
construction.
With this change, all host modules will use ctx.LoadModule() and pass in
both the struct pointer and the field name as a string. This allows the
reflect package to read the struct tag from that field so that it can
get the necessary information like the module namespace and the inline
key.
This has the nice side-effect of unifying the code and documentation. It
also simplifies module loading, and handles several variations on field
types for raw module fields (i.e. variations on json.RawMessage, such as
arrays and maps).
I also renamed ModuleInfo.Name -> ModuleInfo.ID, to make it clear that
the ID is the "full name" which includes both the module namespace and
the name. This clarity is helpful when describing module hierarchy.
As of this change, Caddy modules are no longer an experimental design.
I think the architecture is good enough to go forward.
* logging: Initial implementation
* logging: More encoder formats, better defaults
* logging: Fix repetition bug with FilterEncoder; add more presets
* logging: DiscardWriter; delete or no-op logs that discard their output
* logging: Add http.handlers.log module; enhance Replacer methods
The Replacer interface has new methods to customize how to handle empty
or unrecognized placeholders. Closes#2815.
* logging: Overhaul HTTP logging, fix bugs, improve filtering, etc.
* logging: General cleanup, begin transitioning to using new loggers
* Fixes after merge conflict
* file_server: Make tests work on Windows
* caddyfile: Fix escaping when character is not escapable
We only escape certain characters depending on inside or outside of
quotes (mainly newlines and quotes). We don't want everyone to have to
escape Windows file paths like C:\\Windows\\... but we can't drop the
\ either if it's just C:\Windows\...
See: https://stackoverflow.com/a/12518877/1048862
For example, trying to check the existence of "/www/index.php/index.php"
fails but not with an os.IsNotExist()-type error. So we have to assume
that a file that cannot be successfully stat'ed at all does not exist.
- Rename http.var.* -> http.vars.* to be more consistent
- Prefixing a path matcher with * now invokes simple suffix matching
- Handlers and matchers that need a root path default to {http.vars.root}
- Clean replacer output on the file matcher's file selection suffix
Use piles from which to draw config values.
Module values can return their name, so now we can do two-way mapping
from value to name and name to value; whereas before we could only map
name to value. This was problematic with the Caddyfile adapter since
it receives values and needs to know the name to put in the config.
Along with several other changes, such as renaming caddyhttp.ServerRoute
to caddyhttp.Route, exporting some types that were not exported before,
and tweaking the caddytls TLS values to be more consistent.
Notably, we also now disable automatic cert management for names which
already have a cert (manually) loaded into the cache. These names no
longer need to be specified in the "skip_certificates" field of the
automatic HTTPS config, because they will be skipped automatically.
* optimized functions for inlining
* added note regarding ResponseWriterWrapper
* optimzed browseWrite* methods for FileServer
* created benchmarks for comparison
* creating browseListing instance in each function
* created benchmarks for openResponseWriter
* removed benchmarks of old implementations
* implemented sync.Pool for byte buffers
* using global sync.Pool for writing JSON/HTML
Differentiating middleware and responders has one benefit, namely that
it's clear which module provides the response, but even then it's not
a great advantage. Linear handler config makes a little more sense,
giving greater flexibility and simplifying the core a bit, even though
it's slightly awkward that handlers which are responders may not use
the 'next' handler that is passed in at all.
- Fix static responder so it doesn't replace its own headers config,
and instead replaces the actual response header values
- caddyhttp.ResponseRecorder type optionally buffers response
- Add interface guards to ensure regexp matchers get provisioned
- Use default HTTP port if one is not explicitly set
- Encode middleware writes status code 200 if not written upstream
- Templates and markdown only try to execute on text responses
- Static file server sets Content-Type based on file extension only
(this whole thing -- MIME sniffing, etc -- needs more configurability)