doc_src/string: Add a small regex reference

This isn't nearly all of
it (https://pcre.org/current/doc/html/pcre2syntax.html), but it should
cover the most-used features.

[ci skip]
This commit is contained in:
Fabian Homborg 2018-12-01 09:54:05 +01:00
parent d20b3c688b
commit 1785af156b

View File

@ -126,6 +126,50 @@ Both the `match` and `replace` subcommand support regular expressions when used
In general, special characters are special by default, so `a+` matches one or more "a"s, while `a\+` matches an "a" and then a "+". `(a+)` matches one or more "a"s in a capturing group (`(?:XXXX)` denotes a non-capturing group). For the replacement parameter of `replace`, `$n` refers to the n-th group of the match. In the match parameter, `\n` (e.g. `\1`) refers back to groups.
Some features include repetitions:
- `*` refers to 0 or more repetitions of the previous expression
- `+` 1 or more
- `?` 0 or 1.
- `{n}` to exactly n (where n is a number)
- `{n,m}` at least n, no more than m.
- `{n,}` n or more
Character classes, some of the more important:
- `.` any character except newline
- `\d` a decimal digit and `\D`, not a decimal digit
- `\s` whitespace and `\S`, not whitespace
- `\w` a "word" character and `\W`, a "non-word" character
- `[...]` (where "..." is some characters) is a character set
- `[^...]` is the inverse of the given character set
- `[x-y]` is the range of characters from x-y
- `[[:xxx:]]` is a named character set
- `[[:^xxx:]]` is the inverse of a named character set
- `[[:alnum:]]` : "alphanumeric"
- `[[:alpha:]]` : "alphabetic"
- `[[:ascii:]]` : "0-127"
- `[[:blank:]]` : "space or tab"
- `[[:cntrl:]]` : "control character"
- `[[:digit:]]` : "decimal digit"
- `[[:graph:]]` : "printing, excluding space"
- `[[:lower:]]` : "lower case letter"
- `[[:print:]]` : "printing, including space"
- `[[:punct:]]` : "printing, excluding alphanumeric"
- `[[:space:]]` : "white space"
- `[[:upper:]]` : "upper case letter"
- `[[:word:]]` : "same as \w"
- `[[:xdigit:]]` : "hexadecimal digit"
Groups:
- `(...)` is a capturing group
- `(?:...)` is a non-capturing group
- `\n` is a backreference (where n is the number of the group, starting with 1)
- `$n` is a reference from the replacement expression to a group in the match expression.
And some other things:
- `\b` denotes a word boundary, `\B` is not a word boundary.
- `^` is the start of the string or line, `$` the end.
- `|` is "alternation", i.e. the "or".
\subsection string-example Examples
\fish{cli-dark}