mirror of
https://github.com/fish-shell/fish-shell.git
synced 2025-02-21 04:22:40 +08:00
Allow { } for command grouping, like begin / end
For compound commands we already have begin/end but > it is long, which it is not convenient for the command line > it is different than {} which shell users have been using for >50 years The difference from {} can break muscle memory and add extra steps when I'm trying to write simple commands that work in any shell. Fix that by embracing the traditional style too. --- Since { and } have always been special syntax in fish, we can also allow { } { echo } which I find intuitive even without having used a shell that supports this (like zsh. The downside is that this doesn't work in some other shells. The upside is in aesthetics and convenience (this is for interactive use). Not completely sure about this. --- This implementation adds a hack to the tokenizer: '{' is usually a brace expansion. Make it compound command when in command position (not something the tokenizer would normally know). We need to disable this when parsing a freestanding argument lists (in "complete somecmd -a "{true,false}"). It's not really clear what "read -t" should do. For now, keep the existing behavior (don't parse compound statements). Add another hack to increase backwards compatibility: parse something like "{ foo }" as brace statement only if it has a space after the opening brace. This style is less likely to be used for brace expansion. Perhaps we can change this in future (I'll make a PR). Use separate terminal token types for braces; we could make the left brace an ordinary string token but since string tokens undergo unescaping during expansion etc., every such place would need to know whether it's dealing with a command or an argument. Certainly possible but it seems simpler (especially for tab-completions) to strip braces in the parser. We could change this. --- In future we could allow the following alternative syntax (which is invalid today). if true { } if true; { } Closes #10895 Closes #10898
This commit is contained in:
parent
349f62cd7c
commit
1bf2b43d30
@ -3,9 +3,11 @@ fish 4.1.0 (released ???)
|
||||
|
||||
Notable improvements and fixes
|
||||
------------------------------
|
||||
- Compound commands (``begin; echo 1; echo 2; end``) can now be now be abbreviated using braces (``{ echo1; echo 2 }``), like in other shells.
|
||||
|
||||
Deprecations and removed features
|
||||
---------------------------------
|
||||
- Tokens like `{ echo, echo }`` in command position are no longer interpreted as brace expansion but as compound command.
|
||||
|
||||
Scripting improvements
|
||||
----------------------
|
||||
|
@ -9,6 +9,7 @@ Synopsis
|
||||
.. synopsis::
|
||||
|
||||
begin; [COMMANDS ...]; end
|
||||
{ [COMMANDS ...] }
|
||||
|
||||
Description
|
||||
-----------
|
||||
@ -21,6 +22,8 @@ The block is unconditionally executed. ``begin; ...; end`` is equivalent to ``if
|
||||
|
||||
``begin`` does not change the current exit status itself. After the block has completed, ``$status`` will be set to the status returned by the most recent command.
|
||||
|
||||
Some other shells only support the ``{ [COMMANDS ...] ; }`` notation.
|
||||
|
||||
The **-h** or **--help** option displays help about using this command.
|
||||
|
||||
Example
|
||||
|
@ -53,7 +53,7 @@ lexer_rules = [
|
||||
# Hack: treat the "[ expr ]" alias of builtin test as command token (not as grammar
|
||||
# metacharacter). This works because we write it without spaces in the grammar (like
|
||||
# "[OPTIONS]").
|
||||
(r"\. |! |\[ | \]", Name.Constant),
|
||||
(r"\. |! |\[ | \]|\{ | \}", Name.Constant),
|
||||
# Statement separators.
|
||||
(r"\n", Text.Whitespace),
|
||||
(r";", Punctuation),
|
||||
|
124
src/ast.rs
124
src/ast.rs
@ -21,7 +21,7 @@ use crate::parse_tree::ParseToken;
|
||||
use crate::tests::prelude::*;
|
||||
use crate::tokenizer::{
|
||||
variable_assignment_equals_pos, TokFlags, TokenType, Tokenizer, TokenizerError,
|
||||
TOK_ACCEPT_UNFINISHED, TOK_CONTINUE_AFTER_ERROR, TOK_SHOW_COMMENTS,
|
||||
TOK_ACCEPT_UNFINISHED, TOK_ARGUMENT_LIST, TOK_CONTINUE_AFTER_ERROR, TOK_SHOW_COMMENTS,
|
||||
};
|
||||
use crate::wchar::prelude::*;
|
||||
use std::borrow::Cow;
|
||||
@ -203,6 +203,9 @@ pub trait ConcreteNode {
|
||||
fn as_block_statement(&self) -> Option<&BlockStatement> {
|
||||
None
|
||||
}
|
||||
fn as_brace_statement(&self) -> Option<&BraceStatement> {
|
||||
None
|
||||
}
|
||||
fn as_if_clause(&self) -> Option<&IfClause> {
|
||||
None
|
||||
}
|
||||
@ -321,6 +324,9 @@ trait ConcreteNodeMut {
|
||||
fn as_mut_block_statement(&mut self) -> Option<&mut BlockStatement> {
|
||||
None
|
||||
}
|
||||
fn as_mut_brace_statement(&mut self) -> Option<&mut BraceStatement> {
|
||||
None
|
||||
}
|
||||
fn as_mut_if_clause(&mut self) -> Option<&mut IfClause> {
|
||||
None
|
||||
}
|
||||
@ -1028,6 +1034,9 @@ macro_rules! set_parent_of_union_field {
|
||||
} else if matches!($self.$field_name, StatementVariant::BlockStatement(_)) {
|
||||
$self.$field_name.as_mut_block_statement().parent = Some($self);
|
||||
$self.$field_name.as_mut_block_statement().set_parents();
|
||||
} else if matches!($self.$field_name, StatementVariant::BraceStatement(_)) {
|
||||
$self.$field_name.as_mut_brace_statement().parent = Some($self);
|
||||
$self.$field_name.as_mut_brace_statement().set_parents();
|
||||
} else if matches!($self.$field_name, StatementVariant::IfStatement(_)) {
|
||||
$self.$field_name.as_mut_if_statement().parent = Some($self);
|
||||
$self.$field_name.as_mut_if_statement().set_parents();
|
||||
@ -1247,11 +1256,12 @@ impl CheckParse for JobConjunction {
|
||||
fn can_be_parsed(pop: &mut Populator<'_>) -> bool {
|
||||
let token = pop.peek_token(0);
|
||||
// These keywords end a job list.
|
||||
token.typ == ParseTokenType::string
|
||||
&& !matches!(
|
||||
token.keyword,
|
||||
ParseKeyword::kw_case | ParseKeyword::kw_end | ParseKeyword::kw_else
|
||||
)
|
||||
token.typ == ParseTokenType::left_brace
|
||||
|| (token.typ == ParseTokenType::string
|
||||
&& !matches!(
|
||||
token.keyword,
|
||||
ParseKeyword::kw_case | ParseKeyword::kw_end | ParseKeyword::kw_else
|
||||
))
|
||||
}
|
||||
}
|
||||
|
||||
@ -1399,6 +1409,37 @@ impl ConcreteNodeMut for BlockStatement {
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Default, Debug)]
|
||||
pub struct BraceStatement {
|
||||
parent: Option<*const dyn Node>,
|
||||
/// The opening brace, in command position.
|
||||
pub left_brace: TokenLeftBrace,
|
||||
/// List of jobs in this block.
|
||||
pub jobs: JobList,
|
||||
/// The closing brace.
|
||||
pub right_brace: TokenRightBrace,
|
||||
/// Arguments and redirections associated with the block.
|
||||
pub args_or_redirs: ArgumentOrRedirectionList,
|
||||
}
|
||||
implement_node!(BraceStatement, branch, brace_statement);
|
||||
implement_acceptor_for_branch!(
|
||||
BraceStatement,
|
||||
(left_brace: (TokenLeftBrace)),
|
||||
(jobs: (JobList)),
|
||||
(right_brace: (TokenRightBrace)),
|
||||
(args_or_redirs: (ArgumentOrRedirectionList)),
|
||||
);
|
||||
impl ConcreteNode for BraceStatement {
|
||||
fn as_brace_statement(&self) -> Option<&BraceStatement> {
|
||||
Some(self)
|
||||
}
|
||||
}
|
||||
impl ConcreteNodeMut for BraceStatement {
|
||||
fn as_mut_brace_statement(&mut self) -> Option<&mut BraceStatement> {
|
||||
Some(self)
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Default, Debug)]
|
||||
pub struct IfClause {
|
||||
parent: Option<*const dyn Node>,
|
||||
@ -1772,7 +1813,10 @@ impl CheckParse for AndorJob {
|
||||
// Check that the argument to and/or is a string that's not help. Otherwise
|
||||
// it's either 'and --help' or a naked 'and', and not part of this list.
|
||||
let next_token = pop.peek_token(1);
|
||||
next_token.typ == ParseTokenType::string && !next_token.is_help_argument
|
||||
matches!(
|
||||
next_token.typ,
|
||||
ParseTokenType::string | ParseTokenType::left_brace
|
||||
) && !next_token.is_help_argument
|
||||
}
|
||||
}
|
||||
|
||||
@ -1894,7 +1938,7 @@ impl CheckParse for VariableAssignment {
|
||||
// What is the token after it?
|
||||
match pop.peek_type(1) {
|
||||
// We have `a= cmd` and should treat it as a variable assignment.
|
||||
ParseTokenType::string => true,
|
||||
ParseTokenType::string | ParseTokenType::left_brace => true,
|
||||
// We have `a=` which is OK if we are allowing incomplete, an error otherwise.
|
||||
ParseTokenType::terminate => pop.allow_incomplete(),
|
||||
// We have e.g. `a= >` which is an error.
|
||||
@ -1966,6 +2010,8 @@ define_token_node!(String_, string);
|
||||
define_token_node!(TokenBackground, background);
|
||||
define_token_node!(TokenConjunction, andand, oror);
|
||||
define_token_node!(TokenPipe, pipe);
|
||||
define_token_node!(TokenLeftBrace, left_brace);
|
||||
define_token_node!(TokenRightBrace, right_brace);
|
||||
define_token_node!(TokenRedirection, redirection);
|
||||
|
||||
define_keyword_node!(DecoratedStatementDecorator, kw_command, kw_builtin, kw_exec);
|
||||
@ -2236,6 +2282,7 @@ pub enum StatementVariant {
|
||||
None,
|
||||
NotStatement(Box<NotStatement>),
|
||||
BlockStatement(Box<BlockStatement>),
|
||||
BraceStatement(Box<BraceStatement>),
|
||||
IfStatement(Box<IfStatement>),
|
||||
SwitchStatement(Box<SwitchStatement>),
|
||||
DecoratedStatement(DecoratedStatement),
|
||||
@ -2253,6 +2300,7 @@ impl Acceptor for StatementVariant {
|
||||
StatementVariant::None => panic!("cannot visit null statement"),
|
||||
StatementVariant::NotStatement(node) => node.accept(visitor, reversed),
|
||||
StatementVariant::BlockStatement(node) => node.accept(visitor, reversed),
|
||||
StatementVariant::BraceStatement(node) => node.accept(visitor, reversed),
|
||||
StatementVariant::IfStatement(node) => node.accept(visitor, reversed),
|
||||
StatementVariant::SwitchStatement(node) => node.accept(visitor, reversed),
|
||||
StatementVariant::DecoratedStatement(node) => node.accept(visitor, reversed),
|
||||
@ -2265,6 +2313,7 @@ impl AcceptorMut for StatementVariant {
|
||||
StatementVariant::None => panic!("cannot visit null statement"),
|
||||
StatementVariant::NotStatement(node) => node.accept_mut(visitor, reversed),
|
||||
StatementVariant::BlockStatement(node) => node.accept_mut(visitor, reversed),
|
||||
StatementVariant::BraceStatement(node) => node.accept_mut(visitor, reversed),
|
||||
StatementVariant::IfStatement(node) => node.accept_mut(visitor, reversed),
|
||||
StatementVariant::SwitchStatement(node) => node.accept_mut(visitor, reversed),
|
||||
StatementVariant::DecoratedStatement(node) => node.accept_mut(visitor, reversed),
|
||||
@ -2292,6 +2341,12 @@ impl StatementVariant {
|
||||
_ => None,
|
||||
}
|
||||
}
|
||||
pub fn as_brace_statement(&self) -> Option<&BraceStatement> {
|
||||
match self {
|
||||
StatementVariant::BraceStatement(node) => Some(node),
|
||||
_ => None,
|
||||
}
|
||||
}
|
||||
pub fn as_if_statement(&self) -> Option<&IfStatement> {
|
||||
match self {
|
||||
StatementVariant::IfStatement(node) => Some(node),
|
||||
@ -2316,6 +2371,7 @@ impl StatementVariant {
|
||||
StatementVariant::None => panic!("cannot visit null statement"),
|
||||
StatementVariant::NotStatement(node) => &**node,
|
||||
StatementVariant::BlockStatement(node) => &**node,
|
||||
StatementVariant::BraceStatement(node) => &**node,
|
||||
StatementVariant::IfStatement(node) => &**node,
|
||||
StatementVariant::SwitchStatement(node) => &**node,
|
||||
StatementVariant::DecoratedStatement(node) => node,
|
||||
@ -2333,6 +2389,12 @@ impl StatementVariant {
|
||||
_ => panic!(),
|
||||
}
|
||||
}
|
||||
fn as_mut_brace_statement(&mut self) -> &mut BraceStatement {
|
||||
match self {
|
||||
StatementVariant::BraceStatement(node) => node,
|
||||
_ => panic!(),
|
||||
}
|
||||
}
|
||||
fn as_mut_if_statement(&mut self) -> &mut IfStatement {
|
||||
match self {
|
||||
StatementVariant::IfStatement(node) => node,
|
||||
@ -2371,6 +2433,7 @@ pub fn ast_type_to_string(t: Type) -> &'static wstr {
|
||||
Type::function_header => L!("function_header"),
|
||||
Type::begin_header => L!("begin_header"),
|
||||
Type::block_statement => L!("block_statement"),
|
||||
Type::brace_statement => L!("brace_statement"),
|
||||
Type::if_clause => L!("if_clause"),
|
||||
Type::elseif_clause => L!("elseif_clause"),
|
||||
Type::elseif_clause_list => L!("elseif_clause_list"),
|
||||
@ -2629,13 +2692,17 @@ impl<'a> TokenStream<'a> {
|
||||
// The maximum number of lookahead supported.
|
||||
const MAX_LOOKAHEAD: usize = 2;
|
||||
|
||||
fn new(src: &'a wstr, flags: ParseTreeFlags) -> Self {
|
||||
fn new(src: &'a wstr, flags: ParseTreeFlags, freestanding_arguments: bool) -> Self {
|
||||
let mut flags = TokFlags::from(flags);
|
||||
if freestanding_arguments {
|
||||
flags |= TOK_ARGUMENT_LIST;
|
||||
}
|
||||
Self {
|
||||
lookahead: [ParseToken::new(ParseTokenType::invalid); Self::MAX_LOOKAHEAD],
|
||||
start: 0,
|
||||
count: 0,
|
||||
src,
|
||||
tok: Tokenizer::new(src, TokFlags::from(flags)),
|
||||
tok: Tokenizer::new(src, flags),
|
||||
comment_ranges: vec![],
|
||||
}
|
||||
}
|
||||
@ -2931,6 +2998,20 @@ impl<'s> NodeVisitorMut for Populator<'s> {
|
||||
return;
|
||||
};
|
||||
|
||||
let token = &error.token;
|
||||
// To-do: maybe extend this to other tokenizer errors?
|
||||
if token.typ == ParseTokenType::tokenizer_error
|
||||
&& token.tok_error == TokenizerError::closing_unopened_brace
|
||||
{
|
||||
parse_error_range!(
|
||||
self,
|
||||
token.range(),
|
||||
ParseErrorCode::unbalancing_brace,
|
||||
"%s",
|
||||
<TokenizerError as Into<&wstr>>::into(token.tok_error)
|
||||
);
|
||||
}
|
||||
|
||||
// We believe the node is some sort of block statement. Attempt to find a source range
|
||||
// for the block's keyword (for, if, etc) and a user-presentable description. This
|
||||
// is used to provide better error messages. Note at this point the parse tree is
|
||||
@ -2989,7 +3070,7 @@ impl<'s> NodeVisitorMut for Populator<'s> {
|
||||
} else {
|
||||
parse_error!(
|
||||
self,
|
||||
error.token,
|
||||
token,
|
||||
ParseErrorCode::generic,
|
||||
"Expected %ls, but found %ls",
|
||||
keywords_user_presentable_description(error.allowed_keywords),
|
||||
@ -3095,7 +3176,7 @@ impl<'s> Populator<'s> {
|
||||
flags,
|
||||
semis: vec![],
|
||||
errors: vec![],
|
||||
tokens: TokenStream::new(src, flags),
|
||||
tokens: TokenStream::new(src, flags, top_type == Type::freestanding_argument_list),
|
||||
top_type,
|
||||
unwinding: false,
|
||||
any_error: false,
|
||||
@ -3550,6 +3631,19 @@ impl<'s> Populator<'s> {
|
||||
|
||||
fn new_decorated_statement(slf: &mut Populator<'_>) -> StatementVariant {
|
||||
let embedded = slf.allocate_visit::<DecoratedStatement>();
|
||||
if !slf.unwinding && slf.peek_token(0).typ == ParseTokenType::left_brace {
|
||||
parse_error!(
|
||||
slf,
|
||||
slf.peek_token(0),
|
||||
ParseErrorCode::generic,
|
||||
"Expected %s, but found %ls",
|
||||
token_type_user_presentable_description(
|
||||
ParseTokenType::end,
|
||||
ParseKeyword::none
|
||||
),
|
||||
slf.peek_token(0).user_presentable_description()
|
||||
);
|
||||
}
|
||||
StatementVariant::DecoratedStatement(*embedded)
|
||||
}
|
||||
|
||||
@ -3557,6 +3651,9 @@ impl<'s> Populator<'s> {
|
||||
// This may happen if we just have a 'time' prefix.
|
||||
// Construct a decorated statement, which will be unsourced.
|
||||
self.allocate_visit::<DecoratedStatement>();
|
||||
} else if self.peek_token(0).typ == ParseTokenType::left_brace {
|
||||
let embedded = self.allocate_visit::<BraceStatement>();
|
||||
return StatementVariant::BraceStatement(embedded);
|
||||
} else if self.peek_token(0).typ != ParseTokenType::string {
|
||||
// We may be unwinding already; do not produce another error.
|
||||
// For example in `true | and`.
|
||||
@ -3957,6 +4054,8 @@ impl From<TokenType> for ParseTokenType {
|
||||
TokenType::oror => ParseTokenType::oror,
|
||||
TokenType::end => ParseTokenType::end,
|
||||
TokenType::background => ParseTokenType::background,
|
||||
TokenType::left_brace => ParseTokenType::left_brace,
|
||||
TokenType::right_brace => ParseTokenType::right_brace,
|
||||
TokenType::redirect => ParseTokenType::redirection,
|
||||
TokenType::error => ParseTokenType::tokenizer_error,
|
||||
TokenType::comment => ParseTokenType::comment,
|
||||
@ -4042,6 +4141,7 @@ pub enum Type {
|
||||
function_header,
|
||||
begin_header,
|
||||
block_statement,
|
||||
brace_statement,
|
||||
if_clause,
|
||||
elseif_clause,
|
||||
elseif_clause_list,
|
||||
|
@ -22,6 +22,7 @@ use crate::wcstringutil::join_strings;
|
||||
use std::ops::Range;
|
||||
|
||||
/// Which part of the comandbuffer are we operating on.
|
||||
#[derive(Eq, PartialEq)]
|
||||
enum TextScope {
|
||||
String,
|
||||
Job,
|
||||
@ -103,6 +104,7 @@ fn replace_part(
|
||||
fn write_part(
|
||||
parser: &Parser,
|
||||
range: Range<usize>,
|
||||
range_is_single_token: bool,
|
||||
cut_at_cursor: bool,
|
||||
token_mode: Option<TokenMode>,
|
||||
buffer: &wstr,
|
||||
@ -121,19 +123,8 @@ fn write_part(
|
||||
return;
|
||||
};
|
||||
|
||||
let buff = &buffer[range];
|
||||
let mut tok = Tokenizer::new(buff, TOK_ACCEPT_UNFINISHED);
|
||||
let mut args = vec![];
|
||||
while let Some(token) = tok.next() {
|
||||
if cut_at_cursor && token.end() >= pos {
|
||||
break;
|
||||
}
|
||||
if token.type_ != TokenType::string {
|
||||
continue;
|
||||
}
|
||||
|
||||
let token_text = tok.text_of(&token);
|
||||
|
||||
let mut add_token = |token_text: &wstr| {
|
||||
match token_mode {
|
||||
TokenMode::Expanded => {
|
||||
const COMMANDLINE_TOKENS_MAX_EXPANSION: usize = 512;
|
||||
@ -175,7 +166,26 @@ fn write_part(
|
||||
args.push(Completion::from_completion(unescaped));
|
||||
}
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
let buff = &buffer[range];
|
||||
if range_is_single_token {
|
||||
add_token(buff);
|
||||
} else {
|
||||
let mut tok = Tokenizer::new(buff, TOK_ACCEPT_UNFINISHED);
|
||||
while let Some(token) = tok.next() {
|
||||
if cut_at_cursor && token.end() >= pos {
|
||||
break;
|
||||
}
|
||||
if token.type_ != TokenType::string {
|
||||
continue;
|
||||
}
|
||||
|
||||
let token_text = tok.text_of(&token);
|
||||
add_token(token_text);
|
||||
}
|
||||
};
|
||||
|
||||
for arg in args {
|
||||
streams.out.appendln(arg.completion);
|
||||
}
|
||||
@ -642,6 +652,7 @@ pub fn commandline(parser: &Parser, streams: &mut IoStreams, args: &mut [&wstr])
|
||||
write_part(
|
||||
parser,
|
||||
range,
|
||||
buffer_part == TextScope::Token,
|
||||
cut_at_cursor,
|
||||
token_mode,
|
||||
current_buffer,
|
||||
|
@ -76,6 +76,9 @@ struct PrettyPrinterState<'source, 'ast> {
|
||||
// present in the ast.
|
||||
gaps: Vec<SourceRange>,
|
||||
|
||||
// Sorted set of source offsets of brace statements that span multiple lines.
|
||||
multi_line_brace_statement_locations: Vec<usize>,
|
||||
|
||||
// The sorted set of source offsets of nl_semi_t which should be set as semis, not newlines.
|
||||
// This is computed ahead of time for convenience.
|
||||
preferred_semi_locations: Vec<usize>,
|
||||
@ -120,11 +123,14 @@ impl<'source, 'ast> PrettyPrinter<'source, 'ast> {
|
||||
// Start with true to ignore leading empty lines.
|
||||
gap_text_mask_newline: true,
|
||||
gaps: vec![],
|
||||
multi_line_brace_statement_locations: vec![],
|
||||
preferred_semi_locations: vec![],
|
||||
errors: None,
|
||||
},
|
||||
};
|
||||
zelf.state.gaps = zelf.compute_gaps();
|
||||
zelf.state.multi_line_brace_statement_locations =
|
||||
zelf.compute_multi_line_brace_statement_locations();
|
||||
zelf.state.preferred_semi_locations = zelf.compute_preferred_semi_locations();
|
||||
zelf
|
||||
}
|
||||
@ -224,6 +230,23 @@ impl<'source, 'ast> PrettyPrinter<'source, 'ast> {
|
||||
}
|
||||
}
|
||||
|
||||
// `{ x; y; }` gets semis if the input uses semis and it spans only one line.
|
||||
for node in Traversal::new(self.ast.top()) {
|
||||
let Some(brace_statement) = node.as_brace_statement() else {
|
||||
continue;
|
||||
};
|
||||
if self
|
||||
.state
|
||||
.multi_line_brace_statement_locations
|
||||
.binary_search(&brace_statement.source_range().start())
|
||||
.is_err()
|
||||
{
|
||||
for job in &brace_statement.jobs {
|
||||
job.semi_nl.as_ref().map(&mut mark_semi_from_input);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// `x ; and y` gets semis if it has them already, and they are on the same line.
|
||||
for node in Traversal::new(self.ast.top()) {
|
||||
let Some(job_list) = node.as_job_list() else {
|
||||
@ -259,9 +282,41 @@ impl<'source, 'ast> PrettyPrinter<'source, 'ast> {
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
result.sort_unstable();
|
||||
result
|
||||
}
|
||||
|
||||
fn compute_multi_line_brace_statement_locations(&self) -> Vec<usize> {
|
||||
let mut result = vec![];
|
||||
let newline_offsets: Vec<usize> = self
|
||||
.state
|
||||
.source
|
||||
.char_indices()
|
||||
.filter_map(|(i, c)| (c == '\n').then_some(i))
|
||||
.collect();
|
||||
let mut next_newline = 0;
|
||||
for node in Traversal::new(self.ast.top()) {
|
||||
let Some(brace_statement) = node.as_brace_statement() else {
|
||||
continue;
|
||||
};
|
||||
while next_newline != newline_offsets.len()
|
||||
&& newline_offsets[next_newline] < brace_statement.source_range().start()
|
||||
{
|
||||
next_newline += 1;
|
||||
}
|
||||
let contains_newline = next_newline != newline_offsets.len() && {
|
||||
let newline_offset = newline_offsets[next_newline];
|
||||
assert!(newline_offset >= brace_statement.source_range().start());
|
||||
newline_offset < brace_statement.source_range().end()
|
||||
};
|
||||
if contains_newline {
|
||||
result.push(brace_statement.source_range().start());
|
||||
}
|
||||
}
|
||||
assert!(result.is_sorted_by(|l, r| Some(l.cmp(r))));
|
||||
result
|
||||
}
|
||||
}
|
||||
|
||||
impl<'source, 'ast> PrettyPrinterState<'source, 'ast> {
|
||||
@ -617,6 +672,42 @@ impl<'source, 'ast> PrettyPrinterState<'source, 'ast> {
|
||||
}
|
||||
}
|
||||
|
||||
fn is_multi_line_brace(&self, node: &dyn ast::Token) -> bool {
|
||||
node.parent()
|
||||
.unwrap()
|
||||
.as_brace_statement()
|
||||
.is_some_and(|brace_statement| {
|
||||
self.multi_line_brace_statement_locations
|
||||
.binary_search(&brace_statement.source_range().start())
|
||||
.is_ok()
|
||||
})
|
||||
}
|
||||
fn visit_left_brace(&mut self, node: &dyn ast::Token) {
|
||||
let range = node.source_range();
|
||||
let flags = self.gap_text_flags_before_node(node.as_node());
|
||||
if self.is_multi_line_brace(node) && !self.at_line_start() {
|
||||
self.emit_newline();
|
||||
}
|
||||
self.current_indent = self.indent(range.start());
|
||||
self.emit_space_or_indent(flags);
|
||||
self.output.push('{');
|
||||
}
|
||||
fn visit_right_brace(&mut self, node: &dyn ast::Token) {
|
||||
let range = node.source_range();
|
||||
let flags = self.gap_text_flags_before_node(node.as_node());
|
||||
self.emit_gap_text_before(range, flags);
|
||||
if self.is_multi_line_brace(node) {
|
||||
self.current_indent = self.indent(range.start());
|
||||
if !self.at_line_start() {
|
||||
self.emit_newline();
|
||||
}
|
||||
self.emit_space_or_indent(flags);
|
||||
self.output.push('}');
|
||||
} else {
|
||||
self.emit_node_text(node.as_node());
|
||||
}
|
||||
}
|
||||
|
||||
fn visit_redirection(&mut self, node: &ast::Redirection) {
|
||||
// No space between a redirection operator and its target (#2899).
|
||||
let Some(orange) = node.oper.range() else {
|
||||
@ -684,11 +775,12 @@ impl<'source, 'ast> NodeVisitor<'_> for PrettyPrinterState<'source, 'ast> {
|
||||
return;
|
||||
}
|
||||
if let Some(token) = node.as_token() {
|
||||
if token.token_type() == ParseTokenType::end {
|
||||
self.visit_semi_nl(token);
|
||||
return;
|
||||
match token.token_type() {
|
||||
ParseTokenType::end => self.visit_semi_nl(token),
|
||||
ParseTokenType::left_brace => self.visit_left_brace(token),
|
||||
ParseTokenType::right_brace => self.visit_right_brace(token),
|
||||
_ => self.emit_node_text(node),
|
||||
}
|
||||
self.emit_node_text(node);
|
||||
return;
|
||||
}
|
||||
match node.typ() {
|
||||
|
@ -20,6 +20,7 @@ use crate::reader::ReaderConfig;
|
||||
use crate::reader::{reader_pop, reader_push, reader_readline};
|
||||
use crate::tokenizer::Tokenizer;
|
||||
use crate::tokenizer::TOK_ACCEPT_UNFINISHED;
|
||||
use crate::tokenizer::TOK_ARGUMENT_LIST;
|
||||
use crate::wcstringutil::split_about;
|
||||
use crate::wcstringutil::split_string_tok;
|
||||
use crate::wutil;
|
||||
@ -644,7 +645,7 @@ pub fn read(parser: &Parser, streams: &mut IoStreams, argv: &mut [&wstr]) -> Opt
|
||||
}
|
||||
|
||||
if opts.tokenize {
|
||||
let mut tok = Tokenizer::new(&buff, TOK_ACCEPT_UNFINISHED);
|
||||
let mut tok = Tokenizer::new(&buff, TOK_ACCEPT_UNFINISHED | TOK_ARGUMENT_LIST);
|
||||
if opts.array {
|
||||
// Array mode: assign each token as a separate element of the sole var.
|
||||
let mut tokens = vec![];
|
||||
|
@ -13,6 +13,7 @@ use crate::{
|
||||
ast::unescape_keyword,
|
||||
common::charptr2wcstring,
|
||||
reader::{get_quote, is_backslashed},
|
||||
tokenizer::is_brace_statement,
|
||||
util::wcsfilecmp,
|
||||
wutil::sprintf,
|
||||
};
|
||||
@ -663,7 +664,20 @@ impl<'ctx> Completer<'ctx> {
|
||||
|
||||
// Get all the arguments.
|
||||
let mut tokens = Vec::new();
|
||||
parse_util_process_extent(&cmdline, position_in_statement, Some(&mut tokens));
|
||||
{
|
||||
let proc_range =
|
||||
parse_util_process_extent(&cmdline, position_in_statement, Some(&mut tokens));
|
||||
let start = proc_range.start;
|
||||
if start != 0
|
||||
&& cmdline.as_char_slice()[start - 1] == '{'
|
||||
&& (start == cmdline.len()
|
||||
|| !is_brace_statement(cmdline.as_char_slice().get(start).copied()))
|
||||
{
|
||||
// We don't want to suggest commands here, since this command line parses as
|
||||
// brace expansion.
|
||||
return;
|
||||
}
|
||||
}
|
||||
let actual_token_count = tokens.len();
|
||||
|
||||
// Hack: fix autosuggestion by removing prefixing "and"s #6249.
|
||||
|
@ -1,8 +1,9 @@
|
||||
//! Functions for syntax highlighting.
|
||||
use crate::abbrs::{self, with_abbrs};
|
||||
use crate::ast::{
|
||||
self, Argument, Ast, BlockStatement, BlockStatementHeaderVariant, DecoratedStatement, Keyword,
|
||||
Leaf, List, Node, NodeVisitor, Redirection, Token, Type, VariableAssignment,
|
||||
self, Argument, Ast, BlockStatement, BlockStatementHeaderVariant, BraceStatement,
|
||||
DecoratedStatement, Keyword, Leaf, List, Node, NodeVisitor, Redirection, Token, Type,
|
||||
VariableAssignment,
|
||||
};
|
||||
use crate::builtins::shared::builtin_exists;
|
||||
use crate::color::RgbColor;
|
||||
@ -869,6 +870,9 @@ impl<'s> Highlighter<'s> {
|
||||
ParseTokenType::end | ParseTokenType::pipe | ParseTokenType::background => {
|
||||
role = HighlightRole::statement_terminator
|
||||
}
|
||||
ParseTokenType::left_brace | ParseTokenType::right_brace => {
|
||||
role = HighlightRole::keyword;
|
||||
}
|
||||
ParseTokenType::andand | ParseTokenType::oror => role = HighlightRole::operat,
|
||||
ParseTokenType::string => {
|
||||
// Assume all strings are params. This handles e.g. the variables a for header or
|
||||
@ -1063,6 +1067,12 @@ impl<'s> Highlighter<'s> {
|
||||
self.visit(&block.end);
|
||||
self.pending_variables.truncate(pending_variables_count);
|
||||
}
|
||||
fn visit_brace_statement(&mut self, brace_statement: &BraceStatement) {
|
||||
self.visit(&brace_statement.left_brace);
|
||||
self.visit(&brace_statement.args_or_redirs);
|
||||
self.visit(&brace_statement.jobs);
|
||||
self.visit(&brace_statement.right_brace);
|
||||
}
|
||||
}
|
||||
|
||||
/// Return whether a string contains a command substitution.
|
||||
@ -1121,6 +1131,7 @@ impl<'s, 'a> NodeVisitor<'a> for Highlighter<'s> {
|
||||
self.visit_decorated_statement(node.as_decorated_statement().unwrap())
|
||||
}
|
||||
Type::block_statement => self.visit_block_statement(node.as_block_statement().unwrap()),
|
||||
Type::brace_statement => self.visit_brace_statement(node.as_brace_statement().unwrap()),
|
||||
// Default implementation is to just visit children.
|
||||
_ => self.visit_children(node),
|
||||
}
|
||||
|
@ -66,6 +66,8 @@ pub enum ParseTokenType {
|
||||
// Terminal types.
|
||||
string,
|
||||
pipe,
|
||||
left_brace,
|
||||
right_brace,
|
||||
redirection,
|
||||
background,
|
||||
andand,
|
||||
@ -135,6 +137,7 @@ pub enum ParseErrorCode {
|
||||
unbalancing_end, // end outside of block
|
||||
unbalancing_else, // else outside of if
|
||||
unbalancing_case, // case outside of switch
|
||||
unbalancing_brace, // } outside of {
|
||||
bare_variable_assignment, // a=b without command
|
||||
andor_in_pipeline, // "and" or "or" after a pipe
|
||||
}
|
||||
@ -207,6 +210,8 @@ impl ParseTokenType {
|
||||
ParseTokenType::background => L!("ParseTokenType::background"),
|
||||
ParseTokenType::end => L!("ParseTokenType::end"),
|
||||
ParseTokenType::pipe => L!("ParseTokenType::pipe"),
|
||||
ParseTokenType::left_brace => L!("ParseTokenType::lbrace"),
|
||||
ParseTokenType::right_brace => L!("ParseTokenType::rbrace"),
|
||||
ParseTokenType::redirection => L!("ParseTokenType::redirection"),
|
||||
ParseTokenType::string => L!("ParseTokenType::string"),
|
||||
ParseTokenType::andand => L!("ParseTokenType::andand"),
|
||||
@ -426,6 +431,8 @@ pub fn token_type_user_presentable_description(
|
||||
ParseTokenType::pipe => L!("a pipe").to_owned(),
|
||||
ParseTokenType::redirection => L!("a redirection").to_owned(),
|
||||
ParseTokenType::background => L!("a '&'").to_owned(),
|
||||
ParseTokenType::left_brace => L!("a '{'").to_owned(),
|
||||
ParseTokenType::right_brace => L!("a '}'").to_owned(),
|
||||
ParseTokenType::andand => L!("'&&'").to_owned(),
|
||||
ParseTokenType::oror => L!("'||'").to_owned(),
|
||||
ParseTokenType::end => L!("end of the statement").to_owned(),
|
||||
@ -529,7 +536,3 @@ pub const ERROR_BAD_COMMAND_ASSIGN_ERR_MSG: &str =
|
||||
/// Error message for a command like `time foo &`.
|
||||
pub const ERROR_TIME_BACKGROUND: &str =
|
||||
"'time' is not supported for background jobs. Consider using 'command time'.";
|
||||
|
||||
/// Error issued on { echo; echo }.
|
||||
pub const ERROR_NO_BRACE_GROUPING: &str =
|
||||
"'{ ... }' is not supported for grouping commands. Please use 'begin; ...; end'";
|
||||
|
@ -28,9 +28,9 @@ use crate::job_group::JobGroup;
|
||||
use crate::operation_context::OperationContext;
|
||||
use crate::parse_constants::{
|
||||
parse_error_offset_source_start, ParseError, ParseErrorCode, ParseErrorList, ParseKeyword,
|
||||
ParseTokenType, StatementDecoration, CALL_STACK_LIMIT_EXCEEDED_ERR_MSG,
|
||||
ERROR_NO_BRACE_GROUPING, ERROR_TIME_BACKGROUND, FAILED_EXPANSION_VARIABLE_NAME_ERR_MSG,
|
||||
ILLEGAL_FD_ERR_MSG, INFINITE_FUNC_RECURSION_ERR_MSG, WILDCARD_ERR_MSG,
|
||||
ParseTokenType, StatementDecoration, CALL_STACK_LIMIT_EXCEEDED_ERR_MSG, ERROR_TIME_BACKGROUND,
|
||||
FAILED_EXPANSION_VARIABLE_NAME_ERR_MSG, ILLEGAL_FD_ERR_MSG, INFINITE_FUNC_RECURSION_ERR_MSG,
|
||||
WILDCARD_ERR_MSG,
|
||||
};
|
||||
use crate::parse_tree::{LineCounter, NodeRef, ParsedSourceRef};
|
||||
use crate::parse_util::parse_util_unescape_wildcards;
|
||||
@ -162,6 +162,9 @@ impl<'a> ExecutionContext {
|
||||
StatementVariant::BlockStatement(block) => {
|
||||
self.run_block_statement(ctx, block, associated_block)
|
||||
}
|
||||
StatementVariant::BraceStatement(brace_statement) => {
|
||||
self.run_begin_statement(ctx, &brace_statement.jobs)
|
||||
}
|
||||
StatementVariant::IfStatement(ifstat) => {
|
||||
self.run_if_statement(ctx, ifstat, associated_block)
|
||||
}
|
||||
@ -363,10 +366,6 @@ impl<'a> ExecutionContext {
|
||||
}
|
||||
}
|
||||
|
||||
if cmd.as_char_slice().first() == Some(&'{' /*}*/) {
|
||||
error.push_utfstr(&wgettext!(ERROR_NO_BRACE_GROUPING));
|
||||
}
|
||||
|
||||
// Here we want to report an error (so it shows a backtrace).
|
||||
// If the handler printed text, that's already shown, so error will be empty.
|
||||
report_error_formatted!(
|
||||
@ -569,6 +568,7 @@ impl<'a> ExecutionContext {
|
||||
// type safety (in case we add more specific statement types).
|
||||
match &job.statement.contents {
|
||||
StatementVariant::BlockStatement(stmt) => no_redirs(&stmt.args_or_redirs),
|
||||
StatementVariant::BraceStatement(stmt) => no_redirs(&stmt.args_or_redirs),
|
||||
StatementVariant::SwitchStatement(stmt) => no_redirs(&stmt.args_or_redirs),
|
||||
StatementVariant::IfStatement(stmt) => no_redirs(&stmt.args_or_redirs),
|
||||
StatementVariant::NotStatement(_) | StatementVariant::DecoratedStatement(_) => {
|
||||
@ -688,6 +688,7 @@ impl<'a> ExecutionContext {
|
||||
self.populate_not_process(ctx, job, proc, not_statement)
|
||||
}
|
||||
StatementVariant::BlockStatement(_)
|
||||
| StatementVariant::BraceStatement(_)
|
||||
| StatementVariant::IfStatement(_)
|
||||
| StatementVariant::SwitchStatement(_) => {
|
||||
self.populate_block_process(ctx, proc, statement, specific_statement)
|
||||
@ -852,6 +853,7 @@ impl<'a> ExecutionContext {
|
||||
// TODO: args_or_redirs should be available without resolving the statement type.
|
||||
let args_or_redirs = match specific_statement {
|
||||
StatementVariant::BlockStatement(block_statement) => &block_statement.args_or_redirs,
|
||||
StatementVariant::BraceStatement(brace_statement) => &brace_statement.args_or_redirs,
|
||||
StatementVariant::IfStatement(if_statement) => &if_statement.args_or_redirs,
|
||||
StatementVariant::SwitchStatement(switch_statement) => &switch_statement.args_or_redirs,
|
||||
_ => panic!("Unexpected block node type"),
|
||||
@ -1593,6 +1595,9 @@ impl<'a> ExecutionContext {
|
||||
StatementVariant::BlockStatement(block_statement) => {
|
||||
self.run_block_statement(ctx, block_statement, associated_block)
|
||||
}
|
||||
StatementVariant::BraceStatement(brace_statement) => {
|
||||
self.run_begin_statement(ctx, &brace_statement.jobs)
|
||||
}
|
||||
StatementVariant::IfStatement(ifstmt) => {
|
||||
self.run_if_statement(ctx, ifstmt, associated_block)
|
||||
}
|
||||
@ -1923,6 +1928,7 @@ type AstArgsList<'a> = Vec<&'a ast::Argument>;
|
||||
fn type_is_redirectable_block(typ: ast::Type) -> bool {
|
||||
[
|
||||
ast::Type::block_statement,
|
||||
ast::Type::brace_statement,
|
||||
ast::Type::if_statement,
|
||||
ast::Type::switch_statement,
|
||||
]
|
||||
@ -1961,6 +1967,9 @@ fn profiling_cmd_name_for_redirectable_block(
|
||||
BlockStatementHeaderVariant::None => panic!("Unexpected block header type"),
|
||||
}
|
||||
}
|
||||
StatementVariant::BraceStatement(brace_statement) => {
|
||||
brace_statement.left_brace.source_range().start()
|
||||
}
|
||||
StatementVariant::IfStatement(ifstmt) => {
|
||||
ifstmt.if_clause.condition.job.source_range().end()
|
||||
}
|
||||
|
@ -93,6 +93,7 @@ impl From<TokenizerError> for ParseErrorCode {
|
||||
}
|
||||
TokenizerError::unterminated_slice => ParseErrorCode::tokenizer_unterminated_slice,
|
||||
TokenizerError::unterminated_escape => ParseErrorCode::tokenizer_unterminated_escape,
|
||||
// To-do: maybe also unbalancing brace?
|
||||
_ => ParseErrorCode::tokenizer_other,
|
||||
}
|
||||
}
|
||||
|
@ -415,6 +415,8 @@ fn job_or_process_extent(
|
||||
| TokenType::background
|
||||
| TokenType::andand
|
||||
| TokenType::oror
|
||||
| TokenType::left_brace
|
||||
| TokenType::right_brace
|
||||
if (token.type_ != TokenType::pipe || process) =>
|
||||
{
|
||||
if tok_begin >= pos {
|
||||
@ -1049,9 +1051,9 @@ impl<'a> NodeVisitor<'a> for IndentVisitor<'a> {
|
||||
dec = if switchs.end.has_source() { 1 } else { 0 };
|
||||
}
|
||||
Type::token_base => {
|
||||
if node.parent().unwrap().typ() == Type::begin_header
|
||||
&& node.as_token().unwrap().token_type() == ParseTokenType::end
|
||||
{
|
||||
let token_type = node.as_token().unwrap().token_type();
|
||||
let parent_type = node.parent().unwrap().typ();
|
||||
if parent_type == Type::begin_header && token_type == ParseTokenType::end {
|
||||
// The newline after "begin" is optional, so it is part of the header.
|
||||
// The header is not in the indented block, so indent the newline here.
|
||||
if node.source(self.src) == "\n" {
|
||||
@ -1059,6 +1061,11 @@ impl<'a> NodeVisitor<'a> for IndentVisitor<'a> {
|
||||
dec = 1;
|
||||
}
|
||||
}
|
||||
// if token_type == ParseTokenType::right_brace && parent_type == Type::brace_statement
|
||||
// {
|
||||
// inc = 1;
|
||||
// dec = 1;
|
||||
// }
|
||||
}
|
||||
_ => (),
|
||||
}
|
||||
@ -1229,6 +1236,15 @@ pub fn parse_util_detect_errors_in_ast(
|
||||
}
|
||||
errored |=
|
||||
detect_errors_in_block_redirection_list(&block.args_or_redirs, &mut out_errors);
|
||||
} else if let Some(brace_statement) = node.as_brace_statement() {
|
||||
// If our closing brace had no source, we are unsourced.
|
||||
if !brace_statement.right_brace.has_source() {
|
||||
has_unclosed_block = true;
|
||||
}
|
||||
errored |= detect_errors_in_block_redirection_list(
|
||||
&brace_statement.args_or_redirs,
|
||||
&mut out_errors,
|
||||
);
|
||||
} else if let Some(ifs) = node.as_if_statement() {
|
||||
// If our 'end' had no source, we are unsourced.
|
||||
if !ifs.end.has_source() {
|
||||
@ -1780,15 +1796,28 @@ fn detect_errors_in_block_redirection_list(
|
||||
args_or_redirs: &ast::ArgumentOrRedirectionList,
|
||||
out_errors: &mut Option<&mut ParseErrorList>,
|
||||
) -> bool {
|
||||
if let Some(first_arg) = get_first_arg(args_or_redirs) {
|
||||
let Some(first_arg) = get_first_arg(args_or_redirs) else {
|
||||
return false;
|
||||
};
|
||||
if args_or_redirs
|
||||
.parent()
|
||||
.unwrap()
|
||||
.as_brace_statement()
|
||||
.is_some()
|
||||
{
|
||||
return append_syntax_error!(
|
||||
out_errors,
|
||||
first_arg.source_range().start(),
|
||||
first_arg.source_range().length(),
|
||||
END_ARG_ERR_MSG
|
||||
RIGHT_BRACE_ARG_ERR_MSG
|
||||
);
|
||||
}
|
||||
false
|
||||
append_syntax_error!(
|
||||
out_errors,
|
||||
first_arg.source_range().start(),
|
||||
first_arg.source_range().length(),
|
||||
END_ARG_ERR_MSG
|
||||
)
|
||||
}
|
||||
|
||||
/// Given a string containing a variable expansion error, append an appropriate error to the errors
|
||||
@ -1898,6 +1927,7 @@ const BACKGROUND_IN_CONDITIONAL_ERROR_MSG: &str =
|
||||
|
||||
/// Error message for arguments to 'end'
|
||||
const END_ARG_ERR_MSG: &str = "'end' does not take arguments. Did you forget a ';'?";
|
||||
const RIGHT_BRACE_ARG_ERR_MSG: &str = "'}' does not take arguments. Did you forget a ';'?";
|
||||
|
||||
/// Error message when 'time' is in a pipeline.
|
||||
const TIME_IN_PIPELINE_ERR_MSG: &str =
|
||||
|
@ -299,6 +299,22 @@ fn test_parser() {
|
||||
detect_errors!("true || \n") == Err(ParserTestErrorBits::INCOMPLETE),
|
||||
"unterminated conjunction not reported properly"
|
||||
);
|
||||
|
||||
assert!(
|
||||
detect_errors!("begin ; echo hi; }") == Err(ParserTestErrorBits::ERROR),
|
||||
"closing of unopened brace statement not reported properly"
|
||||
);
|
||||
|
||||
assert_eq!(
|
||||
detect_errors!("begin {"), // }
|
||||
Err(ParserTestErrorBits::INCOMPLETE),
|
||||
"brace after begin not reported properly"
|
||||
);
|
||||
assert_eq!(
|
||||
detect_errors!("a=b {"), // }
|
||||
Err(ParserTestErrorBits::INCOMPLETE),
|
||||
"brace after variable override not reported properly"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
@ -604,6 +620,8 @@ fn test_new_parser_errors() {
|
||||
validate!("case", ParseErrorCode::unbalancing_case);
|
||||
validate!("if true ; case ; end", ParseErrorCode::unbalancing_case);
|
||||
|
||||
validate!("begin ; }", ParseErrorCode::unbalancing_brace);
|
||||
|
||||
validate!("true | and", ParseErrorCode::andor_in_pipeline);
|
||||
|
||||
validate!("a=", ParseErrorCode::bare_variable_assignment);
|
||||
|
@ -31,6 +31,43 @@ fn test_tokenizer() {
|
||||
assert!(t.next().is_none());
|
||||
}
|
||||
|
||||
{
|
||||
let s = L!("{ echo");
|
||||
let mut t = Tokenizer::new(s, TokFlags(0));
|
||||
|
||||
let token = t.next(); // {
|
||||
assert!(token.is_some());
|
||||
let token = token.unwrap();
|
||||
assert_eq!(token.type_, TokenType::left_brace);
|
||||
assert_eq!(token.length, 1);
|
||||
assert_eq!(t.text_of(&token), "{");
|
||||
|
||||
let token = t.next(); // echo
|
||||
assert!(token.is_some());
|
||||
let token = token.unwrap();
|
||||
assert_eq!(token.type_, TokenType::string);
|
||||
assert_eq!(token.offset, 2);
|
||||
assert_eq!(token.length, 4);
|
||||
assert_eq!(t.text_of(&token), "echo");
|
||||
|
||||
assert!(t.next().is_none());
|
||||
}
|
||||
|
||||
{
|
||||
let s = L!("{echo, foo}");
|
||||
let mut t = Tokenizer::new(s, TokFlags(0));
|
||||
let token = t.next().unwrap();
|
||||
assert_eq!(token.type_, TokenType::string);
|
||||
assert_eq!(token.length, 11);
|
||||
assert!(t.next().is_none());
|
||||
}
|
||||
{
|
||||
let s = L!("{ echo; foo}");
|
||||
let mut t = Tokenizer::new(s, TokFlags(0));
|
||||
let token = t.next().unwrap();
|
||||
assert_eq!(token.type_, TokenType::left_brace);
|
||||
}
|
||||
|
||||
let s = L!(concat!(
|
||||
"string <redirection 2>&1 'nested \"quoted\" '(string containing subshells ",
|
||||
"){and,brackets}$as[$well (as variable arrays)] not_a_redirect^ ^ ^^is_a_redirect ",
|
||||
|
101
src/tokenizer.rs
101
src/tokenizer.rs
@ -1,14 +1,16 @@
|
||||
//! A specialized tokenizer for tokenizing the fish language. In the future, the tokenizer should be
|
||||
//! extended to support marks, tokenizing multiple strings and disposing of unused string segments.
|
||||
|
||||
use crate::ast::unescape_keyword;
|
||||
use crate::common::valid_var_name_char;
|
||||
use crate::future_feature_flags::{feature_test, FeatureFlag};
|
||||
use crate::parse_constants::SOURCE_OFFSET_INVALID;
|
||||
use crate::parser_keywords::parser_keywords_is_subcommand;
|
||||
use crate::redirection::RedirectionMode;
|
||||
use crate::wchar::prelude::*;
|
||||
use libc::{STDIN_FILENO, STDOUT_FILENO};
|
||||
use nix::fcntl::OFlag;
|
||||
use std::ops::{BitAnd, BitAndAssign, BitOr, BitOrAssign, Not};
|
||||
use std::ops::{BitAnd, BitAndAssign, BitOr, BitOrAssign, Not, Range};
|
||||
use std::os::fd::RawFd;
|
||||
|
||||
/// Token types. XXX Why this isn't ParseTokenType, I'm not really sure.
|
||||
@ -26,6 +28,10 @@ pub enum TokenType {
|
||||
oror,
|
||||
/// End token (semicolon or newline, not literal end)
|
||||
end,
|
||||
/// opening brace of a compound statement
|
||||
left_brace,
|
||||
/// closing brace of a compound statement
|
||||
right_brace,
|
||||
/// redirection token
|
||||
redirect,
|
||||
/// send job to bg token
|
||||
@ -146,6 +152,10 @@ pub const TOK_SHOW_BLANK_LINES: TokFlags = TokFlags(4);
|
||||
/// Make an effort to continue after an error.
|
||||
pub const TOK_CONTINUE_AFTER_ERROR: TokFlags = TokFlags(8);
|
||||
|
||||
/// Consumers want to treat all tokens as arguments, so disable special handling at
|
||||
/// command-position.
|
||||
pub const TOK_ARGUMENT_LIST: TokFlags = TokFlags(16);
|
||||
|
||||
impl From<TokenizerError> for &'static wstr {
|
||||
fn from(err: TokenizerError) -> Self {
|
||||
match err {
|
||||
@ -178,7 +188,7 @@ impl From<TokenizerError> for &'static wstr {
|
||||
wgettext!("Unexpected '[' at this location")
|
||||
}
|
||||
TokenizerError::closing_unopened_brace => {
|
||||
wgettext!("Unexpected '}' for unopened brace expansion")
|
||||
wgettext!("Unexpected '}' for unopened brace")
|
||||
}
|
||||
TokenizerError::unterminated_brace => {
|
||||
wgettext!("Unexpected end of string, incomplete parameter expansion")
|
||||
@ -234,6 +244,9 @@ impl Tok {
|
||||
pub fn end(&self) -> usize {
|
||||
self.offset() + self.length()
|
||||
}
|
||||
pub fn range(&self) -> Range<usize> {
|
||||
self.offset()..self.end()
|
||||
}
|
||||
pub fn set_error_offset_within_token(&mut self, value: usize) {
|
||||
self.error_offset_within_token = value.try_into().unwrap();
|
||||
}
|
||||
@ -248,6 +261,11 @@ impl Tok {
|
||||
}
|
||||
}
|
||||
|
||||
struct BraceStatementParser {
|
||||
at_command_position: bool,
|
||||
unclosed_brace_statements: usize,
|
||||
}
|
||||
|
||||
/// The tokenizer struct.
|
||||
pub struct Tokenizer<'c> {
|
||||
/// A pointer into the original string, showing where the next token begins.
|
||||
@ -256,6 +274,8 @@ pub struct Tokenizer<'c> {
|
||||
start: &'c wstr,
|
||||
/// Whether we have additional tokens.
|
||||
has_next: bool,
|
||||
/// Parser state regarding brace statements. None if reading an argument list.
|
||||
brace_statement_parser: Option<BraceStatementParser>,
|
||||
/// Whether incomplete tokens are accepted.
|
||||
accept_unfinished: bool,
|
||||
/// Whether comments should be returned.
|
||||
@ -270,6 +290,10 @@ pub struct Tokenizer<'c> {
|
||||
on_quote_toggle: Option<&'c mut dyn FnMut(usize)>,
|
||||
}
|
||||
|
||||
pub(crate) fn is_brace_statement(next_char: Option<char>) -> bool {
|
||||
next_char.map_or(true, |next| next.is_ascii_whitespace() || next == ';')
|
||||
}
|
||||
|
||||
impl<'c> Tokenizer<'c> {
|
||||
/// Constructor for a tokenizer. b is the string that is to be tokenized. It is not copied, and
|
||||
/// should not be freed by the caller until after the tokenizer is destroyed.
|
||||
@ -297,6 +321,12 @@ impl<'c> Tokenizer<'c> {
|
||||
token_cursor: 0,
|
||||
start,
|
||||
has_next: true,
|
||||
brace_statement_parser: (!(flags & TOK_ARGUMENT_LIST)).then_some(
|
||||
BraceStatementParser {
|
||||
at_command_position: true,
|
||||
unclosed_brace_statements: 0,
|
||||
},
|
||||
),
|
||||
accept_unfinished: flags & TOK_ACCEPT_UNFINISHED,
|
||||
show_comments: flags & TOK_SHOW_COMMENTS,
|
||||
show_blank_lines: flags & TOK_SHOW_BLANK_LINES,
|
||||
@ -368,7 +398,8 @@ impl<'c> Iterator for Tokenizer<'c> {
|
||||
.get(self.token_cursor + 1)
|
||||
.copied();
|
||||
let buff = &self.start[self.token_cursor..];
|
||||
match this_char {
|
||||
let mut at_cmd_pos = false;
|
||||
let token = match this_char {
|
||||
'\0'=> {
|
||||
self.has_next = false;
|
||||
None
|
||||
@ -380,6 +411,7 @@ impl<'c> Iterator for Tokenizer<'c> {
|
||||
result.offset = start_pos as u32;
|
||||
result.length = 1;
|
||||
self.token_cursor += 1;
|
||||
at_cmd_pos = true;
|
||||
// Hack: when we get a newline, swallow as many as we can. This compresses multiple
|
||||
// subsequent newlines into a single one.
|
||||
if !self.show_blank_lines {
|
||||
@ -393,6 +425,38 @@ impl<'c> Iterator for Tokenizer<'c> {
|
||||
}
|
||||
Some(result)
|
||||
}
|
||||
'{' if self.brace_statement_parser.as_ref()
|
||||
.is_some_and(|parser| parser.at_command_position)
|
||||
&& is_brace_statement(self.start.as_char_slice().get(self.token_cursor + 1).copied())
|
||||
=>
|
||||
{
|
||||
self.brace_statement_parser.as_mut().unwrap().unclosed_brace_statements += 1;
|
||||
let mut result = Tok::new(TokenType::left_brace);
|
||||
result.offset = start_pos as u32;
|
||||
result.length = 1;
|
||||
self.token_cursor += 1;
|
||||
at_cmd_pos = true;
|
||||
Some(result)
|
||||
}
|
||||
'}' => {
|
||||
let brace_count = self.brace_statement_parser.as_mut()
|
||||
.map(|parser| &mut parser.unclosed_brace_statements);
|
||||
if brace_count.as_ref().map_or(true, |count| **count == 0) {
|
||||
return Some(self.call_error(
|
||||
TokenizerError::closing_unopened_brace,
|
||||
self.token_cursor,
|
||||
self.token_cursor,
|
||||
Some(1),
|
||||
1,
|
||||
));
|
||||
}
|
||||
brace_count.map(|count| *count -= 1);
|
||||
let mut result = Tok::new(TokenType::right_brace);
|
||||
result.offset = start_pos as u32;
|
||||
result.length = 1;
|
||||
self.token_cursor += 1;
|
||||
Some(result)
|
||||
}
|
||||
'&'=> {
|
||||
if next_char == Some('&') {
|
||||
// && is and.
|
||||
@ -400,6 +464,7 @@ impl<'c> Iterator for Tokenizer<'c> {
|
||||
result.offset = start_pos as u32;
|
||||
result.length = 2;
|
||||
self.token_cursor += 2;
|
||||
at_cmd_pos = true;
|
||||
Some(result)
|
||||
} else if next_char == Some('>') || next_char == Some('|') {
|
||||
// &> and &| redirect both stdout and stderr.
|
||||
@ -409,12 +474,14 @@ impl<'c> Iterator for Tokenizer<'c> {
|
||||
result.offset = start_pos as u32;
|
||||
result.length = redir.consumed as u32;
|
||||
self.token_cursor += redir.consumed;
|
||||
at_cmd_pos = next_char == Some('|');
|
||||
Some(result)
|
||||
} else {
|
||||
let mut result = Tok::new(TokenType::background);
|
||||
result.offset = start_pos as u32;
|
||||
result.length = 1;
|
||||
self.token_cursor += 1;
|
||||
at_cmd_pos = true;
|
||||
Some(result)
|
||||
}
|
||||
}
|
||||
@ -425,6 +492,7 @@ impl<'c> Iterator for Tokenizer<'c> {
|
||||
result.offset = start_pos as u32;
|
||||
result.length = 2;
|
||||
self.token_cursor += 2;
|
||||
at_cmd_pos = true;
|
||||
Some(result)
|
||||
} else if next_char == Some('&') {
|
||||
// |& is a bashism; in fish it's &|.
|
||||
@ -437,6 +505,7 @@ impl<'c> Iterator for Tokenizer<'c> {
|
||||
result.offset = start_pos as u32;
|
||||
result.length = pipe.consumed as u32;
|
||||
self.token_cursor += pipe.consumed;
|
||||
at_cmd_pos = true;
|
||||
Some(result)
|
||||
}
|
||||
}
|
||||
@ -489,16 +558,31 @@ impl<'c> Iterator for Tokenizer<'c> {
|
||||
result.offset = start_pos as u32;
|
||||
result.length = redir_or_pipe.consumed as u32;
|
||||
self.token_cursor += redir_or_pipe.consumed;
|
||||
at_cmd_pos = redir_or_pipe.is_pipe;
|
||||
Some(result)
|
||||
}
|
||||
}
|
||||
None => {
|
||||
// Not a redirection or pipe, so just a string.
|
||||
Some(self.read_string())
|
||||
let s = self.read_string();
|
||||
at_cmd_pos = self.brace_statement_parser.as_ref()
|
||||
.is_some_and(|parser| parser.at_command_position) && {
|
||||
let text = self.text_of(&s);
|
||||
parser_keywords_is_subcommand(&unescape_keyword(
|
||||
TokenType::string,
|
||||
text)
|
||||
) ||
|
||||
variable_assignment_equals_pos(text).is_some()
|
||||
};
|
||||
Some(s)
|
||||
}
|
||||
}
|
||||
}
|
||||
};
|
||||
if let Some(parser) = self.brace_statement_parser.as_mut() {
|
||||
parser.at_command_position = at_cmd_pos;
|
||||
}
|
||||
token
|
||||
}
|
||||
}
|
||||
|
||||
@ -675,13 +759,8 @@ impl<'c> Tokenizer<'c> {
|
||||
);
|
||||
}
|
||||
if brace_offsets.pop().is_none() {
|
||||
return self.call_error(
|
||||
TokenizerError::closing_unopened_brace,
|
||||
self.token_cursor,
|
||||
self.token_cursor,
|
||||
Some(1),
|
||||
1,
|
||||
);
|
||||
// Let the caller throw an error.
|
||||
break;
|
||||
}
|
||||
if brace_offsets.is_empty() {
|
||||
mode &= !TOK_MODE_CURLY_BRACES;
|
||||
|
@ -51,3 +51,158 @@ end
|
||||
|
||||
echo {a(echo ,)b}
|
||||
#CHECK: {a,b}
|
||||
|
||||
e{cho,cho,cho}
|
||||
# CHECK: echo echo
|
||||
|
||||
## Compound commands
|
||||
|
||||
{ echo compound; echo command; }
|
||||
# CHECK: compound
|
||||
# CHECK: command
|
||||
|
||||
{;echo -n start with\ ; echo semi; }
|
||||
# CHECK: start with semi
|
||||
|
||||
{ echo no semi }
|
||||
# CHECK: no semi
|
||||
|
||||
# Ambiguous cases
|
||||
|
||||
{ echo ,comma;}
|
||||
# CHECK: ,comma
|
||||
|
||||
PATH= {echo no space}
|
||||
# CHECKERR: fish: Unknown command: '{echo no space}'
|
||||
# CHECKERR: {{.*}}/braces.fish (line {{\d+}}):
|
||||
# CHECKERR: PATH= {echo no space}
|
||||
# CHECKERR: ^~~~~~~~~~~~~~^
|
||||
|
||||
PATH= {echo comma, no space;}
|
||||
# CHECKERR: fish: Unknown command: 'echo comma'
|
||||
# CHECKERR: {{.*}}/braces.fish (line {{\d+}}):
|
||||
# CHECKERR: PATH= {echo comma, no space;}
|
||||
# CHECKERR: ^~~~~~~~~~~~~~~~~~~~~~^
|
||||
|
||||
# Ambiguous case with no space
|
||||
{echo,hello}
|
||||
# CHECK: hello
|
||||
|
||||
# Trailing tokens
|
||||
set -l fish (status fish-path)
|
||||
$fish -c '{ :; } true'
|
||||
# CHECKERR: fish: '}' does not take arguments. Did you forget a ';'?
|
||||
# CHECKERR: { :; } true
|
||||
# CHECKERR: ^~~^
|
||||
|
||||
; { echo semi; }
|
||||
# CHECK: semi
|
||||
|
||||
a=b { echo $a; }
|
||||
# CHECK: b
|
||||
|
||||
time { :; }
|
||||
# CHECKERR:
|
||||
# CHECKERR: {{_+}}
|
||||
# CHECKERR: Executed in {{.*}}
|
||||
# CHECKERR: usr time {{.*}}
|
||||
# CHECKERR: sys time {{.*}}
|
||||
|
||||
true & { echo background; }
|
||||
# CHECK: background
|
||||
|
||||
true && { echo conjunction; }
|
||||
# CHECK: conjunction
|
||||
|
||||
true; and { echo and; }
|
||||
# CHECK: and
|
||||
|
||||
true | { echo pipe; }
|
||||
# CHECK: pipe
|
||||
|
||||
true 2>| { echo stderrpipe; }
|
||||
# CHECK: stderrpipe
|
||||
|
||||
false || { echo disjunction; }
|
||||
# CHECK: disjunction
|
||||
|
||||
false; or { echo or; }
|
||||
# CHECK: or
|
||||
|
||||
begin { echo begin }
|
||||
end
|
||||
# CHECK: begin
|
||||
|
||||
not { false; true }
|
||||
echo $status
|
||||
# CHECK: 1
|
||||
|
||||
! { false }
|
||||
echo $status
|
||||
# CHECK: 0
|
||||
|
||||
if { set -l a true; $a && true }
|
||||
echo if-true
|
||||
end
|
||||
# CHECK: if-true
|
||||
|
||||
{
|
||||
set -l condition true
|
||||
while $condition
|
||||
{
|
||||
echo while
|
||||
set condition false
|
||||
}
|
||||
end
|
||||
}
|
||||
# CHECK: while
|
||||
|
||||
{ { echo inner}
|
||||
echo outer}
|
||||
# CHECK: inner
|
||||
# CHECK: outer
|
||||
|
||||
{
|
||||
|
||||
echo leading blank lines
|
||||
}
|
||||
# CHECK: leading blank lines
|
||||
|
||||
complete foo -a '123 456'
|
||||
complete -C 'foo {' | sed 1q
|
||||
# CHECK: {{\{.*}}
|
||||
|
||||
complete -C '{'
|
||||
echo nothing
|
||||
# CHECK: nothing
|
||||
complete -C '{ ' | grep ^if\t
|
||||
# CHECK: if{{\t}}Evaluate block if condition is true
|
||||
|
||||
$fish -c '{'
|
||||
# CHECKERR: fish: Expected a '}', but found end of the input
|
||||
|
||||
PATH= "{"
|
||||
# CHECKERR: fish: Unknown command: '{'
|
||||
# CHECKERR: {{.*}}/braces.fish (line {{\d+}}):
|
||||
# CHECKERR: PATH= "{"
|
||||
# CHECKERR: ^~^
|
||||
|
||||
$fish -c 'builtin {'
|
||||
# CHECKERR: fish: Expected end of the statement, but found a '{'
|
||||
# CHECKERR: builtin {
|
||||
# CHECKERR: ^
|
||||
|
||||
$fish -c 'command {'
|
||||
# CHECKERR: fish: Expected end of the statement, but found a '{'
|
||||
# CHECKERR: command {
|
||||
# CHECKERR: ^
|
||||
|
||||
$fish -c 'exec {'
|
||||
# CHECKERR: fish: Expected end of the statement, but found a '{'
|
||||
# CHECKERR: exec {
|
||||
# CHECKERR: ^
|
||||
|
||||
$fish -c 'begin; }'
|
||||
# CHECKERR: fish: Unexpected '}' for unopened brace
|
||||
# CHECKERR: begin; }
|
||||
# CHECKERR: ^
|
||||
|
@ -25,13 +25,6 @@ command -v nonexistent-command-1234
|
||||
echo $status
|
||||
#CHECK: 127
|
||||
|
||||
|
||||
{ echo; echo }
|
||||
# CHECKERR: {{.*}}: Unknown command: '{ echo; echo }'
|
||||
# CHECKERR: {{.*}}: '{ ... }' is not supported for grouping commands. Please use 'begin; ...; end'
|
||||
# CHECKERR: { echo; echo }
|
||||
# CHECKERR: ^~~~~~~~~~~~~^
|
||||
|
||||
set -g PATH .
|
||||
echo banana > foobar
|
||||
foobar --banana
|
||||
|
@ -321,7 +321,7 @@ $fish -c 'echo {'
|
||||
#CHECKERR: echo {
|
||||
#CHECKERR: ^
|
||||
$fish -c 'echo {}}'
|
||||
#CHECKERR: fish: Unexpected '}' for unopened brace expansion
|
||||
#CHECKERR: fish: Unexpected '}' for unopened brace
|
||||
#CHECKERR: echo {}}
|
||||
#CHECKERR: ^
|
||||
printf '<%s>\n' ($fish -c 'command (asd)' 2>&1)
|
||||
|
@ -424,6 +424,66 @@ echo 'begin
|
||||
# CHECK: {{^}} first-indented-word \
|
||||
# CHECK: {{^}} second-indented-word
|
||||
|
||||
{
|
||||
echo '{ no semi }'
|
||||
# CHECK: { no semi }
|
||||
echo '{ semi; }'
|
||||
# CHECK: { semi; }
|
||||
|
||||
echo '{ multi; no semi }'
|
||||
# CHECK: { multi; no semi }
|
||||
echo '{ multi; semi; }'
|
||||
# CHECK: { multi; semi; }
|
||||
|
||||
echo '{ conj && no semi }'
|
||||
# CHECK: { conj && no semi }
|
||||
echo '{ conj && semi; }'
|
||||
# CHECK: { conj && semi; }
|
||||
|
||||
echo '{ }'
|
||||
# CHECK: { }
|
||||
echo '{ ; }'
|
||||
# CHECK: { }
|
||||
|
||||
echo '
|
||||
{
|
||||
echo \\
|
||||
# continuation comment
|
||||
}'
|
||||
# CHECK: {
|
||||
# CHECK: {{^ }}echo \
|
||||
# CHECK: {{^ }}# continuation comment
|
||||
# TODO: This is currently broken; so this the begin/end equivalent.
|
||||
# CHECK: {{^ [}]}}
|
||||
|
||||
echo '{ { } }'
|
||||
# CHECK: { { } }
|
||||
|
||||
echo '
|
||||
{
|
||||
|
||||
{
|
||||
}
|
||||
|
||||
}
|
||||
'
|
||||
# CHECK: {{^\{$}}
|
||||
# CHECK: {{^ \{$}}
|
||||
# CHECK: {{^ \}$}}
|
||||
# CHECK: {{^\}$}}
|
||||
|
||||
echo '
|
||||
{ level 1; {
|
||||
level 2 } }
|
||||
'
|
||||
# TODO Should add a line break here.
|
||||
# CHECK: {{^{ level 1$}}
|
||||
# CHECK: {{^ \{$}}
|
||||
# CHECK: {{^ level 2$}}
|
||||
# CHECK: {{^ \}$}}
|
||||
# CHECK: {{^\}$}}
|
||||
} | $fish_indent
|
||||
|
||||
echo 'multiline-\\
|
||||
-word' | $fish_indent --check
|
||||
echo $status #CHECK: 0
|
||||
|
Loading…
x
Reference in New Issue
Block a user