summaryrefslogtreecommitdiff
path: root/man/1/sh-regex
blob: 9fe4f820b2492c5f0c0c520c953606bc19aa7e70 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
.TH SH-REGEX 1
.SH NAME
re, match \- shell script regular expression handling
.SH SYNOPSIS
.B load regex

.B match
.I regex
[
.IR arg ...
]
.br
.B ${re
.I op
.IR arg...
.B }
.br
.SH DESCRIPTION
.I Regex
is a loadable module for
.IR sh (1)
that provides access to regular-expression
pattern matching and substitution.
For details of regular expression syntax in Inferno,
see
.IR regexp (6).
.I Regex
defines one builtin command,
.BR match ,
and one builtin substitution operator,
.BR re .
.B Match
gives a false exit status if its argument
.I regex
fails to match any
.IR arg .
.B Re
provides several operations, detailed below:
.TP 10
\f5${re g\fP \fIregexp\fP \fR[\fP \fIarg\fP\fR...\fP\fR]\fP\f5}\fP
Yields a list of each
.I arg
that matches
.IR regexp .
.TP
\f5${re v\fP \fIregexp\fP \fR[\fP \fIarg\fP\fR...\fP\fR]\fP\f5}\fP
Yields a list of each
.I arg
that does not match
.IR regexp .
.TP
\f5${re m\fP \fIregexp\fP \fIarg\fP\f5}\fP
Yields the portion of
.I arg
that matches
.IR regexp ,
or an empty list if there was no match.
.TP
\f5${re M\fP \fIregexp\fP \fIarg\fP\f5}\fP
Yields a list consisting of the portion
of 
.I arg
that matches
.IR regexp ,
followed by list elements giving the portion
of
.I arg
that matched each parenthesized subexpression
in turn.
.TP
\f5${re mg\fP \fIregexp\fP \fIarg\fP\f5}\fP
Similar to
.B re m
except that it applies the match consecutively
through
.IR arg ,
yielding a list of all the portions of
.I arg
that match
.IR regexp .
If a match is made to the null string,
no subsequent substitutions will take place.
.TP
\f5${re s\fP \fIregexp\fP \fIsubs\fP [ \fIarg\fP... ]\f5}\fP
For each
.IR arg ,
.B re s
substitutes the first occurrence of
.I regexp
(if any) by
.IR subs .
If
.I subs
contains a sequence of the form
.BI \e d
where
.I d
is a single decimal digit,
the
.IR d th
parenthesised subexpression in
.I regexp
will be substituted in its place.
.B \e0
is substituted by the entire match.
If any other character follows a
backslash
.RB ( \e ),
that character will be substituted.
Arguments which contain no match to
.I regexp
will be left unchanged.
.TP
\f5${re sg\fP \fIregexp\fP \fIsubs\fP [ \fIarg\fP... ]\f5}\fP
Similar to
.B re s
except that all matches of
.I regexp
within each
.I arg
will be substituted for, rather than just the
first match. Only one occurrence of the null string is
substituted.
.PP
.SH EXAMPLES
List all files in the current directory that
end in
.B .dis
or
.BR .sbl :
.EX
	ls -l ${re g '\e.(sbl|dis)$' *}
.EE
.PP
Break
.I string
up into its constituent characters,
putting the result in shell variable
.BR x :
.EX
	x = ${re mg '.|\en' \fIstring\fP}
.EE
.PP
Quote a string
.B s
so that it can be used as
a literal regular expression without worrying
about metacharacters:
.EX
	s = ${re sg '[*|[\e\e+.^$()?]' '\e\e\e0' $s}
.EE
.PP
Define a substitution function
.B pat2regexp
to convert shell-style
patterns into equivalent regular expressions
(e.g.
.RB `` ?.sbl* ''
would become
.RB `` ^.\e.sbl.*$ ''):
.EX
	load std
	subfn pat2regexp {
		result = '^' ^ ${re sg '\e*' '.*'
			${re sg '\?' '.'
				${re sg '[()+\e\e.^$|]' '\e\e\e0' $*}
			}
		} ^ '$'
	}
.EE
.SH SOURCE
.B /appl/cmd/sh/regex.b
.SH SEE ALSO
.IR regexp (6),
.IR regex (2),
.IR sh (1),
.IR string (2),
.IR sh-std (1)