austingroupbugs.net.rss.xml - sfeed_tests - sfeed tests and RSS and Atom files
(HTM) git clone git://git.codemadness.org/sfeed_tests
(DIR) Log
(DIR) Files
(DIR) Refs
(DIR) README
(DIR) LICENSE
---
austingroupbugs.net.rss.xml (80837B)
---
1 <?xml version="1.0" encoding="utf-8"?>
2 <!-- RSS generated by Flaimo.com RSS Builder [2022-02-18 13:56:11] --> <rss version="2.0" xmlns:im="http://purl.org/rss/1.0/item-images/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" >
3 <channel>
4 <docs>https://www.austingroupbugs.net/</docs>
5 <description>Austin Group Defect Tracker - ISSUES</description>
6 <link>https://www.austingroupbugs.net/</link>
7 <title>Austin Group Defect Tracker - ISSUES</title>
8 <image>
9 <title>Austin Group Defect Tracker - ISSUES</title>
10 <url>https://www.austingroupbugs.net/images/mantis_logo_button.gif</url>
11 <link>https://www.austingroupbugs.net/</link>
12 <description>Austin Group Defect Tracker - ISSUES</description>
13 </image>
14 <category>All Projects</category>
15 <ttl>10</ttl>
16 <sy:updatePeriod>hourly</sy:updatePeriod>
17 <sy:updateFrequency>1</sy:updateFrequency>
18 <sy:updateBase>2022-02-18T13:56:11+00:00</sy:updateBase>
19 <item>
20 <title>0001538: what -s is poorly described, uses the word "quit"</title>
21 <link>https://www.austingroupbugs.net/view.php?id=1538</link>
22 <description>On:<br />
23 <a href="https://pubs.opengroup.org/onlinepubs/9699919799/utilities/what.html">https://pubs.opengroup.org/onlinepubs/9699919799/utilities/what.html</a> [<a href="https://pubs.opengroup.org/onlinepubs/9699919799/utilities/what.html" target="_blank">^</a>]<br />
24 The -s option for what is described as so:<br />
25 Quit after finding the first occurrence of the pattern in each file.<br />
26 I find the usage of the word 'quit' here unfortunate, as it can be read as exiting or terminating.<br />
27 Both "in each file" and the behavior of what on various BSDs leads me to believe "quit" isn't the best way to describe the behavior of -s as "what -s foo bar" on the BSDs doesn't quit after finding a pattern in foo. (it also checks bar after) If I understand right, this may be a correct behavior according to the standard.<br />
28 <br />
29 In the man pages on NetBSD and OpenBSD, they describe -s as follows:<br />
30 If the -s option is specified, only the first occurrence of an identification string in each file is printed.<br />
31 On FreeBSD, the following phrasing is used:<br />
32 Stop searching each file after the first match.<br />
33 I also checked the phrasing in Solaris 10 and AIX 7.2 but don't have access to test actual behavior. (So I don't know if on SysV, "what -s foo bar" doesn't check bar if a pattern is found in foo)<br />
34 <br />
35 I don't know if this is an Issue 7 or Issue 8 sort of thing to fix, so I have this marked as Issue 7.</description>
36 <guid>https://www.austingroupbugs.net/view.php?id=1538</guid>
37 <author>andras_farkas <andras_farkas@example.com></author>
38 <comments>https://www.austingroupbugs.net/view.php?id=1538#bugnotes</comments>
39 </item>
40 <item>
41 <title>0001558: require [^...] in addition to [!...] for bracket expression negation</title>
42 <link>https://www.austingroupbugs.net/view.php?id=1558</link>
43 <description>(page/line numbers above are from Draft 2.1)<br />
44 <br />
45 There's a very unfortunate difference between sh/fnmatch globs and regexps in that [^...] is used in regexps (and most other places/languages) and [!...] in sh/fnmatch wildcard patterns (and [~...] in rc/es though that's not relevant here).<br />
46 <br />
47 The only reason is that the Bourne shell had decided to keep ^ as a pipe operator for backward compatibility with the Thompson shell, and since it's not an error there to have [ / ] unmatched, ^ could not be used as negation operator inside bracket expressions.<br />
48 <br />
49 But POSIX sh it not and is not compatible with the Bourne shell and POSIX does not allow ^ be treated specially, and leaves [^...] unspecified, more or less explicitly allowing ^ to be used as negation there and it's been the case for decades. Most sh/fnmatch implementations have now moved on and allow [^...] for negation.<br />
50 <br />
51 Among common sh implementations, the only exceptions that I know are ksh88 (and pdksh and derivatives) and bosh (that one still treats ^ as a pipe operator except with -o posix).<br />
52 <br />
53 All of bash, zsh, yash, dash, BSDs (except those like OpenBSD, MirBSD that use pdksh) allow [^...]. Most fnmatch() implementations do (including on OpenBSD making it a discrepancy between sh and fnmatch()/find...).</description>
54 <guid>https://www.austingroupbugs.net/view.php?id=1558</guid>
55 <author>stephane <stephane@example.com></author>
56 <comments>https://www.austingroupbugs.net/view.php?id=1558#bugnotes</comments>
57 </item>
58 <item>
59 <title>0001542: A certain example for puts, fputs, time, localtime, and localtime_r isn't consistent between them all</title>
60 <link>https://www.austingroupbugs.net/view.php?id=1542</link>
61 <description>This is a very minor issue:<br />
62 In some examples in the standard, the following text is used<br />
63 "There are %d minutes to the event.\n"<br />
64 while in other (often otherwise identical) examples, the following text is used<br />
65 "There are still %d minutes to the event.\n"<br />
66 <br />
67 It's easy to grep the standard for these instances, but they appear on:<br />
68 <a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/puts.html">https://pubs.opengroup.org/onlinepubs/9699919799/functions/puts.html</a> [<a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/puts.html" target="_blank">^</a>]<br />
69 <a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/fputs.html">https://pubs.opengroup.org/onlinepubs/9699919799/functions/fputs.html</a> [<a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/fputs.html" target="_blank">^</a>]<br />
70 <a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/localtime.html">https://pubs.opengroup.org/onlinepubs/9699919799/functions/localtime.html</a> [<a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/localtime.html" target="_blank">^</a>]<br />
71 <a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/localtime_r.html">https://pubs.opengroup.org/onlinepubs/9699919799/functions/localtime_r.html</a> [<a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/localtime_r.html" target="_blank">^</a>]<br />
72 <a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/time.html">https://pubs.opengroup.org/onlinepubs/9699919799/functions/time.html</a> [<a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/time.html" target="_blank">^</a>]<br />
73 <br />
74 They could be changed for consistency.</description>
75 <guid>https://www.austingroupbugs.net/view.php?id=1542</guid>
76 <author>andras_farkas <andras_farkas@example.com></author>
77 <comments>https://www.austingroupbugs.net/view.php?id=1542#bugnotes</comments>
78 </item>
79 <item>
80 <title>0001541: Overabundance of parentheses in atoi() example</title>
81 <link>https://www.austingroupbugs.net/view.php?id=1541</link>
82 <description>On<br />
83 <a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/atoi.html">https://pubs.opengroup.org/onlinepubs/9699919799/functions/atoi.html</a> [<a href="https://pubs.opengroup.org/onlinepubs/9699919799/functions/atoi.html" target="_blank">^</a>]<br />
84 the example has redundant parentheses.</description>
85 <guid>https://www.austingroupbugs.net/view.php?id=1541</guid>
86 <author>andras_farkas <andras_farkas@example.com></author>
87 <comments>https://www.austingroupbugs.net/view.php?id=1541#bugnotes</comments>
88 </item>
89 <item>
90 <title>0001562: printf utility: clarify what is (byte) string an what is character string</title>
91 <link>https://www.austingroupbugs.net/view.php?id=1562</link>
92 <description>3.375 String, defines:<br />
93 "A contiguous sequence of bytes terminated by and including the first null byte."<br />
94 i.e. a byte string, not - by itself - subject to the locale.<br />
95 <br />
96 <br />
97 The description of the printf utility, uses the phrase "string" in some cases without clearly telling whether character or byte string is meant.<br />
98 <br />
99 <br />
100 Line 104239, OPERANDS:<br />
101 > format<br />
102 > A string describing the format to use to write the remaining operands. See the<br />
103 > EXTENDED DESCRIPTION section.<br />
104 <br />
105 => At least when following line 104273...<br />
106 "The format operand shall be used as the format string described in XBD Chapter 5 (on page 101)", where it clearly says:<br />
107 "The format is a character string that contains three types of objects defined below"<br />
108 <br />
109 ... format should be a character string, and as such, it would be subject to LC_CTYPE?!<br />
110 <br />
111 => OTOH, the APPLICATION USAGE tells that it's modelled after the printf() function, which in turn uses a C-string as format string,... and if that would also apply to the printf utility, it would be a (byte) string and XBD Chapter 5's provisions would fully apply (which should then be explicitly mentioned).<br />
112 <br />
113 <br />
114 (See below, for an analogue example for why I think the difference matters.)<br />
115 <br />
116 <br />
117 > argument<br />
118 > The strings to be written to standard output, under the control of format.<br />
119 > See the EXTENDED DESCRIPTION section.<br />
120 <br />
121 Seems clearly a (byte) string, whose interpretation (byte, character) depends on the respective conversion specifier.<br />
122 <br />
123 <br />
124 <br />
125 It may further make sense, to explicitly clarify in line 104288, that string here is a "byte string", especially because in 104290 and following, it's laid out how characters are part of it.<br />
126 Consider e.g. some weird multibyte locale in which a character named A' is composed of bytes that include the binary representation of e.g. '\' (assuming some ASCII encodings for '\' and 'n')... I take as example for A' = 0xAA 0x5C<br />
127 A string: A'n ... that is 0xAA 0x5C 0x6E should probably be interpreted as: 0xAA \n and thus giving 0xAA 0x0A ... and not the character A' followed by n.<br />
128 <br />
129 <br />
130 Given that the whole section uses quite often the term "format string" (in the sense of character string)... it may make things more clear, to emphasise that this is a byte string.<br />
131 <br />
132 The same probably at line 104321.</description>
133 <guid>https://www.austingroupbugs.net/view.php?id=1562</guid>
134 <author>calestyo <calestyo@example.com></author>
135 <comments>https://www.austingroupbugs.net/view.php?id=1562#bugnotes</comments>
136 </item>
137 <item>
138 <title>0001531: time: follow-up to issue #1440</title>
139 <link>https://www.austingroupbugs.net/view.php?id=1531</link>
140 <description>""</description>
141 <guid>https://www.austingroupbugs.net/view.php?id=1531</guid>
142 <author>steffen <steffen@example.com></author>
143 <comments>https://www.austingroupbugs.net/view.php?id=1531#bugnotes</comments>
144 </item>
145 <item>
146 <title>0001530: nohup: follow-up to issue #1440</title>
147 <link>https://www.austingroupbugs.net/view.php?id=1530</link>
148 <description>""</description>
149 <guid>https://www.austingroupbugs.net/view.php?id=1530</guid>
150 <author>steffen <steffen@example.com></author>
151 <comments>https://www.austingroupbugs.net/view.php?id=1530#bugnotes</comments>
152 </item>
153 <item>
154 <title>0001529: ex: follow-up to issue #1440</title>
155 <link>https://www.austingroupbugs.net/view.php?id=1529</link>
156 <description>""</description>
157 <guid>https://www.austingroupbugs.net/view.php?id=1529</guid>
158 <author>steffen <steffen@example.com></author>
159 <comments>https://www.austingroupbugs.net/view.php?id=1529#bugnotes</comments>
160 </item>
161 <item>
162 <title>0001526: Update fdopen() mode description to match new fopen() terminology</title>
163 <link>https://www.austingroupbugs.net/view.php?id=1526</link>
164 <description>The new 'e' and 'x' mode string characters for fopen() have been accounted for on the fdopen() page using "prefix" terminology, which is being changed for fopen() via bug <a href="https://www.austingroupbugs.net/view.php?id=1302">0001302</a> (C17 alignment). The fdopen() page should change to match the new terminology, but the need for this change does not result (directly) from C17 alignment, so it is being handled by a separate bug instead of in 1302.<br />
165 <br />
166 The current wording is also vague about what happens if there is a mismatch between the O_APPEND flag on the open file description and the use of 'w' versus 'a' in the mode. The only clue is that the rationale states "a good implementation of append (a) mode would cause the O_APPEND flag to be set", which implies that it is intentional that both behaviours are allowed when O_APPEND is clear and 'a' is used. I tried test programs on Linux (glibc), Solaris, MacOS and HP-UX, and the results were:<br />
167 <br />
168 * Only Linux set O_APPEND if it was clear when using "a"<br />
169 * All four left O_APPEND set if it was set when using "w"<br />
170 <br />
171 Therefore the suggested changes make it unspecified for "a" but mandate leaving O_APPEND set for "w".<br />
172 <br />
173 As an editorial matter, the use of the phrase "Open a file" is also problematic since fdopen() does not actually open a file.<br />
174 <br />
175 Finally, the new changes include a typographical conventions change to use quotes around mode characters instead of italicising them. This change is also needed on the popen() page for consistency.</description>
176 <guid>https://www.austingroupbugs.net/view.php?id=1526</guid>
177 <author>geoffclare <geoffclare@example.com></author>
178 <comments>https://www.austingroupbugs.net/view.php?id=1526#bugnotes</comments>
179 </item>
180 <item>
181 <title>0001524: open() flags used by fopen()</title>
182 <link>https://www.austingroupbugs.net/view.php?id=1524</link>
183 <description>The lead-in to the table describing the open() flags used by fopen() on P882, L30005-30013 says:<br />
184 <blockquote>The file descriptor associated with the opened stream shall be allocated and opened as if by a call to <i>open</i>( ) with the following flags:</blockquote><br />
185 but it doesn't say that additional flags can't silently be added to the given sets of flags. In fact, if you use the <tt>'e'</tt> or <tt>'x'</tt> characters in the <i>mode</i> string, other flags are required to be OR'ed into the flags listed in this table.<br />
186 <br />
187 Implementations of <i>fopen</i>( ) should not be allowed to act as though flags such as O_DSYNC, O_NONBLOCK, etc. had been set when a file is opened.</description>
188 <guid>https://www.austingroupbugs.net/view.php?id=1524</guid>
189 <author>Don Cragun <Don Cragun@example.com></author>
190 <comments>https://www.austingroupbugs.net/view.php?id=1524#bugnotes</comments>
191 </item>
192 <item>
193 <title>0001536: Unimplemented requirements in fd duplication</title>
194 <link>https://www.austingroupbugs.net/view.php?id=1536</link>
195 <description>Note: the text in question appears unchanged in Issue 8 Draft 2.1<br />
196 but obviously the page & line numbers differ.<br />
197 <br />
198 In XCU 2.7.5 (the <& redirect operator) it is stated:<br />
199 <br />
200 if the digits in word do not represent a file descriptor already open <br />
201 for input, a redirection error shall result;<br />
202 <br />
203 Similarly in XCU 2.7.6 (the >& redirect operator):<br />
204 <br />
205 if the digits in word do not represent a file descriptor already open <br />
206 for output, a redirection error shall result;<br />
207 <br />
208 I am unable to find a shell which implements that redirection error, where<br />
209 the fd given by word is open, but not for input/output as specified.<br />
210 (as the requirement is "error shall result" that means there are no<br />
211 conforming shells, or none I have available to test ... I find it hard to<br />
212 believe that ksh88 is any different).<br />
213 <br />
214 Try:<br />
215 sh $ exec 4>/tmp/foo; exec 6<&4<br />
216 sh $ exec 4</tmp/foo; exec 6>&4<br />
217 <br />
218 in your favourite shell and see how many redirection errors you experience.<br />
219 (Errors on the redirect of fd 4 do not count for this purpose, it is easy<br />
220 to set up a scenario where that redirect fails -- just make that one work).<br />
221 <br />
222 This is hardly surprising, as all shells simply use either dup2() or<br />
223 fcntl(F_DUPFD) to perform these operations, and as long as the source<br />
224 fd is open, and there are sufficient available fd's for the new one<br />
225 to be opened (and its fd number is within range) those operations succeed.<br />
226 They don't care in the slightest whether the source fd is open for reading,<br />
227 writing, both, or neither.<br />
228 <br />
229 Requiring that an error be generated when the source fd is not open for<br />
230 the direction of I/O implied by the operator in use, would require the<br />
231 shell to first determine how the source fd was opened (fcntl(F_GETFL)<br />
232 and verify it - which no-one does - but we would also need to invent a<br />
233 <>& operator (which no-one has done yet) to make sure to be able to<br />
234 correctly duplicate a file descriptor open for both read and write.<br />
235 <br />
236 It would be tempting to simply delete the words "for input" and "for output"<br />
237 and leave it like that, but that might lead to a suboptimal result, where<br />
238 shells are (effectively) forbidden from generating errors from the example<br />
239 code shown above, and while no-one currently does, making it effectively<br />
240 impossible to ever do, which it would be if applications were told that it<br />
241 makes no difference which fd duplication redirect operator they use, which<br />
242 that change would effectively do, is probably not what we want.<br />
243 <br />
244 Instead, I'd make it unspecified what happens if the input duplication<br />
245 operator is applied to a fd not already open for input, or if the output<br />
246 duplication operator is applied to a fd not already open for output.<br />
247 That leaves current shells compliant, puts the onus on applications to<br />
248 do the right thing (which is their only choice right now -- it doesn't<br />
249 matter which duplication operator is applied, what matters is how the<br />
250 resulting fd is used - it can only be used for operations that were<br />
251 permitted to the source fd ("word"). It leaves it open for a shell to<br />
252 actually verify correct usage, but such a shell would probably need to<br />
253 invent <>& (in that, or some other, syntax) for fd's open read+write.</description>
254 <guid>https://www.austingroupbugs.net/view.php?id=1536</guid>
255 <author>kre <kre@example.com></author>
256 <comments>https://www.austingroupbugs.net/view.php?id=1536#bugnotes</comments>
257 </item>
258 <item>
259 <title>0001535: Poor description of declaration (all really) utility argument processing</title>
260 <link>https://www.austingroupbugs.net/view.php?id=1535</link>
261 <description>XCU 2.9.1.1 (in Issue 8 Draft 2.1) says:<br />
262 <br />
263 When a given simple command is required to be executed [...]<br />
264 the following expansions, assignments, and redirections shall all <br />
265 be performed from the beginning of the command text to the end:<br />
266 <br />
267 All that's relevant about that quote is that it makes the ordering<br />
268 a requirement.<br />
269 <br />
270 1. [not relevant here]<br />
271 2. The words that are not variable assignments or redirections shall be<br />
272 expanded.<br />
273 <br />
274 That's simple enough. The first thing we do (after moving redirects<br />
275 and var-assigns out of the way - that's step 1 - is to expand the remaining<br />
276 words.<br />
277 <br />
278 If any fields remain following their expansion, the first field shall<br />
279 be considered the command name and remaining fields are the arguments<br />
280 for the command.<br />
281 <br />
282 This is where we get the command name, and it comes "following their [the<br />
283 words] expansion" (assuming there are remaining fields, which for this issue<br />
284 we shall do).<br />
285 <br />
286 If the command name is recognized as a declaration utility,<br />
287 <br />
288 At this point we look and see if we have a declaration utility, and if<br />
289 we do...<br />
290 <br />
291 then any remaining words that would be recognized as a variable<br />
292 assignment in isolation shall be expanded as a variable assignment <br />
293 <br />
294 then we must look at the remaining (remember already expanded) words to<br />
295 see if any look like a variable assignment, and if we find any, expand<br />
296 them *again* (as variable assignments - the following (parenthesised)<br />
297 text goes into what that means, but that's not relevant here.<br />
298 <br />
299 while words that would not be a variable assignment<br />
300 in isolation shall be subject to regular expansion.<br />
301 <br />
302 and the other words also get expanded again, the regular way.<br />
303 <br />
304 For all other command names,<br />
305 <br />
306 that is, commands that are not declaration utilities<br />
307 <br />
308 all subsequent words shall be subject to regular expansion<br />
309 <br />
310 all the remaining words just get regular expansion (this is followed by<br />
311 an explanation of what "regular expansion" means - not relevant here -<br />
312 except to note in passing that this is the second occurrence of "regular<br />
313 expansion", the first one didn't get explained - that's probably backwards.<br />
314 <br />
315 That is, the words are expanded, the command name (first non-empty remaining<br />
316 field) gets examined, and depending upon what we see, we expand all the<br />
317 other args again in one way or another.<br />
318 <br />
319 That's not how it is supposed to happen - not in anyone's world view.<br />
320 <br />
321 Further, with this (precisely specified order of processing) this final<br />
322 paragraph of 2.9.1.1...<br />
323 <br />
324 When determining whether a command name is a declaration utility, an<br />
325 implementation may use only lexical analysis.<br />
326 <br />
327 It isn't really clear what that means, but lexical analysis is occurring<br />
328 around the same time as alias expansion, so how the two interact would<br />
329 probably need to be made clear, but this probably doesn't matter anyway<br />
330 (or not with the current wording) as:<br />
331 <br />
332 It is unspecified whether assignment context will be used if the<br />
333 command name would only become recognized as a declaration utility<br />
334 after word expansions.<br />
335 <br />
336 No it isn't, it is very precisely specified, as above - the word expansions<br />
337 happen first, and then the result is examined to determine whether the command<br />
338 name is a declaration utility or not. Nothing even slightly unspecified<br />
339 about that, and adding contradictory text here can only confuse things.</description>
340 <guid>https://www.austingroupbugs.net/view.php?id=1535</guid>
341 <author>kre <kre@example.com></author>
342 <comments>https://www.austingroupbugs.net/view.php?id=1535#bugnotes</comments>
343 </item>
344 <item>
345 <title>0001561: clarify what kind of data shell variables need to be able to hold</title>
346 <link>https://www.austingroupbugs.net/view.php?id=1561</link>
347 <description>In:<br />
348 <a href="https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33722&limit=100&offset=0&sid=">https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33722&limit=100&offset=0&sid=</a> [<a href="https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33722&limit=100&offset=0&sid=" target="_blank">^</a>]<br />
349 <br />
350 I've raised the question, on which data shell variables are required to be able to hold.<br />
351 <br />
352 In various replies following it became clear that there is some ambiguity with respect to that question:<br />
353 <br />
354 <br />
355 In:<br />
356 <a href="https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33723&limit=100&offset=0&sid=">https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33723&limit=100&offset=0&sid=</a> [<a href="https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33723&limit=100&offset=0&sid=" target="_blank">^</a>]<br />
357 Geoff Clare brought up that:<br />
358 »but POSIX clearly requires that a variable can be<br />
359 assigned any value obtained from a command substitution that does not<br />
360 include a NUL byte, and specifies utilities that can be used to<br />
361 generate arbitrary byte values, therefore a variable can contain any<br />
362 sequence of bytes that does not include a NUL byte.«<br />
363 <br />
364 Which AFAIU means that shell variables are expected to hold any bytes except NUL, and only the use of these shell variables in certain other constructs (e.g. ${#var}) interprets them as characters according to the current locale.<br />
365 <br />
366 <br />
367 It was brought up, that e.g. yash discards any bytes from shell variables that don't make up a valid encoding:<br />
368 <a href="https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33724&limit=100&offset=0&sid=">https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33724&limit=100&offset=0&sid=</a> [<a href="https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33724&limit=100&offset=0&sid=" target="_blank">^</a>]<br />
369 <br />
370 <br />
371 In:<br />
372 <a href="https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33725&limit=100&offset=0&sid=">https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33725&limit=100&offset=0&sid=</a> [<a href="https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33725&limit=100&offset=0&sid=" target="_blank">^</a>]<br />
373 Chet Ramey brought up, that shell variables are initialised from environment variables, which themselves may contain anything except NUL as value, as long as anything before the "=" is a valid Name (in the sense of POSIX).<br />
374 And in the later:<br />
375 <a href="https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33731&limit=100&offset=0&sid=">https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33731&limit=100&offset=0&sid=</a> [<a href="https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33731&limit=100&offset=0&sid=" target="_blank">^</a>]<br />
376 that:<br />
377 »applications can obviously put whatever they want into the value of an environment variable in envp and call execve.«<br />
378 <br />
379 <br />
380 In:<br />
381 <a href="https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33730&limit=100&offset=0&sid=">https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33730&limit=100&offset=0&sid=</a> [<a href="https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33730&limit=100&offset=0&sid=" target="_blank">^</a>]<br />
382 Harald van Dijk countered, that:<br />
383 »That is not what POSIX says. It says "The value of an environment variable is a string of characters" (8.1 Environment Variable Definition), and "character" is defined as "a sequence of one or more bytes representing a single graphic symbol or control code" (3 Definitions), with a note that says it corresponds to what C calls a multi-byte character. Environment variables are not specified to allow arbitrary bytes.«<br />
384 <br />
385 <br />
386 There was some further discussion on whether the definition of command substitutions implies whether or not any bytes other than NUL need to be able to be stored in shell variables.<br />
387 One argument brought up was, that there the wording "<newline> character" is used - another, that this would clearly refer *only* to the <newline> itself which is per definition the same (byte) in every locale.<br />
388 (for that particular part see also the proposed clarifications in <a href="https://www.austingroupbugs.net/view.php?id=1560">https://www.austingroupbugs.net/view.php?id=1560</a> [<a href="https://www.austingroupbugs.net/view.php?id=1560" target="_blank">^</a>] ).<br />
389 <br />
390 <br />
391 <br />
392 In:<br />
393 <a href="https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33736&limit=100&offset=0&sid=">https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33736&limit=100&offset=0&sid=</a> [<a href="https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33736&limit=100&offset=0&sid=" target="_blank">^</a>]<br />
394 I brought up that in addition to what Harald pointed out earlier, in 8.1 Environment Variables it says:<br />
395 »These strings have the form name=value; names shall not contain the<br />
396 character '='. For values to be portable across systems conforming to<br />
397 POSIX.1-2017, the value shall be composed of characters from the<br />
398 portable character set (except NUL and as indicated below).«<br />
399 <br />
400 but a bit further down it says the contradicting:<br />
401 »The values that the environment variables may be assigned are not<br />
402 restricted except that they are considered to end with a null byte and<br />
403 the total space used to store the environment and the arguments to the<br />
404 process is limited to {ARG_MAX} bytes.«<br />
405 <br />
406 <br />
407 And in:<br />
408 <a href="https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33737&limit=100&offset=0&sid=">https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33737&limit=100&offset=0&sid=</a> [<a href="https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33737&limit=100&offset=0&sid=" target="_blank">^</a>]<br />
409 I brought up:<br />
410 »3.368 Standard Output<br />
411 "An output stream usually intended to be used for primary data output."<br />
412 <br />
413 And:<br />
414 3.370 Stream<br />
415 "Appearing in lowercase, a stream is a file access object that allows access to an ordered sequence of characters, as described by the ISO C standard. Such objects can be created by the fdopen(), fmemopen(), fopen(), open_memstream(), or popen() functions, and are associated with a file descriptor. A stream provides the additional services of user-selectable buffering and formatted input and output; see also STREAM."<br />
416 <br />
417 <br />
418 This however links to Standard I/O Streams ( <a href="file:///usr/share/doc/susv4/susv4-2018/functions/V2_chap02.html#tag_15_05">file:///usr/share/doc/susv4/susv4-2018/functions/V2_chap02.html#tag_15_05</a> [<a href="file:///usr/share/doc/susv4/susv4-2018/functions/V2_chap02.html#tag_15_05" target="_blank">^</a>] )<br />
419 which very well names byte output modes (fputc and so on).«</description>
420 <guid>https://www.austingroupbugs.net/view.php?id=1561</guid>
421 <author>calestyo <calestyo@example.com></author>
422 <comments>https://www.austingroupbugs.net/view.php?id=1561#bugnotes</comments>
423 </item>
424 <item>
425 <title>0001533: struct tm: add tm_gmtoff (and tm_zone) field(s)</title>
426 <link>https://www.austingroupbugs.net/view.php?id=1533</link>
427 <description>Hello.<br />
428 <br />
429 Regarding the MUA i maintain i was pinged by a user who needs to<br />
430 use the timezone Europe/Dublin. He wrote<br />
431 <br />
432 In 2018, the tzdata maintainers (IANA) corrected a historical mistake<br />
433 with the Europe/Dublin timezone. The mistake was rooted in a<br />
434 misunderstanding of whether IST meant "Irish Summer Time" or "Irish<br />
435 Standard Time".<br />
436 <br />
437 The problem was discussed at great length<br />
438 (<a href="http://mm.icann.org/pipermail/tz/2018-January/thread.html)">http://mm.icann.org/pipermail/tz/2018-January/thread.html)</a> [<a href="http://mm.icann.org/pipermail/tz/2018-January/thread.html)" target="_blank">^</a>] and it was<br />
439 concluded that IST really meant Irish *Standard* Time (in constrast<br />
440 with, say, British *Summer* Time), and that this standard time is<br />
441 defined as UTC+0100.<br />
442 [.]<br />
443 Once the question was settled, the only possible solution for keeping<br />
444 the Irish local time in sync with the rest of the world (for example,<br />
445 Belfast & London) was for IANA to _reverse_ the functioning of the DST<br />
446 flag for Ireland. The result is that in the current IANA timezone<br />
447 database (2021e), Europe/Dublin has DST applied in *winter*, with an<br />
448 adjustment of -1h (that is, negative).<br />
449 [.]<br />
450 It turns out that the introduction of a negative DST adjustment caused<br />
451 all sorts of bugs back in 2018; in the source distribution of IANA's<br />
452 tzdata, one can spot this inside ./europe:<br />
453 <br />
454 # In January 2018 we discovered that the negative SAVE values in the<br />
455 # Eire rules cause problems with tests for ICU [...] and with tests<br />
456 # for OpenJDK [...]<br />
457 # To work around this problem, the build procedure can translate the<br />
458 # following data into two forms, one with negative SAVE values and the<br />
459 # other form with a traditional approximation for Irish timestamps<br />
460 # after 1971-10-31 02:00 UTC; although this approximation has tm_isdst<br />
461 # flags that are reversed, its UTC offsets are correct and this often<br />
462 # suffices. This source file currently uses only nonnegative SAVE<br />
463 # values, but this is intended to change and downstream code should<br />
464 # not rely on it.<br />
465 <br />
466 So, a temporary hack was put in place in order to allow distro<br />
467 maintainers to retain the old broken convention of IST and support<br />
468 buggy software, but it is clear that the current (and technically, and<br />
469 politically, correct) implementation of a negative DST adjustment for<br />
470 Ireland is there to stay.<br />
471 As a matter of fact, the distro maintainer can choose to compile<br />
472 tzdata to keep buggy software happy ("make DATAFORM=rearguard"),<br />
473 which replicates the behaviour of tzdata prior to 2018. Many distros<br />
474 seem to be doing that for one reason or another, while some have passed<br />
475 the upstream change down to their users (probably, without knowing).<br />
476 <br />
477 Anyhow, all the simple minded software, including the MUA<br />
478 i maintain, used to do something like<br />
479 <br />
480 if((t2 = mktime(gmtime(&t))) == (time_t)-1){<br />
481 t = 0;<br />
482 goto jredo;<br />
483 }<br />
484 tzdiff = t - t2;<br />
485 if((tmp = localtime(&t)) == NULL){<br />
486 t = 0;<br />
487 goto jredo;<br />
488 }<br />
489 <br />
490 tzdiff_hour = (int)(tzdiff / 60);<br />
491 tzdiff_min = tzdiff_hour % 60;<br />
492 tzdiff_hour /= 60;<br />
493 if (tmp->tm_isdst > 0)<br />
494 tzdiff_hour++;<br />
495 <br />
496 Note the .tm_isdst plus positive summer time adjustment.<br />
497 This was overly primitive, and i recognize that POSIX supports the<br />
498 %z (and %Z) formats for strftime(3), and in general code as below<br />
499 is used by projects, so doing it right is very expensive but<br />
500 doable with POSIX as of today.<br />
501 <br />
502 However, all BSDs and Linux with either of GNU and musl C library<br />
503 support the .tm_gmtoff (and .tm_zone) members of "struct tm", in<br />
504 general all users of the public domain (and standardized) IANA TZ<br />
505 project can bake it in upon their own desire.. With .tm_gmtoff<br />
506 being available, code gets as simple as<br />
507 s64<br />
508 time_tzdiff(s64 secsepoch, struct tm const *utcp_or_nil,<br />
509 struct tm const *localp_or_nil){<br />
510 struct tm tmbuf[2], *tmx;<br />
511 time_t t;<br />
512 s64 rv;<br />
513 UNUSED(utcp_or_nil);<br />
514 <br />
515 rv = 0;<br />
516 <br />
517 if(localp_or_nil == NIL){<br />
518 t = S(time_t,secsepoch);<br />
519 while((tmx = localtime(&t)) == NIL){<br />
520 if(t == 0)<br />
521 goto jleave;<br />
522 t = 0;<br />
523 }<br />
524 tmbuf[0] = *tmx;<br />
525 localp_or_nil = &tmbuf[0];<br />
526 }<br />
527 <br />
528 #ifdef HAVE_TM_GMTOFF<br />
529 rv = localp_or_nil->tm_gmtoff;<br />
530 <br />
531 #else<br />
532 if(utcp_or_nil == NIL){<br />
533 t = S(time_t,secsepoch);<br />
534 while((tmx = gmtime(&t)) == NIL){<br />
535 if(t == 0)<br />
536 goto jleave;<br />
537 t = 0;<br />
538 }<br />
539 tmbuf[1] = *tmx;<br />
540 utcp_or_nil = &tmbuf[1];<br />
541 }<br />
542 <br />
543 rv = ((((localp_or_nil->tm_hour - utcp_or_nil->tm_hour) * 60) +<br />
544 (localp_or_nil->tm_min - utcp_or_nil->tm_min)) * 60) +<br />
545 (localp_or_nil->tm_sec - utcp_or_nil->tm_sec);<br />
546 <br />
547 if((t = (localp_or_nil->tm_yday - utcp_or_nil->tm_yday)) != 0){<br />
548 s64 const ds = 24 * 60 * 60;<br />
549 <br />
550 rv += (t == 1) ? ds : -S(s64,ds);<br />
551 }<br />
552 #endif<br />
553 <br />
554 jleave:<br />
555 return rv;<br />
556 }</description>
557 <guid>https://www.austingroupbugs.net/view.php?id=1533</guid>
558 <author>steffen <steffen@example.com></author>
559 <comments>https://www.austingroupbugs.net/view.php?id=1533#bugnotes</comments>
560 </item>
561 <item>
562 <title>0001528: mailx: document "sh(1) -c --" has to be used instead of "sh -c"</title>
563 <link>https://www.austingroupbugs.net/view.php?id=1528</link>
564 <description>mailx(1) specific follow-up to issue #1440.<br />
565 <br />
566 P.S.: page numbers and line numbers from C181. (I think this is not the newest?)</description>
567 <guid>https://www.austingroupbugs.net/view.php?id=1528</guid>
568 <author>steffen <steffen@example.com></author>
569 <comments>https://www.austingroupbugs.net/view.php?id=1528#bugnotes</comments>
570 </item>
571 <item>
572 <title>0001527: cd requires the impossible on standard output</title>
573 <link>https://www.austingroupbugs.net/view.php?id=1527</link>
574 <description>The STDOUT section of XCU 4(cd) says:<br />
575 <br />
576 If a non-empty directory name from CDPATH is used, or if cd - is used,<br />
577 an absolute pathname of the new working directory shall be written to<br />
578 the standard output as follows:<br />
579 <br />
580 Whether used as cd -P, or the ludicrous cd -L, this requires what might<br />
581 be impossible.<br />
582 <br />
583 From XCU 2.5.3 (shell vars):<br />
584 PWD Set by the shell and by the cd utility. [...]<br />
585 if there is insufficient permission on the current working<br />
586 directory, or on any parent of that directory, to determine<br />
587 what that pathname would be, the value of PWD is unspecified<br />
588 <br />
589 So the standard clearly recognises that it is not always possioble to<br />
590 determine the current working directory, which results in PWD being unspecified<br />
591 but not in unspecified behaviour from cd or pwd (that's only if the user<br />
592 modifies PWD).<br />
593 <br />
594 Further, while not exactly common, nor is this a very rare event, it happens<br />
595 from time to time, and is usually easily corrected by a simple cd to a fully<br />
596 qualified path (one which works) which will result in establishing a value<br />
597 for PWD, after which all is normal (which would not be able to be trusted if<br />
598 the behaviour of cd was unspecified at that point).<br />
599 <br />
600 In that state, if the user successfully does:<br />
601 <br />
602 mkdir foo foo/bar<br />
603 export CDPATH=foo<br />
604 cd bar<br />
605 <br />
606 the text quoted above requires the cd command to print the full path<br />
607 /impossible/to/discover/foo/bar<br />
608 which, of course, is impossible to do correctly.<br />
609 <br />
610 It is possible to get into this state after the shell has started (and PWD<br />
611 is set) as well, though in that case it is typically only cd -P which has the<br />
612 problem (the shell just "knows" that $PWD is correct, and all it takes to fix<br />
613 it is some string manipulation in the cd -L case ... but not always true,<br />
614 as the filesystem may have altered state after PWD was set.)<br />
615 <br />
616 What shells do in this situation varies. Some print an error or warning<br />
617 which they're not supposed to do, as "The standard error shall be used only<br />
618 for diagnostic messages." and that means (according to XCU 1.4) that the<br />
619 exit status must indicate an error, but for cd (at least without the not<br />
620 yet existing -e option) when cd has changed directory successfully, it must<br />
621 exit 0 (-e doesn't solve the problem for cd -L with PWD unset, and no<br />
622 ability to discover a path of the current working directory anyway, so it<br />
623 is irrelevant here).<br />
624 <br />
625 Aside from the technically incorrect error message that is sometimes printed,<br />
626 most shells print the (adjusted) relative path (which is what it usually, though<br />
627 not always, is in these cases) that was used in the chdir() call. In the<br />
628 example above that would be "foo/bar" - not an absolute path by any means.<br />
629 (zsh almost does that, at least in the copy I have currently, but pretends it<br />
630 is an absolute path by prepending a '/' - /foo/bar - which is nonsense, and<br />
631 most probably simply a bug). Other shells simply print nothing in this case.<br />
632 <br />
633 In the case that PWD is set, and cd -P is used, and the resulting directory's<br />
634 path cannot be determined, yash seems to treat it something like cd -L, treating<br />
635 PWD as if it were the correct physical path, and adjusting it. The result is<br />
636 an absolute path, and in one case I saw it, was actually correct. There was<br />
637 no way yash could have known that however - it was a guess (even if a well<br />
638 educated one).<br />
639 <br />
640 Since the behaviour of the existing shells varies so much, I suspect that all<br />
641 that it is possible to say in this situation is that the results are unspecified. Something similar may need to be said about what goes in PWD<br />
642 (though there the existence of -e in issue8 might make a difference).<br />
643 <br />
644 Whether it is also desirable to explicitly allow shells to issue an error in<br />
645 this situation I will leave for future discussion (my shell does not...).</description>
646 <guid>https://www.austingroupbugs.net/view.php?id=1527</guid>
647 <author>kre <kre@example.com></author>
648 <comments>https://www.austingroupbugs.net/view.php?id=1527#bugnotes</comments>
649 </item>
650 <item>
651 <title>0001525: only the close() of the last fd for a socket should destroy the socket</title>
652 <link>https://www.austingroupbugs.net/view.php?id=1525</link>
653 <description>The description of close() repeatedly says that various actions only take place when the last fd for an object is closed. For example:<br />
654 <br />
655 "When all file descriptors associated with a pipe or FIFO special file are closed..."<br />
656 <br />
657 "When all file descriptors associated with an open file description have been closed, ..."<br />
658 <br />
659 "The last close() for a STREAM..."<br />
660 <br />
661 "If fildes refers to the master side of a pseudo-terminal, and this is the last close, ..."<br />
662 <br />
663 But it does not say that for sockets: "If fildes refers to a socket, close() shall cause the socket to be destroyed." This suggests behavior that implementations do not (and should not) implement.</description>
664 <guid>https://www.austingroupbugs.net/view.php?id=1525</guid>
665 <author>ben_pfaff <ben_pfaff@example.com></author>
666 <comments>https://www.austingroupbugs.net/view.php?id=1525#bugnotes</comments>
667 </item>
668 <item>
669 <title>0001523: Wrong layout of getopt "-"</title>
670 <link>https://www.austingroupbugs.net/view.php?id=1523</link>
671 <description>> If, when getopt() is called:<br />
672 ><br />
673 > argv[optind] is a null pointer*argv[optind] is not the character -<br />
674 > argv[optind] points to the string "-"<br />
675 <br />
676 I didn't understand the above excerpt, even after reading it multiple times.<br />
677 <br />
678 My misunderstanding was due to the missing line break before the first '*' and by the additional space before the last line.</description>
679 <guid>https://www.austingroupbugs.net/view.php?id=1523</guid>
680 <author>rillig <rillig@example.com></author>
681 <comments>https://www.austingroupbugs.net/view.php?id=1523#bugnotes</comments>
682 </item>
683 <item>
684 <title>0001522: mkdir() and S_ISVTX</title>
685 <link>https://www.austingroupbugs.net/view.php?id=1522</link>
686 <description>The mkdir() description states:<blockquote>When bits in <i>mode</i> other than the file permission bits are set, the meaning of these additional bits is implementation-defined.</blockquote>This is old POSIX.1-199x text from before the merge with SUS, after which the S_ISVTX bit has a specified meaning for directories (when the XSI option is supported), as per the <sys/stat.h> description of S_ISVTX: "On directories, restricted deletion flag". This needs to be accounted for in the mkdir() description.<br />
687 <br />
688 The umask() page also has a similar problem.</description>
689 <guid>https://www.austingroupbugs.net/view.php?id=1522</guid>
690 <author>geoffclare <geoffclare@example.com></author>
691 <comments>https://www.austingroupbugs.net/view.php?id=1522#bugnotes</comments>
692 </item>
693 <item>
694 <title>0001505: Make doesn't seem to specify unset macro expansion behaviour</title>
695 <link>https://www.austingroupbugs.net/view.php?id=1505</link>
696 <description>Hello,<br />
697 <br />
698 I was looking for an actual specification of<br />
699 what happens when trying to expand an undefined macro.<br />
700 <br />
701 Intuition would be that it expands to an empty string,<br />
702 but this doesn't seem to be specified at all,<br />
703 or maybe I missed that somewhere else.<br />
704 <br />
705 Could an implementation be allowed<br />
706 to generate an error on such case (or other behaviour)<br />
707 without having this defined?</description>
708 <guid>https://www.austingroupbugs.net/view.php?id=1505</guid>
709 <author>quinq <quinq@example.com></author>
710 <comments>https://www.austingroupbugs.net/view.php?id=1505#bugnotes</comments>
711 </item>
712 <item>
713 <title>0001036: Errors/Omissions in specification of here document redirection</title>
714 <link>https://www.austingroupbugs.net/view.php?id=1036</link>
715 <description>Aside from the question of just which newline is the "next" newline, that<br />
716 has been canvassed (without resolution I can see) elsewhere, there are<br />
717 several problems with the specification of here documents.<br />
718 <br />
719 First, given that the here doc is processed after encountering a newline<br />
720 (which newline is the other issue) they must be largely processed as a side<br />
721 effect of lexical processing (as newlines, other than those that happen to be literal) no longer exist in the scanned form of the shell input, they have<br />
722 served as token delimiters, and are not otherwise relevant. This would suggest<br />
723 that the here document is processed during lexical analysis - and nothing in the<br />
724 specification contradicts that. The spec does say that (given an unquoted)<br />
725 delimiter word, the text is subject to various expansions. It does not say that those expansions should not be performed while reading the here doc text, however I believe that is (or should be) the intent - that is, if the here doc<br />
726 is never used because it is attached to a command that is never executed (on the "wrong" side of an && or similar) then the expansions in the here doc should not be performed. It could be I am missing something, but I cannot see any text that says that the expansions in the here doc should be evaluated in the context of the command that is about to use the data, immediately before it is used (in the appropriate sequence of all applicable redirect operations).<br />
727 <br />
728 Second, the text says ...<br />
729 <br />
730 If any character in word is quoted, the delimiter shall be...<br />
731 <br />
732 and in the following paragraph ...<br />
733 <br />
734 If no characters in word are quoted, all lines of the ...<br />
735 <br />
736 but I do not believe that is what is intended, and is not what is actually<br />
737 implemented in any shell I can find. Consider ...<br />
738 <br />
739 cat << ""EOF<br />
740 lines of text<br />
741 EOF<br />
742 <br />
743 The delimiter there is the string EOF in which none of the characters were quoted. True it was preceded by a quoted null string, but that contains no quoted characters. Hence no characters of the delimiter word were quoted, and according to the spec, "lines of text" should be subject to the various expansions. No-one implements it that way, it is not whether any characters in the word are quoted, but whether any quote characters were encountered while scanning word.<br />
744 <br />
745 Third, in cases where expansions are done, nothing makes it explicit that the end delimiter cannot be found as the result of an expansion. That is, in the<br />
746 following, there is one here document that happens to contain the string<br />
747 echo foo << EOF<br />
748 and not two here documents<br />
749 <br />
750 end=EOF<br />
751 cat <<EOF<br />
752 lines of text<br />
753 $end<br />
754 echo foo <<EOF<br />
755 another line<br />
756 EOF<br />
757 <br />
758 Of course, if the first question above is resolved to make it clear that expansions do not happen when the document is being read, this would be a moot point, as $end expanding to EOF would not be known while the here doc is being read, which I believe is the correct interpretation.<br />
759 <br />
760 Fourth, I am totally confused by the relationship between double quoting and<br />
761 backtick command expansions, section 2.2.3 appears to say that if backticks appear inside double quotes, then the double-quote interpretation continues through the command expansion (if that were not true, it would not be possible for a double quoted string to start before a ` command substitution, and end inside it, as a " inside the `...` would be the start of a new string, not the end of the previous one (the same as it is in $( ) command substitutions).<br />
762 The relevance of this to here documents is illustrated by the following ...<br />
763 <br />
764 echo "` cat << EOF<br />
765 X = $(( 1 + 2 ))<br />
766 EOF<br />
767 `"<br />
768 <br />
769 If things are as I have postulated, then the EOF is quoted (by the double<br />
770 quotes that surround the command substitution) and hence the here document<br />
771 should not be expanded, and echo should (eventually) output<br />
772 <br />
773 X = $(( 1 + 2 ))<br />
774 <br />
775 and not<br />
776 <br />
777 X = 3<br />
778 <br />
779 but again, I do not believe this is in accordance with what any shell does.<br />
780 This again may be an artifact of the 2nd point above, and if the text is changed so that only quote characters encountered while scanning the delimiter word cause the expansion to be supressed, and not whether "characters are quoted" then this issue will go away.<br />
781 <br />
782 Fifth, and more minor I think, when the delimiter is not quoted,<br />
783 the text states that backslashes work the way they do in double quoted strings, and references section 2.2.3 for the details. There we are informed that inside double quotes, \ is only special (only a quote character) when the following character is one of \ " ` $ and newline (so for example "\n" is a<br />
784 two character string). But then (back to 2.7.4) the text goes on to say that<br />
785 inside the here document " is not special. The problem is that it is not<br />
786 clear whether \ continues to act as a quote character when followed by this non-special " or not (ie: is \" in a here document, with an unquoted delimiter word, one character, or two?) I believe two is correct.<br />
787 <br />
788 Sixth, and perhaps most important of all, there is no discussion of what is expected to happen when the input string ends before the here document delimiter is encountered. Most important, because unlike the previous issues where I believe all shells (all I could find) actually agree on what should be done, and the text just needs to be more clear, for this one, there is a difference of opinion. Some shells treat end of file as equivalent to the<br />
789 delimiter, and go ahead and execute whatever command the here document was attached to with as much input as they managed to gather (one issues a warning when it does this, but does it anyway, most that adopt this behaviour do it silently.) Other shells consider this to be a redirect error, suppress execution of the command, and set $? to indicate failure. Personally I believe that the latter is the best approach, as it avoids situations where the<br />
790 shell eats the entire rest of the script as the here document because of some error or other (the one that happens to me from time to time is that I cut & paste a script, or script segment, and the tabs that had been present get converted to spaces, and then the <<-EOF doe not stop on space space..EOF<br />
791 where it would have with tab EOF.)</description>
792 <guid>https://www.austingroupbugs.net/view.php?id=1036</guid>
793 <author>kre <kre@example.com></author>
794 <comments>https://www.austingroupbugs.net/view.php?id=1036#bugnotes</comments>
795 </item>
796 <item>
797 <title>0001532: "stty -g" output should not have to be split</title>
798 <link>https://www.austingroupbugs.net/view.php?id=1532</link>
799 <description>The resolution of <a href="https://www.austingroupbugs.net/view.php?id=1053">0001053</a>, about the specification of "stty -g" changes:<br />
800 <br />
801 -g<br />
802 <br />
803 Write to standard output all the current settings in an<br />
804 unspecified form that can be used as arguments to another<br />
805 invocation of the stty utility on the same system. The form<br />
806 used shall not contain any characters that would require<br />
807 quoting to avoid word expansion by the shell; see Section<br />
808 2.6 (on page 2353).<br />
809 <br />
810 to:<br />
811 <br />
812 -g<br />
813 <br />
814 Write to standard output all the current settings,<br />
815 optionally excluding the terminal window size, in an<br />
816 unspecified form that, when used as arguments to another<br />
817 invocation of the stty utility on the same system, attempts<br />
818 to apply those settings to the terminal. The form used shall<br />
819 not contain any sequence that would form an Informational<br />
820 Query, nor any characters that would require quoting to<br />
821 avoid word expansions, other than field splitting, by the<br />
822 shell; see Section 2.6 (on page 2353).<br />
823 <br />
824 While the 2018 edition had:<br />
825 <br />
826 -g<br />
827 Write to standard output all the current settings in an<br />
828 unspecified form that can be used as arguments to another<br />
829 invocation of the stty utility on the same system. The form<br />
830 used shall not contain any characters that would require<br />
831 quoting to avoid word expansion by the shell; see wordexp.<br />
832 <br />
833 It's one stream of bytes that applications send to standard output.<br />
834 <br />
835 For that stream of bytes to be "used as argument*S* to another<br />
836 invocation of the stty utility", it implies that that output should<br />
837 somehow be split into a list of arguments (and also implies the<br />
838 result fits in the ARG_MAX limit and doesn't contain NUL bytes).<br />
839 <br />
840 The reference to 2.6 suggests maybe "sh" should be involved to<br />
841 perform that splitting. It almost suggests that the output<br />
842 should be appended to "stty ", or possibly "stty -- " and fed to<br />
843 sh as in (printf "stty -- "; stty -g) | sh<br />
844 <br />
845 AFAIK, stty -g was introduced in SysIII circa 1980. It was<br />
846 outputting on word on one line made of a ":"-separated of alnums<br />
847 from the portable charset and followed by a newline character.<br />
848 After it was specified by POSIX, it was also added to BSDs (some<br />
849 4.3 variant), with "=" in the list of characters that may occur<br />
850 (not in leading position) in the ":"-delimited list.<br />
851 GNU stty has used a format similar to SysIII from at least as<br />
852 far back as 1990.<br />
853 <br />
854 The way to save and restore it was always to store that output<br />
855 without the trailing newline characters, which in a shell can be<br />
856 done with:<br />
857 <br />
858 saveterm=$(stty -g)<br />
859 <br />
860 And to restore it:<br />
861 <br />
862 stty "$saveterm"<br />
863 <br />
864 The "--" being unnecessary because the output of stty -g never<br />
865 starts with "-" and with some (including GNU stty) not allowed.<br />
866 <br />
867 All the documentations of stty I've seen talk of *one* (an)<br />
868 argument to be passed to stty. They don't say the output may<br />
869 (let alone should) be split in any way to be passed as several<br />
870 arguments to stty.<br />
871 <br />
872 In the unlikely event that there are indeed stty implementations<br />
873 that require the output of stty -g to be split into arguments<br />
874 before being fed back to stty, we'd need to specify how those<br />
875 are meant to be split.</description>
876 <guid>https://www.austingroupbugs.net/view.php?id=1532</guid>
877 <author>stephane <stephane@example.com></author>
878 <comments>https://www.austingroupbugs.net/view.php?id=1532#bugnotes</comments>
879 </item>
880 <item>
881 <title>0001557: Better wording to describe FD_CLOEXEC.</title>
882 <link>https://www.austingroupbugs.net/view.php?id=1557</link>
883 <description>The current wording for FD_CLOEXEC in open is:<br />
884 <br />
885 > The FD_CLOEXEC file descriptor flag <br />
886 > associated with the new file descriptor<br />
887 > shall be cleared unless the <br />
888 > O_CLOEXEC flag is set in oflag.<br />
889 <br />
890 It as a grammatical ambiguity as to whether FD_CLOEXEC is set if O_CLOEXEC flag is set.</description>
891 <guid>https://www.austingroupbugs.net/view.php?id=1557</guid>
892 <author>dannyniu <dannyniu@example.com></author>
893 <comments>https://www.austingroupbugs.net/view.php?id=1557#bugnotes</comments>
894 </item>
895 <item>
896 <title>0001551: sed: ambiguities in the how BREs/EREs are parsed/interpreted between delimiters (especially when these are special characters)</title>
897 <link>https://www.austingroupbugs.net/view.php?id=1551</link>
898 <description>Hey.<br />
899 <br />
900 First of all, I've asked/reported all his already at the mailing list:<br />
901 "sed and delimiters that are also special characters to REs"<br />
902 <a href="https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33587&limit=100&offset=0&sid=">https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33587&limit=100&offset=0&sid=</a> [<a href="https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33587&limit=100&offset=0&sid=" target="_blank">^</a>]<br />
903 (unfortunately there seems to be no thread-view)<br />
904 <br />
905 So far, no one could really answer the core questions (or if I just didn't understand, than my apologies for writing again here).<br />
906 <br />
907 <br />
908 I was looking into using BREs/EREs within delimiters, which as far as POSIX is concerned should be only sed, and in:<br />
909 - context addresses (e.g. /RE/ or \xREx with x being another delimiter, of which the 1st needs to be quoted if not / )<br />
910 - s-command<br />
911 <br />
912 <br />
913 (I made another ticket (<a href="https://www.austingroupbugs.net/view.php?id=1550">https://www.austingroupbugs.net/view.php?id=1550</a> [<a href="https://www.austingroupbugs.net/view.php?id=1550" target="_blank">^</a>] ) with respect to clarifications/ambiguities about context addresses and delimiters, which may be a bit related.)<br />
914 <br />
915 <br />
916 This ticket covers presumed ambiguities in:<br />
917 When BREs/EREs are used within delimiters...<br />
918 AND<br />
919 ... the delimiter is a special character (or a character that would be special if quoted with a \).<br />
920 (in the above 3 cases, though in my examples I use only the s-command).<br />
921 <br />
922 <br />
923 <br />
924 As far as I can see, the documentation says with respect to the delimiters and their literal use in REs (or replacements) only:<br />
925 <br />
926 [1] »If the character designated by c appears following a <backslash>, then it shall be considered to be that literal character, which shall not terminate the RE. For example, in the context address "\xabc\xdefx", the second x stands for itself, so that the RE is "abcxdef".«<br />
927 (line 106088 et seq., in the draft)<br />
928 <br />
929 [ii] »Within the RE and the replacement, the RE delimiter itself can be used<br />
930 as a literal character if it is preceded by a <backslash>.«<br />
931 (line 106204 et seq., in the draft)<br />
932 <br />
933 [iii] for the s-command:<br />
934 »Any character other than <backslash> or <newline> can be used instead of a <slash> to delimit the RE and the replacement. Within the RE and the replacement, the RE delimiter itself can be used as a literal character if it is preceded by a <backslash>.«<br />
935 (line 106202 et seq., in the draft)<br />
936 <br />
937 <br />
938 <br />
939 IMO, that leaves open a number of questions and ambiguities:<br />
940 <br />
941 <br />
942 <br />
943 1) How are strings/commands which delimiters actually parsed (or split up)?<br />
944 <br />
945 Consider the following example:<br />
946 s(\\((X(<br />
947 <br />
948 <br />
949 There are IMO at least two ways to parse that:<br />
950 <br />
951 a) two stages<br />
952 - 1st: splitting up into RE an replacement parts first by going through the string and looking for any delimiter which is not immediately preceded by \ which is here the 3rd ( .<br />
953 - 2nd: taking the two parts (RE and replacement) and unquote any quoted delimiter \(<br />
954 RE-part = \\(<br />
955 replacement-part = X<br />
956 unquoted:<br />
957 RE-part = \( (here the \( became a ( with respect to the RE)<br />
958 replacement-part = X<br />
959 <br />
960 now parse the RE \( as usual... assuming a BRE we'd end up with \( as the sequence starting a sub-pattern.<br />
961 <br />
962 So effectively here we'd get:<br />
963 s/\(/X/<br />
964 => would as such be an error, but there could have of course been a 'abc\)' in the RE, making it valid.<br />
965 <br />
966 <br />
967 b) one stage<br />
968 going from left to right applying the varying rules (for REs and delimiters), whichever comes first, resulting into:<br />
969 s( ah, an s command with ( as delimiter<br />
970 \\ parser first sees these, makes them a literal \<br />
971 (( ah, the 2nd and 3rd delimiter<br />
972 X( flags to the s command<br />
973 => would likely be an error, given the unknown flags<br />
974 <br />
975 <br />
976 I couldn't find any place, where it really says clearly (or unclearly) how the parsing is to be done.<br />
977 Just because (b) seems the more logical way to do it, doesn't make it mandatory.<br />
978 <br />
979 Especially, (a) doesn't seem to be ruled out, take [i], [ii], [iii] which all effectively say that if the delimiter is preceded by \ it's taken literally... that wording points IMO actually more towards (a), because (b) would require a wording, that says something like:<br />
980 "if the delimiter is preceded by \ THAT BY ITSELF IS NOT ESCAPED"<br />
981 <br />
982 And both ways (and there might be more crazy ways to do it ;-) )... produce quite different results.<br />
983 <br />
984 <br />
985 <br />
986 <br />
987 2) What if the delimiter is a special character (assuming BREs here).<br />
988 <br />
989 [i], [ii], [iii] all effectively say, that if the delimiter is preceded by \ it's taken literally.<br />
990 <br />
991 <br />
992 a) What does »literally« mean here?<br />
993 - It's not taken as a delimiter, but "directly" used as RE?<br />
994 So if one has:<br />
995 s.\..X.<br />
996 it would be used as:<br />
997 s/./X/<br />
998 (btw, this is what GNU sed does:<br />
999 $ printf '%s\n' '.' | sed 's.\..X.'<br />
1000 X<br />
1001 $ printf '%s\n' 'v' | sed 's.\..X.'<br />
1002 X<br />
1003 )<br />
1004 <br />
1005 or:<br />
1006 <br />
1007 - It's not taken as delimiter AND in RE context it would also be literally, even if it would normally be a special character:<br />
1008 So if one has:<br />
1009 s.\..X.<br />
1010 it would be used as:<br />
1011 s/\./X/<br />
1012 (btw, this is what BusyBox sed does:<br />
1013 printf '%s\n' '.' | busybox sed 's.\..X.'<br />
1014 X<br />
1015 $ printf '%s\n' 'v' | busybox sed 's.\..X.'<br />
1016 v<br />
1017 )<br />
1018 <br />
1019 And again, as above, the standard says "if the delimiter is preceded by \" ... it does not say, that the \ is by itself NOT escaped, which links to (1), that is: how the parsing is actually done?!<br />
1020 <br />
1021 => The standard should clarify this ambiguity, given that two widely used implementations (GNU vs. BusyBox) already use different behaviour shows that there is something fishy.<br />
1022 And if it's undefined, the standard should also mention that (and probably describe it with an example) and warn from using any special characters as delimiters.<br />
1023 <br />
1024 <br />
1025 b) Depending on the answer of (2a), there isn't any mentioning on whether one can get back the special meaning respectively the literal character.<br />
1026 <br />
1027 In both cases, the question would arise:<br />
1028 Other than using a more sane delimiter ;-) ...<br />
1029 <br />
1030 - »literally« means, it's no longer a delimiter, but other than that goes directly into the RE and may be special there:<br />
1031 then: s.\..X. would be effectively s/./X/<br />
1032 ... can I get the literal . here and if so how?<br />
1033 <br />
1034 - »literally« means, literal even with respect to the RE:<br />
1035 then: s.\..X. would be effectively s/\./X/<br />
1036 ... can I get the special meaning . here and if so how?<br />
1037 <br />
1038 => even if it's simply not possibly do get the other meaning (whichever is actually the "right" one), the standard should explicitly mention that.<br />
1039 <br />
1040 <br />
1041 c) (2a) and (2b) also affect characters that get their special meaning (in the RE) only when preceded by \ .<br />
1042 <br />
1043 Consider:<br />
1044 s(\((X(<br />
1045 <br />
1046 Unlike above in (1) (where s(\\((X( ) was used, there is no parsing ambiguity here, and the command should effectively be the same than:<br />
1047 s/<something>/X/<br />
1048 <br />
1049 Again, for <something> the question from (2a) comes up:<br />
1050 What does the RE see?<br />
1051 - ( (the literal ( )<br />
1052 (btw, this is what GNU sed does:<br />
1053 $ printf '%s\n' '(' | sed 's(\((X('<br />
1054 X<br />
1055 $ printf '%s\n' 'v' | sed 's(\((X('<br />
1056 v<br />
1057 )<br />
1058 <br />
1059 or:<br />
1060 - \( (the sequence \( which starts a subpattern)<br />
1061 (I know the resulting RE would lack a closing '\)' and something within the<br />
1062 subexpression ... but that could be easily added.)<br />
1063 (btw, this is what BusyBox sed does:<br />
1064 $ printf '%s\n' 'anything' | busybox sed 's(\((X('<br />
1065 sed: bad regex '\(': Unmatched ( or \(<br />
1066 )<br />
1067 <br />
1068 <br />
1069 So as one can see, the same questions as in (2a) and (2b) pop up for such characters that get their special meaning only when preceded by \ .<br />
1070 <br />
1071 <br />
1072 <br />
1073 d) I found that e.g. GNU's sed (which (2a) uses the quoted delimiter that is a special character AS special character in the RE) allows the following workaround (to get the literal character):<br />
1074 s.[.].X.<br />
1075 which seems then to be used (by GNU sed) as:<br />
1076 s/[.]/X/<br />
1077 <br />
1078 But again, at least to me the standard seems to be ambiguous with<br />
1079 respect to how the original form should be parsed (see point (1) above).<br />
1080 <br />
1081 While the bracket expression itself is defined to take the . inside<br />
1082 literally, POSIX nowhere seems to say that this is even to be seen as a<br />
1083 '.' for the RE and not as the 2nd delimiter.<br />
1084 <br />
1085 Instead, (i), (ii) and (iii) rather seem to imply, that because the (2nd) . is not preceded by \ it *IS* taken as a delimiter.<br />
1086 <br />
1087 (GNU sed:<br />
1088 $ printf '%s\n' '.' | sed 's.[.].X.'<br />
1089 X<br />
1090 $ printf '%s\n' 'v' | sed 's.[.].X.'<br />
1091 v<br />
1092 )<br />
1093 <br />
1094 (BusyBox sed also works like that:<br />
1095 $ printf '%s\n' '.' | busybox sed 's.[.].X.'<br />
1096 X<br />
1097 $ printf '%s\n' 'v' | busybox sed 's.[.].X.'<br />
1098 v<br />
1099 but since BusyBox sed anyway seems to tread the quoted delimiter \. as literal . in the RE (unlike GNU sed):<br />
1100 $ printf '%s\n' '.' | busybox sed 's.\..X.'<br />
1101 X<br />
1102 $ printf '%s\n' 'v' | busybox sed 's.\..X.'<br />
1103 v<br />
1104 the "trick" is not really a workaround to get the "other meaning" (which would be the special character . meaning here<br />
1105 )<br />
1106 <br />
1107 <br />
1108 <br />
1109 <br />
1110 3. Probably just a bug:<br />
1111 <br />
1112 Not really an issue with POSIX, but just as an example how confusing things apparently are:<br />
1113 <br />
1114 <br />
1115 (GNU sed:<br />
1116 $ printf '%s\n' '9+' | sed 's+9\++X+'<br />
1117 X<br />
1118 $ printf '%s\n' '99+' | sed 's+9\++X+'<br />
1119 9X<br />
1120 $ printf '%s\n' '999+' | sed 's+9\++X+'<br />
1121 99X<br />
1122 )<br />
1123 These results are IMO fine, regardless of my other questions above.<br />
1124 <br />
1125 In BREs, + alone is never special, and whether one parses all at once<br />
1126 from left to right (as in (1b))... or first looks for unquoted delimiter characters<br />
1127 and splits the command there (as in (1a))...<br />
1128 ... the RE should always be effectively the string 9+ ... which is (in<br />
1129 BREs) the literal 9 followed by the literal + .<br />
1130 <br />
1131 <br />
1132 However...<br />
1133 <br />
1134 <br />
1135 (BusyBox sed:<br />
1136 $ printf '%s\n' '9+' | busybox sed 's+9\++X+'<br />
1137 X+<br />
1138 $ printf '%s\n' '99+' | busybox sed 's+9\++X+'<br />
1139 X+<br />
1140 $ printf '%s\n' '999+' | busybox sed 's+9\++X+'<br />
1141 X+<br />
1142 )<br />
1143 somehow does both:<br />
1144 - make out of the \+ a non-delimiter<br />
1145 - transforms the (wrt BRE) non-special character + into a special one.<br />
1146 <br />
1147 Which I think is generally (regardless of the interpretation or any<br />
1148 ambiguities in POSIX) a bug (I'll report it there).<br />
1149 <br />
1150 <br />
1151 All the above should also at least partially apply to context addresses.<br />
1152 <br />
1153 I'd guess nothing of it applies to the y-command, though,.. but I haven't really looked at it.</description>
1154 <guid>https://www.austingroupbugs.net/view.php?id=1551</guid>
1155 <author>calestyo <calestyo@example.com></author>
1156 <comments>https://www.austingroupbugs.net/view.php?id=1551#bugnotes</comments>
1157 </item>
1158 <item>
1159 <title>0001560: clarify wording of command substitution</title>
1160 <link>https://www.austingroupbugs.net/view.php?id=1560</link>
1161 <description>In:<br />
1162 <a href="https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33716&limit=100&offset=0&sid=">https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33716&limit=100&offset=0&sid=</a> [<a href="https://collaboration.opengroup.org/austin/plato/protected/mailarch.php?soph=N&action=show&archive=austin-group-l&num=33716&limit=100&offset=0&sid=" target="_blank">^</a>]<br />
1163 <br />
1164 I've had asked whether POSIX requires any conforming shell to consider a last "line" without trailing newline in a command substitution for that purpose, or whether a shell would in principle be allowed to ignore such line, if it had no trailing newline.<br />
1165 <br />
1166 The answer was, that a shell MUST in fact consider such lines.</description>
1167 <guid>https://www.austingroupbugs.net/view.php?id=1560</guid>
1168 <author>calestyo <calestyo@example.com></author>
1169 <comments>https://www.austingroupbugs.net/view.php?id=1560#bugnotes</comments>
1170 </item>
1171 </channel>
1172 </rss>