Many of these are really Unix gotchas that predate the implementation of GNU coreutils.
Not sure if this is the kind of thing you're talking about, but there are multiple regex languages for sed, grep, grep -E, and probably awk/gawk. Another issue is the treatment of Unicode; it would be nice if all the utilities handled UTF-8, but many don't.
The "date" command pads output with zeros by default. If "date" output is used in shell arithmetic, those values will be interpreted as octal numbers. This leads to subtle anomalous behavior where the results look good but are incorrect, E.G:
$ echo $(($(date -d 3/6/2021 '+%j')+10))
Ten days after March 6th becomes March 4th! This is easy to fix by putting a minus after the percent symbol, but it took some head-scratching to figure out what was going on. I suggest a warning in the "date" man page about the possible misinterpretation of leading zeros.
@Bruce, that's kinda a shell gotcha.
Note also you can use relative dates in date itself, like:
date -d '3/6/2021 + 10 days' '+%j'
`sort -u -nr` is not the same as `sort -nr|uniq`
See [this Reddit post](https://www.reddit.com/r/openSUSE/comments/tg70de/something_weird_about_sort_u/) for an example.
I suppose that is a bit of a gotcha.
I.e. that -u operates on part of the line being compared, rather than the whole line.
I'll consider adding it.
Note the full docs already mention this with:
https://github.com/coreutils/coreutils/commit/55fe28e3e
Multicolumn pr, that is pr -COLUMN where COLUMN>=2, implicitly turns on options -e (expand input tab characters to spaces) and -i (greedily convert runs of output space characters to tabs). Output tabs may appear where no input tabs existed; further processing of the output may be fraught. This pipeline will eliminate all output tabs: pr -COLUMN | pr -e -t.
@Doug McIlroy, good call. I've added your comment to the page directly at:
https://www.pixelbeat.org/docs/coreutils-gotchas.html#pr
Thanks!
For me (coreutils 9.5) 'dd status=none count=1 if=/dev/random bs=512 | wc -c' gives the expected result: 512, i.e. no gotcha
@Paul That's because in the meantime /dev/random was updated to generate data rather than providing a smaller amount.
I've updated the example to be a little more involved,
but should demonstrate the issue on any system.
thanks