...making Linux just a little more fun!
Rick Moen [rick at linuxmafia.com]
Tue, 7 Nov 2006 15:48:08 -0800
Thread quoted below could be grist for the TAG mill, or the makings of a 2 cent tip, or something else.
Date: Tue, 7 Nov 2006 12:16:25 -0800 To: conspire@linuxmafia.com X-Mas: Bah humbug. User-Agent: Mutt/1.5.11+cvs20060403 From: Rick Moen <rick@linuxmafia.com> Subject: [conspire] Puzzle: How do you sort IP address lists?There's a maintenance task I have to do occasionally, that is very much The Wrong Thing over the long term, but necessary in the sort term: I keep a blocklist of IP addresses that my SMTP server shouldn't accept mail from. SVLUG's server, on which I'm interim sysadmin, has a list just like it. Since I maintain both lists, it's logical to combine them, run them through 'uniq' (to eliminate duplicates), and sort the result -- to benefit both sites.
That's where the 'puzzle' bit comes in. But first, why it's The Wrong Thing:
Security author Marcus J. Ranum has a dictum that 'enumerating badness' is dumb (http://www.ranum.com/security/computer_security/editorials/dumb/):
Back in the early days of computer security, there were only a relatively small number of well-known security holes. That had a lot to do with the widespread adoption of "Default Permit" because, when there were only 15 well-known ways to hack into a network, it was possible to individually examine and think about those 15 attack vectors and block them. So security practitioners got into the habit of "Enumerating Badness" - listing all the bad things that we know about. Once you list all the badness, then you can put things in place to detect it, or block it. Why is "Enumerating Badness" a dumb idea? It's a dumb idea because sometime around 1992 the amount of Badness in the Internet began to vastly outweigh the amount of Goodness. For every harmless, legitimate, application, there are dozens or hundreds of pieces of malware, worm tests, exploits, or viral code. Examine a typical antivirus package and you'll see it knows about 75,000+ viruses that might infect your machine. Compare that to the legitimate 30 or so apps that I've installed on my machine, and you can see it's rather dumb to try to track 75,000 pieces of Badness when even a simpleton could track 30 pieces of Goodness. [...]So, in keeping blocklists of IP addresses that have been zombified and used for mass-mailed spam, 419-scammail, etc., I'm aware of doing something a bit dumb. It's a losing stategy. I'm doing it on linuxmafia.com because the site is badly short on RAM and disk space in the short term (still need to migrate to that VA Linux 2230), and so software upgrades are deferred. Similarly, the SVLUG host has a scarily broken package system, and is therefore to be migrated rather than worked on in place, as well. So, we limp by on both machines with some long-term losing anti-spam methods because they're short-term palliatives.
Getting back to the puzzle, you'd think that GNU sort would be easily adaptable to a list like this, right? Consider this 11-address chunk of linuxmafia.com's blocklist:
4.3.76.194 8.10.33.176 10.123.189.105 12.30.72.162 12.149.177.21 12.154.4.213 12.159.232.66 12.205.7.190 12.206.142.76 12.214.50.126 12.221.163.162Just 'sort' as a filter with no options does this:
10.123.189.105 12.149.177.21 12.154.4.213 12.159.232.66 12.205.7.190 12.206.142.76 12.214.50.126 12.221.163.162 12.30.72.162 4.3.76.194 8.10.33.176Hmm, fine up until the last three lines, but then it becomes apparent that 'sort' is using strict ASCII order. So, you hit the manpage. '-n' for 'compare according to string numerical value' seems promising, as does '-g' for 'compare according to general numerical value'. Those get you:
4.3.76.194 8.10.33.176 10.123.189.105 12.149.177.21 12.154.4.213 12.159.232.66 12.205.7.190 12.206.142.76 12.214.50.126 12.221.163.162 12.30.72.162and
4.3.76.194 8.10.33.176 10.123.189.105 12.149.177.21 12.154.4.213 12.159.232.66 12.205.7.190 12.206.142.76 12.214.50.126 12.221.163.162 12.30.72.162No cigar.
Personally, I played with these things for a while, gave up and switched to awk, and had the problem mostly solved with a rather ghastly script when I thought 'Wait a second! That's absurd. We should be able to do this using just GNU sort. If it can't sort IP addresses, what the hell good is it?'
So, I went back and eventually figured it out -- and I'm wondering if any other subscriber has either already solved this problem or cares to take a crack at it.
(I'll also really admire someone's elegant solution in, e.g., Python, Perl, or Ruby -- but I'm just boggling at how non-obvious my 'sort' solution seems, and want to compare notes.)
-- Cheers, Rick Moen Ita erat quando hic adveni. rick@linuxmafia.com
Date: Tue, 7 Nov 2006 12:33:23 -0800 (PST) From: Tom Macke <macke@scripps.edu> To: Rick Moen <rick@linuxmafia.com>Cc: conspire@linuxmafia.com
Subject: [conspire] sort -t.Use -t. to break the lines into fields on ., then sort 4 ints from left to right:
sort -t. +0n -1 +1n -2 +2n -3 +3n -4 <ip.list > ip.list.sortInput:
10.123.189.105 12.149.177.21 12.154.4.213 12.159.232.66 12.205.7.190 12.206.142.76 12.214.50.126 12.221.163.162 12.30.72.162 4.3.76.194 8.10.33.176Output:
4.3.76.194 8.10.33.176 10.123.189.105 12.30.72.162 12.149.177.21 12.154.4.213 12.159.232.66 12.205.7.190 12.206.142.76 12.214.50.126 12.221.163.162cheers, tom
Date: Tue, 7 Nov 2006 20:37:29 +0000 From: Nick Moffitt <nick@zork.net> To: conspire@linuxmafia.com Subject: Re: [conspire] Puzzle: How do you sort IP address lists?Rick Moen:
> Personally, I played with these things for a while, gave up and > switched to awk, and had the problem mostly solved with a rather > ghastly script when I thought 'Wait a second! That's absurd. We > should be able to do this using just GNU sort. If it can't sort IP > addresses, what the hell good is it?' > > So, I went back and eventually figured it out -- and I'm wondering if > any other subscriber has either already solved this problem or cares > to take a crack at it.
I have run into this in the past, actually, only with timestamps. I ended up doing a first pass sort, then breaking it up and sorting within using -t and -k to set the field separator and sort starting field, respectively. I ended up doing three successive runs, in reverse order, and naming the files with the prefixes. I then did a final sort to get the filenames in the order I wanted and catted them together. It was a one or two-liner that has since expired from my bash_history, but I was cursing and spitting the whole time.
But as I look now, it seems that you can specify multiple -k entries, and force it to sort on the column alone by specifying an end to the sort criterion as well:
sort -n -t . -k 1,1 -k 2,2 -k 3,3 -k 4,4I'm kind of alarmed that GNU sort hasn't picked up more sort contexts. I found myself in dire need of a hex sort a while ago, and ended up resorting to python.
On a vaguely related tangent, Ryan Finnie has packaged cidrgrep in sid, and it should be in testing by now. Hooray for grep using CIDR ranges instead of regexes! Why doesn't grep have this already? It makes me wonder if there are more pattern specifications we use that would be useful as options to common tools.
-- "N'aimez pas votre voiture? Nick Moffitt Alor, l'heure est arrive pour la brulé" nick@teh.entar.net -- Mark Jaroski
Date: Tue, 7 Nov 2006 12:57:08 -0800 From: Rick Moen <rick@linuxmafia.com> To: conspire@linuxmafia.com X-Mas: Bah humbug. User-Agent: Mutt/1.5.11+cvs20060403 Subject: Re: [conspire] sort -t.Quoting Tom Macke (macke@scripps.edu):
> Use -t. to break the lines into fields on ., then sort 4 ints from > left to right: > > sort -t. +0n -1 +1n -2 +2n -3 +3n -4 <ip.list > ip.list.sort^^^^^^^^^^^^^^^^^^^^^^^^^^^
Huh! I did figure out that I needed '-t.' (or, equivalently, '--field-separator=.') -- but I can't find your field-specification strings described in the manpage or texinfo docs. When I dug into the latter, what I found instead was a suggestion to use the '-k' (key) option. Looking more closely, I now find this reference in the info docs:
On older systems, `sort' supports an obsolete origin-zero syntax `+POS1 [-POS2]' for specifying sort keys. POSIX 1003.1-2001 (*note Standards conformance: does not allow this; use `-k' instead.Your Unix-greybeard credentials are showing, Tom. ;->
My solution, using '-k', did end up being uglier than yours by a fair measure. The info docs say:
`-k POS1[,POS2]' `--key=POS1[,POS2]' Specify a sort field that consists of the part of the line between POS1 and POS2 (or the end of the line, if POS2 is omitted), inclusive. Fields and character positions are numbered starting with 1. So to sort on the second field, you'd use `--key=2,2' (`-k 2,2'). See below for more examples.I thus up with:
$ sort -u -n -t. -k 1,1 -k 2,2 -k 3,3 -k 4,4 ip > ip.sortedThe '-u' is for uniq-ing on the fly. '-n' is a numeric-value sort appropriate for most types of numbers (that don't use leading plus characters or exponential notation, blessedly unlikely in IP addresses). '-t.' specifies that period is the applicable field separator (rather than whitespace).
Which leaves the '-k' string (equivalent to your origin-zero specifiers): It says, 'Hey, stupid sort program! Now that I've handed you a clue about where the fields begin and end, and told you to sort by numeric value, please also be aware that I'd like you to find a number, then a second number, then a third number, then a fourth number. Kindly use all four _as numbers_ when you sort this puppy.'
What a hassle. First, you have to say 'Use numbers', then you have to add '...and I mean, specifically, four of them.'
Date: Tue, 7 Nov 2006 13:16:52 -0800 From: Don Marti <dmarti@zgp.org> To: Nick Moffitt <nick@zork.net>, conspire@linuxmafia.com User-Agent: Mutt/1.5.9i Subject: Re: [conspire] Puzzle: How do you sort IP address lists?Alternate approach:
tr '.' ' ' < address_list | xargs printf '%03d.%03d.%03d.%03d\n' \ | sort -u | sed -re 's/\b0+//g' ) < address_listAnother way would be to multiply each address out into an int, sort, and re-format.
-- Don Marti http://zgp.org/~dmarti/ dmarti@zgp.org
Date: Tue, 7 Nov 2006 14:39:02 -0800 To: conspire@linuxmafia.com X-Mas: Bah humbug. User-Agent: Mutt/1.5.11+cvs20060403 From: Rick Moen <rick@linuxmafia.com> Subject: Re: [conspire] Puzzle: How do you sort IP address lists?Quoting Don Marti (dmarti@zgp.org):
> Alternate approach: > > tr '.' ' ' < address_list | xargs printf '%03d.%03d.%03d.%03d\n' \ > | sort -u | sed -re 's/\b0+//g' ) < address_list^
Works, after you lose the errant parenthesis. I like it; it's a little messy but logical.
Date: Tue, 7 Nov 2006 14:44:22 -0800 To: Rick Moen <rick@linuxmafia.com> User-Agent: Mutt/1.5.9i From: Tim Utschig <tim@tetro.net>Cc: conspire@linuxmafia.com
Subject: Re: [conspire] Puzzle: How do you sort IP address lists?On Tue, Nov 07, 2006 at 12:16:25PM -0800, Rick Moen wrote:
> > (I'll also really admire someone's elegant solution in, e.g., Python, > Perl, or Ruby -- but I'm just boggling at how non-obvious my 'sort' > solution seems, and want to compare notes.) >
I wouldn't call mine elegant, but the last time I tried to figure out how to do it using sort I gave up and used Perl...
:r!grep ipsort ~/.bashrc alias ipsort='perl -MSocket -lne '\''$ips{inet_aton($_)}++; END { for (sort keys %ips) { while($ips{$_}--) { print inet_ntoa($_); } } }'\' alias ipsortu='perl -MSocket -lne '\''$ips{inet_aton($_)} = 1; END { print inet_ntoa($_) for sort keys %ips }'\'
-- - Tim Utschig <tim@tetro.net>
Benjamin A. Okopnik [ben at linuxgazette.net]
Wed, 8 Nov 2006 00:00:11 -0500
On Tue, Nov 07, 2006 at 03:48:08PM -0800, Rick Moen wrote:
> Thread quoted below could be grist for the TAG mill, or the makings of a > 2 cent tip, or something else.Mmmm... 2-Cent Tip, I think. It's a common-enough problem that we should have a good answer for our readers.
Sorting IPs is a classic problem for budding Perl hackers to sharpen their brains on. One of the better solutions (highly efficient and relatively short) is a modified Schwartzian Transform:
ben@Fenrir:~$ cat iplist 12.154.4.213 12.159.232.66 12.205.7.190 12.206.142.76 12.214.50.126 12.221.163.162 4.3.76.194 8.10.33.176 10.123.189.105 12.30.72.162 12.149.177.21 ben@Fenrir:~$ perl -we'print map substr($_,4),sort map pack('C4',split/\./).$_,<>' iplist 4.3.76.194 8.10.33.176 10.123.189.105 12.30.72.162 12.149.177.21 12.154.4.213 12.159.232.66 12.205.7.190 12.206.142.76 12.214.50.126 12.221.163.162For those interested in the details - the IP is parsed into the numerical fields by 'split'; the result is converted to a 4-byte char string which is prepended to the line. This is now sorted using the default lexical sort (much like the one in the shell) - which will now actually work due to the prepended string - and displayed after clipping the prefix. /Voila/! ... I only wish that I'd thought of it.
-- * Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *
Paul Sephton [paul at inet.co.za]
Wed, 08 Nov 2006 15:18:40 +0200
That's a bit confusing. Why go perl & regex if there's a perfectly good
cat iplist | sort -k1.1n,2.1n,3.1n,4.1n -t'.'which gives you exactly what you want anyway- unless you absolutely have to use perl, of course
btw: anyone know of people who use perl as their default shell? <grin>
Paul
Thomas Adam [thomas.adam22 at gmail.com]
Wed, 8 Nov 2006 20:25:31 +0000
Hi --
On 08/11/06, Paul Sephton <paul@inet.co.za> wrote:
> That's a bit confusing. Why go perl & regex if there's a perfectly good > > cat iplist | sort -k1.1n,2.1n,3.1n,4.1n -t'.'
The above is inaccurate (don't top post). What you probably meant (and was already mentioned in the thread Rick bounced to TAG) was:
[n6tadam@workstation ~]% sort -n -t . -k 1,1 -k 2,2 -k 3,3 -k 4,4 < ./test 4.3.76.194 8.10.33.176 10.123.189.105 12.30.72.162 12.149.177.21 12.154.4.213 12.159.232.66 12.205.7.190 12.206.142.76 12.214.50.126 12.221.163.162The use of cat in the above example (don't top post) was also OTT. I suppose you win a UUoC award (don't top post).
I'd admit the perl solution (don't top post) is way OTT, but YMMV, TMTOWTDI, etc.
-- Thomas Adam
Benjamin A. Okopnik [ben at linuxgazette.net]
Wed, 8 Nov 2006 15:31:32 -0500
[ Hi, Paul - please don't top-post; this severely decreases people's ability to read things in order. Also, please clip content that you're not replying to; see "Asking Questions of The Answer Gang" at http://linuxgazette.net/tag/ask-the-gang.html for details. I've restored the correct sequence, re-added correct attribution, and clipped extraneous material. ]
On Wed, Nov 08, 2006 at 03:18:40PM +0200, Paul Sephton wrote:
> On Wed, 2006-11-08 at 07:00, Benjamin A. Okopnik wrote: > > On Tue, Nov 07, 2006 at 03:48:08PM -0800, Rick Moen wrote: > > > > > Thread quoted below could be grist for the TAG mill, or the makings of a > > > 2 cent tip, or something else. > > > > Mmmm... 2-Cent Tip, I think. It's a common-enough problem that we should > > have a good answer for our readers. > > > > Sorting IPs is a classic problem for budding Perl hackers to sharpen > > their brains on. One of the better solutions (highly efficient and > > relatively short) is a modified Schwartzian Transform: > > [ snip ] > > > ben@Fenrir:~$ perl -we'print map substr($_,4),sort map pack('C4',split/\./).$_,<>' > > That's a bit confusing.
To me, it's perfectly readable and illustrates a powerful sorting algorithm that's worth knowing.
> Why go perl & regex if there's a perfectly good > cat iplist | sort -k1.1n,2.1n,3.1n,4.1n -t'.' > which gives you exactly what you want anyway- unless you absolutely have to > use perl, of course
In that case, Paul, why go 'cat' when there's a perfectly good filespec option to 'sort'?
sort -k1.1n,2.1n,3.1n,4.1n -t'.' iplistWorse than that, your solution apparently fails:
ben@Fenrir:/tmp$ cat iplist | sort -k1.1n,2.1n,3.1n,4.1n -t'.' sort: stray character in field spec: invalid field specification `1.1n,2.1n,3.1n,4.1n'I would imagine that there's some simple solution to the above, but I'll let you troubleshoot it.
The answer in general, however, is the Perl motto: TMTOWTDI (There's more than one way to do it.) Your way is not better than mine or vice versa; if it works, and it's what you prefer, Linux - and Unix in general - provides you with options in how you choose to do it. We don't need to compete for whose way is better (although healthy comparisons are useful; I'm always willing to steal^Wadapt someone else's method if it's a significant improvement on what I'm doing.)
> btw: anyone know of people who use perl as their default shell? <grin>
However low that number may be, there are fewer using 'sort' as one.
http://sourceforge.net/projects/psh/
-- * Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *
Faber J. Fedor [faber at linuxnj.com]
Wed, 8 Nov 2006 15:43:27 -0500
On 08/11/06 15:31 -0500, Benjamin A. Okopnik wrote:
> On Wed, Nov 08, 2006 at 03:18:40PM +0200, Paul Sephton wrote: > > btw: anyone know of people who use perl as their default shell? <grin> > > However low that number may be, there are fewer using 'sort' as one. > > http://sourceforge.net/projects/psh/
And all this time I had you pegged as a Futurama fan, Ben.
http://zoidberg.student.utwente.nl/
-- Regards, Faber Fedor President Linux New Jersey, Inc. 908-320-0357 800-706-0701
Paul Sephton [paul at inet.co.za]
Thu, 09 Nov 2006 00:08:24 +0200
Hi;
I am sorry if I offended anyone in my previous post; I was honestly not trying to be contrary. I personally use perl, and quite like the language. What I was trying to show, is a command which works perfectly well for GNU sort v5.0, but as you point out fails for GNU sort v5.2.1 (as evidenced by my testing on two machines).
The documentation for 'sort' shows the following:
-k, --key=POS1[,POS2] start a key at POS1, end it at POS 2 (origin 1)and
-t, --field-separator=SEP use SEP instead of non-blank to blank transitionalso, further down,
POS is F[.C][OPTS], where F is the field number and C the character position in the field. OPTS is one or more single-letter ordering options, which override global ordering options for that key. If no key is given, use the entire line as the key.So my command line is perfectly valid according to the documentation. As tested earlier on my older machine, it also provides a very valid result for GNU sort v5.0. The command line -k1.1 -k2.2 -k3.3 etc. is invalid, as the [.C] is an offset into the field for the first character from which to commence the ordering. By going -k3.3, you drop the first two characters of the field.
I can only conclude that there is a programming error (read BUG) in the later sort v5.2.1 as that version does not function according to documentation as embedded (sort --help) or the man pages.
Apologies again for not testing on a later machine prior to posting. On the other hand, I find myself pulling my hair out sometimes at the way some utilities (for example nslookup) which I have used for many years are deprecated seemingly at someone's whim, or others (such as ps) have their arguments changed (again at someone's whim) breaking all sorts of scripts. Upgrading production machines is a nightmare, and I cannot bring myself to believe that there is a valid reason behind this practice. Standards seem to be things that apply only to those without a sense for adventure, such as SCO Unix.
Perhaps there is no "standard" answer to the query as to how lists of IP's may be sorted. Clearly, any production system which used my method would have broken the moment the binutils and textutils were updated. Who knows; Perl keeps morphing as well- I can hardly recognise the original language anymore- although backward compatiblity seems to have been retained however astounding that might seem.
Regards my previous comments which were intended to be humerous (as indicated by the smileys) I do realise that those comments were severely lacking in substance, and likely to be misconstrued. Clearly, no-one would use "sort" as a shell, although it is perfectly valid to use Perl, Tcl or even Python as shells. A previous acquaintance of mine actually lived in Tcl, eschewing Bash as a creation from Hell.
Regards, Paul
Rick Moen [rick at linuxmafia.com]
Wed, 8 Nov 2006 15:03:39 -0800
Quoting Paul Sephton (paul@inet.co.za):
> I am sorry if I offended anyone in my previous post; I was honestly > not trying to be contrary.
If you don't even try to be contrary, how on earth will you fit in? ;->
Seriously, nobody's offended, and please accept our cheery welcome. (We just tend to have to frequently remind people to not accidentally drop the mailing list, on follow-ups.)
> I personally use perl, and quite like the language. What I was trying > to show, is a command which works perfectly well for GNU sort v5.0, > but as you point out fails for GNU sort v5.2.1 (as evidenced by my > testing on two machines). > > The documentation for 'sort' shows the following: > > -k, --key=POS1[,POS2] > start a key at POS1, end it at POS 2 (origin 1)
Near as I can tell, the bracket syntax ("zero or more of these") is a little misleading, as it appears that you need no more than a pair of POSn numbers, and then need to use additional "-k" / "--key=" options for the second and following keys. So, yes, the sort(1) v.5.2.1 manpage is buggy, but only to the extent of not making that clear.
Also, your syntax omitted "-u" (uniq) and the numeric-sort option. In short, you probably meant something more like my GNU sort example:
sort -u -n -t. -k 1,1 -k 2,2 -k 3,3 -k 4,4 iplist...but somehow it came out as this (rearranged per Ben's suggestion to eliminate "cat"), which doesn't quite work:
sort -k1.1n,2.1n,3.1n,4.1n -t'.' iplist
> I find myself pulling my hair out sometimes at the way some utilities > (for example nslookup) which I have used for many years are deprecated > seemingly at someone's whim, or others (such as ps) have their > arguments changed (again at someone's whim) breaking all sorts of > scripts.
In the case of nslookup(1), its eclipse by dig(1) turns out to have ample justification: nslookup relies on some BIND8-specific implementation features (though that may have been fixed in recent cleanup), carries out some unintended network lookups, conceals critical data from its output results, tends to issue non-helpful error messages, and in general is just buggy and ready for the scrap heap.
The change to "ps" options owes, if I remember correctly to some infamous BSD / SysV trainwreck, such that it provides for both syntaxes while making nobody particularly happy.
-- Cheers, Higgeldy Piggeldy "Phooey on Freud and his Rick Moen Hamlet of Elsinore Psychoanalysis -- rick@linuxmafia.com Ruffled the critics by Oedipus, Schmoedipus, Dropping this bomb: I just loved Mom."
Paul Sephton [paul at inet.co.za]
Thu, 09 Nov 2006 08:33:11 +0200
On Wed, 2006-11-08 at 15:03 -0800, Rick Moen wrote:
> Quoting Paul Sephton (paul@inet.co.za): > > > I am sorry if I offended anyone in my previous post; I was honestly > > not trying to be contrary. > > If you don't even try to be contrary, how on earth will you fit in? ;-> > > Seriously, nobody's offended, and please accept our cheery welcome. > (We just tend to have to frequently remind people to not accidentally > drop the mailing list, on follow-ups.) > > > I personally use perl, and quite like the language. What I was trying > > to show, is a command which works perfectly well for GNU sort v5.0, > > but as you point out fails for GNU sort v5.2.1 (as evidenced by my > > testing on two machines). > > > > The documentation for 'sort' shows the following: > > > > -k, --key=POS1[,POS2] > > start a key at POS1, end it at POS 2 (origin 1) > > Near as I can tell, the bracket syntax ("zero or more of these") is a > little misleading, as it appears that you need no more than a pair > of POSn numbers, and then need to use additional "-k" / "--key=" options > for the second and following keys. So, yes, the sort(1) v.5.2.1 manpage > is buggy, but only to the extent of not making that clear. >
The man page is unaltered between 5.0 and v5.2.1 or sort. The difference is in operation.
Interpreting the syntax, the [] brackets simply means "optional" ( refer BNF ). Therefore, --key=POS1[,POS2] simply means "one or more POS separated by comma". Looking at the definition for POS, we see POS=F[.C][OPTS] where F is the field number. Optionally, the field may be specified as 'F', or as 'F.C' or as 'F.COPTS' where F is a field number, C is an offset into that specific field from whence the sort starts, and OPTS are field specific options (in my case 1.1n means field 1, offset 1, numeric sort for field 1).
> Also, your syntax omitted "-u" (uniq) and the numeric-sort option. In > short, you probably meant something more like my GNU sort example: > > sort -u -n -t. -k 1,1 -k 2,2 -k 3,3 -k 4,4 iplist > > ...but somehow it came out as this (rearranged per Ben's suggestion to > eliminate "cat"), which doesn't quite work: > > sort -k1.1n,2.1n,3.1n,4.1n -t'.' iplist >
Um, no. I did not mean that.
I am sorry to say that I did not refer to your example before replying to the thread. As I said, the command line which I provided does indeed work with GNU sort v5.0. The syntax breaks with GNU sort v5.2.1, but I believe that is the fault of incorrect implementation. Incorrect interpretation of the documentation for sort on the part of the developer, not of mine.
A very highly recommended book, Unix Power Tools http://www.oreilly.com/catalog/upt3/ describes the use of sort in great detail. GNU sort is well specified and not a candidate playing ground for someone's innovation.
> > I find myself pulling my hair out sometimes at the way some utilities > > (for example nslookup) which I have used for many years are deprecated > > seemingly at someone's whim, or others (such as ps) have their > > arguments changed (again at someone's whim) breaking all sorts of > > scripts. > > In the case of nslookup(1), its eclipse by dig(1) turns out to have ample > justification: nslookup relies on some BIND8-specific implementation > features (though that may have been fixed in recent cleanup), carries > out some unintended network lookups, conceals critical data from its > output results, tends to issue non-helpful error messages, and in > general is just buggy and ready for the scrap heap. > > The change to "ps" options owes, if I remember correctly to some > infamous BSD / SysV trainwreck, such that it provides for both syntaxes > while making nobody particularly happy.
Indeed, nslookup does rely on BIND8 features. Again, a change to those features led to the demise of nslookup, which had existed (in it's pristine form) for some 15 years prior to it's demise.
Understand me well; I am not against change. As a CTO, Architect and skilled programmer, I am in fact concerned with introducing and managing change every day of my working life. What frustrates me is that some people don't seem to have a handle on when to change something and when to leave well alone. Adoption of Linux (the OS) is very much influenced by the the stability and associated adoption of core binary tools.
It is not good enough to say that the "open source choice will end up in the right thing being adopted" when a core tool is in effect superceded and replaced, or deprecated through introducing a new tool- possibly with the same name as the old.
Changing the binary interface to the OS as presented to the user through the set of core GNU tools in a way which is not backward-compatible should be taboo. Whereas change is not always bad, it is not always good either
Regards, Paul
Jason Creighton [jcreigh at gmail.com]
Wed, 8 Nov 2006 23:44:16 -0700
On Wed, Nov 08, 2006 at 12:00:11AM -0500, Benjamin A. Okopnik wrote:
> On Tue, Nov 07, 2006 at 03:48:08PM -0800, Rick Moen wrote: > > Thread quoted below could be grist for the TAG mill, or the makings of a > > 2 cent tip, or something else. > > Mmmm... 2-Cent Tip, I think. It's a common-enough problem that we should > have a good answer for our readers. > > Sorting IPs is a classic problem for budding Perl hackers to sharpen > their brains on. One of the better solutions (highly efficient and > relatively short) is a modified Schwartzian Transform: > > `` > ben@Fenrir:~$ cat iplist > 12.154.4.213 > 12.159.232.66 > 12.205.7.190 > 12.206.142.76 > 12.214.50.126 > 12.221.163.162 > 4.3.76.194 > 8.10.33.176 > 10.123.189.105 > 12.30.72.162 > 12.149.177.21 > ben@Fenrir:~$ perl -we'print map substr($_,4),sort map pack('C4',split/\./).$_,<>' iplist^^^^ Hmm...I don't really understand how that works. By the time Perl sees that, it's not quoted anymore:
~/tmp$ ruby -e 'p ARGV' perl -we'print map substr($_,4),sort map pack('C4',split/\./).$_,<>' iplist ["perl", "-weprint map substr($_,4),sort map pack(C4,split/\\./).$_,<>", "iplist"]Which Perl seems to happily accept as a bareword. I had thought that -w and/or "use strict" caused Perl to say "YOU FOOL! NO BAREWORDS ALLOWED!", but just playing around with a test script, I can't get a warning to fire with either -w or "use strict". (Perl 5.8.8, Debian etch). But it's been a long time since I've actually tried to code anything in Perl, so I'm probably mistaken.
> 4.3.76.194 > 8.10.33.176 > 10.123.189.105 > 12.30.72.162 > 12.149.177.21 > 12.154.4.213 > 12.159.232.66 > 12.205.7.190 > 12.206.142.76 > 12.214.50.126 > 12.221.163.162 > ''
"Me too!" Ruby implementation: (same input file):
~/tmp$ ruby -e "puts readlines().sort_by { |ip| ip.split('.').map { |d| d.to_i } }" iplist 4.3.76.194 8.10.33.176 10.123.189.105 12.30.72.162 12.149.177.21 12.154.4.213 12.159.232.66 12.205.7.190 12.206.142.76 12.214.50.126 12.221.163.162 ~/tmp$sort_by does a Schwartzian transform for you, so just map the ip to an array of integers ("4.3.76.194" -> [4, 3, 76, 194]) which will then sort correctly.
Jason Creighton
Benjamin A. Okopnik [ben at linuxgazette.net]
Thu, 9 Nov 2006 07:49:07 -0500
On Wed, Nov 08, 2006 at 11:44:16PM -0700, Jason Creighton wrote:
> On Wed, Nov 08, 2006 at 12:00:11AM -0500, Benjamin A. Okopnik wrote: > > > > `` > > ben@Fenrir:~$ cat iplist > > 12.154.4.213 > > 12.159.232.66 > > 12.205.7.190 > > 12.206.142.76 > > 12.214.50.126 > > 12.221.163.162 > > 4.3.76.194 > > 8.10.33.176 > > 10.123.189.105 > > 12.30.72.162 > > 12.149.177.21 > > ben@Fenrir:~$ perl -we'print map substr($_,4),sort map pack('C4',split/\./).$_,<>' iplist > ^^^^ > Hmm...I don't really understand how that works. By the time Perl sees > that, it's not quoted anymore:
Oh, right. It seems that 'pack' will accept a template argument without it being quoted. I didn't know that. Doing this with, say, 'N*' would create a bit of a problem, though.
The reason it works, of course - despite my senior moment at the keyboard - is that the shell evaluates 'C4' as a string and returns it literally. So, on the one hand, it is double-plus-ungood that I managed to type the wrong quotes - but on the other hand, I've just learned a cute trick that I could use (at least under some shells) to do more low, nasty, mean things with Perl golf.
> `` > ~/tmp$ ruby -e 'p ARGV' perl -we'print map substr($_,4),sort map pack('C4',split/\./).$_,<>' iplist
[glower] Young man, if you're going to act smarter than me regularly, we're going to Have A Talk. A simple capo does not does not do that to Il Padrino, capish?
Nicely done.
In Perl, of course, that would be
perl -we'print "@ARGV"' !!or, better yet - with clearer formatting -
ben@Fenrir:/tmp$ perl -wle'print for @ARGV' !!The latter gives you each argument on a line by itself.
> `` > ~/tmp$ ruby -e "puts readlines().sort_by { |ip| ip.split('.').map { |d| d.to_i } }" iplist > 4.3.76.194 > 8.10.33.176 > 10.123.189.105 > 12.30.72.162 > 12.149.177.21 > 12.154.4.213 > 12.159.232.66 > 12.205.7.190 > 12.206.142.76 > 12.214.50.126 > 12.221.163.162 > ~/tmp$ > '' > > sort_by does a Schwartzian transform for you, so just map the ip to an > array of integers ("4.3.76.194" -> [4, 3, 76, 194]) which will then sort > correctly.
Sweet! It's nice that somebody has implemented it as a fixed routine. Does Ruby do GRTs (Gutman-Rossler Transforms) as well?
Incidentally, I've been occasionally glancing at "Why's (Poignant) Guide to Ruby" (http://poignantguide.net/). That is one seriously bent individual. I like him. And it's a fairly nice language from what I can see so far; if I'm going to add another scripting language to my kit, that's a good candidate. Python just leaves me cold and slightly queasy - unsurprising, perhaps, considering its poikilothermic and venomous nature... [1]
[1] Why, yes, this is intended to poke Mike Orr. Why do you ask?
-- * Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *
Neil Youngman [ny at youngman.org.uk]
Thu, 9 Nov 2006 13:19:46 +0000
On or around Thursday 09 November 2006 12:49, Benjamin A. Okopnik reorganised a bunch of electrons to form the message: <SNIP>
> Python just leaves me cold and slightly > queasy - unsurprising, perhaps, considering its poikilothermic and > venomous nature... [1]
I thought pythons were constrictors and constrictors generally ain't venomous.
Neil
Benjamin A. Okopnik [ben at linuxgazette.net]
Thu, 9 Nov 2006 08:34:39 -0500
On Thu, Nov 09, 2006 at 01:19:46PM +0000, Neil Youngman wrote:
> On or around Thursday 09 November 2006 12:49, Benjamin A. Okopnik reorganised > a bunch of electrons to form the message: > <SNIP> > > > Python just leaves me cold and slightly > > queasy - unsurprising, perhaps, considering its poikilothermic and > > venomous nature... [1] > > I thought pythons were constrictors and constrictors generally ain't venomous.
Yes, but we're talking about the language. ;)
-- * Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *
Thomas Adam [thomas.adam22 at gmail.com]
Thu, 9 Nov 2006 13:42:13 +0000
On Thu, 9 Nov 2006 07:49:07 -0500 "Benjamin A. Okopnik" <ben@linuxgazette.net> wrote:
> Sweet! It's nice that somebody has implemented it as a fixed routine. > Does Ruby do GRTs (Gutman-Rossler Transforms) as well?
Not by default, no.
> Incidentally, I've been occasionally glancing at "Why's (Poignant) > Guide to Ruby" (http://poignantguide.net/). That is one seriously > bent individual. I like him. And it's a fairly nice language > from what I can see so far; if I'm going to add another scripting > language to my kit, that's a good candidate. Python just leaves me > cold and slightly queasy - unsurprising, perhaps, considering its > poikilothermic and venomous nature... [1]
It is a somewhat seminal piece, although it's not in my style of writing such that I can read it for long without being frustrated. By _why is cool -- he wrote the YAML bindings for ruby.
-- Thomas Adam
Jason Creighton [jcreigh at gmail.com]
Thu, 9 Nov 2006 23:59:54 -0700
On Thu, Nov 09, 2006 at 07:49:07AM -0500, Benjamin A. Okopnik wrote:
> On Wed, Nov 08, 2006 at 11:44:16PM -0700, Jason Creighton wrote: > > `` > > ~/tmp$ ruby -e 'p ARGV' perl -we'print map substr($_,4),sort map pack('C4',split/\./).$_,<>' iplist > > [glower] Young man, if you're going to act smarter than me regularly, > we're going to Have A Talk. A simple /capo/ does not does not do that to > Il Padrino, capish? > > Nicely done. > > In Perl, of course, that would be > > `` > perl -we'print "@ARGV"' !! > '' > > or, better yet - with clearer formatting - > > `` > ben@Fenrir:/tmp$ perl -wle'print for @ARGV' !! > '' > > The latter gives you each argument on a line by itself.
One thing I forgot to mention is that I often use that trick to figure out what the heck the shell is doing. For example, I have this in my .bashrc:
alias putargs='ruby -e "p ARGV" --'Or the equivalent Perl, of course.
Anyway, with that in place, you can play around with how the shell interprets command lines:
~/tmp$ ls another_file filename with spaces some_file ~/tmp$ putargs * ["another_file", "filename with spaces", "some_file"] ~/tmp$ var='hello *' ~/tmp$ putargs $var ["hello", "another_file", "filename with spaces", "some_file"] ~/tmp$ putargs "$var" ["hello *"] ~/tmp$ putargs '$var' ["$var"] ~/tmp$ putargs `/bin/ls` ["another_file", "filename", "with", "spaces", "some_file"] ~/tmp$ putargs "`/bin/ls`" ["another_file\nfilename with spaces\nsome_file"] ~/tmp$ putargs '`/bin/ls`' ["`/bin/ls`"]
> > `` > > ~/tmp$ ruby -e "puts readlines().sort_by { |ip| ip.split('.').map { |d| d.to_i } }" iplist > > 4.3.76.194 > > 8.10.33.176 > > 10.123.189.105 > > 12.30.72.162 > > 12.149.177.21 > > 12.154.4.213 > > 12.159.232.66 > > 12.205.7.190 > > 12.206.142.76 > > 12.214.50.126 > > 12.221.163.162 > > ~/tmp$ > > '' > > > > sort_by does a Schwartzian transform for you, so just map the ip to an > > array of integers ("4.3.76.194" -> [4, 3, 76, 194]) which will then sort > > correctly. > > Sweet! It's nice that somebody has implemented it as a fixed routine. > Does Ruby do GRTs (Gutman-Rossler Transforms) as well?
What's the Guttman-Rossler transform? Google is unusually unenlightening.
> Incidentally, I've been occasionally glancing at "Why's (Poignant) Guide > to Ruby" (http://poignantguide.net/). That is one seriously bent > individual. I like him. And it's a fairly nice language from what I > can see so far; if I'm going to add another scripting language to my > kit, that's a good candidate. Python just leaves me cold and slightly > queasy - unsurprising, perhaps, considering its poikilothermic and > venomous nature... [1]
As Thomas mentioned, _why is the author of Syck, a C YAML parser with bindings in Ruby and a couple other languages. And Hpricot, a nice HTML parser for Ruby. And RedCloth, an implementation of the Textile markdown language. And a handful of other libraries. And, of course, the aforementioned "(Poignant) Guide". If life were Slashdot, _why would be +5 Productive.
Jason Creighton
Paul Sephton [paul at inet.co.za]
Fri, 10 Nov 2006 10:03:25 +0200
On Thu, 2006-11-09 at 23:59 -0700, Jason Creighton wrote:
> On Thu, Nov 09, 2006 at 07:49:07AM -0500, Benjamin A. Okopnik wrote: > > On Wed, Nov 08, 2006 at 11:44:16PM -0700, Jason Creighton wrote: > > > `` > > > ~/tmp$ ruby -e 'p ARGV' perl -we'print map substr($_,4),sort map pack('C4',split/\./).$_,<>' iplist > > > > [glower] Young man, if you're going to act smarter than me regularly, > > we're going to Have A Talk. A simple /capo/ does not does not do that to > > Il Padrino, capish? > > > > Nicely done. > > > > In Perl, of course, that would be > > > > `` > > perl -we'print "@ARGV"' !! > > '' > > > > or, better yet - with clearer formatting - > > > > `` > > ben@Fenrir:/tmp$ perl -wle'print for @ARGV' !! > > '' > > > > The latter gives you each argument on a line by itself. > > One thing I forgot to mention is that I often use that trick to figure > out what the heck the shell is doing. For example, I have this in my > .bashrc: > > `` > alias putargs='ruby -e "p ARGV" --' > '' > > Or the equivalent Perl, of course. > > Anyway, with that in place, you can play around with how the shell > interprets command lines: > > `` > ~/tmp$ ls > another_file filename with spaces some_file > ~/tmp$ putargs * > ["another_file", "filename with spaces", "some_file"] > ~/tmp$ var='hello *' > ~/tmp$ putargs $var > ["hello", "another_file", "filename with spaces", "some_file"] > ~/tmp$ putargs "$var" > ["hello *"] > ~/tmp$ putargs '$var' > ["$var"] > ~/tmp$ putargs `/bin/ls` > ["another_file", "filename", "with", "spaces", "some_file"] > ~/tmp$ putargs "`/bin/ls`" > ["another_file\nfilename with spaces\nsome_file"] > ~/tmp$ putargs '`/bin/ls`' > ["`/bin/ls`"] > '' >
That's a really cool trick for getting a list from args. I think I could use that.
> > > `` > > > ~/tmp$ ruby -e "puts readlines().sort_by { |ip| ip.split('.').map { |d| d.to_i } }" iplist > > > 4.3.76.194 > > > 8.10.33.176 > > > 10.123.189.105 > > > 12.30.72.162 > > > 12.149.177.21 > > > 12.154.4.213 > > > 12.159.232.66 > > > 12.205.7.190 > > > 12.206.142.76 > > > 12.214.50.126 > > > 12.221.163.162 > > > ~/tmp$ > > > '' > > >
Like a bulldog that can't let go of a blanket, I just had to see if there was another more perverse approach to this. I came up with the idea of turning the IP address into a number, sorting and then displaying the result.
Python has some built-in methods [socket.inet_aton(ip_string) and socket.inet_ntoa(packed_ip)] that could do this, sort the list of numbers and unpack; perhaps someone could do that as an exercise.
However, where Python would indubitably be more readable, just using bash, and standard tools, we could do:
paul@wart:~$ ((IFS=`echo -e"\n."`; \ while read a b c d; do echo $[((a*256 +b)*256+c)*256+d]; done) | \ sort -n -u | \ while read ip; do \ echo $[ip/0x1000000].$[ip%0x1000000/0x10000].\ $[ip%0x10000/0x100].$[ip%0x100]; done) < iplist 4.3.76.194 8.10.33.176 10.123.189.105 12.30.72.162 12.149.177.21 12.154.4.213 12.159.232.66 12.205.7.190 12.206.142.76 12.214.50.126 12.221.163.162 paul@wart:~$
> > > sort_by does a Schwartzian transform for you, so just map the ip to an > > > array of integers ("4.3.76.194" -> [4, 3, 76, 194]) which will then sort > > > correctly. > > > > Sweet! It's nice that somebody has implemented it as a fixed routine. > > Does Ruby do GRTs (Gutman-Rossler Transforms) as well? > > What's the Guttman-Rossler transform? Google is unusually > unenlightening. >
I would also like to know, please?
Paul Sephton
Benjamin A. Okopnik [ben at linuxgazette.net]
Fri, 10 Nov 2006 08:09:21 -0500
On Thu, Nov 09, 2006 at 11:59:54PM -0700, Jason Creighton wrote:
> On Thu, Nov 09, 2006 at 07:49:07AM -0500, Benjamin A. Okopnik wrote: > > > > `` > > ben@Fenrir:/tmp$ perl -wle'print for @ARGV' !! > > '' > > > > The latter gives you each argument on a line by itself. > > One thing I forgot to mention is that I often use that trick to figure > out what the heck the shell is doing. For example, I have this in my > .bashrc: > > `` > alias putargs='ruby -e "p ARGV" --' > '' > > Or the equivalent Perl, of course.
[laugh] Or you could use Bash. The usage and the output would vary slightly, of course:
ben@Fenrir:/tmp/foo$ touch another_file "filename with spaces" some_file ben@Fenrir:/tmp/foo$ function putargs() { IFS="|"; echo "$*"; } ben@Fenrir:/tmp/foo$ putargs * another_file|filename with spaces|some_fileetc.
> What's the Guttman-Rossler transform? Google is unusually > unenlightening.
Excellent paper by Uri Guttman and Larry Rosler, "A Fresh Look at Efficient Perl Sorting" that covers the ST, the GRT, the Orcish Maneuver and more:
http://www.sysarch.com/Perl/sort_paper.html
> > Incidentally, I've been occasionally glancing at "Why's (Poignant) Guide > > to Ruby" (http://poignantguide.net/). That is one seriously bent > > individual. I like him. And it's a fairly nice language from what I > > can see so far; if I'm going to add another scripting language to my > > kit, that's a good candidate. Python just leaves me cold and slightly > > queasy - unsurprising, perhaps, considering its poikilothermic and > > venomous nature... [1] > > As Thomas mentioned, _why is the author of Syck, a C YAML parser with > bindings in Ruby and a couple other languages. And Hpricot, a nice HTML > parser for Ruby. And RedCloth, an implementation of the Textile markdown > language. And a handful of other libraries. And, of course, the > aforementioned "(Poignant) Guide". If life were Slashdot, _why would be > +5 Productive.
Wow. Another Fabrice Bellard... if such a thing is possible. All kudos.
-- * Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *
Benjamin A. Okopnik [ben at linuxgazette.net]
Fri, 10 Nov 2006 09:51:47 -0500
On Fri, Nov 10, 2006 at 10:03:25AM +0200, Paul Sephton wrote:
> > Like a bulldog that can't let go of a blanket, I just had to see if > there was another more perverse approach to this. I came up with the > idea of turning the IP address into a number, sorting and then > displaying the result. > > Python has some built-in methods [socket.inet_aton(ip_string) and > socket.inet_ntoa(packed_ip)] that could do this, sort the list of > numbers and unpack; perhaps someone could do that as an exercise. > > However, where Python would indubitably be more readable,
...that being my reason for demonstrating the algorithm in Perl...
> just using > bash, and standard tools, we could do: > > `` > paul@wart:~$ ((IFS=`echo -e"\n."`; \ > while read a b c d; do echo $[((a*256 +b)*256+c)*256+d]; done) | \ > sort -n -u | \ > while read ip; do \ > echo $[ip/0x1000000].$[ip%0x1000000/0x10000].\ > $[ip%0x10000/0x100].$[ip%0x100]; done) < iplist > 4.3.76.194 > 8.10.33.176 > 10.123.189.105 > 12.30.72.162 > 12.149.177.21 > 12.154.4.213 > 12.159.232.66 > 12.205.7.190 > 12.206.142.76 > 12.214.50.126 > 12.221.163.162 > paul@wart:~$ > ''
Nice, Paul! The double conversion strikes me as a little unnecessary, but - TMTOWTDI, as I'd mentioned before. Speaking of which:
IFS=`echo -e"\n."`is a bit unnecessary (especially since 'echo' is horribly broken in a number of shells, and the above will fail in many situations); you can just do
IFS=' .'and accomplish the same thing.
-- * Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *
Thomas Adam [thomas.adam22 at gmail.com]
Fri, 10 Nov 2006 19:21:48 +0000
On 10/11/06, Benjamin A. Okopnik <ben@linuxgazette.net> wrote:
> `` > IFS=`echo -e"\n."` > '' > > is a bit unnecessary (especially since 'echo' is horribly broken in a > number of shells, and the above will fail in many situations); you can > just do > > `` > IFS=' > .' > '' > > and accomplish the same thing.
As does:
IFS=$'\n'-- Thomas Adam
Benjamin A. Okopnik [ben at linuxgazette.net]
Fri, 10 Nov 2006 15:13:04 -0500
On Fri, Nov 10, 2006 at 07:21:48PM +0000, Thomas Adam wrote:
> On 10/11/06, Benjamin A. Okopnik <ben@linuxgazette.net> wrote: > > `` > > IFS=`echo -e"\n."` > > '' > > > > is a bit unnecessary (especially since 'echo' is horribly broken in a > > number of shells, and the above will fail in many situations); you can > > just do > > > > `` > > IFS=' > > .' > > '' > > > > and accomplish the same thing. > > As does: > > `` > IFS=$'\n' > ''
Not exactly, although the error is understandable: what's needed is a newline followed by a period. However, the above also fails in other shells:
ben@Fenrir:/tmp/foo$ ls -1 # Bash another_file filename with spaces some_file ben@Fenrir:/tmp/foo$ ksh # KSH $ for n in `ls`; do echo $n; done another_file filename with spaces some_file $ IFS=$'\n' $ for n in `ls`; do echo $n; done a other_file file ame with spaces some_file $
-- * Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *
Benjamin A. Okopnik [ben at linuxgazette.net]
Fri, 10 Nov 2006 15:51:09 -0500
On Wed, Nov 08, 2006 at 03:43:27PM -0500, Faber Fedor wrote:
> On 08/11/06 15:31 -0500, Benjamin A. Okopnik wrote: > > On Wed, Nov 08, 2006 at 03:18:40PM +0200, Paul Sephton wrote: > > > btw: anyone know of people who use perl as their default shell? <grin> > > > > However low that number may be, there are fewer using 'sort' as one. > > > > http://sourceforge.net/projects/psh/ > > And all this time I had you pegged as a Futurama fan, Ben. > > http://zoidberg.student.utwente.nl/
Hadn't even heard of that one, believe it or not (or have forgotten about it if I did.) Zoinks and zounds! It looks very nice, and quite mature. Although I think I'll stick with Bash - I'd hate to get out of the habit.
-- * Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *
Rick Moen [rick at linuxmafia.com]
Fri, 10 Nov 2006 13:16:33 -0800
Quoting Paul Sephton (paul@inet.co.za):
> Interpreting the syntax, the [] brackets simply means "optional" ( refer > BNF ). Therefore, --key=POS1[,POS2] simply means "one or more POS > separated by comma".
I've actually been minutely aware of exactly how Backus-Naur Form works since Algol days -- and had, thank you, already read the sort(1) documentation (such as it is), in some detail. What I was suggesting is that the documentation is inaccurate and misleading. That's just a surmise based on experimentation, however.
I can't speak to how ancient versions such as v.5.0 and v5.2.1 work; modern versions such as v5.94 do function as I described.
> A very highly recommended book, Unix Power Tools > http://www.oreilly.com/catalog/upt3/ describes the use of sort in > great detail.
Yes, I of course have had a copy since ancient days -- but invariably not near me when I'm dealing with e-mail.
> Indeed, nslookup does rely on BIND8 features. Again, a change to those > features led to the demise of nslookup, which had existed (in it's > pristine form) for some 15 years prior to it's demise.
It was not a change to BIND8's features, exactly, but rather the (richly deserved) demine of BIND8 itself -- not to mention a large number of other, serious implementation errors in nslookup that have jointly necessitated switching to a better tool.
Paul Sephton [paul at inet.co.za]
Fri, 10 Nov 2006 23:43:30 +0200
On Fri, 2006-11-10 at 13:16 -0800, Rick Moen wrote:
> Quoting Paul Sephton (paul@inet.co.za): > > documentation (such as it is), in some detail. What I was suggesting is > that the documentation is inaccurate and misleading. That's just a > surmise based on experimentation, however. > > I can't speak to how ancient versions such as v.5.0 and v5.2.1 work; > modern versions such as v5.94 do function as I described.
Ok, I think there is room enough here for both of our beliefs to be at least partially accurate. Certainly, the documentation does not reflect the behaviour of GNU sort. What I pointed out, is that it once did as per GNU sort v5.0. Somewhere either with or prior to v5.2.1, behaviour changed, and the new behaviour apparently persists up to v5.94?
Yes, documentation is inaccurate in that it does not describe behaviour, and yes, GNU sort is broken when measured against the documentation (man page, embedded documentation and formal).
I think this situation is unacceptable.
-- Paul Sephton