awk Row Manipulation Modifying rows on-the-fly (e.g. to fix Windows line-endings)


Example

If a file may contain Windows or Unix-like line endings (or even a mixture of both) then the intended text replacement may not work as expected.

Sample:

$ echo -e 'Entry 1\nEntry 2.1\tEntry 2.2\r\nEntry 3\r\n\r\n' \
> | awk -F'\t' '$1 != "" { print $1 }' \
> | hexdump -c
0000000   E   n   t   r   y       1  \n   E   n   t   r   y       2   .
0000010   1  \n   E   n   t   r   y       3  \r  \n  \r  \n            
000001d

This can be easily fixed by an additional rule which is inserted at the beginning of the awk script:

/\r$/ { $0 = substr($0, 1, length($0) - 1) }

Because the action does not end with next, the following rules are applied as before.

Sample (with fix of line-endings):

$ echo -e 'Entry 1\nEntry 2.1\tEntry 2.2\r\nEntry 3\r\n\r\n' \
> | awk -F'\t' '/\r$/ { $0 = substr($0, 1, length($0) - 1) } $1 != "" { print $1 }' \
> | hexdump -c
0000000   E   n   t   r   y       1  \n   E   n   t   r   y       2   .
0000010   1  \n   E   n   t   r   y       3  \n                        
000001a