awk Two-file processing Check matching fields in two files


Example

Given these two CSV files:

$ cat file1
1,line1
2,line2
3,line3
4,line4
$ cat file2
1,line3
2,line4
3,line5
4,line6

To print those lines in file2 whose second column occurs also in the first file we can say:

$ awk -F, 'FNR==NR {lines[$2]; next} $2 in lines' file1 file2
1,line3
2,line4

Here, lines[] holds an array that gets populated when reading file1 with the contents of the second field of each line.

Then, the condition $2 in lines checks, for every line in file2, if the 2nd field exists in the array. If so, the condition is True and awk performs its default action, consisting in printing the full line.

If just one field was needed to be printed, then this could be the expression:

$ awk -F, 'FNR==NR {lines[$2]; next} $2 in lines {print $1}' file1 file2
1
2