Given these two CSV files:
$ cat file1
1,line1
2,line2
3,line3
4,line4
$ cat file2
1,line3
2,line4
3,line5
4,line6
To print those lines in file2
whose second column occurs also in the first file we can say:
$ awk -F, 'FNR==NR {lines[$2]; next} $2 in lines' file1 file2
1,line3
2,line4
Here, lines[]
holds an array that gets populated when reading file1
with the contents of the second field of each line.
Then, the condition $2 in lines
checks, for every line in file2
, if the 2nd field exists in the array. If so, the condition is True and awk
performs its default action, consisting in printing the full line.
If just one field was needed to be printed, then this could be the expression:
$ awk -F, 'FNR==NR {lines[$2]; next} $2 in lines {print $1}' file1 file2
1
2