awk multiple lines with conditions

Posted on 2023-02-08 Edited on 2025-07-26 In Notes

Log file consists of lines looking like:

xxxxxxxxxxxxxxxxxxxxxxxxx..xxxx
keyword1=aaaaa
keyword2=bbbbb
xxxxxxx
xxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxx..xxxx
xxxxxxxxxxxxxx...xxxxxx
keyword1=ccccc
keyword2=bbbbb
xxxxxxx
xxxxx
xxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxx..xxxx
keyword1=aaaaa
keyword2=ddddd
xxxxxxx
xxxxxxx

Records are separated via an empty line. Goal is to extract the records that contains keyword1=aaaaa and keyword2=bbbbb.

This can be done via first cat the file, then re-direct to sed to add an form feed to each empty line, and feed the results to awk to filter.

cat records.log | sed $'s|^\s*$|\f|' | awk '
BEGIN { RS="\f" }
! /keyword1=aaaaa/ {next }
/keyword2=bbbbb/ { print } '

sed replaces ^\s*$ with a form feed \f. Form feed is not used as often, but it is useful here, as it provides a unique record separator for awk.

At the begining fo awk, we set “RS” to form feed \f, replacing the default “new line”. Now, intead of read per line, we read until \f is seen, and put it into the buffer. Next, if the buffer content does not contain “keyword1=aaaaa” we skip by calling next. If it does, then we look for “keyword2=bbbbb”, and print the current content if this condition matches.