The Global Regular Expression Print or Grep is a tool that searches text files for the occurrence of a specified regular expression and outputs any line containing a match to standard output. Grep uses regular expressions or Regex for the matching algorithm. Regex is a symbolic notations used to identify patterns in text and is widely used to process text. Its time to learn some ‘grep’!
Grep on Linux and Mac
When you use the terminal, changes are that you use Linux and Mac and switch back and forth between them. Lets start by explaining that the Grep command is different on Linux than on the Mac. The Mac comes with the BSD version of Grep that can be checked by typing:
$ grep -V
grep (BSD grep) 2.5.1-FreeBSD
Linux uses the GNU version of Grep that can be checked by typing:
$ docker run -it ubuntu
root@d5b72371f815:/# grep -V
grep (GNU grep) 3.1
Copyright (C) 2017 Free Software Foundation, Inc.
GNU Grep on Mac
GNU grep can be installed on the Mac by means of homebrew. After brew has installed grep, you have two versions of the command. The BSD version is called grep
and the GNU version is called ggrep
. The examples that follow use the GNU version of grep, so Mac users should use the ggrep
command.
# brew install grep
$ ggrep -V
ggrep (GNU grep) 3.1
Packaged by Homebrew
Copyright (C) 2017 Free Software Foundation, Inc.
Posix and PCRE Regex
The difference between the BSD and GNU version of Grep is the Regex engine that it uses. The BSD version uses the POSIX Compatible Regular Expressions and the GNU version uses the Perl Compatible Regular Expressions (PCRE). The short explanation of the difference is that the GNU version of grep is much easier to use. For an overview of the differences see this cheat sheet.
Using Grep
Most often we use grep to pipe output to. That way, grep acts as a filter eg:
# take the first 3 lines from the history, filtered by grep
$ history | grep "git" | head -3
157 git status
158 git status
159 git add .
By default grep matches are case sentitive:
$ history | grep "TEST" | head -3
399 touch TEST_1.txt
400 touch TEST_2.txt
401 touch TEST_3.txt
By adding the -i
option, searches become case insensitive:
$ history | grep -i "TEST" | head -3
302 find . -type f -name test*
303 find . -type f -name test*.*
304 find . -type f -name test1.txt
We can also match on exact words with the -w
option:
$ history | grep -w "git status" | head -3
157 git status
158 git status
279 git status
We can combine options so -iw
searches for an exact word, case insensitive:
history | grep -iw "TEST_1" | head -3
399 touch TEST_1.txt
545 history | grep -iw "TEST_1" | head -3
Searching within a file
I have prepared an example project that you can use to learn grep. Grep can be used to search through files for content. For example, to search for the text ‘Dennis’ in the file LICENSE type:
grep "boto3" Pipfile
"boto3" = "*"
The option -n
shows the line number of the match:
$ grep -n "boto3" Pipfile
10:"boto3" = "*"
Searching through files
Grep can also be used to search through multiple files by typing:
grep -n "boto3" ./*
./Pipfile:10:"boto3" = "*"
grep: ./config: Is a directory
grep: ./lambdas: Is a directory
grep: ./templates: Is a directory
grep: ./tests: Is a directory
Grep can only operate on files, that is why we see the message ‘Is a directory’. To search through all files in all folders we need to use the recursive -r
option:
$ grep -r "handler" .
./lambdas/log_lambda.py:def handler(event, ctx):
./lambdas/cloudwatch_subscription_lambda.py:def handler(event, ctx) -> None:
./templates/cloudwatch.yaml: Handler: index.handler
./templates/cloudwatch.yaml: def handler(event, ctx):
./templates/cloudwatch.yaml: Handler: index.handler
./templates/cloudwatch.yaml: def handler(event, ctx) -> None:
Grep doesn’t have to show the output, it can also report on which file contains a match with the -l
option:
$ grep -rl "handler" .
./lambdas/log_lambda.py
./lambdas/cloudwatch_subscription_lambda.py
./templates/cloudwatch.yaml
Grep can also show context around the match. Use the -A
option to see lines after the match. Use -B
to see lines before the match and use -C
to see lines before and after the match:
$ grep -r -C 2 "handler" .
./lambdas/log_lambda.py:def handler(event, ctx):
./lambdas/log_lambda.py- print(event)
--
--
./lambdas/cloudwatch_subscription_lambda.py- return decode_record(event['awslogs'])
./lambdas/cloudwatch_subscription_lambda.py-
./lambdas/cloudwatch_subscription_lambda.py:def handler(event, ctx) -> None:
./lambdas/cloudwatch_subscription_lambda.py- print(json.dumps(decode_event(event)))
--
--
./templates/cloudwatch.yaml- Type: AWS::Lambda::Function
./templates/cloudwatch.yaml- Properties:
./templates/cloudwatch.yaml: Handler: index.handler
./templates/cloudwatch.yaml- Runtime: python3.6
./templates/cloudwatch.yaml- Role: !GetAtt 'LambdaBasicExecutionRole.Arn'
--
--
./templates/cloudwatch.yaml- Code:
./templates/cloudwatch.yaml- ZipFile: |-
./templates/cloudwatch.yaml: def handler(event, ctx):
./templates/cloudwatch.yaml- print(event)
./templates/cloudwatch.yaml-
--
--
./templates/cloudwatch.yaml- Type: AWS::Lambda::Function
./templates/cloudwatch.yaml- Properties:
./templates/cloudwatch.yaml: Handler: index.handler
./templates/cloudwatch.yaml- Runtime: python3.6
./templates/cloudwatch.yaml- Role: !GetAtt 'LambdaBasicExecutionRole.Arn'
--
--
./templates/cloudwatch.yaml- return decode_record(event['awslogs'])
./templates/cloudwatch.yaml-
./templates/cloudwatch.yaml: def handler(event, ctx) -> None:
./templates/cloudwatch.yaml- print(json.dumps(decode_event(event)))
./templates/cloudwatch.yaml- CloudWatchSubscriptionLambdaLogGroup:
Regex search
Grep, being Global Regular Expression Print can search based on a Regex. Note that Mac users must use ggrep
and use the -P
option to search for PCRE Regex:
$ grep -rl -P "decode_event([.]*)" .
./lambdas/cloudwatch_subscription_lambda.py
./tests/.pytest_cache/v/cache/nodeids
./tests/test_cloudwatch_subscription_lambda.py
./templates/cloudwatch.yaml
Conclusion
Grep or Global Regular Expression Print can search for file content based on a word or a Regex. Grep can also be used to filter results from a command by means of a pipe. Grep is the indispensable tool in your toolbox!