In this post we will try to check how we can work on Linux grep command with regular expression.

Generally any Linux user or Admin use grep command on daily basis. It used to search and filter pattern from file or input provide to it.But we basically do only some rudimentary usage of grep, we can do many more things with grep, it has wide of regex(regular expression) which could make it highly powerful and useful for daily work. So for this post we would try to understand how we could use them efficiently. Grep stands for “global regular expression print”.

SetUP

For this setup I am using Ubuntu machine 16.04 as of Aug 2018

root@jarvis:/usr/share/doc/grep# lsb_release -d
Description:	Ubuntu 18.04.1 LTS

root@jarvis:/usr/share/doc/grep# grep -V
grep (GNU grep) 3.1
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later .
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Mike Haertel and others, see .

So first we need some document to work with grep command. Let’s start that could be easily available to anyone working with grep.When we install grep command on Linux/unix, it usually create its own documentation in /usr/share/doc like below.

root@jarvis:~# cd /usr/share/doc/grep/
root@jarvis:/usr/share/doc/grep# ls

AUTHORS  changelog.Debian.gz  copyright  NEWS.gz  README  THANKS.gz  TODO.gz

let’s work with Copyright file first to example grep command

Print line matches patterns

In case we like to find some word in file , like try finding “Franklin” in copyright file like below.

root@jarvis:/usr/share/doc/grep# grep Franklin  copyright 
 Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA

So Our syntax was like.

#grep PATERN FILE-NAME

with this any line containing that PATTERN will print on screen in output. This helps to search pattern, exact words or any string in file or command output. We could also do same with command outputs as well.

root@jarvis:~# ethtool eno1| grep Link
	Link detected: yes

In this command we tried to search PATTERN in command output. These small things make huge impact while searching from large outputs or big file. This also impact lot while writing scripts which could automate most of work easily.

With grep command we could do many types of search patterns and things which would helps us in many ways. I suggest you to learn more about grep command through its man page.

man grep

In this post we will consider more about various regex used in grep to search these patterns.

Searching lines starting with

In regex (regular expression) ^(caret) is used to find line start with, So in case we like to find any lines start with some pattern we could search it with help of ^(Caret) like below.

root@jarvis:/usr/share/doc/grep# grep ^License copyright 
License: GPL-3+
License: GPL-3+
License: GPL-3+

Now we know three lines are started with License. But what if we have some other between two patterns like below.

root@jarvis:/usr/share/doc/grep# grep ^Copyright.*Inc copyright 
Copyright: 1992, 1997-2002, 2004-2012 Free Software Foundation, Inc.

Description:-
Here we tried to search lines start with Copyright then some more random words and then Inc word, for this pattern we could use .* which match single or multiple characters

Searching lines end with

Sometime we need to find lines which should ends with some pattern, In regex (Regular Expression) $ is used for end of line, like below

We are going to search lines ends with License. Here we have to use $ regex for ends of line.

root@jarvis:/usr/share/doc/grep# grep License$ copyright 
 You should have received a copy of the GNU General Public License

Now we can add these two regex with one and create on single regex that start with some pattern and also end with another pattern, like below.

root@jarvis:/usr/share/doc/grep# grep ^Source.*grep$ copyright
Source: https://savannah.gnu.org/projects/grep

Description:-
Here we tried to print lines start with Source and ends with grep.

Number/Characters/Alphabet range patterns

We can print lines containing numbers, lines starting or ending with numbers, some specific pattern in numbers. Like below.

This will print lines ends with number, Number mentioned with with [0-9], Hypen (-) used to mentioned range either with numbers and characters.

root@jarvis:/usr/share/doc/grep# grep [0-9]$ copyright 
 2003, Clint Adams   Mon, 10 Mar 2003 02:10:32 -0500

Line start with space and then with number like below.

root@jarvis:/usr/share/doc/grep# grep ^[[:space:]][0-9] copyright 
 2004, Stepan Kasal  
 2007, Tony Abou-Assaleh 
 2009-2012, Jim Meyering  and Paolo Bonzini 
 2003-2004 Ryan M. Golbeck 
 2003, Jeff Bailey 
 2003, Clint Adams   Mon, 10 Mar 2003 02:10:32 -0500
 2001 Robert van der Meulen 
 1996-2000 Wichert Akkerman 
 2005, 2006, 2007, 2008, 2009 Free Software Foundation, Inc.
 02110-1301, USA.  

Lines start with space and then capital words and then number and one comma (,) like below. Above we just used number range, now we also used alphabet range.

root@jarvis:/usr/share/doc/grep# grep ^[[:space:]][A-Z].*[0-9],$ copyright 
 Copyright (C) 1992, 1997, 1998, 1999, 2000, 2001, 2002, 2004,

Same as above, just include capital and small words both.

root@jarvis:/usr/share/doc/grep# grep ^[[:space:]][a-zA-Z].*[0-9] copyright 
 Copyright (C) 1992, 1997, 1998, 1999, 2000, 2001, 2002, 2004,
 the Free Software Foundation; either version 3, or (at your option)
 Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA

Character Classes and Bracket Expressions

I hope its clear to you using start of line, end of line , numbers, words. These regex are common and have some pattern which I tried to cover in below table.
basically these called as Character Classes and Bracket Expressions, most likely you would use them like mentioned below in grep. Actually Character class are enclosed with only single Square brackets, but to make it in effect we have enclosed them another square brackets.

[[:alnum:]]
	---Alphanumeric characters: ‘[:alpha:]’ and ‘[:digit:]’; in the ‘C’ locale and ASCII character encoding, this is the same as ‘[0-9A-Za-z]’.

[[:alpha:]]
	---Alphabetic characters: ‘[:lower:]’ and ‘[:upper:]’; in the ‘C’ locale and ASCII character encoding, this is the same as ‘[A-Za-z]’.

[[:blank:]]
	---Blank characters: space and tab.

[[:cntrl:]]
	---Control characters. In ASCII, these characters have octal codes 000 through 037, and 177 (DEL). In other character sets, these are the equivalent characters, if any.

[[:digit:]]
	---Digits: 0 1 2 3 4 5 6 7 8 9.

[[:graph:]]
	---Graphical characters: ‘[:alnum:]’ and ‘[:punct:]’.

[[:lower:]]
	---Lower-case letters; in the ‘C’ locale and ASCII character encoding, this is a b c d e f g h i j k l m n o p q r s t u v w x y z.

[[:print:]]
	---Printable characters: ‘[:alnum:]’, ‘[:punct:]’, and space.

[[:punct:]]
	---Punctuation characters; in the ‘C’ locale and ASCII character encoding, this is ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~.

[[:space:]]
	---Space characters: in the ‘C’ locale, this is tab, newline, vertical tab, form feed, carriage return, and space. See Usage, for more discussion of matching newlines.

[[:upper:]]
	---Upper-case letters: in the ‘C’ locale and ASCII character encoding, this is A B C D E F G H I J K L M N O P Q R S T U V W X Y Z.

[[:xdigit:]]
	---Hexadecimal digits: 0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f.

We can use any of them while searching them through grep command. Although these regex are also used in any other Linux commands. We will later used them in detail for various others regex used in Linux Operating system.

Grep repeating numbers and characters

There are some other regex which helps us to search in grep like for repeating of numbers and characters like below.

This regex we are trying to find out number with year ranges.

root@jarvis:/usr/share/doc/grep# grep "^[[:space:]][1-2][0-9]\{3\}-[0-9]\{2,4\}" copyright 
 2009-2012, Jim Meyering  and Paolo Bonzini 
 2003-2004 Ryan M. Golbeck 
 1996-2000 Wichert Akkerman 

Description:-
[1-2] — One number either 1 or 2
[0-9]\{3\} — number range of 0-9 , this number get repeat minimum 1 or maximum 3 numbers, so including above its could be 1983 or 2018
Between these two part there is one Hypen (-), this is not expression , just string ask to check as it is.
[0-9]\{2,4\} — Number range from 0 to 9, repeat it for 2 times and maximum to 4 times.

That’s how its comes to end of this grep output

In below we try to search http URL

root@jarvis:/usr/share/doc/grep# grep "http[[:punct:]]\{3\}" copyright 
Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/

This way we know number and punctuation characters are also search with repeating patterns. We can also work with alphabet as well, like below.

root@jarvis:/usr/share/doc/grep# grep "[[:space:]]a[a-z]\{2\}[[:space:]]" copyright 
 2009-2012, Jim Meyering  and Paolo Bonzini 
Copyright: 2005-2013 Anibal Monsalve Salazar  and Santiago Ruano Rincón 
 any later version.

Description:-
In above we tried to find out word start with a Alphabet and has three alphabet only

Show only Matches not complete line

We can also print only regex search instead of complete line.

root@jarvis:/usr/share/doc/grep# grep -o "[[:space:]]a[a-z]\{2\}[[:space:]]" copyright 
 and 
 and 
 any 

This is an easy way to search and save output in variable for bash scripts but this will print every search in new line, like if two search in one line will print in two different lines.

I think this is could ever ending topic, I will try to keep on updating this discussion further as well in. But for today closing it now.