What I Learned From the GNU Grep

While going through my HackerNews feed today, I came across this: Why GNU Grep is Fast. It’s an archived email from Mike Haertel, the original author of the GNU Grep, to Gabor Kovesdan discussing the reasons behind the speed of the GNU Grep.

The email is filled with great knowledge in easy-to-understand “tricks”.

Two of these tricks that speed up Grep, Mike explains, are:

  • Grep avoids looking at every Input Byte
  • Grep executes very few instuctions for each byte it does look at

Other great lessons, I took away, are:

  • Use of Boyer-Moore Algorithm
  • Use of Unbuffered Input using Raw System Call and avoid copying the input before searching it
  • Don’t look for newlines until after a match has been found
  • You need to get down to the Kernel’s level to make it really fast

And the greatest lesson:

The key to making programs fast is to make them do practically nothing