Search and replace is such a common task that it should be a tool that’s in every command line script author’s toolbox.
There are probably endless solutions to the problem.
I’ve put together my standard methods for tackling the problem.
I’ll also show similar Perl versions, mainly for comparisons.
use models
In most of the following discussion, I’m just replacing ‘foo’ with ‘bar’.
However ‘foo’ can be ANY regular expression.
The dummy situation I’m going to set up is this.
I’ve got a file named ‘example.txt’.
It looks like this:
foo foo fiddle foo
and baz too, as well as foo
I realize that I want to replace ‘foo’ with ‘bar’.
I’d like to be able to have a script that runs in a way that I can either pipe the contents of the file into the script, or just give a file name to the script, like this:
cat example.txt | python search_replace.py
python search_replace.py example.txt
Now, I may or may not want to have the script modify the file in place.
If not, then the second example above would just print the modified contents.
I also may want to make a backup of example.txt first.
We can all of those things, as I’ll show below.
using Python
Search and replace as a filter
I think the most basic form of a search/replace script in python is something like this:
import fileinput
import re
for line in fileinput.input():
line = re.sub('foo','bar', line.rstrip())
print(line)
The fileinput module takes care of the stream verses filename input handling.
The re (regex, regular expression) module has sub which handles the search/replace.
The ‘rstrip’ is called on the input just to strip the line of it’s newline, since the next ‘print’ statement is going to add a newline by default.
This handles the case where we don’t want to modify the file.
In place modification of files
A slightly modified script will allow you to modify files, and optionally copy the original file to a backup.
import fileinput
import re
for line in fileinput.input(inplace=1, backup='.bak'):
line = re.sub('foo','bar', line.rstrip())
print(line)
If you leave the ‘backup’ flag out, no backup will be made.
As I’ve shown it with backup='.bak', an example.txt.bak file will be created.
with a more complex regular expression
Of course, foo and bar don’t have to be so simple.
For example, the markdown issue I have with converting a table of contents to a series of h2 tags, can be solved with the following script.
import fileinput
import re
for line in fileinput.input():
line = re.sub(r'\* \[(.*)\]\(#(.*)\)', r'<h2 id="\2">\1</h2>', line.rstrip())
print(line)
Here, I’m looking for things that look like
* [the label](#the_anchor)
and replacing it with
<h2 id="the_anchor">the link</h2>
using Perl
As I’ve mentioned before, I still use Perl sometimes, even though I like to use Python for most of my scripting needs.
Therefore, I’m presenting Perl versions that I’ve used before, mostly as a comparison.
There are lots of ways to implement this in Perl.
all on the command line
The following are all roughly equivalent:
cat example.txt | perl -pe 's/foo/bar/g;'
cat example.txt | perl -ne 's/foo/bar/g;print'
cat example.txt | perl -e 'while (<>){s/foo/bar/g;print;}'
cat example.txt | perl -e 'while (<>){$_ =~ s/foo/bar/g;print;}'
perl -pe 's/foo/bar/g;' example.txt
perl -ne 's/foo/bar/g;print' example.txt
perl -e 'while (<>){s/foo/bar/g;print;}' example.txt
perl -e 'while (<>){$_ =~ s/foo/bar/g;print $_;}' example.txt
flags used
- -e : execute the code on the command line
- -n : wrap everything in
while (<>){ [code here] }
- -p : wrap everything in
while (<>){ [code here]; print; }
As a script
And of course, you don’t have to use the ‘-e’ flag.
You can stick the script in a file.
The following scripts are equivalent:
#!/usr/bin/perl
while (<>) {
$_ =~ s/foo/bar/;
print $_;
}
#!/usr/bin/perl
while (<>) {
s/foo/bar/;
print;
}
#!/usr/bin/perl -n
s/foo/bar/;
print;
#!/usr/bin/perl -p
s/foo/bar/;
Changing files in place
Perl also allows changing the files in place, with the addition of the ‘-i’ flag, usually used as ‘-i.bak’ to make backup copies of the original:
perl -pi.bak -e 's/foo/bar/g;' example.txt