sed stands for stream editor
Think of streams as another way to talk about the STDIN, STDOUT that we’ve covered in depth. Essentially, sed
will read the contents of a file, perform a modification on the input, and send the modified text to Standard Output by default. Keep in mind that by default, we are not changing the file’s contents.
Both sed
and awk
(which we are covering next lecture) are complicated tools. Books are written about how to use them. From that book’s introduction:
Initially, using sed and awk will seem like the long way to accomplish a task. After several attempts you may conclude that the task would have been easier to do manually. Be patient. You not only have to learn how to use sed and awk but you also need to learn to recognize situations where using them pays off. As you become more proficient, you will solve problems more quickly and solve a broader range of problems.
Keep in mind that we can only scrape the surface of what sed
can actually do.
Another anecdote: At one of my placements, we needed to create an OTA (over the air) update script. My supervisor Sent over a sample script that would create a hash of itself while excluding the line that contained any previous hash, and then compare that result with that previous hash. Essentially the script was checking to see if it had been modified since its last execution. This function was contained in one line, using sed
. “Oh, just ignore that sed
command, it works but don’t ask how”, he told me. So there you go, if you are able to invest the time into these tools, it gives you the power to perform black magic..!
All sed
commands use the syntax: sed '<address> <instructions>'
followed by an optional argument, which is a filename. Recall the cars
file:
cat -n cars
1 plym fury 77 73 2500
2 chevy nova 79 60 3000
3 ford mustang 65 45 17000
4 volvo gl 78 102 9850
5 ford ltd 83 15 10500
6 Chevy nova 80 50 3500
7 fiat 600 65 115 450
8 honda accord 81 30 6000
9 ford thundbd 84 10 17000
10 toyota tercel 82 180 750
11 chevy impala 65 85 1550
12 ford bronco 83 25 9525
Now let’s introduce our first command:
sed '3,6 p' cars
plym fury 77 73 2500
chevy nova 79 60 3000
ford mustang 65 45 17000
ford mustang 65 45 17000
volvo gl 78 102 9850
volvo gl 78 102 9850
ford ltd 83 15 10500
ford ltd 83 15 10500
Chevy nova 80 50 3500
Chevy nova 80 50 3500
fiat 600 65 115 450
honda accord 81 30 6000
ford thundbd 84 10 17000
toyota tercel 82 180 750
chevy impala 65 85 1550
ford bronco 83 25 9525
Notice which lines get repeated. sed
by default will print all lines, so here you see the combination of the entire cars
file with the output from the 3,6 p
instruction.
-n
will turn off the printing of all lines.
sed -n '3,6 p' cars
Again, please note that although we are ‘deleting’ line 5, we are only doing this to the output on our display.
sed '5 d' cars
plym fury 77 73 2500
chevy nova 79 60 3000
ford mustang 65 45 17000
volvo gl 78 102 9850
Chevy nova 80 50 3500
fiat 600 65 115 450
honda accord 81 30 6000
ford thundbd 84 10 17000
toyota tercel 82 180 750
chevy impala 65 85 1550
ford bronco 83 25 9525
sed '5,8 d' cars
plym fury 77 73 2500
chevy nova 79 60 3000
ford mustang 65 45 17000
volvo gl 78 102 9850
ford thundbd 84 10 17000
toyota tercel 82 180 750
chevy impala 65 85 1550
ford bronco 83 25 9525
The first example goes like this: “for each line, print but delete line 5.” The second does the same but for a range.
sed '5 q' cars
plym fury 77 73 2500
chevy nova 79 60 3000
ford mustang 65 45 17000
volvo gl 78 102 9850
ford ltd 83 15 10500
Once we reach line 5, we ‘quit’ sed
and nothing more is printed to our screen. In other words, this does the same thing as head -5 cars
.
You knew weren’t done with these, didn’t you?
sed -n '/chevy/ p' cars
This seems to do the same thing as grep
. Here’s an example where we delete the pattern:
sed '/chevy/ d' cars
plym fury 77 73 2500
ford mustang 65 45 17000
volvo gl 78 102 9850
ford ltd 83 15 10500
Chevy nova 80 50 3500
fiat 600 65 115 450
honda accord 81 30 6000
ford thundbd 84 10 17000
toyota tercel 82 180 750
ford bronco 83 25 9525
Notice that this didn’t ignore case. You could modify it with /chevy/I
.
Now consider this example. What does this do?
sed '/chevy/ q' cars
The real power of sed
is in being able to search and replace strings. How does this look?
sed 's/[0-9]/*/' cars
plym fury *7 73 2500
chevy nova *9 60 3000
ford mustang *5 45 17000
volvo gl *8 102 9850
ford ltd *3 15 10500
Chevy nova *0 50 3500
fiat *00 65 115 450
honda accord *1 30 6000
ford thundbd *4 10 17000
toyota tercel *2 180 750
chevy impala *5 85 1550
ford bronco *3 25 9525
Only the first number in each line is being replaced by an asterisk. Using g
for a ‘global’ replace will replace all of the numbers:
sed 's/[0-9]/*/g' cars
plym fury ** ** ****
chevy nova ** ** ****
ford mustang ** ** *****
volvo gl ** *** ****
ford ltd ** ** *****
Chevy nova ** ** ****
fiat *** ** *** ***
honda accord ** ** ****
ford thundbd ** ** *****
toyota tercel ** *** ***
chevy impala ** ** ****
ford bronco ** ** ****
Use cat
to check that the cars
is still intact. Remember: by default this doesn’t affect the original file. If we wanted to, we could redirect this out to a new file, but also using -i
will allow us to ‘save’ our changes.
This is essentially doing the same thing that you can do in any text editor, and using syntax that is much more complicated. Why would we ever use this?
Answer: Imagine now that instead of working with just one file named ‘cars’, you are working with hundreds of files called ‘cars*’. Even the most efficient office worker will need to open each file in notepad.exe and run their ‘ctrl f’. sed has the ability to save thousands of hours, as long as you understand the syntax. Always do a dry run to check for errors before replacing important data. “With great power comes great responsibility.” We can teach you the power, but maybe not the responsibility..!
sed 'address instruction' filename
Now that we’ve talked about substutions using sed
, we can use this to perform replacements in Vim as well. Replacements are done in Command Mode so press : to enter that mode. Once in command mode, the syntax looks like this:
s/old/new/g
This will replace old
with new
, but only on the current line. There’s two small differences to keep in mind. Use % to change all instances of old
, and add c to ask for confirmation.
%s/old/new/gc
There is one more thing to touch on here. Consider the following:
We know that we want to get rid of <blink>
and </blink>
but keep everything in between. Consider this:
%s/<blink>.*</blink>//gc
If you try that, it’s going to fail. We need to escape the / inside the closing blink tag.
%s/<blink>.*<\/blink>//gc
That’s going to successfully delete our entire line, including the part we want to keep. :
The way we preserve something is using backreferences. Create a backreference with quotes ( ).
%s/<blink>(.*)<\/blink>//gc
Anything inside the backreference can be ‘pasted’ into the new field with \1. So now we have this:
%s/<blink>(.*)<\/blink>/\1/gc
If you try this, it won’t work… again. What’s wrong? We have to escape both ( and ). So finally:
%s/<blink>\(.*\)<\/blink>/\1/gc
And there we go.