Sed - An Introduction and Tutorial by Bruce Barnett ==================================================================================================== Note - You can click on the table of contents sections to jump to that section. And if you click on a section header, it returns you to the Table of Contents. Handy, eh? * The Awful Truth about sed * The essential command: s for substitution * The slash as a delimiter * Using & as the matched string * Using \1 to keep part of the pattern * Extended Regular Expressions * Sed Pattern Flags + /g - Global replacement + Is sed recursive? + /1, /2, etc. Specifying which occurrence + /p - print + Write to a file with /w filename + /I - Ignore Case + Combining substitution flags * Arguments and invocation of sed + Multiple commands with -e command + Filenames on the command line + sed -n: no printing + Using 'sed /pattern/' o Using 'sed -n /pattern/p' to duplicate the function of grep + sed -f scriptname + sed in shell scripts o Quoting multiple sed lines in the C shell o Quoting multiple sed lines in the Bourne shell + sed -V + sed -h + A sed interpreter script + Sed Comments + Passing arguments into a sed script + Using sed in a shell here-is document + Multiple commands and order of execution * Addresses and Ranges of Text + Restricting to a line number + Patterns + Ranges by line number + Ranges by patterns * Delete with d * Printing with p * Reversing the restriction with ! * Relationships between d, p, and ! * The q or quit command * Grouping with { and } * Operating in a pattern range except for the patterns * Writing a file with the 'w' command * Reading in a file with the 'r' command * The # Comment Command * Adding, Changing, Inserting new lines + Append a line with 'a' + Insert a line with 'i' + Change a line with 'c' + Leading tabs and spaces in a sed script + Adding more than one line + Adding lines and the pattern space + Address ranges and the above commands * Multi-Line Patterns * Print line number with = * Transform with y * Displaying control characters with a l * Working with Multiple Lines + Matching three lines with sed + Matching patterns that span multiple lines + Using newlines in sed scripts + The Hold Buffer + Exchange with x + Example of Context Grep + Hold with h or H + Keeping more than one line in the hold buffer + Get with g or G * Branch (Flow Control) * Testing with t * Debugging with l * An alternate way of adding comments * The poorly documented ; * Passing regular expressions as arguments * Inserting binary characters * GNU sed Command Line arguments + The -posix argument + The --version argument + The -h Help argument + The -l Line Length Argument + The -s Separate argument + The -i in-place argument + The --follow-symlinks argument + The -b Binary argument + The -r Extended Regular Expression argument + The -u Unbuffered argument + The -z Null Data argument * FreeBSD Extensions + -a or delayed open + The -I in-place argument + -E or Extended Regular Expressions * Using word boundaries * Command Summary * In Conclusion * More References Introduction to Sed How to use sed, a special editor for modifying files automatically. If you want to write a program to make changes in a file, sed is the tool to use. There are a few programs that are the real workhorse in the UNIX toolbox. These programs are simple to use for simple applications, yet have a rich set of commands for performing complex actions. Don't let the complex potential of a program keep you from making use of the simpler aspects. I'll start with the simple concepts and introduce the advanced topics later on. When I first wrote this (in 1994), most versions of sed did not allow you to place comments inside the script. Lines starting with the '#' characters are comments. Newer versions of sed may support comments at the end of the line as well. One way to think of this is that the old, "classic" version was the basis of GNU, FreeBSD and Solaris versions of sed. And to help you understand what I had to work with, here is the sed(1) manual page from Sun/Oracle. The Awful Truth about sed Sed is the ultimate stream editor. If that sounds strange, picture a stream flowing through a pipe. Okay, you can't see a stream if it's inside a pipe. That's what I get for attempting a flowing analogy. You want literature, read James Joyce. Anyhow, sed is a marvelous utility. Unfortunately, most people never learn its real power. The language is very simple, but the documentation is terrible. The Solaris on-line manual pages for sed are five pages long, and two of those pages describe the 34 different errors you can get. A program that spends as much space documenting the errors as it does documenting the language has a serious learning curve. Do not fret! It is not your fault you don't understand sed. I will cover sed completely. But I will describe the features in the order that I learned them. I didn't learn everything at once. You don't need to either. now! The essential command: s for substitution Sed has several commands, but most people only learn the substitute command: s. The substitute command changes all occurrences of the regular expression into a new value. A simple example is changing "day" in the "old" file to "night" in the "new" file: sed s/day/night/ new Or another way (for UNIX beginners), sed s/day/night/ old >new and for those who want to test this: echo day | sed s/day/night/ This will output "night". I didn't put quotes around the argument because this example didn't need them. If you read my earlier tutorial on quotes, you would understand why it doesn't need quotes. However, I recommend you do use quotes. If you have meta-characters in the command, quotes are necessary. And if you aren't sure, it's a good habit, and I will henceforth quote future examples to emphasize the "best practice." Using the strong (single quote) character, that would be: sed 's/day/night/' new I must emphasize that the sed editor changes exactly what you tell it to. So if you executed echo Sunday | sed 's/day/night/' This would output the word "Sunnight" because sed found the string "day" in the input. Another important concept is that sed is line oriented. Suppose you have the input file: one two three, one two three four three two one one hundred and you used the command sed 's/one/ONE/' new Gulp. Some call this a 'Picket Fence' and it's ugly. It is easier to read if you use an underline instead of a slash as a delimiter: sed 's_/usr/local/bin_/common/bin_' new Some people use colons: sed 's:/usr/local/bin:/common/bin:' new Others use the "|" character. sed 's|/usr/local/bin|/common/bin|' new Pick one you like. As long as it's not in the string you are looking for, anything goes. And remember that you need three delimiters. If you get a "Unterminated `s' command" it's because you are missing one of them. Using & as the matched string Sometimes you want to search for a pattern and add some characters, like parenthesis, around or near the pattern you found. It is easy to do this if you are looking for a particular string: sed 's/abc/(abc)/' new This won't work if you don't know exactly what you will find. How can you put the string you found in the replacement string if you don't know what it is? The solution requires the special character "&." It corresponds to the pattern found. sed 's/[a-z]*/(&)/' new You can have any number of "&" in the replacement string. You could also double a pattern, e.g. the first number of a line: % echo "123 abc" | sed 's/[0-9]*/& &/' 123 123 abc Let me slightly amend this example. Sed will match the first string, and make it as greedy as possible. I'll cover that later. If you don't want it to be so greedy (i.e. limit the matching), you need to put restrictions on the match. The first match for '[0-9]*' is the first character on the line, as this matches zero or more numbers. So if the input was "abc 123" the output would be unchanged (well, except for a space before the letters). A better way to duplicate the number is to make sure it matches a number: % echo "123 abc" | sed 's/[0-9][0-9]*/& &/' 123 123 abc The string "abc" is unchanged, because it was not matched by the regular expression. If you wanted to eliminate "abc" from the output, you must expand the regular expression to match the rest of the line and explicitly exclude part of the expression using "(", ")" and "\1", which is the next topic. Extended Regular Expressions Let me add a quick comment here because there is another way to write the above script. "[0-9]*" matches zero or more numbers. "[0-9][0-9]*" matches one or more numbers. Another way to do this is to use the "+" meta-character and use the pattern "[0-9]+" as the "+" is a special character when using "extended regular expressions." Extended regular expressions have more power, but sed scripts that treated "+" as a normal character would break. Therefore you must explicitly enable this extension with a command line option. GNU sed turns this feature on if you use the "-r" command line option. So the above could also be written using % echo "123 abc" | sed -r 's/[0-9]+/& &/' 123 123 abc Mac OS X and FreeBSD uses -E instead of -r. For more information on extended regular expressions, see Regular Expressions and the description of the -r command line argument Using \1 to keep part of the pattern I have already described the use of "(" ")" and "1" in my tutorial on regular expressions. To review, the escaped parentheses (that is, parentheses with backslashes before them) remember a substring of the characters matched by the regular expression. You can use this to exclude part of the characters matched by the regular expression. The "\1" is the first remembered pattern, and the "\2" is the second remembered pattern. Sed has up to nine remembered patterns. If you wanted to keep the first word of a line, and delete the rest of the line, mark the important part with the parenthesis: sed 's/\([a-z]*\).*/\1/' I should elaborate on this. Regular expressions are greedy, and try to match as much as possible. "[a-z]*" matches zero or more lower case letters, and tries to match as many characters as possible. The ".*" matches zero or more characters after the first match. Since the first one grabs all of the contiguous lower case letters, the second matches anything else. Therefore if you type echo abcd123 | sed 's/\([a-z]*\).*/\1/' This will output "abcd" and delete the numbers. If you want to switch two words around, you can remember two patterns and change the order around: sed 's/\([a-z]*\) \([a-z]*\)/\2 \1/' Note the space between the two remembered patterns. This is used to make sure two words are found. However, this will do nothing if a single word is found, or any lines with no letters. You may want to insist that words have at least one letter by using sed 's/\([a-z][a-z]*\) \([a-z][a-z]*\)/\2 \1/' or by using extended regular expressions (note that '(' and ')' no longer need to have a backslash): sed -r 's/([a-z]+) ([a-z]+)/\2 \1/' # Using GNU sed sed -E 's/([a-z]+) ([a-z]+)/\2 \1/' # Using Apple Mac OS X The "\1" doesn't have to be in the replacement string (in the right hand side). It can be in the pattern you are searching for (in the left hand side). If you want to eliminate duplicated words, you can try: sed 's/\([a-z]*\) \1/\1/' If you want to detect duplicated words, you can use sed -n '/\([a-z][a-z]*\) \1/p' or with extended regular expressions sed -rn '/([a-z]+) \1/p' # GNU sed sed -En '/([a-z]+) \1/p' # Mac OS X This, when used as a filter, will print lines with duplicated words. The numeric value can have up to nine values: "\1" thru "\9." If you wanted to reverse the first three characters on a line, you can use sed 's/^\(.\)\(.\)\(.\)/\3\2\1/' Sed Pattern Flags You can add additional flags after the last delimiter. You might have noticed I used a 'p' at the end of the previous substitute command. I also added the '-n' option. Let me first cover the 'p' and other pattern flags. These flags can specify what happens when a match is found. Let me describe them. /g - Global replacement Most UNIX utilities work on files, reading a line at a time. Sed, by default, is the same way. If you tell it to change a word, it will only change the first occurrence of the word on a line. You may want to make the change on every word on the line instead of the first. For an example, let's place parentheses around words on a line. Instead of using a pattern like "[A-Za-z]*" which won't match words like "won't," we will use a pattern, "[^ ]*," that matches everything except a space. Well, this will also match anything because "*" means zero or more. The current version of Solaris's sed (as I wrote this) can get unhappy with patterns like this, and generate errors like "Output line too long" or even run forever. I consider this a bug, and have reported this to Sun. As a work-around, you must avoid matching the null string when using the "g" flag to sed. A work-around example is: "[^ ][^ ]*." The following will put parenthesis around the first word: sed 's/[^ ]*/(&)/' new If you want it to make changes for every word, add a "g" after the last delimiter and use the work-around: sed 's/[^ ][^ ]*/(&)/g' new Is sed recursive? Sed only operates on patterns found in the in-coming data. That is, the input line is read, and when a pattern is matched, the modified output is generated, and the rest of the input line is scanned. The "s" command will not scan the newly created output. That is, you don't have to worry about expressions like: sed 's/loop/loop the loop/g' new This will not cause an infinite loop. If a second "s" command is executed, it could modify the results of a previous command. I will show you how to execute multiple commands later. /1, /2, etc. Specifying which occurrence With no flags, the first matched substitution is changed. With the "g" option, all matches are changed. If you want to modify a particular pattern that is not the first one on the line, you could use "\(" and "\)" to mark each pattern, and use "\1" to put the first pattern back unchanged. This next example keeps the first word on the line but deletes the second: sed 's/\([a-zA-Z]*\) \([a-zA-Z]*\) /\1 /' new Yuck. There is an easier way to do this. You can add a number after the substitution command to indicate you only want to match that particular pattern. Example: sed 's/[a-zA-Z]* //2' new You can combine a number with the g (global) flag. For instance, if you want to leave the first word alone, but change the second, third, etc. to be DELETED instead, use /2g: sed 's/[a-zA-Z]* /DELETED /2g' new I've heard that combining the number with the g command does not work on The MacOS, and perhaps the FreeSBD version of sed as well. Don't get /2 and \2 confused. The /2 is used at the end. \2 is used in inside the replacement field. Note the space after the "*" character. Without the space, sed will run a long, long time. (Note: this bug is probably fixed by now.) This is because the number flag and the "g" flag have the same bug. You should also be able to use the pattern sed 's/[^ ]*//2' new but this also eats CPU. If this works on your computer, and it does on some UNIX systems, you could remove the encrypted password from the password file: sed 's/[^:]*//2' /etc/password.new But this didn't work for me the time I wrote this. Using "[^:][^:]*" as a work-around doesn't help because it won't match a non-existent password, and instead delete the third field, which is the user ID! Instead you have to use the ugly parenthesis: sed 's/^\([^:]*\):[^:]:/\1::/' /etc/password.new You could also add a character to the first pattern so that it no longer matches the null pattern: sed 's/[^:]*:/:/2' /etc/password.new The number flag is not restricted to a single digit. It can be any number from 1 to 512. If you wanted to add a colon after the 80th character in each line, you could type: sed 's/./&:/80' new You can also do it the hard way by using 80 dots: sed 's/^................................................................................/&:/' new /p - print By default, sed prints every line. If it makes a substitution, the new text is printed instead of the old one. If you use an optional argument to sed, "sed -n," it will not, by default, print any new lines. I'll cover this and other options later. When the "-n" option is used, the "p" flag will cause the modified line to be printed. Here is one way to duplicate the function of grep with sed: sed -n 's/pattern/&/p' new Note that a space after the '/I' and the 'p' (print) command emphasizes that the 'p' is not a modifier of the pattern matching process, , but a command to execute after the pattern matching. Combining substitution flags You can combine flags when it makes sense. Please note that the "w" has to be the last flag. For example the following command works: sed -n 's/a/A/2pw /tmp/file' new Next I will discuss the options to sed, and different ways to invoke sed. Arguments and invocation of sed previously, I have only used one substitute command. If you need to make two changes, and you didn't want to read the manual, you could pipe together multiple sed commands: sed 's/BEGIN/begin/' new This used two processes instead of one. A sed guru never uses two processes when one can do. Multiple commands with -e command One method of combining multiple commands is to use a -e before each command: sed -e 's/a/A/' -e 's/b/B/' new A "-e" isn't needed in the earlier examples because sed knows that there must always be one command. If you give sed one argument, it must be a command, and sed will edit the data read from standard input. The long argument version is sed --expression='s/a/A/' --expression='s/b/B/' new Filenames on the command line You can specify files on the command line if you wish. If there is more than one argument to sed that does not start with an option, it must be a filename. This next example will count the number of lines in three files that don't begin with a "#:" sed 's/^#.*//' f1 f2 f3 | grep -v '^$' | wc -l Let's break this down into pieces. The sed substitute command changes every line that starts with a "#" into a blank line. Grep was used to filter out (delete) empty lines. Wc counts the number of lines left. Sed has more commands that make grep unnecessary. And grep -c can replace wc -l. I'll discuss how you can duplicate some of grep's functionality later. Of course you could write the last example using the "-e" option: sed -e 's/^#.*//' f1 f2 f3 | grep -v '^$' | wc -l There are two other options to sed. sed -n: no printing The "-n" option will not print anything unless an explicit request to print is found. I mentioned the "/p" flag to the substitute command as one way to turn printing back on. Let me clarify this. The command sed 's/PATTERN/&/p' file acts like the cat program if PATTERN is not in the file: e.g. nothing is changed. If PATTERN is in the file, then each line that has this is printed twice. Add the "-n" option and the example acts like grep: sed -n 's/PATTERN/&/p' file Nothing is printed, except those lines with PATTERN included. The long argument of the -n command is either sed --quiet 's/PATTERN/&/p' file or sed --silent 's/PATTERN/&/p' file Using 'sed /pattern/' Sed has the ability to specify which lines are to be examined and/or modified, by specifying addresses before the command. I will just describe the simplest version for now - the /PATTERN/ address. When used, only lines that match the pattern are given the command after the address. Briefly, when used with the /p flag, matching lines are printed twice: sed '/PATTERN/p' file And of course PATTERN is any regular expression. Please note that if you do not include a command, such as the "p" for print, you will get an error. When I type echo abc | sed '/a/' I get the error sed: -e expression #1, char 3: missing command Also, you don't need to, but I recommend that you place a space after the pattern and the command. This will help you distinguish between flags that modify the pattern matching, and commands to execute after the pattern is matched. Therefore I recommend this style: sed '/PATTERN/ p' file Using 'sed -n /pattern/p' to duplicate the function of grep If you want to duplicate the functionality of grep, combine the -n (noprint) option with the /p print flag: sed -n '/PATTERN/p' file sed -f scriptname If you have a large number of sed commands, you can put them into a file and use sed -f sedscript new where sedscript could look like this: # sed comment - This script changes lower case vowels to upper case s/a/A/g s/e/E/g s/i/I/g s/o/O/g s/u/U/g When there are several commands in one file, each command must be on a separate line. The long argument version is sed --file=sedscript new sed in shell scripts If you have many commands and they won't fit neatly on one line, you can break up the line using a backslash: sed -e 's/a/A/g' \ -e 's/e/E/g' \ -e 's/i/I/g' \ -e 's/o/O/g' \ -e 's/u/U/g' new Quoting multiple sed lines in the C shell You can have a large, multi-line sed script in the C shell, but you must tell the C shell that the quote is continued across several lines. This is done by placing a backslash at the end of each line: #!/bin/csh -f sed 's/a/A/g \ s/e/E/g \ s/i/I/g \ s/o/O/g \ s/u/U/g' new Quoting multiple sed lines in the Bourne shell The Bourne shell makes this easier as a quote can cover several lines: #!/bin/sh sed ' s/a/A/g s/e/E/g s/i/I/g s/o/O/g s/u/U/g' new sed -V The -V option will print the version of sed you are using. The long argument of the command is sed --version sed -h The -h option will print a summary of the sed commands. The long argument of the command is sed --help A sed interpreter script Another way of executing sed is to use an interpreter script. Create a file that contains: #!/bin/sed -f s/a/A/g s/e/E/g s/i/I/g s/o/O/g s/u/U/g If this script was stored in a file with the name "CapVowel" and was executable, you could use it with the simple command: CapVowel new Comments Sed comments are lines where the first non-white character is a "#." On many systems, sed can have only one comment, and it must be the first line of the script. On the Sun (1988 when I wrote this), you can have several comment lines anywhere in the script. Modern versions of Sed support this. If the first line contains exactly "#n" then this does the same thing as the "-n" option: turning off printing by default. This could not done with a sed interpreter script, because the first line must start with "#!/bin/sed -f" as I think "#!/bin/sed -nf" generated an error. It worked when I first wrote this (2008). Note that "#!/bin/sed -fn" does not work because sed thinks the filename of the script is "n". However, "#!/bin/sed -nf" does work. Passing arguments into a sed script Passing a word into a shell script that calls sed is easy if you remembered my tutorial on the UNIX quoting mechanism. To review, you use the single quotes to turn quoting on and off. A simple shell script that uses sed to emulate grep is: #!/bin/sh sed -n 's/'$1'/&/p' However - there is a subtle problem with this script. If you have a space as an argument, the script would cause a syntax error, such as sed: -e expression #1, char 4: unterminated `s' command A better version would protect from this happening: #!/bin/sh sed -n 's/'"$1"'/&/p' If this was stored in a file called sedgrep, you could type sedgrep '[A-Z][A-Z]' new Patterns Many UNIX utilities like vi and more use a slash to search for a regular expression. Sed uses the same convention, provided you terminate the expression with a slash. To delete the first number on all lines that start with a "#," use: sed '/^#/ s/[0-9][0-9]*//' I placed a space after the "/expression/" so it is easier to read. It isn't necessary, but without it the command is harder to fathom. Sed does provide a few extra options when specifying regular expressions. But I'll discuss those later. If the expression starts with a backslash, the next character is the delimiter. To use a comma instead of a slash, use: sed '\,^#, s/[0-9][0-9]*//' The main advantage of this feature is searching for slashes. Suppose you wanted to search for the string "/usr/local/bin" and you wanted to change it for "/common/all/bin." You could use the backslash to escape the slash: sed '/\/usr\/local\/bin/ s/\/usr\/local/\/common\/all/' It would be easier to follow if you used an underline instead of a slash as a search. This example uses the underline in both the search command and the substitute command: sed '\_/usr/local/bin_ s_/usr/local_/common/all_' This illustrates why sed scripts get the reputation for obscurity. I could be perverse and show you the example that will search for all lines that start with a "g," and change each "g" on that line to an "s:" sed '/^g/s/g/s/g' Adding a space and using an underscore after the substitute command makes this much easier to read: sed '/^g/ s_g_s_g' Er, I take that back. It's hopeless. There is a lesson here: Use comments liberally in a sed script. You may have to remove the comments to run the script under a different (older) operating system, but you now know how to write a sed script to do that very easily! Comments are a Good Thing. You may have understood the script perfectly when you wrote it. But six months from now it could look like modem noise. And if you don't understand that reference, imagine an 8-month-old child typing on a computer. Ranges by line number You can specify a range on line numbers by inserting a comma between the numbers. To restrict a substitution to the first 100 lines, you can use: sed '1,100 s/A/a/' If you know exactly how many lines are in a file, you can explicitly state that number to perform the substitution on the rest of the file. In this case, assume you used wc to find out there are 532 lines in the file: sed '101,532 s/A/a/' An easier way is to use the special character "$," which means the last line in the file. sed '101,$ s/A/a/' The "$" is one of those conventions that mean "last" in utilities like cat -e, vi, and ed. "cat -e" Line numbers are cumulative if several files are edited. That is, sed '200,300 s/A/a/' f1 f2 f3 >new is the same as cat f1 f2 f3 | sed '200,300 s/A/a/' >new Ranges by patterns You can specify two regular expressions as the range. Assuming a "#" starts a comment, you can search for a keyword, remove all comments until you see the second keyword. In this case the two keywords are "start" and "stop:" sed '/start/,/stop/ s/#.*//' The first pattern turns on a flag that tells sed to perform the substitute command on every line. The second pattern turns off the flag. If the "start" and "stop" pattern occurs twice, the substitution is done both times. If the "stop" pattern is missing, the flag is never turned off, and the substitution will be performed on every line until the end of the file. You should know that if the "start" pattern is found, the substitution occurs on the same line that contains "start." This turns on a switch, which is line oriented. That is, the next line is read and the substitute command is checked. If it contains "stop" the switch is turned off. Switches are line oriented, and not word oriented. You can combine line numbers and regular expressions. This example will remove comments from the beginning of the file until it finds the keyword "start:" sed -e '1,/start/ s/#.*//' This example will remove comments everywhere except the lines between the two keywords: sed -e '1,/start/ s/#.*//' -e '/stop/,$ s/#.*//' The last example has a range that overlaps the "/start/,/stop/" range, as both ranges operate on the lines that contain the keywords. I will show you later how to restrict a command up to, but not including the line containing the specified pattern. It is in Operating in a pattern range except for the patterns But I have to cover some more basic principles. Before I start discussing the various commands, I should explain that some commands cannot operate on a range of lines. I will let you know when I mention the commands. In this next section I will describe three commands, one of which cannot operate on a range. Delete with d Using ranges can be confusing, so you should expect to do some experimentation when you are trying out a new script. A useful command deletes every line that matches the restriction: "d." If you want to look at the first 10 lines of a file, you can use: sed '11,$ d' out will append the file "end" at the end of the file (address "$)." The following will insert a file after the line with the word "INCLUDE:" sed '/INCLUDE/ r file' out You can use the curly braces to delete the line having the "INCLUDE" command on it: #!/bin/sh sed '/INCLUDE/ { r file d }' The order of the delete command "d" and the read file command "r" is important. Change the order and it will not work. There are two subtle actions that prevent this from working. The first is the "r" command writes the file to the output stream. The file is not inserted into the pattern space, and therefore cannot be modified by any command. Therefore the delete command does not affect the data read from the file. The other subtlety is the "d" command deletes the current data in the pattern space. Once all of the data is deleted, it does make sense that no other action will be attempted. Therefore a "d" command executed in a curly brace also aborts all further actions. As an example, the substitute command below is never executed: #!/bin/sh # this example is WRONG sed -e '1 { d s/.*// }' The earlier example is a crude version of the C preprocessor program. The file that is included has a predetermined name. It would be nice if sed allowed a variable (e.g "\1" ) instead of a fixed file name. Alas, sed doesn't have this ability. You could work around this limitation by creating sed commands on the fly, or by using shell quotes to pass variables into the sed script. Suppose you wanted to create a command that would include a file like cpp, but the filename is an argument to the script. An example of this script is: % include 'sys/param.h' file.c.new A shell script to do this would be: #!/bin/sh # watch out for a '/' in the parameter # use alternate search delimiter sed -e '\_#INCLUDE <'"$1"'>_{ r '"$1"' d }' Let me elaborate. If you had a file that contains Test first file #INCLUDE Test second file #INCLUDE you could use the command sed_include1.sh file1_{ # read the file r '"$1"' # delete any characters in the pattern space # and read the next line in d }' Adding, Changing, Inserting new lines Sed has three commands used to add new lines to the output stream. Because an entire line is added, the new line is on a line by itself to emphasize this. There is no option, an entire line is used, and it must be on its own line. If you are familiar with many UNIX utilities, you would expect sed to use a similar convention: lines are continued by ending the previous line with a "\". The syntax to these commands is finicky, like the "r" and "w" commands. Append a line with 'a' The "a" command appends a line after the range or pattern. This example will add a line after every line with "WORD:" #!/bin/sh sed ' /WORD/ a\ Add this line after every line with WORD ' You could eliminate two lines in the shell script if you wish: #!/bin/sh sed '/WORD/ a\ Add this line after every line with WORD' I prefer the first form because it's easier to add a new command by adding a new line and because the intent is clearer. There must not be a space after the "\". Insert a line with 'i' You can insert a new line before the pattern with the "i" command: #!/bin/sh sed ' /WORD/ i\ Add this line before every line with WORD ' Change a line with 'c' You can change the current line with a new line. #!/bin/sh sed ' /WORD/ c\ Replace the current line with the line ' A "d" command followed by a "a" command won't work, as I discussed earlier. The "d" command would terminate the current actions. You can combine all three actions using curly braces: #!/bin/sh sed ' /WORD/ { i\ Add this line before a\ Add this line after c\ Change the line to this one }' Leading tabs and spaces in a sed script Sed ignores leading tabs and spaces in all commands. However these white space characters may or may not be ignored if they start the text following a "a," "c" or "i" command. In SunOS, both "features" are available. The Berkeley (and Linux) style sed is in /usr/bin, and the AT&T version (System V) is in /usr/5bin/. To elaborate, the /usr/bin/sed command retains white space, while the /usr/5bin/sed strips off leading spaces. If you want to keep leading spaces, and not care about which version of sed you are using, put a "\" as the first character of the line: #!/bin/sh sed ' a\ \ This line starts with a tab ' Adding more than one line All three commands will allow you to add more than one line. Just end each line with a "\:" #!/bin/sh sed ' /WORD/ a\ Add this line\ This line\ And this line ' Adding lines and the pattern space I have mentioned the pattern space before. Most commands operate on the pattern space, and subsequent commands may act on the results of the last modification. The three previous commands, like the read file command, add the new lines to the output stream, bypassing the pattern space. Address ranges and the above commands You may remember that earlier I warned you that some commands can take a range of lines, and others cannot. To be precise, the commands "a," "i," "r," and "q" will not take a range like "1,100" or "/begin/,/end/." The documentation states that the read command can take a range, but I got an error when I tried this. The "c" or change command allows this, and it will let you change several lines into one: #!/bin/sh sed ' /begin/,/end/ c\ ***DELETED*** ' If you need to do this, you can use the curly braces, as that will let you perform the operation on every line: #!/bin/sh # add a blank line after every line sed '1,$ { a\ }' Multi-Line Patterns Most UNIX utilities are line oriented. Regular expressions are line oriented. Searching for patterns that covers more than one line is not an easy task. (Hint: It will be very shortly.) Sed reads in a line of text, performs commands which may modify the line, and outputs modification if desired. The main loop of a sed script looks like this: 1. The next line is read from the input file and places it in the pattern space. If the end of file is found, and if there are additional files to read, the current file is closed, the next file is opened, and the first line of the new file is placed into the pattern space. 2. The line count is incremented by one. Opening a new file does not reset this number. 3. Each sed command is examined. If there is a restriction placed on the command, and the current line in the pattern space meets that restriction, the command is executed. Some commands, like "n" or "d" cause sed to go to the top of the loop. The "q" command causes sed to stop. Otherwise the next command is examined. 4. After all of the commands are examined, the pattern space is output unless sed has the optional "-n" argument. The restriction before the command determines if the command is executed. If the restriction is a pattern, and the operation is the delete command, then the following will delete all lines that have the pattern: /PATTERN/ d If the restriction is a pair of numbers, then the deletion will happen if the line number is equal to the first number or greater than the first number and less than or equal to the last number: 10,20 d If the restriction is a pair of patterns, there is a variable that is kept for each of these pairs. If the variable is false and the first pattern is found, the variable is made true. If the variable is true, the command is executed. If the variable is true, and the last pattern is on the line, after the command is executed the variable is turned off: /begin/,/end/ d Whew! That was a mouthful. If you have read carefully up to here, you should have breezed through this. You may want to refer back, because I covered several subtle points. My choice of words was deliberate. It covers some unusual cases, like: # what happens if the second number # is less than the first number? sed -n '20,1 p' file and # generate a 10 line file with line numbers # and see what happens when two patterns overlap yes | head -10 | cat -n | \ sed -n -e '/1/,/7/ p' -e '/5/,/9/ p' Enough mental punishment. Here is another review, this time in a table format. Assume the input file contains the following lines: AB CD EF GH IJ When sed starts up, the first line is placed in the pattern space. The next line is "CD." The operations of the "n," "d," and "p" commands can be summarized as: Pattern Space Next Input Command Output New Pattern Space New Text Input AB CD n CD EF AB CD d - CD EF AB CD p AB CD EF The "n" command may or may not generate output depending upon the existence of the "-n" flag. That review is a little easier to follow, isn't it? Before I jump into multi-line patterns, I wanted to cover three more commands: Print line number with = The "=" command prints the current line number to standard output. One way to find out the line numbers that contain a pattern is to use: # add line numbers first, # then use grep, # then just print the number cat -n file | grep 'PATTERN' | awk '{print $1}' The sed solution is: sed -n '/PATTERN/ =' file Earlier I used the following to find the number of lines in a file #!/bin/sh lines=$(wc -l file | awk '{print $1}' ) Using the "=" command can simplify this: #!/bin/sh lines=$(sed -n '$=' file ) The "=" command only accepts one address, so if you want to print the number for a range of lines, you must use the curly braces: #!/bin/sh # Just print the line numbers sed -n '/begin/,/end/ { = d }' file Since the "=" command only prints to standard output, you cannot print the line number on the same line as the pattern. You need to edit multi-line patterns to do this. Transform with y If you wanted to change a word from lower case to upper case, you could write 26 character substitutions, converting "a" to "A," etc. Sed has a command that operates like the tr program. It is called the "y" command. For instance, to change the letters "a" through "f" into their upper case form, use: sed 'y/abcdef/ABCDEF/' file Here's a sed example that converts all uppercase letters to lowercase letters, like the tr command: sed 'y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/' lowercase If you wanted to convert a line that contained a hexadecimal number (e.g. 0x1aff) to upper case (0x1AFF), you could use: sed '/0x[0-9a-zA-Z]*/ y/abcdef/ABCDEF' file This works fine if there are only numbers in the file. If you wanted to change the second word in a line to upper case, and you are using classic sed, you are out of luck - unless you use multi-line editing. (Hey - I think there is some sort of theme here!) However, GNU sed has a uppercase and lowercase extension. Displaying control characters with a l The "l" command prints the current pattern space. It is therefore useful in debugging sed scripts. It also converts unprintable characters into printing characters by outputting the value in octal preceded by a "\" character. I found it useful to print out the current pattern space, while probing the subtleties of sed. Working with Multiple Lines There are three new commands used in multiple-line patterns: "N," "D," and "P." I will explain their relation to the matching "n," "d," and "p" single-line commands. The "n" command will print out the current pattern space (unless the "-n" flag is used), empty the current pattern space, and read in the next line of input. The "N" command does not print out the current pattern space and does not empty the pattern space. It reads in the next line, but appends a new line character along with the input line itself to the pattern space. The "d" command deletes the current pattern space, reads in the next line, puts the new line into the pattern space, and aborts the current command, and starts execution at the first sed command. This is called starting a new "cycle." The "D" command deletes the first portion of the pattern space, up to the new line character, leaving the rest of the pattern alone. Like "d," it stops the current command and starts the command cycle over again. However, it will not print the current pattern space. You must print it yourself, a step earlier. If the "D" command is executed with a group of other commands in a curly brace, commands after the "D" command are ignored. The next group of sed commands is executed, unless the pattern space is emptied. If this happens, the cycle is started from the top and a new line is read. The "p" command prints the entire pattern space. The "P" command only prints the first part of the pattern space, up to the NEWLINE character. Neither the "p" nor the "P" command changes the patterns space. Some examples might demonstrate "N" by itself isn't very useful. the filter sed -e 'N' doesn't modify the input stream. Instead, it combines the first and second line, then prints them, combines the third and fourth line, and prints them, etc. It does allow you to use a new "anchor" character: "\n." This matches the new line character that separates multiple lines in the pattern space. If you wanted to search for a line that ended with the character "#," and append the next line to it, you could use #!/bin/sh sed ' # look for a "#" at the end of the line /#$/ { # Found one - now read in the next line N # delete the "#" and the new line character, s/#\n// }' file You could search for two lines containing "ONE" and "TWO" and only print out the two consecutive lines: #!/bin/sh sed -n ' /ONE/ { # found "ONE" - read in next line N # look for "TWO" on the second line # and print if there. /\n.*TWO/ p }' file The next example would delete everything between "ONE" and "TWO:" #!/bin/sh sed ' /ONE/ { # append a line N # search for TWO on the second line /\n.*TWO/ { # found it - now edit making one line s/ONE.*\n.*TWO/ONE TWO/ } }' file Matching three lines with sed You can match multiple lines in searches. Here is a way to look for the string "skip3", and if found, delete that line and the next two lines. #!/bin/sh sed '/skip3/ { N N s/skip3\n.*\n.*/# 3 lines deleted/ }' Note that it doesn't matter what the next two lines are. If you wanted to match 3 particular lines, it's a little more work. This script looks for three lines, where the first line contains "one", the second contained "two" and the third contains "three", and if found, replace them with the string "1+2+3": #!/bin/sh sed ' /one/ { N /two/ { N /three/ { N s/one\ntwo\nthree/1+2+3/ } } } ' Matching patterns that span multiple lines You can either search for a particular pattern on two consecutive lines, or you can search for two consecutive words that may be split on a line boundary. The next example will look for two words which are either on the same line or one is on the end of a line and the second is on the beginning of the next line. If found, the first word is deleted: #!/bin/sh sed ' /ONE/ { # append a line N # "ONE TWO" on same line s/ONE TWO/TWO/ # "ONE # TWO" on two consecutive lines s/ONE\nTWO/TWO/ }' file Let's use the "D" command, and if we find a line containing "TWO" immediately after a line containing "ONE," then delete the first line: #!/bin/sh sed ' /ONE/ { # append a line N # if TWO found, delete the first line /\n.*TWO/ D }' file If we wanted to print the first line instead of deleting it, and not print every other line, change the "D" to a "P" and add a "-n" as an argument to sed: #!/bin/sh sed -n ' # by default - do not print anything /ONE/ { # append a line N # if TWO found, print the first line /\n.*TWO/ P }' file It is very common to combine all three multi-line commands. The typical order is "N," "P" and lastly "D." This one will delete everything between "ONE" and "TWO" if they are on one or two consecutive lines: #!/bin/sh sed ' /ONE/ { # append the next line N # look for "ONE" followed by "TWO" /ONE.*TWO/ { # delete everything between s/ONE.*TWO/ONE TWO/ # print P # then delete the first line D } }' file Earlier I talked about the "=" command, and using it to add line numbers to a file. You can use two invocations of sed to do this (although it is possible to do it with one, but that must wait until next section). The first sed command will output a line number on one line, and then print the line on the next line. The second invocation of sed will merge the two lines together: #!/bin/sh sed '=' file | \ sed '{ N s/\n/ / }' If you find it necessary, you can break one line into two lines, edit them, and merge them together again. As an example, if you had a file that had a hexadecimal number followed by a word, and you wanted to convert the first word to all upper case, you can use the "y" command, but you must first split the line into two lines, change one of the two, and merge them together. That is, a line containing 0x1fff table2 will be changed into two lines: 0x1fff table2 and the first line will be converted into upper case. I will use tr to convert the space into a new line, and then use sed to do the rest. The command would be ./sed_split CD EF AB CD N - AB\nCD EF AB CD d - - EF AB CD D - - EF AB CD p AB AB CD AB CD P AB AB CD AB\nCD EF n EF GH AB\nCD EF N - AB\nCD\nEF GH AB\nCD EF d - EF GH AB\nCD EF D - CD EF AB\nCD EF p AB\nCD AB\nCD EF AB\nCD EF P AB AB\nCD EF Using newlines in sed scripts Occasionally one wishes to use a new line character in a sed script. Well, this has some subtle issues here. If one wants to search for a new line, one has to use "\n." Here is an example where you search for a phrase, and delete the new line character after that phrase - joining two lines together. (echo a;echo x;echo y) | sed '/x$/ { N s:x\n:x: }' which generates a xy However, if you are inserting a new line, don't use "\n" - instead insert a literal new line character: (echo a;echo x;echo y) | sed 's:x:X\ :' generates a X y The Hold Buffer So far we have talked about three concepts of sed: (1) The input stream or data before it is modified, (2) the output stream or data after it has been modified, and (3) the pattern space, or buffer containing characters that can be modified and send to the output stream. There is one more "location" to be covered: the hold buffer or hold space. Think of it as a spare pattern buffer. It can be used to "copy" or "remember" the data in the pattern space for later. There are five commands that use the hold buffer. Exchange with x The "x" command eXchanges the pattern space with the hold buffer. By itself, the command isn't useful. Executing the sed command sed 'x' as a filter adds a blank line in the front, and deletes the last line. It looks like it didn't change the input stream significantly, but the sed command is modifying every line. The hold buffer starts out containing a blank line. When the "x" command modifies the first line, line 1 is saved in the hold buffer, and the blank line takes the place of the first line. The second "x" command exchanges the second line with the hold buffer, which contains the first line. Each subsequent line is exchanged with the preceding line. The last line is placed in the hold buffer, and is not exchanged a second time, so it remains in the hold buffer when the program terminates, and never gets printed. This illustrates that care must be taken when storing data in the hold buffer, because it won't be output unless you explicitly request it. Example of Context Grep One use of the hold buffer is to remember previous lines. An example of this is a utility that acts like grep as it shows you the lines that match a pattern. In addition, it shows you the line before and after the pattern. That is, if line 8 contains the pattern, this utility would print lines 7, 8 and 9. One way to do this is to see if the line has the pattern. If it does not have the pattern, put the current line in the hold buffer. If it does, print the line in the hold buffer, then the current line, and then the next line. After each set, three dashes are printed. The script checks for the existence of an argument, and if missing, prints an error. Passing the argument into the sed script is done by turning off the single quote mechanism, inserting the "$1" into the script, and starting up the single quote again: #!/bin/sh # grep3 - prints out three lines around pattern # if there is only one argument, exit case $# in 1);; *) echo "Usage: $0 pattern";exit;; esac; # I hope the argument doesn't contain a / # if it does, sed will complain # use sed -n to disable printing # unless we ask for it sed -n ' '/"$1"/' !{ #no match - put the current line in the hold buffer x # delete the old one, which is # now in the pattern buffer d } '/"$1"/' { # a match - get last line x # print it p # get the original line back x # print it p # get the next line n # print it p # now add three dashes as a marker a\ --- # now put this line into the hold buffer x }' You could use this to show the three lines around a keyword, i.e.: grep3 vt100 out sed ' s/ /\ /' | \ sed ' { y/abcdef/ABCDEF/ N s/\n/ / }' Here is a solution that does not require two invocations of sed because it uses the "h" and "G" command: #!/bin/sh # convert2uc version b # change the first hex number to upper case format, leave the rest of the line alone # uses sed once # used as a filter # convert2uc out sed ' { # remember the line h #change the current line to upper case y/abcdef/ABCDEF/ # add the old line back G # Keep the first word of the first line, # and second word of the second line # with one humongous regular expression s/^\([^ ]*\) .*\n[^ ]* \(.*\)/\1 \2/ }' Carl Henrik Lunde suggested a way to make this simpler, but not as general purpose. The two previous verion converted works with lines with multiple words. This one only converts two words - the first and the last. However, it deletes any word in-between. #!/bin/sh # convert2uc version b # change the first hex number to upper case format, and keeps the last word # Note that it deletes the words in-between # uses sed once # used as a filter # convert2uc out sed ' { # remember the line h #change the current line to upper case y/abcdef/ABCDEF/ # add the old line back G # Keep the first word of the first line, # and last word of the second line # with one humongous regular expression s/ .* / / # delete all but the first and last word }' This example only converts the letters "a" through "f" to upper case. This was chosen to make the script easier to print in these narrow columns. You can easily modify the script to convert all letters to uppercase, or to change the first letter, second word, etc. Branch (Flow Control) As you learn about sed you realize that it has its own programming language. It is true that it's a very specialized and simple language. What language would be complete without a method of changing the flow control? There are three commands sed uses for this. You can specify a label with a text string preceded by a colon. The "b" command branches to the label. The label follows the command. If no label is there, branch to the end of the script. The "t" command is used to test conditions. Before I discuss the "t" command, I will show you an example using the "b" command. This example remembers paragraphs, and if it contains the pattern (specified by an argument), the script prints out the entire paragraph. #!/bin/sh sed -n ' # if an empty line, check the paragraph /^$/ b para # else add it to the hold buffer H # at end of file, check paragraph $ b para # now branch to end of script b # this is where a paragraph is checked for the pattern :para # return the entire paragraph # into the pattern space x # look for the pattern, if there - print /'"$1"'/ p ' Testing with t You can execute a branch if a pattern is found. You may want to execute a branch only if a substitution is made. The command "t label" will branch to the label if the last substitute command modified the pattern space. One use for this is recursive patterns. Suppose you wanted to remove white space inside parenthesis. These parentheses might be nested. That is, you would want to delete a string that looked like "( ( ( ())) )." The sed expressions sed 's/([ ^I]*)/g' would only remove the innermost set. You would have to pipe the data through the script four times to remove each set or parenthesis. You could use the regular expression sed 's/([ ^I()]*)/g' but that would delete non-matching sets of parenthesis. The "t" command would solve this: #!/bin/sh sed ' :again s/([ ^I]*)// t again ' An earlier version had a 'g' after the 's' expression. This is not needed. Debugging with l The 'l' command will print the pattern space in an unambiguous form. Non-printing characters are printed in a C-style escaped format. This can be useful when debugging a complex multi-line sed script. An alternate way of adding comments There is one way to add comments in a sed script if you don't have a version that supports it. Use the "a" command with the line number of zero: #!/bin/sh sed ' /begin/ { 0i\ This is a comment\ It can cover several lines\ It will work with any version of sed }' The poorly documented ; There is one more sed command that isn't well documented. It is the ";" command. This can be used to combined several sed commands on one line. Here is the grep4 script I described earlier, but without the comments or error checking and with semicolons between commands: #!/bin/sh sed -n ' '/"$1"/' !{;H;x;s/^.*\n\(.*\n.*\)$/\1/;x;} '/"$1"/' {;H;n;H;x;p;a\ --- }' Yessireebob! Definitely character building. I think I have made my point. As far as I am concerned, the only time the semicolon is useful is when you want to type the sed script on the command line. If you are going to place it in a script, format it so it is readable. I have mentioned earlier that many versions of sed do not support comments except on the first line. You may want to write your scripts with comments in them, and install them in "binary" form without comments. This should not be difficult. After all, you have become a sed guru by now. I won't even tell you how to write a script to strip out comments. That would be insulting your intelligence. Also - some operating systems do NOT let you use semicolons. So if you see a script with semicolons, and it does not work on a non-Linux system, replace the semicolon with a new line character. (As long as you are not using csh/tcsh, but that's another topic. Passing regular expressions as arguments In the earlier scripts, I mentioned that you would have problems if you passed an argument to the script that had a slash in it. In fact, regular expression might cause you problems. A script like the following is asking to be broken some day: #!/bin/sh sed 's/'"$1"'//g' If the argument contains any of these characters in it, you may get a broken script: "/\.*[]^$" For instance, if someone types a "/" then the substitute command will see four delimiters instead of three. You will also get syntax errors if you provide a "]" without a "]". One solution is to have the user put a backslash before any of these characters when they pass it as an argument. However, the user has to know which characters are special. Here's another solution - add a backslash before each of those special characters in the script. #!/bin/sh # put two backslashes before each of these characters: ][^$.*/ # Note that the first ']' doesn't need a backslash arg=$(echo "$1" | sed 's:[]\[\^\$\.\*\/]:\\\\&:g') # We need two backslashes because the shell converts each double backslash in quotes to a single backslash sed 's/'"$arg"'//g' If you were searching for the pattern "^../," the script would convert this into "\^\.\.\/" before passing it to sed. Inserting binary characters Dealing with binary characters can be trick, especially when writing scripts for people to read. I can insert a binary character using an editor like EMACS but if I show the binary character, the terminal may change it to show it to you. The easiest way I have found to do this in a script in a portable fashion is to use the tr(1) command. It understands octal notations, and it can be output into a variable which can be used. Here's a script that will replace the string "ding" with the ASCII bell character: #!/bin/sh BELL=$(echo x | tr 'x' '\007') sed "s/ding/$BELL/" Please note that I used double quotes. Since special characters are interpreted, you have to be careful when you use this mechanism. GNU sed Command Line arguments One of the conventions UNIX systems have is to use single letters are command line arguments. This makes typing faster, and shorted, which is an advantage if you are in a contest. Normal people often find sed's terseness cryptic. You can improve the readability of sed scripts by using the long word equivalent options. That is, instead of typing sed -n 20p You can type the long word version of the -n argument sed --quiet 20p Or sed --silent 20p The long form of sed's command line arguments always have 2 hyphens before their names. GNU sed has the following long-form command line arguments: GNU Command Line Arguments Short Form Long Form -n --quiet --silent -e script --expression=SCRIPT -f SCRIPTFILE --file=SCRIPTFILE -i[SUFFIX] --in-place[=SUFFIX] -l N --line-length=N --posix -b --binary --follow-symlinks -r --regular-extended -s --separate -u --unbuffered --help --version Let's define each of these. The -posix argument The GNU version of sed has many features that are not available in other versions. When portability is important, test your script with the -posix option. If you had an example that used a feature of GNU sed, such as the 'v' command to test the version number, such as #this is a sed command file v 4.0.1 # print the number of lines $= And you executed it with the command sed -nf sedfile --posix . General help using GNU software: . E-mail bug reports to: . Be sure to include the word ``sed'' somewhere in the ``Subject:'' field. The -h Help argument The -h option will print a summary of the sed commands. The long argument of the command is sed --help It provides a nice summary of the command line arguments. The -l Line Length Argument I've already described the 'l' command. The default line width for the 'l' command is 70 characters. This default value can be changed by adding the '-l N' option and specifying the maximum line length as the number after the '-l'. sed -n -l 80 'l' tmp/b.txt If you executed the above command to do in place editing, there will be a new file called "b.txt" in the current directory, and "tmp/b.txt" will be unchanged. Now you have two versions of the file, one is changed (in the current directory), and one is not (in the "tmp" directory). And where you had a symbolic link, it has been replaced with a modified version of the original file. If you want to edit the real file, and keep the symbolic link in place, use the "--follow-symlinks" command line option: sed -i --follow-symlinks 's/^/\t/' *.txt This follows the symlink to the original location, and modifies the file in the "tmp" directory, If you specify an extension, the original file will be found with that extension in the same directory as the real source. Without the --follow-symlinks command line option, the "backup" file "b.tmp" will be in the same directory that held the symbolic link, and will still be a symbolic link - just renamed to give it a new extension. The -b Binary argument Unix and Linux systems consider the new line character "\n" to be the end of the line. However, MS-DOS, Windows, and Cygwin systems end each line with "\r\n" - Carriage return and line-feed. If you are using any of these operating systems, the "-b" or --binary" command line option will treat the carriage return/new line combination as the end of the line. Otherwise the carriage return is treated as an unprintable character immediately before the end-of-line. I think. (Note to self - verify this). The -r Extended Regular Expression argument When I mention patterns, such as "s/pattern/", the pattern is a regular expression. There are two common classes of regular expressions, the original "basic" expressions, and the "extended" regular expressions. For more on the differences see [283]My tutorial on regular expressions and the [284]the section on extended regular expressions. Because the meaning of certain characters are different between the regular and extended expressions, you need a command line argument to enable sed to use the extension. To enable this extension, use the "-r" command, as mentioned in [285]the example on finding duplicated words on a line sed -r -n '/\([a-z]+\) \1/p' or sed --regular-extended -quiet '/\([a-z]+\) \1/p' I already mentioned that Mac OS X and FreeBSD uses -E instead of -r. The -u Unbuffered argument Normally - Unix and Linux systems apply some intelligence to handling standard output. It's assumed that if you are sending results to a terminal, you want the output as soon as it becomes available. However, if you are sending the output to a file, then it's assumed you want better performance, so it buffers the output until the buffer is full, and then the contents of the buffer is written to the file. Let me elaborate on this. Let's assume for this example you have a very large file, and you are using sed to search for a string, and to print it when it is found: sed -n '/MATCH/p' ’ as anchors that indicated a word boundary. So you could use s@\@@ However, the GNU version of sed says the usage of these special characters are undefined. According to the manual page: Regex syntax clashes (problems with backslashes) `sed' uses the POSIX basic regular expression syntax. According to the standard, the meaning of some escape sequences is undefined in this syntax; notable in the case of `sed' are `\|', `\+', `\?', `\`', `\'', `\<', `\>', `\b', `\B', `\w', and `\W'. As in all GNU programs that use POSIX basic regular expressions, `sed' interprets these escape sequences as special characters. So, `x\+' matches one or more occurrences of `x'. `abc\|def' matches either `abc' or `def'. When in doubt, experiment. Command Summary As I promised earlier, here is a table that summarizes the different commands. The second column specifies if the command can have a range or pair of addresses or a single address or pattern. The next four columns specifies which of the four buffers or streams are modified by the command. Some commands only affect the output stream, others only affect the hold buffer. If you remember that the pattern space is output (unless a "-n" was given to sed), this table should help you keep track of the various commands. Command Address or Range Modification to Input Stream Modification to Output Stream Modification to Pattern Space Modification to Hold Buffer = - - Y - - a Address - Y - - b Range - - - - c Range - Y - - d Range Y - Y - D Range Y - Y - g Range - - Y - G Range - - Y - h Range - - - Y H Range - - - Y i Address - Y - - l Address - Y - - n Range Y * - - N Range Y - Y - p Range - Y - - P Range - Y - - q Address - - - - r Address - Y - - s Range - - Y - t Range - - - - w Range - Y - - x Range - - Y Y y Range - - Y - The "n" command may or may not generate output, depending on the "-n" option. The "r" command can only have one address, despite the documentation. Check out my Sed Reference Chart In Conclusion This concludes my tutorial on sed. It is possible to find shorter forms of some of my scripts. However, I chose these examples to illustrate some basic constructs. I wanted clarity, not obscurity. I hope you enjoyed it.