https://learnbyexample.github.io/learn_perl_oneliners/one-liner-introduction.html

 1. Cover
 2. Buy PDF/EPUB versions
 3. 1. Preface
 4. 2. One-liner introduction
 5. 3. Line processing
 6. 4. In-place file editing
 7. 5. Field separators
 8. 6. Record separators
 9. 7. Using modules
10. 8. Multiple file input
11. 9. Processing multiple records
12. 10. Two file processing
13. 11. Dealing with duplicates
14. 12. Perl rename command
15. 13. Exercise Solutions
    -----------------------------------------------------------------
16.    Source code
17.    My Blog
18.    My Books
19.    learnbyexample weekly
20.    Twitter

  * Light (default)
  * Rust
  * Coal
  * Navy
  * Ayu

Perl One-Liners Guide

   
[                    ]

One-liner introduction

This chapter will give an overview of Perl syntax for command line
usage. You'll see examples to understand what kind of problems are
typically suited for one-liners.

Why use Perl for one-liners?

I assume that you are already familiar with use cases where the
command line is more productive compared to GUI. See also this series
of articles titled Unix as IDE.

A shell utility like Bash provides built-in commands and scripting
features to easily solve and automate various tasks. External
commands like grep, sed, awk, sort, find, parallel, etc help to solve
a wide variety of text processing tasks. These tools are often
combined to work together along with shell features like pipelines,
wildcards and loops. You can use Perl as an alternative to such
external tools and also complement them for some use cases.

Here are some sample text processing tasks that you can solve using
Perl one-liners. Options and related details will be explained later.

# change ; to #
# but don't change ; within single or double quotes
perl -pe 's/(?:\x27;\x27|";")(*SKIP)(*F)|;/#/g'

# retain only the first copy of duplicated lines
# uses the built-in module List::Util
perl -MList::Util=uniq -e 'print uniq <>'

# extract only IPv4 addresses
# uses a third-party module Regexp::Common
perl -MRegexp::Common=net -nE 'say $& while /$RE{net}{IPv4}/g'

Here are some stackoverflow questions that I've answered with simpler
Perl solution compared to other CLI tools:

  * replace string with incrementing value
  * sort rows in csv file without header & first column
  * reverse matched pattern
  * append zeros to list
  * arithmetic replacement in a text file
  * reverse complement DNA sequence for a specific field

The selling point of Perl over tools like grep, sed and awk includes
feature rich regular expression engine and standard/third-party
modules. Another advantage is that Perl is more portable, given the
many differences between GNU, BSD and other such implementations. The
main disadvantage is that Perl is likely to be verbose and slower for
features that are supported out of the box by those tools.


    info See also unix.stackexchange: when to use grep, sed, awk,
    perl, etc.

Installation and Documentation

If you are on a Unix-like system, you are most likely to already have
some version of Perl installed. See cpan: Perl Source for
instructions to install the latest Perl version from source. perl
v5.38.0 is used for all the examples shown in this book.

You can use the perldoc command to access documentation from the
command line. You can visit https://perldoc.perl.org/ if you wish to
read it online, which also has a handy search feature. Here are some
useful links to get started:

  * perldoc: overview
  * perldoc: perlintro
  * perldoc: faqs

Command line options

perl -h gives the list of all command line options, along with a
brief description. See perldoc: perlrun for documentation on these
command switches.

     Option                          Description
-0[octal]        specify record separator (\0, if no argument)
-a               autosplit mode with -n or -p (splits $_ into @F)
-C[number/list]  enables the listed Unicode features
-c               check syntax only (runs BEGIN and CHECK blocks)
-d[t][:MOD]      run program under debugger or module Devel::MOD
-D[number/       set debugging flags (argument is a bit mask or
letters]         alphabets)
-e commandline   one line of program (several -e's allowed, omit
                 programfile)
-E commandline   like -e, but enables all optional features
-f               don't do $sitelib/sitecustomize.pl at startup
-F/pattern/      split() pattern for -a switch (//'s are optional)
-g               read all input in one go (slurp), rather than
                 line-by-line
                 (alias for -0777)
-i[extension]    edit <> files in place (makes backup if extension
                 supplied)
-Idirectory      specify @INC/#include directory (several -I's
                 allowed)
-l[octnum]       enable line ending processing, specifies line
                 terminator
-[mM][-]module   execute use/no module... before executing program
-n               assume while (<>) { ... } loop around program
-p               assume loop like -n but print line also, like sed
-s               enable rudimentary parsing for switches after
                 programfile
-S               look for programfile using PATH environment variable
-t               enable tainting warnings
-T               enable tainting checks
-u               dump core after parsing program
-U               allow unsafe operations
-v               print version, patchlevel and license
-V[:variable]    print configuration summary (or a single Config.pm
                 variable)
-w               enable many useful warnings
-W               enable all warnings
-x[directory]    ignore text before #!perl line (optionally cd to
                 directory)
-X               disable all warnings

This chapter will show examples with the -e, -E, -l, -n, -p and -a
options. Some more options will be covered in later chapters, but not
all of them are discussed in this book.

Executing Perl code

If you want to execute a Perl program file, one way is to pass the
filename as an argument to the perl command.

$ echo 'print "Hello Perl\n"' > hello.pl
$ perl hello.pl
Hello Perl

For short programs, you can also directly pass the code as an
argument to the -e and -E options. See perldoc: feature for details
about the features enabled by the -E option.

$ perl -e 'print "Hello Perl\n"'
Hello Perl

# multiple statements can be issued separated by ;
# -l option will be covered in detail later, appends \n to 'print' here
$ perl -le '$x=25; $y=12; print $x**$y'
59604644775390625
# or, use -E and 'say' instead of -l and 'print'
$ perl -E '$x=25; $y=12; say $x**$y'
59604644775390625

Filtering

Perl one-liners can be used for filtering lines matched by a regular
expression (regexp), similar to the grep, sed and awk commands. And
similar to many command line utilities, Perl can accept input from
both stdin and file arguments.

# sample stdin data
$ printf 'gate\napple\nwhat\nkite\n'
gate
apple
what
kite

# print lines containing 'at'
# same as: grep 'at' and sed -n '/at/p' and awk '/at/'
$ printf 'gate\napple\nwhat\nkite\n' | perl -ne 'print if /at/'
gate
what

# print lines NOT containing 'e'
# same as: grep -v 'e' and sed -n '/e/!p' and awk '!/e/'
$ printf 'gate\napple\nwhat\nkite\n' | perl -ne 'print if !/e/'
what

By default, grep, sed and awk automatically loop over the input
content line by line (with newline character as the default line
separator). To do so with Perl, you can use the -n and -p options.
The O module section shows the code Perl runs with these options.

As seen before, the -e option accepts code as a command line
argument. Many shortcuts are available to reduce the amount of typing
needed. In the above examples, a regular expression (defined by the
pattern between a pair of forward slashes) has been used to filter
the input. When the input string isn't specified, the test is
performed against the special variable $_ which has the contents of
the current input line (the correct term would be input record, as
discussed in the Record separators chapter). $_ is also the default
argument for many functions like print and say. To summarize:

  * /REGEXP/FLAGS is a shortcut for $_ =~ m/REGEXP/FLAGS
  * !/REGEXP/FLAGS is a shortcut for $_ !~ m/REGEXP/FLAGS

    info See perldoc: match for help on the m operator.

Here's an example with file input instead of stdin.

$ cat table.txt
brown bread mat hair 42
blue cake mug shirt -7
yellow banana window shoes 3.14

# digits at the end of lines that are not preceded by -
$ perl -nE 'say $& if /(?<!-)\d+$/' table.txt
42
14
# if the condition isn't required, capture groups can be used
$ perl -nE 'say /(\d+)$/' table.txt
42
7
14

    info The example_files directory has all the files used in the
    examples (like table.txt in the above illustration).

Substitution

Use the s operator for search and replace requirements. By default,
this operates on $_ when the input string isn't provided. For these
examples, the -p option is used instead of -n, so that the value of
$_ is automatically printed after processing each input line. See
perldoc: search and replace for documentation and examples.

# for each input line, change only the first ':' to '-'
# same as: sed 's/:/-/' and awk '{sub(/:/, "-")} 1'
$ printf '1:2:3:4\na:b:c:d\n' | perl -pe 's/:/-/'
1-2:3:4
a-b:c:d

# for each input line, change all ':' to '-'
# same as: sed 's/:/-/g' and awk '{gsub(/:/, "-")} 1'
$ printf '1:2:3:4\na:b:c:d\n' | perl -pe 's/:/-/g'
1-2-3-4
a-b-c-d

    info The s operator modifies the input string it is acting upon
    if the pattern matches. In addition, it will return the number of
    substitutions made if successful, otherwise returns a falsy value
    (empty string or 0). You can use the r flag to return the string
    after substitution instead of in-place modification. As mentioned
    before, this book assumes you are already familiar with Perl
    regular expressions. If not, see perldoc: perlretut to get
    started.

Special variables

Brief descriptions for some of the special variables are given below:

  * $_ contains the input record content
  * @F array containing fields (with the -a and -F options)
      + $F[0] first field
      + $F[1] second field and so on
      + $F[-1] last field
      + $F[-2] second last field and so on
      + $#F index of the last field
  * $. number of records (i.e. line number)
  * $1 backreference to the first capture group
  * $2 backreference to the second capture group and so on
  * $& backreference to the entire matched portion
  * %ENV hash containing environment variables

See perldoc: special variables for documentation.

Field processing

Consider the sample input file shown below with fields separated by a
single space character.

$ cat table.txt
brown bread mat hair 42
blue cake mug shirt -7
yellow banana window shoes 3.14

Here are some examples that are based on specific fields rather than
the entire line. The -a option will cause the input line to be split
based on whitespaces and the field contents can be accessed using the
@F special array variable. Leading and trailing whitespaces will be
suppressed, so there's no possibility of empty fields. More details
are discussed in the Default field separation section.

# print the second field of each input line
# same as: awk '{print $2}' table.txt
$ perl -lane 'print $F[1]' table.txt
bread
cake
banana

# print lines only if the last field is a negative number
# same as: awk '$NF&LT0' table.txt
$ perl -lane 'print if $F[-1] < 0' table.txt
blue cake mug shirt -7

# change 'b' to 'B' only for the first field
# same as: awk '{gsub(/b/, "B", $1)} 1' table.txt
$ perl -lane '$F[0] =~ s/b/B/g; print "@F"' table.txt
Brown bread mat hair 42
Blue cake mug shirt -7
yellow banana window shoes 3.14

See the Output field separator section for details on using array
variables inside double quotes.

BEGIN and END

You can use a BEGIN{} block when you need to execute something before
the input is read and an END{} block to execute something after all
of the input has been processed.

# same as: awk 'BEGIN{print "---"} 1; END{print "%%%"}'
$ seq 4 | perl -pE 'BEGIN{say "---"} END{say "%%%"}'
---
1
2
3
4
%%%

ENV hash

When it comes to automation and scripting, you'd often need to
construct commands that can accept input from users, use data from
files and the output of a shell command and so on. As mentioned
before, this book assumes bash as the shell being used. To access
environment variables of the shell, you can use the special hash
variable %ENV with the name of the environment variable as a string
key.


    info Quotes won't be used around hash keys in this book. See
    stackoverflow: are quotes around hash keys a good practice in
    Perl? on possible issues if you don't quote the hash keys.

# existing environment variables
# output shown here is for my machine, would differ for you
$ perl -E 'say $ENV{HOME}'
/home/learnbyexample
$ perl -E 'say $ENV{SHELL}'
/bin/bash

# defined along with the command
# note that the variable definition is placed before the command
$ word='hello' perl -E 'say $ENV{word}'
hello
# the characters are preserved as is
$ ip='hi\nbye' perl -E 'say $ENV{ip}'
hi\nbye

Here's another example when a regexp is passed as an environment
variable content.

$ cat anchors.txt
sub par
spar
apparent effort
two spare computers
cart part tart mart

# assume 'r' is a shell variable containing user provided regexp
$ r='\Bpar\B'
$ rgx="$r" perl -ne 'print if /$ENV{rgx}/' anchors.txt
apparent effort
two spare computers

You can also make use of the -s option to assign a Perl variable.

$ r='\Bpar\B'
$ perl -sne 'print if /$rgx/' -- -rgx="$r" anchors.txt
apparent effort
two spare computers

    info As an example, see my repo ch: command help for a practical
    shell script, where commands are constructed dynamically.

Executing external commands

You can execute external commands using the system function. See
perldoc: system for documentation and details like how string/list
arguments are processed before execution.

$ perl -e 'system("echo Hello World")'
Hello World

$ perl -e 'system("wc -w &LTanchors.txt")'
12

$ perl -e 'system("seq -s, 10 > out.txt")'
$ cat out.txt
1,2,3,4,5,6,7,8,9,10

The return value of system or the special variable $? can be used to
act upon the exit status of the command being executed. As per
documentation:


    info The return value is the exit status of the program as
    returned by the wait call. To get the actual exit value, shift
    right by eight

$ perl -E '$es=system("ls anchors.txt"); say $es'
anchors.txt
0
$ perl -E 'system("ls anchors.txt"); say $?'
anchors.txt
0

$ perl -E 'system("ls xyz.txt"); say $?'
ls: cannot access 'xyz.txt': No such file or directory
512

To save the result of an external command, use backticks or the qx
operator. See perldoc: qx for documentation and details like
separating out STDOUT and STDERR.

$ perl -e '$words = `wc -w &LTanchors.txt`; print $words'
12

$ perl -e '$nums = qx/seq 3/; print $nums'
1
2
3

    info See also stackoverflow: difference between backticks,
    system, and exec.

Summary

This chapter introduced some of the common options for Perl CLI
usage, along with some of the typical text processing examples. While
specific purpose CLI tools like grep, sed and awk are usually faster,
Perl has a much more extensive standard library and ecosystem. And
you do not have to learn a lot if you are already comfortable with
Perl but not familiar with those CLI tools. The next section has a
few exercises for you to practice the CLI options and text processing
use cases.

Exercises

    info All the exercises are also collated together in one place at
    Exercises.md. For solutions, see Exercise_solutions.md.

    info The exercises directory has all the files used in this
    section.

1) For the input file ip.txt, display all lines containing is.

$ cat ip.txt
Hello World
How are you
This game is good
Today is sunny
12345
You are funny

##### add your solution here
This game is good
Today is sunny

2) For the input file ip.txt, display the first field of lines not
containing y. Consider space as the field separator for this file.

##### add your solution here
Hello
This
12345

3) For the input file ip.txt, display all lines containing no more
than 2 fields.

##### add your solution here
Hello World
12345

4) For the input file ip.txt, display all lines containing is in the
second field.

##### add your solution here
Today is sunny

5) For each line of the input file ip.txt, replace the first
occurrence of o with 0.

##### add your solution here
Hell0 World
H0w are you
This game is g0od
T0day is sunny
12345
Y0u are funny

6) For the input file table.txt, calculate and display the product of
numbers in the last field of each line. Consider space as the field
separator for this file.

$ cat table.txt
brown bread mat hair 42
blue cake mug shirt -7
yellow banana window shoes 3.14

##### add your solution here
-923.16

7) Append . to all the input lines for the given stdin data.

$ printf 'last\nappend\nstop\ntail\n' | ##### add your solution here
last.
append.
stop.
tail.

8) Use the contents of the s variable to display all matching lines
from the input file ip.txt. Assume that s doesn't have any regexp
metacharacters. Construct the solution such that there's at least one
word character immediately preceding the contents of the s variable.

$ s='is'

##### add your solution here
This game is good

9) Use system to display the contents of the filename present in the
second field of the given input line. Consider space as the field
separator.

$ s='report.log ip.txt sorted.txt'
$ echo "$s" | ##### add your solution here
Hello World
How are you
This game is good
Today is sunny
12345
You are funny

$ s='power.txt table.txt'
$ echo "$s" | ##### add your solution here
brown bread mat hair 42
blue cake mug shirt -7
yellow banana window shoes 3.14