Episode 9 Redirection and substitution

We explore some more shell basics including redirection to and from files, and command substitution. This feature allows the output of one command to be used as parameters of another command.

22 September 2015

[Rhythmic, dark electronic intro music]

League

Welcome back to Command Line TV, this is episode 9.

Today we’re going to talk about redirection and command substitution.

Do you have any follow-up from last time?

Lopes

Last time we did installing programs and package management.

My one question is we used apt-get upgrade and apt-get update

can we run apt-get upgrade without running update or will that cause any conflicting issues?

League

You can run upgrade by itself – normally you would run update and then upgrade together.

It’s fine to run upgrade by itself but what it will do is only upgrade

packages that it already knows about. What’s happening is the ‘apt’ subsystem keeps a package cache.

So it knows the last time it did an update, what were the versions of all

the packages that were available. And upgrade will look at that package

cache and upgrade anything that can be upgraded. But unless you update,

you’re not getting the freshest stuff, you’re just getting the stuff from the previous update.

So it’s fine to do that, it doesn’t hurt anything – you just might not be getting the very latest.

Lopes

I guess the last question we have would be in terms of uninstalling a package?

What’s the proper way of doing that so there are no conflicts?

League

So when you want to uninstall something, there are a couple of options.

The simplest is, if you do apt-get and then the command remove.

And then you put the name of a package, or multiple packages –

apt-get remove imagemagick

so maybe we want to remove imagemagick. This will remove that package and

delete the files from the system. However, it does leave behind the

configuration files for imagemagick. So the thinking is that –

if you install it, and maybe you customized the installation of this package in some way,

by editing some of the configuration files – it doesn’t want to remove your customization,

because if you install it again later, then maybe you want to keep the way it was customized.

So remove removes most of the stuff but it leaves behind a little bit.

An alternative to that is purge. This will remove the package as well as

all the configuration files, even if you made changes to them.

apt-get purge imagemagick

So if you really want to be sure that everything is gone, then purge works fine.

And then we saw there are cases of dependencies, right.

Sometimes when you install a package, it brings along some other packages

that it needs as dependencies. So just uninstalling imagemagick doesn’t

necessarily remove all those dependencies. And there’s another command you

can run called autoremove. You don’t need to put a package name here,

apt-get autoremove

it’s just that whenever you run autoremove, it’ll look for any packages

that are no longer required by other packages that you’ve installed.

So if imagemagick needed some library that nobody else needs, it can remove that.

If some other package does still need that library, then it will keep it.

So it’s very smart about managing these kinds of shared dependencies and conflicts and so on.

One thing to be careful of – since you mentioned conflicts –

is the upgrade and install commands support an option called -f or --force.

This is usually a bad idea. What it means is that if there’s –

it can help in a couple of ways, but it can also be harmful.

If the package requires dependencies that are not installed properly,

you can use --force to try and install it anyway, and maybe it’ll work or maybe it’ll be terrible.

If a package is going to overwrite files that another package has already installed,

that’s a bad thing that normally ‘apt’ will try to prevent. But --force will allow that to happen.

So once in a while, --force might be the thing you need to solve some problem,

but usually it’s just going to cause problems, so try to avoid it if possible.

Lopes

So today we’re going to learn about redirection, which utilizes the >,

the < and the >> symbols. What are the purposes of these?

League

Redirection is about controlling where the input and output of different commands come from.

Every command that you run has basically three streams that are associated with it –

three streams of data. They are standard input, the standard output, and the standard error.

And so error is also an output stream, but it’s meant to be used for things

that are not part of the normal output, but things that are error messages

or warnings or stuff like that.

The way it works is that when you put together a pipeline –

so if I do ls help – whoops, can’t have a space there –

and then pipe that into less – what I’m doing with this pipe is

ls --help | less

connecting the output of one command to the input of another command.

And then where standard error comes into play is, if this ls command has

some warnings or error messages it wants to put out,

they do not get piped into less or into whatever the rest of that pipeline is.

So we could try out some examples of that. One command that generates both

standard output and standard error very easily is grep.

So we did grep to search through files before. I’m going to search for

copyright in all the files in the current directory.

grep copyright *

We saw this before, that it will generate some lines where this is the filename,

and then the content of that file which contains our keyword out here.

But it also has these error messages that come from grep,

so whenever it hits a directory – I didn’t tell it to go into that

directory or to ignore directories, so it’s giving me a little error message there.

So that’s the standard error that we’re seeing. Normally when you run grep,

the standard output and the standard error are both just dumping content onto my terminal.

But I can redirect that in different ways. So let’s try redirect to a file.

I’m going to put a greater-than (>) and then a filename.

So we’ll – let’s call this copymatches.txt.

grep copyright * >copymatches.txt

What this will do is run that command, but the standard output –

instead of being attached to another command like less,

it will take that standard output and write it to this filename that I’ve given here, after the >.

And you can have a space here, or not – that doesn’t matter.

So if I do that, what happens is all of the legitimate output of that

grep got directed to that file, so I don’t see it.

But I still see all of the error messages, because the error messages were

going out on a separate output stream called standard error.

And then if I want to redirect both output and error to the same place, there’s a way to do that.

It is using the ampersand (&) – there’s one way to do it with this

(&>) but I think it’s a little more complicated. If you put the ampersand

after the greater-than (>&) – and in that case, you don’t want a space there.

This will write both of those – let’s say copyboth.txt for that file name.

grep copyright * >& copyboth.txt

So now I don’t see any output from that command because both the standard

output and standard error went to the same file.

So let’s take a look at those files. The copymatches.txt contains just the matches that we saw,

less copymatches.txt

and no error messages. And then copyboth.txt contains the matches and also –

less copyboth.txt

I think I saw one up here – the error messages where I have directories.

So that’s redirecting the output.

Lopes

So besides the grep command can redirection be used with things such as cat?

League

Yeah, cat is often used to just directly put some data into a file.

You can use a text editor for that, but this a really simple thing and we

can use it to illustrate some of the other output redirection operators.

So if I run cat – cat all by itself, all it does is it copies its

cat

standard input to its standard output. So if I type hello, it says hello back, and so forth.

And then when I’m done, whenever you’ve got a command that’s waiting for

you to type something for its standard input, when you’re all done you can

type ‘control-D’ to say that’s the end of the input.

So I’m going to hit ‘control-D’ and it takes me back and I’m done running cat.

So what if I did cat but I redirect the output somewhere. So we could call it output1.txt.

cat >output1.txt

And this time I’ll type line1 test, this is line 2, control-D.

And now we didn’t see it echo those lines back to me because instead it

echoed them to the file that I specified. And then if I look at the content of that file,

also with cat, then it shows me those lines that got saved there.

cat output1.txt

And then I could open that with a text editor or something and edit it further if it needs it.

Let’s try that again – if I cat to the same output file –

cat >output1.txt

and we’ll do line 3 hello, testing line 4, control-D to stop.

cat output1.txt

And now my output file just contains the latest stuff. It actually overwrote the previous content.

So you have to be really careful with this redirection operator.

It will – if that file already exists – it will overwrite what’s there,

and so what’s there gets lost.

An alternative to that is if I use the double greater-than (>>).

This means to take the existing content of that file and append to it – add to the end.

cat >>output1.txt

So I was up to line 4 I think, or?

Lopes

Line 4.

League

Line 4, alright. So this is line 5, and now line 6, and control-D.

And when I output the file now, it has 3 and 4 from the previous run but I

cat output1.txt

appended 5 and 6 to the end. So that’s one way that we can distinguish

between the single greater-than (>) to possibly overwrite a file and start from the beginning,

or the double greater-than (>>) which appends onto the end of the file.

Lopes

So the double greater-than and the single greater-than can both create new

files if they don’t exist. What if you just want to add a no-clobber option to it.

League

If you want to make sure it doesn’t overwrite some file that already exists,

we can set an option in the shell to make it a little safer –

just like we made those aliases for mv and cp and so on,

to make sure they don’t overwrite files. The way to do that is a shell option,

which you set with set -o and the name of that option is noclobber.

set -o noclobber

And so normally I would want to do that in my .bashrc or somewhere like that,

so it can be saved and every time I start a new shell I’ll have that option set.

The way I did it right now, it’ll only take place –

it’ll only take effect for this particular session.

But now if I do cat into my output1.txt, it will actually prevent me

cat >output1.txt

from overwriting a file that already exists. So that’s kind of nice.

But if I do double greater-than (>>), that would not overwrite a file

cat >>output1.txt

because that’s going to append and that’s okay. So the noclobber doesn’t prevent that.

So that’s a simple option you might want to put in your .bashrc,

to make sure that this overwrite doesn’t happen.

Alright, so the final type of redirection you might want to do is directing

standard input from a file. So you’ve got a file and you want it to become

the input of another command. So for example I just created this output1.txt,

cat output1.txt

and if I want that to become the input of, let’s say – grep.

So if I grep for line, okay that’s the keyword I’m grepping for.

Now, grep and a lot of commands that can work in multiple ways –

so I can just put the filename on my grep command line and it will do that grep,

grep line output1.txt

so it matched four of those lines but not the last one, so that one got omitted.

But I can also specify it using a less-than (<) operator which is redirecting from that file.

grep line <output1.txt

And it does the same thing. So I can think of doing that with grep or head

let’s say I want to see only the first two lines of this file.

head -2 <output1.txt

And in both of these cases, I can either use the redirection operator or not,

head -2 output1.txt

it doesn’t really make a big difference. But there are some commands where

you would have to specify it as a redirection instead of –

because it just doesn’t support reading its files from the command line.

Now I can mix input and output redirection on the same command line as well.

So if I’ve got a file output1.txt and I want to, let’s say grep for

append which I know matches one of those lines. And then I can also

output the result of that into output2.txt. So no output appears on the terminal,

grep append <output1.txt >output2.txt

but my grep ran – it took its input from this file and it dumped its output to that file.

So the output only contains that one line that matched the keyword that I searched for.

Lopes

So in one of our previous episodes, we ran cat and then piped it into less.

A few commands ago now, you just ran less and then the text file.

What’s the difference between those two?

League

Yeah, we normally were using less like I did --help pipe less or something like that.

ls --help | less

And so what less is doing is giving me a page at a time,

but it’s getting its content from the pipe. So it’s taking its standard

input and showing it to me a page at a time. But just like I did up here with head,

head can take its standard input from a redirection or it can use a file.

So I can do the same with less. If I want to – there must be a README here, yeah.

So if I want to see the contents of this a page at a time,

I can specify it as a file like that, less README, and it shows me a page at a time.

less README

Or I can run less and redirect the contents of README into it – same thing.

less <README

Or – [laughs] this gets even crazier – you can run cat

cat of course will just dump out the content, but then you can pipe it into less – same thing.

cat README |less

Or you can redirect from the README file into cat and then pipe it into less.

cat <README |less

And these become useless at some point, you’re just adding some small

layers of complexity to it when really all you want to do is less README.

But it just gives you a sense of the flexibility. Sometimes these –

the way these commands work comes in handy when you’re building long pipelines.

And we’ve seen cases of doing like grep then cut and sort then grep and stuff like that.

The very nature of these commands that allows them to be plugged together

to build those big useful pipelines also allows them to be plugged together in fairly useless ways.

Lopes

So now we’ll be moving on to command substitution. The two main symbols it

uses are the back-tick and the dollar parentheses symbol.

But first, what would command substitution be used for?

League

It’s a way of combining two different commands. And we’ve seen how to

combine two commands using a pipe, so when you do ls|head or something,

ls|head

you’re taking the output of one command and making it the input of another command.

What the back-quotes do – or command substitution – is,

let’s say I do ls – this doesn’t make any sense, but – let`s say I do that.

ls `head`

So I put one command in back-quotes and this is the –

it’s not the normal apostrophe character, it’s the back-quote (`) that

appears usually underneath the tilde. So tilde is the shifted one,

and without shift it’s usually that.

So this takes the command in back-quotes and executes it. It has some output.

And then it takes that output and it pastes it in that position on the command line.

So it will then run ls where what appears out here is the output of the previous command.

So let’s try to come up with a better example of that. I’ll start with a grep command.

So when I do something like grep copyright *, we know what that does –

grep copyright *

it just finds matching lines. But there’s an option to grep -l which

means only show the filenames that match. So if I do grep -l copyright I

grep -l copyright *

still get these error messages about directories but other than that it’s just showing me filenames.

It doesn’t show the text of that file where it matched, but just the filename itself.

So then what I might want to do is take those filenames and pass them to

another command to do something different with them.

For example, I might want to delete them, or move them all to a different

folder or something like that.

Let’s make a new folder, call it stuff. And what I want to do is move –

mkdir stuff

what are the files you want to move? Well, I want to move all the files

that match copyright out of * which is all the files in the current directory.

And then where do you want to move them to? The new directory I created, called stuff.

mv `grep -l copyright *` stuff

So there are two commands here: mv and grep. The grep will run first,

and the output of grep – the standard output of grep is then pasted

into the command line at this point where the quotes are now.

So that will get replaced by the output of grep and then I can move all those files into stuff.

What I’m seeing here is just the standard error of the grep, right.

So it’s only the standard output that gets pasted in,

and then the standard error still comes out on the terminal.

But what it did do is it moved all of those files into stuff.

So if I look in that stuff directory I’ve got a bunch of files in there –

ls stuff

every single one of those files is one that had copyright appear within it.

And now the files that remain in the current directory do not have copyright in them.

So if I do grep copyright * in the current directory, I only get the error messages.

grep copyright *

All the files that matched have been moved aside.

So that’s a pretty good example of a command – whenever you’ve got a

command that might return a list of files, then you can use that as –

with the back quotes, with the command substitution –

as parameters for things like mv and rm and cp and other commands

that expect filenames on their command line.

Lopes

Now does command substitution only work on text files, or can we use it for images as well?

League

Oh we can use it for lots of things. So a pretty neat way that we could use it for images is like –

to select some image files according to some criteria,

and then apply maybe a mogrify to crop or shrink those images or something like that.

So let’s go look at the – I think that was in Downloads/pics

cd ~/Downloads/pics/

these are the images I was working with in our episode on ImageMagick.

And so if I do identify on all of these images, I get a line for each

identify *

image that tells me the resolution and so forth. And to keep this relatively simple,

let’s just say that I grep and I’m going to look for images that have a width of 3264,

identify * | grep 3264x

so it’ll have like the 3264 and then an x. That’s a subset of the images, right?

Most of them are by 1952, but this one actually is a little different.

But now I’ve been able to use identify and grep to select a particular set of image files.

And then I want to strip this down so that it’s just the filename that I’m seeing,

not the rest of this information. So I’m going to use cut and what I’m

going to pretend is the delimiter here is this left bracket symbol ([), right?

So if that’s the delimiter then the first field would be the filename and

the second field would be all the rest of this. So I can say the delimiter is left bracket,

and I want field 1. And now I just have a list of filenames.

identify * | grep 3264x | cut -d'[' -f1

So that’s a perfect type of command that I could put in my substitution.

And when commands get a little more complex like this,

and especially when you use nested substitution, these back-quotes don’t always behave very well.

They can’t be nested within each other the way parentheses can.

So what I’m going to do is use this alternative syntax for command substitution,

which is dollar and then parentheses $() around the part that gets substituted.

Reminder that this outputs a list of filenames. So if I put it in

back-quotes it’s going to expand to that list of filenames.

And then I can use that within another command like mogrify.

And let’s say I want to shrink these by – I don’t know, 10% or something like that.

mogrify -geometry 10% $(identify * | grep 3264x | cut -d'[' -f1)

So first we’re going to use this pipeline to select the files,

and then we’re going to apply mogrify to all of them.

That can take a moment because I’m shrinking a bunch of files, but it came out pretty fast.

If I list these files in reverse order by modification time,

ls -tr

then I see that all of the ones that appeared in this list are at the end.

So starting with 201, 202, 207 – these files were the most recently modified.

And if I look at some of them – let’s just take 201, 2, and 7 .jpg

eog IMAG020[127].jpg

do you remember this bracket wildcard? So this will substitute –

another way to do this is a question mark. The question mark just substitutes a single character,

so that’s going to match all the images that have 020-something.

But I want to only select ones that end with 1,2,7,9 because they were in my initial list.

So I’m going to look at those files that I just shrunk – at least the first four of them.

And you can see that they’re very tiny images now – only 10% of what they were.

And the way I selected those was using the $() and this sub-command.

Lopes

Well today we covered the basics on redirection using the >, <, and >> symbols.

We also touched on command substitution using the back-tick as well as the dollar-parentheses sign.

League

So thanks for joining us today and we’ll see you next time.

[Dark electronic beat]

[Captions by Christopher League]

[End]