Episode 12 Find and locate
We use find
and locate
to dig up lists of files on our system that match
certain criteria. We also look at xargs
for executing commands on a selected
set of files.
20 October 2015
•
[Rhythmic, dark electronic intro music] | |
League |
•
Welcome back to Command Line TV. Today we’re going to talk about finding •files using a command called And do we have any follow-up from last time? |
Lopes |
•
Last time at the end of the episode, we learned about formatting and modifying SD cards or, •sorry, external drives. How can we load a drive so it’s read-only? |
League |
•
Sure, so if you want to make sure that programs can’t access – •or can’t write to the drive, there is an option for that when you mount it. •If I type and then the path where it would be mounted, the directory where it would mount. •You can specify some other options here
using a common options is just saying
read-only mount -o ro /dev/sdb1 /mnt• then the disk will be mounted read-only and it means
you can do things like but if you tried to actually edit a file or copy a file to it or something like that, •it would stop you right away and say “read-only filesystem.” So that prevents it from being written. |
Lopes |
•
Since we’ll be using the simple as the command name sounds. •We just use it to find files and other things on our filesystem, correct? |
League |
•
Yeah, you use it to find things – what’s interesting about it is it’s got •this syntax that’s available as options for specifying a query – •it’s really like querying a database, but about files. •So you can find them by name but you can also find them by modification •times or permissions or combine all these things together into a big query. •I’m going to start with the simplest case, which is finding them by name. •Let’s say I want to find – the first thing that I give is the directory to start looking in, •and then it will look in any sub-directory of that too. •If I want to look across this entire
system I could put that would be the top level directory. Or
the current directory home directory But you could do any of those as your
starting point for the Then we put the query as options. So
and this takes wildcards so I could
say something like but there is a little bit of a catch there. When you use a wildcard like this, •the shell expands it before it actually
gives it to the So directory and that’s not what we want. We want that
star So I don’t want my shell to expand the
wildcard, I want wildcard with the files that it comes across. So I have to quote it, •just like when you’ve got spaces in a filename, or any special characters. find ~ -name '*.png'• You put quotes around it and then it won’t
expand but So there’s a simple example of a going to just dump out a list of all of these PNG files that exist in my home directory. •So I’m going to pipe that into find ~ -name '*.png' | less• You see some of them are in this – these cache folders, •so this is a And so it’s got little thumbnails in there of images that I didn’t even know about, •but I can find them with The program called It has put some of its icons into that cache, and so on. •There are lots of PNGs here that you might not have even thought of before. |
Lopes |
•
So when we ran the wildcard but then it also expanded into subdirectories. •What if for example you wanted to locate some of the playing cards that we worked with, •but the originals not the ones that we changed the geometry on. |
League |
•
Yep, so down here in the
cd Downloads/Playing\ Cards cd PNG-cards-1.3 ls• I had these and last time we created the subdirectories. •So when I do a by the way if the current directory is where you want to start from then •you don’t actually need to specify the of the filenames that have find . -name '*hearts*'• But that is getting me the ones in subdirectories
as well. So there’s If I want to limit it to either the current directory or maybe I just don’t •want to search too deep – there’s an
option called I put find -maxdepth 1 -name '*hearts*'• seeing the files that have so basically in the current directory. •And if I went to find -maxdepth 2 -name '*hearts*'• Now there are some other queries that I can add to this. •When you have multiple queries on a using a Boolean AND operator. So in other words all of them have to be true •in order for the file to match. One that I like to use sometimes is – •if you want to find files that have been modified since a certain time •that’s an option called So I want to show files that are newer than some other file. •Let’s pick one of those First let’s get back all of the hearts in the current directory, •and then I’m going to just do the ones that are newer than the •
find -maxdepth 1 -name '*hearts*' -newer 7_of_hearts.png• If I were to look at these by modification
time – so like ls -ltr• most recent ones at the bottom – you’re going to see queen, king, 2, 6, 4, •5, jack, 3 as being newer than the 7. So let’s see – •oh but I’m seeing stuff that isn’t hearts so let’s do it this way. ls -ltr *hearts*• King, queen, jack, 2, 3, 4, 6 – I believe that’s what we had before. •So these files below here are
newer than the And the order those come in – so it’s showing me the modification times here – •all say 2011 because that is the time-stamp that was in the zip file. •But they could have been zipped in a particular order and there are seconds •and milliseconds there that it’s not showing me because the date is so far in the past. •But it is actually more detailed than what it’s showing. |
Lopes |
•
Since these files were modified elsewhere, I guess a way to represent this •or show a better explanation of it would
be to that we modified ourselves, like the
|
League |
•
Yeah so the And again they all show So if I did that newer command – but
let’s get rid of the right so newer than the
find -name '*hearts*' -newer 7_of_hearts.png• You’re going to get everything in those subdirectories because those were •modified much more recently. These also are not – don’t seem to be coming •out in any particular order. If you wanted these to appear in some more •significant order you could sort them,
right? So pipe it into find -name '*hearts*' -newer 7_of_hearts.png |sort• now they’re a little more nicely organized. All the
So there are a few other queries we could use. One thing that’s useful •besides So if I do or types of files like device files and so forth, which we haven’t really learned much about. •So If I wanted to find everything that’s a
directory that contains the name find -type d -name '*hearts*'• there’s nothing that matches. So there were lots of things that have the •name then my result set becomes empty. If I just
did find -type d• of all the directories that I’ve got. •Another one that I think is useful is
whether the file has zero bytes – it’s a completely empty file. find -empty• And sometimes there are a surprising number of empty files on your filesystem. •Some of them are there for good reason even though they’re completely empty. •So these are just some of the queries you can use. Do you want to guess how •we could find out about more queries that are available with find? |
Lopes |
•
We could do |
League |
•
find -h find --help• So pipe that into find --help | less• the queries that you can do. There’s
But then for more detail there’s this ‘manual’ command, •so man find• So way about the capabilities of |
Lopes |
•
So when we pulled up the I guess – options called “actions”. What can we do with those? |
League |
•
Right, so it kind of carves up these options into these three categories •and “actions” are something you would put at the end of your query. •The default action if you don’t specify one
is just to print filenames but There are lots of ways to specify how it prints – that’s what these formats are about. •I’m not going to get into those. But it can also execute arbitrary other commands. •And it’s got a built-in one here called
a bunch of files according to your query. •Let’s try some of those. I’m going to do a
and whenever I’ve got an action besides
find -name '*jack*' |less• out first by doing just a print, right? So I want to see all of the files that it is producing. •Maybe I will simplify it a little by – like it’s
finding these with a dot-underscore so if I just take filenames that start with
find -name 'jack*' |less• there ought to be fewer of them. •Okay so those are all of the Jack cards.
And then on my add a find -name 'jack*' -delete• but now if I do find -name 'jack*'• So that’s something obviously you want to use with great care. •There are ways to specify other arbitrary commands you could do as well. •So let’s say I am looking at the queens. Here are all of my queens. find -name 'queen*'• And I want to change permissions on those so if I look at – •let’s go down into cd Downloads/Playing\ Cards/PNG-cards-1.3/ ls -l• So if I look at these cards here, they all have
permissions Let’s say that my queens are private and they want to turn off the read •permission for anyone but the user.
We’re going to do a
And with so for group and others let’s turn off read permission. •The user can keep read permission but turn it off for the others. •And then you would put the filename normally – well, •the filename is going to come
from the
So I’ve got a very special way to plug in the
filename at this point in my And that is I put quote and curly braces That’s the signal to the in the filename that it finds. •Finally I have to say when I’m done
with the so that you do backslash semi-colon find -name 'queen*' -exec chmod go-r '{}' \;• backslashes and stuff with you need to pass these wildcards in explicitly. And normally the curly •braces would be a wildcard that the shell interprets, so you put quotes around that. •Semi-colon means something in the shell so you quote that with a backslash, •so that Alright, so I’m going to run that. It was very fast, •and what we will find in the current directory is that all of our queens ls -l• now do not have read permission for those other two, but everybody else does. •So that’s sort of – that hints at the
power of this of doing very complex queries and then allowing us to hook that in to some •other command like in order to execute a command on lots of different files. |
Lopes |
•
Can |
League |
•
Yeah, anything that the So one of those octal numerical values was 662 or something – just make up a weird one. •So if I do that on all of the queens then we see here – this is the result of 662. find -name 'queen*' -exec chmod 662 '{}' \;• Yeah, so any command can be put in there, it
could even be something like some script that you wrote – any command you could normally execute and put a filename into, find -name 'queen*' -exec ./resize '{}' \;•
|
Lopes |
•
So when we combined that it seemed sort of like when we would use a pipe. |
League |
•
Yes, there are lots of ways to combine commands together – pipe, •and we also did the command substitution with those back-ticks – •and when I do find -name 'queen*' -exec chmod go-r '{}' \;• so you might imagine another way to think of
that. Let’s say I do I did queens and jacks, let’s do a king –
and I’m going to do to make this a little bit fewer, right. So there are my kings, find -maxdepth 1 -name 'king*'• and if I wanted to run actually is put so maybe I want 722 – twos don’t make a lot of sense, •like giving other write permission, four is read permission. •Let’s say I want 744 and then normally you would put a filename here but •you can put multiple filenames on either with the back-quotes or chmod 744 $(find -maxdepth 1 -name 'king*')• this will run the get plugged into the that’s pretty much the same thing. So you see my kings turned green because I made them executable. •So that is very similar to using But there are some subtle differences. One of the differences has to do with – •first of all, there is often a limit on how big a command line can get. •So if this then I might exhaust the limitation on the
size of the So this form with the command substitution has that limitation. •Whereas if I do rather than build up an enormous
single So that is one difference in the limitations, even though it looks like it •does pretty much the same thing. •But another one – you said it’s like a pipe, and there is another way to use a pipe – •which is a command called piping and command substitution. So let’s bring back the command substitution form. chmod 744 $(find -maxdepth 1 -name 'king*')• What is more or less equivalent to
this is – let’s do the so I’ll copy that out and paste it out there – so that’s going to generate those – find -maxdepth 1 -name 'king*'• and then if I pipe it into Well let’s do something different so I recognize the change. •Now normally But what so the result of that it’ll take all of the files from there and
put them on the find -maxdepth 1 -name 'king*' | xargs chmod 644• So That allows me to turn what was a command substitution into a pipe. And that works fine. •Now all of the kings have 644 as their permissions. ls -l• So one of the – I said like these are three different ways to do the same thing, right? •We did the is on the outside and
They’re all more or less the same but the caveats and limitations are where •things get a little weird. And one of those limitations is when spaces – •when filenames have spaces in them. •I might have said before that you should be very careful about naming things with spaces in them. •And this is one of the reasons – it makes it very hard for commands to distinguish between files, •like – let’s take a little example here. I know up here I’ve got – cd ~• outside of So this directory is called find -maxdepth 1 -type d• I get all of the directories in the current directory. So that includes that. •But now if I pipe that into find -maxdepth 1 -type d | xargs chmod +x• so I want to make all of those directories executable, which is a reasonable thing to do. •[Sigh] What happened? That “Command Line TV” was one line of my output, •but when I pipe it into And so my So that’s a risk with filenames with spaces in them. •It’s also a risk with how There is a solution to it though, and
it’s a way in which
find --help• I said that What find -maxdepth 1 -type d -print0• but instead of separating them with newlines or spaces, •it separates them with the ‘zero’ character, or the null character. •So the character with the value zero. And when I see them printed here on •the command line it looks like they’re all bunched together – •that’s because that null character doesn’t show up. •But if I take that and then pipe it into
it should look for the zero character to split them up – so I’ve got to – •let me verify with xargs --help• So we’re going to do the find -maxdepth 1 -type d -print0 | xargs -0 chmod +x• And now it works again. •So it’s able to keep the filename with spaces in it together because it •knows that spaces or newlines are not what splits up the multiple directories. •It’s actually this null character. So if you do that on both sides – •the then they cooperate and this problem goes away. |
Lopes |
•
So just to backtrack, would the null character
be considered that |
League |
•
No you literally can’t see it here, it just doesn’t print out. •So what we’re seeing is And then the next result is The null character just doesn’t appear in printed output. •But it will appear when you pipe it into something that is expecting it. •Another way that we could see it show up actually – •this introduces another command, but one that’s pretty easy – •there’s a command called “octal dump”
( find -maxdepth 1 -type d | od• just shows it to you as a series of octal numbers. And you can specify that •they should be like one byte big instead of – so basically – hmm, find -maxdepth 1 -type d | od -t o1• I don’t like octal so I’m going to do hexadecimal – that’s better. find -maxdepth 1 -type d | od -t x1• What we’re seeing here is basically – this is the – •so dot-slash-dot something, newline, okay. So it’s separating – •
But if I do that same thing with find -maxdepth 1 -type d -print0 | od -t x1• Those So I don’t see that when it’s just output onto the terminal, •but it is sent on to the next command in the pipe. |
Lopes |
•
So I would say that it works especially well when you’re working with a •confined or constrained search area. Is that command the best to use when •you’re searching your entire system? |
League |
•
Yeah, so you can – you know, when you
specify find /• here to say search the entire system. And sometimes you might need to do that. •But it’s very time-consuming, if it has to do that. •If you’re not the administrative user, you’re a regular user, •it’s going to encounter lots of directories you’re not even allowed to read. •So it will give you some error messages about stuff like that. •There is a better command for searching the entire disk for a pattern. •The trade-off – well there are a couple of trade-offs. •One trade-off is that it support as
many queries as You know, The command that I’m going to introduce now –
So it can do pattern matching on filenames, but that’s pretty much it. •So let’s say I do anything that has to do with the password file, right? So this will respond pretty quickly. locate passwd• And the reason it can respond pretty quickly and still find stuff all over •your disk is that it uses a database. There’s a database that only gets updated periodically, •that basically indexes all of the files on your system. •And then So we can use it like that. Or lets say I want anything that has to do with ImageMagick. locate Magick• So it gives me anything across the whole system, and it’s pretty fast. •If I wanted to find stuff that’s very recent, very new – that’s more of a problem. •So let’s go into my cd Downloads• So suppose I want to locate files which have this pattern. locate weblog-2015• And I believe So I don’t really need to do that, although if you wanted to put stars in there somewhere, •you do again just like – for the same reason as
with So we can search for that. And these are
all of the files that say But it’s able to find those because it’s got this database, •so if I add a new file right now – let’s
create touch weblog-2015-09-17.txt ls• So now that file exists, and it wasn’t there before. •When I do locate weblog-2015• That’s because the database is now out of date. •If I wanted to update the database manually, I can. •Normally what happens is it’s scheduled as like a periodic job – •like once a day or every couple of hours or something – it will run a command. •The command is called When I do this, it’s going to reindex the entire disk, so it can take a little while. •But then it updates the database and then we’ll be able to see the result. sudo updatedb• Actually that was pretty fast – I’m not sure it did what I thought it was going to do! •But we’ll try it – let’s try
locate weblog-2015•
|
Lopes |
•
So like most commands on the terminal that we’ve been running, a lot of it is case-sensitive. •What would we do, or what option would we have to set to turn off case-sensitivity. |
League |
•
Yeah, both it’s assuming a case-sensitive match. So
if I did falsely remembered that my weblog files were capitalized like that – it’s not going to find them. locate WEBLOG-2015• But there is an option it’s the same that locate -i WEBLOG-2015• And then it will ignore the difference in case between your pattern and the filename itself. •That helps me find more things. •
first of all let’s verify that find -name weblog-2015• like still found these files but find -name 'weblog-2015*'• But same thing – find -name 'WEBLOG-2015*'• it’s going to look for capitals. So the fix there is that there’s just a •separate query operator called find -iname 'WEBLOG-2015*'• And then it’ll find those files. |
Lopes |
•
So in today’s episode we touched base on locating files using three commands, •
That also wraps up Season 1 of Command Line TV. |
League |
•
Yes, we hope you found this useful – we covered lots of things since we started this – •so, navigating through files with searching for stuff, image processing, package management, redirection, •shell scripts – so we did lots of things. And if you found this useful, •I hope you’ll get in touch with us. |
Lopes |
•
You can reach us at |
League |
•
And if we have good feedback from you and you found this useful then we’ll try to do more! |
•
[Dark electronic beat] •[Captions by Christopher League] •[End] |