This is another one of those notes to myself because I look this stuff up every six months.
Basics of find. The following is a complete rip off of the content at: http://content.hccfl.edu/pollock/unix/findcmd.htm
But I wanted to keep a copy safe here. Thanks Wayne Pollock on 12/30/2009 10:27:30.
FIND
The
1 | -print |
action lists the names of files separated by a newline. But it is common to pipe the output of
1 | find |
into
1 | xargs |
, which uses a space to separate file names. This can lead to a problem if any found files contain spaces in their names, as the output doesn’t use any quoting. In such cases, when the output of
1 | find |
contains a file name such as
1 | foo bar |
and is piped into another command, that command sees
two file names, not one file name containing a space. Even without using
1 | xargs |
you could have a problem if the file name contains a newline character.
In such cases you can specify the action
1 | -print0 |
instead. This lists the found files separated not with a newline but with a null (or NUL
) character, which is not a legal character in Unix or Linux file names. Of course the command that reads the output of
1 | find |
must be able to handle such a list of file names. Many commands commonly used with
1 | find |
(such as
1 | tar |
or
1 | cpio |
) have special options to read in file names separated with NULs instead of spaces.
You can use shell-style wildcards in the
1 | -name |
search argument:
find . -name foo\*bar
This will search from the current directory down for foo*bar (that is, any filename that begins with
1 | foo |
and ends with
1 | bar |
). Note that wildcards in the name argument must be quoted so the shell doesn’t expand them before passing them to
1 | find |
. Also, unlike regular shell wildcards, these will match leading periods in filenames. (For example
1 | find -name \*.txt |
.)
You can search for other criteria beside the name. Also you can list multiple search criteria. When you have multiple criteria any found files must match all listed criteria. That is, there is an implied Boolean AND operator between the listed search criteria.
1 | find |
also allows OR and NOT Boolean operators, as well as grouping, to combine search criteria in powerful ways (not shown here.)
Here’s an example using two search criteria:
find / -type f -mtime -7 | xargs tar -rf weekly_incremental.tar gzip weekly_incremental.tar
will find any regular files (i.e., not directories or other special files) with the criteria
1 | -type f |
, and only those modified seven or fewer days ago (
1 | -mtime -7 |
). Note the use of
1 | xargs |
, a handy utility that coverts a stream of input (in this case the output of
1 | find |
) into command line arguments for the supplied command (in this case
1 | tar |
, used to create a backup archive).
Using the
1 | tar |
option
1 | -c |
is dangerous here;
1 | xargs |
may invoke
1 | tar |
several times if there are many files found and each
1 | -c |
will cause
1 | tar |
to over-write the previous invocation. The
1 | -r |
option appends files to an archive. Other options such as those that would permit filenames containing spaces would be useful in a production quality
backup script.
1 | xargs |
is illustrated below. This command will efficiently remove all files named
1 | core |
from your system (provided you run the command as root of course):
find / -name core | xargs /bin/rm -f
find / -name core -exec /bin/rm -f '{}' \; # same thing
find / -name core -delete # same if using Gnu find
(The last two forms run the
1 | rm |
command once per file, and are not as efficient as the first form.)
One of my favorite
1 | find |
criteria is to locate files modified less than 10 minutes ago. I use this right after using some system administration tool, to learn which files got changed by that tool:
find / -mmin -10
(This search is also useful when I’ve downloaded some file but can’t locate it.)
Another common use is to locate all files owned by a given user (
1 | -user <em>username</em> |
). This is useful when deleting user accounts.
You can also find files with various permissions set.
1 | -perm /<em>permissions</em> |
means to find files with any of the specified permissions on,
1 | -perm -<em>permissions</em> |
means to find files with all of the specified permissions on, and
1 | -perm <em>permissions</em> |
means to find files with exactly permissions. Permissions can be specified either symbolically (preferred) or with an octal number. The following will locate files that are writeable by others
(including symlinks, which should be writeable by all):
find . -perm -o=w
(Using
1 | -perm |
is more complex than this example shows. You should check both the POSIX documentation for
1 | find |
(which explains how the symbolic modes work) and the Gnu
1 | find |
man page (which describes the Gnu extensions).
When using
1 | find |
to locate files for backups, it often pays to use the
1 | -depth |
option (really a criterion that is always true), which forces the output to be depth-first—that is, files first and then the directories containing them. This helps when the directories have restrictive permissions, and restoring the directory first could prevent the files from restoring at all (and would change the time stamp on the directory in any case). Normally,
1 | find |
returns the directory first, before any of the files in that directory. This is useful when using the
1 | -prune |
action to prevent
1 | find |
from examining any files you want to ignore:
find / -name /dev -prune | xargs tar ...
When specifying time with
1 | find |
options such as
1 | -mmin |
(minutes) or
1 | -mtime |
(24 hour periods, starting from now), you can specify a number
1 | <em>n</em> |
to mean exactly
1 | <em>n</em> |
,
1 | <em>-n</em> |
to mean less than
1 | <em>n</em> |
, and
1 | <em>+n</em> |
to mean more than
1 | <em>n</em> |
.
Fractional 24-hour periods are truncated! That means that
1 | find -mtime +1 |
says to match files modified two or more days ago.
For example:
find . -mtime 0 # find files modified between now and 1 day ago
# (i.e., within the past 24 hours)
find . -mtime -1 # find files modified less than 1 day ago
# (i.e., within the past 24 hours, as before)
find . -mtime 1 # find files modified between 24 and 48 hours ago
find . -mtime +1 # find files modified more than 48 hours ago
find . -mmin +5 -mmin -10 # find files modified between
# 6 and 9 minutes ago
Using the
1 | -printf |
action instead of the default
1 | -print |
is useful to control the output format better than you can with
1 | ls |
or
1 | dir |
. You can use
1 | find |
with
1 | -printf |
to produce output that can easily be parsed by other utilities or imported into spreadsheets or databases. See the man page for the dozens of possibilities with the
1 | -printf |
action. (In fact
1 | find |
with
1 | -printf |
is more versatile than
1 | ls |
and is the preferred tool for forensic examiners even on Windows systems, to list file information.) For example the following displays non-hidden (no leading dot) files in the current directory only (no subdirectories), with an custom output format:
find . -maxdepth 1 -name '[!.]*' -printf 'Name: %16f Size: %6s\n'
1 | -maxdepth |
is a Gnu extension. On a modern, POSIX version of
1 | find |
you could use this:
find . -path './*' -prune ...
On any version of
1 | find |
you can use this more complex (but portable) code:
find . ! -name . -prune ...
which says to prune
(don’t descend into) any directories except
1 | . |
.
Note that
1 | -maxdepth 1 |
will include
1 | . |
unless you also specify
1 | -mindepth 1 |
. A portable way to include
1 | . |
is:
find . \( -name . -o -prune \) ...
[This information posted by Stephane Chazelas, on 3/10/09 in newsgroup comp.unix.shell.]
As a system administrator you can use
1 | find |
to locate suspicious files (e.g., world writable files, files with no valid owner and/or group, SetUID files, files with unusual permissions, sizes, names, or dates). Here’s a final more complex example (which I saved as a shell script):
find / -noleaf -wholename '/proc' -prune \
-o -wholename '/sys' -prune \
-o -wholename '/dev' -prune \
-o -wholename '/windows-C-Drive' -prune \
-o -perm -2 ! -type l ! -type s \
! \( -type d -perm -1000 \) -print
This says to search the whole system, skipping the directories
1 | /proc |
,
1 | /sys |
,
1 | /dev |
, and
1 | /windows-C-Drive |
(presumably a Windows partition on a dual-booted computer). The Gnu
1 | -noleaf |
option tells
1 | find |
not to assume all remaining mounted filesystems are Unix file systems (you might have a mounted CD for instance). The
1 | -o |
is the Boolean OR operator, and
1 | ! |
is the Boolean NOT operator (applies to the following criteria).
So these criteria say to locate files that are world writable (
1 | -perm -2 |
, same as
1 | -o=w |
) and NOT symlinks (
1 | ! -type l |
) and NOT sockets (
1 | ! -type s |
) and NOT directories with the sticky (or text) bit set (
1 | ! \( -type d -perm -1000 \)<!-- --> |
). (Symlinks, sockets and directories with the sticky bit set are often world-writable and generally not suspicious.)
A common request is a way to find all the hard links to some file. Using
1 | ls -li <em>file</em> |
will tell you how many hard links the file has, and the inode number. You can locate all pathnames to this file with:
find mount-point -xdev -inum inode-number
Since hard links are restricted to a single filesystem, you need to search that whole filesystem so you start the search at the filesystem’s mount point. (This is likely to be either
1 | /home |
or
1 | / |
for files in your home directory.) The
1 | -xdev |
options tells
1 | find |
to not search any other filesystems.
(While most Unix and all Linux systems have a
1 | find |
command that supports the
1 | -inum |
criterion, this isn’t POSIX standard. Older Unix systems provided the
1 | ncheck |
utility instead that could be used for this.)
The
1 | -exec |
option to
1 | find |
is great, but since it runs the command listed for every found file it isn’t very efficient. On a large system this makes a difference! One solution is to combine
1 | find |
with
1 | xargs |
as discussed above:
find whatever... | xargs command
However this approach has two limitations. Firstly not all commands accept the list of files at the end of the command. A good example is
1 | cp |
:
find . -name \*.txt | xargs cp /tmp # This won't work!
(Note the Gnu version of
1 | cp |
has a non-POSIX option
1 | -t |
for this, and
1 | xargs |
has options to handle this too.)
Secondly filenames may contain spaces or newlines, which would confuse the command used with
1 | xargs |
. (Again Gnu tools have options for that,
1 | find ... -print0 <!-- -->|xargs -0 ... |
.)
There are POSIX (but non-obvious) solutions to both problems. An alternate form of
1 | -exec |
ends with a plus-sign, not a semi-colon. This form collects the filenames into groups or sets, and runs the command once per set. (This is exactly what
1 | xargs |
does, to prevent argument lists from becoming too long for the system to handle.) In this form the
1 | {} |
argument expands to the set of filenames. For example:
find / -name core -exec /bin/rm -f '{}' +
This form of
1 | -exec |
can be combined with a shell feature to solve the other problem (names with spaces). The POSIX shell allows us to use:
sh -c 'command-line' [ command-name [ args... ] ]
(We don’t usually care about the command-name, so X
, dummy
, or inline cmd
is often used.) Here’s an example of efficiently copying found files to
1 | /tmp |
, in a POSIX-compliant way (Posted on comp.unix.shell netnews newsgroup on Oct. 28 2007 by Stephane CHAZELAS):
find . -name '*.txt' -type f \
-exec sh -c 'exec cp -f "$@" /tmp' find-copy {} +
If the given expression to
1 | find |
does not contain any of the action
primaries
1 | -exec |
,
1 | -ok |
, or
1 | -print |
, the given expression is effectively replaced by:
find \( expression \) -print
The implied parenthesis can cause unexpected results. For example, consider these two similar commands:
$ find -name tmp -prune -o -name \*.txt ./bin/data/secret.txt ./tmp ./missingEOL.txt ./public_html/graphics/README.txt ./datafile2.txt ./datafile.txt
$ find -name tmp -prune -o -name \*.txt -print ./bin/data/secret.txt ./missingEOL.txt ./public_html/graphics/README.txt ./datafile2.txt ./datafile.txt
The lack of an action in the first command means it is equivalent to:
find . \( -name tmp -prune -o -name \*.txt \) -print
This causes
1 | tmp |
to be included in the output. However for the second
1 | find |
command the normal rules of Boolean operator precedence apply, so the pruned directory does not appear in the output.
The
1 | find |
command can be amazingly useful. See the man page to learn all the criteria and actions you can use.
See Also Stat
$ stat –help
Usage: stat [OPTION] FILE…
Display file or file system status.
-L, –dereference follow links
-f, –file-system display file system status instead of file status
-c –format=FORMAT use the specified FORMAT instead of the default;
output a newline after each use of FORMAT
–printf=FORMAT like –format, but interpret backslash escapes,
and do not output a mandatory trailing newline.
If you want a newline, include \n in FORMAT.
-t, –terse print the information in terse form
–append-exe append .exe if cygwin magic was needed
–help display this help and exit
–version output version information and exit
The valid format sequences for files (without –file-system):
%a Access rights in octal
%A Access rights in human readable form
%b Number of blocks allocated (see %B)
%B The size in bytes of each block reported by %b
%C SELinux security context string
%d Device number in decimal
%D Device number in hex
%f Raw mode in hex
%F File type
%g Group ID of owner
%G Group name of owner
%h Number of hard links
%i Inode number
%n File name
%N Quoted file name with dereference if symbolic link
%o I/O block size
%s Total size, in bytes
%t Major device type in hex
%T Minor device type in hex
%u User ID of owner
%U User name of owner
%x Time of last access
%X Time of last access as seconds since Epoch
%y Time of last modification
%Y Time of last modification as seconds since Epoch
%z Time of last change
%Z Time of last change as seconds since Epoch
Valid format sequences for file systems:
%a Free blocks available to non-superuser
%b Total data blocks in file system
%c Total file nodes in file system
%d Free file nodes in file system
%f Free blocks in file system
%C SELinux security context string
%i File System ID in hex
%l Maximum length of filenames
%n File name
%s Block size (for faster transfers)
%S Fundamental block size (for block counts)
%t Type in hex
%T Type in human readable form
This might be handy with xargs from find. Here’s an example from cygwin that outputs a formatted display of filename, size in bytes, and date time when the file was last modified.
$ stat -c “file: %N | bytes: %s | modtime: %y” *.ini
file: `XrxWm.ini’ | bytes: 1064 | modtime: 2009-09-14 12:15:19.531250000 -0400
file: `ntuser.ini’ | bytes: 178 | modtime: 2009-12-29 09:19:04.843750000 -0500
This little command works nicely o Windows (to esacpe the spaces in file names).
$ find . -name ‘*.enl’ -type f -print0 | xargs -0 stat -c “file: %N | bytes: %s | modtime: %y” >> find-output.txt
To produce
file: `./Sample library.enl’ | bytes: 13918 | modtime: 2009-12-18 15:02:41.671875000 -0500