Let us see how to use find command to gain lots of useful information about users and their files.

Find Syntax

find <path to search in> <options> <expression to search for>

Finding files by name

The below command will search the /etc directory for any file name containing the word "fstab" within it.

find /etc -name *fstab*

Finding all large directories

To find all directories taking 50k (kilobytes) blocks of space. This is useful to find out which directories on system taking lot of space. "type -d" searches for directories. "-size +50K" species you are looking for something 50k or bigger.

# find / -type d -size +50k

Output:

/var/lib/dpkg/info
/var/log/ksymoops
/usr/share/doc/HOWTO/en-html
/usr/share/man/man3

Finding all large files on a Linux / UNIX

"type -f searches for files. -size +20000K" species you are looking for something 20000k or bigger.

# find / -type f -size +20000k

Output:

var/log/kern.log
/sys/devices/pci0000:00/0000:00:02.0/resource0
/sys/devices/pci0000:00/0000:00:00.0/resource0
/opt/03Jun05/firefox-1.0.4-source.tar.bz2

However my favorite hack to above command is as follows:

# find / -type f -size +20000k -exec ls -lh {} \; | awk '{ print $8 ": " $5 }'

/var/log/kern.log: 22M
/sys/devices/pci0000:00/0000:00:02.0/resource0: 128M
/sys/devices/pci0000:00/0000:00:00.0/resource0: 256M
/opt/03Jun05/firefox-1.0.4-source.tar.bz2: 32M

Above command will find all files block size greater than 20000k and print filename followed by the file size. Output is more informative as compare to normal find command output

How to find files owned by a user

If you know the username, this is the command you might use to locate all the files which belong to it:

tuxtrain$ find /home -user bob

In this case, we're looking for all the files and directories owned by a bob under /home.

Find files owned by a Unix group

If you're doing a broader search, you may be interested in identifying all the files owned by a Unix group. Here's how you can do it:

tuxtrain$ find /usr -group staff

In this example, we’re using find to locate all the files in /usr owned by the staff group.

Locate files by UID or GID

If you’re more comfortable dealing with Unix user IDs (UIDs) and group IDs (GIDs), you can use them with find command as well. In this example, I’m looking for the temporary files created by myself (my UID on that system is 1000):

tuxtrain$ find /var/tmp -uid 1000

you can alternatively use -gid if you know the GID to look for.

Understand time options

Find permits selection of files based on Unix mtime, ctime, and atime attributes. The standard unit is 24 hour periods (a day). GNU find also permits using minutes for the period, for example:

find / -mmin -10

Standard predicates that work with age of the file in file are : “-atime/-ctime/-mtime” [+|-]n and they use 24 hour periods (a day). Each compares with value provided with the Unix timestamps: the last time a files’s “access time”, “file change time, to be more exact, the inode change time” and “content modification time”.

atime is the simplest the non-controversial time-stamp: it stands for access time which is when the file was last read.

ctime is the inode change time. When does the inode change, when you of course update a file, but also when you do things like changing the permissions of the file but not necessarily its contents. It would ne better to call this attribute change time as it indicates the last time a file’s metadata (inode) was changed. ctime changes when you change file’s ownership or access permissions. As the man page for stat explains: “The field st_ctime is changed by writing or by setting inode information (i.e., owner, group, link count, mode, etc.).”

mtime: is the “content modification time”, so if you change the contents of the file, this timestamp is updated. Changes of name, ownership and permissions does not affect it

n is time interval — an integer with optional sign. It is measured in 24-hour periods (days) or minutes counted from the current moment.

* n: If the integer n does not have sign this means exactly n 24-hour periods (days) ago, 0 means today.
* +n: if it has plus sing, then it means “more then n 24-hour periods (days) ago”, or older then n,
* -n: if it has the minus sign, then it means less than n 24-hour periods (days) ago (-n), or younger then n. It’s evident that -1 and 0 are the same and both means “today”.

Below are some examples using mtime to show you how the time intervals work:

Find everything in your home directory modified in the last 24 hours:

find $HOME -mtime -1

Find everything in your home directory modified in the last seven 24-hour periods (days):

find $HOME -mtime -7

Find everything in your home directory that have NOT been modified in the last year:

find $HOME -mtime +365

To find html files that have been modified in the last seven 24-hour periods (days), I can use -mtime with the argument -7 (include the hyphen):

find . -mtime -7 -name "*.html" -print

Note: If you use the number 7 (without a hyphen), find will match only html files that were modified exactly seven 24-hour periods (days) ago:

find . -mtime 7 -name "*.html" -print

To find those html files that were not touched for at least seven 24-hour periods (days), use +7:

find . -mtime +7 -name "*.html" -print

Executing commands with find

Find is capable to perform various actions on the files or directories that are found

exec command executes the specified command. This option is more suitable for executing relatively simple commands. For more complex things post processing of output is a safer option as you have some additional context to make the particular decision.

Find is able to execute one or more commands for each file it has found with the -exec option. Unfortunately, one cannot simply enter the command. You need to remember two syntactic tricks:

1. The command that you want to execute need to contain a special macro argument {}, which will be replaced by the matched filename on each invocation of -exec predicate.
2. You need to specify \; (or ‘;’ ) at the end of the command. (If the \ is left out, the shell will interpret the ; as the end of the find command.)

In case {} macro parameter is the last item in the command then it should be a space between the {} and the \;. For example:

find . -type d -exec ls -ld {} \;

Here are several “global” chmod tricks based on find -exec capabilities:

find . -type f -exec chmod 500 {} ';'

This command will search in the current directory and all sub directories and change permissions of each file as specified.

find . -name "*rc.conf" -exec chmod o+r '{}' \;

find . -name "*rc.conf" -exec chmod o+r '{} ;'

This command will search in the current directory and all sub directories. All files named *rc.conf will be processed by the chmod -o+r command. The argument ‘{}’ is a macro that expands to each found file. The \; argument indicates the exec argument has ended.

The end results of this command is all *rc.conf files have the other permissions set to read access (if the operator is the owner of the file).

The find command is commonly used to remove core files that are more than a few 24-hour periods (days) old. These core files are copies of the actual memory image of a running program when the program dies unexpectedly. They can be huge, so occasionally trimming them is wise:

find . -name core -ctime +4 -exec /bin/rm -f {} \;

Feeding find output to pipes with xargs

One of the biggest limitations of the -exec option (or predicate with the side effect to be more correct) is that it can only run the specified command on one file at a time. The xargs command solves this problem by enabling users to run a single command on many files at one time. In general, it is much faster to run one command on many files, because this cuts down on the number of invocations of particular command/utility.

For example often one needs to find files containing a specific pattern in multiple directories one can use an exec option in find (please note that you should use the -l flag for grep so that grep specifies the matched filenames):

find . -type f -exec grep -li '/bin/ksh' {} \;

But there is more elegant and more Unix-like way of accomplishing the same task using xarg and pipes. You can use the xargs to read the output of find and build a pipelines that invokes grep. This way, grep is called only four or five times even though it might check through 200 or 300 files. By default, xargs always appends the list of filenames to the end of the specified command, so using it is as easy as can be:

find . -type f -print | xargs grep -li 'bin/ksh'

This gave the same output, but it was a lot faster. Also when grep is getting multiple filenames, it will automatically include the filename of any file that contains a match so option for grep -l is redundant:

find . -type f -print | xargs grep -i 'bin/ksh'

When used in combination, find, grep, and xargs are a potent team to help find files lost or misplaced anywhere in the UNIX file system. I encourage you to experiment further with these important commands to find ways they can help you work with UNIX. You can use time to find the difference in speed with -exec option vs xarg in the following way:

time find /usr/src -name "*.html" -exec grep -l foo '{}' ';' | wc -l

time find /usr/src -name "*.html" | xargs grep -l foo | wc -l

xargs works considerably faster. The difference becomes even greater when more complex commands are run and the list of files is longer.

find /mnt/zip -name "*prefs copy" -print | xargs rm

This is actually dangerous if you have a filename with spaces. If you add option -print0, you can avoid this danger:

find /mnt/zip -name "*prefs copy" -print0 | xargs rm

Two other useful options for xargs are the -p option, which makes xargs interactive, and the -n args option, which makes xargs run the specified command with only args number of arguments.

Some people wonder why there is a -p option. xargs runs the specified command on the filenames from its standard input, so interactive commands such as cp -i, mv -i, and rm -i don’t work right.

The -p option solves that problem. In the preceding example, the -p option would have made the command safe because I could answer yes or no to each file. Thus, the command I typed was the following:

find /mnt/zip -name "*prefs copy" -print0 | xargs -p rm

Many users frequently ask why xargs should be used when shell command substitution archives the same results. Take a look at this example:

grep -l foo ´find /usr/src/linux -name "*.html"´


converted by ascii2html