share
Unix & LinuxHow to efficiently use find command ? please lend me a hand?
[-2] [2] andrew_ysk
[2021-03-06 20:31:45]
[ linux find ]
[ https://unix.stackexchange.com/questions/637927/how-to-efficiently-use-find-command-please-lend-me-a-hand ]
find ~ -name delete_me -type f -print0 | xargs -0 /bin/rm -f

Q1.What does it meant by -print0, why is it required ?

Q2.Why /bin/rm ? why not just rm -f ?

This command only delete matched file in ~ directory; it won't delete matched file in subdirectory or sub sub directory..

Q3. How to modify it so that it will find and delete all the matched file in subdirectories as well ?

Q4. How to modify it so that it will not find into a certain directory ?

Thanks

(3) is this a school question? - jsotola
(2) All unix commandos has man-pages that explain the meaning of a command and its options; have a look at linux.die.net/man/1/find and then come back with additional questions if needed. - Fredrik
I just came from man find. I got this example from man find, but can't understand, because it did not explain to me. - andrew_ysk
[+2] [2021-03-06 20:59:36] Kusalananda
  1. The non-standard -print0 predicate means "print the pathname of the found file with a terminating nul character at the end instead of a newline character". This is to be able to handle filenames that contain newline characters (which is permitted on Unix systems; nul characters can however not be part of a Unix filename). The command reading the output of the find command that uses -print0 needs to be aware that the entries are nul-terminated. It's the -0 option to xargs that lets it read nul-terminated entries.

    It is not required. Neither of

    find "$HOME" -name delete_me -type f -delete
    

    nor the more standard

    find "$HOME" -name delete_me -type f -exec rm {} +
    

    uses -print0. Both commands do the same thing as your command, and should be preferred to using xargs. The xargs utility could possibly be useful when reading arguments from a file or some other command. When using find, it is safer to use its built in -exec facility. There are a few thing that some implementation of xargs can do that -exec can't, like starting concurrent batches of utilities. This is however not used in this exercise.

    It is also preferrable to use "$HOME" in place of ~ in shell scripts, as $HOME behaves like a variable, and ~ does not.

    Also note that the -f option to rm probably isn't needed. It is used to avoid having the rm utility fail (with a diagnostic message and a non-zero exit status) if the given pathnames are missing, or if rm is called with no pathnames to remove. It also overrides any earlier -i (interactive mode) options. But there is no -i option on the rm command line here, and the given pathnames are guaranteed to exist as find is being used to find them (barring race conditions where the files are actually deleted during the execution of find, by some other process, between being found by find and being deleted by the rm executed from find).

  2. There is no reason to use rm with its full path, unless the PATH variable's value can not be guaranteed to contain a sane set of utility paths. It is unusual to have a broken PATH and writing scripts that use absolute paths for utilities are prone to break when they are moved to other systems.

  3. It does already recurse down into all subdirectories of your home directory.

  4. Prune the directory that you want to avoid from the search tree. For example, to avoid all directories called avoid:

    find "$HOME" -name avoid -prune -o -name delete_me -type f -exec rm {} +
    

    Note that we can't use -delete here as this implies -depth, i.e. a depth-first search. With a depth-first search, we'd be visiting all the subdirectories and files inside avoid before noticing that the avoid directory should be pruned from the search tree.

    Another example: Don't enter into the specific directory $HOME/dir/save_me:

    find "$HOME" -path "$HOME/dir/save_me" -prune -o -name delete_me -type f -exec rm {} +
    

Thank you for the great explaination, I need to try out more because i tried previously , it does not recursive. - andrew_ysk
BTW, what is -o does ? Is it equavalent to "or" operator ? - andrew_ysk
@andrew_ysk Yes, -o means "or". When there's nothing between two tests/predicates, it means "and". Tests are evaluated left-to-right. The command that you show will most definitely recurse, or you are running some other command that you do not show (possibly just subtly different). - Kusalananda
By the way, This find command makes me go crazy! Is there reason for me to still use it after i realized how easy fdfind command is ? - andrew_ysk
$ fd find1 find2 ./folder Apparancely this fd command does not work as i like. How can i make it to find : find1 and find 2 both ? or make it to find either find1 or find2 ? - andrew_ysk
To exclude find1 or find2, i can use !find1 - andrew_ysk
@andrew_ysk You question here is about find. If you want to ask about fdfind, ask a new question. - Kusalananda
@andrew_ysk find will only iterate one object at a time starting from the starting-point. -name or -path will only test the pattern on this particular object and not on the list in its entirety. You may not search for two or more unique file objects on the same iteration, e.g., if you have a path './1/a/b', then, the first object will be './1', second './1/a, and the last './1/a/b', this is why find -name 'a' -name 'b' will never work, sure, you can combine tests, e.g., find -path './1/*' -name 'c' will match the third. - user380915
@andrew_ysk Debian (only ref for fdfind that I can see) may be in some disarray. At manpages.debian.org, man fdfind resolves to fd (in the unstable group), and then fd resolves to fdsh. Care is advised. - Paul_Pedant
1
[+1] [2021-03-06 21:04:44] Philip Couling [ACCEPTED]

Q1.What does it meant by -print0, why is it required ?

A good place to start is by looking in the manual. At the terminal you can type man find an get a good description. Typically you can also just google for "man find" and get the same / similar manual pages on the web.

The manual for find says about -print0:

print the full file name on the standard output, followed by a null character (instead of the newline character that -print uses). This allows file names that contain newlines or other types of white space to be correctly interpreted by programs that process the find output.

So this writes every file name followed by a null character and xargs with -0 will read, expecting each item to end with -0. This allows file names to contain a new line character.

Q2.Why /bin/ ? why not just rm -f ?

It's not obvious why this was done from your question.

It looks like there's really no good reason for this. One reason for doing this sometimes is to let you run a comment when the PATH [1] environment variable is not set correctly. This can be common when running commands in in cron.

However in this case it isn't helping anything since other commands (find and xargs) are not fully qualified. It should be:

 /bin/find ~ -name delete_me -type f -print0 | /usr/xargs -0 /bin/rm -f

The only other (highly unlikely) reason would be that your system somehow has two different rm commands and you need to specify which one. But that really is very unlikely.

Q3. How to modify it so that it will find and delete all the matched file in subdirectories as well ?

It already does this.

Q4. How to modify it so that it will not find into a certain directory ?

You can use the -prune option to skip that directory (and its children).

# Just editing your version...
find "$HOME" -name exclude_this_dir -prune -o -name delete_me -type f -print0 | xargs -0 /bin/rm -f

# More succinctly
find "$HOME" -name exclude_this_dir -prune -o -name delete_me -type f -exec rm -f {} +
[1] https://en.wikipedia.org/wiki/PATH_(variable)

That man find is very hard to understand. After explanation from @Kusalananda, i can kind of make sense with the statement from man find about newline character and null character. I believe /n is new line character .. but don'tknow what is null character still. Is there way to write it out for easier understand ? - andrew_ysk
Does that means find cmd by default listing found filename and end it with /n ? but it is not displayed on screen .. ? - andrew_ysk
In Q2 you said there is no good reason... i got that example from man find.. take a look below. - andrew_ysk
Safer find -print0 | xargs -0 approach • Find files named core in or below the directory /tmp and delete them, processing filenames in such a way that file or directory names containing single or double quotes, spaces or newlines are correctly handled. $ find /tmp -name core -type f -print0 | xargs -0 /bin/rm -f . I copied this from man find.. it said suppose to be safer way, safer than -delete method - andrew_ysk
Refer to your Q4 answer: Can i specifiy multiple location in -prune to skip them ? How ? Thx - andrew_ysk
@andrew_ysk All characters are represented as numbers: a is 65 b is 66 ... newline is number 10. There is a special character called the "null" character which is 0. It has special meaning and usually cannot be included inside other text. This makes it a safe thing to use to divide up different bits of text, because it will only be the divider, never inside the text itself. In this case, there's an assumption that file names can contain a new line (10) character but cannot contain a null character (0). - Philip Couling
@andrew_ysk yes, by default find lists each file on its own line. More accurately find writes the number 10 after every file name and 10 is read as a new line. I can't say why the author of find's manual wrote it the way they did. I only know it's not necessary. I've explained the reasons why you might do something like that in the answer. But try it yourself, it works just fine without /bin. - Philip Couling
@andrew_ysk the syntax I've shown says "match exclude_this_dir: delete or match delete_me: print0". If you want to add more you can add more "or" clauses with -o. So just add more -name directory_name -prune -o to the front of the arguments. - Philip Couling
@andrew_ysk man ascii will list the whole character set with char, octal, decimal and hex values, and the common (shell and C and other languages) escape sequences for them. E.g 012 10 0A LF '\n' (new line). - Paul_Pedant
2