Register
It is currently Thu Nov 27, 2014 3:20 am

bash script for duplicate files


All times are UTC - 6 hours


Post new topic Reply to topic  [ 9 posts ] 
Author Message
 PostPosted: Tue Feb 13, 2007 1:30 pm   

Joined: Tue Feb 13, 2007 1:24 pm
Posts: 2
Location: las vegas, nv
Hi all,

I need a nifty bash script for finding and removing duplicate mp3, ogg, wma files in a directory each time i connect my ipod

Thanks in advance for your help!

tsairox


Top
 Profile YIM  
 PostPosted: Tue Feb 13, 2007 9:41 pm   
Site Admin
User avatar

Joined: Sun May 15, 2005 9:36 pm
Posts: 673
Location: Des Moines, Iowa
tsairox wrote:
Hi all,

I need a nifty bash script for finding and removing duplicate mp3, ogg, wma files in a directory each time i connect my ipod

Thanks in advance for your help!

tsairox


ummmmm huh ?

You can't HAVE duplicate files in the same directory anyway ........ at least not with the same filename.... or are we comparing 2 directories ???? or are we comparing filename.mp3 to see if we have filename.ogg ?? I think you'll need to be a bit more specific ....


Top
 Profile WWW  
 PostPosted: Wed Feb 14, 2007 6:14 am   

Joined: Tue Feb 13, 2007 1:24 pm
Posts: 2
Location: las vegas, nv
What I mean is that when I connect my ipod or my IAudio, other external hard drive, I can see several directories. In some directories or folders if that helps, I have mistakenly placed more than one of the same song by the same artist. This has occurred quite a bit for me. So instead of manually deleting each dupe, I need a script which will do it for me. I even thouhgt maybe of using wget. I know that it has a command that lets you search blogs for new mp3s without downloading ones you already have.

Say that one of your directories is named Music. You go into that directory and say there is another directory named "Pop." Then say you have three songs by Prince called PurpleRain.mp3 which are exactly the same,(which I do). Is there a bash script that will look through all directories and remove the dupes? There surely must be a bash script that allows removing dupe files by comparing the name, size and type, etc.

thanks again,

tsairox


Top
 Profile YIM  
 PostPosted: Wed Feb 14, 2007 9:05 am   
Site Admin

Joined: Tue May 17, 2005 7:31 pm
Posts: 251
Location: Georgia
so you mean remove duplicate files, not nec. in the same directory... right? because as crouse said, you can't have two files with the same name in the same directory, if you do, duplicate songs is the least of your worries :wink:


Top
 Profile  
 PostPosted: Wed Feb 14, 2007 1:23 pm   
Site Admin
User avatar

Joined: Sun May 15, 2005 9:36 pm
Posts: 673
Location: Des Moines, Iowa
So, you would need to compare 2 different directories, and remove duplicates from one of them..... correct ? (I think i'm getting the big picture now ;) )

Are the directories the same all the time ? If so, could you list the full paths to the ones you want to compare, and tell us from WHICH one would you want to remove the duplicates and WHICH one would stay the same......


Top
 Profile WWW  
 PostPosted: Mon Dec 10, 2007 9:12 pm   

Joined: Sun Apr 22, 2007 5:16 am
Posts: 7
I would be interested in a script like this.
Rather than delete the duplicate entries,have it port the outcome omitting duplicate entries using >filename.txt.
My interest would be in finding duplicate bookmarks from several browsers.
An example would be:
in /mnt/hdb7/backups/bookmarks I may have entries like:
firefox9272007.html,firefox10272007.html,opera10272007.html,opera11272007.html.

A script that would compare the addresses from all the files(the Opera and Firefox html backup files)which are located in the same directory(bookmarks)for duplicate IP addresses,
then,
create a text file(>bookmarks.txt).
Or using >bookmarks.html?Would be better yet as to import back into a browser.

I know,Midnight Commander has a feature to find duplicate entries,but a bash script would be better.Makes one look and feel cool. Ha! :D


Top
 Profile  
 PostPosted: Sun Apr 20, 2008 2:21 am   
User avatar

Joined: Sun Apr 13, 2008 4:05 am
Posts: 37
Location: /dev/random
Code:
#!/bin/bash
#dupfind.sh
#author: myownshadow
#TODO:
#needs find redirecting without for loop (find invokes subshells)
#need a faster checksum, md5sum can be very slow with big files

for ind in `find  $1  -type f -print0 | xargs -0 md5sum |sed s/\ /-/g`
do
    echo  $ind  >> .md5list &2>/dev/null
done
cat .md5list|cut -c1-32| sort| uniq -d >> .duplist
for line in `cat .duplist`
do
  cat .md5list |  grep $line | cut -c35-   
done
rm .md5list
rm .duplist
exit


whether duplicate songs are wth same name or not u can use it like
$./dupfind.sh /media/iPOD


Top
 Profile  
 PostPosted: Mon Apr 21, 2008 2:19 pm   

Joined: Tue Apr 01, 2008 10:19 am
Posts: 49
There is a utility called fdupes for this. If you use Archlinux, it's packaged (In AUR I think), if not check your package manager. Otherwise google for it.


Top
 Profile  
 PostPosted: Wed Apr 23, 2008 11:44 pm   
User avatar

Joined: Sun Apr 13, 2008 4:05 am
Posts: 37
Location: /dev/random
yeah i know that. but as you see; http://premium.caribe.net/~adrian2/prog ... .40.tar.gz it s a C code . we bash fu here :)
fdupes also use md5 hashes, u cant expect a better performence.


Top
 Profile  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 9 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  


BashScripts | Promote Your Page Too
Powered by phpBB © 2011 phpBB Group
© 2003 - 2011 USA LINUX USERS GROUP