Great, another program to un-duplicate my files

Unduplicator

August 2010. My program collection is getting big. Cataloging it cause some program to be lost in the folder structure. Newer version came. I must have it. Or I already have it. Ah, let it be, just download it and put it into the appropriate folder. Or folders? And soon, time to put the program collection into my pendrive. Estimated size… gulp! 4.3 gb. Ow. Must be because of the three netbeans SDK, or movie. Anyway, I need a program that find duplicated program in my collection and none seems to be in my way in Linux. Because of that, unduplicator is born.

Unduplicator is a program that search some given folder for duplicated file and folder. The user can choose to search for duplicate in file name, size or content. Unduplicator can be installed on most version of linux that has pygtk. It require python and gtk. In another word, if you are using Gnome or default ubuntu, than you can run it without a problem. No compile needed.

Features:

· Search for duplicated files and folders.

· Check for same name, size or content (using hash), or all of them.

· Multiple folder-to-search option.

· Unix-like operating system support (and requirement).

· Option to delete or move files with autogenerated restore script, or some file delete while some move.

· No compile needed (It is written in python)

· Not-so-bad gui.

· Free

Requirement

· Python

· Pygtk

· Unix like file-system

Installation

· Download the tar file.

· Extract it to a folder

· Open the terminal, cd to the folder

· Run "sudo python setup.py install"

Usage

· Run the command "unduplicator" (The menu is still on the way)

· Check the folder that you wish to search.

· Click the start button.

· Use your head. (It's pretty simple actually)

FAQ

Q:Rational?

A:One day, I've found that my program collection is growing to big and some of it are duplicated. Therefore, it is normal for a person to search the internet for Linux-Util-to-search-for-duplicate-file. Unfortunately, I cannot find a program which suit my need (search for duplicated FOLDER). So, I created one. How hard it is to make a python program right?

Q:Why should I use it?

A: No one told you to use it. But if you like it, have a go. Beside, it is:

-Written in python

-No compile needed.

-Should be able to run on all Unix like operating system.

-Search for duplicated folder.

Q:Is there a Windows version?

A:No. I do not think so. It is written in python, and most windows does not have python. In order to run it, the user must also download python, attach it to the command line, install gtk, install pygtk, and stuff. So I think, lets keep it in Linux. Beside, It rely on the root-ed file system aka Unix like folder structur which the most to level directory is always "/". Changing that is like changing 50% of the program.In theory, it may be able to run in mac os x. But you still need python and pygtk.

Q:Is there any plan to port it to Windows?

A:I don't think so, Instead, I've been thinking on creating a new unduplicator program in c++.

Q:How much the speed I should expect?

A:The time taken is directly proportional to Number of File x Number of File, divided by 2, divided even further to your cpu speed. The more the duplicates, the faster.

Q:Everytime, I scan some folder, it only use one core of my dual core cpu, Why is that?

A:Because python naturally only use one core, even multithreaded. I'm still thinking of ways to change that.

Q: Is there a debian / rpm package?

A: No. Do you know howto make one? Help is always needed.