Clean Up Your HDD (Start)

Last Update: 05.12.2006. By kerim in hddcleaner | python

As mentioned here i will implement a small program sorting the files of any given list of directories according to size, present that visually and offer you a button to delete selected files.

Why ?

Because I need it myself anyway and because it’s a nice excercise to see how easy python can be.

We take my last entry as a start. I will now add some more methods/functions and as time goes by we will assemble a complete program.

Since i don’t have a mac and only occationally use linux i will mostly program on my windows system. If something doesn’t work on your non-windows system, tell me.

If you think the effort is worth it and want to provide some support or look for some really interesting feature i should implement (which means it should be something i need too) you are more than welcome to leave a comment. If you are too lazy to do it on your own, want something i don’t and have no argument to convince me you can of course simply donate lol.

Ok ......
We have seen how to obtain a list of drives in windows systems. Pretty simply huh ?
Lets use a combined function, so we are on the safe side (I don’t trust the ord(a) stuff to work everywhere since (i think) you can use more than one character to define a drive letter in windows):

def listDrives():  
    if sys.platform == 'win32':  
        try:  
            import win32api,string  
            drives=win32api.GetLogicalDriveStrings()  
            drives=string.splitfields(drives,'\000')  
            return drives  
        except:  
            drives = []  
            for i in range(ord('a'), ord('z')+1):  
                drive = chr(i)  
                if(os.path.exists(drive +":\\")):  
                    drives.append(drive+":\\")  
            return drives

OK

Now we need something to go through a drive/directory and assemble a list of files. The list should contain the path of the file and the size.

Lets look at what python has to offer:

def scan(directories):   
    listOfFiles = {}  
    for directory in directories:   
        sys.stdout.write('Scanning directory '+directory+'...\n')           
        directory = os.path.abspath(directory)  
        os.path.walk(directory,listFilesByName,listOfFiles)

Nice ....
Here we have something i totally like in python … using functions as parameters in calls.
I don’t want to explain much. If you really don’t understand that short piece of code … use the comment section.

So all we need to do now is to implement “listFilesByName” to actually compose the list:

def listFilesByName(listOfFiles,directory,files):  
    for fileName in files:  
        filepath = os.path.join(directory,fileName)  
        if os.path.isfile(filepath):  
            filesize = os.stat(filepath)[ST_SIZE]  
            listOfFiles[filepath] = filesize

You know … almost too easy !
What we get is a dictionary with all filenames and sizes.

Here is some output:

('c:\\temp\\ks069.tmp', 0L) ('c:\\temp\\jusched.log', 173L) ('c:\\temp\\jsproxy.wmp', 12288L) ('c:\\temp\\java_install_reg.log', 3952L) ('c:\\temp\\iwf67.tmp', 0L) ('c:\\temp\\iju65.tmp', 0L) ('c:\\temp\\iedkcs32.wmp', 294912L) ('c:\\temp\\hv159.tmp', 0L)

Nice !

Wait ! How did we get that sorted ? At the end of the scan-method i simply added a <code class=”python]sortedList = sorted(listOfFiles.items(),reverse=True)[/geshi”>

But listFilesByName has one disadvantage …
If the primary point of interest is the size we should perhaps assemble a dictionary with the sizes as keys. That would also allow us to list several files that have the same size together instantly.

So a perhaps better function would be :

def listFilesBySize(listOfFiles,directory,files):  
    for fileName in files:  
        filepath = os.path.join(directory,fileName)  
        if os.path.isfile(filepath):  
            filesize = os.stat(filepath)[ST_SIZE]  
            if listOfFiles.has_key(filesize):  
                listOfFiles[filesize].append(filepath)   
            else:  
                listOfFiles[filesize] = [filepath]

Here again some sample output:

(0L, ['c:\\temp\\2p55F.tmp', 'c:\\temp\\amv16F.tmp', 'c:\\temp\\CDATEN.TXT', 'c:\\temp\\fla79.tmp', 'c:\\temp\\fla7B.tmp', 'c:\\temp\\fla7D.tmp', 'c:\\temp\\fla7F.tmp', 'c:\\temp\\fla81.tmp', 'c:\\temp\\fla83.tmp', 'c:\\temp\\fla85.tmp', 'c:\\temp\\fla87.tmp', 'c:\\temp\\fla89.tmp', 'c:\\temp\\fla8B.tmp', 'c:\\temp\\fla8D.tmp', 'c:\\temp\\h93162.tmp', 'c:\\temp\\hmz15D.tmp', 'c:\\temp\\hv159.tmp', 'c:\\temp\\iju65.tmp', 'c:\\temp\\iwf67.tmp', 'c:\\temp\\ks069.tmp', 'c:\\temp\\siw65.tmp', 'c:\\temp\\ta5174.tmp', 'c:\\temp\\y6d16A.tmp'])

More next time ....