Clean Up Your HDD (Start)

05.12.2006 by kerim in hddcleaner | python

As mentioned here i will implement a small program sorting the files of any given list of directories according to size, present that visually and offer you a button to delete selected files.

Why ?

Because I need it myself anyway and because it's a nice excercise to see how easy python can be.

We take my last entry as a start. I will now add some more methods/functions and as time goes by we will assemble a complete program.

Since i don't have a mac and only occationally use linux i will mostly program on my windows system. If something doesn't work on your non-windows system, tell me.

If you think the effort is worth it and want to provide some support or look for some really interesting feature i should implement (which means it should be something i need too) you are more than welcome to leave a comment. If you are too lazy to do it on your own, want something i don't and have no argument to convince me you can of course simply donate lol.

Ok ......
We have seen how to obtain a list of drives in windows systems. Pretty simply huh ?
Lets use a combined function, so we are on the safe side (I don't trust the ord(a) stuff to work everywhere since (i think) you can use more than one character to define a drive letter in windows):

def listDrives():  
    if sys.platform == 'win32':  
        try:  
            import win32api,string  
            drives=win32api.GetLogicalDriveStrings()  
            drives=string.splitfields(drives,'\000')  
            return drives  
        except:  
            drives = []  
            for i in range(ord('a'), ord('z')+1):  
                drive = chr(i)  
                if(os.path.exists(drive +":\\")):  
                    drives.append(drive+":\\")  
            return drives

OK

Now we need something to go through a drive/directory and assemble a list of files. The list should contain the path of the file and the size.

Lets look at what python has to offer:

def scan(directories):   
    listOfFiles = {}  
    for directory in directories:   
        sys.stdout.write('Scanning directory '+directory+'...\n')           
        directory = os.path.abspath(directory)  
        os.path.walk(directory,listFilesByName,listOfFiles)

Nice ....
Here we have something i totally like in python ... using functions as parameters in calls.
I don't want to explain much. If you really don't understand that short piece of code ... use the comment section.

So all we need to do now is to implement "listFilesByName" to actually compose the list:

def listFilesByName(listOfFiles,directory,files):  
    for fileName in files:  
        filepath = os.path.join(directory,fileName)  
        if os.path.isfile(filepath):  
            filesize = os.stat(filepath)[ST_SIZE]  
            listOfFiles[filepath] = filesize

You know ... almost too easy !
What we get is a dictionary with all filenames and sizes.

Here is some output:

('c:\temp\ks069.tmp', 0L)
('c:\temp\jusched.log', 173L)
('c:\temp\jsproxy.wmp', 12288L)
('c:\temp\java_install_reg.log', 3952L)
('c:\temp\iwf67.tmp', 0L)
('c:\temp\iju65.tmp', 0L)
('c:\temp\iedkcs32.wmp', 294912L)
('c:\temp\hv159.tmp', 0L)

Nice !

Wait ! How did we get that sorted ? At the end of the scan-method i simply added a <code class="python]sortedList = sorted(listOfFiles.items(),reverse=True)[/geshi">

But listFilesByName has one disadvantage ...
If the primary point of interest is the size we should perhaps assemble a dictionary with the sizes as keys. That would also allow us to list several files that have the same size together instantly.

So a perhaps better function would be :

def listFilesBySize(listOfFiles,directory,files):  
    for fileName in files:  
        filepath = os.path.join(directory,fileName)  
        if os.path.isfile(filepath):  
            filesize = os.stat(filepath)[ST_SIZE]  
            if listOfFiles.has_key(filesize):  
                listOfFiles[filesize].append(filepath)   
            else:  
                listOfFiles[filesize] = [filepath]

Here again some sample output:

(0L, ['c:\\temp\\2p55F.tmp', 'c:\\temp\\amv16F.tmp', 'c:\\temp\\CDATEN.TXT', 'c:\\temp\\fla79.tmp', 'c:\\temp\\fla7B.tmp', 'c:\\temp\\fla7D.tmp', 'c:\\temp\\fla7F.tmp', 'c:\\temp\\fla81.tmp', 'c:\\temp\\fla83.tmp', 'c:\\temp\\fla85.tmp', 'c:\\temp\\fla87.tmp', 'c:\\temp\\fla89.tmp', 'c:\\temp\\fla8B.tmp', 'c:\\temp\\fla8D.tmp', 'c:\\temp\\h93162.tmp', 'c:\\temp\\hmz15D.tmp', 'c:\\temp\\hv159.tmp', 'c:\\temp\\iju65.tmp', 'c:\\temp\\iwf67.tmp', 'c:\\temp\\ks069.tmp', 'c:\\temp\\siw65.tmp', 'c:\\temp\\ta5174.tmp', 'c:\\temp\\y6d16A.tmp'])

More next time ....


comments powered by Disqus