We have some large GeoWebCache caches of Ordnance Survey map tiles in .png8 format. Unfortunately a few million .png files have crept into the caches too (accidental bad config at some point), duplicating the .png8 tiles and using up a lot more space. I wanted to delete these.
At first I thought "simple:- do a search in Windows Explorer for .png and delete". Ah no, that selects all the .png8 files as well, plus my number of folders and files were far too large.
Then I thought...
del /S *.png
But no, this deletes the .png8 files as well, grrr.
So I wrote a simple python script, that I will lodge here for future reference. It's Python 2.7, but I imagine would work in 3.* with little or no change. The final print statement probably slows things down, but I wanted to be able to see what was going on.
indir = 'M:\\MyVeryLargeCachePath'
for root, dirs, filenames in os.walk(indir):
for f in filenames:
if os.path.splitext(f) == '.png':
print('deleted ' + os.path.join(root, f))
Assuming you have Python installed, simply save this as a file with a .py extension, modify the path and the file extensions to suit, then run it.
This same technique would work for other combinations of file extensions, e.g. to remove *.doc but not *.docx, or to remove *.xls while retaining *.xslx