I’ve been spending a good deal of time the past couple of days processing large JSON files to try and fix some corrupted data (long story, short version: my fault). While JSON is a fast file format to work with, processing > 50 MB of any data format takes some time.
import sys class JsonProgress(object): def __init__(self): self.count = 0 def __call__(self, obj): self.count += 1 sys.stdout.write("\r%8d" % self.count) return obj
And then use it as the object_hook when loading JSON:
f = open('foo.json') foo = json.load(f, object_hook=JsonProgress()) print "\rDone" # \r in the next line erases the progress output
Although JsonProgress is a poor name since its also useful in generic list comprehensions:
progress = JsonProgress() foo = [progress(x) for x in bar] print "\nDone" # \n prints a newline so the progress output is kept
Obviously this is a performance hit, but still quite handy for personal use when you just want to know that something is happening.