Handy Python Progress for JSON module

I’ve been spending a good deal of time the past couple of days processing large JSON files to try and fix some corrupted data (long story, short version: my fault). While JSON is a fast file format to work with, processing > 50 MB of any data format takes some time.

So to give myself some idea of what was going on, I whipped up a small progress bar for Python 2.6′s json module (works on simplejson if you’re still using 2.4/2.5):

import sys
 
class JsonProgress(object):
    def __init__(self):
        self.count = 0
 
    def __call__(self, obj):
        self.count += 1
        sys.stdout.write("\r%8d" % self.count)
        return obj

And then use it as the object_hook when loading JSON:

f = open('foo.json')
foo = json.load(f, object_hook=JsonProgress())
print "\rDone" # \r in the next line erases the progress output

Although JsonProgress is a poor name since its also useful in generic list comprehensions:

progress = JsonProgress()
foo = [progress(x) for x in bar]
print "\nDone" # \n prints a newline so the progress output is kept

Obviously this is a performance hit, but still quite handy for personal use when you just want to know that something is happening.

This entry was posted in Open Source, Python, Technology and tagged . Bookmark the permalink.

2 Responses to Handy Python Progress for JSON module

  1. Paddy3118 says:

    In progress prints, i find it handy to just print every n’th time through the loop. Something like:

        def __call__(self, obj):
            self.count += 1
            if (self.count % 100) == 0:
                    sys.stdout.write("\r%8d" % self.count)
            return obj

    - Paddy.

  2. @Paddy3118: Good call! (no pun intended) I went ahead and corrected the formatting because WordPress kills formatting in comments, but lets admins correct it evidently.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="">