That Pesky Error in PyLint's Results

How to avoid PyLint's Module 'hashlib' has no 'md5' member error when using hashlib's md5() function in your code, and the consequences of doing so.

I do not use md5 much at all these days, as it is known to be fairly unsafe; having said that, md5 does still have its uses when it comes to comparing file likeness in non-critical applications. It just so happens that PyLint does not seem to like invoking hashlib with specific constructors and will therefore mark down code ratings as a result. See the example below:

>>> import hashlib
>>> HASH = hashlib.md5()
>>> HASH.update('Test my hash')
>>> HASH.hexdigest()
'f94b71b31c1d09d352db8b59d4f98892'

Checking the source code for a particular file using this method of invocating md5, PyLint invariably gives me an error message similar to the following:

raz@foobox:~/Python/soap$ pylint soap_client.py
No config file found, using default configuration
************* Module somemodule
E:231:SomeClass._some_method: Module 'hashlib' has no 'md5' member

[...]

Messages
--------

+-----------+------------+
|message id |occurrences |
+===========+============+
|E1101      |1           |
+-----------+------------+



Global evaluation
-----------------
Your code has been rated at 9.68/10

Hmm, that is not good at all. I want a better PyLint score, something like 10/10 would do just fine, thanks.

What to do Slightly Differently

Seeing as I would like my code to score as highly as possible in PyLint tests, what to do? The answer is to use md5 in a subtly different way to get rid of that error...

Instead of invocating the named constructor, such as hashlib.md5(), use the generic new(), constructor hashlib.new('md5'):

>>> import hashlib
>>> HASH = hashlib.new('md5')
>>> HASH.update('Test my hash')
>>> HASH.hexdigest()
'f94b71b31c1d09d352db8b59d4fq98892'

The interesting bit of test output now shows:

Global evaluation
-----------------
Your code has been rated at 10.00/10 (previous run: 9.68/10)

Hooray, full marks at last! But, read on...!

A Minor Cautionary Note

This does come with a minor health warning: as briefly mentioned in the hashlib documentation, the new() constructor runs slower than the named constructors. Let me put this to a very basic test, using Python's timeit module.

Firstly with the generic constructor:

>>> import timeit
>>> s = """\
... import hashlib
... HASH = hashlib.new('md5')
... HASH.update('Test my hash')
... HASH.hexdigest()
... """
>>> t = timeit.Timer(stmt=s)
>>> print "%.2f usec/pass" % (1000000 * t.timeit(number=100000)/100000)
3.02 usec/pass

Now, with the named constructor:

>>> import timeit
>>> s = """\
... import hashlib
... HASH = hashlib.md5()
... HASH.update('Test my hash')
... HASH.hexdigest()
... """
>>> t = timeit.Timer(stmt=s)
>>> print "%.2f usec/pass" % (1000000 * t.timeit(number=100000)/100000)
2.41 usec/pass

So we have knocked off approximately 20% of execution time in this code by using the named constructor at the expense of a drop in the overall PyLint score.

Finally

As with most things, this is not a black and white issue. If performance is an quality of important consideration in your code, you may just have to put up with lower PyLint code ratings!


Comments

comments powered by Disqus