?

Log in

No account? Create an account

Previous Entry | Next Entry

Python is annoying

Our code works with binary data (hashes/digest) and hexstring representations of such data, a lot. It was written in Python 2, when everything was a string, but some strings were "beef" and some were "'\xbe\xef'"

Then we converted to Python 3, which introduced the 'bytes' type for binary data, and Unicode strings everywhere, which led to some type problems I had figured out, but a recent debugging session revealed I had to think about it some more. Basically we can now have a hexstring "beef", the bytes object b'\xbe\xef' described by that hexstring... and the bytes b"beef" which is the UTF-8 encoding of the string.

In particular, the function binascii.hexlify (aka binascii.b2a_hex) which we used a lot, changed what it returned.

Python 2:
>>> binascii.a2b_hex("beef")
'\xbe\xef'
>>> binascii.hexlify(_)
'beef'

Python 3:
>>> binascii.a2b_hex("beef")
b'\xbe\xef'
>>> binascii.hexlify(_)
b'beef'

vs.
>>> binascii.a2b_hex("beef")
b'\xbe\xef'
>>> _.hex()
'beef'


I found it easy to assume that if one of our functions was returning b"beef" and the other "beef" that they were on the same page, when really, not.

Bunch of examples in the cut.



>>> binascii.a2b_hex("beef")
b'\xbe\xef'
>>> binascii.hexlify(_)
b'beef'
>>> _.decode()
'beef'
>>> binascii.a2b_hex("beef")
b'\xbe\xef'
>>> _.decode()
Traceback (most recent call last):
  File "", line 1, in 
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbe in position 0:
invalid start byte

>>> b"beef".hex()
'62656566'

>>> sha=hashlib.sha256(b'fee')
>>> sha.digest()
b'\xb8\x0c\xda\xae\x9b2+\xba*8\xfd9b\x99x*L\xc6\xb4\xa0\xcc\xf6\x7f\xcc\xbb\xcca|\x94\xa4`&'
>>> sha.digest().hex()
'b80cdaae9b322bba2a38fd396299782a4cc6b4a0ccf67fccbbcc617c94a46026'
>>> sha.hexdigest()
'b80cdaae9b322bba2a38fd396299782a4cc6b4a0ccf67fccbbcc617c94a46026'

>>> bytes.fromhex("cow")
Traceback (most recent call last):
  File "", line 1, in 
ValueError: non-hexadecimal number found in fromhex() arg at position 1
>>> "cow".encode()
b'cow'
>>> "beef".encode()
b'beef'

>>> binascii.b2a_hex(b"beef")
b'62656566'
>>> binascii.b2a_hex(bytes.fromhex("beef"))
b'beef'
>>> bytes.fromhex("beef").hex()
'beef'




See the comment count unavailable DW comments at https://mindstalk.dreamwidth.org/513680.html#comments

Profile

Phoenix
mindstalk
Damien Sullivan
Website

Latest Month

February 2019
S M T W T F S
     12
3456789
10111213141516
17181920212223
2425262728  

Tags

Powered by LiveJournal.com
Designed by Lilia Ahner