Byte Strings Are Decoded To ASCII In IO By Default In Python

Pavol Kutaj
2 min readJan 13, 2022

The aim of this pageđź“ť is to explain why python seemingly prints characters even for byte strings. I dealt with this writing scripts requesting values from Consul KV store with require module and they were arriving (logically) with b' prefix when I was printing them to the console for the user (teammates).

1. encode and decode the terminology

  • string = <class 'str'> = decoded byte string → that is executed by the brain
  • byte string = <class 'bytes'> = encoded string → that is executed by the machine
  • by default, python decodes byte strings to ASCII when they are printed — this is confusing unless known
  • to remove b' pre-fix from the string you need to run decode() method on it
  • but if there are non-ASCII characters, you will notice/feel the pain immediately even when accessing/printing the variable
>>> a = 'š'.encode('utf-8')
>>> a
b'\xc5\xa1'
>>> a.decode('ascii')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 0: ordinal not in range(128)
'ascii' codec can't decode byte 0xc5 in position 0: ordinal not in range(128)
>>> a.decode('utf-8')
'š'
>>> a
b'\xc5\xa1'
>>> print(a)
b'\xc5\xa1'
>>> a = 'š'.encode('ascii')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character '\u0161' in position 0: ordinal not in range(128)
'ascii' codec can't encode character '\u0161' in position 0: ordinal not in range(128)

5. links

--

--

Pavol Kutaj

Today I Learnt | Infrastructure Support Engineer at snowplow.io with a passion for cloud infrastructure/terraform/python/docs. More at https://pavol.kutaj.com