Byte Strings Are Decoded To ASCII In IO By Default In Python

1. encode and decode the terminology

  • string = <class 'str'> = decoded byte string → that is executed by the brain
  • byte string = <class 'bytes'> = encoded string → that is executed by the machine
  • by default, python decodes byte strings to ASCII when they are printed — this is confusing unless known
  • to remove b' pre-fix from the string you need to run decode() method on it
  • but if there are non-ASCII characters, you will notice/feel the pain immediately even when accessing/printing the variable
>>> a = 'š'.encode('utf-8')
>>> a
b'\xc5\xa1'
>>> a.decode('ascii')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 0: ordinal not in range(128)
'ascii' codec can't decode byte 0xc5 in position 0: ordinal not in range(128)
>>> a.decode('utf-8')
'š'
>>> a
b'\xc5\xa1'
>>> print(a)
b'\xc5\xa1'
>>> a = 'š'.encode('ascii')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character '\u0161' in position 0: ordinal not in range(128)
'ascii' codec can't encode character '\u0161' in position 0: ordinal not in range(128)

5. links

Infrastructure Support Engineer/Technical Writer (Snowplow Analytics) with a passion for Python/writing documentation. More about me: https://pavol.kutaj.com

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Ruby on Rails note: new project from scratch

Terminal feedback without Zsh

Centering Floated Elements

Our First Kubernetes Outage

How to extend Cloud Dataprep by using BigQuery Javascript UDFs

Queues in C++

Elegant Iteration in Ruby

Understanding Recycler View

RecyclerView

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Pavol Kutaj

Pavol Kutaj

Infrastructure Support Engineer/Technical Writer (Snowplow Analytics) with a passion for Python/writing documentation. More about me: https://pavol.kutaj.com

More from Medium

String Interpolation With Format F Strings

Python ternary operator

Introduction To Python Part 6

How To Write Multiline Format Strings In Python