Python & standard output redirection on ms-windows
Redirecting standard output of the Python script on ms-windows can cause strange crashes because of encoding differences.
The problem
Recently, I saw the following traceback when running the lamprop console application on ms-windows.
U:\>lamprop.py foo.lam >foo.txt
Traceback (most recent call last):
File "C:\_LocalData\Python3\lib\runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\_LocalData\Python3\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "__main__.py", line 97, in <module>
File "__main__.py", line 93, in main
File "C:\_LocalData\Python3\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u03bd'
in position 0: character maps to <undefined>
(The character \u03bd
is the greek letter ν.)
Since Python 3.6, utf-8
is used as the encoding for sys.stdout
, as
long as this is not redirected.
From the documentation:
On Windows, UTF-8 is used for the console device. Non-character devices such as disk files and pipes use the system locale encoding.
Suppose you have the following program;
import sys
print(sys.stdout.encoding)
Normally, when you call this program from cmd.exe
, it will report utf-8
.
But, if you redirect its output to a file, that file will contain cp1252
!
So it you are trying to print a character that cannot be encoded in that
codepage, the abovementioned error occurs.
The solution
In the console version of lamprop, I added the following code to the program initialization.
if os.name == "nt":
sys.stdout.reconfigure(encoding="utf-8")
The forces the output to utf-8
even when it is redirected.
For comments, please send me an e-mail.
Related articles
- Profiling Python scripts(6): auto-orient
- Profiling with pyinstrument
- From python script to executable with cython
- On Python speed
- Python 3.11 speed comparison with 3.9