2to3

Moving from Python 2 to Python 3

Python 2 has a limited lifetime, and by 2020, there will no longer be any active development on Python 2.

http://legacy.python.org/dev/peps/pep-0373/

Why? Apparently it was easier to make a shiny new python by breaking backwards compatibility. The good news is it's relatively painless to switch small projects over to Python 3, and most major Python packages already support Python 3 (including most of the scientific stack: numpy, scipy, pandas, astropy).

In [1]:
import sys
print(sys.version)
3.5.3 (default, Mar 21 2017, 17:21:33) 
[GCC 6.3.1 20161221 (Red Hat 6.3.1-1)]
In [3]:
2 / 3
Out[3]:
0.6666666666666666

A (non-exhaustive) list of differences between Python 2 and Python 3

  • print is now a function, no longer a keyword
  • exec is now a function, no longer a keyword
  • division, /, no longer truncates! (no more 2/3 == 0)
  • all strings are unicode (this is... controversial)
  • the functions range(), zip(), map(), filter(), dict.keys(), dict.items(), dict.values(), all return an iterator instead of a list
  • exceptions are handled using a slightly different syntax
  • strict comparisons, so 'a' < 1 will fail with an error
  • from the standard library, urllib is reorganized

For a more complete list, see

http://ptgmedia.pearsoncmg.com/imprint_downloads/informit/promotions/python/python2python3.pdf

Cool things in Python 3

Some of these have been back-ported to Python 2.7

In [9]:
'1' < 2
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-9-b67fbc3e6cdc> in <module>()
----> 1 '1' < 2

TypeError: unorderable types: str() < int()
In [6]:
range(5)[0]
Out[6]:
0
In [10]:
# python2 has list comprehensions
[x ** 2 for x in range(5)]
Out[10]:
[0, 1, 4, 9, 16]
In [11]:
# python3 has dict comprehensions!
{str(x): x ** 2 for x in range(5)}
Out[11]:
{'0': 0, '1': 1, '2': 4, '3': 9, '4': 16}
In [12]:
# and set comprehensions
{x ** 2 for x in range(5)}
Out[12]:
{0, 1, 4, 9, 16}
In [29]:
# magic dictionary concatenation
some_kwargs = {'do': 'this', 
               'not': 'that'}
other_kwargs = {'use': 'something', 
                'when': 'sometime',
                'do': "other"}
{**some_kwargs, **other_kwargs}
Out[29]:
{'do': 'other', 'not': 'that', 'use': 'something', 'when': 'sometime'}
In [17]:
print(a)
0
In [18]:
print(stuff)
[1, 2, 3, 4, 5]
In [19]:
print(*stuff)
1 2 3 4 5
In [20]:
print(1, 2, 3, 4, 5)
1 2 3 4 5
In [22]:
# unpacking magic
a, *stuff, b, c = range(8)
print(a)
print(stuff)
print(b, c)
0
[1, 2, 3, 4, 5]
6 7
In [24]:
# native support for unicode
s = 'Το Ζεν του Πύθωνα ηορρυ'
print(s)
Το Ζεν του Πύθωνα ηορρυ
In [25]:
# unicode variable names!
import numpy as np
π = np.pi
np.cos(2 * π)
Out[25]:
1.0
In [26]:
# infix matrix multiplication
A = np.random.choice(list(range(-9, 10)), size=(3, 3))
B = np.random.choice(list(range(-9, 10)), size=(3, 3))
print("A = \n", A)
print("B = \n", B)

print("A B = \n", A @ B)
print("A B = \n", np.dot(A, B))
A = 
 [[-9 -1 -2]
 [ 1 -5 -5]
 [-2  0 -4]]
B = 
 [[ 1  7 -3]
 [ 2 -4  3]
 [ 5 -6  3]]
A B = 
 [[-21 -47  18]
 [-34  57 -33]
 [-22  10  -6]]
A B = 
 [[-21 -47  18]
 [-34  57 -33]
 [-22  10  -6]]

New string formatting

The old string formatting (with %) is depricated in favor of str.format(). A good comparison of the two can be found here:

https://pyformat.info/

Unicode

Dealing with unicode can be a pain when *nix doesn't give or expect unicode. Sometimes importing data in python3 will give you strings with a weird b in front. These are bytestrings, and they can usually be converted to unicode strings with bytestring.decode('utf-8').

In [14]:
s = 'asdf'
b = s.encode('utf-8')
b
Out[14]:
b'asdf'
In [15]:
b.decode('utf-8')
Out[15]:
'asdf'
In [16]:
# this will be problematic if other encodings are used...
s = 'asdf'
b = s.encode('utf-32')
b
Out[16]:
b'\xff\xfe\x00\x00a\x00\x00\x00s\x00\x00\x00d\x00\x00\x00f\x00\x00\x00'
In [17]:
b.decode('utf-8')
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-17-dbeeccecf491> in <module>()
----> 1 b.decode('utf-8')

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

Writing code for both Python 2 and Python 3

Ever wonder what those from __future__ import foo statements were doing?

http://python-future.org/quickstart.html

future

Using the future package, you can write code that works for either Python 2 or Python 3. You'll still have to avoid using some Python 3 specific syntax.

In [27]:
# shouldn't change anything in python3
from __future__ import print_function, division

print('non-truncated division in a print function: 2/3 =', 2/3)
non-truncated division in a print function: 2/3 = 0.6666666666666666

Automagically converting Python 2 to Python 3

2to3 will convert Python 2 code to Python 3. It may come with your python installation, or you may have to install it separately (in the Fedora package repository it is found under python-tools).

Simply run 2to3 myscript.py to see the diff of changes, then run 2to3 -w myscript.py to write the changes. The old file is saved as myscript.py.bak. You can also run it on an entire directory to convert a whole package.

Note that there are some edge cases to deal with. For instance, it can't tell where you wanted truncated division vs normal division. Also it leaves old style formatting (though this should still work).

To convert Jupyter notebooks, you can install the jupytercontrib package (via pip as pip install jupytercontrib), then run

jupyter nbconvert --to 2to3 mynotebook.ipynb

It will create a new notebook called mynotebook.nbconvert.ipynb which will have relevent cells converted to python3 syntax and have the default kernel set to python3.

In [ ]:
 

social