1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
|
====================================
Differences between PyPy and CPython
====================================
This page documents the few differences and incompatibilities between
the PyPy Python interpreter and CPython. Some of these differences
are "by design", since we think that there are cases in which the
behaviour of CPython is buggy, and we do not want to copy bugs.
Differences that are not listed here should be considered bugs of
PyPy.
Extension modules
-----------------
List of extension modules that we support:
* Supported as built-in modules (in `pypy/module/`_):
__builtin__
`__pypy__`_
_ast
_bisect
_codecs
_lsprof
`_minimal_curses`_
_random
`_rawffi`_
_ssl
_socket
_sre
_weakref
array
bz2
cStringIO
`cpyext`_
crypt
errno
exceptions
fcntl
gc
itertools
marshal
math
md5
mmap
operator
parser
posix
pyexpat
select
sha
signal
struct
symbol
sys
termios
thread
time
token
unicodedata
zipimport
zlib
When translated to Java or .NET, the list is smaller; see
`pypy/config/pypyoption.py`_ for details.
When translated on Windows, a few Unix-only modules are skipped,
and the following module is built instead:
_winreg
Extra module with Stackless_ only:
_stackless
* Supported by being rewritten in pure Python (possibly using ``ctypes``):
see the `lib_pypy/`_ directory. Examples of modules that we
support this way: ``ctypes``, ``cPickle``,
``cStringIO``, ``cmath``, ``dbm`` (?), ``datetime``, ``binascii``...
Note that some modules are both in there and in the list above;
by default, the built-in module is used (but can be disabled
at translation time).
The extension modules (i.e. modules written in C, in the standard CPython)
that are neither mentioned above nor in `lib_pypy/`_ are not available in PyPy.
(You may have a chance to use them anyway with `cpyext`_.)
.. the nonstandard modules are listed below...
.. _`__pypy__`: __pypy__-module.html
.. _`_rawffi`: ctypes-implementation.html
.. _`_minimal_curses`: config/objspace.usemodules._minimal_curses.html
.. _`cpyext`: http://morepypy.blogspot.com/2010/04/using-cpython-extension-modules-with.html
.. _Stackless: stackless.html
Differences related to garbage collection strategies
----------------------------------------------------
Most of the garbage collectors used or implemented by PyPy are not based on
reference counting, so the objects are not freed instantly when they are no
longer reachable. The most obvious effect of this is that files are not
promptly closed when they go out of scope. For files that are opened for
writing, data can be left sitting in their output buffers for a while, making
the on-disk file appear empty or truncated.
Fixing this is essentially not possible without forcing a
reference-counting approach to garbage collection. The effect that you
get in CPython has clearly been described as a side-effect of the
implementation and not a language design decision: programs relying on
this are basically bogus. It would anyway be insane to try to enforce
CPython's behavior in a language spec, given that it has no chance to be
adopted by Jython or IronPython (or any other port of Python to Java or
.NET, like PyPy itself).
This affects the precise time at which __del__ methods are called, which
is not reliable in PyPy (nor Jython nor IronPython). It also means that
weak references may stay alive for a bit longer than expected. This
makes "weak proxies" (as returned by ``weakref.proxy()``) somewhat less
useful: they will appear to stay alive for a bit longer in PyPy, and
suddenly they will really be dead, raising a ``ReferenceError`` on the
next access. Any code that uses weak proxies must carefully catch such
``ReferenceError`` at any place that uses them.
There are a few extra implications for the difference in the GC. Most
notably, if an object has a __del__, the __del__ is never called more
than once in PyPy; but CPython will call the same __del__ several times
if the object is resurrected and dies again. The __del__ methods are
called in "the right" order if they are on objects pointing to each
other, as in CPython, but unlike CPython, if there is a dead cycle of
objects referencing each other, their __del__ methods are called anyway;
CPython would instead put them into the list ``garbage`` of the ``gc``
module. More information is available on the blog `[1]`__ `[2]`__.
.. __: http://morepypy.blogspot.com/2008/02/python-finalizers-semantics-part-1.html
.. __: http://morepypy.blogspot.com/2008/02/python-finalizers-semantics-part-2.html
Using the default GC called ``minimark``, the built-in function ``id()``
works like it does in CPython. With other GCs it returns numbers that
are not real addresses (because an object can move around several times)
and calling it a lot can lead to performance problem.
Note that if you have a long chain of objects, each with a reference to
the next one, and each with a __del__, PyPy's GC will perform badly. On
the bright side, in most other cases, benchmarks have shown that PyPy's
GCs perform much better than CPython's.
Another difference is that if you add a ``__del__`` to an existing class it will
not be called::
>>>> class A(object):
.... pass
....
>>>> A.__del__ = lambda self: None
__main__:1: RuntimeWarning: a __del__ method added to an existing type will not be called
Subclasses of built-in types
----------------------------
Officially, CPython has no rule at all for when exactly
overriden method of subclasses of built-in types get
implicitly called or not. As an approximation, these methods
are never called by other built-in methods of the same object.
For example, an overridden ``__getitem__()`` in a subclass of
``dict`` will not be called by e.g. the built-in ``get()``
method.
The above is true both in CPython and in PyPy. Differences
can occur about whether a built-in function or method will
call an overridden method of *another* object than ``self``.
In PyPy, they are generally always called, whereas not in
CPython. For example, in PyPy, ``dict1.update(dict2)``
considers that ``dict2`` is just a general mapping object, and
will thus call overridden ``keys()`` and ``__getitem__()``
methods on it. So the following code prints ``42`` on PyPy
but ``foo`` on CPython::
>>>> class D(dict):
.... def __getitem__(self, key):
.... return 42
....
>>>>
>>>> d1 = {}
>>>> d2 = D(a='foo')
>>>> d1.update(d2)
>>>> print d1['a']
42
Ignored exceptions
-----------------------
In many corner cases, CPython can silently swallow exceptions.
The precise list of when this occurs is rather long, even
though most cases are very uncommon. The most well-known
places are custom rich comparison methods (like \_\_eq\_\_);
dictionary lookup; calls to some built-in functions like
isinstance().
Unless this behavior is clearly present by design and
documented as such (as e.g. for hasattr()), in most cases PyPy
lets the exception propagate instead.
Miscellaneous
-------------
* ``sys.setrecursionlimit()`` is ignored (and not needed) on
PyPy. On CPython it would set the maximum number of nested
calls that can occur before a RuntimeError is raised; on PyPy
overflowing the stack also causes RuntimeErrors, but the limit
is checked at a lower level. (The limit is currenty hard-coded
at 768 KB, corresponding to roughly 1480 Python calls on
Linux.)
.. include:: _ref.txt
|