Coding in Cython¶
This chapter discusses Cython, which is a compiled language based on Python. The major advantage it has over Python is that code can be much faster (sometimes orders of magnitude) and can directly call C and C++ code. As Cython is essentially a superset of the Python language, one often doesn’t make a distinction between Cython and Python code in Sage (e.g. one talks of the “Sage Python Library” and “Python Coding Conventions”).
Python is an interpreted language and has no declared data types for variables. These features make it easy to write and debug, but Python code can sometimes be slow. Cython code can look a lot like Python, but it gets translated into C code (often very efficient C code) and then compiled. Thus it offers a language which is familiar to Python developers, but with the potential for much greater speed. Cython also allows Sage developers to interface with C and C++ much easier than using the Python C API directly.
Cython is a compiled version of Python. It was originally based on Pyrex but has changed based on what Sage’s developers needed; Cython has been developed in concert with Sage. However, it is an independent project now, which is used beyond the scope of Sage. As such, it is a young, but developing language, with young, but developing documentation. See its web page, http://www.cython.org/, for the most up-to-date information or check out the Language Basics to get started immediately.
Writing Cython Code in Sage¶
There are several ways to create and build Cython code in Sage.
In the Sage Notebook, begin any cell with
%cython
. When you evaluate that cell,It is saved to a file.
Cython is run on it with all the standard Sage libraries automatically linked if necessary.
The resulting shared library file (
.so
/.dll
/.dylib
) is then loaded into your running instance of Sage.The functionality defined in that cell is now available for you to use in the notebook. Also, the output cell has a link to the C program that was compiled to create the
.so
file.A
cpdef
ordef
function, saytestfunction
, defined in a%cython
cell in a worksheet can be imported and made available in a different%cython
cell within the same worksheet by importing it as shown below:%cython from __main__ import testfunction
Create an
.spyx
file and attach or load it from the command line. This is similar to creating a%cython
cell in the notebook but works completely from the command line (and not from the notebook).Create a
.pyx
file and add it to the Sage library.First, add a listing for the Cython extension to the variable
ext_modules
in the fileSAGE_ROOT/src/module_list.py
. See thedistutils.extension.Extension
class for more information on creating a new Cython extension.Run
sage -b
to rebuild Sage.
For example, in order to compile
SAGE_ROOT/src/sage/graphs/chrompoly.pyx
, we see the following lines inmodule_list.py
:Extension('sage.graphs.chrompoly', sources = ['sage/graphs/chrompoly.pyx'], libraries = ['gmp']),
Attaching or Loading .spyx Files¶
The easiest way to try out Cython without having to learn anything
about distutils, etc., is to create a file with the extension
spyx
, which stands for “Sage Pyrex”:
Create a file
power2.spyx
.Put the following in it:
def is2pow(n): while n != 0 and n%2 == 0: n = n >> 1 return n == 1
Start the Sage command line interpreter and load the
spyx
file (this will fail if you do not have a C compiler installed).sage: load("power2.spyx") Compiling power2.spyx... sage: is2pow(12) False
Note that you can change power2.spyx
, then load it again and it
will be recompiled on the fly. You can also attach power2.spyx
so
it is reloaded whenever you make changes:
sage: attach("power2.spyx")
Cython is used for its speed. Here is a timed test on a 2.6 GHz Opteron:
sage: %time [n for n in range(10^5) if is2pow(n)]
[1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536]
CPU times: user 0.60 s, sys: 0.00 s, total: 0.60 s
Wall time: 0.60 s
Now, the code in the file power2.spyx
is valid Python, and if we
copy this to a file powerslow.py
and load that, we get the
following:
sage: load("powerslow.py")
sage: %time [n for n in range(10^5) if is2pow(n)]
[1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536]
CPU times: user 1.01 s, sys: 0.04 s, total: 1.05 s
Wall time: 1.05 s
By the way, we could gain even a little more speed with the Cython
version with a type declaration, by changing def is2pow(n):
to
def is2pow(unsigned int n):
.
Interrupt and Signal Handling¶
When writing Cython code for Sage, special care must be taken to ensure
that the code can be interrupted with CTRL-C
.
Sage uses the cysignals package for this, see the cysignals documentation for more information.
Unpickling Cython Code¶
Pickling for Python classes and extension classes, such as Cython, is different.
This is discussed in the Python pickling documentation. For the unpickling of
extension classes you need to write a __reduce__()
method which typically
returns a tuple (f, args, ...)
such that f(*args)
returns (a copy of) the
original object. As an example, the following code snippet is the
__reduce__()
method from
sage.rings.integer.Integer
:
def __reduce__(self):
'''
This is used when pickling integers.
EXAMPLES::
sage: n = 5
sage: t = n.__reduce__(); t
(<cyfunction make_integer at ...>, ('5',))
sage: t[0](*t[1])
5
sage: loads(dumps(n)) == n
True
'''
# This single line below took me HOURS to figure out.
# It is the *trick* needed to pickle Cython extension types.
# The trick is that you must put a pure Python function
# as the first argument, and that function must return
# the result of unpickling with the argument in the second
# tuple as input. All kinds of problems happen
# if we don't do this.
return sage.rings.integer.make_integer, (self.str(32),)