Announcement: PyCuda 0.91 Windows, OS X support, better arrays, a compiler cache, oh my!

Hi all,

I’m happy to announce the availability of PyCuda 0.91. There is full, up-to-date documentation available.

The following exciting stuff is in PyCuda 0.91:

    [*] Support for Windows and MacOS X, in addition to Linux. (Gert Wohlgemuth,

    Cosmin Stejerean, Znah on the Nvidia forums, and David Gadling)

    Support more arithmetic operators on pycuda.gpuarray.GPUArray. (Gert

    Wohlgemuth)

    [*] Add pycuda.gpuarray.arange(). (Gert Wohlgemuth)

    [*] Add pycuda.curandom. (Gert Wohlgemuth)

    [*] Add pycuda.cumath. (Gert Wohlgemuth)

    [*] Add pycuda.autoinit.

    [*] Add pycuda.tools.

    [*] Add pycuda.tools.DeviceData and pycuda.tools.OccupancyRecord.

    pycuda.gpuarray.GPUArray parallelizes properly on GTX200-generation devices.

    [*] Add support for compiling on CUDA 1.1. Added version query

    pycuda.driver.get_version(). Updated documentation to show 2.0-only

    functionality.

    [*] Make pycuda.driver.Function resource usage available to the program. (See,

    e.g. pycuda.driver.Function.registers.)

    Cache kernels compiled by pycuda.driver.SourceModule.

    [*] Allow for faster, prepared kernel invocation. See

    pycuda.driver.Function.prepare().

    [*] Added memory pools, at pycuda.tools.DeviceMemoryPool as experimental,

    undocumented functionality. For some workloads, this can cure the slowness of

    pycuda.driver.mem_alloc().

    [*] Fix the memset family of functions.

    [*] Improve Error Reporting.

Check the docs change list for a fully hyperlinked version of the above.

Have fun,

Andreas