13. simple_cl Module

This module provides a convenient method for working with openCL via pyopencl, by simplifying access to devices, allowing precision to be determined on the fly and by providing common complex variable functions.

It also simplifies function calls be assuming that the desired number of work-groups and work-items is the number of compute units and max workgroup size, respectively. This means the looping needs to be handled in the kernel, as in the example below.

13.1. Overview

CLSession([device, use_doubles, group_size, …])

Create an OpenCL session on the specified device, while defining float precision and some utility functions.

acquire_opencl_devices(devname)

Find OpenCL devices with matching descriptions.

is_job_done(event)

Check if an OpenCL task is done.

get_device_info(device, field)

Get information about an opencl device.

13.2. Example Usage

from ilpm import simple_cl
import numpy as np
import time

NP = 10**6
REPEATS = 10

#Here we are creating the an OpenCL context on the CPU.  Why bother?  This
#  causes the code execution to be multithreaded, which should speed it up!
ctx = simple_cl.CLSession(device='cpu', use_doubles=False)

#OpenCL code
ctx.compile('''
    __kernel void dot_product(__global REAL* a, __global REAL* b, int np, __global REAL* result)
    {
        for(int i = get_global_id(0); i < np; i += get_global_size(0)) {
        result[i] = dot(get_3D(a, i), get_3D(b, i));
        }
    }''')

#Initial data
a = np.random.rand(NP, 3)
b = np.random.rand(NP, 3)

#Create version of data on device
a_cl = ctx.to_device(a)
b_cl = ctx.to_device(b)
result_cl = ctx.empty(NP)

#Run in OpenCL
start = time.time()
for n in range(REPEATS):
    event = ctx.dot_product(a_cl, b_cl, NP, result_cl)
event.wait()
print 'OpenCL: %5.1f ms' % ((time.time() - start)*1E3)

result = result_cl.get()


#Run in numpy
start = time.time()
for n in range(REPEATS):
    result_n = (a*b).sum(-1)
print ' Numpy: %5.1f ms' % ((time.time() - start)*1E3)


#Find error; note that the OpenCL version is single precision, as we specified
#  use_doubles=False on the context creation.  This causes an error equal to
#  the single precision size.
print 'Dot product error:', np.abs(result_n - result).sum() / NP

Say you want to run your code on the computer’s GPU, not its CPU. This option is available with all high end recent Macs (maybe not the Macbook Airs?). How would I find out my GPU’s name to use it in my code? I’ll do it below without using simple_cl.

We can choose from the list. For instance, my MBP has 3 devices: the CPU, an Iris graphics card, and an AMD graphics card. So I could say

ctx = simple_cl.CLSession( device = ‘AMD Radeon R9 M370X Compute Engine’, use_doubles=True)

13.3. Classes and Functions

exception ilpm.simple_cl.SimpleCLWarning[source]

Bases: Warning

ilpm.simple_cl.acquire_opencl_devices(devname)[source]

Find OpenCL devices with matching descriptions.

Parameters
devnamestring

A partial name for an opencl device, or “cpu” or “gpu”.

Returns
deviceslist of OpenCL devices

All matching devices. May be empty

class ilpm.simple_cl.CLSession(device=['cpu', 'gpu'], use_doubles=False, group_size=None, show_warnings=False)[source]

Bases: object

Create an OpenCL session on the specified device, while defining float precision and some utility functions.

The actual creation of the OpenCL contexts are delayed until necessary. This allows the context parameters to be modified before it is used – this is useful for external modules where the user may which to change the device used for calculations.

The following data types are defined as members of the class:

Name

OpenCL type

Size (bytes)

real

REAL (float or double)

4 or 8

complex

COMPLEX (float2 or double2)

8 or 16

char

char

1

uchar

uchar

1

short

short

2

ushort

ushort

2

int

int

4

uint

uint

4

long

long

8

ulong

ulong

8

The following functions and constants will also be made available to any kernel compiled via the session:

Type

Name

Parameters

Description

COMPLEX

c_mul

COMPLEX x, COMPLEX y

\(x * y\)

COMPLEX

c_div

COMPLEX x, COMPLEX y

\(x / y\)

COMPLEX

conj

COMPLEX x

\(x^*\)

COMPLEX

c_exp

COMPLEX x

\(\exp(x)\)

REAL

c_angle

COMPLEX x

\(\mathrm{arg}(x)\)

REAL

c_abs

COMPLEX x

\(|x|\)

REAL

c_abs_sq

COMPLEX x

\(|x|^2\)

COMPLEX

c_exp_i

REAL x

\(\exp(i x)\)

COMPLEX

native_exp_i

REAL x

(uses native precision math)

COMPLEX

c_exp_i_T

REAL x

\(\exp(-i x)\)

COMPLEX

native_exp_i_T

REAL x

(uses native precision math)

REAL

PI

constant

\(\pi\)

REAL

TWO_PI

constant

\(2 \pi\)

REAL

EULER_CONSTANT

constant

\(\gamma = 0.57721...\)

Parameters
deviceopenCL device, description string or list of strings

The openCL device to use. If a string or list is passed, :function: acquire_opencl_devices will be used to find devices. As soon as it finds a matching device it stops; lists of devices used for fallbacks.

use_doublesbool (default: False)

Specify if REAL and COMPLEX types are double precision

group_sizeint (default: max for device)

The default group_size in calls. If not specified, determined by the max_workgroup_size for the device (recommended). If the device type is CPU, this will be set to 1 (CPUs sometimes erroneously report more than 1 for this value)

show_warningsbool, optional (default: False)

If True, warnings displayed in build error messages.

Methods

compile(self, code)

Compile OpenCL kernel code (as a string) on the present device.

compile_file(self, fn)

Compile OpenCL kernels from file.

empty(self, shape[, dtype])

Create an empty array on the OpenCL device.

empty_like(self, arr)

Create an empty array with copied shape/dtype.

enqueue_copy(self, dst, src)

Call pyopencl.enqueue_copy with the session queue.

fft(self, arr[, sign, inplace, swap, …])

Perform a multidimensional FFT on a pyopencl array.

get_device_info(self, field)

Get info about the opencl device for the session.

initialize(self[, context])

Initialize the OpenCL context, if not already done.

local_memory(self, size)

Create a local memory object on the OpenCL device.

ones(self, shape[, dtype])

Create a one filled array on the OpenCL device.

ones_like(self, arr)

Create a one filled array with copied shape/dtype.

to_device(self, X)

Make a copy of a local array on the OpenCL device.

zeros(self, shape[, dtype])

Create a zeroed array on the OpenCL device.

zeros_like(self, arr)

Create a zeroed array with copied shape/dtype.

device_info

initialize(self, context=None)[source]

Initialize the OpenCL context, if not already done.

Parameters
contextNone or pyopencl Context

The OpenCL context to used. If none, created using the preset “device” attribute.

to_device(self, X)[source]

Make a copy of a local array on the OpenCL device.

Parameters
Xnumpy array
Returns
X_clpyopencl array

A version of the array stored on the OpenCL device. If the data type is float or double, it will be converted to the precision of the device context.

compile_file(self, fn)[source]

Compile OpenCL kernels from file. See compile() for details.

compile(self, code)[source]

Compile OpenCL kernel code (as a string) on the present device.

Compliation is delayed if this session is not initialized. If you require access to the Program object (which is usually not important), call initialize() first.

Returns
programpyopencl Program or None

The compiled program. May be used to make function calls, although the contained kernels will automatically become functions of the base class. If context is not initialized yet, returns None.

empty(self, shape, dtype=None)[source]

Create an empty array on the OpenCL device.

Parameters
shapetuple
dtypenumpy data type
Returns
arrpyopencl array

An empty array of the specified type.

empty_like(self, arr)[source]

Create an empty array with copied shape/dtype.

zeros(self, shape, dtype=None)[source]

Create a zeroed array on the OpenCL device.

Parameters
shapetuple
dtypenumpy data type
Returns
arrpyopencl array

A zeroed array of the specified type.

ones_like(self, arr)[source]

Create a one filled array with copied shape/dtype.

ones(self, shape, dtype=None)[source]

Create a one filled array on the OpenCL device.

Parameters
shapetuple
dtypenumpy data type
Returns
arrpyopencl array

A zeroed array of the specified type.

zeros_like(self, arr)[source]

Create a zeroed array with copied shape/dtype.

local_memory(self, size)[source]

Create a local memory object on the OpenCL device.

Parameters
sizeint

The size (in bytes) of the local memory required.

Returns
memcl.LocalMemory

A local memory object for an OpenCL kernel.

enqueue_copy(self, dst, src)[source]

Call pyopencl.enqueue_copy with the session queue.

fft(self, arr, sign=1, inplace=True, swap=None, max_threads=None)[source]

Perform a multidimensional FFT on a pyopencl array. Note: array dimension sizes must be multiples of 2, 3, 5, 7.

Parameters
arrpyopencl array, or an object castable to one (e.g. numpy array)
signinteger (+-1)

Forward or backward transform.

inplacebool

In place transform? If not, a copy is made first. Note: if a numpy array is passed it will never by in place!

swappyopencl array (default: None)

The swap buffer used in the calculation. If not specified, one will be created. Should have the same dimensions/type as arr

max_threadsint (default self.max_work_group_size)

The max_thread argument used for the fft code generation. Note that if a given size FFT has already been generated, this will be ignored. (Generally, this should be left unset, unless there isn’t enough local memory to transform multiple blocks.)

Returns
arropencl array

The output array; will match the input array if inplace.

get_device_info(self, field)[source]

Get info about the opencl device for the session. See get_device_info()

ilpm.simple_cl.get_device_info(device, field)[source]

Get information about an opencl device.

Parameters
fieldA valid field from pyopencl.device_info or str

The field to obtain. If specified as a string, name should match a member of pyopencl.device_info, ignoring case. Examples include: "global_mem_size" and "max_compute_units".

Returns
valuevaries (usually int)

The value returned by calling device.get_info(…)

ilpm.simple_cl.is_job_done(event)[source]

Check if an OpenCL task is done.

Parameters
eventpyopencl Event
Returns
donebool
class ilpm.simple_cl.TestCases(methodName='runTest')[source]

Bases: unittest.case.TestCase

Methods

__call__(self, \*args, \*\*kwds)

Call self as a function.

addCleanup(self, function, \*args, \*\*kwargs)

Add a function, with arguments, to be called when the test is completed.

addTypeEqualityFunc(self, typeobj, function)

Add a type specific assertEqual style function to compare a type.

assertAlmostEqual(self, first, second[, …])

Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.

assertCountEqual(self, first, second[, msg])

An unordered sequence comparison asserting that the same elements, regardless of order.

assertDictContainsSubset(self, subset, …)

Checks whether dictionary is a superset of subset.

assertEqual(self, first, second[, msg])

Fail if the two objects are unequal as determined by the ‘==’ operator.

assertFalse(self, expr[, msg])

Check that the expression is false.

assertGreater(self, a, b[, msg])

Just like self.assertTrue(a > b), but with a nicer default message.

assertGreaterEqual(self, a, b[, msg])

Just like self.assertTrue(a >= b), but with a nicer default message.

assertIn(self, member, container[, msg])

Just like self.assertTrue(a in b), but with a nicer default message.

assertIs(self, expr1, expr2[, msg])

Just like self.assertTrue(a is b), but with a nicer default message.

assertIsInstance(self, obj, cls[, msg])

Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.

assertIsNone(self, obj[, msg])

Same as self.assertTrue(obj is None), with a nicer default message.

assertIsNot(self, expr1, expr2[, msg])

Just like self.assertTrue(a is not b), but with a nicer default message.

assertIsNotNone(self, obj[, msg])

Included for symmetry with assertIsNone.

assertLess(self, a, b[, msg])

Just like self.assertTrue(a < b), but with a nicer default message.

assertLessEqual(self, a, b[, msg])

Just like self.assertTrue(a <= b), but with a nicer default message.

assertListEqual(self, list1, list2[, msg])

A list-specific equality assertion.

assertLogs(self[, logger, level])

Fail unless a log message of level level or higher is emitted on logger_name or its children.

assertMultiLineEqual(self, first, second[, msg])

Assert that two multi-line strings are equal.

assertNotAlmostEqual(self, first, second[, …])

Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.

assertNotEqual(self, first, second[, msg])

Fail if the two objects are equal as determined by the ‘!=’ operator.

assertNotIn(self, member, container[, msg])

Just like self.assertTrue(a not in b), but with a nicer default message.

assertNotIsInstance(self, obj, cls[, msg])

Included for symmetry with assertIsInstance.

assertNotRegex(self, text, unexpected_regex)

Fail the test if the text matches the regular expression.

assertRaises(self, expected_exception, …)

Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments.

assertRaisesRegex(self, expected_exception, …)

Asserts that the message in a raised exception matches a regex.

assertRegex(self, text, expected_regex[, msg])

Fail the test unless the text matches the regular expression.

assertSequenceEqual(self, seq1, seq2[, msg, …])

An equality assertion for ordered sequences (like lists and tuples).

assertSetEqual(self, set1, set2[, msg])

A set-specific equality assertion.

assertTrue(self, expr[, msg])

Check that the expression is true.

assertTupleEqual(self, tuple1, tuple2[, msg])

A tuple-specific equality assertion.

assertWarns(self, expected_warning, \*args, …)

Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments.

assertWarnsRegex(self, expected_warning, …)

Asserts that the message in a triggered warning matches a regexp.

debug(self)

Run the test without collecting errors in a TestResult

doCleanups(self)

Execute all cleanup functions.

fail(self[, msg])

Fail immediately, with the given message.

failureException

alias of builtins.AssertionError

setUp(self)

Hook method for setting up the test fixture before exercising it.

setUpClass()

Hook method for setting up class fixture before running tests in the class.

shortDescription(self)

Returns a one-line description of the test, or None if no description has been provided.

skipTest(self, reason)

Skip this test.

subTest(self[, msg])

Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters.

tearDown(self)

Hook method for deconstructing the test fixture after testing it.

tearDownClass()

Hook method for deconstructing the class fixture after running all tests in the class.

assertAlmostEquals

assertDictEqual

assertEquals

assertNotAlmostEquals

assertNotEquals

assertNotRegexpMatches

assertRaisesRegexp

assertRegexpMatches

assert_

countTestCases

defaultTestResult

failIf

failIfAlmostEqual

failIfEqual

failUnless

failUnlessAlmostEqual

failUnlessEqual

failUnlessRaises

id

run

test_dot_product_double

test_dot_product_gpu

test_dot_product_single