13. simple_cl Module¶
This module provides a convenient method for working with openCL via pyopencl, by simplifying access to devices, allowing precision to be determined on the fly and by providing common complex variable functions.
It also simplifies function calls be assuming that the desired number of work-groups and work-items is the number of compute units and max workgroup size, respectively. This means the looping needs to be handled in the kernel, as in the example below.
13.1. Overview¶
|
Create an OpenCL session on the specified device, while defining float precision and some utility functions. |
|
Find OpenCL devices with matching descriptions. |
|
Check if an OpenCL task is done. |
|
Get information about an opencl device. |
13.2. Example Usage¶
from ilpm import simple_cl
import numpy as np
import time
NP = 10**6
REPEATS = 10
#Here we are creating the an OpenCL context on the CPU. Why bother? This
# causes the code execution to be multithreaded, which should speed it up!
ctx = simple_cl.CLSession(device='cpu', use_doubles=False)
#OpenCL code
ctx.compile('''
__kernel void dot_product(__global REAL* a, __global REAL* b, int np, __global REAL* result)
{
for(int i = get_global_id(0); i < np; i += get_global_size(0)) {
result[i] = dot(get_3D(a, i), get_3D(b, i));
}
}''')
#Initial data
a = np.random.rand(NP, 3)
b = np.random.rand(NP, 3)
#Create version of data on device
a_cl = ctx.to_device(a)
b_cl = ctx.to_device(b)
result_cl = ctx.empty(NP)
#Run in OpenCL
start = time.time()
for n in range(REPEATS):
event = ctx.dot_product(a_cl, b_cl, NP, result_cl)
event.wait()
print 'OpenCL: %5.1f ms' % ((time.time() - start)*1E3)
result = result_cl.get()
#Run in numpy
start = time.time()
for n in range(REPEATS):
result_n = (a*b).sum(-1)
print ' Numpy: %5.1f ms' % ((time.time() - start)*1E3)
#Find error; note that the OpenCL version is single precision, as we specified
# use_doubles=False on the context creation. This causes an error equal to
# the single precision size.
print 'Dot product error:', np.abs(result_n - result).sum() / NP
Say you want to run your code on the computer’s GPU, not its CPU. This option is available with all high end recent Macs (maybe not the Macbook Airs?). How would I find out my GPU’s name to use it in my code? I’ll do it below without using simple_cl.
We can choose from the list. For instance, my MBP has 3 devices: the CPU, an Iris graphics card, and an AMD graphics card. So I could say
ctx = simple_cl.CLSession( device = ‘AMD Radeon R9 M370X Compute Engine’, use_doubles=True)
13.3. Classes and Functions¶
-
ilpm.simple_cl.
acquire_opencl_devices
(devname)[source]¶ Find OpenCL devices with matching descriptions.
- Parameters
- devnamestring
A partial name for an opencl device, or “cpu” or “gpu”.
- Returns
- deviceslist of OpenCL devices
All matching devices. May be empty
-
class
ilpm.simple_cl.
CLSession
(device=['cpu', 'gpu'], use_doubles=False, group_size=None, show_warnings=False)[source]¶ Bases:
object
Create an OpenCL session on the specified device, while defining float precision and some utility functions.
The actual creation of the OpenCL contexts are delayed until necessary. This allows the context parameters to be modified before it is used – this is useful for external modules where the user may which to change the device used for calculations.
The following data types are defined as members of the class:
Name
OpenCL type
Size (bytes)
real
REAL
(float
ordouble
)4 or 8
complex
COMPLEX
(float2
ordouble2
)8 or 16
char
char
1
uchar
uchar
1
short
short
2
ushort
ushort
2
int
int
4
uint
uint
4
long
long
8
ulong
ulong
8
The following functions and constants will also be made available to any kernel compiled via the session:
Type
Name
Parameters
Description
COMPLEX
c_mul
COMPLEX x, COMPLEX y
\(x * y\)
COMPLEX
c_div
COMPLEX x, COMPLEX y
\(x / y\)
COMPLEX
conj
COMPLEX x
\(x^*\)
COMPLEX
c_exp
COMPLEX x
\(\exp(x)\)
REAL
c_angle
COMPLEX x
\(\mathrm{arg}(x)\)
REAL
c_abs
COMPLEX x
\(|x|\)
REAL
c_abs_sq
COMPLEX x
\(|x|^2\)
COMPLEX
c_exp_i
REAL x
\(\exp(i x)\)
COMPLEX
native_exp_i
REAL x
(uses native precision math)
COMPLEX
c_exp_i_T
REAL x
\(\exp(-i x)\)
COMPLEX
native_exp_i_T
REAL x
(uses native precision math)
REAL
PI
constant
\(\pi\)
REAL
TWO_PI
constant
\(2 \pi\)
REAL
EULER_CONSTANT
constant
\(\gamma = 0.57721...\)
- Parameters
- deviceopenCL device, description string or list of strings
The openCL device to use. If a string or list is passed, :function: acquire_opencl_devices will be used to find devices. As soon as it finds a matching device it stops; lists of devices used for fallbacks.
- use_doublesbool (default: False)
Specify if REAL and COMPLEX types are double precision
- group_sizeint (default: max for device)
The default group_size in calls. If not specified, determined by the max_workgroup_size for the device (recommended). If the device type is CPU, this will be set to 1 (CPUs sometimes erroneously report more than 1 for this value)
- show_warningsbool, optional (default: False)
If True, warnings displayed in build error messages.
Methods
compile
(self, code)Compile OpenCL kernel code (as a string) on the present device.
compile_file
(self, fn)Compile OpenCL kernels from file.
empty
(self, shape[, dtype])Create an empty array on the OpenCL device.
empty_like
(self, arr)Create an empty array with copied shape/dtype.
enqueue_copy
(self, dst, src)Call
pyopencl.enqueue_copy
with the session queue.fft
(self, arr[, sign, inplace, swap, …])Perform a multidimensional FFT on a pyopencl array.
get_device_info
(self, field)Get info about the opencl device for the session.
initialize
(self[, context])Initialize the OpenCL context, if not already done.
local_memory
(self, size)Create a local memory object on the OpenCL device.
ones
(self, shape[, dtype])Create a one filled array on the OpenCL device.
ones_like
(self, arr)Create a one filled array with copied shape/dtype.
to_device
(self, X)Make a copy of a local array on the OpenCL device.
zeros
(self, shape[, dtype])Create a zeroed array on the OpenCL device.
zeros_like
(self, arr)Create a zeroed array with copied shape/dtype.
device_info
-
initialize
(self, context=None)[source]¶ Initialize the OpenCL context, if not already done.
- Parameters
- contextNone or pyopencl Context
The OpenCL context to used. If none, created using the preset “device” attribute.
-
to_device
(self, X)[source]¶ Make a copy of a local array on the OpenCL device.
- Parameters
- Xnumpy array
- Returns
- X_clpyopencl array
A version of the array stored on the OpenCL device. If the data type is float or double, it will be converted to the precision of the device context.
-
compile
(self, code)[source]¶ Compile OpenCL kernel code (as a string) on the present device.
Compliation is delayed if this session is not initialized. If you require access to the Program object (which is usually not important), call
initialize()
first.- Returns
- programpyopencl Program or None
The compiled program. May be used to make function calls, although the contained kernels will automatically become functions of the base class. If context is not initialized yet, returns None.
-
empty
(self, shape, dtype=None)[source]¶ Create an empty array on the OpenCL device.
- Parameters
- shapetuple
- dtypenumpy data type
- Returns
- arrpyopencl array
An empty array of the specified type.
-
zeros
(self, shape, dtype=None)[source]¶ Create a zeroed array on the OpenCL device.
- Parameters
- shapetuple
- dtypenumpy data type
- Returns
- arrpyopencl array
A zeroed array of the specified type.
-
ones
(self, shape, dtype=None)[source]¶ Create a one filled array on the OpenCL device.
- Parameters
- shapetuple
- dtypenumpy data type
- Returns
- arrpyopencl array
A zeroed array of the specified type.
-
local_memory
(self, size)[source]¶ Create a local memory object on the OpenCL device.
- Parameters
- sizeint
The size (in bytes) of the local memory required.
- Returns
- memcl.LocalMemory
A local memory object for an OpenCL kernel.
-
fft
(self, arr, sign=1, inplace=True, swap=None, max_threads=None)[source]¶ Perform a multidimensional FFT on a pyopencl array. Note: array dimension sizes must be multiples of 2, 3, 5, 7.
- Parameters
- arrpyopencl array, or an object castable to one (e.g. numpy array)
- signinteger (+-1)
Forward or backward transform.
- inplacebool
In place transform? If not, a copy is made first. Note: if a numpy array is passed it will never by in place!
- swappyopencl array (default: None)
The swap buffer used in the calculation. If not specified, one will be created. Should have the same dimensions/type as arr
- max_threadsint (default self.max_work_group_size)
The max_thread argument used for the fft code generation. Note that if a given size FFT has already been generated, this will be ignored. (Generally, this should be left unset, unless there isn’t enough local memory to transform multiple blocks.)
- Returns
- arropencl array
The output array; will match the input array if inplace.
-
get_device_info
(self, field)[source]¶ Get info about the opencl device for the session. See
get_device_info()
-
ilpm.simple_cl.
get_device_info
(device, field)[source]¶ Get information about an opencl device.
- Parameters
- fieldA valid field from
pyopencl.device_info
or str The field to obtain. If specified as a string, name should match a member of pyopencl.device_info, ignoring case. Examples include:
"global_mem_size"
and"max_compute_units"
.
- fieldA valid field from
- Returns
- valuevaries (usually int)
The value returned by calling device.get_info(…)
-
ilpm.simple_cl.
is_job_done
(event)[source]¶ Check if an OpenCL task is done.
- Parameters
- eventpyopencl Event
- Returns
- donebool
-
class
ilpm.simple_cl.
TestCases
(methodName='runTest')[source]¶ Bases:
unittest.case.TestCase
Methods
__call__
(self, \*args, \*\*kwds)Call self as a function.
addCleanup
(self, function, \*args, \*\*kwargs)Add a function, with arguments, to be called when the test is completed.
addTypeEqualityFunc
(self, typeobj, function)Add a type specific assertEqual style function to compare a type.
assertAlmostEqual
(self, first, second[, …])Fail if the two objects are unequal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is more than the given delta.
assertCountEqual
(self, first, second[, msg])An unordered sequence comparison asserting that the same elements, regardless of order.
assertDictContainsSubset
(self, subset, …)Checks whether dictionary is a superset of subset.
assertEqual
(self, first, second[, msg])Fail if the two objects are unequal as determined by the ‘==’ operator.
assertFalse
(self, expr[, msg])Check that the expression is false.
assertGreater
(self, a, b[, msg])Just like self.assertTrue(a > b), but with a nicer default message.
assertGreaterEqual
(self, a, b[, msg])Just like self.assertTrue(a >= b), but with a nicer default message.
assertIn
(self, member, container[, msg])Just like self.assertTrue(a in b), but with a nicer default message.
assertIs
(self, expr1, expr2[, msg])Just like self.assertTrue(a is b), but with a nicer default message.
assertIsInstance
(self, obj, cls[, msg])Same as self.assertTrue(isinstance(obj, cls)), with a nicer default message.
assertIsNone
(self, obj[, msg])Same as self.assertTrue(obj is None), with a nicer default message.
assertIsNot
(self, expr1, expr2[, msg])Just like self.assertTrue(a is not b), but with a nicer default message.
assertIsNotNone
(self, obj[, msg])Included for symmetry with assertIsNone.
assertLess
(self, a, b[, msg])Just like self.assertTrue(a < b), but with a nicer default message.
assertLessEqual
(self, a, b[, msg])Just like self.assertTrue(a <= b), but with a nicer default message.
assertListEqual
(self, list1, list2[, msg])A list-specific equality assertion.
assertLogs
(self[, logger, level])Fail unless a log message of level level or higher is emitted on logger_name or its children.
assertMultiLineEqual
(self, first, second[, msg])Assert that two multi-line strings are equal.
assertNotAlmostEqual
(self, first, second[, …])Fail if the two objects are equal as determined by their difference rounded to the given number of decimal places (default 7) and comparing to zero, or by comparing that the difference between the two objects is less than the given delta.
assertNotEqual
(self, first, second[, msg])Fail if the two objects are equal as determined by the ‘!=’ operator.
assertNotIn
(self, member, container[, msg])Just like self.assertTrue(a not in b), but with a nicer default message.
assertNotIsInstance
(self, obj, cls[, msg])Included for symmetry with assertIsInstance.
assertNotRegex
(self, text, unexpected_regex)Fail the test if the text matches the regular expression.
assertRaises
(self, expected_exception, …)Fail unless an exception of class expected_exception is raised by the callable when invoked with specified positional and keyword arguments.
assertRaisesRegex
(self, expected_exception, …)Asserts that the message in a raised exception matches a regex.
assertRegex
(self, text, expected_regex[, msg])Fail the test unless the text matches the regular expression.
assertSequenceEqual
(self, seq1, seq2[, msg, …])An equality assertion for ordered sequences (like lists and tuples).
assertSetEqual
(self, set1, set2[, msg])A set-specific equality assertion.
assertTrue
(self, expr[, msg])Check that the expression is true.
assertTupleEqual
(self, tuple1, tuple2[, msg])A tuple-specific equality assertion.
assertWarns
(self, expected_warning, \*args, …)Fail unless a warning of class warnClass is triggered by the callable when invoked with specified positional and keyword arguments.
assertWarnsRegex
(self, expected_warning, …)Asserts that the message in a triggered warning matches a regexp.
debug
(self)Run the test without collecting errors in a TestResult
doCleanups
(self)Execute all cleanup functions.
fail
(self[, msg])Fail immediately, with the given message.
failureException
alias of
builtins.AssertionError
setUp
(self)Hook method for setting up the test fixture before exercising it.
setUpClass
()Hook method for setting up class fixture before running tests in the class.
shortDescription
(self)Returns a one-line description of the test, or None if no description has been provided.
skipTest
(self, reason)Skip this test.
subTest
(self[, msg])Return a context manager that will return the enclosed block of code in a subtest identified by the optional message and keyword parameters.
tearDown
(self)Hook method for deconstructing the test fixture after testing it.
tearDownClass
()Hook method for deconstructing the class fixture after running all tests in the class.
assertAlmostEquals
assertDictEqual
assertEquals
assertNotAlmostEquals
assertNotEquals
assertNotRegexpMatches
assertRaisesRegexp
assertRegexpMatches
assert_
countTestCases
defaultTestResult
failIf
failIfAlmostEqual
failIfEqual
failUnless
failUnlessAlmostEqual
failUnlessEqual
failUnlessRaises
id
run
test_dot_product_double
test_dot_product_gpu
test_dot_product_single