The simulation of nanomotors is an integral part of my research. For this blog article, I have chosen to present nano-dimer, a simulation software written by my friend Peter Colberg. nano-dimer generates on the fly OpenCL code to perform Molecular Dynamics with a hybrid algorithm called MPCD to study chemically powered dimer nanomotors.
Overview
I explain how the program nano-dimer by Peter Colberg is structured and how it benefits from a mixed Lua-OpenCL programming approach. nano-dimer is MIT licensed. More precisely, I take a look at the Lua code and explain what the variables represent and how the OpenCL code is called.
I am not reviewing the science behind the simulation. The homepage of the software has good references for the methods.
A bit of motivation though: Lua is a "powerful, fast, lightweight, embeddable scripting language" (from the homepage of Lua). It has simple yet powerful data structures and interfaces very well with C. The just in time implementation LuaJIT provides fast execution of loops which is useful for scientific codes.
More motivation: nano-dimer is the only open-source and documented implementation of chemically powered motors with a MPCD solvent. My own code is open but not documented and currently undergoing a large refactoring and rewriting.
Installation and program structure
The installation page of nano-dimer contains all the needed information. To ease its reading, here are few guidelines:
- Use a package manager to install: a recent version of GCC, LuaJIT, git, OpenCL, HDF5.
- Install LuaRocks, as advised in the documentation of
nano-dimer. You must install luarocks yourself to use luajit and cannot
rely a priori on the system-provided version. If you lack administrator
access to your machine, luarocks accepts the
--local
option to install in your home directory. - Use luarocks to install: opencl, hdf5, templet, ljsyscall, lua-cjson.
- You are ready to download nano-dimer! (see the
installation page). When
writing this article the current git version was
1.0.0-57-g65987aa
.
In the nano-dimer directory, type make test
to check that everything is
running. If that is ok, you may proceed to the next paragraph.
It is good to know that make
is not used to compile anything! There is nothing
to compile prior to execution, a real convenience for development.
There are examples coming with the code, you may run them as directed in the README.
source examples/env.sh
cd examples/single_dimer/equilibration/
luajit single_dimer.lua config.lua
All the code that is common to dimer nanomotors is found in the nanomotor
directory. This library is then used in several applications corresponding to
different physical setups.
What is going on?
What is happening when the code is run? Most of the code being organised in the
library nanomotor
, the file single_dimer.lua
is really short. It starts by
loading a number of lua modules and reads the file config.lua
.
Then, "physical" components are initialized. The object box
contains the size
of the box and the minimum image function that is used for the periodic boundary
conditions. Inspection of the file box.lua
reveals a pattern that will be used
repeatedly. The file defines a function that returns an object embedding data
(e.g. L
for the box size) and functions (e.g. mindist
).
When the code box = nm.box(args)
is executed, the main function defined in
box.lua
is executed with arguments args
and what is returned is the Lua
table with name self
in box.lua
.
The object dom
contains the data structure for the particles: the position,
velocities, etc. As for the file box.lua
, domain.lua
defines a single
function that returns data and functions in a single object.
The extreme flexibility of Lua is used to define the data structure and there is very little syntactical noise. One of the datastructure of interest is the table: a simple declaration allows you to collect objects into a containing table.
A table initiated as
mytable = { key1 = 3, key2 = 'a' , [5] = 17}
where key1 = 3
denotes a string attribute that can be retrieved via a
dot-based syntax and [5] = 17
is the general form where the table's key is
5
. Any type of Lua variable can be used as a key. You can test this in a
luajit console, and retrieve the content via
print(mytable.key1);
print(mytable['key1']);
print(mytable[5]);
Lua regards the keys 1, 2, 3, etc as special. They form a sequence whose length
is returned by #mytable
. Accessing an undefined table key returns nil
. Only
string keys can be acceess via the dot syntax (i.e. mytable.5
does not work).
Given the layered architecture of nano-dimer, it is good to know where the data
is actually stored. Take, for instance, the solvent position. It is stored in
dom.rs
as a chunk of memory of type cl_double3
. The memory is managed by the
ffi library of luajit. A corresponding buffer dom.d_rs
is created for the
OpenCL device. Data is moved between the two locations only if necessary.
The dimer and the solvent are then placed at random in the simulation domain. As the code is compiled just in time (JIT), this Lua code runs very fast!
Next, the integrate
object is created. Again, this is a well-organized
collection of data and functions. Jumping a few lines, we get to the
observables
table. This is a list of
coroutines that are evaluated during
the integration of the system.
The observables are run at regular intervals and this logic lies in the file
observe.lua
, with a detailed documentation. The function observe
will
integrate the system and let the observables be computed, with their inner state
preserved between evaluations thanks to the coroutine machinery.
So, this runs with OpenCL?
For now, we only touched at the Lua part of the code. But this code is
advertised for OpenCL (thus targeting multicore CPUs, GPUs and other accelerator
devices), so how does that work?
To understand this part, I have chosen a short part of the code: species.lua
and species.cl
.
species.lua
is structured similarly to other Lua files in the program: it
defines a function that returns structured data and functions. One of the
functions contains a OpenCL program (see the bit below).
local program = compute.program(context, "nanomotor/species.cl", {
dom = dom,
species = species,
})
-- ...
local kernel = program:create_kernel("species_sum")
-- ...
kernel:set_arg(0, dom.d_sps)
What this call does is load the OpenCL code in species.cl
and run it through a
templating engine (lua-templet, also by Peter
Colberg) to fill in some variables, those that start with a dollar sign. The
templating engine allows the execution of any Lua code within a template using a
pipe at the beginning of a line. The code below, from species.cl
, executes a
for loop in Lua to generate several OpenCL statement, one per species.
|for i = 0, #species-1 do
if (sp == ${i}) count.s${i}++;
|end
This feature is used for the conditional generation of optional algorithm so
that only the code that is relevant for the parameter set is compiled to OpenCL.
In lj.cl
, for instance, the absence of the parameter wall
, the code for the
wall potential is not even sent to the OpenCL compiler. More complex examples
are found in random.cl
and hilbert.cl
.
The code is packaged into a kernel for which the arguments must be listed
explicitly (here, dom.d_sps
, the array containing the species of the solvent
particles, is given as the argument number 0 for instance). This kernel can be
enqueued for execution by the GPU.
This is where the magic happens: Lua is used to manage the data and organization of the code and to prepare OpenCL kernels with only the minimum amount of code necessary.
Final comments
nano-dimer combines a lot of interesting programming achievements. I must add that the other Lua modules by Peter Colberg represent a lot of work and a credible alternative for OpenCL programming, with specific additional features that are of general interest (The HDF5 wrapper library lua-hdf5, for instance).
This article focuses on the programming stack and leaves other aspects untouched, notably the optimization of the program and design choices for the parallel implementation.
Many thanks to Peter Colberg for his comments on this article and for putting together this advanced software stack.
Comments !
Comments are temporarily disabled.