Hull Science Festival – Parallel Rendering with 4 Rapsberry Pis
For the 2017 Hull Science festival we have produced a demonstration of parallel computing using 4 Raspberry Pi 3 computers rendering a ray traced image and shown the difference by running the same render on a single Raspberry Pi.
We have the first Raspberry Pi (piNode01) which will render the image on its own, so that we have a reference point for how long a single Pi takes to complete.
We then have the 4 other raspberry Pis working in parallel and displaying the completed render on all 4 monitors.
The 4 machines working together complete the render roughly 2 to 3 times faster that the machine working on its own.
How is it done?
The demonstration is based on two bash shell scripts written by Dr Christian Baun called task-distributor Source: https://code.google.com/archive/p/task-distributor/
To render the images we are using a program called Pov-Ray (Persistence of vision raytracer) which produces ray traced images. One of the benefits of Pov-Ray is that it can be instructed to produce a subset of an overall image. This allows us to split up an image into any number of sub-parts where they can be worked on in parallel.
The single and parallel renders are started from piNode02, so we need to be able to connect to all of the other Pis without requiring a login prompt, otherwise we would need to stop processing to wait for a password. We achieve this by using a passwordless login using keygen.
The next thing we needed to do was set up a file system that all the Pis (including the Pi running on its own) can see. We’re using an NFS file system that is shared from piNode02 to all the other Pis. This is done for three reasons:
Firstly, once all of the parallel parts of the image have been produced, we need a single place where the pieces can be ‘stitched’ together to create the final image.
Secondly, we need a place to store the state of each of the Pis, i.e. we need to know when all of the Pis have finished their individual part.
Thirdly, we need to be able to determine when the single Pi has finished it’s rendering, so we can display the single render and the multiple render together for a certain amount of time before starting the next image.
Rendering in parallel
Each node takes a part of the scene to render, this is split equally amongst the number of nodes that the demo is being run on using a loop to execute a remote script on each node, the script takes parameters that indicate which part of the image it is to process
Complete image of pawn chess pieces:
So, if we wanted to render the above image using 4 nodes (piNode02, piNode03, piNode04 and piNode05) and the height of the image is 800 rows, each node will render the following:
piNode02 renders rows 1 to 200:
piNode03 renders rows 201 to 400:
Pinode04 renders rows 401 to 600:
piNode05 renders rows 601 to 800:
On each node, the output from Pov-Ray, whilst the render is taking place, is stored in the local /tmp directory, this prevents an unnecessary network overhead. Once each node has finished rendering, it’s output is placed in the shared NFS directory as pinode<n>.png
As each node has finished it’s part of the render, it places it’s hostname in a text file, once all nodes have placed an entry in this file, the master (piNode02) can then piece together the individual images ready for displaying. The images are pieced together using ‘convert’ which is part of ImageMagick
Syncing the display
Syncing of the single Pi and parallel Pis is achieved by using inotifywait. The script that controls the demonstration is running in an infinite loop, constantly looking for the single Pi image to be written and closed. Once this happens, we check that both the single Pi image and the parallel images are present, once they are, we then kill the processes that are displaying the images. Rendering of the next image in the demonstration is then started at the same time.