Boosts MPI and serialization

Since my project is going to involve MPI parallelization and there is a Boost.MPI library I’ll of course have a look at it and most likely also use it (I really like Boost). One of the main advantages of using Boost.MPI over MPIs C interface is that it is typesafe, more expressive and generally nice to use. Instead of packaging data by hand into buffers before sending them we can have Boost.Serialization handle this task for us.

Let’s say we have the following structs that represent one and multidimensional Ranges:

template<class T>
struct Range {
   T begin, end;
};
template<class T, int Dim>
struct NRange {
   Range<T>& operator[](int i)
   { return ranges[i]; }
   const Range<T>& operator[](int i) const
   { return ranges[i]; }
private:
   Range<T> ranges[Dim];
};

To make them serializable we could put a serialize function as required by Boost.Serialization into their definitions. But since they are already in their own headers etc. that don’t depend on boost serialization we might want to make them serializable in a non intrusive way, which is luckily possible:

#include "Range.h"

namespace boost {
namespace serialization {

template<class Archive, class T>
void serialize(Archive & ar, Range<T> & r, const unsigned int version)
{
   ar & r.begin;
   ar & r.end;
}

template<class Archive, class T, int Dim>
void serialize(Archive & ar, NRange<T,Dim> & r, const unsigned int version)
{
   for(int i = 0;i<Dim;++i)
      ar & r[i];
}

} // namespace serialization
} // namespace boost

//...

Notice how we don’t even have to include any boost headers since serialize only takes template parameters, so we actually don’t even introduce any boost dependency here. So since the structs are now serializable we can just send them off with MPI:

//...

#include <iostream>
#include <boost/mpi.hpp>

int main(int argc, char* argv[])
{
   mpi::environment env(argc, argv);
   mpi::communicator world;

   if (world.rank() == 0)
   {
      NRange<int,3> range;
      range[0].begin = 23;
      range[0].end = 42;
      world.send(1, 0, range);
   } 
   else
   {
      NRange<int,3> range;
      world.recv(0, 0, range);
      std::cout << range[0].begin << ' ' << range[0].end;
   }
   return 0;
}

One of the core features of the framework I’m working are distributed grids. So naturally we probably want a way to send chunks of the grid to other nodes via MPI. A very simple implementation of a chunk class could look like this:

#include <vector>
#include <boost/serialization/vector.hpp>

//...

template<class T>
class Chunk {
public:
    Chunk() { }
    
    Chunk(const NRange<size_t, 3> &b)
    : bounds_(b), data_(b.volume())
    {
    }

    // subscript operators etc...

private:
    friend class boost::serialization::access;

    template<class Archive>
    void serialize(Archive & ar, const unsigned int version)
    {
        ar & bounds_;
        ar & data_;
    }

    NRange<size_t, 3> bounds_;
    std::vector<T> data_;
};

Here we see the intrusive version of putting the serialize function right into class itself. The additional header we included along with vector provides serialization for std::vector so we don’t have to do that ourselves. There are a few additional traits we can define which for example allow bitwise copy in some cases or turn of the versioning of the serialization library, but essentially we can now send around Chunks without worrying about their packaging etc.

Comments are closed.