Genetic Programming-based Self-reconfiguration Planning for Metamorphic Robot
  International Journal of Automation and Computing  2018, Vol. 15 Issue (4): 431-442   PDF    
Genetic Programming-based Self-reconfiguration Planning for Metamorphic Robot
Tarek Ababsa1, Noureddine Djedl1, Yves Duthen2     
1 LESIA Laboratory, University of Biskra, PO Box 145 RP, Biskra 07000, Algerla;
2 IRIT Laboratory, University of Toulouse 1 Capitole, 21 Alles de Brinne, Toulouse 31032, France
Abstract: This paper presents a genetic programming based reconfiguration planner for metamorphic modular robots. Initially used for evolving computer programs that can solve simple problems, genetic programming (GP) has been recently used to handle various kinds of problems in the area of complex systems. This paper details how genetic programming can be used as an automatic programming tool for handling reconfiguration-planning problem. To do so, the GP evolves sequences of basic operations which are required for transforming the robot s geometric structure from its initial configuration into the target one while the total number of modules and their connectedness are preserved. The proposed planner is intended for both Crystalline and TeleCube modules which are achieved by cubical compressible units. The target pattern of the modular robot is expressed in quantitative terms of morphogens diffused on the environment. Our work presents a solution for self reconfiguration problem with restricted and unrestricted free space available to the robot during reconfiguration. The planner outputs a near optimal explicit sequence of low-level actions that allows modules to move relative to each other in order to form the desired shape.
Key words: Modular robots     unit-compressible modules     self-reconfiguration     genetic programming     reconfiguration planning    
1 Introduction

Metamorphic modular robots are versatile systems that consist of a large number of autonomous units called "atoms". These units are typically able to connect and disconnect to or from each other to meet different needs, or even exchange information and energy with connected, neighbor modules. Compared with fixed-body robots, self-reconfigurable robots are believed to be more robust and more adaptive under dynamic environments.

In the robotics community, there are several types of modules that have been designed and prototyped differing in shape and in the operations they perform. In this paper, we focus on lattice-based modular robots in which modules are arranged on a regular grid.

Performing motion primitives in a coordinated way is the subject of the self-reconfiguration problem. The question is how to change connectivity among modules to transform the robot from the current configuration into the desired configuration within the restrictions of physical implementation [1-4]. One of the major challenges in such a problem is the number of possible configurations that increases exponentially with the number of modules and the number of their degrees of freedom. This challenge was first studied by Chirikjian et al. [5] and Pamecha et al. [6, 7] who proved that the reconfiguration of modular robots is non-deterministic polynomial NP-complete problem. Thus, determining efficient sequences of modules operations that transform a modular robot from one configuration into another is a pressing need for efficient reconfiguration process.

According to the way in which units are reconfigured, there are two main types of reconfiguration [2]:

1) Deterministic reconfiguration

In which, the planner manipulates the modules into their target location during reconfiguration. The exact location of each module can be determined at all times and reconfiguration takes place in determined time steps.

2) Stochastic reconfiguration

In which, the planner is based on statistical processes. During reconfiguration, each module may take unknown paths to move toward target locations. Reconfiguration times can be calculated only statistically.

Considering the amount of free space used during the reconfiguration, we can also classify the reconfiguration algorithms into the following two classes:

1) In-place reconfiguration

Algorithms performing in-place reconfiguration assume that the space available in which modules′ movement takes place is constrained to the union of the bounding boxes of the source and target robots.

2) Out-of-place reconfiguration

Algorithms performing out-of-place reconfiguration assume that the quantity of space available in which modules′ movement takes place is unbounded. This technique is useful for planning fast reconfigurations.

Most available lattice-based modular robotic systems have only basic locomotion controllers to reshape the robot structure to a few predefined patterns by following predefined sequences and rules. These sequences have usually been hand-coded (by human operators), combined, generalized and optimized. However, in most cases the predefined sequences cannot predict all the possible situations that modular robots may face under dynamic environments. Further, since many atoms may perform moves simultaneously, it is useful to increase the number of parallel steps to reduce the reconfiguration time.

In this work we present an experimental study of self-reconfiguration planning using an automatic programming engine. The planner evolves and optimizes sequences of low-level actions which are required to transform the robot geometric structure from initial configuration to the target one. We assume that during reconfiguration the total number of modules and their connectedness must be preserved. This aspect ensures that the overall structure does not fragment during actuation due to module disconnections. The proposed planner is intended for both Crystalline and TeleCube modules which are designed and prototyped in hardware as cubical compressible units.

It is important to note that the mechanical actions (attach, detach, expand, contract) performed by the units are the slowest part of the system. Thus, we are also interested in reducing the total number of low-level operations required to achieve the desired reconfiguration.

2 Related work

In related work, several lattice-based self-reconfiguring modular robots have been proposed with different designs. There have been a wide variety of actuation modes that require different planning algorithms to manipulate each design with respect to its physical constraints.

Rus and Vona [4] used meta-modules composed of 4$\times$4 individual units to guarantee completeness of reconfiguration spaces. The concept of Meta-modules transforms the self reconfiguration problem from finding sequences of module-level actions into finding a sequence of high-level meta-module actions. Rus and Vona also proposed "Melt-Grow" algorithm as one of the first designed algorithms for reconfiguration of expanding-cube style modular robots. However, this algorithm is not distributed and it does not allow for parallel actuation. As for the complexity, it requires O$(n^2)$ steps, where $n$ is the number of modules in the robot. Besides, this algorithm does not allow in-place reconfiguration since it transforms the starting configuration into an intermediate structure in the first stage of the run. More recently, the same idea has been improved and used again by Aloupis et al. [8] Although the reconfiguration involves a total of O$(n)$ atomic operations, the reconfiguration still occurred out of place.

Pamecha et al.[6, 7] used simulated annealing to solve the self reconfiguration problem using hexagonal deformable modules. In this method the energy difference E(C, T) between an arbitrary configuration C and the target configuration T is computed to determine the set of neighboring configurations from which the next configuration will be selected. The proposed method is only computationally feasible for one module moving at a time. The restriction of single module motion may render this solution too slow for systems consisting of large numbers of modules. More recently, Larkworthyand Ramamoortly [9] has proposed an O$(n)$ centralized algorithm for single-move reconfiguration planning for the same hexagonal metamorphic robots. Empirically, Larkworthy showed also how multi-move plans scaled linearly with the number of units. This improvement has enabled efficient planning over a large spanning subset of the Chirikjian reconfiguration state space. However, this contribution requires to keep the surface of the configuration unconstrained for the moving units.

The PacMan algorithm introduced by Rutler and Rus [10] seems to be the first distributed reconfiguration algorithm for expanding cube style modular robots. This algorithm runs in a distributed fashion by the individual modules in two phases. The first phase requires a path planning for modules that will move to fill target positions while the second phase requires modules to cooperate in order to actuate all planned paths. This algorithm focuses on the fact that in a homogeneous system the final shape of a goal configuration is important, not the exact location of specific modules. Thus, adapting this algorithm for heterogeneous systems remains to be proved. Moreover, the algorithm is not designed to be optimal and it assumes that the actuators are strong enough to push and pull any number of units during reconfiguration.

Emergent structures exhibit another interesting approach for planning reconfiguration. Rather than explicitly specifying final configurations the desired shapes emerge as the result of applying local control rules. Kubica et al. [11] used agent-based control programs for modular robot made of TeleCube units. By using local control rules, the proposed approach allows units to self-reconfigure in response to weight for internal object manipulation.

Kala [12] used genetic programming (GP) for motion planning for multiple mobile robots. The mission was to move $n$ robots located in a maze-like map to their destinations. Even the designed approach is able to guide every single robot to reach its destination, it lacks some essential components that prevent it from being used for reconfiguring modular robots. First, the author assumed that after reaching the goal, the robot disappears from the goal in order not to obstruct other robots. Second, the approach is not designed to support a large number of robots. Wang et al. [13] have transformed the reconfiguration planning for a group of e-puck mobile robots into the optimal formation problem, the shape theory is used to describe the desired formation and the optimization problem is solved by a recurrent neural network. Due to its parallel computation nature, the model can solve a large-scale optimal formation problems effectively. However, due to the involvement of the Euclidean distance, the formation is discussed only in free space (the robots do not negotiate obstacles while performing the task).

Previous work in self-reconfiguring robotics has proposed several approaches for controlling these dynamic systems. However, almost every approach is not restrained by implicit interaction (attachment and detachment) with walls, floors, or any other obstacles. Furthermore, it is important to note that planning in the native kinematic space of the module (module-level actions) opens the opportunity of optimizing reconfiguration steps for various quantities of interest.

3 Overview of the compressible units and the simulator

Fig. 1 shows some 2D compressible units in different configurations. The units are cubic in shape (squares in 2D) with an extensible arm on all six faces. Each unit is completely autonomous with computation, power and communication onboard. Besides, each unit can expand each of its six arms independently up to a factor of two times its fully contracted configuration. A set range for contracting and expanding will then allow two neighboring modules to contract to half the breadth of a fully expanded module and fill a single grid space in the lattice. A single grid gap is then created for other modules to be pushed into. Each face of the unit (called a connection plate) is equipped with an electromechanical functionality allowing the unit to clamp onto the neighboring unit′s connection plate. The mechanical design of compressible modular robots is particularly well-suited to take advantage since modules are allowed to perform motion through the interior of a robot structure rather than only along the exterior surface. The reader is invited to see [14] for more hardware details.

Download:
Fig. 1. Compressible units in different configurations

3.1 Unit-compressible motion

The unit-compressible motion is controlled by attaching the unit to a neighboring one and actuating the expansion or contraction mechanism. First, the unit disconnects from all neighbors in directions perpendicular to the direction of motion. Then the actual move is performed by expanding and contracting the back and the front arm. The unit slides one step in the desired direction with respect to some substrate modules. A simple module is unable to move by its own without interacting with neighboring modules, however it can carry other attached modules as long as the target positions for the carried modules are unoccupied. When the units combine these operations in a coordinated way, they move relative to one another resulting in a reconfiguration of the robot (as shown in Fig. 1).

Crystalline and TeleCube modular robots have units that are arranged and connected in regular, three-dimensional pattern "cubic grid". As modules actuate the expansion or contraction mechanism each module performs linear motions, which can be described as a set of functions $f_i(x)$ where

$ \displaystyle{ f_i(x) = x + \frac{1}{2} \mu_i \textrm{, }} \left\{ \begin{array}{ll} \displaystyle{x = \frac{1}{2}k \textrm{ , } k\in {\bf Z}^3 }\\ \mu_i \in \{(1, 0, 0), (0, 1, 0), (0, 0, 1)\}. \\ \end{array} \right. $
3.2 Vocabulary of module actions

In what follows, the unit we refer to as parent is the unit that executes a given action, while the dangling modules include all the units that are attached to the parent. During the robot′s reconfiguration, each module can perform the following low-level primitives:

1) ExpandArm (direction)

Before expanding arm in direction, the following conditions must be guaranteed: a) either, the parent or the dangling modules must be labeled as moving blocks, meaning that one of them is not attached to a fixed block of units, otherwise the sum of all forces is equal to zero and the motion fails, b) collision should not be occurred when performing expansion. This primitive returns true if the expansion succeeds, and false otherwise.

2) ContractArm (direction)

Before contracting arm in direction, the following conditions must be guaranteed: a) either, the parent unit or the dangling modules must be labeled as moving block, b) collision should not be occurred when performing contraction. This primitive returns true if the contraction succeeds, and false otherwise.

3) Connect (direction)

If there is an immediate neighboring module in direction, activate the appropriate connection plate in order to bond with that module. Then update the neighboring list of the new connected modules.

4) Disconnect (direction)

If there is a module linked in direction, break the link with it only if the system remains connected after disconnection. Then update the neighboring list of the new disconnected modules.

Combining these primitives, we can build more complicated actions for high-level manipulation of the robot. We can also build a complete program that reconfigures a modular robot to take several shapes.

3.3 Compressible units simulator

For our experiments, we implemented a simulator for two types of compressible modular robots based on Crystalline and TeleCube units. The units are square in shape and they all have the same size, they are located in a regular grid and act in a two dimensional discrete environment. We suppose that next to each connector there is a transceiver and a set of appropriate sensors, which allow modules to communicate with connected neighbor modules and sense the state of the nearby grid as well. A single square module is embedded in a discrete coordinate system, and it fills one grid location in the world when its four arms are contracted back to the body. In contrast, it occupies an amount of $\frac{1}{2}$ of grid size in addition for each expanded arm. The four facing directions (east, north, west, south) of each module are respectively encoded as integer values in [0, 3]. While the states of the four actuators are encoded as a Boolean array $S_{i \in [0, 3]}$ where $S_{i \in [0, 3]}=$ true if $arm_{i \in [0, 3]}$ is expanded and $S_{i \in [0, 3]}=$ false otherwise.

Basically, the simulator executes a given sequence of native actions and rewards each action with true if it is successfully performed and with false otherwise. For reason of simplicity, no complex physical constraints except collision-detection and Newton′s three laws of motion are taken into account. When expanding and contracting arms, a unit exerts force (push or pull) upon connected neighboring units. By using Newton′s laws of motion, it is possible to determine which part in the system is pushed or pulled against which other and in what direction. In this paper, we suppose that the module′s actuators (mechanisms that control arms) have constant strength (i.e., each unit can push and pull a constant number of other units when expanding and contracting). We suppose also that module arms can only occupy two states, fully expanded and fully contracted. When a module sticks somewhere to the floor, the simulator assumes infinite strength between the module and the floor to avoid slipping (at least one module must be connected to the floor or any stationary obstacle when performing motion).

4 GP-based reconfiguration planning

Genetic programming is one of a number of evolutionary algorithms that solves problems without being explicitly programmed [15, 16]. It achieves this goal of automatic programming by genetically breeding a population of computer programs using biologically inspired operations (stochastic selection, crossover and mutation). On each generation, the fitness of each individual in the population is evaluated. Once the termination criterion for the run is satisfied, the best single individual obtained during the run is considered as the result of the run. Such a process, stochastically transforms a population of random computer programs into new, more optimized generation of programs. Unlike genetic algorithm′s (GA) individuals which are typically encoded as linear bit-strings, GP usually uses tree-based expressions as nonlinear structures for hierarchically structuring and manipulating individuals. The shape, and the contents of these individuals can dynamically change during the process.

In the area of complex systems, GP has demonstrated its potential in evolving programs for a wide range of applications including classification and pattern recognition, symbolic regression and circuit design [15, 17]. In this work we used GP as an automatic programming engine to solve the reconfiguration planning problem. To do that, the GP computes a feasible plan that compiles down an abstract move into a sequence of valid native moves that, given a source robot S and a target robot T, reconfigures S into T (Fig. 2). Both S and T are robots composed of n modules arranged in 2D grid. For the purpose of reliability, we require that the computed plan must maintain the system-connectivity on every time-step of the reconfiguration process.

Download:
Fig. 2. General scheme of the planner. The planner takes initial and target configurations as inputs, then it gives us a sequence of primitives required to transform the initial configuration into the target one.

We used tree-based GP-engine that was introduced by Koza [15]. In his model, Koza has mentioned five major steps in preparing to use genetic programming paradigm to solve a problem. These are choosing: 1) the set of terminals, 2) the set of functions, 3) the fitness measure, 4) the values of the numerical parameters and qualitative variables for controlling the run, 5) the criterion for designating a result and terminating a run.

4.1 Target shape description

Performing a successful reconfiguration requires the units to move toward target positions without being stuck behind each other or behind obstacles that exist in the environment. The goal of the units is to find target location and move together as close to it as possible.

Defining a fitness function as the sum of each module′s distance to the target location is particularly an attractive choice in a number of situations, however it seems to be useless when the environment contains concave obstacles [18]. Thus we used the concept of Morphogens for representing the target configuration.

Morphogens are long-range signaling molecules that act as graded positional cues to control cell differentiation in a concentration-dependent manner. This concept has brought to light the potential for signaling gradients to provide positional information for many developing multicellular organisms [19]. In previous work [18], we used morphogens to guide a metamorphic modular robot toward target objects without having any explicit information about their locations. In the present work we use the same strategy to provide positional information for the reconfiguring modular robot.

$ \begin{equation} C_s(a)=C_0 {\rm e}^{-\beta L_{(s, a)}} \label{1}~ . \end{equation} $ (1)

We assumed that the environment contains different morphogens that are gradually spreading on the grid by means of the decay function indicated in (1), where $C_0$ is the morphogen concentration at the location of a producing cell s, $\beta$ is a positive decay rate and $L_{(s, a)}$ is the full length of the shortest path between the producing cell s and a given position a. Morphogens diffuse only through the empty grid with respect to the obstacles that exist in the environment. During their spreading, the substances follow the very easy way to move from the high concentrated grid to the low concentrated one.

The target pattern of the modular robot is defined by morphogen values of each grid. The morphogen is released from producing cells which occupy each target grid. The morphogen value can be either positive or negative. A positive value means that the grid should be occupied by a module, while a negative gradient suggests that the module in the grid, if any, should be removed. A higher value of morphogen quantity indicates a higher priority for the grid to be filled by a module.

The distribution of morphogen gradients creates a virtual landscape that helps the modular robot to decide which direction to move to while flowing toward the target locations just like gravity does for fluids while traversing a terrain toward a sink.

4.2 Representation of GP individuals

The set of all possible individuals that GP can generate includes all the trees that can be recursively created by compositions of a set of function symbols $F=\{f_1, f_2, \cdots, f_n\}$ and a set of terminal symbols $T=\{t_1, t_2, \cdots, t_m\}$. Each function in the function set $F$ takes a specified number of arguments. Functions are derived directly from the description of the hardware capabilities, they include the four mechanical primitives that can be performed by each single module (attach, detach, expand, contract), each of which requires both the identity of the module by which it will be performed and the module′s actuator (east, north, west, south) that will be used to perform this function.

The configuration space generated by a system of n lattice-based modules is $ C = {\bf Z}^{2n}$. It includes all configurations that may be achieved by the collection of these modules. Any configuration ($c$) is a subset of the configuration space ($c \subset {\bf Z}^{2n}$) that refers to a particular arrangement of connectivity between modules.

As a given function $f_i$ is applied to an arbitrary configuration c, the configuration c is either remaining intact or is turning into a new configuration $\acute{c}$ as expressed in (2).

$ \begin{equation} f_i(t_1, t_2, c) = \left\{ \begin{array}{ll} c,&\textrm{ if the function $f_i$ fails } \\ \acute{c},&\textrm{ if the function $f_i$ succeeds. } \\ \end{array} \right. \end{equation} $ (2)

A function $f_i(t_1, t_2, c)$ fails if one of the following reasons is occurred:

1) Collision is detected: Due to module expansion and contraction by means of arguments $(t_1, t_2)$, collision can occur either between modules or between any module and the obstacles that exist in the environment.

2) The overall structure is fragmented: The robot structure may be divided into several parts due to module disconnections performed by means of $(t_1, t_2)$.

No matter a function $f_i \in {F}$ is succeeded or failed, it always results in a configuration $c \in {C}$, so that the next function can be applied to this result. The closure property is respected since each of the functions in the function set is able to accept, as one of its arguments, any value and data type that may possibly be returned by any function in the function set (Fig. 3). A sequence of consecutive native actions applied on a given configuration can be viewed as a composition of functions which is implemented as shown in Fig. 4. The result returned by this composition of functions is the final configuration returned when the entire sequence of functions is executed while traversing the program tree by depth-first recursive process.

Download:
Fig. 3. Hierarchical representation of a composition of functions

Download:
Fig. 4. Genotype implementation, each node encodes a single primitive "$C_{op}$" and its arguments ("$id_m$": for the module identifier and "$dir$": for the actuator)

The set of terminal symbols $T$ and the set of function symbols $ F$ are respectively represented as follow: $ T=\{t_1, t_2, c\}$

1) The first terminal symbol $t_1 \in \{0, 1, \cdots, n - 1\}$ is used to identify the module $M_{t_1}$ that will perform a given action $f_i \in F$.

2) The second terminal symbol $t_2$ takes value in $\{0, 1, 2, 3\}$. It is used to specify which one of the module′s actuators (east, north, west, south) will be used to perform the given action.

3) The third terminal symbol $c \in C$ represents the configuration on which the selected action $f_i$ will be applied using arguments $(t_1, t_2)$.

$ F=\{f_0, f_1, f_2, f_3\} \left\{ \begin{array}{ll} f_0(t_1, t_2, c)=attach(t_1, t_2, c) \\ f_1(t_1, t_2, c)=detach(t_1, t_2, c) \\ f_2(t_1, t_2, c)=expand(t_1, t_2, c) \\ f_3(t_1, t_2, c)=contract(t_1, t_2, c). \\ \end{array} \right. $
4.3 Fitness evaluation

Fitness is the driving force of natural selection in both conventional genetic algorithms and genetic programming. It represents the degree to which a given individual solves the problem of interest. This measure is used to control the genetic operations (selection, crossover, mutation) in the artificial population and to determine the probability that the individual survives up to the next generation.

Since the fitness function has a great impact in guiding GP to obtain the best solutions within a very large search space, it is important to give it the best design as possible in order to help the GP to explore the search space more effectively and efficiently. Our experience applying GP to solve the reconfiguration problem suggests that the fitness function should take into account the following three metrics:

1) The sum of all distances between the location of the units and the target locations where these units are supposed to be. However, instead of direct measuring these distances we use the concept of positional information which is most commonly thought to be implemented by morphogen gradients. In general terms, assuming that the production of morphogens at each of the target locations, the perceived morphogen $\delta_M$ will increase as the modular robot gets closer to the target configuration. In contrast, responding in a concentration-dependent manner might result in driving the reconfiguration process to output an improperly shaped robot that does not match the desired configuration. This situation is occurred when the most concentrated grids do not bound the target configuration space. It is thus useful to consider other metrics such as the bounding volumes.

2) Another such metric is the overlap metric $\delta_V$ which gives the volume remaining outside the target configuration $\delta^1_V = V - V_T$. Where V is the space occupied by the achieved configuration, and $V_T$ is the space occupied by the target configuration. In turn, this metric is expressed in term of the perceived negative morphogens which are equally spread outside the target configuration.

3) The third metric $\delta^2_V$ is the difference between the bounding boxes of both the achieved configuration and the desired configuration. $\delta^2_V = V_T - V$.

Let $x=\delta_M$ and $y = \delta^1_V + \delta^2_V = V + V_T - 2(V \cap V_T)$. The fitness function $f(x, y)$ we proposed is calculated by means of $\delta_M$, $\delta^1_V$ and $\delta^2_V$. It takes x and y as arguments and returns the degree to which the modular robot matches the desired shape.

$ \begin{equation} (\frac{\partial f}{\partial x} > 0) \wedge (\frac{\partial f}{\partial y} < 0). \end{equation} $ (3)

In general, we assume the fitness of an individual increases with the perceived morphogens ($\frac{\partial f}{\partial x} > 0$), decreases with both the volume remained outside the target configuration and the difference between the bounding boxes of both the final configuration and the desired configuration ($\frac{\partial f}{\partial y} < 0$).

$ \begin{eqnarray} f(x, y) = \frac{1}{2^y}(1+\frac{x}{x+1}) \end{eqnarray} $ (4)
$ \begin{eqnarray} Fitness = \displaystyle\frac{1}{2^{(\delta^1_V + \delta^2_V)}}(1+\displaystyle\frac{\delta_M}{\delta_M+1}). \end{eqnarray} $ (5)

There are many possible functions satisfying the conditions given in (3). The fitness function that we designed is shown in (4) and (5). This function increases with the value of x and quickly loses value as y gets bigger (placing modules in a proper location is more important than getting closer to the target configuration), furthermore this function is injective and the gradient $\bigtriangledown f$ has always a non-zero value ($\bigtriangledown f(x, y) \neq (0, 0)$ $\forall (x, y) \in \bf{R}^2$), which significantly helps the evolutionary algorithm to explore the research space without being stuck in critical points.

The use of arbitrary-length representations in genetic programming often presents the bloat challenge to the search process. This challenge is occurred due to uncontrolled and unbounded code growth without a corresponding improvement in fitness. Bloat slows the evolutionary search process and significantly consumes memory. The fitness function expressed in (5), allows individuals to constantly grow during the evolution. Hence the search process has the potential to become inefficient due to the evaluation of big programs that usually suffer from the proliferation of introns (parts of the program that do not contribute to the problem resolution). To overcome this problem, one simple solution consists of punishing individuals in some way for being large. However, constantly reducing the size of the individuals carries the risk that the number of possible nodes in a tree of maximum depth could not be enough to represent a solution for the given problem. Instead, a Gaussian-weighted penalty is given to the fitness for controlling the influence of the individual′s size on its performance to solve the problem at hand. When a change in the fitness function occurs (8), the weighted sum expressed in (6) receives more or less influence for the individual size.

We used the fitness function expressed in (6) to avoid setting hard limits on program sizes. The z argument represents the individual size in term of the sum of all nodes (the number of the encoded primitives).

$ \begin{eqnarray} f_t(x, y, z) = \frac{1}{2^y}(1+\frac{x}{x+1}) - (1 - \frac{\delta_t}{\delta_0}) \sqrt{z} \end{eqnarray} $ (6)
$ \begin{equation} \delta_t = \frac{1}{\sigma \sqrt{2\pi}}{\rm e}^\frac{-t^2}{2\sigma^2}\textrm{, }~~ t = \left\{ \begin{array}{ll} \displaystyle \frac{t}{2},&{\rm if}(\Delta f_t \neq 0)\\ t + 1,&{\rm if}(\Delta f_t = 0) \\ \end{array} \right. \end{equation} $ (7)
$ \begin{equation} \Delta f_t = f_t(x, y, z) - f_{t - 1}(x, y, z). \end{equation} $ (8)

As the fitness function increases (meaning that the current population evolves), the parameter $t$ decreases by half and it keeps decreasing by a factor of $\frac{1}{2^s}$ in every time step s while $\Delta f_t \neq 0$. At this stage, it is useful to lower the penalty for individuals to grow in order to allow them to get more nodes (genetic material) while there is a corresponding improvement in their fitness. The properties of this specification can be formulated as follows:

$ \begin{eqnarray} \lim\limits_{s\to\infty} \delta_t = \delta_0 \Rightarrow \lim\limits_{s\to\infty} f_t(x, y, z) = \frac{1}{2^y}(1+\frac{x}{x+1}). \end{eqnarray} $ (9)

The evolution should benefit from the creation of new genetic material, that should give the necessary amount of diversity for GP to evolve individuals.

In contrast, when $\Delta f_t = 0$ (no evolution is occurred in the considered population) the parameter t increases and it keeps increasing in every time step s while $\Delta f_t = 0$. After awhile ($\delta_t \simeq 0$), we believe that the evolution stagnates, thus it is useful to start optimizing the population (individuals that have less size replace those having the same performance for solving the problem at hand). This specification can be formulated as follows:

$ \begin{eqnarray} \lim\limits_{s\to\infty} \delta_t = 0 \Rightarrow \lim\limits_{s\to\infty} f_t(x, y, z) = \frac{1}{2^y}(1+\frac{x}{x+1}) - \sqrt{z}. \end{eqnarray} $ (10)

At this stage, individuals that exceed a certain threshold $"Thr"$ in size are allowed to lose some genetic materials in the hope of reducing the amount of physical memory usage as well as the processing time required to run and evaluate each individual.

5 Experimental results

In our experiments, we used the standard GP mutation and recombination operators for trees. The mutation operator replaces a subtree with a randomly created subtree and the crossover operator swaps subtrees rooted at two randomly chosen crossover points to generate two new offspring trees. The execution parameters for the GP are shown in Table 1. We used roulette-wheel selection to select individuals for performing genetic operations. The population was initialized with random trees (programs) with maximum tree depth of 6. A population size of 500 individuals is used to achieve the results depicted in Fig. 5. The simulator is synchronous, which means that each module performs its action in turn. This assumption is unrealistic, we are trying to relax it for future work.

Table 1
Set of all parameters used for the experiments

Download:
Fig. 5. Left-hand (a, c, e, g, i) the evolution over time of the size $(z)$ of best individuals. Right-hand (b, d, f, h, j) the fitness function against generations

During the evolution, each individual must be run separately to measure how well it performs the task. The result of the composition of functions which are encoded in a given individual is the achieved configuration when the entire sequence of these functions is carried out on the initial state of the system. The fitter individual is the program that transforms the initial configuration into a configuration which corresponds most closely to the goal.

We empirically tested the proposed method on five different reconfigurations which are in-place tunneling, out-of-place tunneling, sliding, circle-shaping and internal-move. The reconfigurations are depicted in Fig. 6. Some of them require multiple waves of actuation, others require modules to perform corner turns or internal movements. The results depicted in Fig. 5 are based on 24 runs. They show the progression over time of some system parameters while performing each task. They also show how the fitness function forces the GP to select genomes that best guide the modules to move toward positive morphogen gradient and away from negative morphogens.

Download:
Fig. 6. Five different reconfigurations considered for testing the planner. In-place-tunneling reconfiguration is constrained by obstacles "$X$ symbol"

The first observation we can make about these results is the capacity of the genetic programming to generate near-optimal sequences of primitives for compressible based modular robots to perform self reconfiguration. In Fig. 5 (c), we notice that the initial solution is achieved in generation 300 with a size of 604 total operations. This solution contains a feasible sequence of functions that reshapes the modular robot into the desired pattern. However, the encoded expression is complicated by unnecessary duplication of functions "introns". For this reason, the GP starts to recombine the genetic material of each individual in order to pick-up the finest short-length expressions which solve the desired reconfiguration. By only executing effective sequences when testing the fitness function, evaluation can be accelerated significantly. In Fig. 5 (c) a near-optimal expression is achieved in generation 1 080 with a size of 15 total operations.

We noticed also that GP performs well for reconfigurations that require moving modules toward target location which are aligned with the direction of the move (Fig. 5 (c)). This kind of reconfiguration may be achieved just by involving a limited number of modules which execute some periodic sequences of actions. Here, it is important to note that some modules (dangling modules) will not contribute to the reconfiguration. In contrast, for reconfigurations that require orthogonal turns (Figs. 6 (a) and 6 (b)) or those that require maneuvering modules on interior lines (Fig. 6 (d)) the problem seems more complex to be solved by GP, the reason for which it takes more generation before the first feasible sequence is achieved. In these cases, the problem returns to the physical complexity of turning corners under unit-compressible actuation because the disconnections required to turn a corner (as well as to maneuver modules on interior lines) need a minimum amount of surrounding structure that must be present (some snapshots of interior move reconfiguration are depicted in Fig. 7).

Download:
Fig. 7. Snapshots of internal move reconfiguration

Fig. 8 shows how the different fitness functions affect the optimal process. When the fitness function that associates each genome to one solution is not injective (function A in Fig. 8), different genomes can encode the same solution and the representation is said to be degenerated. It is not intrinsically too serious problem to get a slight degeneracy. However, this phenomenon can sometimes badly affect the search process, mostly because if several genomes can represent the same phenotype, which may make some kind of confusion in the search.

Download:
Fig. 8. Evolution of the perceived positive and negative morphogens when using injective and non-injective fitness function

Table 2 shows the ratio between the average size of the first achieved sequences "$F$-$Seq$" and the size of the near-optimal sequences "$Opt$-$Seq$" for each reconfiguration. The optimized sequences have the capability to solve the reconfiguration in reduced steps where each sequence is executed in a centralized fashion.

Table 2
Summary of experimental results

Although, it has been executed in a centralized fashion, the evolved program could also be executed in a decentralized fashion as explained in the following steps:

1) Propagate the program to each unit in the system by message passing algorithm.

2) Each unit keeps both the code segments in which it is involved and the corresponding predecessor for which it should wait before starting the execution of the next primitive.

3) The first unit is the unit which is involved in the first line of the program. This unit starts the execution and propagate a termination signal at the end of each single execution.

4) As a unit receives a termination signal from the appropriate predecessor, it starts the execution and propagate a termination signal at the end of each single execution.

Here, the termination signal is used for synchronizing the execution of primitives over multiple modules. It contains the module identifier which has just finished executing the current sequential primitive. As a unit receives this signal from a neighbor, it compares the signal identifier with that of the module it is waiting for (the module which executes the predecessor of its current primitive). If both identifiers are identical the unit starts executing its current primitive, otherwise it sends the received signal toward the neighboring units.

6 Conclusions and future work

In this paper, we have used genetic programming to solve the self-reconfiguration problem for modular robots which are based on compressible units. We see the self reconfiguration as a set of functions recursively applied to a given configuration. These functions are designed according to the mechanical actuators (motion primitives) of each unit in the system.

Here, genetic programming has been used as an automatic programming tool to find a near-optimal sequence of module motion primitives that transform the start configuration into the goal. Our approach does not require any extra effort to breaking down the problem into subproblems as it must be done when using dynamic programming method. In this work, the evolution is divided into two phases. First, a population of random programs is evolved toward better programs (sequence of motion primitives that makes the initial configuration more like to target configuration). Once a feasible sequence is achieved, the next phase of evolution will focus on minimizing the total number of primitives performed over all units in the hope of reducing the total power consumption.

We have tested this technique on both in-place and out-of-place reconfigurations. As was noted in [8], in-place reconfigurations are harder to be achieved than out-of-place reconfigurations. For the first one, the modules tend to suffer from overcrowding due to the restricted available space while for the second the modules are more free to actuate the reconfiguration. The experimental results also show that the system works well for generating primitive sequences for both slide and tunnel reconfigurations, however, it does not converge very well for planning reconfiguration that needs interior move of some modules.

For future work, we are continuing to write algorithms for automatically identifying and encapsulating potentially useful subtrees that serve as high-level primitives for different levels of granularity.

References
[1]
J. Kubica, A. Casal, T. Hogg. Complex behaviors from local rules in modular self-reconfigurable robots. In Proceedings of IEEE International Conference on Robotics and Automation, IEEE, Seoul, South Korea, pp. 360-367, 2001. http://ieeexplore.ieee.org/xpls/icp.jsp?arnumber=932578
[2]
M. Yim, W. M. Shen, B. Salemi, D. Rus, M. Moll, H. Lipson, E. Klavins, G. S. Chirikjian. Modular selfreconfigurable robot systems. IEEE Robotics and Automation Magazine, vol.14, no.1, pp.43-52, 2007. DOI:10.1109/MRA.2007.339623
[3]
B. Salemi, M. Moll, W. M. Shen. SUPERBOT: A deployable, multi-functional, and modular self-reconfigurable robotic system. In Proceedings of 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, IEEE, Beijing, China, pp. 3636-3641, 2006. http://ieeexplore.ieee.org/document/4058969/
[4]
D. Rus, M. Vona. Crystalline robots:Self-reconfiguration with compressible unit modules. Autonomous Robots, vol.10, no.1, pp.107-124, 2001. DOI:10.1023/A:1026504804984
[5]
G. S. Chirikjian, A. Pamecha, I. Ebert-Uphoff. Evaluating efficiency of self-reconfiguration in a class of modular robots. Journal of Robotic Systems, vol.13, no.5, pp.317-338, 1996. DOI:10.1002/(SICI)1097-4563(199605)13:5<>1.0.CO;2-R
[6]
A. Pamecha, G. Chirikjian. A useful metric for modular robot motion planning. In Proceedings of IEEE International Conference on Robotics and Automation, IEEE, Minneapolis, USA, pp. 442-447, 1996. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=503816
[7]
A. Pamecha, I. Ebert-Uphoff, G. S. Chirikjian. Useful metrics for modular robot motion planning. IEEE Transactions on Robotics and Automation, vol.13, no.4, pp.531-545, 1997. DOI:10.1109/70.611311
[8]
G. Aloupis, S. Collette, M. Damian, E. D. Demaine, R. Flatland, S. Langerman, J. O sourke, S. Ramaswami, V. Sacristan, S. Wuhrer. Linear reconfiguration of cube-style modular robots. In Proceedings of the 18th International Symposium, Lecture Notes in Computer Science, Springer, Sendai, Japan, vol. 4835, pp. 208-219, 2007.
[9]
T. Larkworthy, S. Ramamoorthy. An efficient algorithm for self-reconfiguration planning in a modular robot. In Proceedings of IEEE International Conference on Robotics and Automation, IEEE, Anchorage, USA, pp. 5139-5146, 2010. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5509482
[10]
Z. Butler, D. Rus. Distributed planning and control for modular robots with unit-compressible modules. International Journal of Robotics Research, vol.22, no.9, pp.699-715, 2003. DOI:10.1177/02783649030229002
[11]
J. Kubica, A. Casal, T. Hogg. Agent-based control for object manipulation with modular self-reconfigurable robots. In Proceedings of the 17th International Joint Conference on Artificial Intelligence, Morgan Kaufmann Publishers Inc. , San Francisco, USA, pp. 1344-1349, 2001.
[12]
R. Kala. Multi-robot path planning using co-evolutionary genetic programming. Expert Systems with Applications, vol.39, no.3, pp.3817-3831, 2012. DOI:10.1016/j.eswa.2011.09.090
[13]
Y. P. Wang, L. Cheng, Z. G. Hou, J. Z. Yu, M. Tan. Optimal formation of multirobot systems based on a recurrent neural network. IEEE Transactions on Neural Networks and Learning Systems, vol.27, no.2, pp.322-333, 2016. DOI:10.1109/TNNLS.2015.2464314
[14]
D. Rus, M. Vona. A physical implementation of the selfreconfiguring crystalline robot. In Proceedings of IEEE International Conference on Robotics and Automation, IEEE, San Francisco, USA, pp. 1726-1733, 2000. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=844845
[15]
J. R. Koza. Genetic Programming:On the Programming of Computers by Means of Natural Selection, Cambridge, USA: MIT Press, 1992.
[16]
J. R. Koza. Genetic Programming Ⅱ:Automatic Discovery of Reusable Programs, Cambridge, USA: MIT Press, 1994.
[17]
T. K. Paul, H. Iba. Genetic programming for classifying cancer data and controlling humanoid robots. Genetic Programming Theory and Practice IV, R. Riolo, T. Soule, B. Worzel, Eds., New York, USA: Springer, pp. 41-59, 2007.
[18]
T. Ababsa, Y. Duthen. Splittable metamorphic carrier robots. Artificial Life 14, H. Sayama, J. Rieffel, S. Risi, R. Doursat, H. Lipson, Eds., New York USA: MIT Press, pp. 801-808, 2014.
[19]
T. Steiner, J. Trommler, M. Brenn, Y. C. Jin, B. Sendhoff. Global shape with morphogen gradients and motile polarized cells. In Proceedings of IEEE Congress on Evolutionary Computation, IEEE, Trondheim, Norway, pp. 2225-2232, 2009. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4983217