draft programme eurogp 2012

Please note that the final order of presentations may change

Wednesday 11 April

New representations and operators (14:30-16:10)

Evolving High-Level Imperative Program Trees with Strongly Formed Genetic Programming

Tom Castle, Colin G. Johnson
We present a set of extensions to Montana’s popular Strongly Typed Genetic Programming system that introduce constraints on the structure of program trees. It is demonstrated that these constraints can be used to evolve programs with a naturally imperative structure, using common high-level imperative language constructs such as loops. A set of three problems including factorial and the general even-n-parity problem are used to test the system. Experimental results are presented which show success rates and required computational effort that compare favourably against other systems on these problems, while providing support for this imperative structure.

A New, Node-Focused Model for Genetic Programming

David Jackson
We introduce Single Node Genetic Programming (SNGP), a new graph-based model for genetic programming in which every individual in the population consists of a single program node. Function operands are other individuals, meaning that the graph structure is imposed externally on the population as a whole, rather than existing within its members. Evolution is via a hill-climbing mechanism using a single reversible operator. Experimental results indicate substantial improvements over conventional GP in terms of solution rates, efficiency and program sizes.

Medial Crossovers for Genetic Programming

Krzysztof Krawiec
We propose a class of crossover operators for genetic programming that aim at making offspring programs semantically intermediate (medial) with respect to parent programs by modifying short fragments of code (subprograms). The approach is applicable to problems that define fitness as a distance between program output and the desired output. Based on that metric, we define two measures of semantic `mediality’, which we employ to design two crossover operators: one aimed at making the semantic of offsprings geometric with respect to the semantic of parents, and the other aimed at making them equidistant to parents’ semantics. The operators act only on randomly selected fragments of parents’ code, which makes them computationally efficient. When compared experimentally with four other crossover operators, both operators lead to success ratio at least as good as for the non-semantic crossovers, and the operator based on equidistance proves superior to all others.

Coevolution in Cartesian Genetic Programming

Michaela Sikulova, Lukas Sekanina
Cartesian genetic programming (CGP) is a branch of genetic programming which has been utilized in various applications. This paper proposes to introduce coevolution to CGP in order to accelerate the task of symbolic regression. In particular, fitness predictors which are small subsets of the training set are coevolved with CGP programs. It is shown using five symbolic regression problems that the (median) execution time can be reduced 2-5 times in comparison with the standard CGP.

Wednesday 11 April

Applications I – Design (16:30-18:10)

Evolving Interpolating Models of Net Ecosystem CO2 Exchange Using Grammatical Evolution

Miguel Nicolau, Matthew Saunders, Michael O’Neill, Bruce Osborne, Anthony Brabazon
Accurate measurements of Net Ecosystem Exchange of CO2 between atmosphere and biosphere are required in order to estimate annual carbon budgets. These are typically obtained with Eddy Covariance techniques. Unfortunately, these techniques are often both noisy and incomplete, due to data loss through equipment failure and routine maintenance, and require gap-filling techniques in order to provide accurate annual budgets. In this study, a grammar-based version of Genetic Programming is employed to generate interpolating models for flux data. The evolved models are robust, and their symbolic nature provides further understanding of the environmental variables involved.

Genetic Programming for Generalised Helicopter Hovering Control

Dimitris C. Dracopoulos, Dimitrios Effraimidis
We show how genetic programming can be applied to helicopter hovering control, a nonlinear high dimensional control problem which previously has been included in the literature in the set of benchmarks for the derivation of new intelligent controllers. The evolved controllers are compared with a neuroevolutionary approach which won the first position in the 2008 helicopter hovering reinforcement learning competition. GP performs similarly (and in some cases better) with the winner of the competition, even in the case where unknown wind is added to the dynamic system and control is based on structures evolved previously, i.e. the evolved controllers have good generalisation capability.

Cartesian Genetic Programming for Memristive Logic Circuits

Gerard Howard, Larry Bull, Andrew Adamatzky
In this paper memristive logic circuits are evolved using Cartesian Genetic Programming. Graphs comprised of implication logic (IMP) nodes are compared to more ubiquitous NAND circuitry on a number of logic circuit problems and a robotic control task. Self-adaptive search parameters are used to provide each graph with autonomy with respect to its relative mutation rates. Results demonstrate that, although NAND-logic graphs are easier to evolve, IMP graphs carry benefits in terms of (i) numbers of memristors required (ii) the time required to process the graphs.

Evolutionary Design of Message Efficient Secrecy Amplification Protocols

Tobias Smolka, Petr Svenda, Lukas Sekanina, Vashek Matyas
Secrecy amplification protocols are mechanisms that can significantly improve security of partially compromised wireless sensor networks (e.g., turning a half-compromised network into the 95% secure one). The main disadvantage of existing protocols is a high communication overhead increasing exponentially with network density. We devise a novel family of these protocols exhibiting only a linear increase of the communication overhead. The protocols are automatically generated by linear genetic programming (LGP) connected to a network simulator. After a deep analysis of various characteristics of this new family of protocols, with a special focus on the tuning of LGP parameters, new and better group-oriented protocols are discovered by LGP. A multi-criteria optimization is then utilized to further reduce the communication over-head down to 1/2 of the original amount while maintaining the original fraction of secure links.

Thursday 12 April

Posters (9:30-11:00)

Random Sampling Technique for Overfitting Control in Genetic Programming

Ivo Goncalves, Sara Silva, Joana B. Melo, Joao M. B. Carreiras
One of the areas of Genetic Programming (GP) that, in comparison to other Machine Learning methods, has seen fewer research efforts is that of generalization. Generalization is the ability of a solution to perform well on unseen cases. It is one of the most important goals of any Machine Learning method, although in GP only recently has this issue started to receive more attention. In this work we perform a comparative analysis of a particularly interesting configuration of the Random Sampling Technique (RST) against the Standard GP approach. Experiments are conducted on three multidimensional symbolic regression real world datasets, the first two on the pharmacokinetics domain and the third one on the forestry domain. The results show that the RST decreases overfitting on all datasets. This technique also improves testing fitness on two of the three datasets. Furthermore, it does so while producing considerably smaller and less complex solutions. We discuss the possible reasons for the good performance of the RST, as well as its possible limitations.

Evolutionary Operator Self-Adaptation with Diverse Operators

Min Hyeok Kim, R I McKay, Dong-Kyun Kim, Nguyen Xuan Hoai
Operator adaptation in evolutionary computation has previously been applied to either small numbers of operators, or larger numbers of fairly similar ones. This paper focuses on adaptation in algorithms offering a diverse range of operators. We compare a number of previously-developed adaptation strategies, together with two that have been specifically designed for this situation. Probability Matching and Adaptive Pursuit methods performed reasonably well in this scenario, but a strategy combining aspects of both performed better. Multi-Arm Bandit techniques performed well when parameter settings were suitably tailored to the problem, but this tailoring was difficult, and performance was very brittle when the parameter settings were varied.

The Effect of Bloat on the Efficiency of Incremental Evolution of Simulated Snake-like Robot

Ivan Tanev, Tuze Kuyucu, Katsunori Shimohara
We present the effect of bloat on the efficiency of incremental evolution of locomotion of simulated snake-like robot (Snakebot) situated in a challenging environment. In the proposed incremental genetic programming (IGP), the task of coevolving the locomotion gaits and sensing of the bot in a challenging environment is decomposed into two subtasks, implemented as two consecutive evolutionary stages. In the first stage we use genetic programming (GP) to evolve a pool of morphologically simple, sensorless Snakebots that move fast in a smooth, open terrain. Then, during the second stage, we use this pool to seed the initial population of Snakebots that are further subjected to coevolution of their locomotion control and sensing morphology in a challenging environment. The empirical results suggest that the bloat has no immediate effect on the efficiency of the first stage of IGP. However, the bloated seed contributes to a much faster second stage of evolution. In average, the second stage with bloated seed reaches the best fitness values of the parsimony seeds about five times faster. We assume that this speedup is attributed to the neutral code that is used by IGP as an evolutionary playground to experiment with developing novel sensory abilities, without damaging the already evolved, fast locomotion of the bot.

Bayesian Network Structure Learning from Limited Datasets through Graph Evolution

Alberto Paolo Tonda, Evelyne Lutton, Romain Reuillon, Giovanni Squillero, Pierre-Henri Wuillemin
Bayesian networks are stochastic models, widely adopted to encode knowledge in several fields. One of the most interesting features of a Bayesian network is the possibility of learning its structure from a set of data, and subsequently use the resulting model to perform new predictions. Structure learning for such models is a NP-hard problem, for which the scientific community developed two main approaches: score-and-search metaheuristics, often evolutionary-based, and dependency-analysis deterministic algorithms, based on stochastic tests. State-of-the-art solutions have been presented in both domains, but all methodologies start from the assumption of having access to large sets of learning data available, often numbering thousands of samples. This is not the case for many real-world applications, especially in the food processing and research industry. This paper proposes an evolutionary approach to the Bayesian structure learning problem, specifically tailored for learning sets of limited size. Falling in the category of score-and-search techniques, the methodology exploits an evolutionary algorithm able to work directly on graph structures, previously used for assembly language generation, and a scoring function based on the Akaike Information Criterion, a well-studied metric of stochastic model performance. Experimental results show that the approach is able to outperform a state-of-the-art dependency-analysis algorithm, providing better models for small datasets.

Efficient Phenotype Evaluation in Cartesian Genetic Programming

Zdenek Vasicek, Karel Slany
This paper describes an efficient acceleration technique designed to speedup the evaluation of candidate solutions in Cartesian Genetic Programming (CGP). The method is based on translation of the CGP phenotype to a binary machine code that is consequently executed. The key feature of the presented approach is that the introduction of the translation mechanism into common fitness evaluation procedure requires only marginal knowledge of target CPU instruction set. The proposed acceleration technique is evaluated using a symbolic regression problem in floating point domain. It is shown that for a cost of small changes in a common CGP implementation, a significant speedup can be obtained even on a common desktop CPU. The accelerated version of CGP implementation accompanied with performance analysis is available for free download from http://www.fit.vutbr.cz/~vasicek/cgp.

Thursday 12 April

Analysis (11:20-13:00)

Grammar Bias and Initialisation in Grammar Based Genetic Programming

Eoin Murphy, Erik Hemberg, Miguel Nicolau, Michael O’Neill, Anthony Brabazon
Preferential language biases which are introduced when using Tree-Adjoining Grammars in Grammatical Evolution affect the distribution of generated derivation structures, and as such, present difficulties when designing initialisation methods. Similar initial populations allow for a fairer comparison between different GP methods. This work proposes methods for dealing with these biases and examines their effect on performance over four well known benchmark problems. In addition, a comparison is performed with a previous study that did not employ similar phenotype distributions in their initial populations. It is found that the use of this form of initialisation has a positive effect on performance.

An Investigation of Fitness Sharing with Semantic and Syntactic Distance Metrics

Nguyen Quang Uy, Nguyen Xuan Hoai, Michael O’Neill, Alexandros Agapitos
This paper investigates the efficiency of using semantic and syntactic distance metrics in fitness sharing with Genetic Programming (GP). We modify the implementation of fitness sharing to speed up its execution, and used two distance metrics in calculating the distance between individuals in fitness sharing: semantic distance and syntactic distance. We applied fitness sharing with these two distance metrics to a class of real-valued symbolic regression. Experimental results show that using semantic distance in fitness sharing helps to significantly improve the performance of GP more frequently, and results in faster execution times than with the syntactic distance. Moreover, we also analyse the impact of the fitness sharing parameters on GP performance helping to indicate appropriate values for fitness sharing using a semantic distance metric.

Matrix Analysis of Genetic Programming Mutation

Andrew J. Parkes, Ender Ozcan, Matthew R. Hyde
Heuristic policies for combinatorial optimisation problems can be found by using Genetic programming (GP) to evolve a mathematical function over variables given by the current state of the problem, and whose value is used to determine action choices (such as preferred assignments or branches). If all variables have finite discrete domains, then the expressions can be converted to an equivalent lookup table or `decision matrix’. Spaces of such matrices often have natural distance metrics (after conversion to a standard form). As a case study, and to support the understanding of GP as a meta-heuristic, we extend previous bin-packing work and compare the distances between matrices from before and after a GP-driven mutation. We find that GP mutations often correspond to large moves within the space of decision matrices. This strengthens evidence that the role of mutations within GP might be somewhat different than their role within Genetic Algorithms.

Thursday 12 April

Applications II – Classification/Prediction (14:30-16:10)

Improving Face Detection

Penousal Machado, Joao Correia, Juan Romero
A novel Genetic Programming approach for the improvement of the performance of classifier systems through the synthesis of new training instances is presented. The approach relies on the ability of the Genetic Programming engine to identify and exploit shortcomings of classifier systems, and generate instances that are misclassified by them. The addition of these instances to the training set has the potential to improve classifier’s performance. The experimental results attained with face detection classifiers are presented and discussed. Overall they indicate the success of the approach.

Improving Relevance Measures Using Genetic Programming

Kourosh Neshatian, Mengjie Zhang
Relevance is a central concept in many feature selection algorithms. Given a relevance measure, a feature selection algorithm searches for a subset of features that maximise the relevance between the subset and target concepts. This paper first shows how relevance measures that rely on the posterior estimation such as information theory measures may fail to quantify the actual utility of subsets of features in certain situations. The paper then proposes a solution based on Genetic Programming which can improve the usability of these measures. The paper is focused on classification problems with numeric features.

Android Genetic Programming Framework

Alban Cotillon, Philip Valencia, Raja Jurdak
Personalisation in smart phones requires adaptability to dynamic context based on application usage and sensor inputs. Current personalisation approaches do not provide sufficient adaptability to dynamic and unexpected context. This paper introduces the Android Genetic Programming Framework (AGP) as a personalisation method for smart phones. AGP considers the specific design challenges of smart phones, such as resource limitation and constrained programming environments. We demonstrate AGP’s utility through empirical experiments on two applications: a news reader application and an energy efficient localisation application. Results show that AGP successfully adapts application behaviour to user context.

Multi-objective Ant Programming for Mining Classification Rules

Juan Luis Olmo, Jose Raul Romero, Sebastian Ventura
Ant programming (AP) is a kind of automatic programming that generates computer programs by using the ant colony optimization metaheuristic. It has recently demonstrated a good generalization ability when extracting classification rules. We extend the investigation on the application of AP to classification, developing an algorithm that addresses rules’ evaluation using a novel multi-objective approach specially devised for the classification task. The algorithm proposed also incorporates an evolutionary computing niching procedure to increment the diversity of the population of programs found so far. Results obtained by this algorithm are compared with other three genetic programming algorithms and other industry standard algorithms from different areas, proving that multi-objective AP is a good technique at tackling classification problems.

Friday 13 April

Best paper nominees (09:30-11:10)

Evolving Reusable Operation-Based Due-Date Assignment Models for Job Shop Scheduling with Genetic Programming

Su Nguyen, Mengjie Zhang, Mark Johnston, Kay Chen Tan
Due-date assignment plays an important role in scheduling systems and strongly influences the delivery performance of job shops. Because of the stochastic and dynamic features of job shops, the development of general due-date assignment models (DDAMs) is complicated. In this study, two genetic programming (GP) methods are proposed to evolve DDAMs for job shop environments. The experimental results show that the evolved DDAMs can make more accurate estimates than other existing dynamic DDAMs with promising reusability. In addition, the evolved operation-based DDAMs show better performance than the evolved DDAMs employing aggregate information of jobs and machines.

An Ecological Approach to Measuring Locality in Linear Genotype to Phenotype Maps

Tom Seaton, Julian F. Miller, Tim Clarke
Recent research has considered the role of locality in GP representations. We use a modified statistical technique drawn from numerical ecology, the Mantel test, to measure the locality of integer-encoded GP. Weak locality is identified in a case study on Cartesian Genetic Programming (CGP), a directed acyclic graph representation. A method of varying syntactic program locality continuously through the application of a biased mutation operator is demonstrated. The impact of varying locality under the new measure is assessed over a randomly generated set of polynomial symbolic regression problems. We observe that enforcing higher levels of locality in CGP is associated with poorer performance on the problem set and discuss implications in the context of existing models of GP genotype-phenotype maps.

Automatic Design of Ant Algorithms with Grammatical Evolution

Jorge Tavares, Francisco B. Pereira
We propose a Grammatical Evolution approach to the automatic design of Ant Colony Optimization algorithms. The grammar adopted by this framework has the ability to guide the learning of novel architectures, by rearranging components regularly found on human designed variants. Results obtained with several TSP instances show that the evolved algorithmic strategies are effective, exhibit a good generalization capability and are competitive with human designed variants.