Talks

Wednesday 27 April, room 1: Darwin, Theory

Chair: Bill Langdon

11:30-13:00 Paper #13 – Robustness, Evolvability, and Accessibility in Linear Genetic Programming (Best paper nomination)
Ting Hu, Joshua Payne, Wolfgang Banzhaf, Jason Moore
Whether neutrality has positive or negative effects on evolutionary search is a contentious topic, with reported experimental results supporting both sides of the debate. Most existing studies use performance statistics, success rate or search efficiency, to investigate if neutrality, either embedded or artificially added, can benefit an evolutionary algorithm. Here, we argue that understanding the influence of neutrality on evolutionary optimization requires an understanding of the interplay between robustness and evolvability at the genotypic and phenotypic scales. As a concrete example, we consider a simple linear genetic programming system that is amenable to exhaustive enumeration, and allows for the full characterization of these properties. We adopt statistical measurements from RNA systems to quantify robustness and evolvability at both genotypic and phenotypic levels. Using an ensemble of random walks, we demonstrate that the benefit of neutrality crucially depends upon its phenotypic distribution.

11:30-13:00 Paper #54 – How Far Is It From Here to There? A Distance that is Coherent with GP Operators
James McDermott, Una-May O’Reilly, Kalyan Veeramachaneni, Leonardo Vanneschi

The distance between pairs of individuals is a useful concept in the study of evolutionary algorithms. It is particularly useful to define a distance which is coherent with, i.e. related to, the action of a particular operator. We present the first formal, general definition of this operator-distance coherence. We also propose a new distance function, based on the multi-step transition probability (MSTP), that is coherent with any GP operator for which the one-step transition probability (1STP) between individuals can be defined. We give an algorithm for 1STP in the case of subtree mutation. Because MSTP is useful in GP investigations, but impractical to compute, we evaluate a variety of means to approximate it. We show that some syntactic distance measures give good approximations, and attempt to combine them to improve the approximation using a GP symbolic regression method. We conclude that 1STP itself is a sufficient indicator of MSTP for subtree mutation.

11:30-13:00 Paper #48 – Statistical Distribution of Generation-to-Success in GP: Application to Model Accumulated Success Probability
David Fernández, Bonifacio Castaño, María R-Moreno, David Camacho

Many different metrics have been defined in Genetic Programming. Depending on the experiment requirements and objectives, a collection of measures are selected in order to achieve an understanding of the algorithm behaviour. One of the most common metrics is the accumulated success probability, which evaluates the probability of an algorithm to achieve a solution in a certain generation. We propose a model of accumulated success probability composed by two parts, a binomial distribution that models the total number of success, and a lognormal approximation to the generation-to-success, that models the variation of the success probability with the generation.

Wednesday 27 April, room 1: Darwin, Representation

Chair: Wolfgang Banzhaf
14:30-16:20 Paper #42 – Examining Mutation Landscapes In Grammar Based Genetic Programming (Best paper nomination)
Eoin Murphy, Michael O’Neill, Anthony Brabazon

Representation is a very important component of any evolutionary algorithm. Changing the representation can cause an algorithm to perform very differently. Such a change can have an effect that is difficult to understand. This paper examines what happens to the grammatical evolution algorithm when replacing the commonly used context-free grammar representation with a tree-adjunct grammar representation. We model the landscapes produced when using integer flip mutation with both representations and compare these landscapes using visualisation methods little used in the field of genetic programming.

14:30-16:20 Paper #47 – ReNCoDe: Regulatory Network Computational Device (Best paper nomination)
Rui Lopes, Ernesto Costa

In recent years, our biologic understanding was increased with the comprehension of the multitude of regulatory mechanisms that are fundamental in both processes of inheritance and of development, and some researchers advocate the need to explore computationally this new understanding. One of the outcomes was the Artificial Gene Regulatory (ARN) model, first proposed by Wolfgang Banzhaf. In this paper, we present a modification of this model, aimed at solving some of its limitations, and show experimentally that it is effective in solving a set of benchmark problems.

14:30-16:20 Paper #50 – Learnable Embeddings of Program Spaces
Krzysztof Krawiec

We consider a class of adaptive, globally-operating, semantic-based embeddings of programs into discrete multidimensional spaces termed prespaces. In the proposed formulation, the original space of programs and its prespace are bound with a learnable mapping, where the process of learning is aimed at improving the overall locality of the new representation with respect to program semantics. To learn the mapping, which is formally a permutation of program locations in the prespace, we propose two algorithms: simple greedy heuristics and an evolutionary algorithm. To guide the learning process, we use a new definition of semantic locality. In an experimental illustration concerning four symbolic regression domains, we demonstrate that an evolutionary algorithm is able to improve the embedding designed by means of greedy search, and that the learned prespaces usually offer better search performance than the original program space.

Wednesday 27 April, room 1: Darwin, Applications

Chair: Leonardo Vanneschi
16:20-17:50 Paper #58 – Evolution of a Brain-Computer Interface Mouse via Genetic Programming (Best paper nomination)
Riccardo Poli, Mathew Salvaris, Caterina Cinel

We propose the use of genetic programming as a means to evolve brain-computer interfaces for mouse control. Our objective is to synthesise complete systems, which analyse electroencephalographic signals and directly transform them into pointer movements, almost from scratch, the only input provided by us in the process being the set of visual stimuli to be used to generate recognisable brain activity. Experimental results with our GP approach are very promising and compare favourably with those produced by support vector machines.

16:20-17:50 Paper #25 – GP-based Electricity Price Forecasting
Alberto Bartoli, Giorgio Davanzo, Andrea De Lorenzo, Eric Medvet

The electric power market is increasingly relying on competitive mechanisms taking the form of day-ahead auctions, in which buyers and sellers submit their bids in terms of prices and quantities for each hour of the next day. Methods for electricity price forecasting suitable for these contexts are crucial to the success of any bidding strategy. Such methods have thus become very important in practice, due to the economic relevance of electric power auctions. In this work we propose a novel forecasting method based on Genetic Programming. Key feature of our proposal is the handling of outliers, i.e., regions of the input space rarely seen during the learning. Since a predictor generated with Genetic Programming can hardly provide acceptable performance in these regions, we use a classifier that attempts to determine whether the system is shifting toward a difficult-to-learn region. In those cases, we replace the prediction made by Genetic Programming by a constant value determined during learning and tailored to the specific subregion expected. We evaluate the performance of our proposal against a challenging baseline representative of the state-of-the-art. The baseline analyzes a real-world dataset by means of a number of different methods, each calibrated separately for each hour of the day and recalibrated every day on a progressively growing learning set. Our proposal exhibits smaller prediction error, even though we construct one single model, valid for each hour of the day and used unmodified across the entire testing set. We believe that our results are highly promising and may open a broad range of novel solutions.

16:20-17:50 Paper #30 – Evolving Cell Array Configurations Using CGP
Paul Bremner, Anthony G. Pipe, Mohammad Samie, Gabriel Dragffy, Yang Liu

A cell array is a proposed type of custom FPGA, where digital circuits can be formed from interconnected configurable cells. In this paper we have presented a means by which CGP might be adapted to evolve configurations of a proposed cell array. As part of doing so, we have suggested an additional genetic operator that exploits modularity by copying sections of the genome within a solution, and investigated its efficacy. Additionally, we have investigated applying selection pressure for parsimony during functional evolution, rather than in a subsequent stage as proposed in other work. Our results show that solutions to benchmark problems can be evolved with a good degree of efficiency, and that compact solutions can be found with no significant impact on the required number of circuit evaluations.

Thursday 28 April, room 1: Darwin, Techniques 1

James Foster

9:30-11:00 Paper #29 – Maximum Margin Decision Surfaces for Increased Generalisation in Evolutionary Decision Tree Learning
Alexandros Agapitos, Michael O’Neill, Anthony Brabazon, Theodoros Theodoridis

Decision tree learning is one of the most widely used and practical methods for inductive inference. We present a novel method that increases the generalisation of genetically-induced classification trees, which employ linear discriminants as the partitioning function at each internal node. Genetic Programming is employed to search the space of oblique decision trees. At the end of the evolutionary run, a (1+1) Evolution Strategy is used to geometrically optimise the boundaries in the decision space, which are represented by the linear discriminant functions. The evolutionary optimisation concerns maximising the decision-surface margin that is defined to be the smallest distance between the decision-surface and any of the samples. Initial empirical results of the application of our method to a series of datasets from the UCI repository suggest that model generalisation benefits from the margin maximisation, and that the new method is a very competent approach to pattern classification as compared to other learning algorithms.

9:30-11:00 Paper #35 – A Peer-to-Peer Approach to Genetic Programming
Juan Luis Jiménez Laredo, Daniel Lombraña González, Francisco Fernández de Vega, Maribel García Arenas, Juan Julián Merelo Guervós

This paper proposes a fine-grained parallelization of the Genetic Programming paradigm (GP) using the Evolvable Agent model (EvAg). The algorithm is decentralized in order to take full-advantage of a massively parallel Peer-to-Peer infrastructure. In this context, GP is particularly demanding due to its high requirements of computational power. To assess the viability of the approach, the EvAg model has been empirically analyzed in a simulated Peer-to-Peer environment where experiments were conducted on two well-known GP problems. Results show that the spatially structured nature of the algorithm is able to yield a good quality in the solutions. Additionally, parallelization improves times to solution by several orders of magnitude.

9:30-11:00 Paper #10 – A Sniffer Technique for an Efficient Deduction of Model Dynamical Equations using Genetic Programming
Dilip Ahalpara, Abhijit Sen
A novel heuristic technique that enhances the search facility of the standard genetic programming (GP) algorithm is presented. The method provides a dynamic sniffing facility to optimize the local search in the vicinity of the current best chromosomes that emerge during GP iterations. Such a hybrid approach, that combines the GP method with the sniffer technique, is found to be very effective in the solution of inverse problems where one is trying to construct model dynamical equations from either finite time series data or knowledge of an analytic solution function. As illustrative examples, some special function ordinary differential equations (ODEs) and integrable nonlinear partial differential equations (PDEs) are shown to be efficiently and exactly recovered from known solution data. The method can also be used effectively for solution of model equations (the direct problem) and as a tool for generating multiple dynamical systems that share the same solution space.

Thursday 28 April, room 1: Darwin, Techniques 2

Ernesto Costa

11:30-13:00 Paper #37 – Performance Models for Evolutionary Program Induction Algorithms based on Problem Difficulty Indicators
Mario Graff, Riccardo Poli

Most theoretical models of evolutionary algorithms are difficult to apply to realistic situations. In this paper, two models of evolutionary program-induction algorithms (EPAs) are proposed which overcome this limitation. We test our approach with two important classes of problems — symbolic regression and Boolean function induction — and a variety of EPAs including: different versions of genetic programming, gene expression programing, stochastic iterated hill climbing in program space and one version of cartesian genetic programming. We compare the proposed models against a practical model of EPAs we previously developed and find that in most cases the new models are simpler and produce better predictions. A great deal can also be learnt about an EPA via a simple inspection of our new models. E.g., it is possible to infer which characteristics make a problem difficult or easy for the EPA.

11:30-13:00 Paper #22 – A Quantitative Study of Learning and Generalization in Genetic Programming
Mauro Castelli, Luca Manzoni, Sara Silva, Leonardo Vanneschi

The relationship between generalization and solutions functional complexity in genetic programming (GP) has been recently investigated. Three main contributions are contained in this paper: (1) a new measure of functional complexity for GP solutions, called Graph Based Complexity (GBC) is defined and we show that it has a higher correlation with GP performance on out-of-sample data than another complexity measure introduced in a recent publication. (2) A new measure is presented, called Graph Based Learning Ability (GBLA). It is inspired by the GBC and its goal is to quantify the ability of GP to learn “difficult” training points; we show that GBLA is negatively correlated with the performance of GP on out-of-sample data. (3) Finally, we use the ideas that have inspired the definition of GBC and GBLA to define a new fitness function, whose suitability is empirically demonstrated. The experimental results reported in this paper have been obtained using three real-life multidimensional regression problems.

Thursday 28 April, room 1: Darwin, Techniques 3

Riccardo Poli
14:30-16:00 Paper #52 – Parallel Linear Genetic Programming (Best Paper Nomination)
Carlton Downey, Mengjie Zhang

Motivated by biological inspiration and the issue of code disruption, we develop a new form of Linear Genetic Programming (LGP) called Parallel Linear Genetic Programming (PLGP). PLGP programs consists of n lists of instructions. These lists are executed in parallel, after which the resulting vectors are combined to produce program output. PGLP limits the disruptive effects of crossover and mutation, which allows PLGP to significantly outperform regular LGP.

14:30-16:00 Paper #31 – Designing Pheromone Update Strategies with Strongly Typed Genetic Programming(Best Paper Nomination)
Jorge Tavares, Francisco Pereira

Ant Colony algorithms are population-based methods widely used in combinatorial optimization problems. We propose a strongly typed genetic programming approach to automatically evolve the communication mechanism that allows ants to cooperatively solve a given problem. Results obtained with several TSP instances show that the evolved pheromone update strategies are effective, exhibit a good generalization capability and are competitive with human designed variants.

14:30-16:00 Paper #27 – Novel Loop Structures and the Evolution of Mathematical Algorithms(Best Paper Nomination)
Mingxu Wan, Thomas Weise, Ke Tang

In this paper, we analyze the capability of Genetic Programming (GP) to synthesize non-trivial, non-approximative, and deterministic mathematical algorithms with integer-valued results. Such algorithms usually involve loop structures and we raise the question which representation for loops would be most efficient. We define five tree-based program representations which realize the concept of loops in different ways, including two novel methods which use the convergence of variable values as implicit stopping criteria. Based on experiments on four problems and statistical comparisons with random walks, we conclude that the problem type under investigation is hard for GP. Still, GP can significantly outperform random walks. Furthermore, we found that none of the program representations could consistently outperform the others, but the two novel methods with indirect stopping criteria are utilized to a much higher degree than the other three loop representations.

Friday 29 April, room 1: Darwin, Self-Organisation

Chair: Sara Silva

9:30-11:00 Paper #62 – Evolving Fitness Functions for Mating Selection
Penousal Machado, António Leitão

The tailoring of an evolutionary algorithm to a specific problem is typically a time-consuming and complex process. Over the years, several approaches have been proposed for the automatic adaptation of parameters and components of evolutionary algorithms. We focus on the evolution of mating selection fitness functions and use as case study the Circle Packing in Squares problem. Each individual encodes a potential solution for the circle packing problem and a fitness function, which is used to assess the suitability of its potential mating partners. The experimental results show that by evolving mating selection functions it is possible to surpass the results attained with hardcoded fitness functions. Moreover, they also indicate that genetic programming was able to discover mating selection functions that: use the information regarding potential mates in novel and unforeseen ways; outperform the class of mating functions considered by the authors.

9:30-11:00 Paper #61 – Operator Self-Adaptation in Genetic Programming
MinHyeok Kim, Bob (RI) McKay, Nguyen Xuan Hoai, Kangil Kim

We investigate the application of adaptive operator selection rates to Genetic Programming. Results confirm those from other areas of evolutionary algorithms: adaptive rate selection out-performs non-adaptive methods, and among adaptive methods, adaptive pursuit out-performs probability matching. Adaptive pursuit combined with a reward policy that rewards the overall fitness change in the elite worked best of the strategies tested, though not uniformly on all problems.

9:30-11:00 Paper #32 – Random Lines: A Novel Population Set-based Evolutionary Global Optimization Algorithm
Ismet Sahin

In this paper, we present a new population set-based evolutionary optimization algorithm which aims to find global minima of cost functions. This algorithm creates random lines passing through pairs of points (vectors) in population, fits a quadratic function based on three points on each line, and then applies the crossover operation to extrema of these quadratic functions, and lastly performs the selection operation. We refer to the points determining random lines as parent points and the extremum of a quadratic model as the descendant or mutated point under some conditions. In the crossover operation, some entries of a descendant vector are randomly replaced with the corresponding entries of one parent vector and some other entries of the descendant vector are replaced with the corresponding entries of the other parent vector based on the crossover constant. The above crossover and mutation operations make this algorithm robust and fast converging. One important property of this algorithm is that its robustness in general increases with increasing population size which may become useful when more processing units are available. This algorithm achieves comparable results with the well-known Differential Evolution (DE) algorithm over a wide range of cost functions.

Posters

Wednesday 27 April in the Atrium, 18:15-19:00

Paper #16 – Experiments on Islands
Michael Harwerth

The use of segmented populations (Islands) has proven to be advantageous for Genetic Programming (GP). This paper discusses the application of segmentation and migration strategies to a system for Linear Genetic Programming (LGP). Besides revisiting migration topologies, a modification for migration strategies is proposed — migration delay. It is found that highly connected topologies yield better results than those with segments coupled more loosely, and that migration delays can further improve the effect of migration.

Paper #21 – A New Approach to Solving 0-1 Multiconstraint Knapsack Problems Using Attribute Grammar with Lookahead
Muhammad Rezaul Karim, Conor Ryan

In this paper, we introduce a new approach to genotype-phenotype mapping for Grammatical Evolution (GE) using an attribute grammar (AG) to solve 0-1 multiconstraint knapsack problems. Previous work on AGs dealt with constraint violations through repeated remapping of non-terminals, which generated many introns, thus decreasing the power of the evolutionary search. Our approach incorporates a form of lookahead into the mapping process using AG to focus only on feasible solutions and so avoid repeated remapping and introns. The results presented in this paper show that the proposed approach is capable of obtaining high quality solutions for the tested problem instances using fewer evaluations than existing methods.

Paper #23 – An empirical study of functional complexity as an indicator of overfitting in Genetic Programming
Leonardo Trujillo, Sara Silva, Pierrick Legrand, Leonardo Vanneschi

Recently, it has been stated that the complexity of a solution is a good indicator of the amount of overfitting it incurs. However, measuring the complexity of a program, in Genetic Programming, is not a trivial task. In this paper, we study the functional complexity and how it relates with overfitting on symbolic regression problems.We consider two measures of complexity, Slope-based Functional Complexity, inspired by the concept of curvature, and Regularity-based Functional Complexity based on the concept of Hölderian regularity. In general, both complexity measures appear to be poor indicators of program overfitting. However, results suggest that Regularity-based Functional Complexity could provide a good indication of overfitting in extreme cases.

Paper #24 – Estimating classifier performance with Genetic Programming
Leonardo Trujillo, Yuliana Martínez, Patricia Melin

A fundamental task that must be addressed before classifying a set of data, is that of choosing the proper classification method. In other words, a researcher must infer which classifier will achieve the best performance on the classification problem in order to make a reasoned choice. This task is not trivial, and it is mostly resolved based on personal experience and individual preferences. This paper presents a methodological approach to produce estimators of classifier performance, based on descriptive measures of the problem data. The proposal is to use Genetic Programming (GP) to evolve mathematical operators that take as input descriptors of the problem data, and output the expected error that a particular classifier might achieve if it is used to classify the data. Experimental tests show that GP can produce accurate estimators of classifier performance, by evaluating our approach on a large set of 500 two-class problems of multimodal data, using a neural network for classification. The results suggest that the GP approach could provide a tool that helps researchers make a reasoned decision regarding the applicability of a classifier to a particular problem.

Paper #28 – Investigation of the Performance of Different Mapping Orders for GE on the Max Problem
David Fagan, Miguel Nicolau, Erik Hemberg, Michael O’Neill, Anthony Brabazon, Sean McGarraghy

We present an analysis of how the genotype-phenotype map in Grammatical Evolution (GE) can effect performance on the Max Problem. Earlier studies have demonstrated a performance decrease for Position Independent Grammatical Evolution piGE in this problem domain. In piGE the genotype-phenotype map is changed so that the evolutionary algorithm controls not only what the next expansion will be but also the choice of what position in the derivation tree is expanded next. In this study we extend previous work and investigate whether the ability to change the order of expansion is responsible for the performance decrease or if the problem is simply that a certain order of expansion in the genotype-phenotype map is responsible. We conclude that the reduction of performance in the Max problem domain by piGE is rooted in the way the genotype-phenotype map and the genetic operators used with this mapping interact.

Paper #39 – A Self-Scaling Instruction Generator using Cartesian Genetic Programming
Yang Liu, Gianluca Tempesti, James Walker, Jon Timmis, Andy Tyrrell, Paul Bremner

In the past decades, a number of genetic programming techniques have been developed to evolve machine instructions. However, these approaches typically suffer from a lack of scalability that seriously impairs their applicability to real-world scenarios. In this paper, a novel self-scaling instruction generation method is introduced, which tries to overcome the scalability issue by using Cartesian Genetic Programming. In the proposed method, a dual-layer network architecture is created: one layer is used to evolve a series of instructions while the other is dedicated to the generation of loop control parameters.

Paper #41 – Exploring Grammatical Modiﬁcation with Modules in Grammatical Evolution
John Mark Swafford, Michael O’Neill, Miguel Nicolau, Anthony Brabazon

There have been many approaches to modularity in the field of evolutionary computation, each tailored to function with a particular representation. This research examines one approach to modularity and grammar modification with a grammar-based approach to genetic programming, grammatical evolution (GE). Here, GE’s grammar was modified over the course of an evolutionary run with modules in order to facilitate their appearance in the population. This is the first step in what will be a series of analysis on methods of modifying GE’s grammar to enhance evolutionary performance. The results show that identifying modules and using them to modify GE’s grammar can have a negative effect on search performance when done improperly. But, if undertaken thoughtfully, there are possible benefits to dynamically enhancing the grammar with modules identified during evolution.

Paper #45 – Multi-Objective Genetic Programming for Visual Analytics
Ilknur Icke, Andrew Rosenberg

Visual analytics is a human-machine collaboration to data modeling where extraction of the most informative features plays an important role. Although feature extraction is a multi-objective task, the traditional algorithms either only consider one objective or aggregate the objectives into one scalar criterion to optimize. In this paper, we propose a Pareto-based multi-objective approach to feature extraction for visual analytics applied to data classification problems. We identify classifiability, visual interpretability and semantic interpretability as the three equally important objectives for feature extraction in classification problems and define various measures to quantify these objectives. Our results on a number of benchmark datasets show consistent improvement compared to three standard dimensionality reduction techniques. We also argue that exploration of the multiple Pareto-optimal models provide more insight about the classification problem as opposed to a single optimal solution.

Paper #57 – A Continuous Approach to Genetic Programming
Cyril Fonlupt, Denis Robilliard

Differential Evolution (DE) is an evolutionary heuristic for continuous optimization problems. In DE, solutions are coded as vectors of floats that evolve by crossover with a combination of best and random individuals from the current generation. Experiments to apply DE to automatic programming were made recently by Veenhuis, coding full program trees as vectors of floats (Tree Based Differential Evolution or TreeDE). In this paper, we use DE to evolve linear sequences of imperative instructions, which we call Linear Differential Evolutionary Programming (LDEP). Unlike TreeDE, our heuristic provides constant management for regression problems and lessens the tree-depth constraint on the architecture of solutions. Comparisons with TreeDE and GP show that LDEP is appropriate to automatic programming.

eurogp