Image approximation with geometrical shapes

This notebook shows how grammar-guided genetic programming (G3P) can be used for a task invented by Roger Johansson: search for a configuration of semi-transparent geoemetrical shapes, so that the resulting image closely resembles a given target image.

References

In [1]:
import time

import alogos as al
import IPython
from PIL import Image, ImageDraw, ImageChops, ImageStat

Preparation

1) Functions for handling images

In [2]:
def load_image_from_file(filepath):
    return Image.open(filepath).convert('RGB')

def store_image_to_file(img, filepath):
    img.save(filepath)

def store_string_to_file(img, filepath):
    with open(filepath, 'w') as f:
        f.write(string)

def resize_image(img, width):
    w, h = img.size
    height = int(width * h / w)
    return img.resize((width, height), Image.ANTIALIAS)

def display_image(img):
    IPython.display.display(img)

2) Functions for drawing shapes

In [3]:
def create_image(width, height, bg_color='black'):
    return Image.new('RGB', (width, height), bg_color)

def create_drawer(img):
    return ImageDraw.Draw(img, 'RGBA')

def circle(d, x, y, r, c):
    d.ellipse(((x-r, y-r), (x+r, y+r)), c)

def ellipse(d, p, c):
    d.ellipse(p, c)

def triangle(d, p, c):
    d.polygon(p, c)

def quadrilateral(d, p, c):
    d.polygon(p, c)

def pentagon(d, p, c):
    d.polygon(p, c)

def hexagon(d, p, c):
    d.polygon(p, c)

def heptagon(d, p, c):
    d.polygon(p, c)

def octagon(d, p, c):
    d.polygon(p, c)

def nonagon(d, p, c):
    d.polygon(p, c)
    
def polygon(d, p, c):
    d.polygon(p, c)

3) Target image

In [4]:
img_name = 'mona_lisa'
img_ori = load_image_from_file(img_name + '.png')

# smaller image size leads to much faster objective function evaluation
size = 70
img_target = resize_image(img_ori, size)
In [5]:
display_image(img_target)

Definition of search space and goal

1) Grammar

This grammar defines the search space: a Python program that creates an empty image and draws a chosen number and type of geometrical shapes on it, with locations and colors being variable.

In [6]:
ebnf_template = """
PROGRAM = IMAGE NL DRAWER NL SHAPES

IMAGE = "im = create_image(w, h)"
DRAWER = "d = create_drawer(im)"
SHAPES = {chosen_number}

SHAPE = {chosen_shapes}
CIRC = "circle(d, " X "," Y "," R "," C ")"
ELLI = "ellipse(d,(" PT "," PT ")," C ")"
TRI = "triangle(d,(" PT "," PT "," PT ")," C ")"
QUAD = "quadrilateral(d,(" PT "," PT "," PT "," PT ")," C ")"
PENT = "pentagon(d,(" PT "," PT "," PT "," PT "," PT ")," C ")"
HEX = "hexagon(d,(" PT "," PT "," PT "," PT "," PT "," PT ")," C ")"
HEPT = "heptagon(d,(" PT "," PT "," PT "," PT "," PT "," PT "," PT ")," C ")"
OCT = "octagon(d,(" PT "," PT "," PT "," PT "," PT "," PT "," PT "," PT ")," C ")"
NONA = "nonagon(d,(" PT "," PT "," PT "," PT "," PT "," PT "," PT "," PT "," PT ")," C ")"
POLY = "polygon(d,(" POINTS ")," C ")"

POINTS = PT "," PT | POINTS "," PT
PT = X "," Y
X = FRAC "*w"
Y = FRAC "*h"
R = FRAC "*w/2"
C = "(" INT "," INT "," INT ", {chosen_alpha})"

FRAC = "0." DIG DIG
INT = "int(" FRAC "*255)"
DIG = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
NL = "\n"
"""


def generate_grammar(num_shapes=150, shapes='CIRC | ELLI | TRI | QUAD | PENT | POLY', alpha=50):
    """Parameterized generation of the grammar: Choose number and type of drawn shapes."""
    ebnf_text = ebnf_template.format(
        chosen_number='SHAPE NL '*num_shapes, chosen_shapes=shapes, chosen_alpha=alpha)
    grammar = al.Grammar(ebnf_text=ebnf_text)
    return grammar


grammar = generate_grammar(50, 'NONA')

2) Objective function

The objective function gets a candidate solution (=a string of the grammar's language) and returns a fitness value for it. This is done by 1) executing the string as a Python program, so that it draws a candidate image, and then 2) pixel-wise comparing the candidate image with the target image: the smaller the difference, the better is the candidate.

In [7]:
w, h = img_target.size

var = dict(
    w=w,
    h=h,
    create_image=create_image,
    create_drawer=create_drawer,
    circle=circle,
    ellipse=ellipse,
    triangle=triangle,
    quadrilateral=quadrilateral,
    pentagon=pentagon,
    hexagon=hexagon,
    heptagon=heptagon,
    octagon=octagon,
    nonagon=nonagon,
    polygon=polygon,
)

var_large = var.copy()
var_large['w'] = 500
var_large['h'] = 500
In [8]:
def run_python_code(string, var=var):
    exec(string, var)
    return var['im']
In [9]:
def px_diff(im1, im2):
    return sum(ImageStat.Stat(ImageChops.difference(im1, im2)).sum)
In [10]:
def objective_function(string):
    img = run_python_code(string)
    return px_diff(img, img_target)

Generation of a random solution

Check if grammar and objective function work as intended.

In [11]:
random_string = grammar.generate_string()
print('A random string:')
print(random_string[:500], '.........')

print()
print('Visualized:')
img = run_python_code(random_string)
display_image(img)

print()
print('Fitness value assigned by the objective function:', objective_function(random_string))
A random string:
im = create_image(w, h)
d = create_drawer(im)
nonagon(d,(0.45*w,0.03*h,0.92*w,0.44*h,0.86*w,0.84*h,0.16*w,0.24*h,0.97*w,0.37*h,0.74*w,0.79*h,0.33*w,0.91*h,0.05*w,0.68*h,0.89*w,0.99*h),(int(0.61*255),int(0.68*255),int(0.94*255), 50))
nonagon(d,(0.92*w,0.67*h,0.11*w,0.00*h,0.90*w,0.35*h,0.46*w,0.92*h,0.36*w,0.08*h,0.48*w,0.62*h,0.36*w,0.30*h,0.51*w,0.57*h,0.42*w,0.18*h),(int(0.49*255),int(0.52*255),int(0.34*255), 50))
nonagon(d,(0.21*w,0.77*h,0.84*w,0.69*h,0.01*w,0.51*h,0.63*w,0.65*h,0.66*w,0.08*h .........

Visualized:
Fitness value assigned by the objective function: 1133255.0

Search for an optimal solution

Evolutionary optimization with random variation and non-random selection is used to find increasingly better candidate solutions.

1) Parameterization

In [12]:
report_time = True
display_results = True
store_results = False

num_ind = 2
num_gen = 20_000
In [13]:
ea = al.EvolutionaryAlgorithm(
    grammar, objective_function, 'min', population_size=num_ind, offspring_size=num_ind)

2) Run

The search is performed one generation after another and some intermediate results are reported to see how the solutions improve gradually.

In [14]:
start = time.time()
for i in range(1, num_gen+1):
    best_ind = ea.step()

    if i % 1000 == 0:
        string = best_ind.phenotype

        if report_time:
            dt = time.time() - start
            print('{}: {:.0f}min {:.0f}s'.format(i, dt // 60, dt % 60))

        if display_results:
            img = run_python_code(string)
            display_image(img)

        if store_results:
            img = run_python_code(string, var_large)
            store_image_to_file(img, 'results/{}_n{}_s{}_g{}.png'.format(img_name, args[0], args[1], i))
            store_string_to_file(string, 'results/{}_n{}_s{}_g{}.txt'.format(img_name, args[0], args[1], i))
1000: 0min 16s
2000: 0min 34s
3000: 0min 50s
4000: 1min 8s
5000: 1min 23s
6000: 1min 38s
7000: 1min 53s
8000: 2min 8s
9000: 2min 24s
10000: 2min 39s
11000: 2min 54s
12000: 3min 7s
13000: 3min 20s
14000: 3min 35s
15000: 3min 50s
16000: 4min 4s
17000: 4min 18s
18000: 4min 33s
19000: 4min 47s
20000: 5min 1s

3) Result

Show the phenotype of the best individual found so far. If more computing time is provided, increasingly better solutions can be found.

In [15]:
print('Preliminary result after {} generations'.format(ea.state.generation))

string = best_ind.phenotype
img = run_python_code(string, var_large)
display_image(img)
Preliminary result after 20000 generations