Numeric literals in Python 3¶
[1]:
import alogos as al
1) Create the grammar¶
Grammar for Python 3 integer literals¶
Ellipses in the original text need to be replaced by the productions they stand for.
Example: "1"..."9"
becomes "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
[2]:
ebnf = """
integer ::= decinteger | bininteger | octinteger | hexinteger
decinteger ::= nonzerodigit (["_"] digit)* | "0"+ (["_"] "0")*
bininteger ::= "0" ("b" | "B") (["_"] bindigit)+
octinteger ::= "0" ("o" | "O") (["_"] octdigit)+
hexinteger ::= "0" ("x" | "X") (["_"] hexdigit)+
nonzerodigit ::= "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
digit ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
bindigit ::= "0" | "1"
octdigit ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7"
hexdigit ::= digit | "a" | "b" | "c" | "d" | "e" | "f" | "A" | "B" | "C" | "D" | "E" | "F"
"""
grammar_int = al.Grammar(ebnf_text=ebnf, defining_symbol='::=')
Grammar for Python 3 floating point literals¶
The rule for the digit
nonterminal needs to be copied from the definition of integer literals
[3]:
ebnf = """
floatnumber ::= pointfloat | exponentfloat
pointfloat ::= [digitpart] fraction | digitpart "."
exponentfloat ::= (digitpart | pointfloat) exponent
digitpart ::= digit (["_"] digit)*
fraction ::= "." digitpart
exponent ::= ("e" | "E") ["+" | "-"] digitpart
digit ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
"""
grammar_float = al.Grammar(ebnf_text=ebnf, defining_symbol='::=')
2) Use the grammar to generate strings¶
a) Some random strings¶
[4]:
n = 15
left = 'Examples of integer literals'
right = 'Number when evaluated (with this Python implementation)'
print('{:30} {}'.format(left, right))
print('{:30} {}'.format('-'*len(left), '-'*len(right)))
for i in range(n):
string = grammar_int.generate_string()
number = eval(string)
print('{:30} {}'.format(string, number))
Examples of integer literals Number when evaluated (with this Python implementation)
---------------------------- -------------------------------------------------------
0B1_1_1_0 14
0o_5_3 43
0B1_0 2
0B_0 0
0b_1_1 3
00_0 0
0 0
0x_c_bcf_B 834811
0B_0 0
0_00 0
0O3_6_1 241
0 0
0b1_00 4
0x_a_E 174
0o_4_4 36
[5]:
left = 'Examples of float literals'
right = 'Number when evaluated (with this Python implementation)'
print('{:30} {}'.format(left, right))
print('{:30} {}'.format('-'*len(left), '-'*len(right)))
for i in range(n):
string = grammar_float.generate_string()
number = eval(string)
print('{:30} {}'.format(string, number))
Examples of float literals Number when evaluated (with this Python implementation)
-------------------------- -------------------------------------------------------
7. 7.0
9. 9.0
5.8e+7 58000000.0
6_8.3e1 683.0
9e1 90.0
3_7. 37.0
10597.e8 1059700000000.0
6. 6.0
9. 9.0
0E+6 0.0
6e-6 6e-06
51_7.E9 517000000000.0
1_5e-3_9 1.5e-38
3.2_6e-7 3.26e-07
5. 5.0
b) A subset of the grammar’s infinite language¶
For a finite language it is possible to generate all strings.
For an infinite language, as it is the case here, the construction process needs to be limited with the parameter
max_steps
to only get simple strings that can be generated with a few derivation steps from the start symbol.
[6]:
language = grammar_int.generate_language(max_steps=6, sort_order='shortlex') #, verbose=True)
print('{} strings were generated, a subset of the inifinite language of integer literals.'.format(len(language)))
print()
print('Here are some of these strings:')
for i in [0, 1, 10, 11, 100, 101, 1000, 1001, 4000, 4001, 5000, 5001, 6000, 6001, -2, -1]:
string = language[i]
print(string)
6592 strings were generated, a subset of the inifinite language of integer literals.
Here are some of these strings:
0
1
00
10
99
000
863
864
0X_Bd
0X_Be
0xf_2
0xf_3
0X_A_8
0X_A_9
000_0_0
0000_0_0
[7]:
language = grammar_float.generate_language(max_steps=6, sort_order='shortlex')
print('{} strings were generated, a subset of the inifinite language of float literals.'.format(len(language)))
print()
print('Here are some of these strings:')
for i in [0, 1, 10, 11, 100, 101, 1000, 1001, 1201, 1202, -2, -1]:
string = language[i]
print(string)
1520 strings were generated, a subset of the inifinite language of float literals.
Here are some of these strings:
.0
.1
0.
1.
2.0
2.1
8E+0
8E+1
2.E+1
2.E+2
9.e-8
9.e-9
3) Use the grammar to parse strings¶
a) Recognize if a string belongs to the language defined by a grammar¶
[8]:
# Examples from website
examples_int = [
'7', '2147483647', '0o177', '0b100110111', '3', '79228162514264337593543950336',
'0o377', '0xdeadbeef', '100_000_000_000', '0b_1110_0101']
# Counterexamples by myself
counterexamples_int = ['007', '1__0', '0o8', '0xEFG']
[9]:
left = 'Candidate string'
right = 'Is it recognized as valid Python integer literal?'
print('{:30} {}'.format(left, right))
print('{:30} {}'.format('-'*len(left), '-'*len(right)))
for string in examples_int + counterexamples_int:
is_recognized = grammar_int.recognize_string(string)
print('{:30} {}'.format(string, is_recognized))
Candidate string Is it recognized as valid Python integer literal?
---------------- -------------------------------------------------
7 True
2147483647 True
0o177 True
0b100110111 True
3 True
79228162514264337593543950336 True
0o377 True
0xdeadbeef True
100_000_000_000 True
0b_1110_0101 True
007 False
1__0 False
0o8 False
0xEFG False
[10]:
# Examples from website
examples_float = ['3.14', '10.', '.001', '1e100', '3.14e-10', '0e0', '3.14_15_93']
# Counterexamples by myself
counterexamples_float = ['3.1.4', '10.0_', '1e_100']
[11]:
left = 'Candidate string'
right = 'Is it recognized as valid Python floating-point literal?'
print('{:30} {}'.format(left, right))
print('{:30} {}'.format('-'*len(left), '-'*len(right)))
for string in examples_float + counterexamples_float:
is_recognized = grammar_float.recognize_string(string)
print('{:30} {}'.format(string, is_recognized))
Candidate string Is it recognized as valid Python floating-point literal?
---------------- --------------------------------------------------------
3.14 True
10. True
.001 True
1e100 True
3.14e-10 True
0e0 True
3.14_15_93 True
3.1.4 False
10.0_ False
1e_100 False
b) Analyze the syntactic structure of a string with its parse tree¶
[12]:
parse_tree = grammar_int.parse_string('0xC3')
parse_tree
[12]:
[13]:
parse_tree = grammar_float.parse_string('3.14')
parse_tree.plot(shape_nt='hexagon', fillcolor_nt='lightblue', shape_t='circle', fillcolor_t='black')
[13]:
Optimization¶
Grammar-guided genetic programming allows to search for optimal strings within a finite or infinite language. An objective function defines what is optimal. It takes a string as input (a member of the language) and returns a number as output (the objective value or fitness value of that string).
Aim here: Try to find a float literal of length 4 that evaluates to the highest possible number.
[14]:
def objective_function(string):
if len(string) != 4:
raise ValueError('Undesired string, wrong length.')
number = eval(string)
return number
[15]:
# Test whether the function creates the expected numerical output
print(objective_function('1234'))
print(objective_function('1_01'))
print(objective_function('3.e7'))
1234
101
30000000.0
[16]:
ea = al.EvolutionaryAlgorithm(grammar_float, objective_function, 'max', max_generations=50)
best_individual = ea.run()
best_individual.phenotype
[16]:
'9E99'