Semantic versioning

This notebook takes a look at the grammar and regular expression that are used by Semantic Versioning 2.0.0 to define its precise format for valid version numbers.

References

[1]:
import re

import alogos as al

Define the grammar

[2]:
bnf = """
<valid semver> ::= <version core>
                 | <version core> "-" <pre-release>
                 | <version core> "+" <build>
                 | <version core> "-" <pre-release> "+" <build>

<version core> ::= <major> "." <minor> "." <patch>

<major> ::= <numeric identifier>

<minor> ::= <numeric identifier>

<patch> ::= <numeric identifier>

<pre-release> ::= <dot-separated pre-release identifiers>

<dot-separated pre-release identifiers> ::= <pre-release identifier>
                                          | <pre-release identifier> "." <dot-separated pre-release identifiers>

<build> ::= <dot-separated build identifiers>

<dot-separated build identifiers> ::= <build identifier>
                                    | <build identifier> "." <dot-separated build identifiers>

<pre-release identifier> ::= <alphanumeric identifier>
                           | <numeric identifier>

<build identifier> ::= <alphanumeric identifier>
                     | <digits>

<alphanumeric identifier> ::= <non-digit>
                            | <non-digit> <identifier characters>
                            | <identifier characters> <non-digit>
                            | <identifier characters> <non-digit> <identifier characters>

<numeric identifier> ::= "0"
                       | <positive digit>
                       | <positive digit> <digits>

<identifier characters> ::= <identifier character>
                          | <identifier character> <identifier characters>

<identifier character> ::= <digit>
                         | <non-digit>

<non-digit> ::= <letter>
              | "-"

<digits> ::= <digit>
           | <digit> <digits>

<digit> ::= "0"
          | <positive digit>

<positive digit> ::= "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"

<letter> ::= "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" | "J"
           | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" | "S" | "T"
           | "U" | "V" | "W" | "X" | "Y" | "Z" | "a" | "b" | "c" | "d"
           | "e" | "f" | "g" | "h" | "i" | "j" | "k" | "l" | "m" | "n"
           | "o" | "p" | "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x"
           | "y" | "z"
"""

grammar = al.Grammar(bnf_text=bnf, start_terminal_symbol='"', end_terminal_symbol='"')

Generate random strings

[3]:
for _ in range(20):
    print(grammar.generate_string())
2.4.3-100+-1.0
1.40.3-917.30.8300.f80n0.5.-q+0.090
7.3.90064
8.0.649
50.0.0+p.t0C-.0
0.0.0
6.22.2+8.9.079
2.23.0-0+300.h80
7.0.0-6.-m7.9530+B-0.-.00
0.0.3-3+006
4.6.0
5.255.5+0
0.18.5-9.-W6.8.2
0.0.3-2+La-0.0980.0-0.1.0.Pl.00
0.0.2+16605
0.4.88-256+0C
5.76080.0-jcB.0
992.9.6
0.8.688+0-p.07
52.450.9-T-A0

Check if strings generated with the grammar are recognized as valid by the regular expression

[4]:
regex_pattern = (
    '^(?P<major>0|[1-9]\d*)\.'
    '(?P<minor>0|[1-9]\d*)\.'
    '(?P<patch>0|[1-9]\d*)'
    '(?:-(?P<prerelease>(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?'
    '(?:\+(?P<buildmetadata>[0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$'
)

n = 2000
for _ in range(n):
    random_string = grammar.generate_string()
    match = re.match(regex_pattern, random_string)
    if not match:
        raise Exception('String "{}" was not recognized by the regular expression.'.format(random_string))
else:
    print('All {} strings that were randomly generated with the grammar were '
          'recognized by the regular expression.'.format(n))
All 2000 strings that were randomly generated with the grammar were recognized by the regular expression.

Parse given strings

[5]:
grammar.parse_string('1.0.0')
[5]:
%30valid semver1version core0->12major1->23.1->34minor1->45.1->56patch1->67numeric identifier2->710numeric identifier4->1012numeric identifier6->128positive digit7->8918->911010->1113012->13
[6]:
grammar.parse_string('11.3.17-rc0+nightly')
[6]:
%30valid semver1version core0->12-0->23pre-release0->34+0->45build0->56major1->67.1->78minor1->89.1->910patch1->1028dot-separated pre-release identifiers3->2843dot-separated build identifiers5->4311numeric identifier6->1118numeric identifier8->1821numeric identifier10->2112positive digit11->1213digits11->1314112->1415digit13->1516positive digit15->1617116->1719positive digit18->1920319->2022positive digit21->2223digits21->2324122->2425digit23->2526positive digit25->2627726->2729pre-release identifier28->2930alphanumeric identifier29->3031non-digit30->3132identifier characters30->3233letter31->3335identifier character32->3536identifier characters32->3634r33->3437non-digit35->3740identifier character36->4038letter37->3839c38->3941digit40->4142041->4244build identifier43->4445alphanumeric identifier44->4546non-digit45->4647identifier characters45->4748letter46->4850identifier character47->5051identifier characters47->5149n48->4952non-digit50->5255identifier character51->5556identifier characters51->5653letter52->5354i53->5457non-digit55->5760identifier character56->6061identifier characters56->6158letter57->5859g58->5962non-digit60->6265identifier character61->6566identifier characters61->6663letter62->6364h63->6467non-digit65->6770identifier character66->7071identifier characters66->7168letter67->6869t68->6972non-digit70->7275identifier character71->7573letter72->7374l73->7476non-digit75->7677letter76->7778y77->78