[ prog / sol / mona ]


How can I run my own instance of this

102 2020-03-01 11:19

However, I promised the cold, hard truth and this was only the cold part, so here is the rest. The disregard for "leftmost, longest" in sre->procedure is not limited to (or) and (**), it is present in all branches that can take multiple paths. This is meant to be salvaged by the external user of sre->procedure via the 'fail' lambda that is returned as part of the 'matches' object in the named let lp. On a successful match, 'fail' is actually the retry continuation. This is how irregex-match works. Its driver for sre->procedure is the 'else' branch of irregex-match/chunked:

(define (irregex-match/chunked irx cnk src)
  (let* ((irx (irregex irx))
         (matches (irregex-new-matches irx)))
    (irregex-match-chunker-set! matches cnk)
     ((irregex-dfa irx)
      (let* ((matcher (irregex-nfa irx))
             (str ((chunker-get-str cnk) src))
             (i ((chunker-get-start cnk) src))
             (end ((chunker-get-end cnk) src))
             (init (cons src i)))
        (let lp ((m (matcher cnk init src str i end matches (lambda () #f))))
          (and m
                ((and (not ((chunker-get-next cnk)
                            (%irregex-match-end-chunk m 0)))
                      (= ((chunker-get-end cnk)
                          (%irregex-match-end-chunk m 0))
                         (%irregex-match-end-index m 0)))
                 (%irregex-match-fail-set! m #f)
                ((%irregex-match-fail m)
                 (lp ((%irregex-match-fail m))))

Whenever there is a match that does not exhaust the input, and a retry continuation exists, the retry is called by the "(lp ((%irregex-match-fail m)))" branch. This means that if a full match is possible, it will be found. Here is the above (**) example with irregex-match and one more debug print:

scheme@(guile-user)> (define (imsim re str) (irregex-match-substring (irregex-match re str)))
scheme@(guile-user)> (imsim '(** 3 4 (or "a" "ab")) "abababab")
trying a at 0
trying a at 1
trying ab at 1
trying ab at 0
trying a at 2
trying a at 3
trying ab at 3
trying ab at 2
trying a at 4
trying a at 5
trying ab at 5
retry by irregex-match/chunked
trying ab at 4
trying a at 6
retry by irregex-match/chunked
trying ab at 6
$1 = "abababab"

Irregex-match/chunked has to override sre->procedure's result twice to get the full match.



do not edit these