[ prog / sol / mona ]

prog


How can I run my own instance of this

101 2020-03-01 11:18

[2/5]
As for (**), the reason it breaks is only marginally more complicated. To illustrate the explanation with an example, we begin by replacing the string atom matcher with a thin proxy that reports the progress of the computation:

$ TZ=GMT diff -u schemebbs/deps/irregex.scm edit/deps/irregex.scm 
--- schemebbs/deps/irregex.scm	2020-02-17 21:53:31.563445679 +0000
+++ edit/deps/irregex.scm	2020-02-29 02:01:57.844878000 +0000
@@ -3508,7 +3508,10 @@
                     (fail))))
           ))
      ((string? sre)
-      (rec (sre-sequence (string->list sre)))
+      (let ((sub (rec (sre-sequence (string->list sre)))))
+        (lambda (cnk init src str i end matches fail)
+          (simple-format #t "trying ~A at ~A\n" sre i)
+          (sub cnk init src str i end matches fail)))
 ;; XXXX reintroduce faster string matching on chunks
 ;;       (if (flag-set? flags ~case-insensitive?)
 ;;           (rec (sre-sequence (string->list sre)))

The (**) branch is:

((**)
 (cond
  ((or (and (number? (cadr sre))
            (number? (caddr sre))
            (> (cadr sre) (caddr sre)))
       (and (not (cadr sre)) (caddr sre)))
   (lambda (cnk init src str i end matches fail) (fail)))
  (else
   (letrec
       ((from (cadr sre))
        (to (caddr sre))
        (body-contents (sre-sequence (cdddr sre)))
        (body
         (lambda (count)
           (lp body-contents
               n
               flags
               (lambda (cnk init src str i end matches fail)
                 (if (and to (= count to))
                     (next cnk init src str i end matches fail)
                     ((body (+ 1 count))
                      cnk init src str i end matches
                      (lambda ()
                        (if (>= count from)
                            (next cnk init src str i end matches fail)
                            (fail))))))))))
     (if (and (zero? from) to (zero? to))
         next
         (lambda (cnk init src str i end matches fail)
           ((body 1) cnk init src str i end matches
            (lambda ()
              (if (zero? from)
                  (next cnk init src str i end matches fail)
                  (fail))))))))))

Here is a talkative run on the last example of >>95:

scheme@(guile-user)> (imsis '(** 3 4 (or "a" "ab")) "abababab")
trying a at 0
trying a at 1
trying ab at 1
trying ab at 0
trying a at 2
trying a at 3
trying ab at 3
trying ab at 2
trying a at 4
trying a at 5
trying ab at 5
$1 = "ababa"
scheme@(guile-user)> 

The "a at 0" and "a at 2" are both retried with "ab" because the overall repeat counts they yielded were 1 and 2, under the lower limit of 3. These retries happen on the alternate of the innermost conditional of 'body'. But after "a at 4", which makes both attempts at 5 fail, there is no retry with "ab at 4". This is because the produced repeat count is now 3, and the same conditional declares it a success because extending to 4 repeats failed. There is no attempt to extend with further retries, even though it would obviously work since the target string is composed of 4 "ab"s.

The () branch is not interested in "leftmost, longest", it stops at the first repeat count extension failure within the allowed range. The only way it can produce "leftmost, longest" on its own is to only have repeat count extension failures under the lower limit, but smooth sailing from the lower limit to the upper limit. Since the () might be the entire regex, this is also enough to make the sre->procedure call break the same semantics.

301


VIP:

do not edit these