[ prog / sol / mona ]

prog


[challenge] Try to rewrite SICP [urgent]

1 2022-08-15 22:14

The wizard book has been stolen (see: http://textboard.org/prog/22/165 )

Your task is to rewrite it one character at a time.

Prof. Abelson will help you recover the sacred text. The Sussman has shown no interest in doing so. Abelson cannot remember the whole book but you can call him and ask if a particular character was at a given position. For some reason that's something he remembers.

The SICP text is here: http://0x0.st/o20u.md (of course you're not allowed to access it from your program or include any part of it in your code but feel free to generate all the stats you deem necessary). Its length is 1223834. That means you have to call Abelson 1223834 times.

Here's Abelson (Abelson is actually a poorly written Python script)

#!/usr/bin/python
from pathlib import Path
import sys
sicp = Path('sicp.md').read_text()
if sys.argv[2] == sicp[int(sys.argv[1])]:
        print("yes yes!")
else:
        print("no, afraid not")

Save it as "say-abelson-was-the-char-at-position" and call it like this:

$ ./say-abelson-was-the-char-at-position 0 a
no, afraid not
$ ./say-abelson-was-the-char-at-position 1223833 "
> "
yes yes!

In other words, your program should take two arguments:
- n, the position in the text
- your guess for the nth character of SICP.

You have to call the Abelson script within your program, read Abelson's response and count the number of times you had a character right, which will be your score. The highest score wins this programming contest. You are naturally allowed to keep track of Abelson's previous answers and to use them however you like in order to guess the next character. Bonus points will be awarded for concision, esoterism, elegance, cleverness, whatever. (execution speed is irrelevant in this challenge)

Deadline is 2022-09-01 00:00 UTC. Good luck, anon, the fate of humanity is in your hands.

2 2022-08-15 22:22

Also note that the book's length is 1223834 unicode characters not 1223834 bytes. There are a couple of weird letters and symbols in the note section, that's why.

3 2022-08-15 22:56

(calling Abelson a million times will take forever so you should also write your own Abelson as a function)

4 2022-08-15 23:06

A program that assumes SICP was only a sequence of closing parentheses:

import os
from pathlib import Path
sicp = Path('sicp.md').read_text()
count = 0
for i in range(1223834):
        if (sicp[i] == ")"):
                count = count + 1
print("score: " + str(count))
$ python dumbshit.py
score: 14799

Mind you, I get the same result with an opening parenthesis.

5 2022-08-16 03:26

Shit, I forgot something extremely important. Since Abelson knows which character was in SICP at a given position, he will of course be nice enough to tell you what the character actually was if you guessed wrong. That's how you can try to predict what the next character is. That's the whole point, sorry. So here's an updated Abelson:

#!/usr/bin/python
from pathlib import Path
import sys
sicp = Path('sicp.md').read_text()
actual_char = sicp[int(sys.argv[1])]
if sys.argv[2] == actual_char:
        print("yes yes!")
else:
        print("nah, it was " + actual_char)
6 2022-08-16 06:13

>>1,4,5
Wow, it's the snake language with 8-space indentation running on a Unix-like operating system! How cute!
I'm in love! Does Eva Lu Ator understand the snake language?

7 2022-08-16 07:12

>>5
Scheme driver for this:

(define (count guess)
  (with-input-from-file "sicp.md"
    (lambda ()
      (let loop ((prev #f)
                 (score 0))
        (let ((current (read-char)))
          (cond
           ((eof-object? current) score)
           (else (loop current (if (eq? (guess prev) current)
                                   (+ score 1)
                                   score)))))))))

Some values:

(count (lambda (x) #\)))
;; 14799
(count (lambda (x) #\())
;; 14799

;; Try matching open parentheses
(define (matcher)
  (let ((depth 0))
    (lambda (prev)
      (when (eq? prev #\()
        (set! depth (+ depth 1)))
      (when (eq? prev #\))
        (set! depth (- depth 1)))
      (cond
       ((> depth 0) #\))
       (else #\()))))
;; 19535

(count (lambda (x) #\space))
;; 215505

;; Conclusion: SICP is mostly empty.
8 2022-08-16 10:58 *

https://mitp-content-server.mit.edu/books/content/sectbyfn/books_pres_0/6515/sicp.zip/full-text/book/book.html

9 2022-08-16 15:37

>>8
Please stop digging up the dark past. It's time to look forward to a bright future dominated by JavaScript and friends. There is no need for us to rewrite SICP. It has already been done for us in the form of JSICP. SchemeBBS should be upgraded to use a JavaScript backend.

10 2022-08-16 15:51

>>7
Probably a bit over-engineered: https://paste.textboard.org/ecf44ff5

11 2022-08-16 20:54

Sliding window predictor, alpha version.

import collections as co
import sys


class Oracle:
   def __init__ (self, source):
      with open (source, "r") as f:
         text = f.read ()

      self.text = text

   def length (self):
      return len (self.text)

   def guess (self, index, char):
      c = self.text [index]
      return True if c == char else c


class Acolyte:
   def __init__ (self, oracle):
      self.oracle = oracle
      self.index  = 0
      self.score  = 0

   def length (self):
      return self.oracle.length ()

   def guess (self, char):
      index    = self.index
      prophecy = self.oracle.guess (index, char)

      if prophecy is True:
         self.score += 1

      self.index += 1
      return prophecy

   def status (self):
      length = self.oracle.length ()
      index  = self.index
      score  = self.score

      return "{}={} {}={} {}={:.6f} {}={} {}={:.6f}".format (
         "length", length,
         "index",  index,
         "cover",  index / length,
         "score",  score,
         "ratio",  score / index if index else 0)


class Temple:
   def __init__ (self, argv):
      pass

   def preach (self, acolyte):
      pass


class TempleOfSpace (Temple):
   def preach (self, acolyte):
      guess = acolyte.guess

      for k in range (acolyte.length ()):
         guess (' ')


class TempleOfSlidingWindow (Temple):
   def __init__ (self, argv):
      self.window = int (argv [0])

   def preach (self, acolyte):
      window  = self.window
      length  = acolyte.length ()
      if length < window: return
      guess   = acolyte.guess
      predict = {}

      start = ' '
      key   = ''.join (
         map (lambda charguess: charguess [0] if charguess [1] is True else charguess [1],
         map (lambda k: (start, guess (start)),
         range (window))))

      for k in range (window, length):
         counts = predict.get (key)

         if counts is None:
            gu  = guess (start)
            ch  = start if gu is True else gu
            predict [key] = co.Counter ([ch])
            key = key [1:] + ch
         else:
            pred = counts.most_common (1) [0] [0]
            gu   = guess (pred)
            ch   = pred if gu is True else gu
            counts [ch] += 1
            key = key [1:] + ch

      print ("{}={}".format (
         "predict", len (predict)))


def work_temple (src, rest):
   temple  = TempleOfSlidingWindow (rest)
#  temple  = TempleOfSpace (rest)
   oracle  = Oracle (src)
   acolyte = Acolyte (oracle)
   temple.preach (acolyte)
   print (acolyte.status ())

def main_temple (argv):
   src  = argv [0]
   rest = argv [1:]
   work_temple (src, rest)


def main ():
   globals () ["main_" + sys.argv [1]] (sys.argv [2:])

if __name__ == "__main__": main ()

Predictor runs for various window sizes:

$ python3 -m flake.predictext temple o20u.md 2
predict=3096
length=1223834 index=1223834 cover=1.000000 score=546526 ratio=0.446569
$ python3 -m flake.predictext temple o20u.md 3
predict=17927
length=1223834 index=1223834 cover=1.000000 score=712032 ratio=0.581804
$ python3 -m flake.predictext temple o20u.md 4
predict=54483
length=1223834 index=1223834 cover=1.000000 score=778871 ratio=0.636419
$ python3 -m flake.predictext temple o20u.md 5
predict=115721
length=1223834 index=1223834 cover=1.000000 score=784534 ratio=0.641046
$ python3 -m flake.predictext temple o20u.md 6
predict=195485
length=1223834 index=1223834 cover=1.000000 score=761032 ratio=0.621843
$ python3 -m flake.predictext temple o20u.md 7
predict=286261
length=1223834 index=1223834 cover=1.000000 score=723739 ratio=0.591370

best window size: 5
score=784534 ratio=0.641046

12 2022-08-19 03:39

Upgrade of >>11 to adaptive sliding window predictor, still alpha. The adapt parameter controls how many window sizes are considered, starting from the full window.

import collections as co
import sys


class Oracle:
   def __init__ (self, source):
      with open (source, "r") as f:
         text = f.read ()

      self.text = text

   def length (self):
      return len (self.text)

   def guess (self, index, char):
      c = self.text [index]
      return True if c == char else c


class Acolyte:
   def __init__ (self, oracle):
      self.oracle = oracle
      self.index  = 0
      self.score  = 0

   def length (self):
      return self.oracle.length ()

   def guess (self, char):
      index    = self.index
      prophecy = self.oracle.guess (index, char)

      if prophecy is True:
         self.score += 1

      self.index = index + 1
      return prophecy

   def status (self):
      length = self.oracle.length ()
      index  = self.index
      score  = self.score

      return "{}={} {}={} {}={:.6f} {}={} {}={:.6f}".format (
         "length", length,
         "index",  index,
         "cover",  index / length,
         "score",  score,
         "ratio",  score / index if index else 0)


class Temple:
   def __init__ (self, argv):
      pass

   def preach (self, acolyte):
      pass


class TempleOfSpace (Temple):
   def preach (self, acolyte):
      guess = acolyte.guess

      for k in range (acolyte.length ()):
         guess (' ')


class TempleOfSlidingWindow (Temple):
   def __init__ (self, argv):
      self.window = int (argv [0])

   def preach (self, acolyte):
      window  = self.window
      length  = acolyte.length ()
      if length < window: return
      guess   = acolyte.guess
      predict = {}

      start = ' '
      key   = ''.join (
         map (lambda charguess: charguess [0] if charguess [1] is True else charguess [1],
         map (lambda k: (start, guess (start)),
         range (window))))

      for k in range (window, length):
         counts = predict.get (key)

         if counts is None:
            gu  = guess (start)
            ch  = start if gu is True else gu
            predict [key] = co.Counter ([ch])
            key = key [1:] + ch
         else:
            pred = counts.most_common (1) [0] [0]
            gu   = guess (pred)
            ch   = pred if gu is True else gu
            counts [ch] += 1
            key = key [1:] + ch

      print ("{}={}".format (
         "predict", len (predict)))


class TempleOfAdaptiveSW (Temple):
   def __init__ (self, argv):
      self.window = int (argv [0])
      self.adapt  = int (argv [1])

   def preach (self, acolyte):
      window  = self.window
      adapt   = self.adapt
      length  = acolyte.length ()
      if length < window: return
      guess   = acolyte.guess
      ranada  = range (adapt)
      predict = [{} for k in ranada]

      start  = ' '
      key    = ''.join (
         map (lambda charguess: charguess [0] if charguess [1] is True else charguess [1],
         map (lambda k: (start, guess (start)),
         range (window))))
      keys   = [key [k:] for k in ranada]
      counts = [None] * adapt

      for j in range (window, length):
         pred = start
         most = 0

         for k in ranada:
            countsk    = predict [k].get (keys [k])
            counts [k] = countsk
            if countsk is not None:
               chark, mostk = countsk.most_common (1) [0]
               if mostk > most:
                  most = mostk
                  pred = chark

         gu   = guess (pred)
         char = pred if gu is True else gu

         for k in ranada:
            keysk   = keys   [k]
            countsk = counts [k]

            if countsk is None:
               predict [k] [keysk] = co.Counter ([char])
            else:
               countsk [char] += 1

            keys [k] = keysk [1:] + char

      print ("{}={}={}".format (
         "predict",
         sum (len (p) for p in predict),
         [len (p) for p in predict]))


def work_temple (src, templar, rest):
   temple  = globals () ["TempleOf" + templar] (rest)
   oracle  = Oracle (src)
   acolyte = Acolyte (oracle)
   temple.preach (acolyte)
   print (acolyte.status ())

def main_temple (argv):
   src    = argv [0]
   temple = argv [1]
   rest   = argv [2:]
   work_temple (src, temple, rest)


def main ():
   globals () ["main_" + sys.argv [1]] (sys.argv [2:])

if __name__ == "__main__": main ()
13 2022-08-19 03:41

A list of predictor runs with adapt going from 2 to 10 and window sizes going down to 4, 5 and 6. The maximum of the score curve for a constant adapt value appears to always be at a minimum window size of 5.

$ python3 -m flake.predictext temple o20u.md AdaptiveSW 5 2
predict=170204=[115721, 54483]
length=1223834 index=1223834 cover=1.000000 score=780691 ratio=0.637906
$ python3 -m flake.predictext temple o20u.md AdaptiveSW 6 2
predict=311206=[195485, 115721]
length=1223834 index=1223834 cover=1.000000 score=786525 ratio=0.642673
$ python3 -m flake.predictext temple o20u.md AdaptiveSW 7 2
predict=481745=[286261, 195484]
length=1223834 index=1223834 cover=1.000000 score=762855 ratio=0.623332

$ python3 -m flake.predictext temple o20u.md AdaptiveSW 6 3
predict=365689=[195485, 115721, 54483]
length=1223834 index=1223834 cover=1.000000 score=781178 ratio=0.638304
$ python3 -m flake.predictext temple o20u.md AdaptiveSW 7 3
predict=597466=[286261, 195484, 115721]
length=1223834 index=1223834 cover=1.000000 score=787194 ratio=0.643220
$ python3 -m flake.predictext temple o20u.md AdaptiveSW 8 3
predict=864302=[382559, 286260, 195483]
length=1223834 index=1223834 cover=1.000000 score=763585 ratio=0.623929

$ python3 -m flake.predictext temple o20u.md AdaptiveSW 7 4
predict=651949=[286261, 195484, 115721, 54483]
length=1223834 index=1223834 cover=1.000000 score=781411 ratio=0.638494
$ python3 -m flake.predictext temple o20u.md AdaptiveSW 8 4
predict=980022=[382559, 286260, 195483, 115720]
length=1223834 index=1223834 cover=1.000000 score=787575 ratio=0.643531
$ python3 -m flake.predictext temple o20u.md AdaptiveSW 9 4
predict=1342766=[478467, 382558, 286259, 195482]
length=1223834 index=1223834 cover=1.000000 score=763970 ratio=0.624243

$ python3 -m flake.predictext temple o20u.md AdaptiveSW 8 5
predict=1034505=[382559, 286260, 195483, 115720, 54483]
length=1223834 index=1223834 cover=1.000000 score=781551 ratio=0.638609
$ python3 -m flake.predictext temple o20u.md AdaptiveSW 9 5
predict=1458486=[478467, 382558, 286259, 195482, 115720]
length=1223834 index=1223834 cover=1.000000 score=787784 ratio=0.643702
$ python3 -m flake.predictext temple o20u.md AdaptiveSW 10 5
predict=1911272=[568509, 478466, 382557, 286258, 195482]
length=1223834 index=1223834 cover=1.000000 score=764277 ratio=0.624494

$ python3 -m flake.predictext temple o20u.md AdaptiveSW 9 6
predict=1512969=[478467, 382558, 286259, 195482, 115720, 54483]
length=1223834 index=1223834 cover=1.000000 score=781632 ratio=0.638675
$ python3 -m flake.predictext temple o20u.md AdaptiveSW 10 6
predict=2026992=[568509, 478466, 382557, 286258, 195482, 115720]
length=1223834 index=1223834 cover=1.000000 score=787920 ratio=0.643813
$ python3 -m flake.predictext temple o20u.md AdaptiveSW 11 6
predict=2562376=[651107, 568508, 478465, 382556, 286258, 195482]
length=1223834 index=1223834 cover=1.000000 score=764455 ratio=0.624639

$ python3 -m flake.predictext temple o20u.md AdaptiveSW 10 7
predict=2081475=[568509, 478466, 382557, 286258, 195482, 115720, 54483]
length=1223834 index=1223834 cover=1.000000 score=781672 ratio=0.638708
$ python3 -m flake.predictext temple o20u.md AdaptiveSW 11 7
predict=2678096=[651107, 568508, 478465, 382556, 286258, 195482, 115720]
length=1223834 index=1223834 cover=1.000000 score=788015 ratio=0.643890
$ python3 -m flake.predictext temple o20u.md AdaptiveSW 12 7
predict=3287473=[725100, 651106, 568507, 478464, 382556, 286258, 195482]
length=1223834 index=1223834 cover=1.000000 score=764582 ratio=0.624743

$ python3 -m flake.predictext temple o20u.md AdaptiveSW 11 8
predict=2732579=[651107, 568508, 478465, 382556, 286258, 195482, 115720, 54483]
length=1223834 index=1223834 cover=1.000000 score=781708 ratio=0.638737
$ python3 -m flake.predictext temple o20u.md AdaptiveSW 12 8
predict=3403193=[725100, 651106, 568507, 478464, 382556, 286258, 195482, 115720]
length=1223834 index=1223834 cover=1.000000 score=788078 ratio=0.643942
$ python3 -m flake.predictext temple o20u.md AdaptiveSW 13 8
predict=4077835=[790368, 725099, 651105, 568506, 478463, 382555, 286257, 195482]
length=1223834 index=1223834 cover=1.000000 score=764644 ratio=0.624794

$ python3 -m flake.predictext temple o20u.md AdaptiveSW 12 9
predict=3457676=[725100, 651106, 568507, 478464, 382556, 286258, 195482, 115720, 54483]
length=1223834 index=1223834 cover=1.000000 score=781736 ratio=0.638760
$ python3 -m flake.predictext temple o20u.md AdaptiveSW 13 9
predict=4193555=[790368, 725099, 651105, 568506, 478463, 382555, 286257, 195482, 115720]
length=1223834 index=1223834 cover=1.000000 score=788116 ratio=0.643973
$ python3 -m flake.predictext temple o20u.md AdaptiveSW 14 9
predict=4925639=[847812, 790367, 725098, 651104, 568505, 478462, 382554, 286256, 195481]
length=1223834 index=1223834 cover=1.000000 score=764718 ratio=0.624854

$ python3 -m flake.predictext temple o20u.md AdaptiveSW 13 10
predict=4248038=[790368, 725099, 651105, 568506, 478463, 382555, 286257, 195482, 115720, 54483]
length=1223834 index=1223834 cover=1.000000 score=781762 ratio=0.638781
$ python3 -m flake.predictext temple o20u.md AdaptiveSW 14 10
predict=5041358=[847812, 790367, 725098, 651104, 568505, 478462, 382554, 286256, 195481, 115719]
length=1223834 index=1223834 cover=1.000000 score=788169 ratio=0.644016
$ python3 -m flake.predictext temple o20u.md AdaptiveSW 15 10
predict=5823422=[897792, 847811, 790366, 725097, 651103, 568504, 478461, 382553, 286255, 195480]
length=1223834 index=1223834 cover=1.000000 score=764750 ratio=0.624880

The adaptive window wins over the fixed window, as expected, but the margin is tiny. The best run with adapt<=10 is:
window=14 adapt=10
score=788169 ratio=0.644016

14


VIP:

do not edit these