Towards Inform 7 in WASI

I have been a fan of Inform 7 for a long time. It was first released in binary form only, with the promise to eventually open source it. The timeline for this, for someone so curious to see how it worked inside, seemed to drag on indefinitely, and I did not hear much for years. But in 2019, progress was clearly starting to show on the outside, and on April 28, 2022, Inform was made open-source.

Inform is a wonderful program, but it requires you to be able to run a program on your computer. That may not sound like much, but you don’t always have that capability. Sometimes when I am writing I may not want to turn on my computer, for example, and may just want to write on my iPad for example. But there is no Inform for iPad, and even if there were, there’s no Inform for Chromebook, and it seems like Inform for Android used to exist but no longer does.

This is a perfect spot for the Web. All of these devices have perfectly capable web browsers. With Emscripten blazing the path, first with asm.js and later with WebAssembly, and then WASI providing another system interface for WebAssembly, and with the Inform source code finally available, this goal should be within reach – an Inform for Web that would let anyone use Inform, as long as you’ve got a web browser.

This isn’t what I have today, and it’s not something I likely have the energy to deliver. But, I did want to see how far I could push this with a little bit of effort. Let’s try to get Inform running in WASI.

First, we’ll use WASI SDK. Grab the most recent release. Next, we need some way to run WASI executables to test the result out. wasm3 is not the fastest option, but it is easy to install, and will do in a pinch. Note that while -DBUILD_WASI=simple can be used to simplify the build, it is missing implementations of parts of the WASI interface needed by Inform, like fd_readdir; the default UVWASI setting pulls in another dependency and requires more time to build, but fixes this issue.

Double-check your installation works fine by compiling a hello-world C or C++ program with the WASI SDK and running it with wasm3:

cat <<EOF > hello.cpp
#include <cstdio>

int main() {
    std::puts("Hello WASI");
}
EOF
wasi-sdk/bin/wasm-wasip1-clang++ -o hello.wasm hello.cpp
wasm3/build/wasm3 hello.wasm

Next, get a build environment for Inform up and running. Per the instructions, you’ll need a directory into which you clone inweb, intest, and inform. Then:

bash inweb/scripts/first.sh <platform>  # check inweb README for options
bash intest/scripts/first.sh
cd inform
bash scripts/first.sh

Note that all of these scripts are sensitive to the working directory: inweb and intest want the working directory to be the parent of the repository root, while inform wants the working directory to be its repository root and not the parent.

You can check that the local build is working by creating a small input. Normally a project directory would be used, but it’s easiest to start with a single file:

cat <<EOF > test.ni
"Example" by Fractolog

The Blog is a room.  There are words in the blog.  The words are edible.
EOF
inform/inform7/Tangled/inform7 -at inform/inform7 test.ni
inform/inform6/Tangled/inform6 -G test.i6
inform/inform6/Tests/Assistants/dumb-glulx/glulxe/glulxe test.ulx

The inweb process used to build Inform makes the build process simple, at least as far as the C toolchain is concerned. The entire inform7 executable is created from a single large C file, inform7/Tangled/inform7.c. You can feed this into WASI SDK, but there are some problems off the bat:

  • It complains about pthread.
  • It complains about setjmp.

Let’s proceed by hacking our way inelegantly to get something working. It can be made pretty later, perhaps.

The setjmp comes from inform/inter/final-module/Chapter 5/Final C.w. Remove the #include <setjmp.h>. This will break C output from Inform, but this is not widely used right now – Z-machine and Glulx are almost certainly what you’re using. Hackety hack.

Pthread is used for the threading primitives in inweb/foundation-module/Chapter 1/POSIX Platforms.w. These threading primitives are only used in intest, not in the inform7 compiler proper. Remove the pthread include and replace all usages with stubs. Hackety hack.

If you missed a -std directive on the command line, you’ll get an error about asm used as a variable name; add -std=c11 to avoid this. You’ll get an error about PLATFORM_STRING not being defined; add -DPLATFORM_STRING=\"wasi\" to avoid this.

Next there’s linker errors for Platform__where_am_i, clock, and system. Replace all calls to clock with 0 (mostly in inweb/foundation-module/Chapter 3/Time.w, but some in inform/inbuild/supervisor-module/Chapter 3/Inter Skill.w); edit Platform::system in inweb/foundation-module/Chapter 1/POSIX Platforms.w to return 1 instead of calling system; comment out the Platform::where_am_i call in inweb/foundation-module/Chapter 3/Pathnames.w.

After doing all of these, I get a huge inform7.wasm file (10 MB) once compilation finishes (which takes nearly a whole minute for me). Note that in order to trigger recompilation, inweb needs to re-run. To do this, you can use make -C .. -f inweb/inweb.mk && INWEB_PATH=$PWD/../inweb make (after which you can run your wasm-wasip1-clang). INWEB_PATH is needed because after commenting out Platform::where_am_i, we broke inweb’s ability to find itself, so we need to help it along.

See if you can pull a version:

../wasm3/build/wasm3 inform7/Tangled/inform7.wasm --version
Error: compiling function overran its stack height limit

Aww, man! But wasm3’s docs/Troubleshooting.md covers this. Update m3_config.h’s d_m3MaxFunctionStackHeight from its default of 2,000 up to 30,000. (I tried 20,000, but it wasn’t enough – we’re getting scarily close to the maximum wasm3 supports of 32768!). Rebuild wasm3 and retry:

../wasm3/build/wasm3 inform7/Tangled/inform7.wasm --version
Error: [trap] stack overflow

Aww, man! The troubleshooting document covers this also.

../wasm3/build/wasm3 --stack-size 1000000 inform7/Tangled/inform7.wasm --version
inform7 version 10.2.0-beta+6X64 'Krypton' (21 May 2024)

Woohoo! Run it on the test story:

wasm3/build/wasm3 --stack-size 1000000 inform/inform7/Tangled/inform7.wasm -at inform/inform7 test.ni
Inform 7 v10.2.0 has started.
Error: [trap] indirect call type mismatch

Aww, man! Adding some debugging code into wasm3 to see where we’re going wrong, this is from BuildSteps::execute calling as such:

VOID_METHOD_CALL(S->what_to_do, BUILD_SKILL_COMMAND_MTID, S, command, BM, search_list);

Into a method declared this way:

VOID_METHOD_TYPE(BUILD_SKILL_COMMAND_MTID,
    build_skill *S, build_step *BS, text_stream *command, build_methodology *meth,
    linked_list *search)

Whose implementation is declared this way:

int Inform7Skill__inform7_via_shell(build_skill *skill, build_step *S,
    text_stream *command, build_methodology *BM, linked_list *search_list) {

Is that an int return type I spy? On my void method type? WebAssembly is strict about this – we can’t be loose about it. Near as I can tell, this inconsistency has existed since this code was first introduced in 695721dcee. It seems every single skill’s “via shell” implementation ends with return TRUE though and has no paths that can return any other value, so it’s safe to turn all the implementations into void-returning functions. Next!

wasm3/build/wasm3 --stack-size 1000000 ../inform/inform7/Tangled/inform7.wasm -at ../inform/inform7 test.ni
Inform 7 v10.2.0 has started.
  >--> The project Example seems to need me to know about a non-English
    language, 'English'. I can't find any definition for this language.
    Because of this problem, the source could not be translated into a working
    game. (Correct the source text to remove the difficulty and click on Go
    once again.)
Inform 7 has finished.

Aww, man! This is an uvwasi sandboxing problem. Don’t try to be clever and use .. anywhere in your path names. Next!

wasm3/build/wasm3 --stack-size 1000000 inform/inform7/Tangled/inform7.wasm -at inform/inform7 test.ni
  >--> An internal error has occurred: empty intermediate pathname. The error
    was detected at line of
    "inweb/foundation-module/Chapter 3/Filenames.w". This should never happen, and I am now halting in abject
    failure.
    What has happened here is that one of the checks Inform carries out
    internally, to see if it is working properly, has failed. There must be a
    bug in this copy of Inform. It may be worth checking whether you have the
    current, up-to-date version. If so, please report this problem via
    www.inform7.com/bugs. As for fixing your source text to avoid this bug,
    the last thing you changed is probably the cause, if there is a simple
    cause. Your source text might in fact be wrong, and the problem might be
    occurring because Inform has failed to find a good way to say so. But even
    if your source text looks correct, there are probably rephrasings which
    would achieve the same effect.

Aww, man! The problem here is very subtle and required several hours of debugging, but the problem is quite simple and not at all implied by the error message. The stack (the portion that resides in memory, not the WebAssembly stack, and hence not controlled by wasm3’s --stack-size – confusing, I know) is too small, and memory corruption is occurring. So we have to tell the linker to pick a bigger stack size with -Wl,-z,stack-size=$((2*1024*1024)) while linking, to give it a nice big 2 MB stack. Next!

wasm3/build/wasm3 --stack-size 10000000 inform/inform7/Tangled/inform7.wasm -at inform/inform7 test.ni
Inform 7 v10.2.0 has started.
  >--> The project Example seems to need me to work with a non-English
    language, but 'this project asks to be 'written in' a language which does
    not support that'.
  >--> The project Example seems to need me to work with a non-English
    language, but 'this project asks to be 'played in' a language which does
    not support that'.
  >--> The project Example seems to need me to know about a non-English
    language, 'English'. I can't find any definition for this language.
  >--> The kit Basicinformkit, which your source text makes use of, seems to
    have metadata problems. Specifically: the metadata contains a syntax error:
    'inform/inform7/Internal/Inter/BasicInformKit/kit_metadata.json: '.
  >--> The kit Architecture32kit, which your source text makes use of, seems to
    have metadata problems. Specifically: the metadata contains a syntax error:
    'inform/inform7/Internal/Inter/Architecture32Kit/kit_metadata.json: '.
  >--> The kit Commandparserkit, which your source text makes use of, seems to
    have metadata problems. Specifically: the metadata did not validate:
    '.is.title: expected  but found string'.
  >--> The kit Commandparserkit, which your source text makes use of, seems to
    have metadata problems. Specifically: the metadata did not validate:
    '.is.version: expected  but found string'.
  >--> The kit Commandparserkit, which your source text makes use of, seems to
    have metadata problems. Specifically: the metadata did not validate:
    '.needs[0].need.title: expected  but found string'.
  >--> The kit Commandparserkit, which your source text makes use of, seems to
    have metadata problems. Specifically: the metadata did not validate:
    '.needs[1].need.title: expected  but found string'.
  >--> The kit Commandparserkit, which your source text makes use of, seems to
    have metadata problems. Specifically: the metadata did not validate:
    '.needs[1].need.author: expected  but found string'.
  >--> The kit Commandparserkit, which your source text makes use of, seems to
    have metadata problems. Specifically: the metadata did not validate:
    '.activates[0]: expected  but found string'.
  >--> The kit Commandparserkit, which your source text makes use of, seems to
    have metadata problems. Specifically: the metadata did not validate:
    '.kit-details.has-priority: expected  but found number'.
  >--> The kit Commandparserkit, which your source text makes use of, seems to
    have metadata problems. Specifically: the metadata did not validate:
    '.kit-details.provides-kinds[0]: expected  but found string'.
  >--> The kit Commandparserkit, which your source text makes use of, seems to
    have metadata problems. Specifically: the metadata did not validate:
    '.kit-details.configuration-flags[0]: expected  but found string'.
  >--> The kit Commandparserkit, which your source text makes use of, seems to
    have metadata problems. Specifically: the metadata did not validate:
    '.kit-details.configuration-flags[1]: expected  but found string'.
  >--> The kit Commandparserkit, which your source text makes use of, seems to
    have metadata problems. Specifically: the metadata did not validate:
    '.kit-details.configuration-flags[2]: expected  but found string'.
  >--> The kit Commandparserkit, which your source text makes use of, seems to
    have metadata problems. Specifically: the metadata did not validate:
    '.kit-details.configuration-flags[3]: expected  but found string'.
    Problems occurring in translation prevented the game from being properly
    created. (Correct the source text to remove these problems and click on Go
    once again.)
Inform 7 has finished.

Aww, man! This problem was even more perplexing. register_tangled_text_literals compiles down to WebAssembly like this:

00b3dd func[62] <register_tangled_text_literals>:
 00b3de: a9 d7 01 7f                | local[0..27560] type=i32
 00b3e2: 41 c8 cb a3 80 00          | i32.const 583112
 00b3e8: 21 00                      | local.set 0
 00b3ea: 20 00                      | local.get 0
 00b3ec: 10 d3 80 80 80 00          | call 83 <Str__literal>
 00b3f2: 21 01                      | local.set 1
 00b3f4: 41 00                      | i32.const 0
 00b3f6: 21 02                      | local.set 2
 00b3f8: 20 02                      | local.get 2
 00b3fa: 20 01                      | local.get 1
 00b3fc: 36 02 b8 b3 be 80 00       | i32.store 2 1022392
 00b403: 41 f4 cb a3 80 00          | i32.const 583156
 00b409: 21 03                      | local.set 3
 00b40b: 20 03                      | local.get 3
 00b40d: 10 d3 80 80 80 00          | call 83 <Str__literal>
 00b413: 21 04                      | local.set 4
 00b415: 41 00                      | i32.const 0
 00b417: 21 05                      | local.set 5
 00b419: 20 05                      | local.get 5
 00b41b: 20 04                      | local.get 4
 00b41d: 36 02 bc b3 be 80 00       | i32.store 2 1022396
...

Extremely verbose, with an absurd number of locals (27561). It could be rewritten with no locals at all instead:

func[62] <register_tangled_text_literals>:
i32.const 0
i32.const 583112
call 83 <Str__literal>
i32.store 2 1022392
i32.const 0
i32.const 583156
call 83 <Str__literal>
i32.store 2 1022396
...

It turns out that having so many local variables causes wasm3 to execute code incorrectly. That seems like a bug – WebAssembly allows implementation limits, and says:

If the limits of an implementation are exceeded for a given module, then the implementation may reject the validation, compilation, or instantiation of that module with an embedder-specific error.

It doesn’t say “must”, which is odd; it’s unclear to me if this then allows implementations to produce incorrect behavior without diagnosing a limit being exceeded. In any case, this is undesirable. Fortunately, it ought to be possible to make Clang generate the code with fewer locals “just” by tacking on -O1 to the command line. Less fortunately, this not only causes the compilation time to balloon; Clang also requires more than 16 GB of RAM to complete the compilation. Another solution is needed. Patching register_tangled_text_literals might solve this problem, but register_tangled_text_literals is far from the only function with a ton of locals; Tokenisation::go has a whopping 103703, and that one is not nearly as easy to patch up on an ad-hoc basis.

Binaryen to the rescue: It can do the optimizations we need without taking too much time or memory. It is run as a separate pass after Clang runs. It takes about 8 seconds on my computer. register_tangled_text_literals is indeed brought down to 0 locals, and Tokenisation::go is brought down from 103703 to 81! Hierarchy::establish tops the list now at 4201 locals, but this “worst offender” is almost two orders of magnitude better than the old “worst offender”, so this should hopefully be OK. The command line I used is:

wasm-opt -O -g -o inform/inform7/Tangled/inform7.opt.wasm inform/inform7/Tangled/inform7.wasm

Next!

wasm3/build/wasm3 inform/inform7/Tangled/inform7.opt.wasm -at inform/inform7 test.ni                                  
Inform 7 v10.2.0 has started.
I've now read your source text, which is 18 words long.
I've also read version 2 of Basic Inform by Graham Nelson, which is 8488 words long.
I've also read version 10.2.0 of English Language by Graham Nelson, which is 2330 words long.
I've also read version 7 of Standard Rules by Graham Nelson, which is 35330 words long.

  The 18-word source text has successfully been translated. There were 1 room
    and 2 things.
Inform 7 has finished.

It works! test.i6 has been created. wasm-opt also removed enough superfluous locals that we were able to drop the --stack-size, and we’re also able to drop d_m3MaxFunctionStackHeight back to the default of 2000! It takes about 22 seconds to translate, which isn’t great, but this is an interpreter and we’d be faster in a browser that actually JITs. The size of inform7.opt.wasm, after stripping, is also a (still rather sizable, but) much more reasonable 4.7 MB.

Not a full game yet: The I6 needs to be translated to Z-machine or Glulx. We’ll use Glulx. Fortunately, in comparison to all the trouble we had with Inform7, Inform6 compiles to WebAssembly without a hitch (albeit using all the lessons we’ve learned so far, plus one I used while debugging I haven’t talked about in this post):

wasm32-wasip1-clang -o inform/inform6/Tangled/inform6.wasm -std=c11 -Wno-everything -g inform/inform6/Inform6/*.c -Wl,-z,stack-size=$((2*1024*1024)) -Wl,--Map=inform/inform6/Tangled/inform6.wasm.map && wasm-opt -O -o inform/inform6/Tangled/inform6.opt.wasm inform/inform6/Tangled/inform6.wasm

And then translating:

wasm3 inform/inform6/Tangled/inform6.opt.wasm -G test.i6

After 2 seconds, we get a test.ulx out!

From here, we could try to compile glulxe, as is compiled for Inform 7 normally (in inform/inform6/Tests/Assistants/dumb-glulx/glulxe/glulxe)… but there’s way better options. The Web ecosystem already has an excellent Glulx and Z-machine runtime in the form of Parchment (source repository).

To me, this demonstrates that all the necessary pieces to make Inform 7 work in a browser are there. We’ve shown the Inform 7 compiler and Inform 6 compilers running under WebAssembly. We still rely on a filesystem – to run in a browser, we’d need to virtualize this (which has precedent, but still needs to be done), and provide initial contents of the virtual filesystem from another file downloaded from the server. Likely, you’d want to load extensions on-demand, rather than downloading the world ahead of time – this could be done using inbuild to discover dependencies and downloading them. To get a real first-class Inform user experience, lots of work would need to go into building a UI with all the expected features (text editing and syntax highlighting, viewing the index, using the skein, problem viewing, documentation access, etc.). For release, Inblorb and other tools may also need to be translated.

To get here we also needed to hack at the Inform 7 source a bit. The modifications, all things considered, were fairly minimal, but doing it in a bit less of a hacky way (with a real PLATFORM_WASI define, for one), could make this a bit more palatable for upstreaming. The changes for WASI are completely unsuitable for intest, however (as well as some of inbuild’s non-“info” functionality); it’s not clear to me how to adapt intest for WASI compatibility.

I’m not planning to touch this again any time soon, but if anyone else feels like pursuing this, this might help kick off their efforts. I am happy to provide limited advice if so.

Tags: , ,

Comments are closed.