expat binding for LambdaMOO

Overview

In 2000 I wrote a binding of expat (that link is to the relatively modern version, not the version that existed in 2000) to LambdaMOO called ext-xml. It provides two new functions which accept an XML document in a string, parse it using expat, and return a LIST data structure representing the parsed XML.\

Where is it?

The 1.0 distribution that was here is ancient and is a patch against a version of the server that is also ancient. You may have landed on this page because you followed a link directly to a file that is no longer here.

Fortunately, you can still get this binding in its more recent form. wp-lambdamoo is the github project which contains the patched version of the server that Waterpoint runs. The differences between that branch and the trunk LambdaMOO source are:

  • this extension (ext-xml) and its subsequent bug fixes (ext-xml.c, the modification to Makefile.in and extensions.c to build and register the extension)
  • WAIF and WAIF_DICT have been applied (unrelated to the XML parsing code; if you don’t want those don’t just build the version of the server there).

There’s not a great deal of demand for this XML parsing patch for a creaky old textual MUD server so I have not invested the effort to package and release the current version of the code. It seemed wise to take down the old, known to be broken version, though, because one of the bugs fixed could have led to UTF-8 data in the database.

You might be interested in codepoint — a version of LambdaMOO with unicode support (and the unicode-xml branch there has this extension available integrated with the Unicode support).

Use

Once this patch is installed, the MOO server has two new builtins:

  • LIST xml_parse_tree(STR string)
  • LIST xml_parse_document(STR string)

Both return a list of the form: {STR tag, LIST attributes alist, STR text, LIST children}

STR tag ~ tag name LIST attributes ~ alist of attributes {{STR key, STR value}, …} STR text ~ text between the tags LIST children ~ children of this node

The difference between xml_parse_tree and xml_parse_document lies in where text between tags ends up. xml_parse_tree puts it all in the “text” element of the node. xml_parse_document puts it in the children element. This may be clearer with examples. The indendation of the return values is for illustrative purposes only.

;xml_parse_tree("<XML>abc</XML>") 
=> {{"XML",        -- tag name
     {},           -- attribute alist 
     "abc",        -- text
     {}}}          -- children

;xml_parse_tree("<list><item>foo</item><item>bar</item></list>") 
=> {{"list",       -- tag name of root
     {},           -- attributes
     "",           -- text
     {{"item",     -- tag name of first child
       {},         -- attributes of first child 
       "foo",      -- text of first child
       {}          -- children of first child 
      },         
      {"item",     -- tag name of second child
       {},         -- attributes of second child
       "bar",      -- text of second child
       {}          -- children of second child
      }           
     }             -- end of children of root
    }
   }

;xml_parse_document("<document coolness=\"yes\">The <EM>brown cow</EM> " +
                    "lept over " +
                    "the <EM>blue moon</EM></document>") 
=> {"document",
    {{"coolness", "yes"}},
    "",
    {"The ",
     {"EM", {}, "", {"brown cow"}}
     "lept over ",
     {"EM", {}, "", {"blue moon"}}
    }}

License

Copyright 2000 by Ken Fox.

Permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee is hereby granted, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting documentation.

KEN FOX DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL KEN FOX BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.