Sandbox: Embedding PHP as a Scripting Language *in* PHP

So with working on Griddle some more, I decided that in order to make it a viable platform for building a virtual world (specifically in my case a multiplayer RPG) I would need some means to script objects to do things.

Doing this with PHP objects that are ultimately serialized and stored on a database is potentially hazardous, as you’ll need to:

  1. Write a new class for each object type (each NPC, rock, tree, etc.) which is time consuming and costly to process.
  2. Write some very robust serialization functions to ensure everything updates properly as you alter the classes involved,
  3. Deal with PHP errors related to improper unserialization.

For those of you who do not know what “serialization” in this context means, it is simply the process of taking a digital object and reducing it to something that can be stored to a database (usually a long string of characters) and then later turned back into the digital object, itself in the computer’s memory when needed.

Think of when you get a desk from Ikea or the likes and it comes in a nice flat-pack container, perfect for transport or storage. The desk in that form is “serialized” and when you put it together you’ve “unserialized” it; however, since it’s Ikea you can always break it down and “serialize” it again. 🙂

Extending this metaphor further, having to keep a different class for each different object would be like trying to assemble a bunch of different pieces of furniture from different sets of instructions, and if the models of furniture change and the instructions on how to put them together aren’t properly updated, it quickly becomes a mess.

What would be easiest is if you have *one* model with *one* set of instructions that you can, with a little practice, put together and take apart easily.

Scripting Languages

One of the ways to make this happen is to devise a way to extend the functionality of an object without altering its underlying structure in the computer, and the easiest way to do that is via a scripting language of some sort (i.e. a set of editable instructions that the object can store in a regular manner).

Nearly all well-developed virtual world systems that I’ve worked with have a scripting language. Second Life has LSL (Linden Scripting Language) which is a variant of Mono (which is ECMA compliant, i.e. JavaScript-like), and Metaplace (now defunct) and Blue Mars chose variants of of Lua (a lightweight embeddable scripting language).

I’ve come to like Mono and Lua a lot in my dabbling with them; however, neither one of them will snugly fit what I’ve wanted to accomplish with Griddle. I want to stick to JavaScript/jQuery on the client side, and PHP/MySQL on the server-side so that this will be able to run virtually *anywhere* without any fancy configuration. With shared hosting programs like GoDaddy, installing the proper software to get Mono or Lua working in such a way they would be useful on the server is next to impossible.

The only other real option I could think of would be to write my own, but doing so in PHP would be slow as molasses and not very robust.

It was then, that I had a crazy idea: If PHP is the limiting factor, why not let PHP be the scripting language?

I Dub Thee “Sandbox” – PHP Embedded in PHP

In order to make a scripting language *work* it needs to be:

  1. Quick, as slow scripts frustrate users.
  2. Easy to learn, so people will use it.
  3. Locked down and isolated so that a malicious hacker can’t gain control of your server.

PHP fits the bill on 1 and 2, but 3 is tricky. The only ways that were well-known to lock down PHP are to mess with its initialization file, which allows you to turn on and turn off functions and features. However, that goes for all code that’s executed. If I wanted to, say block a user from accessing the database in this manner, I would also end up blocking my own programs from accessing the database (which would be of absolutely no help).

What I needed was that extra layer of control where I could “reach into” their code, but they couldn’t “reach into” mine.

It was then that I learned about the PHP Tokenizer functions. Tokenization is a process by which a computer takes programming language code and breaks it up into its constituent parts for interpretation. Since the tokenizer functions in PHP are actually part of the PHP engine, they’re lightning fast.

My idea was to build a system that could accept raw PHP code, tokenize it, and then compare the tokens to a set of parameters to determine what was allowable and what was not. If it found anything “bad” it would reject the code. Otherwise it would allow it to be executed.

My next question, thought, was “How the heck am I going to lock everything down?” Different PHP installs have different variables and modules, and since Griddle is a distributed system that may have different modules on different machines running, how am I going to nail every possibility? It was then that I remembered about PHP’s four functions:

  1. get_defined_constants()
  2. get_defined_functions()
  3. get_defined_vars()
  4. get_declared_classes()

Each of these returns a list of what they describe, so by using these I could determine exactly what a user *would* have access to, and block them *all* out, making PHP a blank slate. The checker would only give its thumbs’ up to user-defined code that did not use *any* of the PHP libraries or built in variables and functions. A step in the right direction!

Add onto that a “white list” of allowed parameters, and I now have complete control over the environment in which the code executes: A Sandbox. 🙂

Control Structures and Choices

From here I’ll need to figure out how I want to give the user access to World data, as well as under what conditions the script is triggered.

PHP is strictly procedural (with each line in its flow executed in sequence, one line at a time), quite in contrast to the event-driven Mono and flexibility of Lua, however I would need to figure out some method of passing events to the script and storing its own “session variables” for later retrieval.

What I’m thinking about is a series of callback functions (such as “onClick,” “onBump,” etc.) that will be appropriately triggered (sort of in a similar fashion to how LSL is structured). Then, give these access to a few “wrapper classes” that give the user some access to the object’s data and the means to access other nearby objects. For example, a code snippet that causes the object to chat “Hello there [name]! I’ve greeted someone like that [x] times.” every time someone clicks on it could be:

var $times = 0;

function onClick($clicker) {
    $this->times++;
    $this->chat('Hello there, '.$clicker["name"].'! I've greeted someone like that '.$this->times.' times.');
}

Or an object script designed to move within 2 squares of another object and stop:

var $targetId = "[target's id]";

function onTick() {      
    global $World;
    $targetPosition = $World->findObject($this->targetId);
    if ( distance_between($this->getPosition(), $targetPosition) > 2 ) {
        $this->moveTowards($targetPosition);
        $this->resetTick();
    }
}

Anyways, you probably get the idea: PHP embedded (and locked down in a sandbox) in PHP. 🙂

More later. Time for sleep!

-Steve

Leave a Reply