In Pursuit of Perfect Code: Multiplayer Script Debugging

January

14, 2013

by John Shedletsky


Archive

Remote Error Monitoring System

Now you’ve done it. You got interested in this “Lua scripting” thing you keep hearing about on our forums and reading about on our blog. You tinkered with it for a while. You even got to the point where you were able to understand enough to make your own online game. You’re like Neo, from the Matrix, crafting virtual realities from little glowy bits of green code. You are invincible. Your game gets super popular. It’s great.

Then the bug reports start to flow in. You get hundreds of private messages a day that all have the same general subject line: “dude your game is broken fixfixfixfix”. You know a little bit about debugging, so you start your level in ROBLOX Studio and run it locally using Test –> Start Server & Test –> Start Player. You have the Output window open, so you can see script errors as they occur. You know that once you see what the error message is, it is typically easy to fix. Reproducing and isolating the bug is most of the work. This time, however, you’re stuck. The bug only happens in multiplayer mode, on the production site, ROBLOX.com. What do you do?

Remote Debugging to the Rescue

Console

Tyler Mullen, of the the ROBLOX Content Team, has developed a new tool that will collect information from your running games. It can save and track:

  • Server uptime
  • Player visit stats: joins, leaves, average playtime, and number of unique visitors
  • Lua statistics: number of errors, warnings, and number of scripts running
  • Lua stack traces for recorded errors (this is the exciting part)

The best part of this tool is that it will record data even when you are not in your game. If you have a bug in your map that only happens every 10 hours, you can let your game run all day, then take a peek and see what’s going on.

How to Install the Tool

First, grab a free copy of this model. Open ROBLOX Studio and insert it into the Workspace of the place you want to track. Then expand the model in the explorer and find the AdminList StringValue. Add your username to the list. You can have as many users as you want – these are players that will see the Place Console when they visit the tracked place (careful: these users can shut your place down). Publish your map to ROBLOX. The next time you join the map, you’ll get the Place Console.

Caveats:

  • There is currently no way to minimize the Place Console; this is a very early version of this particular tool. Eventually, the functionality of this tool will be built into ROBLOX. For now, the window can’t be minimized, but it’s draggable.
  • The scripts used here are digitally signed by ROBLOX–if you edit any of the scripts, the tool will stop working, as it depends on several high-privilege functions to work properly.

How to Use the Remote Error Logging System to Debug Scripts

This is the cool part. The tool you just installed will capture erroneous scripts, along with a stack trace. How is this useful to you? When a place is “broken” online, 95% of the time it’s broken because one of the game scripts in it has stopped running. This happens when scripts encounter an error, unless the erroneous block of code is wrapping in a pcall, which would be an interesting topic for another article altogether. The stack trace can show you exactly what your scripts were doing before they errored out.

ErrorConsole

As an experiment, I added the remote error logging system to Sword Fight on the Heights IV. I thought that the code in this level was relatively simple, and for the most part the level runs without problems, so I didn’t expect to see many errors. I had quite a few, it turns out.

An error in Sword Fight on the Heights IV breaks the part of the code that causes players to drop all the items that they are carrying when they get bloxxed. Here is the stack trace:

Workspace.LeaderboardV3, line 78 – global dropWeapons

Workspace.LeaderboardV3, line 125 – global onHumanoidDied

Workspace.LeaderboardV3, line 136.

In the tool this is all compressed into one line. The topmost line (or the first one, if you’re reading left to right in the tool), is the line number where the error happened. The remaining lines are functions that were called prior to the error occurring. This can be extremely helpful in diagnosing complex situations, as the error may have happened in shared code that has numerous callers. The actual error, in this case, was:

Torso is not a valid member of Model

If you see an error like this, get psyched–these types of errors are easiest to fix because they are commonly caused by code assuming that an object will exist (and when it doesn’t, it causes the error). I opened my script in ROBLOX Studio, navigated to line 78 and noticed a very unsafe-looking reference to humanoid.Parent.Torso…

ErrorLine

This code is unsafe because it assumes that a part called “Torso” will be found in the humanoid’s parent. If this part is missing, this line of code will error out.

In my case, I estimate the condition is being triggered when the character falls off the edge of the world. By the time my code is getting called, this character’s Torso has been removed from the Workspace. If this happens, it doesn’t matter if he drops his weapons since no one is going to be able to pick them up. So the fix is very easy–I simply test that the Torso exists before I bother trying to drop his weapons.

ErrorFix

What Next?

Writing perfect code is hard. Actually, there is strong evidence that it might be theoretically impossible, so don’t feel bad if your script has bugs in it. I got lucky with this particular bug–most bugs you find will probably be bad references–similar to the one I had. But it’s possible to have bugs that don’t crash scripts, which makes them much harder to find. You can get logic bugs (where the code isn’t doing what you think it is), or race conditions (when two scripts are halted waiting for each other to finish, or otherwise interfere with each other because they are both manipulating the same objects). In these cases, it’s best to remove variables and simplify your program until you can diagnose the problem accurately. Fixing all the common bugs that happen frequently in your levels can help find rare bugs that don’t happen all that often.

For instance, I know that the teleport gates in Sword Fight on the Heights occasionally breaks. It doesn’t happen very often, and I didn’t see it happen in any of my instrumented levels. However, with this tool, I’m going to go back and fix all my trivial bugs, even if they hardly ever happen. Then when I load up Sword Fight on the Heights and see a stack trace, I’ll know I’ve struck gold.