Lua is a powerful, lightweight, scripting language, and is often used in video games due to fast execution and a short learning curve. World of Warcraft, Angry Birds and Mafia II all use Lua script. Today, Chief Scientist at ROBLOX, Erik Cassel, discusses a way to optimize Lua.
A Short Lesson
Any programming language has its little tricks for squeezing out better performance. Today, we will show you a way to make some Lua code faster. This guide is intended for more advanced scripting. I’ll also discuss the pro’s and con’s of code optimization.
Syntactic sugar makes code easier to read, but it can hide some of the stuff that happens under the hood. By pulling out the sugar, we see that in a member function call the script actually does two things. First, it looks up a function named ‘SetCell’ in terrain’s internal list of members. Then, the script calls that function with terrain as the first argument.
What It Means
The member lookup isn’t cheap. It takes more time than a function lookup in many other languages like C++, Java or C#. I ran some benchmarks and found that it can take up 33% of the script’s execution time.
Let’s make the code more interesting:
This code calls SetCell 10,000 times, which means it does 10,000 lookups of the SetCell member function. There must be something we can do to make this more efficient.
Think back to the “syntactic sugar” above. We can safely assume that the definition of the SetCell function is the same each time, right? All we want to do is call the same function 10,000 times with a different set of arguments. Here is how we do this:
Now we do the ‘SetCell’ lookup only once, and call SetCell 10,000 times. That’s a lot less work to do!
Notice how I put some comments in the code, including an invariant, which states an assumption I’m making about terrain.setCell being immutable. Whenever you do something clever that is more complicated than a more elegant expression, be sure to explain why.
Another thing to note: I very consciously declared the “setCell” variable as a local variable. There are two good reasons to do so:
- Local variables don’t clobber global variables. What if some other code uses a global “setCell” variable? Our definition will overwrite it! When in doubt, use the “local” keyword.
- Local variables are usually faster. My benchmark revealed that using a global variable cut the speed boost from 33% to 20%. This is because global variables are themselves syntactic sugar, for _G[‘setCell’]. We nearly wiped out the gain made in our optimization. Here is what happens under the hood, if you don’t declare the variable as local:
Here Be Dragons!
OK, now you’re ready to modify all your code and make it faster, right? Not so fast!
In the vast majority of cases, it won’t improve performance at all. In fact, it might hurt performance. The optimization I outlined above only helps you in large, tight loops. Also, keep in mind that the function you are calling might be expensive. If the function you are calling takes a lot more time than the lookup, then you’ve made your code more complicated without optimizing anything.
Here is an example:
All you’ve done is made your code less readable. The Clone function is rather complicated and takes a long time. The time saved avoiding one ‘Clone’ lookup pales in comparison.
Just because something can be optimized doesn’t mean you should do so:
Premature optimization is the root of all evil (or at least most of it) in programming.
(I could list a lot of roots of evil in programming, but they will have to wait for another time.) Before optimizing code, look for a problem worth optimizing.
Some time ago I created a Conway’s Game of Life place. I wrote it specifically as a stress test. It makes a lot of terrain changes and runs a lot of Lua code:
Notice the “Script Performance” panel. You can find it under the Tools menu in ROBLOX Studio. My script is taking 35% of the CPU. I decided to try our optimization on this place.
To confirm that the optimization was worth it, you must run before-and-after tests. If the optimization attempt did not help, then rip it out. Cleaner code is more important, and will make it easier for you to consider alternate optimizations later. Here is the result after optimization:
The CPU load went from 35% down to 27%. This is an improvement.
To get the best performance out of any language, you need to know some things about how it works and when to apply optimizations. Because Lua is an interpreted language, it frequently does string lookups at run-time. Avoiding these lookups may improve performance in some cases. Also, be sure to document your optimizations. They are rarely intuitive and often make the code more complicated. Don’t substitute one inefficiency for another. See the local/global discussion above. Make sure your change truly makes a difference. If not, then look elsewhere for optimizations. We hope this helps!