Engineering ROBLOX for the iPad, Part 2 (Memory Optimization)
by Andrew Haak
One of the most important parts of developing a high-quality ROBLOX experience for the iPad is ensuring smooth, stable, steady game-play performance. iPads are not as powerful as almost any modern-day desktop computer or laptop, which means our developers have to dig deep into ROBLOX’s code, uncover problem areas and tune them to run more efficiently, while keeping game-play quality in mind. The end goal is to have quality and performance exist in harmony; the challenge is pushing performance optimization to its limit without noticeably degrading the experience.
For the past month, the Client Team has been neck-deep in ROBLOX’s source code, identifying inefficiencies and re-engineering them in exchange for quantifiable and positive impacts on performance. One of the best benchmarks for illustrating their collective progress is Crossroads, a classic level the team has been using as an iPad testing ground since September. When we first launched the ROBLOX code stack on an iPad, Crossroads with eight players ran at an unplayable five frames per second (FPS). Today, it runs at a cool 30+ FPS.
Part of achieving 30 FPS was reducing ROBLOX’s memory footprint from roughly 500 megabytes to 256 megabytes – the upper limit for stability on iPad (and half of the total memory on an iPad 2, one quarter of the total memory on most computers from the last 10 years). In this article, you’ll see some of the ways we reduced the weight of ROBLOX.
Texture manipulation and sound compression
Everything you see and hear in ROBLOX eats up system memory (RAM). Software Engineer Daniel Ignatoff focused on optimizing and reducing the size of textures and sounds to be iPad-optimal, resulting in significant file-size and load-time improvements:
- Texture loading: 100-200 milliseconds to 20-25 milliseconds
- OGG to MP3 conversion: 5 megabytes of RAM saved
- WAV to MP3 conversion: File sizes reduced to roughly 1 KB
The iPad rendering system can handle textures that use a power-of-two size for each dimension (i.e., 2×2, 16×16, 512×512, etc.). This causes inefficiency in some cases; ROBLOX’s shirt-and-pants texture, for example, is larger than 512×512 (it’s 585×559), so the iPad scales the texture up to the next power-of-two dimensions: 1024×1024. Rather than 327k pixels, the scaled-up texture uses roughly 1.049 million pixels and those pixels eat precious RAM.
Because many image formats cannot be used by graphics cards directly, a copy of the texture is made in the RAM and/or the video card’s memory in a format that is easier to render on screen. In general, these texture copies consume memory in proportion to the number of pixels, so a 1024×1024 texture takes four times as much memory as a 512×512 texture.
To avoid consuming excessive memory with textures, we implemented two changes. First, we capped textures on the iPad at 512×512 pixels. Second, we converted our PNG texture images to a different file-format: DDS (DirectDraw Surface). DDS files are more easily rendered by the iPad’s video chip (draw time is reduced from 100-200 milliseconds to 20-25 milliseconds) and, while they are not compressed (larger files), they don’t affect the app’s file size because the entire ROBLOX package is ultimately compressed.
Converting audio files from OGG to MP3 was similar to converting PNG to DDS. While the conversion resulted in larger individual files, MP3s save a total of around 5 megabytes of RAM as a result of an FMOD OGG Vorbis codec not needed in MP3s.
Smart character textures
Every ROBLOX character’s look translates to a texture in game. For instance, whether you have a simple ROBLOX 1.0 character with a shirt and pants or an elaborate character with advanced meshes (e.g., detailed torso shapes), you have a single (composite) in-game texture. This has been happening on ROBLOX for a while. This texture is typically created using the video card (GPU); however, on the iPad, the process occasionally dips into the CPU and slows the game, according to Senior Rendering Engineer Arseny Kapulkin.
We’ve developed a dynamic method for rendering character textures on the iPad. Rather than show every character texture at high-resolution, even at long distances, we allocate a memory budget to character textures and adjust texture quality as you play.
Here’s how it works: on the iPad, ROBLOX has a memory budget of about 10 megabytes for character textures. Because we’ve reduced the size of an average, high-resolution texture from 2 megabytes to 1 megabyte, the memory budget can support roughly 10 characters with high-resolution textures. When additional players (or non-playable humanoids) arrive, however, the iPad’s memory budget needs to adjust and make room for more textures. Rather than increase the budget and risk crashing the app, we begin rendering high-resolution textures only on characters that are within close range of you.
Because the iPad’s screen resolution is smaller than that of most monitors, less detail on far-away characters is okay. Plus, the trade-off is worthwhile: low-resolution textures are about 4 times smaller in terms of memory size, meaning we can trade one high-res for four low-res characters.
“We try to balance between keeping the texture quality high so you don’t notice the low resolution of the texture,” Arseny says. “Since the compositing step of upgrading and downgrading quality takes some time both from the CPU and GPU, we tried to balance it so it doesn’t hurt your performance too much, even if you are out of memory and have to upgrade and downgrade textures constantly.”
This update has the side benefit of getting ROBLOX closer to one of its ancillary goals: games on all platforms with more than 30 players.
Drawing UIs efficiently
User interfaces sometimes look like still images; something that shouldn’t be resource intensive. For most desktop/laptops, they’re not. But on the iPad, they need to be leaner. Senior Client Engineer Fred Kimberly made inroads in improving the efficiency of all the graphic user interfaces (GUIs) you see in ROBLOX, both in terms of performance and memory use:
- Batch draw calls have been reduced 90%
- Video memory representing text has been reduced by 12-13 MB
ROBLOX GUIs consist of up to 10 layers (generally some go unused). With each change to a GUI in a given frame, the graphics card memory is flushed and a “draw call” occurs. Even on very good graphics cards draw calls are incredibly expensive. Modern cards have gotten great at drawing several hundred million polygons per second, but can still only manage several thousand draw calls per second.
Let’s say a ROBLOX game has a GUI with three buttons, each consisting of an outline, a background texture, an icon and text. Rendering these buttons using the old method, each element is drawn individually in the order the GUI was built. That’s 12 draw calls (three buttons, each with four elements). Fred optimized the way these draw calls happen. Using the new method, we group backgrounds, borders and icons, and draw them all at once, for a total of three draw calls. Add in each button’s unique text, and you have six draw calls. As a result, Wordfall, a GUI-heavy game, went from roughly 400 draw calls to about 40 draw calls. Crossroads saw a more than 50% reduction (205 to 92, with the Backpack open) in draw calls.
We also stopped splitting individual GUI buttons into nine pieces for the sake of easy stretching (e.g., to fit chat text), which means each button requires one draw call rather than nine. We’re now using only texture coordinates, which can change the shape of a GUI element without creating a new draw call. Finally, we now store GUI text not as four-color channels (RGBA, with three channels of white and one channel of blending data), but in a single channel. Text data is white, so there’s no need for four channels. This alone saved 12 to 13 megabytes of memory.
What we’re making isn’t “ROBLOX Lite.” iPad players will be playing on the same servers as PC and Mac players, which is why it’s so important to optimize ROBLOX’s existing code and maintain its look and feel. While many of the changes mentioned in this article are iPad-specific, some – particularly those involving GUIs – will improve ROBLOX’s performance on desktop computers and laptops.
In our next Engineering ROBLOX for the iPad article, we’ll go in-depth on pure performance optimizations, including how we’re ensuring a stable frame rate, profiling our code for inefficiencies, and using a technique called debris culling to show high-importance objects in your world.