Profiling
To run the profiler, make sure you have an optimized build of the interpreter (otherwise profiling results are going to be very skewed) and run it with argument:
The resulting profile.out
file can be converted to an SVG file by running script that is part of Luau repository:
$ python tools/perfgraph.py profile.out >profile.svg
This produces an SVG file that can be opened in a browser (the image below is clickable):
In a flame graph visualization, the individual bars represent function calls, the width represents how much of the total program runtime they execute, and the nesting matches the call stack encountered during program execution. This is a fantastic visualization technique that allows you to hone in on the specific bottlenecks affecting your program performance, optimize those exact bottlenecks, and then re-generate the profile data and visualizer, and look for the next set of true bottlenecks (if any).
Hovering your mouse cursor over individual sections will display detailed function information in the status bar and in a tooltip. If you want to Search for a specific named function, use the Search field in the upper right, or press Ctrl+F.
Notice that some of the bars in the screenshot don’t have any text. In some cases, there isn’t enough room in the size of the bar to display the name. You can hover your mouse over those bars to see the name and source location of the function in the tool tip, or double-click to zoom in on that part of the flame graph.
-> local function myFunc() --[[ work ]] end
Even without these changes, you can hover over a given bar with no visible name and see it’s source location.
As any sampling profiler, this profiler relies on gathering enough information for the resulting output to be statistically meaningful. It may miss short functions if they aren’t called often enough. By default the profiler runs at 10 kHz, this can be customized by passing a different parameter to . Note that higher frequencies result in higher profiling overhead and longer program execution, potentially skewing the results.
This profiler tracks time consumed by Luau thread stacks; when a thread calls another thread via coroutine.resume
, the time spent is not attributed to the parent thread that’s waiting for resume results. This limitation will be removed in the future.