Debugging Performance Issues

From Sirikata Wiki
Revision as of 04:10, 3 February 2012 by Ewencp (talk | contribs) (→‎Course Grained Profiling: Fill in details and example output)
Jump to navigation Jump to search

Sirikata is large, complicated, and uses multiple languages. Debugging performance issues can be tricky. The following tries to describe the available tools, how to use them, and which is best for different situations.

External Tools

OProfile

If you're looking for hotspots in C++ code, this is the suggested tool. It works by having the kernel periodically trigger sampling, collecting stacktraces from Sirikata (and other applications). You run opcontrol to enable sampling, run the application, and then turn off opcontrol. A separate program, opreport, will report the collected data. See the OProfile documentation for details.

A simple wrapper script for whichever binary you're targeting can be very handy:

   #!/bin/bash
   
   sudo opcontrol --no-vmlinux --start
   sudo opcontrol --reset
   sudo opcontrol --callgraph=6
   
   ./cppoh_d "$@"
   
   sudo opcontrol --stop
   sudo opcontrol --shutdown

When you run opreport, you'll need to filter out non-Sirikata modules (because OProfile examines the entire system). Make sure you include Sirikata's libraries and plugins as well as the binary you're examining!

Google Perftools

Google Perftools is already included as we also use it for Debugging Memory Issues. To use it, set the CMake variable GOOGLE_PERFTOOLS_SETTING=CPU and recompile. When you execute the program you want to examine, set the environment variable CPUPROFILE to a location to save profiling data to, e.g.,

   > env CPUPROFILE=/tmp/cppoh.prof /home/me/sirikata/build/cmake/cppoh_d

Then run pprof to analyze the output.

See the Google Perftools page for more details, and specifically the CPU profiling instructions.

Emerson

Theoretically V8's internal profiler can be used, which should give information about both Javascript and V8 CPU usage. You can use the --v8-flags setting or change the value directly in JSObjectScriptManager.cpp. However, so far we've only been able to get information about C++.

GNU Profiler

You can build Sirikata with GNU Profiler settings. Set CMAKE_BUILD_TYPE=Profile. This is an instrumentation-based statistical profiler. It uses the same basic idea as OProfile and will report information in a similar manner. However, it's more intrusive and seems to have a higher impact on performance. Therefore, you should generally use it when you can't get the information you need from OProfile, or to double check OProfile's results.

See the GNU Profiler guide for more details.

Sirikata Tools

Course Grained Profiling

For a few parts of code where performance issues might be expected, especially if they could result in long-running event handlers, TimeProfiler's are manually added around event handlers. This is much less heavyweight than a profiler like GProf and minimally invasive, so it is useful to ensure you're getting results that aren't skewed by the profiler.

To enable profilers, add --profile=true to the command line (this should work for all binaries). The statistics will be reported at exit:

   [PROFILER] INFO: Profiler report: Object Host
   [PROFILER] INFO: Stage: Session Manager -- Avg: 2us Min: 2us Max:7us Sum: 38us  Its: 13
   [PROFILER] INFO: Stage: Handle Read Network -- Avg: 10us Min: 5us Max:1.045ms Sum: 125.492ms  Its: 12018
   [PROFILER] INFO: Stage: Handle Server Message -- Avg: 402us Min: 23us Max:4.346ms Sum: 1.8317s  Its: 4550
   [PROFILER] INFO: Stage: TimeSync -- Avg: 81us Min: 31us Max:173us Sum: 326us  Its: 4
   [PROFILER] INFO: Stage: OrphanLocUpdateManager -- Avg: 8us Min: 8us Max:9us Sum: 17us  Its: 2

This data covers the entire run, so be wary of variation or spikes.

You can add your own 'Stage' by using TimeProfiler::addStage. All Contexts have a TimeProfiler in Context::profiler. A few classes (e.g. PollingService) will automatically add a Stage if you provide a name for it in their constructor. In fact, PollingService is a good example of how to add your own TimeProfiler Stage.

Fine-grained Event Loop Profiling

Quick start

Details