hprof

hprof

hprof
Process Heap Profiling Tool
Module hprof was introduced in OTP @OTP-18756@.

hprof provides convenience helpers for Erlang process heap profiling. Underlying mechanism is the Erlang trace BIFs.

Heap profiling can be done ad-hoc, to understand heap allocations done by your program, or run in a server-aided mode for deeper introspection of the code running in production.

Warning

Avoid hot code reload for modules that are participating in the memory tracing. Reloading a module turns tracing off, discarding accumulated statistics. hprof may not report correct amounts when code reload happened during profiling session.

Heap allocations happen for all Erlang terms that do not fit a single machine word. For example, a function returning tuple of 2 elements needs to allocate this tuple on the process heap. Actual consumption is more than 2 machine words, because Erlang runtime needs to store tuple size and other internal information.

Note

When profiling is enabled, expect a slowdown in the program execution.

For profiling convenience, heap allocations are accumulated for functions that are not enabled in trace pattern. Consider this call stack example:

      top_traced_function(...)
      not_traced_function()
      bottom_traced_function()

Allocations that happened within not_traced_function will be accounted into top_traced_function. However allocations happened within bottom_traced_function are not included in the top_traced_function. If you want to account only own allocations, you need to trace all functions.

Basic profiling providing accumulated memory allocation data. You can choose to print per-process statistics, total statistics, or omit printing and extract machine-readable data that you can later sort/print:

      1> hprof:profile(lists, seq, [1, 16]).

      ****** Process <0.179.0>    -- 100.00 % of total allocations ***
      MODUL FUN/ARITY   CALLS  WORDS  PER CALL  [     %]
      lists seq_loop/3      5     32         6  [100.00]
      32            [ 100.0]
      ok

By default tracing is enabled for all functions of all modules. When functions are created in the interactive shell, parts of shell code are also traced. It is however possible to limit the trace to specific functions or modules:

      1> hprof:profile(fun() -> lists:seq(1, 16) end).

      ****** Process <0.224.0>    -- 100.00 % of total allocations ***
      MODULE   FUN/ARITY         CALLS  WORDS  PER CALL  [    %]
      erl_eval match_list/6          1      3         3  [ 3.19]
      erl_eval do_apply/7            1      3         3  [ 3.19]
      lists    reverse/1             1      4         4  [ 4.26]
      erl_eval add_bindings/2        1      5         5  [ 5.32]
      erl_eval expr_list/7           3      7         2  [ 7.45]
      erl_eval ret_expr/3            4     16         4  [17.02]
      erl_eval merge_bindings/4      3     24         8  [25.53]
      lists    seq_loop/3            5     32         6  [34.04]

      2> hprof:profile(fun() -> lists:seq(1, 16) end, #{pattern => [{lists, seq_loop, '_'}]}).
      ****** Process <0.247.0>    -- 100.00 % of total allocations ***
      MODUL FUN/ARITY   CALLS  WORDS  PER CALL  [     %]
      lists seq_loop/3      5     32         6  [100.00]

Ad-hoc profiling results may be printed in a few different ways. Following examples are using test module defined like this:

      -module(test).
      -export([test_spawn/0]).
      test_spawn() ->
          {Pid, MRef} = spawn_monitor(fun () -> lists:seq(1, 32) end),
          receive
              {'DOWN', MRef, process, Pid, normal} ->
                  done
          end.

Default format prints per-process statistics.

        2> hprof:profile(test, test_spawn, []).

        ****** Process <0.176.0>    -- 23.66 % of total allocations ***
        MODULE FUN/ARITY        CALLS  WORDS  PER CALL  [    %]
        erlang spawn_monitor/1      1      2         2  [ 9.09]
        erlang spawn_opt/4          1      6         6  [27.27]
        test   test_spawn/0         1     14        14  [63.64]
                                          22            [100.0]

        ****** Process <0.177.0>    -- 76.34 % of total allocations ***
        MODULE FUN/ARITY   CALLS  WORDS  PER CALL  [    %]
        erlang apply/2         1      7         7  [ 9.86]
        lists  seq_loop/3      9     64         7  [90.14]
                                     71            [100.0]

This example prints the combined memory allocation of all processes, sorted by total allocated words in the descending order

        5> hprof:profile(test, test_spawn, [], #{report => {total, {words, descending}}}).

        MODULE FUN/ARITY        CALLS  WORDS  PER CALL  [    %]
        lists  seq_loop/3           9     64         7  [68.82]
        test   test_spawn/0         1     14        14  [15.05]
        erlang apply/2              1      7         7  [ 7.53]
        erlang spawn_opt/4          1      6         6  [ 6.45]
        erlang spawn_monitor/1      1      2         2  [ 2.15]
                                          93            [100.0]

You can also collect the profile for further inspection.

      6> {done, ProfileData} = hprof:profile(fun test:test_spawn/0, #{report => return}).
      <...>
      7> hprof:format(hprof:inspect(ProfileData, process, {percent, descending})).

      ****** Process <0.223.0>    -- 23.66 % of total allocations ***
      MODULE FUN/ARITY        CALLS  WORDS  PER CALL  [    %]
      test   test_spawn/0         1     14        14  [63.64]
      erlang spawn_opt/4          1      6         6  [27.27]
      erlang spawn_monitor/1      1      2         2  [ 9.09]
      22            [100.0]

      ****** Process <0.224.0>    -- 76.34 % of total allocations ***
      MODULE FUN/ARITY   CALLS  WORDS  PER CALL  [    %]
      lists  seq_loop/3      9     64         7  [90.14]
      erlang apply/2         1      7         7  [ 9.86]
      71            [100.0]

By default, basic profiling takes into account all processes spawned from the user-provided function (using set_on_spawn argument for trace/3 BIF). You can limit the trace to a single process:

      2> hprof:profile(test, test_spawn, [], #{set_on_spawn => false}).

      ****** Process <0.183.0>    -- 100.00 % of total allocations ***
      MODULE FUN/ARITY        CALLS  WORDS  PER CALL  [    %]
      erlang spawn_monitor/1      1      2         2  [ 9.09]
      erlang spawn_opt/4          1      6         6  [27.27]
      test   test_spawn/0         1     14        14  [63.64]

Erlang programs may perform memory-intensive operations in processes that are different from the original one. You can include multiple, new or even all processes in the trace.

      7> pg:start_link().
      {ok,<0.252.0>}
      8> hprof:profile(fun () -> pg:join(group, self()) end, #{rootset => [pg]}).
      ****** Process <0.252.0>    -- 52.86 % of total allocations ***
      MODULE     FUN/ARITY                 CALLS  WORDS  PER CALL  [    %]
      pg         leave_local_update_ets/5      1      2         2  [ 1.80]
      gen        reply/2                       1      3         3  [ 2.70]
      erlang     monitor/2                     1      3         3  [ 2.70]
      gen_server try_handle_call/4             1      3         3  [ 2.70]
      gen_server try_dispatch/4                1      3         3  [ 2.70]
      maps       iterator/1                    2      4         2  [ 3.60]
      maps       take/2                        1      6         6  [ 5.41]
      pg         join_local_update_ets/5       1      8         8  [ 7.21]
      pg         handle_info/2                 1      8         8  [ 7.21]
      pg         handle_call/3                 1      9         9  [ 8.11]
      gen_server loop/7                        2      9         4  [ 8.11]
      ets        lookup/2                      2     10         5  [ 9.01]
      pg         join_local/3                  1     11        11  [ 9.91]
      pg         notify_group/5                2     16         8  [14.41]
      erlang     setelement/3                  2     16         8  [14.41]
      111            [100.0]

      ****** Process <0.255.0>    -- 47.14 % of total allocations ***
      MODULE   FUN/ARITY         CALLS  WORDS  PER CALL  [    %]
      erl_eval match_list/6          1      3         3  [ 3.03]
      erlang   monitor/2             1      3         3  [ 3.03]
      lists    reverse/1             2      4         2  [ 4.04]
      pg       join/3                1      4         4  [ 4.04]
      erl_eval add_bindings/2        1      5         5  [ 5.05]
      erl_eval do_apply/7            2      6         3  [ 6.06]
      gen      call/4                1      8         8  [ 8.08]
      erl_eval expr_list/7           4     10         2  [10.10]
      gen      do_call/4             1     16        16  [16.16]
      erl_eval ret_expr/3            4     16         4  [16.16]
      erl_eval merge_bindings/4      3     24         8  [24.24]
      99            [100.0]

There is no default limit on the profiling time. It is possible to define such limit for ad-hoc profile. If function being profiled does not return in a specified amount of time, process is terminated with kill reason. Any unlinked children started by the user-supplied function are kept, it is developer's responsibility to ensure cleanup.

      9> hprof:profile(timer, sleep, [100000], #{timeout => 1000}).

By default, only one ad-hoc or server-aided profiling session is allowed at any point in time. It is possible to force multiple ad-hoc sessions concurrently, but it is developer responsibility to ensure non-overlapping trace patterns.

      1> hprof:profile(fun() -> lists:seq(1, 32) end,
          #{registered => false, pattern => [{lists, '_', '_'}]}).

Memory profiling can be done when your system is up and running. You can start the hprof server, add trace patterns and processes to trace while your system handles actual traffic. You can extract the data any time, inspect, and print. The example below traces activity of all processes supervised by kernel:

      1> hprof:start().
      {ok,<0.200.0>}
      2> hprof:enable_trace({all_children, kernel_sup}).
      34
      3> hprof:set_pattern('_', '_' , '_').
      16728
      4> Sample = hprof:collect().
      [{gen_server,try_dispatch,4,[{<0.154.0>,2,6}]},
      {erlang,iolist_to_iovec,1,[{<0.161.0>,1,8}]},
      <...>
      5 > hprof:format(hprof:inspect(Sample)).

      ****** Process <0.154.0>    -- 14.21 % of total allocations ***
      MODULE     FUN/ARITY       CALLS  WORDS  PER CALL  [    %]
      maps       iterator/1          2      4         2  [15.38]
      gen_server try_dispatch/4      2      6         3  [23.08]
      net_kernel handle_info/2       2     16         8  [61.54]
                                           26            [100.0]

      ****** Process <0.161.0>    -- 85.79 % of total allocations ***
      MODULE     FUN/ARITY            CALLS  WORDS  PER CALL  [    %]
      disk_log   handle/2                 2      2         1  [ 1.27]
      disk_log_1 maybe_start_timer/1      1      3         3  [ 1.91]
      disk_log_1 mf_write_cache/1         1      3         3  [ 1.91]
      <...>

It is possible to profile the entire running system, and then examine individual processes:

      1> hprof:start(), hprof:enable_trace(processes), hprof:set_pattern('_', '_' , '_').
      9041
      2> timer:sleep(10000), hprof:disable_trace(processes), Sample = hprof:collect().
      [{user_drv,server,3,[{<0.64.0>,12,136}]},
      {user_drv,contains_ctrl_g_or_ctrl_c,1,[{<0.64.0>,80,10}]},
      <...>
      3> Inspected = hprof:inspect(Sample, process, words), Shell = maps:get(self(), Inspected).
      {2743,
      [{shell,{enc,0},1,2,2,0.07291286912139992},
      <...>
      4> hprof:format(Shell).

      MODULE                 FUN/ARITY                             CALLS  WORDS  PER CALL  [    %]
      <...>
      erl_lint               start/2                                   2    300       150  [10.94]
      shell                  used_records/1                          114    342         3  [12.47]

Column to sort by inspect/3, or profile.

Module name.
Function name.
Number of calls to the function.
Total number of words allocated throughout all calls to the function.
Number of words allocated on average per function call.
Percentage of words to a total amount allocated during the entire profile collection.

Types

Pid = pid()
Reason = {already_started, Pid}

Starts the server, not supervised. Profiling server stores current trace patterns and ensures a single instance of heap profiler is running.

Types

Pid = pid()
Reason = {already_started, Pid}

Starts the server, supervised by the calling process.

Stops the hprof, disabling memory tracing that has been enabled.

Turns tracing on for the supplied pattern. Requires running hprof. Patterns are additive, following the same rules as erlang:trace_pattern/3. Returns number of functions matching the supplied pattern.

1> hprof:set_pattern(lists, seq, '_').
2
2> hprof:set_pattern(lists, keyfind, 3).
1
3> hprof:get_trace_map().
#{lists => [{keyfind,3},{seq,2},{seq,3}]}

If there are no functions matching the pattern, error is returned

> hprof:set_pattern(no_module, func, '_').
{error,{trace_pattern,no_module,func,'_'}}

Turns tracing off for the supplied pattern.

1> hprof:set_pattern(lists, seq, '_').
2
2> hprof:clear_pattern(lists, seq, 3).
1
3> hprof:get_trace_map().
#{lists => [{seq,2}]}

Returns a map of module names to functions with their arities.

Similar to erlang:trace/3, but supports a few more options for heap tracing convenience.

Spec is either a process identifier (pid) for a local process, one of the following atoms, or a list of local process identifiers or their registered names:

All currently existing processes and all that will be created in the future.
All currently existing processes.
All processes that will be created in the future.
All currently running processes that were directly spawned by the specified process. This mode is helpful for tracing workers of a single supervisor.
All currently running processes that were spawned by the specified process, or any recursive descendant of it. This mode is designed to facilitate tracing of supervision trees.
Note

Heap profiling server does not keep track of processes that were added to the tracing set. It is permitted to stop the profiling server (wiping out any accumulated data), restart the server, set entirely different tracing pattern keeping the list of traced processes for future use. Use disable_trace(processes) to clear the list of traced processes.

Specify Options to modify heap tracing behaviour:

Automatically start heap tracing for processes spawned by the traced process. On by default.

Stops accumulating heap traces for specified processes. See enable_trace/2 for options description.

Profile accumulated before process is removed from the traced list is retained. This allows to enable tracing for many or even all processes in the system, sleep for a short period of time, then disable tracing for all processes, avoiding system overload, but keeping profile data.

Pauses trace collection for all currently traced functions, keeping all traces intact. Use continue/0 to resume trace collection.

Resumes previously paused heap profiling.

Clears accumulated profiles. If profiling was paused prior to calling restart, it gets continued.

Produces ad-hoc heap profile for function Fun or Module:Function call. By default, result is formatted to the output device, use report option to change this behaviour.

Ad-hoc profiling starts a new instance of hprof server, runs the profiling routine, extracts results and shuts the server down. If hprof is already running (for server-aided profiling), default ad-hoc profiler options block this call to avoid mixing results from several independent instances. Use registered => false option to override this behaviour.

Ad-hoc profiler supports followingOptions:

Specifies I/O devices to print the profile to. Useful to redirect text output to console or standard_error.
Specifies trace pattern, or a list of trace patterns to enable. By default, all functions ({'_', '_', '_'}) are traced.
Specifies hprof registered process name. Use false to leave the process unregistered, or {local, myname} to register the process under a different name.
Controls output format. The default is process, printing per-process heap profiling data sorted by percentage of the total allocation. Specify report => return to suppress printing and get the raw data for further evaluation with inspect/3 and formatting with format/2.
Includes extra processes in the trace list. Useful for profiling allocations for gen_server, calls, or other allocations caused by inter-process communications. See example.
Automatically start heap tracing for processes spawned by the traced process. On by default.
Terminate profiling after the specified amount of time (milliseconds).

The same as inspect (Profile, process, percent). Transforms raw profile into a map of process identifiers to a tuple containing total count of words allocated, and a list of all traced functions sorted in the ascending order by the allocation percentage.

Transforms raw data returned by tracing BIFs into a form convenient for subsequent analysis and formatting. Returns a map of process identifiers with corresponding profile data sorted by the selected column.

Inspected profile can be leveraged to print individual process allocations.

Combines raw profile from multiple processes into a single summary sorted by the selected column.

A single profiling session may contain data from thousands or even millions processes. This inspection mode allows to quickly glance through the allocation summary, discarding process identifiers and keeping only totals.