Perf

From 탱이의 잡동사니
Revision as of 09:02, 4 May 2016 by Pchero (talk | contribs) (Created page with "== Overview == 성능 분석 도구 perf 내용 정리 == perf stat == perf stat: obtain event counts === Options === <pre> $ perf stat -h usage: perf stat [<options>] [<c...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Overview

성능 분석 도구 perf 내용 정리

perf stat

perf stat: obtain event counts

Options

$ perf stat -h

 usage: perf stat [<options>] [<command>]

    -T, --transaction     hardware transaction statistics
    -e, --event <event>   event selector. use 'perf list' to list available events
        --filter <filter>
                          event filter
    -i, --no-inherit      child tasks do not inherit counters
    -p, --pid <pid>       stat events on existing process id
    -t, --tid <tid>       stat events on existing thread id
    -a, --all-cpus        system-wide collection from all CPUs
    -g, --group           put the counters into a counter group
    -c, --scale           scale/normalize counters
    -v, --verbose         be more verbose (show counter open errors, etc)
    -r, --repeat <n>      repeat command and print average + stddev (max: 100, forever: 0)
    -n, --null            null run - dont start any counters
    -d, --detailed        detailed run - start a lot of events
    -S, --sync            call sync() before starting a run
    -B, --big-num         print large numbers with thousands' separators
    -C, --cpu <cpu>       list of cpus to monitor in system-wide
    -A, --no-aggr         disable CPU count aggregation
    -x, --field-separator <separator>
                          print counts with custom separator
    -G, --cgroup <name>   monitor event in cgroup name only
    -o, --output <file>   output file name
        --append          append to the output file
        --log-fd <n>      log output to fd, instead of stderr
        --pre <command>   command to run prior to the measured command
        --post <command>  command to run after to the measured command
    -I, --interval-print <n>
                          print counts at regular interval in ms (>= 100)
        --per-socket      aggregate counts per processor socket
        --per-core        aggregate counts per physical processor core
    -D, --delay <n>       ms to wait before starting measurement after program start

Example

<source lang=bash> $ perf stat --repeat 10 -e cycles:u -e instructions:u -e cache-references:u\

    -e cache-misses:u -e stalled-cycles-frontend:u -e stalled-cycles-backend:u\
    -e ref-cycles:u -e branch-instructions:u -e branch-misses:u ./main

</source>

perf record

perf record: record events for later reporting

Options

$ perf record -h

 usage: perf record [<options>] [<command>]
    or: perf record [<options>] -- <command> [<options>]

    -e, --event <event>   event selector. use 'perf list' to list available events
        --filter <filter>
                          event filter
    -p, --pid <pid>       record events on existing process id
    -t, --tid <tid>       record events on existing thread id
    -r, --realtime <n>    collect data with this RT SCHED_FIFO priority
    -D, --no-delay        collect data without buffering
    -R, --raw-samples     collect raw sample records from all opened counters
    -a, --all-cpus        system-wide collection from all CPUs
    -C, --cpu <cpu>       list of cpus to monitor
    -c, --count <n>       event period to sample
    -o, --output <file>   output file name
    -i, --no-inherit      child tasks do not inherit counters
    -F, --freq <n>        profile at this frequency
    -m, --mmap-pages <pages>
                          number of mmap data pages
        --group           put the counters into a counter group
    -g                    enables call-graph recording
        --call-graph <mode[,dump_size]>
                          setup and enables call-graph (stack chain/backtrace) recording: fp dwarf
    -v, --verbose         be more verbose (show counter open errors, etc)
    -q, --quiet           don't print any message
    -s, --stat            per thread counts
    -d, --data            Sample addresses
    -T, --timestamp       Sample timestamps
    -P, --period          Sample period
    -n, --no-samples      don't sample
    -N, --no-buildid-cache
                          do not update the buildid cache
    -B, --no-buildid      do not collect buildids in perf.data
    -G, --cgroup <name>   monitor event in cgroup name only
    -u, --uid <user>      user to profile
    -b, --branch-any      sample any taken branches
    -j, --branch-filter <branch filter mask>
                          branch stack filter modes
    -W, --weight          sample by weight (on special events only)
        --transaction     sample transaction flags (special events only)
        --force-per-cpu   force the use of per-cpu mmaps

perf report

perf report: break down events by process, function, etc.

perf annotate

perf annotate: annotate assembly or source code with event counts

perf top

perf top: see live event count

perf bench

perf bench: run different kernel microbenchmarks

See also