Skip to content

Commit f0bb4c0

Browse files
committed
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar: "Kernel improvements: - watchdog driver improvements by Li Zefan - Power7 CPI stack events related improvements by Sukadev Bhattiprolu - event multiplexing via hrtimers and other improvements by Stephane Eranian - kernel stack use optimization by Andrew Hunter - AMD IOMMU uncore PMU support by Suravee Suthikulpanit - NMI handling rate-limits by Dave Hansen - various hw_breakpoint fixes by Oleg Nesterov - hw_breakpoint overflow period sampling and related signal handling fixes by Jiri Olsa - Intel Haswell PMU support by Andi Kleen Tooling improvements: - Reset SIGTERM handler in workload child process, fix from David Ahern. - Makefile reorganization, prep work for Kconfig patches, from Jiri Olsa. - Add automated make test suite, from Jiri Olsa. - Add --percent-limit option to 'top' and 'report', from Namhyung Kim. - Sorting improvements, from Namhyung Kim. - Expand definition of sysfs format attribute, from Michael Ellerman. Tooling fixes: - 'perf tests' fixes from Jiri Olsa. - Make Power7 CPI stack events available in sysfs, from Sukadev Bhattiprolu. - Handle death by SIGTERM in 'perf record', fix from David Ahern. - Fix printing of perf_event_paranoid message, from David Ahern. - Handle realloc failures in 'perf kvm', from David Ahern. - Fix divide by 0 in variance, from David Ahern. - Save parent pid in thread struct, from David Ahern. - Handle JITed code in shared memory, from Andi Kleen. - Fixes for 'perf diff', from Jiri Olsa. - Remove some unused struct members, from Jiri Olsa. - Add missing liblk.a dependency for python/perf.so, fix from Jiri Olsa. - Respect CROSS_COMPILE in liblk.a, from Rabin Vincent. - No need to do locking when adding hists in perf report, only 'top' needs that, from Namhyung Kim. - Fix alignment of symbol column in in the hists browser (top, report) when -v is given, from NAmhyung Kim. - Fix 'perf top' -E option behavior, from Namhyung Kim. - Fix bug in isupper() and islower(), from Sukadev Bhattiprolu. - Fix compile errors in bp_signal 'perf test', from Sukadev Bhattiprolu. ... and more things" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (102 commits) perf/x86: Disable PEBS-LL in intel_pmu_pebs_disable() perf/x86: Fix shared register mutual exclusion enforcement perf/x86/intel: Support full width counting x86: Add NMI duration tracepoints perf: Drop sample rate when sampling is too slow x86: Warn when NMI handlers take large amounts of time hw_breakpoint: Introduce "struct bp_cpuinfo" hw_breakpoint: Simplify *register_wide_hw_breakpoint() hw_breakpoint: Introduce cpumask_of_bp() hw_breakpoint: Simplify the "weight" usage in toggle_bp_slot() paths hw_breakpoint: Simplify list/idx mess in toggle_bp_slot() paths perf/x86/intel: Add mem-loads/stores support for Haswell perf/x86/intel: Support Haswell/v4 LBR format perf/x86/intel: Move NMI clearing to end of PMI handler perf/x86/intel: Add Haswell PEBS support perf/x86/intel: Add simple Haswell PMU support perf/x86/intel: Add Haswell PEBS record support perf/x86/intel: Fix sparse warning perf/x86/amd: AMD IOMMU Performance Counter PERF uncore PMU implementation perf/x86/amd: Add IOMMU Performance Counter resource management ...
2 parents a4883ef + 983433b commit f0bb4c0

71 files changed

Lines changed: 2947 additions & 1053 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

Documentation/ABI/testing/sysfs-bus-event_source-devices-events

Lines changed: 27 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -27,14 +27,36 @@ Description: Generic performance monitoring events
2727
"basename".
2828

2929

30-
What: /sys/devices/cpu/events/PM_LD_MISS_L1
31-
/sys/devices/cpu/events/PM_LD_REF_L1
32-
/sys/devices/cpu/events/PM_CYC
30+
What: /sys/devices/cpu/events/PM_1PLUS_PPC_CMPL
3331
/sys/devices/cpu/events/PM_BRU_FIN
34-
/sys/devices/cpu/events/PM_GCT_NOSLOT_CYC
3532
/sys/devices/cpu/events/PM_BRU_MPRED
36-
/sys/devices/cpu/events/PM_INST_CMPL
3733
/sys/devices/cpu/events/PM_CMPLU_STALL
34+
/sys/devices/cpu/events/PM_CMPLU_STALL_BRU
35+
/sys/devices/cpu/events/PM_CMPLU_STALL_DCACHE_MISS
36+
/sys/devices/cpu/events/PM_CMPLU_STALL_DFU
37+
/sys/devices/cpu/events/PM_CMPLU_STALL_DIV
38+
/sys/devices/cpu/events/PM_CMPLU_STALL_ERAT_MISS
39+
/sys/devices/cpu/events/PM_CMPLU_STALL_FXU
40+
/sys/devices/cpu/events/PM_CMPLU_STALL_IFU
41+
/sys/devices/cpu/events/PM_CMPLU_STALL_LSU
42+
/sys/devices/cpu/events/PM_CMPLU_STALL_REJECT
43+
/sys/devices/cpu/events/PM_CMPLU_STALL_SCALAR
44+
/sys/devices/cpu/events/PM_CMPLU_STALL_SCALAR_LONG
45+
/sys/devices/cpu/events/PM_CMPLU_STALL_STORE
46+
/sys/devices/cpu/events/PM_CMPLU_STALL_THRD
47+
/sys/devices/cpu/events/PM_CMPLU_STALL_VECTOR
48+
/sys/devices/cpu/events/PM_CMPLU_STALL_VECTOR_LONG
49+
/sys/devices/cpu/events/PM_CYC
50+
/sys/devices/cpu/events/PM_GCT_NOSLOT_BR_MPRED
51+
/sys/devices/cpu/events/PM_GCT_NOSLOT_BR_MPRED_IC_MISS
52+
/sys/devices/cpu/events/PM_GCT_NOSLOT_CYC
53+
/sys/devices/cpu/events/PM_GCT_NOSLOT_IC_MISS
54+
/sys/devices/cpu/events/PM_GRP_CMPL
55+
/sys/devices/cpu/events/PM_INST_CMPL
56+
/sys/devices/cpu/events/PM_LD_MISS_L1
57+
/sys/devices/cpu/events/PM_LD_REF_L1
58+
/sys/devices/cpu/events/PM_RUN_CYC
59+
/sys/devices/cpu/events/PM_RUN_INST_CMPL
3860

3961
Date: 2013/01/08
4062

Documentation/ABI/testing/sysfs-bus-event_source-devices-format

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,12 @@ Description:
99
we want to export, so that userspace can deal with sane
1010
name/value pairs.
1111

12+
Userspace must be prepared for the possibility that attributes
13+
define overlapping bit ranges. For example:
14+
attr1 = 'config:0-23'
15+
attr2 = 'config:0-7'
16+
attr3 = 'config:12-35'
17+
1218
Example: 'config1:1,6-10,44'
1319
Defines contents of attribute that occupies bits 1,6-10,44 of
1420
perf_event_attr::config1.

Documentation/sysctl/kernel.txt

Lines changed: 40 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -70,12 +70,12 @@ show up in /proc/sys/kernel:
7070
- shmall
7171
- shmmax [ sysv ipc ]
7272
- shmmni
73-
- softlockup_thresh
7473
- stop-a [ SPARC only ]
7574
- sysrq ==> Documentation/sysrq.txt
7675
- tainted
7776
- threads-max
7877
- unknown_nmi_panic
78+
- watchdog_thresh
7979
- version
8080

8181
==============================================================
@@ -427,6 +427,32 @@ This file shows up if CONFIG_DEBUG_STACKOVERFLOW is enabled.
427427

428428
==============================================================
429429

430+
perf_cpu_time_max_percent:
431+
432+
Hints to the kernel how much CPU time it should be allowed to
433+
use to handle perf sampling events. If the perf subsystem
434+
is informed that its samples are exceeding this limit, it
435+
will drop its sampling frequency to attempt to reduce its CPU
436+
usage.
437+
438+
Some perf sampling happens in NMIs. If these samples
439+
unexpectedly take too long to execute, the NMIs can become
440+
stacked up next to each other so much that nothing else is
441+
allowed to execute.
442+
443+
0: disable the mechanism. Do not monitor or correct perf's
444+
sampling rate no matter how CPU time it takes.
445+
446+
1-100: attempt to throttle perf's sample rate to this
447+
percentage of CPU. Note: the kernel calculates an
448+
"expected" length of each sample event. 100 here means
449+
100% of that expected length. Even if this is set to
450+
100, you may still see sample throttling if this
451+
length is exceeded. Set to 0 if you truly do not care
452+
how much CPU is consumed.
453+
454+
==============================================================
455+
430456

431457
pid_max:
432458

@@ -604,15 +630,6 @@ without users and with a dead originative process will be destroyed.
604630

605631
==============================================================
606632

607-
softlockup_thresh:
608-
609-
This value can be used to lower the softlockup tolerance threshold. The
610-
default threshold is 60 seconds. If a cpu is locked up for 60 seconds,
611-
the kernel complains. Valid values are 1-60 seconds. Setting this
612-
tunable to zero will disable the softlockup detection altogether.
613-
614-
==============================================================
615-
616633
tainted:
617634

618635
Non-zero if the kernel has been tainted. Numeric values, which
@@ -648,3 +665,16 @@ that time, kernel debugging information is displayed on console.
648665

649666
NMI switch that most IA32 servers have fires unknown NMI up, for
650667
example. If a system hangs up, try pressing the NMI switch.
668+
669+
==============================================================
670+
671+
watchdog_thresh:
672+
673+
This value can be used to control the frequency of hrtimer and NMI
674+
events and the soft and hard lockup thresholds. The default threshold
675+
is 10 seconds.
676+
677+
The softlockup threshold is (2 * watchdog_thresh). Setting this
678+
tunable to zero will disable lockup detection altogether.
679+
680+
==============================================================

Documentation/trace/events-nmi.txt

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
NMI Trace Events
2+
3+
These events normally show up here:
4+
5+
/sys/kernel/debug/tracing/events/nmi
6+
7+
--
8+
9+
nmi_handler:
10+
11+
You might want to use this tracepoint if you suspect that your
12+
NMI handlers are hogging large amounts of CPU time. The kernel
13+
will warn if it sees long-running handlers:
14+
15+
INFO: NMI handler took too long to run: 9.207 msecs
16+
17+
and this tracepoint will allow you to drill down and get some
18+
more details.
19+
20+
Let's say you suspect that perf_event_nmi_handler() is causing
21+
you some problems and you only want to trace that handler
22+
specifically. You need to find its address:
23+
24+
$ grep perf_event_nmi_handler /proc/kallsyms
25+
ffffffff81625600 t perf_event_nmi_handler
26+
27+
Let's also say you are only interested in when that function is
28+
really hogging a lot of CPU time, like a millisecond at a time.
29+
Note that the kernel's output is in milliseconds, but the input
30+
to the filter is in nanoseconds! You can filter on 'delta_ns':
31+
32+
cd /sys/kernel/debug/tracing/events/nmi/nmi_handler
33+
echo 'handler==0xffffffff81625600 && delta_ns>1000000' > filter
34+
echo 1 > enable
35+
36+
Your output would then look like:
37+
38+
$ cat /sys/kernel/debug/tracing/trace_pipe
39+
<idle>-0 [000] d.h3 505.397558: nmi_handler: perf_event_nmi_handler() delta_ns: 3236765 handled: 1
40+
<idle>-0 [000] d.h3 505.805893: nmi_handler: perf_event_nmi_handler() delta_ns: 3174234 handled: 1
41+
<idle>-0 [000] d.h3 506.158206: nmi_handler: perf_event_nmi_handler() delta_ns: 3084642 handled: 1
42+
<idle>-0 [000] d.h3 506.334346: nmi_handler: perf_event_nmi_handler() delta_ns: 3080351 handled: 1
43+

arch/metag/kernel/perf/perf_event.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -882,7 +882,7 @@ static int __init init_hw_perf_events(void)
882882
}
883883

884884
register_cpu_notifier(&metag_pmu_notifier);
885-
ret = perf_pmu_register(&pmu, (char *)metag_pmu->name, PERF_TYPE_RAW);
885+
ret = perf_pmu_register(&pmu, metag_pmu->name, PERF_TYPE_RAW);
886886
out:
887887
return ret;
888888
}

arch/powerpc/perf/power7-pmu.c

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,29 @@
6262
#define PME_PM_BRU_FIN 0x10068
6363
#define PME_PM_BRU_MPRED 0x400f6
6464

65+
#define PME_PM_CMPLU_STALL_FXU 0x20014
66+
#define PME_PM_CMPLU_STALL_DIV 0x40014
67+
#define PME_PM_CMPLU_STALL_SCALAR 0x40012
68+
#define PME_PM_CMPLU_STALL_SCALAR_LONG 0x20018
69+
#define PME_PM_CMPLU_STALL_VECTOR 0x2001c
70+
#define PME_PM_CMPLU_STALL_VECTOR_LONG 0x4004a
71+
#define PME_PM_CMPLU_STALL_LSU 0x20012
72+
#define PME_PM_CMPLU_STALL_REJECT 0x40016
73+
#define PME_PM_CMPLU_STALL_ERAT_MISS 0x40018
74+
#define PME_PM_CMPLU_STALL_DCACHE_MISS 0x20016
75+
#define PME_PM_CMPLU_STALL_STORE 0x2004a
76+
#define PME_PM_CMPLU_STALL_THRD 0x1001c
77+
#define PME_PM_CMPLU_STALL_IFU 0x4004c
78+
#define PME_PM_CMPLU_STALL_BRU 0x4004e
79+
#define PME_PM_GCT_NOSLOT_IC_MISS 0x2001a
80+
#define PME_PM_GCT_NOSLOT_BR_MPRED 0x4001a
81+
#define PME_PM_GCT_NOSLOT_BR_MPRED_IC_MISS 0x4001c
82+
#define PME_PM_GRP_CMPL 0x30004
83+
#define PME_PM_1PLUS_PPC_CMPL 0x100f2
84+
#define PME_PM_CMPLU_STALL_DFU 0x2003c
85+
#define PME_PM_RUN_CYC 0x200f4
86+
#define PME_PM_RUN_INST_CMPL 0x400fa
87+
6588
/*
6689
* Layout of constraint bits:
6790
* 6666555555555544444444443333333333222222222211111111110000000000
@@ -393,6 +416,31 @@ POWER_EVENT_ATTR(LD_MISS_L1, LD_MISS_L1);
393416
POWER_EVENT_ATTR(BRU_FIN, BRU_FIN)
394417
POWER_EVENT_ATTR(BRU_MPRED, BRU_MPRED);
395418

419+
POWER_EVENT_ATTR(CMPLU_STALL_FXU, CMPLU_STALL_FXU);
420+
POWER_EVENT_ATTR(CMPLU_STALL_DIV, CMPLU_STALL_DIV);
421+
POWER_EVENT_ATTR(CMPLU_STALL_SCALAR, CMPLU_STALL_SCALAR);
422+
POWER_EVENT_ATTR(CMPLU_STALL_SCALAR_LONG, CMPLU_STALL_SCALAR_LONG);
423+
POWER_EVENT_ATTR(CMPLU_STALL_VECTOR, CMPLU_STALL_VECTOR);
424+
POWER_EVENT_ATTR(CMPLU_STALL_VECTOR_LONG, CMPLU_STALL_VECTOR_LONG);
425+
POWER_EVENT_ATTR(CMPLU_STALL_LSU, CMPLU_STALL_LSU);
426+
POWER_EVENT_ATTR(CMPLU_STALL_REJECT, CMPLU_STALL_REJECT);
427+
428+
POWER_EVENT_ATTR(CMPLU_STALL_ERAT_MISS, CMPLU_STALL_ERAT_MISS);
429+
POWER_EVENT_ATTR(CMPLU_STALL_DCACHE_MISS, CMPLU_STALL_DCACHE_MISS);
430+
POWER_EVENT_ATTR(CMPLU_STALL_STORE, CMPLU_STALL_STORE);
431+
POWER_EVENT_ATTR(CMPLU_STALL_THRD, CMPLU_STALL_THRD);
432+
POWER_EVENT_ATTR(CMPLU_STALL_IFU, CMPLU_STALL_IFU);
433+
POWER_EVENT_ATTR(CMPLU_STALL_BRU, CMPLU_STALL_BRU);
434+
POWER_EVENT_ATTR(GCT_NOSLOT_IC_MISS, GCT_NOSLOT_IC_MISS);
435+
436+
POWER_EVENT_ATTR(GCT_NOSLOT_BR_MPRED, GCT_NOSLOT_BR_MPRED);
437+
POWER_EVENT_ATTR(GCT_NOSLOT_BR_MPRED_IC_MISS, GCT_NOSLOT_BR_MPRED_IC_MISS);
438+
POWER_EVENT_ATTR(GRP_CMPL, GRP_CMPL);
439+
POWER_EVENT_ATTR(1PLUS_PPC_CMPL, 1PLUS_PPC_CMPL);
440+
POWER_EVENT_ATTR(CMPLU_STALL_DFU, CMPLU_STALL_DFU);
441+
POWER_EVENT_ATTR(RUN_CYC, RUN_CYC);
442+
POWER_EVENT_ATTR(RUN_INST_CMPL, RUN_INST_CMPL);
443+
396444
static struct attribute *power7_events_attr[] = {
397445
GENERIC_EVENT_PTR(CYC),
398446
GENERIC_EVENT_PTR(GCT_NOSLOT_CYC),
@@ -411,6 +459,31 @@ static struct attribute *power7_events_attr[] = {
411459
POWER_EVENT_PTR(LD_MISS_L1),
412460
POWER_EVENT_PTR(BRU_FIN),
413461
POWER_EVENT_PTR(BRU_MPRED),
462+
463+
POWER_EVENT_PTR(CMPLU_STALL_FXU),
464+
POWER_EVENT_PTR(CMPLU_STALL_DIV),
465+
POWER_EVENT_PTR(CMPLU_STALL_SCALAR),
466+
POWER_EVENT_PTR(CMPLU_STALL_SCALAR_LONG),
467+
POWER_EVENT_PTR(CMPLU_STALL_VECTOR),
468+
POWER_EVENT_PTR(CMPLU_STALL_VECTOR_LONG),
469+
POWER_EVENT_PTR(CMPLU_STALL_LSU),
470+
POWER_EVENT_PTR(CMPLU_STALL_REJECT),
471+
472+
POWER_EVENT_PTR(CMPLU_STALL_ERAT_MISS),
473+
POWER_EVENT_PTR(CMPLU_STALL_DCACHE_MISS),
474+
POWER_EVENT_PTR(CMPLU_STALL_STORE),
475+
POWER_EVENT_PTR(CMPLU_STALL_THRD),
476+
POWER_EVENT_PTR(CMPLU_STALL_IFU),
477+
POWER_EVENT_PTR(CMPLU_STALL_BRU),
478+
POWER_EVENT_PTR(GCT_NOSLOT_IC_MISS),
479+
POWER_EVENT_PTR(GCT_NOSLOT_BR_MPRED),
480+
481+
POWER_EVENT_PTR(GCT_NOSLOT_BR_MPRED_IC_MISS),
482+
POWER_EVENT_PTR(GRP_CMPL),
483+
POWER_EVENT_PTR(1PLUS_PPC_CMPL),
484+
POWER_EVENT_PTR(CMPLU_STALL_DFU),
485+
POWER_EVENT_PTR(RUN_CYC),
486+
POWER_EVENT_PTR(RUN_INST_CMPL),
414487
NULL
415488
};
416489

arch/x86/ia32/ia32_signal.c

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,8 +34,6 @@
3434
#include <asm/sys_ia32.h>
3535
#include <asm/smap.h>
3636

37-
#define FIX_EFLAGS __FIX_EFLAGS
38-
3937
int copy_siginfo_to_user32(compat_siginfo_t __user *to, siginfo_t *from)
4038
{
4139
int err = 0;

arch/x86/include/asm/perf_event.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,9 @@
2929
#define ARCH_PERFMON_EVENTSEL_INV (1ULL << 23)
3030
#define ARCH_PERFMON_EVENTSEL_CMASK 0xFF000000ULL
3131

32+
#define HSW_IN_TX (1ULL << 32)
33+
#define HSW_IN_TX_CHECKPOINTED (1ULL << 33)
34+
3235
#define AMD64_EVENTSEL_INT_CORE_ENABLE (1ULL << 36)
3336
#define AMD64_EVENTSEL_GUESTONLY (1ULL << 40)
3437
#define AMD64_EVENTSEL_HOSTONLY (1ULL << 41)

arch/x86/include/asm/sighandling.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,10 @@
77

88
#include <asm/processor-flags.h>
99

10-
#define __FIX_EFLAGS (X86_EFLAGS_AC | X86_EFLAGS_OF | \
10+
#define FIX_EFLAGS (X86_EFLAGS_AC | X86_EFLAGS_OF | \
1111
X86_EFLAGS_DF | X86_EFLAGS_TF | X86_EFLAGS_SF | \
1212
X86_EFLAGS_ZF | X86_EFLAGS_AF | X86_EFLAGS_PF | \
13-
X86_EFLAGS_CF)
13+
X86_EFLAGS_CF | X86_EFLAGS_RF)
1414

1515
void signal_fault(struct pt_regs *regs, void __user *frame, char *where);
1616

arch/x86/include/uapi/asm/msr-index.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -170,6 +170,9 @@
170170
#define MSR_KNC_EVNTSEL0 0x00000028
171171
#define MSR_KNC_EVNTSEL1 0x00000029
172172

173+
/* Alternative perfctr range with full access. */
174+
#define MSR_IA32_PMC0 0x000004c1
175+
173176
/* AMD64 MSRs. Not complete. See the architecture manual for a more
174177
complete list. */
175178

0 commit comments

Comments
 (0)