-
Notifications
You must be signed in to change notification settings - Fork 535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build(deps): bump setuptools from 68.0.0 to 70.0.0 in /drivers/gpu/drm/ci/xfails #18
Open
dependabot
wants to merge
1
commit into
master
Choose a base branch
from
dependabot/pip/drivers/gpu/drm/ci/xfails/setuptools-70.0.0
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
build(deps): bump setuptools from 68.0.0 to 70.0.0 in /drivers/gpu/drm/ci/xfails #18
dependabot
wants to merge
1
commit into
master
from
dependabot/pip/drivers/gpu/drm/ci/xfails/setuptools-70.0.0
+1
−1
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Bumps [setuptools](https://github.com/pypa/setuptools) from 68.0.0 to 70.0.0. - [Release notes](https://github.com/pypa/setuptools/releases) - [Changelog](https://github.com/pypa/setuptools/blob/main/NEWS.rst) - [Commits](pypa/setuptools@v68.0.0...v70.0.0) --- updated-dependencies: - dependency-name: setuptools dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]>
dependabot
bot
added
the
dependencies
Pull requests that update a dependency file
label
Jul 15, 2024
Thor-x86
pushed a commit
to Thor-x86/linux
that referenced
this pull request
Jul 16, 2024
The code in ocfs2_dio_end_io_write() estimates number of necessary transaction credits using ocfs2_calc_extend_credits(). This however does not take into account that the IO could be arbitrarily large and can contain arbitrary number of extents. Extent tree manipulations do often extend the current transaction but not in all of the cases. For example if we have only single block extents in the tree, ocfs2_mark_extent_written() will end up calling ocfs2_replace_extent_rec() all the time and we will never extend the current transaction and eventually exhaust all the transaction credits if the IO contains many single block extents. Once that happens a WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to this error. This was actually triggered by one of our customers on a heavily fragmented OCFS2 filesystem. To fix the issue make sure the transaction always has enough credits for one extent insert before each call of ocfs2_mark_extent_written(). Heming Zhao said: ------ PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error" PID: xxx TASK: xxxx CPU: 5 COMMAND: "SubmitThread-CA" #0 machine_kexec at ffffffff8c069932 gregkh#1 __crash_kexec at ffffffff8c1338fa gregkh#2 panic at ffffffff8c1d69b9 gregkh#3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2] gregkh#4 __ocfs2_abort at ffffffffc0c88387 [ocfs2] gregkh#5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2] gregkh#6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2] gregkh#7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2] gregkh#8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2] gregkh#9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2] gregkh#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2] gregkh#11 dio_complete at ffffffff8c2b9fa7 gregkh#12 do_blockdev_direct_IO at ffffffff8c2bc09f gregkh#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2] gregkh#14 generic_file_direct_write at ffffffff8c1dcf14 gregkh#15 __generic_file_write_iter at ffffffff8c1dd07b gregkh#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2] gregkh#17 aio_write at ffffffff8c2cc72e gregkh#18 kmem_cache_alloc at ffffffff8c248dde #19 do_io_submit at ffffffff8c2ccada #20 do_syscall_64 at ffffffff8c004984 #21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io") Signed-off-by: Jan Kara <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Reviewed-by: Heming Zhao <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
Thor-x86
pushed a commit
to Thor-x86/linux
that referenced
this pull request
Jul 16, 2024
When running BPF selftests (./test_progs -t sockmap_basic) on a Loongarch platform, the following kernel panic occurs: [...] Oops[gregkh#1]: CPU: 22 PID: 2824 Comm: test_progs Tainted: G OE 6.10.0-rc2+ gregkh#18 Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018 ... ... ra: 90000000048bf6c0 sk_msg_recvmsg+0x120/0x560 ERA: 9000000004162774 copy_page_to_iter+0x74/0x1c0 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 0000000c (PPLV0 +PIE +PWE) EUEN: 00000007 (+FPE +SXE +ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) BADV: 0000000000000040 PRID: 0014c011 (Loongson-64bit, Loongson-3C5000) Modules linked in: bpf_testmod(OE) xt_CHECKSUM xt_MASQUERADE xt_conntrack Process test_progs (pid: 2824, threadinfo=0000000000863a31, task=...) Stack : ... Call Trace: [<9000000004162774>] copy_page_to_iter+0x74/0x1c0 [<90000000048bf6c0>] sk_msg_recvmsg+0x120/0x560 [<90000000049f2b90>] tcp_bpf_recvmsg_parser+0x170/0x4e0 [<90000000049aae34>] inet_recvmsg+0x54/0x100 [<900000000481ad5c>] sock_recvmsg+0x7c/0xe0 [<900000000481e1a8>] __sys_recvfrom+0x108/0x1c0 [<900000000481e27c>] sys_recvfrom+0x1c/0x40 [<9000000004c076ec>] do_syscall+0x8c/0xc0 [<9000000003731da4>] handle_syscall+0xc4/0x160 Code: ... ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Fatal exception Kernel relocated by 0x3510000 .text @ 0x9000000003710000 .data @ 0x9000000004d70000 .bss @ 0x9000000006469400 ---[ end Kernel panic - not syncing: Fatal exception ]--- [...] This crash happens every time when running sockmap_skb_verdict_shutdown subtest in sockmap_basic. This crash is because a NULL pointer is passed to page_address() in the sk_msg_recvmsg(). Due to the different implementations depending on the architecture, page_address(NULL) will trigger a panic on Loongarch platform but not on x86 platform. So this bug was hidden on x86 platform for a while, but now it is exposed on Loongarch platform. The root cause is that a zero length skb (skb->len == 0) was put on the queue. This zero length skb is a TCP FIN packet, which was sent by shutdown(), invoked in test_sockmap_skb_verdict_shutdown(): shutdown(p1, SHUT_WR); In this case, in sk_psock_skb_ingress_enqueue(), num_sge is zero, and no page is put to this sge (see sg_set_page in sg_set_page), but this empty sge is queued into ingress_msg list. And in sk_msg_recvmsg(), this empty sge is used, and a NULL page is got by sg_page(sge). Pass this NULL page to copy_page_to_iter(), which passes it to kmap_local_page() and to page_address(), then kernel panics. To solve this, we should skip this zero length skb. So in sk_msg_recvmsg(), if copy is zero, that means it's a zero length skb, skip invoking copy_page_to_iter(). We are using the EFAULT return triggered by copy_page_to_iter to check for is_fin in tcp_bpf.c. Fixes: 604326b ("bpf, sockmap: convert to generic sk_msg interface") Suggested-by: John Fastabend <[email protected]> Signed-off-by: Geliang Tang <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/e3a16eacdc6740658ee02a33489b1b9d4912f378.1719992715.git.tanggeliang@kylinos.cn
piso77
pushed a commit
to piso77/linux
that referenced
this pull request
Jul 17, 2024
mwifiex_get_priv_by_id() returns the priv pointer corresponding to the bss_num and bss_type, but without checking if the priv is actually currently in use. Unused priv pointers do not have a wiphy attached to them which can lead to NULL pointer dereferences further down the callstack. Fix this by returning only used priv pointers which have priv->bss_mode set to something else than NL80211_IFTYPE_UNSPECIFIED. Said NULL pointer dereference happened when an Accesspoint was started with wpa_supplicant -i mlan0 with this config: network={ ssid="somessid" mode=2 frequency=2412 key_mgmt=WPA-PSK WPA-PSK-SHA256 proto=RSN group=CCMP pairwise=CCMP psk="12345678" } When waiting for the AP to be established, interrupting wpa_supplicant with <ctrl-c> and starting it again this happens: | Unable to handle kernel NULL pointer dereference at virtual address 0000000000000140 | Mem abort info: | ESR = 0x0000000096000004 | EC = 0x25: DABT (current EL), IL = 32 bits | SET = 0, FnV = 0 | EA = 0, S1PTW = 0 | FSC = 0x04: level 0 translation fault | Data abort info: | ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 | CM = 0, WnR = 0, TnD = 0, TagAccess = 0 | GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 | user pgtable: 4k pages, 48-bit VAs, pgdp=0000000046d96000 | [0000000000000140] pgd=0000000000000000, p4d=0000000000000000 | Internal error: Oops: 0000000096000004 [gregkh#1] PREEMPT SMP | Modules linked in: caam_jr caamhash_desc spidev caamalg_desc crypto_engine authenc libdes mwifiex_sdio +mwifiex crct10dif_ce cdc_acm onboard_usb_hub fsl_imx8_ddr_perf imx8m_ddrc rtc_ds1307 lm75 rtc_snvs +imx_sdma caam imx8mm_thermal spi_imx error imx_cpufreq_dt fuse ip_tables x_tables ipv6 | CPU: 0 PID: 8 Comm: kworker/0:1 Not tainted 6.9.0-00007-g937242013fce-dirty gregkh#18 | Hardware name: somemachine (DT) | Workqueue: events sdio_irq_work | pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : mwifiex_get_cfp+0xd8/0x15c [mwifiex] | lr : mwifiex_get_cfp+0x34/0x15c [mwifiex] | sp : ffff8000818b3a70 | x29: ffff8000818b3a70 x28: ffff000006bfd8a5 x27: 0000000000000004 | x26: 000000000000002c x25: 0000000000001511 x24: 0000000002e86bc9 | x23: ffff000006bfd996 x22: 0000000000000004 x21: ffff000007bec000 | x20: 000000000000002c x19: 0000000000000000 x18: 0000000000000000 | x17: 000000040044ffff x16: 00500072b5503510 x15: ccc283740681e517 | x14: 0201000101006d15 x13: 0000000002e8ff43 x12: 002c01000000ffb1 | x11: 0100000000000000 x10: 02e8ff43002c0100 x9 : 0000ffb100100157 | x8 : ffff000003d20000 x7 : 00000000000002f1 x6 : 00000000ffffe124 | x5 : 0000000000000001 x4 : 0000000000000003 x3 : 0000000000000000 | x2 : 0000000000000000 x1 : 0001000000011001 x0 : 0000000000000000 | Call trace: | mwifiex_get_cfp+0xd8/0x15c [mwifiex] | mwifiex_parse_single_response_buf+0x1d0/0x504 [mwifiex] | mwifiex_handle_event_ext_scan_report+0x19c/0x2f8 [mwifiex] | mwifiex_process_sta_event+0x298/0xf0c [mwifiex] | mwifiex_process_event+0x110/0x238 [mwifiex] | mwifiex_main_process+0x428/0xa44 [mwifiex] | mwifiex_sdio_interrupt+0x64/0x12c [mwifiex_sdio] | process_sdio_pending_irqs+0x64/0x1b8 | sdio_irq_work+0x4c/0x7c | process_one_work+0x148/0x2a0 | worker_thread+0x2fc/0x40c | kthread+0x110/0x114 | ret_from_fork+0x10/0x20 | Code: a94153f3 a8c37bfd d50323bf d65f03c0 (f940a000) | ---[ end trace 0000000000000000 ]--- Signed-off-by: Sascha Hauer <[email protected]> Acked-by: Brian Norris <[email protected]> Reviewed-by: Francesco Dolcini <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://patch.msgid.link/[email protected]
gregkh
pushed a commit
that referenced
this pull request
Jul 18, 2024
[ Upstream commit f0c1802 ] When running BPF selftests (./test_progs -t sockmap_basic) on a Loongarch platform, the following kernel panic occurs: [...] Oops[#1]: CPU: 22 PID: 2824 Comm: test_progs Tainted: G OE 6.10.0-rc2+ #18 Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018 ... ... ra: 90000000048bf6c0 sk_msg_recvmsg+0x120/0x560 ERA: 9000000004162774 copy_page_to_iter+0x74/0x1c0 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 0000000c (PPLV0 +PIE +PWE) EUEN: 00000007 (+FPE +SXE +ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) BADV: 0000000000000040 PRID: 0014c011 (Loongson-64bit, Loongson-3C5000) Modules linked in: bpf_testmod(OE) xt_CHECKSUM xt_MASQUERADE xt_conntrack Process test_progs (pid: 2824, threadinfo=0000000000863a31, task=...) Stack : ... Call Trace: [<9000000004162774>] copy_page_to_iter+0x74/0x1c0 [<90000000048bf6c0>] sk_msg_recvmsg+0x120/0x560 [<90000000049f2b90>] tcp_bpf_recvmsg_parser+0x170/0x4e0 [<90000000049aae34>] inet_recvmsg+0x54/0x100 [<900000000481ad5c>] sock_recvmsg+0x7c/0xe0 [<900000000481e1a8>] __sys_recvfrom+0x108/0x1c0 [<900000000481e27c>] sys_recvfrom+0x1c/0x40 [<9000000004c076ec>] do_syscall+0x8c/0xc0 [<9000000003731da4>] handle_syscall+0xc4/0x160 Code: ... ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Fatal exception Kernel relocated by 0x3510000 .text @ 0x9000000003710000 .data @ 0x9000000004d70000 .bss @ 0x9000000006469400 ---[ end Kernel panic - not syncing: Fatal exception ]--- [...] This crash happens every time when running sockmap_skb_verdict_shutdown subtest in sockmap_basic. This crash is because a NULL pointer is passed to page_address() in the sk_msg_recvmsg(). Due to the different implementations depending on the architecture, page_address(NULL) will trigger a panic on Loongarch platform but not on x86 platform. So this bug was hidden on x86 platform for a while, but now it is exposed on Loongarch platform. The root cause is that a zero length skb (skb->len == 0) was put on the queue. This zero length skb is a TCP FIN packet, which was sent by shutdown(), invoked in test_sockmap_skb_verdict_shutdown(): shutdown(p1, SHUT_WR); In this case, in sk_psock_skb_ingress_enqueue(), num_sge is zero, and no page is put to this sge (see sg_set_page in sg_set_page), but this empty sge is queued into ingress_msg list. And in sk_msg_recvmsg(), this empty sge is used, and a NULL page is got by sg_page(sge). Pass this NULL page to copy_page_to_iter(), which passes it to kmap_local_page() and to page_address(), then kernel panics. To solve this, we should skip this zero length skb. So in sk_msg_recvmsg(), if copy is zero, that means it's a zero length skb, skip invoking copy_page_to_iter(). We are using the EFAULT return triggered by copy_page_to_iter to check for is_fin in tcp_bpf.c. Fixes: 604326b ("bpf, sockmap: convert to generic sk_msg interface") Suggested-by: John Fastabend <[email protected]> Signed-off-by: Geliang Tang <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/e3a16eacdc6740658ee02a33489b1b9d4912f378.1719992715.git.tanggeliang@kylinos.cn Signed-off-by: Sasha Levin <[email protected]>
gregkh
pushed a commit
that referenced
this pull request
Jul 18, 2024
[ Upstream commit f0c1802 ] When running BPF selftests (./test_progs -t sockmap_basic) on a Loongarch platform, the following kernel panic occurs: [...] Oops[#1]: CPU: 22 PID: 2824 Comm: test_progs Tainted: G OE 6.10.0-rc2+ #18 Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018 ... ... ra: 90000000048bf6c0 sk_msg_recvmsg+0x120/0x560 ERA: 9000000004162774 copy_page_to_iter+0x74/0x1c0 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 0000000c (PPLV0 +PIE +PWE) EUEN: 00000007 (+FPE +SXE +ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) BADV: 0000000000000040 PRID: 0014c011 (Loongson-64bit, Loongson-3C5000) Modules linked in: bpf_testmod(OE) xt_CHECKSUM xt_MASQUERADE xt_conntrack Process test_progs (pid: 2824, threadinfo=0000000000863a31, task=...) Stack : ... Call Trace: [<9000000004162774>] copy_page_to_iter+0x74/0x1c0 [<90000000048bf6c0>] sk_msg_recvmsg+0x120/0x560 [<90000000049f2b90>] tcp_bpf_recvmsg_parser+0x170/0x4e0 [<90000000049aae34>] inet_recvmsg+0x54/0x100 [<900000000481ad5c>] sock_recvmsg+0x7c/0xe0 [<900000000481e1a8>] __sys_recvfrom+0x108/0x1c0 [<900000000481e27c>] sys_recvfrom+0x1c/0x40 [<9000000004c076ec>] do_syscall+0x8c/0xc0 [<9000000003731da4>] handle_syscall+0xc4/0x160 Code: ... ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Fatal exception Kernel relocated by 0x3510000 .text @ 0x9000000003710000 .data @ 0x9000000004d70000 .bss @ 0x9000000006469400 ---[ end Kernel panic - not syncing: Fatal exception ]--- [...] This crash happens every time when running sockmap_skb_verdict_shutdown subtest in sockmap_basic. This crash is because a NULL pointer is passed to page_address() in the sk_msg_recvmsg(). Due to the different implementations depending on the architecture, page_address(NULL) will trigger a panic on Loongarch platform but not on x86 platform. So this bug was hidden on x86 platform for a while, but now it is exposed on Loongarch platform. The root cause is that a zero length skb (skb->len == 0) was put on the queue. This zero length skb is a TCP FIN packet, which was sent by shutdown(), invoked in test_sockmap_skb_verdict_shutdown(): shutdown(p1, SHUT_WR); In this case, in sk_psock_skb_ingress_enqueue(), num_sge is zero, and no page is put to this sge (see sg_set_page in sg_set_page), but this empty sge is queued into ingress_msg list. And in sk_msg_recvmsg(), this empty sge is used, and a NULL page is got by sg_page(sge). Pass this NULL page to copy_page_to_iter(), which passes it to kmap_local_page() and to page_address(), then kernel panics. To solve this, we should skip this zero length skb. So in sk_msg_recvmsg(), if copy is zero, that means it's a zero length skb, skip invoking copy_page_to_iter(). We are using the EFAULT return triggered by copy_page_to_iter to check for is_fin in tcp_bpf.c. Fixes: 604326b ("bpf, sockmap: convert to generic sk_msg interface") Suggested-by: John Fastabend <[email protected]> Signed-off-by: Geliang Tang <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/e3a16eacdc6740658ee02a33489b1b9d4912f378.1719992715.git.tanggeliang@kylinos.cn Signed-off-by: Sasha Levin <[email protected]>
gregkh
pushed a commit
that referenced
this pull request
Jul 18, 2024
[ Upstream commit f0c1802 ] When running BPF selftests (./test_progs -t sockmap_basic) on a Loongarch platform, the following kernel panic occurs: [...] Oops[#1]: CPU: 22 PID: 2824 Comm: test_progs Tainted: G OE 6.10.0-rc2+ #18 Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018 ... ... ra: 90000000048bf6c0 sk_msg_recvmsg+0x120/0x560 ERA: 9000000004162774 copy_page_to_iter+0x74/0x1c0 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 0000000c (PPLV0 +PIE +PWE) EUEN: 00000007 (+FPE +SXE +ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) BADV: 0000000000000040 PRID: 0014c011 (Loongson-64bit, Loongson-3C5000) Modules linked in: bpf_testmod(OE) xt_CHECKSUM xt_MASQUERADE xt_conntrack Process test_progs (pid: 2824, threadinfo=0000000000863a31, task=...) Stack : ... Call Trace: [<9000000004162774>] copy_page_to_iter+0x74/0x1c0 [<90000000048bf6c0>] sk_msg_recvmsg+0x120/0x560 [<90000000049f2b90>] tcp_bpf_recvmsg_parser+0x170/0x4e0 [<90000000049aae34>] inet_recvmsg+0x54/0x100 [<900000000481ad5c>] sock_recvmsg+0x7c/0xe0 [<900000000481e1a8>] __sys_recvfrom+0x108/0x1c0 [<900000000481e27c>] sys_recvfrom+0x1c/0x40 [<9000000004c076ec>] do_syscall+0x8c/0xc0 [<9000000003731da4>] handle_syscall+0xc4/0x160 Code: ... ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Fatal exception Kernel relocated by 0x3510000 .text @ 0x9000000003710000 .data @ 0x9000000004d70000 .bss @ 0x9000000006469400 ---[ end Kernel panic - not syncing: Fatal exception ]--- [...] This crash happens every time when running sockmap_skb_verdict_shutdown subtest in sockmap_basic. This crash is because a NULL pointer is passed to page_address() in the sk_msg_recvmsg(). Due to the different implementations depending on the architecture, page_address(NULL) will trigger a panic on Loongarch platform but not on x86 platform. So this bug was hidden on x86 platform for a while, but now it is exposed on Loongarch platform. The root cause is that a zero length skb (skb->len == 0) was put on the queue. This zero length skb is a TCP FIN packet, which was sent by shutdown(), invoked in test_sockmap_skb_verdict_shutdown(): shutdown(p1, SHUT_WR); In this case, in sk_psock_skb_ingress_enqueue(), num_sge is zero, and no page is put to this sge (see sg_set_page in sg_set_page), but this empty sge is queued into ingress_msg list. And in sk_msg_recvmsg(), this empty sge is used, and a NULL page is got by sg_page(sge). Pass this NULL page to copy_page_to_iter(), which passes it to kmap_local_page() and to page_address(), then kernel panics. To solve this, we should skip this zero length skb. So in sk_msg_recvmsg(), if copy is zero, that means it's a zero length skb, skip invoking copy_page_to_iter(). We are using the EFAULT return triggered by copy_page_to_iter to check for is_fin in tcp_bpf.c. Fixes: 604326b ("bpf, sockmap: convert to generic sk_msg interface") Suggested-by: John Fastabend <[email protected]> Signed-off-by: Geliang Tang <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/e3a16eacdc6740658ee02a33489b1b9d4912f378.1719992715.git.tanggeliang@kylinos.cn Signed-off-by: Sasha Levin <[email protected]>
gregkh
pushed a commit
that referenced
this pull request
Jul 18, 2024
[ Upstream commit f0c1802 ] When running BPF selftests (./test_progs -t sockmap_basic) on a Loongarch platform, the following kernel panic occurs: [...] Oops[#1]: CPU: 22 PID: 2824 Comm: test_progs Tainted: G OE 6.10.0-rc2+ #18 Hardware name: LOONGSON Dabieshan/Loongson-TC542F0, BIOS Loongson-UDK2018 ... ... ra: 90000000048bf6c0 sk_msg_recvmsg+0x120/0x560 ERA: 9000000004162774 copy_page_to_iter+0x74/0x1c0 CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) PRMD: 0000000c (PPLV0 +PIE +PWE) EUEN: 00000007 (+FPE +SXE +ASXE -BTE) ECFG: 00071c1d (LIE=0,2-4,10-12 VS=7) ESTAT: 00010000 [PIL] (IS= ECode=1 EsubCode=0) BADV: 0000000000000040 PRID: 0014c011 (Loongson-64bit, Loongson-3C5000) Modules linked in: bpf_testmod(OE) xt_CHECKSUM xt_MASQUERADE xt_conntrack Process test_progs (pid: 2824, threadinfo=0000000000863a31, task=...) Stack : ... Call Trace: [<9000000004162774>] copy_page_to_iter+0x74/0x1c0 [<90000000048bf6c0>] sk_msg_recvmsg+0x120/0x560 [<90000000049f2b90>] tcp_bpf_recvmsg_parser+0x170/0x4e0 [<90000000049aae34>] inet_recvmsg+0x54/0x100 [<900000000481ad5c>] sock_recvmsg+0x7c/0xe0 [<900000000481e1a8>] __sys_recvfrom+0x108/0x1c0 [<900000000481e27c>] sys_recvfrom+0x1c/0x40 [<9000000004c076ec>] do_syscall+0x8c/0xc0 [<9000000003731da4>] handle_syscall+0xc4/0x160 Code: ... ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Fatal exception Kernel relocated by 0x3510000 .text @ 0x9000000003710000 .data @ 0x9000000004d70000 .bss @ 0x9000000006469400 ---[ end Kernel panic - not syncing: Fatal exception ]--- [...] This crash happens every time when running sockmap_skb_verdict_shutdown subtest in sockmap_basic. This crash is because a NULL pointer is passed to page_address() in the sk_msg_recvmsg(). Due to the different implementations depending on the architecture, page_address(NULL) will trigger a panic on Loongarch platform but not on x86 platform. So this bug was hidden on x86 platform for a while, but now it is exposed on Loongarch platform. The root cause is that a zero length skb (skb->len == 0) was put on the queue. This zero length skb is a TCP FIN packet, which was sent by shutdown(), invoked in test_sockmap_skb_verdict_shutdown(): shutdown(p1, SHUT_WR); In this case, in sk_psock_skb_ingress_enqueue(), num_sge is zero, and no page is put to this sge (see sg_set_page in sg_set_page), but this empty sge is queued into ingress_msg list. And in sk_msg_recvmsg(), this empty sge is used, and a NULL page is got by sg_page(sge). Pass this NULL page to copy_page_to_iter(), which passes it to kmap_local_page() and to page_address(), then kernel panics. To solve this, we should skip this zero length skb. So in sk_msg_recvmsg(), if copy is zero, that means it's a zero length skb, skip invoking copy_page_to_iter(). We are using the EFAULT return triggered by copy_page_to_iter to check for is_fin in tcp_bpf.c. Fixes: 604326b ("bpf, sockmap: convert to generic sk_msg interface") Suggested-by: John Fastabend <[email protected]> Signed-off-by: Geliang Tang <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/e3a16eacdc6740658ee02a33489b1b9d4912f378.1719992715.git.tanggeliang@kylinos.cn Signed-off-by: Sasha Levin <[email protected]>
user-why-red
pushed a commit
to user-why-red/linux_stable
that referenced
this pull request
Jul 30, 2024
commit be346c1 upstream. The code in ocfs2_dio_end_io_write() estimates number of necessary transaction credits using ocfs2_calc_extend_credits(). This however does not take into account that the IO could be arbitrarily large and can contain arbitrary number of extents. Extent tree manipulations do often extend the current transaction but not in all of the cases. For example if we have only single block extents in the tree, ocfs2_mark_extent_written() will end up calling ocfs2_replace_extent_rec() all the time and we will never extend the current transaction and eventually exhaust all the transaction credits if the IO contains many single block extents. Once that happens a WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to this error. This was actually triggered by one of our customers on a heavily fragmented OCFS2 filesystem. To fix the issue make sure the transaction always has enough credits for one extent insert before each call of ocfs2_mark_extent_written(). Heming Zhao said: ------ PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error" PID: xxx TASK: xxxx CPU: 5 COMMAND: "SubmitThread-CA" #0 machine_kexec at ffffffff8c069932 gregkh#1 __crash_kexec at ffffffff8c1338fa gregkh#2 panic at ffffffff8c1d69b9 gregkh#3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2] gregkh#4 __ocfs2_abort at ffffffffc0c88387 [ocfs2] gregkh#5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2] gregkh#6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2] gregkh#7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2] gregkh#8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2] gregkh#9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2] gregkh#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2] gregkh#11 dio_complete at ffffffff8c2b9fa7 gregkh#12 do_blockdev_direct_IO at ffffffff8c2bc09f gregkh#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2] gregkh#14 generic_file_direct_write at ffffffff8c1dcf14 gregkh#15 __generic_file_write_iter at ffffffff8c1dd07b gregkh#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2] gregkh#17 aio_write at ffffffff8c2cc72e gregkh#18 kmem_cache_alloc at ffffffff8c248dde #19 do_io_submit at ffffffff8c2ccada #20 do_syscall_64 at ffffffff8c004984 #21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io") Signed-off-by: Jan Kara <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Reviewed-by: Heming Zhao <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
user-why-red
pushed a commit
to user-why-red/linux_stable
that referenced
this pull request
Jul 30, 2024
commit be346c1 upstream. The code in ocfs2_dio_end_io_write() estimates number of necessary transaction credits using ocfs2_calc_extend_credits(). This however does not take into account that the IO could be arbitrarily large and can contain arbitrary number of extents. Extent tree manipulations do often extend the current transaction but not in all of the cases. For example if we have only single block extents in the tree, ocfs2_mark_extent_written() will end up calling ocfs2_replace_extent_rec() all the time and we will never extend the current transaction and eventually exhaust all the transaction credits if the IO contains many single block extents. Once that happens a WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to this error. This was actually triggered by one of our customers on a heavily fragmented OCFS2 filesystem. To fix the issue make sure the transaction always has enough credits for one extent insert before each call of ocfs2_mark_extent_written(). Heming Zhao said: ------ PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error" PID: xxx TASK: xxxx CPU: 5 COMMAND: "SubmitThread-CA" #0 machine_kexec at ffffffff8c069932 gregkh#1 __crash_kexec at ffffffff8c1338fa gregkh#2 panic at ffffffff8c1d69b9 gregkh#3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2] gregkh#4 __ocfs2_abort at ffffffffc0c88387 [ocfs2] gregkh#5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2] gregkh#6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2] gregkh#7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2] gregkh#8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2] gregkh#9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2] gregkh#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2] gregkh#11 dio_complete at ffffffff8c2b9fa7 gregkh#12 do_blockdev_direct_IO at ffffffff8c2bc09f gregkh#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2] gregkh#14 generic_file_direct_write at ffffffff8c1dcf14 gregkh#15 __generic_file_write_iter at ffffffff8c1dd07b gregkh#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2] gregkh#17 aio_write at ffffffff8c2cc72e gregkh#18 kmem_cache_alloc at ffffffff8c248dde #19 do_io_submit at ffffffff8c2ccada #20 do_syscall_64 at ffffffff8c004984 #21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io") Signed-off-by: Jan Kara <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Reviewed-by: Heming Zhao <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
user-why-red
pushed a commit
to user-why-red/linux_stable
that referenced
this pull request
Jul 30, 2024
commit be346c1 upstream. The code in ocfs2_dio_end_io_write() estimates number of necessary transaction credits using ocfs2_calc_extend_credits(). This however does not take into account that the IO could be arbitrarily large and can contain arbitrary number of extents. Extent tree manipulations do often extend the current transaction but not in all of the cases. For example if we have only single block extents in the tree, ocfs2_mark_extent_written() will end up calling ocfs2_replace_extent_rec() all the time and we will never extend the current transaction and eventually exhaust all the transaction credits if the IO contains many single block extents. Once that happens a WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to this error. This was actually triggered by one of our customers on a heavily fragmented OCFS2 filesystem. To fix the issue make sure the transaction always has enough credits for one extent insert before each call of ocfs2_mark_extent_written(). Heming Zhao said: ------ PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error" PID: xxx TASK: xxxx CPU: 5 COMMAND: "SubmitThread-CA" #0 machine_kexec at ffffffff8c069932 gregkh#1 __crash_kexec at ffffffff8c1338fa gregkh#2 panic at ffffffff8c1d69b9 gregkh#3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2] gregkh#4 __ocfs2_abort at ffffffffc0c88387 [ocfs2] gregkh#5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2] gregkh#6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2] gregkh#7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2] gregkh#8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2] gregkh#9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2] gregkh#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2] gregkh#11 dio_complete at ffffffff8c2b9fa7 gregkh#12 do_blockdev_direct_IO at ffffffff8c2bc09f gregkh#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2] gregkh#14 generic_file_direct_write at ffffffff8c1dcf14 gregkh#15 __generic_file_write_iter at ffffffff8c1dd07b gregkh#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2] gregkh#17 aio_write at ffffffff8c2cc72e gregkh#18 kmem_cache_alloc at ffffffff8c248dde #19 do_io_submit at ffffffff8c2ccada #20 do_syscall_64 at ffffffff8c004984 #21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io") Signed-off-by: Jan Kara <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Reviewed-by: Heming Zhao <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
paniakin-aws
pushed a commit
to amazonlinux/linux
that referenced
this pull request
Aug 16, 2024
[ Upstream commit f8bbc07 ] vhost_worker will call tun call backs to receive packets. If too many illegal packets arrives, tun_do_read will keep dumping packet contents. When console is enabled, it will costs much more cpu time to dump packet and soft lockup will be detected. net_ratelimit mechanism can be used to limit the dumping rate. PID: 33036 TASK: ffff949da6f20000 CPU: 23 COMMAND: "vhost-32980" #0 [fffffe00003fce50] crash_nmi_callback at ffffffff89249253 #1 [fffffe00003fce58] nmi_handle at ffffffff89225fa3 #2 [fffffe00003fceb0] default_do_nmi at ffffffff8922642e #3 [fffffe00003fced0] do_nmi at ffffffff8922660d #4 [fffffe00003fcef0] end_repeat_nmi at ffffffff89c01663 [exception RIP: io_serial_in+20] RIP: ffffffff89792594 RSP: ffffa655314979e8 RFLAGS: 00000002 RAX: ffffffff89792500 RBX: ffffffff8af428a0 RCX: 0000000000000000 RDX: 00000000000003fd RSI: 0000000000000005 RDI: ffffffff8af428a0 RBP: 0000000000002710 R8: 0000000000000004 R9: 000000000000000f R10: 0000000000000000 R11: ffffffff8acbf64f R12: 0000000000000020 R13: ffffffff8acbf698 R14: 0000000000000058 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #5 [ffffa655314979e8] io_serial_in at ffffffff89792594 gregkh#6 [ffffa655314979e8] wait_for_xmitr at ffffffff89793470 gregkh#7 [ffffa65531497a08] serial8250_console_putchar at ffffffff897934f6 gregkh#8 [ffffa65531497a20] uart_console_write at ffffffff8978b605 gregkh#9 [ffffa65531497a48] serial8250_console_write at ffffffff89796558 gregkh#10 [ffffa65531497ac8] console_unlock at ffffffff89316124 gregkh#11 [ffffa65531497b10] vprintk_emit at ffffffff89317c07 gregkh#12 [ffffa65531497b68] printk at ffffffff89318306 gregkh#13 [ffffa65531497bc8] print_hex_dump at ffffffff89650765 gregkh#14 [ffffa65531497ca8] tun_do_read at ffffffffc0b06c27 [tun] gregkh#15 [ffffa65531497d38] tun_recvmsg at ffffffffc0b06e34 [tun] gregkh#16 [ffffa65531497d68] handle_rx at ffffffffc0c5d682 [vhost_net] gregkh#17 [ffffa65531497ed0] vhost_worker at ffffffffc0c644dc [vhost] gregkh#18 [ffffa65531497f10] kthread at ffffffff892d2e72 #19 [ffffa65531497f50] ret_from_fork at ffffffff89c0022f Fixes: ef3db4a ("tun: avoid BUG, dump packet on GSO errors") Signed-off-by: Lei Chen <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Acked-by: Jason Wang <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]> (cherry picked from commit 68459b8) Signed-off-by: Vegard Nossum <[email protected]>
RadxaNaoki
pushed a commit
to RadxaNaoki/linux
that referenced
this pull request
Aug 28, 2024
The code in ocfs2_dio_end_io_write() estimates number of necessary transaction credits using ocfs2_calc_extend_credits(). This however does not take into account that the IO could be arbitrarily large and can contain arbitrary number of extents. Extent tree manipulations do often extend the current transaction but not in all of the cases. For example if we have only single block extents in the tree, ocfs2_mark_extent_written() will end up calling ocfs2_replace_extent_rec() all the time and we will never extend the current transaction and eventually exhaust all the transaction credits if the IO contains many single block extents. Once that happens a WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to this error. This was actually triggered by one of our customers on a heavily fragmented OCFS2 filesystem. To fix the issue make sure the transaction always has enough credits for one extent insert before each call of ocfs2_mark_extent_written(). Heming Zhao said: ------ PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error" PID: xxx TASK: xxxx CPU: 5 COMMAND: "SubmitThread-CA" #0 machine_kexec at ffffffff8c069932 gregkh#1 __crash_kexec at ffffffff8c1338fa gregkh#2 panic at ffffffff8c1d69b9 gregkh#3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2] gregkh#4 __ocfs2_abort at ffffffffc0c88387 [ocfs2] gregkh#5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2] gregkh#6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2] gregkh#7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2] gregkh#8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2] gregkh#9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2] gregkh#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2] gregkh#11 dio_complete at ffffffff8c2b9fa7 gregkh#12 do_blockdev_direct_IO at ffffffff8c2bc09f gregkh#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2] gregkh#14 generic_file_direct_write at ffffffff8c1dcf14 gregkh#15 __generic_file_write_iter at ffffffff8c1dd07b gregkh#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2] gregkh#17 aio_write at ffffffff8c2cc72e gregkh#18 kmem_cache_alloc at ffffffff8c248dde #19 do_io_submit at ffffffff8c2ccada #20 do_syscall_64 at ffffffff8c004984 #21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io") Signed-off-by: Jan Kara <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Reviewed-by: Heming Zhao <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
piso77
pushed a commit
to piso77/linux
that referenced
this pull request
Sep 4, 2024
When PG_hwpoison pages are freed they are treated differently in free_pages_prepare() and instead of being released they are isolated. Page allocation tag counters are decremented at this point since the page is considered not in use. Later on when such pages are released by unpoison_memory(), the allocation tag counters will be decremented again and the following warning gets reported: [ 113.930443][ T3282] ------------[ cut here ]------------ [ 113.931105][ T3282] alloc_tag was not set [ 113.931576][ T3282] WARNING: CPU: 2 PID: 3282 at ./include/linux/alloc_tag.h:130 pgalloc_tag_sub.part.66+0x154/0x164 [ 113.932866][ T3282] Modules linked in: hwpoison_inject fuse ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute ip6table_nat ip6table_man4 [ 113.941638][ T3282] CPU: 2 UID: 0 PID: 3282 Comm: madvise11 Kdump: loaded Tainted: G W 6.11.0-rc4-dirty gregkh#18 [ 113.943003][ T3282] Tainted: [W]=WARN [ 113.943453][ T3282] Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022 [ 113.944378][ T3282] pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 113.945319][ T3282] pc : pgalloc_tag_sub.part.66+0x154/0x164 [ 113.946016][ T3282] lr : pgalloc_tag_sub.part.66+0x154/0x164 [ 113.946706][ T3282] sp : ffff800087093a10 [ 113.947197][ T3282] x29: ffff800087093a10 x28: ffff0000d7a9d400 x27: ffff80008249f0a0 [ 113.948165][ T3282] x26: 0000000000000000 x25: ffff80008249f2b0 x24: 0000000000000000 [ 113.949134][ T3282] x23: 0000000000000001 x22: 0000000000000001 x21: 0000000000000000 [ 113.950597][ T3282] x20: ffff0000c08fcad8 x19: ffff80008251e000 x18: ffffffffffffffff [ 113.952207][ T3282] x17: 0000000000000000 x16: 0000000000000000 x15: ffff800081746210 [ 113.953161][ T3282] x14: 0000000000000000 x13: 205d323832335420 x12: 5b5d353031313339 [ 113.954120][ T3282] x11: ffff800087093500 x10: 000000000000005d x9 : 00000000ffffffd0 [ 113.955078][ T3282] x8 : 7f7f7f7f7f7f7f7f x7 : ffff80008236ba90 x6 : c0000000ffff7fff [ 113.956036][ T3282] x5 : ffff000b34bf4dc8 x4 : ffff8000820aba90 x3 : 0000000000000001 [ 113.956994][ T3282] x2 : ffff800ab320f000 x1 : 841d1e35ac932e00 x0 : 0000000000000000 [ 113.957962][ T3282] Call trace: [ 113.958350][ T3282] pgalloc_tag_sub.part.66+0x154/0x164 [ 113.959000][ T3282] pgalloc_tag_sub+0x14/0x1c [ 113.959539][ T3282] free_unref_page+0xf4/0x4b8 [ 113.960096][ T3282] __folio_put+0xd4/0x120 [ 113.960614][ T3282] folio_put+0x24/0x50 [ 113.961103][ T3282] unpoison_memory+0x4f0/0x5b0 [ 113.961678][ T3282] hwpoison_unpoison+0x30/0x48 [hwpoison_inject] [ 113.962436][ T3282] simple_attr_write_xsigned.isra.34+0xec/0x1cc [ 113.963183][ T3282] simple_attr_write+0x38/0x48 [ 113.963750][ T3282] debugfs_attr_write+0x54/0x80 [ 113.964330][ T3282] full_proxy_write+0x68/0x98 [ 113.964880][ T3282] vfs_write+0xdc/0x4d0 [ 113.965372][ T3282] ksys_write+0x78/0x100 [ 113.965875][ T3282] __arm64_sys_write+0x24/0x30 [ 113.966440][ T3282] invoke_syscall+0x7c/0x104 [ 113.966984][ T3282] el0_svc_common.constprop.1+0x88/0x104 [ 113.967652][ T3282] do_el0_svc+0x2c/0x38 [ 113.968893][ T3282] el0_svc+0x3c/0x1b8 [ 113.969379][ T3282] el0t_64_sync_handler+0x98/0xbc [ 113.969980][ T3282] el0t_64_sync+0x19c/0x1a0 [ 113.970511][ T3282] ---[ end trace 0000000000000000 ]--- To fix this, clear the page tag reference after the page got isolated and accounted for. Link: https://lkml.kernel.org/r/[email protected] Fixes: d224eb0 ("codetag: debug: mark codetags for reserved pages as empty") Signed-off-by: Hao Ge <[email protected]> Reviewed-by: Miaohe Lin <[email protected]> Acked-by: Suren Baghdasaryan <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Hao Ge <[email protected]> Cc: Kent Overstreet <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Pasha Tatashin <[email protected]> Cc: <[email protected]> [6.10+] Signed-off-by: Andrew Morton <[email protected]>
gregkh
pushed a commit
that referenced
this pull request
Sep 12, 2024
[ Upstream commit c145eea ] mwifiex_get_priv_by_id() returns the priv pointer corresponding to the bss_num and bss_type, but without checking if the priv is actually currently in use. Unused priv pointers do not have a wiphy attached to them which can lead to NULL pointer dereferences further down the callstack. Fix this by returning only used priv pointers which have priv->bss_mode set to something else than NL80211_IFTYPE_UNSPECIFIED. Said NULL pointer dereference happened when an Accesspoint was started with wpa_supplicant -i mlan0 with this config: network={ ssid="somessid" mode=2 frequency=2412 key_mgmt=WPA-PSK WPA-PSK-SHA256 proto=RSN group=CCMP pairwise=CCMP psk="12345678" } When waiting for the AP to be established, interrupting wpa_supplicant with <ctrl-c> and starting it again this happens: | Unable to handle kernel NULL pointer dereference at virtual address 0000000000000140 | Mem abort info: | ESR = 0x0000000096000004 | EC = 0x25: DABT (current EL), IL = 32 bits | SET = 0, FnV = 0 | EA = 0, S1PTW = 0 | FSC = 0x04: level 0 translation fault | Data abort info: | ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 | CM = 0, WnR = 0, TnD = 0, TagAccess = 0 | GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 | user pgtable: 4k pages, 48-bit VAs, pgdp=0000000046d96000 | [0000000000000140] pgd=0000000000000000, p4d=0000000000000000 | Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP | Modules linked in: caam_jr caamhash_desc spidev caamalg_desc crypto_engine authenc libdes mwifiex_sdio +mwifiex crct10dif_ce cdc_acm onboard_usb_hub fsl_imx8_ddr_perf imx8m_ddrc rtc_ds1307 lm75 rtc_snvs +imx_sdma caam imx8mm_thermal spi_imx error imx_cpufreq_dt fuse ip_tables x_tables ipv6 | CPU: 0 PID: 8 Comm: kworker/0:1 Not tainted 6.9.0-00007-g937242013fce-dirty #18 | Hardware name: somemachine (DT) | Workqueue: events sdio_irq_work | pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : mwifiex_get_cfp+0xd8/0x15c [mwifiex] | lr : mwifiex_get_cfp+0x34/0x15c [mwifiex] | sp : ffff8000818b3a70 | x29: ffff8000818b3a70 x28: ffff000006bfd8a5 x27: 0000000000000004 | x26: 000000000000002c x25: 0000000000001511 x24: 0000000002e86bc9 | x23: ffff000006bfd996 x22: 0000000000000004 x21: ffff000007bec000 | x20: 000000000000002c x19: 0000000000000000 x18: 0000000000000000 | x17: 000000040044ffff x16: 00500072b5503510 x15: ccc283740681e517 | x14: 0201000101006d15 x13: 0000000002e8ff43 x12: 002c01000000ffb1 | x11: 0100000000000000 x10: 02e8ff43002c0100 x9 : 0000ffb100100157 | x8 : ffff000003d20000 x7 : 00000000000002f1 x6 : 00000000ffffe124 | x5 : 0000000000000001 x4 : 0000000000000003 x3 : 0000000000000000 | x2 : 0000000000000000 x1 : 0001000000011001 x0 : 0000000000000000 | Call trace: | mwifiex_get_cfp+0xd8/0x15c [mwifiex] | mwifiex_parse_single_response_buf+0x1d0/0x504 [mwifiex] | mwifiex_handle_event_ext_scan_report+0x19c/0x2f8 [mwifiex] | mwifiex_process_sta_event+0x298/0xf0c [mwifiex] | mwifiex_process_event+0x110/0x238 [mwifiex] | mwifiex_main_process+0x428/0xa44 [mwifiex] | mwifiex_sdio_interrupt+0x64/0x12c [mwifiex_sdio] | process_sdio_pending_irqs+0x64/0x1b8 | sdio_irq_work+0x4c/0x7c | process_one_work+0x148/0x2a0 | worker_thread+0x2fc/0x40c | kthread+0x110/0x114 | ret_from_fork+0x10/0x20 | Code: a94153f3 a8c37bfd d50323bf d65f03c0 (f940a000) | ---[ end trace 0000000000000000 ]--- Signed-off-by: Sascha Hauer <[email protected]> Acked-by: Brian Norris <[email protected]> Reviewed-by: Francesco Dolcini <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
gregkh
pushed a commit
that referenced
this pull request
Sep 12, 2024
[ Upstream commit c145eea ] mwifiex_get_priv_by_id() returns the priv pointer corresponding to the bss_num and bss_type, but without checking if the priv is actually currently in use. Unused priv pointers do not have a wiphy attached to them which can lead to NULL pointer dereferences further down the callstack. Fix this by returning only used priv pointers which have priv->bss_mode set to something else than NL80211_IFTYPE_UNSPECIFIED. Said NULL pointer dereference happened when an Accesspoint was started with wpa_supplicant -i mlan0 with this config: network={ ssid="somessid" mode=2 frequency=2412 key_mgmt=WPA-PSK WPA-PSK-SHA256 proto=RSN group=CCMP pairwise=CCMP psk="12345678" } When waiting for the AP to be established, interrupting wpa_supplicant with <ctrl-c> and starting it again this happens: | Unable to handle kernel NULL pointer dereference at virtual address 0000000000000140 | Mem abort info: | ESR = 0x0000000096000004 | EC = 0x25: DABT (current EL), IL = 32 bits | SET = 0, FnV = 0 | EA = 0, S1PTW = 0 | FSC = 0x04: level 0 translation fault | Data abort info: | ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 | CM = 0, WnR = 0, TnD = 0, TagAccess = 0 | GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 | user pgtable: 4k pages, 48-bit VAs, pgdp=0000000046d96000 | [0000000000000140] pgd=0000000000000000, p4d=0000000000000000 | Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP | Modules linked in: caam_jr caamhash_desc spidev caamalg_desc crypto_engine authenc libdes mwifiex_sdio +mwifiex crct10dif_ce cdc_acm onboard_usb_hub fsl_imx8_ddr_perf imx8m_ddrc rtc_ds1307 lm75 rtc_snvs +imx_sdma caam imx8mm_thermal spi_imx error imx_cpufreq_dt fuse ip_tables x_tables ipv6 | CPU: 0 PID: 8 Comm: kworker/0:1 Not tainted 6.9.0-00007-g937242013fce-dirty #18 | Hardware name: somemachine (DT) | Workqueue: events sdio_irq_work | pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : mwifiex_get_cfp+0xd8/0x15c [mwifiex] | lr : mwifiex_get_cfp+0x34/0x15c [mwifiex] | sp : ffff8000818b3a70 | x29: ffff8000818b3a70 x28: ffff000006bfd8a5 x27: 0000000000000004 | x26: 000000000000002c x25: 0000000000001511 x24: 0000000002e86bc9 | x23: ffff000006bfd996 x22: 0000000000000004 x21: ffff000007bec000 | x20: 000000000000002c x19: 0000000000000000 x18: 0000000000000000 | x17: 000000040044ffff x16: 00500072b5503510 x15: ccc283740681e517 | x14: 0201000101006d15 x13: 0000000002e8ff43 x12: 002c01000000ffb1 | x11: 0100000000000000 x10: 02e8ff43002c0100 x9 : 0000ffb100100157 | x8 : ffff000003d20000 x7 : 00000000000002f1 x6 : 00000000ffffe124 | x5 : 0000000000000001 x4 : 0000000000000003 x3 : 0000000000000000 | x2 : 0000000000000000 x1 : 0001000000011001 x0 : 0000000000000000 | Call trace: | mwifiex_get_cfp+0xd8/0x15c [mwifiex] | mwifiex_parse_single_response_buf+0x1d0/0x504 [mwifiex] | mwifiex_handle_event_ext_scan_report+0x19c/0x2f8 [mwifiex] | mwifiex_process_sta_event+0x298/0xf0c [mwifiex] | mwifiex_process_event+0x110/0x238 [mwifiex] | mwifiex_main_process+0x428/0xa44 [mwifiex] | mwifiex_sdio_interrupt+0x64/0x12c [mwifiex_sdio] | process_sdio_pending_irqs+0x64/0x1b8 | sdio_irq_work+0x4c/0x7c | process_one_work+0x148/0x2a0 | worker_thread+0x2fc/0x40c | kthread+0x110/0x114 | ret_from_fork+0x10/0x20 | Code: a94153f3 a8c37bfd d50323bf d65f03c0 (f940a000) | ---[ end trace 0000000000000000 ]--- Signed-off-by: Sascha Hauer <[email protected]> Acked-by: Brian Norris <[email protected]> Reviewed-by: Francesco Dolcini <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
gregkh
pushed a commit
that referenced
this pull request
Sep 12, 2024
commit 5e9784e upstream. When PG_hwpoison pages are freed they are treated differently in free_pages_prepare() and instead of being released they are isolated. Page allocation tag counters are decremented at this point since the page is considered not in use. Later on when such pages are released by unpoison_memory(), the allocation tag counters will be decremented again and the following warning gets reported: [ 113.930443][ T3282] ------------[ cut here ]------------ [ 113.931105][ T3282] alloc_tag was not set [ 113.931576][ T3282] WARNING: CPU: 2 PID: 3282 at ./include/linux/alloc_tag.h:130 pgalloc_tag_sub.part.66+0x154/0x164 [ 113.932866][ T3282] Modules linked in: hwpoison_inject fuse ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute ip6table_nat ip6table_man4 [ 113.941638][ T3282] CPU: 2 UID: 0 PID: 3282 Comm: madvise11 Kdump: loaded Tainted: G W 6.11.0-rc4-dirty #18 [ 113.943003][ T3282] Tainted: [W]=WARN [ 113.943453][ T3282] Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022 [ 113.944378][ T3282] pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 113.945319][ T3282] pc : pgalloc_tag_sub.part.66+0x154/0x164 [ 113.946016][ T3282] lr : pgalloc_tag_sub.part.66+0x154/0x164 [ 113.946706][ T3282] sp : ffff800087093a10 [ 113.947197][ T3282] x29: ffff800087093a10 x28: ffff0000d7a9d400 x27: ffff80008249f0a0 [ 113.948165][ T3282] x26: 0000000000000000 x25: ffff80008249f2b0 x24: 0000000000000000 [ 113.949134][ T3282] x23: 0000000000000001 x22: 0000000000000001 x21: 0000000000000000 [ 113.950597][ T3282] x20: ffff0000c08fcad8 x19: ffff80008251e000 x18: ffffffffffffffff [ 113.952207][ T3282] x17: 0000000000000000 x16: 0000000000000000 x15: ffff800081746210 [ 113.953161][ T3282] x14: 0000000000000000 x13: 205d323832335420 x12: 5b5d353031313339 [ 113.954120][ T3282] x11: ffff800087093500 x10: 000000000000005d x9 : 00000000ffffffd0 [ 113.955078][ T3282] x8 : 7f7f7f7f7f7f7f7f x7 : ffff80008236ba90 x6 : c0000000ffff7fff [ 113.956036][ T3282] x5 : ffff000b34bf4dc8 x4 : ffff8000820aba90 x3 : 0000000000000001 [ 113.956994][ T3282] x2 : ffff800ab320f000 x1 : 841d1e35ac932e00 x0 : 0000000000000000 [ 113.957962][ T3282] Call trace: [ 113.958350][ T3282] pgalloc_tag_sub.part.66+0x154/0x164 [ 113.959000][ T3282] pgalloc_tag_sub+0x14/0x1c [ 113.959539][ T3282] free_unref_page+0xf4/0x4b8 [ 113.960096][ T3282] __folio_put+0xd4/0x120 [ 113.960614][ T3282] folio_put+0x24/0x50 [ 113.961103][ T3282] unpoison_memory+0x4f0/0x5b0 [ 113.961678][ T3282] hwpoison_unpoison+0x30/0x48 [hwpoison_inject] [ 113.962436][ T3282] simple_attr_write_xsigned.isra.34+0xec/0x1cc [ 113.963183][ T3282] simple_attr_write+0x38/0x48 [ 113.963750][ T3282] debugfs_attr_write+0x54/0x80 [ 113.964330][ T3282] full_proxy_write+0x68/0x98 [ 113.964880][ T3282] vfs_write+0xdc/0x4d0 [ 113.965372][ T3282] ksys_write+0x78/0x100 [ 113.965875][ T3282] __arm64_sys_write+0x24/0x30 [ 113.966440][ T3282] invoke_syscall+0x7c/0x104 [ 113.966984][ T3282] el0_svc_common.constprop.1+0x88/0x104 [ 113.967652][ T3282] do_el0_svc+0x2c/0x38 [ 113.968893][ T3282] el0_svc+0x3c/0x1b8 [ 113.969379][ T3282] el0t_64_sync_handler+0x98/0xbc [ 113.969980][ T3282] el0t_64_sync+0x19c/0x1a0 [ 113.970511][ T3282] ---[ end trace 0000000000000000 ]--- To fix this, clear the page tag reference after the page got isolated and accounted for. Link: https://lkml.kernel.org/r/[email protected] Fixes: d224eb0 ("codetag: debug: mark codetags for reserved pages as empty") Signed-off-by: Hao Ge <[email protected]> Reviewed-by: Miaohe Lin <[email protected]> Acked-by: Suren Baghdasaryan <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Hao Ge <[email protected]> Cc: Kent Overstreet <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Pasha Tatashin <[email protected]> Cc: <[email protected]> [6.10+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
gregkh
pushed a commit
that referenced
this pull request
Sep 12, 2024
[ Upstream commit c145eea ] mwifiex_get_priv_by_id() returns the priv pointer corresponding to the bss_num and bss_type, but without checking if the priv is actually currently in use. Unused priv pointers do not have a wiphy attached to them which can lead to NULL pointer dereferences further down the callstack. Fix this by returning only used priv pointers which have priv->bss_mode set to something else than NL80211_IFTYPE_UNSPECIFIED. Said NULL pointer dereference happened when an Accesspoint was started with wpa_supplicant -i mlan0 with this config: network={ ssid="somessid" mode=2 frequency=2412 key_mgmt=WPA-PSK WPA-PSK-SHA256 proto=RSN group=CCMP pairwise=CCMP psk="12345678" } When waiting for the AP to be established, interrupting wpa_supplicant with <ctrl-c> and starting it again this happens: | Unable to handle kernel NULL pointer dereference at virtual address 0000000000000140 | Mem abort info: | ESR = 0x0000000096000004 | EC = 0x25: DABT (current EL), IL = 32 bits | SET = 0, FnV = 0 | EA = 0, S1PTW = 0 | FSC = 0x04: level 0 translation fault | Data abort info: | ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 | CM = 0, WnR = 0, TnD = 0, TagAccess = 0 | GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 | user pgtable: 4k pages, 48-bit VAs, pgdp=0000000046d96000 | [0000000000000140] pgd=0000000000000000, p4d=0000000000000000 | Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP | Modules linked in: caam_jr caamhash_desc spidev caamalg_desc crypto_engine authenc libdes mwifiex_sdio +mwifiex crct10dif_ce cdc_acm onboard_usb_hub fsl_imx8_ddr_perf imx8m_ddrc rtc_ds1307 lm75 rtc_snvs +imx_sdma caam imx8mm_thermal spi_imx error imx_cpufreq_dt fuse ip_tables x_tables ipv6 | CPU: 0 PID: 8 Comm: kworker/0:1 Not tainted 6.9.0-00007-g937242013fce-dirty #18 | Hardware name: somemachine (DT) | Workqueue: events sdio_irq_work | pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : mwifiex_get_cfp+0xd8/0x15c [mwifiex] | lr : mwifiex_get_cfp+0x34/0x15c [mwifiex] | sp : ffff8000818b3a70 | x29: ffff8000818b3a70 x28: ffff000006bfd8a5 x27: 0000000000000004 | x26: 000000000000002c x25: 0000000000001511 x24: 0000000002e86bc9 | x23: ffff000006bfd996 x22: 0000000000000004 x21: ffff000007bec000 | x20: 000000000000002c x19: 0000000000000000 x18: 0000000000000000 | x17: 000000040044ffff x16: 00500072b5503510 x15: ccc283740681e517 | x14: 0201000101006d15 x13: 0000000002e8ff43 x12: 002c01000000ffb1 | x11: 0100000000000000 x10: 02e8ff43002c0100 x9 : 0000ffb100100157 | x8 : ffff000003d20000 x7 : 00000000000002f1 x6 : 00000000ffffe124 | x5 : 0000000000000001 x4 : 0000000000000003 x3 : 0000000000000000 | x2 : 0000000000000000 x1 : 0001000000011001 x0 : 0000000000000000 | Call trace: | mwifiex_get_cfp+0xd8/0x15c [mwifiex] | mwifiex_parse_single_response_buf+0x1d0/0x504 [mwifiex] | mwifiex_handle_event_ext_scan_report+0x19c/0x2f8 [mwifiex] | mwifiex_process_sta_event+0x298/0xf0c [mwifiex] | mwifiex_process_event+0x110/0x238 [mwifiex] | mwifiex_main_process+0x428/0xa44 [mwifiex] | mwifiex_sdio_interrupt+0x64/0x12c [mwifiex_sdio] | process_sdio_pending_irqs+0x64/0x1b8 | sdio_irq_work+0x4c/0x7c | process_one_work+0x148/0x2a0 | worker_thread+0x2fc/0x40c | kthread+0x110/0x114 | ret_from_fork+0x10/0x20 | Code: a94153f3 a8c37bfd d50323bf d65f03c0 (f940a000) | ---[ end trace 0000000000000000 ]--- Signed-off-by: Sascha Hauer <[email protected]> Acked-by: Brian Norris <[email protected]> Reviewed-by: Francesco Dolcini <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
gregkh
pushed a commit
that referenced
this pull request
Sep 12, 2024
[ Upstream commit c145eea ] mwifiex_get_priv_by_id() returns the priv pointer corresponding to the bss_num and bss_type, but without checking if the priv is actually currently in use. Unused priv pointers do not have a wiphy attached to them which can lead to NULL pointer dereferences further down the callstack. Fix this by returning only used priv pointers which have priv->bss_mode set to something else than NL80211_IFTYPE_UNSPECIFIED. Said NULL pointer dereference happened when an Accesspoint was started with wpa_supplicant -i mlan0 with this config: network={ ssid="somessid" mode=2 frequency=2412 key_mgmt=WPA-PSK WPA-PSK-SHA256 proto=RSN group=CCMP pairwise=CCMP psk="12345678" } When waiting for the AP to be established, interrupting wpa_supplicant with <ctrl-c> and starting it again this happens: | Unable to handle kernel NULL pointer dereference at virtual address 0000000000000140 | Mem abort info: | ESR = 0x0000000096000004 | EC = 0x25: DABT (current EL), IL = 32 bits | SET = 0, FnV = 0 | EA = 0, S1PTW = 0 | FSC = 0x04: level 0 translation fault | Data abort info: | ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 | CM = 0, WnR = 0, TnD = 0, TagAccess = 0 | GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 | user pgtable: 4k pages, 48-bit VAs, pgdp=0000000046d96000 | [0000000000000140] pgd=0000000000000000, p4d=0000000000000000 | Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP | Modules linked in: caam_jr caamhash_desc spidev caamalg_desc crypto_engine authenc libdes mwifiex_sdio +mwifiex crct10dif_ce cdc_acm onboard_usb_hub fsl_imx8_ddr_perf imx8m_ddrc rtc_ds1307 lm75 rtc_snvs +imx_sdma caam imx8mm_thermal spi_imx error imx_cpufreq_dt fuse ip_tables x_tables ipv6 | CPU: 0 PID: 8 Comm: kworker/0:1 Not tainted 6.9.0-00007-g937242013fce-dirty #18 | Hardware name: somemachine (DT) | Workqueue: events sdio_irq_work | pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : mwifiex_get_cfp+0xd8/0x15c [mwifiex] | lr : mwifiex_get_cfp+0x34/0x15c [mwifiex] | sp : ffff8000818b3a70 | x29: ffff8000818b3a70 x28: ffff000006bfd8a5 x27: 0000000000000004 | x26: 000000000000002c x25: 0000000000001511 x24: 0000000002e86bc9 | x23: ffff000006bfd996 x22: 0000000000000004 x21: ffff000007bec000 | x20: 000000000000002c x19: 0000000000000000 x18: 0000000000000000 | x17: 000000040044ffff x16: 00500072b5503510 x15: ccc283740681e517 | x14: 0201000101006d15 x13: 0000000002e8ff43 x12: 002c01000000ffb1 | x11: 0100000000000000 x10: 02e8ff43002c0100 x9 : 0000ffb100100157 | x8 : ffff000003d20000 x7 : 00000000000002f1 x6 : 00000000ffffe124 | x5 : 0000000000000001 x4 : 0000000000000003 x3 : 0000000000000000 | x2 : 0000000000000000 x1 : 0001000000011001 x0 : 0000000000000000 | Call trace: | mwifiex_get_cfp+0xd8/0x15c [mwifiex] | mwifiex_parse_single_response_buf+0x1d0/0x504 [mwifiex] | mwifiex_handle_event_ext_scan_report+0x19c/0x2f8 [mwifiex] | mwifiex_process_sta_event+0x298/0xf0c [mwifiex] | mwifiex_process_event+0x110/0x238 [mwifiex] | mwifiex_main_process+0x428/0xa44 [mwifiex] | mwifiex_sdio_interrupt+0x64/0x12c [mwifiex_sdio] | process_sdio_pending_irqs+0x64/0x1b8 | sdio_irq_work+0x4c/0x7c | process_one_work+0x148/0x2a0 | worker_thread+0x2fc/0x40c | kthread+0x110/0x114 | ret_from_fork+0x10/0x20 | Code: a94153f3 a8c37bfd d50323bf d65f03c0 (f940a000) | ---[ end trace 0000000000000000 ]--- Signed-off-by: Sascha Hauer <[email protected]> Acked-by: Brian Norris <[email protected]> Reviewed-by: Francesco Dolcini <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
gregkh
pushed a commit
that referenced
this pull request
Sep 12, 2024
[ Upstream commit c145eea ] mwifiex_get_priv_by_id() returns the priv pointer corresponding to the bss_num and bss_type, but without checking if the priv is actually currently in use. Unused priv pointers do not have a wiphy attached to them which can lead to NULL pointer dereferences further down the callstack. Fix this by returning only used priv pointers which have priv->bss_mode set to something else than NL80211_IFTYPE_UNSPECIFIED. Said NULL pointer dereference happened when an Accesspoint was started with wpa_supplicant -i mlan0 with this config: network={ ssid="somessid" mode=2 frequency=2412 key_mgmt=WPA-PSK WPA-PSK-SHA256 proto=RSN group=CCMP pairwise=CCMP psk="12345678" } When waiting for the AP to be established, interrupting wpa_supplicant with <ctrl-c> and starting it again this happens: | Unable to handle kernel NULL pointer dereference at virtual address 0000000000000140 | Mem abort info: | ESR = 0x0000000096000004 | EC = 0x25: DABT (current EL), IL = 32 bits | SET = 0, FnV = 0 | EA = 0, S1PTW = 0 | FSC = 0x04: level 0 translation fault | Data abort info: | ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 | CM = 0, WnR = 0, TnD = 0, TagAccess = 0 | GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 | user pgtable: 4k pages, 48-bit VAs, pgdp=0000000046d96000 | [0000000000000140] pgd=0000000000000000, p4d=0000000000000000 | Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP | Modules linked in: caam_jr caamhash_desc spidev caamalg_desc crypto_engine authenc libdes mwifiex_sdio +mwifiex crct10dif_ce cdc_acm onboard_usb_hub fsl_imx8_ddr_perf imx8m_ddrc rtc_ds1307 lm75 rtc_snvs +imx_sdma caam imx8mm_thermal spi_imx error imx_cpufreq_dt fuse ip_tables x_tables ipv6 | CPU: 0 PID: 8 Comm: kworker/0:1 Not tainted 6.9.0-00007-g937242013fce-dirty #18 | Hardware name: somemachine (DT) | Workqueue: events sdio_irq_work | pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : mwifiex_get_cfp+0xd8/0x15c [mwifiex] | lr : mwifiex_get_cfp+0x34/0x15c [mwifiex] | sp : ffff8000818b3a70 | x29: ffff8000818b3a70 x28: ffff000006bfd8a5 x27: 0000000000000004 | x26: 000000000000002c x25: 0000000000001511 x24: 0000000002e86bc9 | x23: ffff000006bfd996 x22: 0000000000000004 x21: ffff000007bec000 | x20: 000000000000002c x19: 0000000000000000 x18: 0000000000000000 | x17: 000000040044ffff x16: 00500072b5503510 x15: ccc283740681e517 | x14: 0201000101006d15 x13: 0000000002e8ff43 x12: 002c01000000ffb1 | x11: 0100000000000000 x10: 02e8ff43002c0100 x9 : 0000ffb100100157 | x8 : ffff000003d20000 x7 : 00000000000002f1 x6 : 00000000ffffe124 | x5 : 0000000000000001 x4 : 0000000000000003 x3 : 0000000000000000 | x2 : 0000000000000000 x1 : 0001000000011001 x0 : 0000000000000000 | Call trace: | mwifiex_get_cfp+0xd8/0x15c [mwifiex] | mwifiex_parse_single_response_buf+0x1d0/0x504 [mwifiex] | mwifiex_handle_event_ext_scan_report+0x19c/0x2f8 [mwifiex] | mwifiex_process_sta_event+0x298/0xf0c [mwifiex] | mwifiex_process_event+0x110/0x238 [mwifiex] | mwifiex_main_process+0x428/0xa44 [mwifiex] | mwifiex_sdio_interrupt+0x64/0x12c [mwifiex_sdio] | process_sdio_pending_irqs+0x64/0x1b8 | sdio_irq_work+0x4c/0x7c | process_one_work+0x148/0x2a0 | worker_thread+0x2fc/0x40c | kthread+0x110/0x114 | ret_from_fork+0x10/0x20 | Code: a94153f3 a8c37bfd d50323bf d65f03c0 (f940a000) | ---[ end trace 0000000000000000 ]--- Signed-off-by: Sascha Hauer <[email protected]> Acked-by: Brian Norris <[email protected]> Reviewed-by: Francesco Dolcini <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
gregkh
pushed a commit
that referenced
this pull request
Sep 12, 2024
[ Upstream commit c145eea ] mwifiex_get_priv_by_id() returns the priv pointer corresponding to the bss_num and bss_type, but without checking if the priv is actually currently in use. Unused priv pointers do not have a wiphy attached to them which can lead to NULL pointer dereferences further down the callstack. Fix this by returning only used priv pointers which have priv->bss_mode set to something else than NL80211_IFTYPE_UNSPECIFIED. Said NULL pointer dereference happened when an Accesspoint was started with wpa_supplicant -i mlan0 with this config: network={ ssid="somessid" mode=2 frequency=2412 key_mgmt=WPA-PSK WPA-PSK-SHA256 proto=RSN group=CCMP pairwise=CCMP psk="12345678" } When waiting for the AP to be established, interrupting wpa_supplicant with <ctrl-c> and starting it again this happens: | Unable to handle kernel NULL pointer dereference at virtual address 0000000000000140 | Mem abort info: | ESR = 0x0000000096000004 | EC = 0x25: DABT (current EL), IL = 32 bits | SET = 0, FnV = 0 | EA = 0, S1PTW = 0 | FSC = 0x04: level 0 translation fault | Data abort info: | ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 | CM = 0, WnR = 0, TnD = 0, TagAccess = 0 | GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 | user pgtable: 4k pages, 48-bit VAs, pgdp=0000000046d96000 | [0000000000000140] pgd=0000000000000000, p4d=0000000000000000 | Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP | Modules linked in: caam_jr caamhash_desc spidev caamalg_desc crypto_engine authenc libdes mwifiex_sdio +mwifiex crct10dif_ce cdc_acm onboard_usb_hub fsl_imx8_ddr_perf imx8m_ddrc rtc_ds1307 lm75 rtc_snvs +imx_sdma caam imx8mm_thermal spi_imx error imx_cpufreq_dt fuse ip_tables x_tables ipv6 | CPU: 0 PID: 8 Comm: kworker/0:1 Not tainted 6.9.0-00007-g937242013fce-dirty #18 | Hardware name: somemachine (DT) | Workqueue: events sdio_irq_work | pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : mwifiex_get_cfp+0xd8/0x15c [mwifiex] | lr : mwifiex_get_cfp+0x34/0x15c [mwifiex] | sp : ffff8000818b3a70 | x29: ffff8000818b3a70 x28: ffff000006bfd8a5 x27: 0000000000000004 | x26: 000000000000002c x25: 0000000000001511 x24: 0000000002e86bc9 | x23: ffff000006bfd996 x22: 0000000000000004 x21: ffff000007bec000 | x20: 000000000000002c x19: 0000000000000000 x18: 0000000000000000 | x17: 000000040044ffff x16: 00500072b5503510 x15: ccc283740681e517 | x14: 0201000101006d15 x13: 0000000002e8ff43 x12: 002c01000000ffb1 | x11: 0100000000000000 x10: 02e8ff43002c0100 x9 : 0000ffb100100157 | x8 : ffff000003d20000 x7 : 00000000000002f1 x6 : 00000000ffffe124 | x5 : 0000000000000001 x4 : 0000000000000003 x3 : 0000000000000000 | x2 : 0000000000000000 x1 : 0001000000011001 x0 : 0000000000000000 | Call trace: | mwifiex_get_cfp+0xd8/0x15c [mwifiex] | mwifiex_parse_single_response_buf+0x1d0/0x504 [mwifiex] | mwifiex_handle_event_ext_scan_report+0x19c/0x2f8 [mwifiex] | mwifiex_process_sta_event+0x298/0xf0c [mwifiex] | mwifiex_process_event+0x110/0x238 [mwifiex] | mwifiex_main_process+0x428/0xa44 [mwifiex] | mwifiex_sdio_interrupt+0x64/0x12c [mwifiex_sdio] | process_sdio_pending_irqs+0x64/0x1b8 | sdio_irq_work+0x4c/0x7c | process_one_work+0x148/0x2a0 | worker_thread+0x2fc/0x40c | kthread+0x110/0x114 | ret_from_fork+0x10/0x20 | Code: a94153f3 a8c37bfd d50323bf d65f03c0 (f940a000) | ---[ end trace 0000000000000000 ]--- Signed-off-by: Sascha Hauer <[email protected]> Acked-by: Brian Norris <[email protected]> Reviewed-by: Francesco Dolcini <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
gregkh
pushed a commit
that referenced
this pull request
Sep 12, 2024
[ Upstream commit c145eea ] mwifiex_get_priv_by_id() returns the priv pointer corresponding to the bss_num and bss_type, but without checking if the priv is actually currently in use. Unused priv pointers do not have a wiphy attached to them which can lead to NULL pointer dereferences further down the callstack. Fix this by returning only used priv pointers which have priv->bss_mode set to something else than NL80211_IFTYPE_UNSPECIFIED. Said NULL pointer dereference happened when an Accesspoint was started with wpa_supplicant -i mlan0 with this config: network={ ssid="somessid" mode=2 frequency=2412 key_mgmt=WPA-PSK WPA-PSK-SHA256 proto=RSN group=CCMP pairwise=CCMP psk="12345678" } When waiting for the AP to be established, interrupting wpa_supplicant with <ctrl-c> and starting it again this happens: | Unable to handle kernel NULL pointer dereference at virtual address 0000000000000140 | Mem abort info: | ESR = 0x0000000096000004 | EC = 0x25: DABT (current EL), IL = 32 bits | SET = 0, FnV = 0 | EA = 0, S1PTW = 0 | FSC = 0x04: level 0 translation fault | Data abort info: | ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 | CM = 0, WnR = 0, TnD = 0, TagAccess = 0 | GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 | user pgtable: 4k pages, 48-bit VAs, pgdp=0000000046d96000 | [0000000000000140] pgd=0000000000000000, p4d=0000000000000000 | Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP | Modules linked in: caam_jr caamhash_desc spidev caamalg_desc crypto_engine authenc libdes mwifiex_sdio +mwifiex crct10dif_ce cdc_acm onboard_usb_hub fsl_imx8_ddr_perf imx8m_ddrc rtc_ds1307 lm75 rtc_snvs +imx_sdma caam imx8mm_thermal spi_imx error imx_cpufreq_dt fuse ip_tables x_tables ipv6 | CPU: 0 PID: 8 Comm: kworker/0:1 Not tainted 6.9.0-00007-g937242013fce-dirty #18 | Hardware name: somemachine (DT) | Workqueue: events sdio_irq_work | pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : mwifiex_get_cfp+0xd8/0x15c [mwifiex] | lr : mwifiex_get_cfp+0x34/0x15c [mwifiex] | sp : ffff8000818b3a70 | x29: ffff8000818b3a70 x28: ffff000006bfd8a5 x27: 0000000000000004 | x26: 000000000000002c x25: 0000000000001511 x24: 0000000002e86bc9 | x23: ffff000006bfd996 x22: 0000000000000004 x21: ffff000007bec000 | x20: 000000000000002c x19: 0000000000000000 x18: 0000000000000000 | x17: 000000040044ffff x16: 00500072b5503510 x15: ccc283740681e517 | x14: 0201000101006d15 x13: 0000000002e8ff43 x12: 002c01000000ffb1 | x11: 0100000000000000 x10: 02e8ff43002c0100 x9 : 0000ffb100100157 | x8 : ffff000003d20000 x7 : 00000000000002f1 x6 : 00000000ffffe124 | x5 : 0000000000000001 x4 : 0000000000000003 x3 : 0000000000000000 | x2 : 0000000000000000 x1 : 0001000000011001 x0 : 0000000000000000 | Call trace: | mwifiex_get_cfp+0xd8/0x15c [mwifiex] | mwifiex_parse_single_response_buf+0x1d0/0x504 [mwifiex] | mwifiex_handle_event_ext_scan_report+0x19c/0x2f8 [mwifiex] | mwifiex_process_sta_event+0x298/0xf0c [mwifiex] | mwifiex_process_event+0x110/0x238 [mwifiex] | mwifiex_main_process+0x428/0xa44 [mwifiex] | mwifiex_sdio_interrupt+0x64/0x12c [mwifiex_sdio] | process_sdio_pending_irqs+0x64/0x1b8 | sdio_irq_work+0x4c/0x7c | process_one_work+0x148/0x2a0 | worker_thread+0x2fc/0x40c | kthread+0x110/0x114 | ret_from_fork+0x10/0x20 | Code: a94153f3 a8c37bfd d50323bf d65f03c0 (f940a000) | ---[ end trace 0000000000000000 ]--- Signed-off-by: Sascha Hauer <[email protected]> Acked-by: Brian Norris <[email protected]> Reviewed-by: Francesco Dolcini <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
rsalvaterra
pushed a commit
to rsalvaterra/linux
that referenced
this pull request
Sep 21, 2024
[ Upstream commit f8bbc07 ] vhost_worker will call tun call backs to receive packets. If too many illegal packets arrives, tun_do_read will keep dumping packet contents. When console is enabled, it will costs much more cpu time to dump packet and soft lockup will be detected. net_ratelimit mechanism can be used to limit the dumping rate. PID: 33036 TASK: ffff949da6f20000 CPU: 23 COMMAND: "vhost-32980" #0 [fffffe00003fce50] crash_nmi_callback at ffffffff89249253 gregkh#1 [fffffe00003fce58] nmi_handle at ffffffff89225fa3 gregkh#2 [fffffe00003fceb0] default_do_nmi at ffffffff8922642e gregkh#3 [fffffe00003fced0] do_nmi at ffffffff8922660d gregkh#4 [fffffe00003fcef0] end_repeat_nmi at ffffffff89c01663 [exception RIP: io_serial_in+20] RIP: ffffffff89792594 RSP: ffffa655314979e8 RFLAGS: 00000002 RAX: ffffffff89792500 RBX: ffffffff8af428a0 RCX: 0000000000000000 RDX: 00000000000003fd RSI: 0000000000000005 RDI: ffffffff8af428a0 RBP: 0000000000002710 R8: 0000000000000004 R9: 000000000000000f R10: 0000000000000000 R11: ffffffff8acbf64f R12: 0000000000000020 R13: ffffffff8acbf698 R14: 0000000000000058 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 gregkh#5 [ffffa655314979e8] io_serial_in at ffffffff89792594 gregkh#6 [ffffa655314979e8] wait_for_xmitr at ffffffff89793470 gregkh#7 [ffffa65531497a08] serial8250_console_putchar at ffffffff897934f6 gregkh#8 [ffffa65531497a20] uart_console_write at ffffffff8978b605 gregkh#9 [ffffa65531497a48] serial8250_console_write at ffffffff89796558 gregkh#10 [ffffa65531497ac8] console_unlock at ffffffff89316124 gregkh#11 [ffffa65531497b10] vprintk_emit at ffffffff89317c07 gregkh#12 [ffffa65531497b68] printk at ffffffff89318306 gregkh#13 [ffffa65531497bc8] print_hex_dump at ffffffff89650765 gregkh#14 [ffffa65531497ca8] tun_do_read at ffffffffc0b06c27 [tun] gregkh#15 [ffffa65531497d38] tun_recvmsg at ffffffffc0b06e34 [tun] gregkh#16 [ffffa65531497d68] handle_rx at ffffffffc0c5d682 [vhost_net] gregkh#17 [ffffa65531497ed0] vhost_worker at ffffffffc0c644dc [vhost] gregkh#18 [ffffa65531497f10] kthread at ffffffff892d2e72 #19 [ffffa65531497f50] ret_from_fork at ffffffff89c0022f Fixes: ef3db4a ("tun: avoid BUG, dump packet on GSO errors") Signed-off-by: Lei Chen <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Acked-by: Jason Wang <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
github-actions bot
pushed a commit
to sirdarckcat/linux-1
that referenced
this pull request
Sep 22, 2024
iter_finish_branch_entry() doesn't put the branch_info from/to map elements creating memory leaks. This can be seen with: ``` $ perf record -e cycles -b perf test -w noploop $ perf report -D ... Direct leak of 984344 byte(s) in 123043 object(s) allocated from: #0 0x7fb2654f3bd7 in malloc libsanitizer/asan/asan_malloc_linux.cpp:69 gregkh#1 0x564d3400d10b in map__get util/map.h:186 gregkh#2 0x564d3400d10b in ip__resolve_ams util/machine.c:1981 gregkh#3 0x564d34014d81 in sample__resolve_bstack util/machine.c:2151 gregkh#4 0x564d34094790 in iter_prepare_branch_entry util/hist.c:898 gregkh#5 0x564d34098fa4 in hist_entry_iter__add util/hist.c:1238 gregkh#6 0x564d33d1f0c7 in process_sample_event tools/perf/builtin-report.c:334 gregkh#7 0x564d34031eb7 in perf_session__deliver_event util/session.c:1655 gregkh#8 0x564d3403ba52 in do_flush util/ordered-events.c:245 gregkh#9 0x564d3403ba52 in __ordered_events__flush util/ordered-events.c:324 gregkh#10 0x564d3402d32e in perf_session__process_user_event util/session.c:1708 gregkh#11 0x564d34032480 in perf_session__process_event util/session.c:1877 gregkh#12 0x564d340336ad in reader__read_event util/session.c:2399 gregkh#13 0x564d34033fdc in reader__process_events util/session.c:2448 gregkh#14 0x564d34033fdc in __perf_session__process_events util/session.c:2495 gregkh#15 0x564d34033fdc in perf_session__process_events util/session.c:2661 gregkh#16 0x564d33d27113 in __cmd_report tools/perf/builtin-report.c:1065 gregkh#17 0x564d33d27113 in cmd_report tools/perf/builtin-report.c:1805 gregkh#18 0x564d33e0ccb7 in run_builtin tools/perf/perf.c:350 #19 0x564d33e0d45e in handle_internal_command tools/perf/perf.c:403 #20 0x564d33cdd827 in run_argv tools/perf/perf.c:447 #21 0x564d33cdd827 in main tools/perf/perf.c:561 ... ``` Clearing up the map_symbols properly creates maps reference count issues so resolve those. Resolving this issue doesn't improve peak heap consumption for the test above. Committer testing: $ sudo dnf install libasan $ make -k CORESIGHT=1 EXTRA_CFLAGS="-fsanitize=address" CC=clang O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Yanteng Si <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
github-actions bot
pushed a commit
to sirdarckcat/linux-1
that referenced
this pull request
Sep 26, 2024
…the Crashkernel Scenario The issue is that before entering the crash kernel, the DWC USB controller did not perform operations such as resetting the interrupt mask bits. After entering the crash kernel,before the USB interrupt handler registration was completed while loading the DWC USB driver,an GINTSTS_SOF interrupt was received.This triggered the misroute_irq process within the GIC handling framework,ultimately leading to the misrouting of the interrupt,causing it to be handled by the wrong interrupt handler and resulting in the issue. Summary:In a scenario where the kernel triggers a panic and enters the crash kernel,it is necessary to ensure that the interrupt mask bit is not enabled before the interrupt registration is complete. If an interrupt reaches the CPU at this moment,it will certainly not be handled correctly,especially in cases where this interrupt is reported frequently. Please refer to the Crashkernel dmesg information as follows (the message on line 3 was added before devm_request_irq is called by the dwc2_driver_probe function): [ 5.866837][ T1] dwc2 JMIC0010:01: supply vusb_d not found, using dummy regulator [ 5.874588][ T1] dwc2 JMIC0010:01: supply vusb_a not found, using dummy regulator [ 5.882335][ T1] dwc2 JMIC0010:01: before devm_request_irq irq: [71], gintmsk[0xf300080e], gintsts[0x04200009] [ 5.892686][ C0] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.0-jmnd1.2_RC gregkh#18 [ 5.900327][ C0] Hardware name: CMSS HyperCard4-25G/HyperCard4-25G, BIOS 1.6.4 Jul 8 2024 [ 5.908836][ C0] Call trace: [ 5.911965][ C0] dump_backtrace+0x0/0x1f0 [ 5.916308][ C0] show_stack+0x20/0x30 [ 5.920304][ C0] dump_stack+0xd8/0x140 [ 5.924387][ C0] pcie_xxx_handler+0x3c/0x1d8 [ 5.930121][ C0] __handle_irq_event_percpu+0x64/0x1e0 [ 5.935506][ C0] handle_irq_event+0x80/0x1d0 [ 5.940109][ C0] try_one_irq+0x138/0x174 [ 5.944365][ C0] misrouted_irq+0x134/0x140 [ 5.948795][ C0] note_interrupt+0x1d0/0x30c [ 5.953311][ C0] handle_irq_event+0x13c/0x1d0 [ 5.958001][ C0] handle_fasteoi_irq+0xd4/0x260 [ 5.962779][ C0] __handle_domain_irq+0x88/0xf0 [ 5.967555][ C0] gic_handle_irq+0x9c/0x2f0 [ 5.971985][ C0] el1_irq+0xb8/0x140 [ 5.975807][ C0] __setup_irq+0x3dc/0x7cc [ 5.980064][ C0] request_threaded_irq+0xf4/0x1b4 [ 5.985015][ C0] devm_request_threaded_irq+0x80/0x100 [ 5.990400][ C0] dwc2_driver_probe+0x1b8/0x6b0 [ 5.995178][ C0] platform_drv_probe+0x5c/0xb0 [ 5.999868][ C0] really_probe+0xf8/0x51c [ 6.004125][ C0] driver_probe_device+0xfc/0x170 [ 6.008989][ C0] device_driver_attach+0xc8/0xd0 [ 6.013853][ C0] __driver_attach+0xe8/0x1b0 [ 6.018369][ C0] bus_for_each_dev+0x7c/0xdc [ 6.022886][ C0] driver_attach+0x2c/0x3c [ 6.027143][ C0] bus_add_driver+0xdc/0x240 [ 6.031573][ C0] driver_register+0x80/0x13c [ 6.036090][ C0] __platform_driver_register+0x50/0x5c [ 6.041476][ C0] dwc2_platform_driver_init+0x24/0x30 [ 6.046774][ C0] do_one_initcall+0x50/0x25c [ 6.051291][ C0] do_initcall_level+0xe4/0xfc [ 6.055894][ C0] do_initcalls+0x80/0xa4 [ 6.060064][ C0] kernel_init_freeable+0x198/0x240 [ 6.065102][ C0] kernel_init+0x1c/0x12c Signed-off-by: Shawn Shao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
github-actions bot
pushed a commit
to sirdarckcat/linux-1
that referenced
this pull request
Oct 17, 2024
…the Crashkernel Scenario [ Upstream commit 4058c39 ] The issue is that before entering the crash kernel, the DWC USB controller did not perform operations such as resetting the interrupt mask bits. After entering the crash kernel,before the USB interrupt handler registration was completed while loading the DWC USB driver,an GINTSTS_SOF interrupt was received.This triggered the misroute_irq process within the GIC handling framework,ultimately leading to the misrouting of the interrupt,causing it to be handled by the wrong interrupt handler and resulting in the issue. Summary:In a scenario where the kernel triggers a panic and enters the crash kernel,it is necessary to ensure that the interrupt mask bit is not enabled before the interrupt registration is complete. If an interrupt reaches the CPU at this moment,it will certainly not be handled correctly,especially in cases where this interrupt is reported frequently. Please refer to the Crashkernel dmesg information as follows (the message on line 3 was added before devm_request_irq is called by the dwc2_driver_probe function): [ 5.866837][ T1] dwc2 JMIC0010:01: supply vusb_d not found, using dummy regulator [ 5.874588][ T1] dwc2 JMIC0010:01: supply vusb_a not found, using dummy regulator [ 5.882335][ T1] dwc2 JMIC0010:01: before devm_request_irq irq: [71], gintmsk[0xf300080e], gintsts[0x04200009] [ 5.892686][ C0] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.0-jmnd1.2_RC gregkh#18 [ 5.900327][ C0] Hardware name: CMSS HyperCard4-25G/HyperCard4-25G, BIOS 1.6.4 Jul 8 2024 [ 5.908836][ C0] Call trace: [ 5.911965][ C0] dump_backtrace+0x0/0x1f0 [ 5.916308][ C0] show_stack+0x20/0x30 [ 5.920304][ C0] dump_stack+0xd8/0x140 [ 5.924387][ C0] pcie_xxx_handler+0x3c/0x1d8 [ 5.930121][ C0] __handle_irq_event_percpu+0x64/0x1e0 [ 5.935506][ C0] handle_irq_event+0x80/0x1d0 [ 5.940109][ C0] try_one_irq+0x138/0x174 [ 5.944365][ C0] misrouted_irq+0x134/0x140 [ 5.948795][ C0] note_interrupt+0x1d0/0x30c [ 5.953311][ C0] handle_irq_event+0x13c/0x1d0 [ 5.958001][ C0] handle_fasteoi_irq+0xd4/0x260 [ 5.962779][ C0] __handle_domain_irq+0x88/0xf0 [ 5.967555][ C0] gic_handle_irq+0x9c/0x2f0 [ 5.971985][ C0] el1_irq+0xb8/0x140 [ 5.975807][ C0] __setup_irq+0x3dc/0x7cc [ 5.980064][ C0] request_threaded_irq+0xf4/0x1b4 [ 5.985015][ C0] devm_request_threaded_irq+0x80/0x100 [ 5.990400][ C0] dwc2_driver_probe+0x1b8/0x6b0 [ 5.995178][ C0] platform_drv_probe+0x5c/0xb0 [ 5.999868][ C0] really_probe+0xf8/0x51c [ 6.004125][ C0] driver_probe_device+0xfc/0x170 [ 6.008989][ C0] device_driver_attach+0xc8/0xd0 [ 6.013853][ C0] __driver_attach+0xe8/0x1b0 [ 6.018369][ C0] bus_for_each_dev+0x7c/0xdc [ 6.022886][ C0] driver_attach+0x2c/0x3c [ 6.027143][ C0] bus_add_driver+0xdc/0x240 [ 6.031573][ C0] driver_register+0x80/0x13c [ 6.036090][ C0] __platform_driver_register+0x50/0x5c [ 6.041476][ C0] dwc2_platform_driver_init+0x24/0x30 [ 6.046774][ C0] do_one_initcall+0x50/0x25c [ 6.051291][ C0] do_initcall_level+0xe4/0xfc [ 6.055894][ C0] do_initcalls+0x80/0xa4 [ 6.060064][ C0] kernel_init_freeable+0x198/0x240 [ 6.065102][ C0] kernel_init+0x1c/0x12c Signed-off-by: Shawn Shao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
github-actions bot
pushed a commit
to sirdarckcat/linux-1
that referenced
this pull request
Oct 17, 2024
…the Crashkernel Scenario [ Upstream commit 4058c39 ] The issue is that before entering the crash kernel, the DWC USB controller did not perform operations such as resetting the interrupt mask bits. After entering the crash kernel,before the USB interrupt handler registration was completed while loading the DWC USB driver,an GINTSTS_SOF interrupt was received.This triggered the misroute_irq process within the GIC handling framework,ultimately leading to the misrouting of the interrupt,causing it to be handled by the wrong interrupt handler and resulting in the issue. Summary:In a scenario where the kernel triggers a panic and enters the crash kernel,it is necessary to ensure that the interrupt mask bit is not enabled before the interrupt registration is complete. If an interrupt reaches the CPU at this moment,it will certainly not be handled correctly,especially in cases where this interrupt is reported frequently. Please refer to the Crashkernel dmesg information as follows (the message on line 3 was added before devm_request_irq is called by the dwc2_driver_probe function): [ 5.866837][ T1] dwc2 JMIC0010:01: supply vusb_d not found, using dummy regulator [ 5.874588][ T1] dwc2 JMIC0010:01: supply vusb_a not found, using dummy regulator [ 5.882335][ T1] dwc2 JMIC0010:01: before devm_request_irq irq: [71], gintmsk[0xf300080e], gintsts[0x04200009] [ 5.892686][ C0] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.0-jmnd1.2_RC gregkh#18 [ 5.900327][ C0] Hardware name: CMSS HyperCard4-25G/HyperCard4-25G, BIOS 1.6.4 Jul 8 2024 [ 5.908836][ C0] Call trace: [ 5.911965][ C0] dump_backtrace+0x0/0x1f0 [ 5.916308][ C0] show_stack+0x20/0x30 [ 5.920304][ C0] dump_stack+0xd8/0x140 [ 5.924387][ C0] pcie_xxx_handler+0x3c/0x1d8 [ 5.930121][ C0] __handle_irq_event_percpu+0x64/0x1e0 [ 5.935506][ C0] handle_irq_event+0x80/0x1d0 [ 5.940109][ C0] try_one_irq+0x138/0x174 [ 5.944365][ C0] misrouted_irq+0x134/0x140 [ 5.948795][ C0] note_interrupt+0x1d0/0x30c [ 5.953311][ C0] handle_irq_event+0x13c/0x1d0 [ 5.958001][ C0] handle_fasteoi_irq+0xd4/0x260 [ 5.962779][ C0] __handle_domain_irq+0x88/0xf0 [ 5.967555][ C0] gic_handle_irq+0x9c/0x2f0 [ 5.971985][ C0] el1_irq+0xb8/0x140 [ 5.975807][ C0] __setup_irq+0x3dc/0x7cc [ 5.980064][ C0] request_threaded_irq+0xf4/0x1b4 [ 5.985015][ C0] devm_request_threaded_irq+0x80/0x100 [ 5.990400][ C0] dwc2_driver_probe+0x1b8/0x6b0 [ 5.995178][ C0] platform_drv_probe+0x5c/0xb0 [ 5.999868][ C0] really_probe+0xf8/0x51c [ 6.004125][ C0] driver_probe_device+0xfc/0x170 [ 6.008989][ C0] device_driver_attach+0xc8/0xd0 [ 6.013853][ C0] __driver_attach+0xe8/0x1b0 [ 6.018369][ C0] bus_for_each_dev+0x7c/0xdc [ 6.022886][ C0] driver_attach+0x2c/0x3c [ 6.027143][ C0] bus_add_driver+0xdc/0x240 [ 6.031573][ C0] driver_register+0x80/0x13c [ 6.036090][ C0] __platform_driver_register+0x50/0x5c [ 6.041476][ C0] dwc2_platform_driver_init+0x24/0x30 [ 6.046774][ C0] do_one_initcall+0x50/0x25c [ 6.051291][ C0] do_initcall_level+0xe4/0xfc [ 6.055894][ C0] do_initcalls+0x80/0xa4 [ 6.060064][ C0] kernel_init_freeable+0x198/0x240 [ 6.065102][ C0] kernel_init+0x1c/0x12c Signed-off-by: Shawn Shao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
github-actions bot
pushed a commit
to sirdarckcat/linux-1
that referenced
this pull request
Oct 17, 2024
…the Crashkernel Scenario [ Upstream commit 4058c39 ] The issue is that before entering the crash kernel, the DWC USB controller did not perform operations such as resetting the interrupt mask bits. After entering the crash kernel,before the USB interrupt handler registration was completed while loading the DWC USB driver,an GINTSTS_SOF interrupt was received.This triggered the misroute_irq process within the GIC handling framework,ultimately leading to the misrouting of the interrupt,causing it to be handled by the wrong interrupt handler and resulting in the issue. Summary:In a scenario where the kernel triggers a panic and enters the crash kernel,it is necessary to ensure that the interrupt mask bit is not enabled before the interrupt registration is complete. If an interrupt reaches the CPU at this moment,it will certainly not be handled correctly,especially in cases where this interrupt is reported frequently. Please refer to the Crashkernel dmesg information as follows (the message on line 3 was added before devm_request_irq is called by the dwc2_driver_probe function): [ 5.866837][ T1] dwc2 JMIC0010:01: supply vusb_d not found, using dummy regulator [ 5.874588][ T1] dwc2 JMIC0010:01: supply vusb_a not found, using dummy regulator [ 5.882335][ T1] dwc2 JMIC0010:01: before devm_request_irq irq: [71], gintmsk[0xf300080e], gintsts[0x04200009] [ 5.892686][ C0] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.0-jmnd1.2_RC gregkh#18 [ 5.900327][ C0] Hardware name: CMSS HyperCard4-25G/HyperCard4-25G, BIOS 1.6.4 Jul 8 2024 [ 5.908836][ C0] Call trace: [ 5.911965][ C0] dump_backtrace+0x0/0x1f0 [ 5.916308][ C0] show_stack+0x20/0x30 [ 5.920304][ C0] dump_stack+0xd8/0x140 [ 5.924387][ C0] pcie_xxx_handler+0x3c/0x1d8 [ 5.930121][ C0] __handle_irq_event_percpu+0x64/0x1e0 [ 5.935506][ C0] handle_irq_event+0x80/0x1d0 [ 5.940109][ C0] try_one_irq+0x138/0x174 [ 5.944365][ C0] misrouted_irq+0x134/0x140 [ 5.948795][ C0] note_interrupt+0x1d0/0x30c [ 5.953311][ C0] handle_irq_event+0x13c/0x1d0 [ 5.958001][ C0] handle_fasteoi_irq+0xd4/0x260 [ 5.962779][ C0] __handle_domain_irq+0x88/0xf0 [ 5.967555][ C0] gic_handle_irq+0x9c/0x2f0 [ 5.971985][ C0] el1_irq+0xb8/0x140 [ 5.975807][ C0] __setup_irq+0x3dc/0x7cc [ 5.980064][ C0] request_threaded_irq+0xf4/0x1b4 [ 5.985015][ C0] devm_request_threaded_irq+0x80/0x100 [ 5.990400][ C0] dwc2_driver_probe+0x1b8/0x6b0 [ 5.995178][ C0] platform_drv_probe+0x5c/0xb0 [ 5.999868][ C0] really_probe+0xf8/0x51c [ 6.004125][ C0] driver_probe_device+0xfc/0x170 [ 6.008989][ C0] device_driver_attach+0xc8/0xd0 [ 6.013853][ C0] __driver_attach+0xe8/0x1b0 [ 6.018369][ C0] bus_for_each_dev+0x7c/0xdc [ 6.022886][ C0] driver_attach+0x2c/0x3c [ 6.027143][ C0] bus_add_driver+0xdc/0x240 [ 6.031573][ C0] driver_register+0x80/0x13c [ 6.036090][ C0] __platform_driver_register+0x50/0x5c [ 6.041476][ C0] dwc2_platform_driver_init+0x24/0x30 [ 6.046774][ C0] do_one_initcall+0x50/0x25c [ 6.051291][ C0] do_initcall_level+0xe4/0xfc [ 6.055894][ C0] do_initcalls+0x80/0xa4 [ 6.060064][ C0] kernel_init_freeable+0x198/0x240 [ 6.065102][ C0] kernel_init+0x1c/0x12c Signed-off-by: Shawn Shao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
github-actions bot
pushed a commit
to sirdarckcat/linux-1
that referenced
this pull request
Oct 17, 2024
…the Crashkernel Scenario [ Upstream commit 4058c39 ] The issue is that before entering the crash kernel, the DWC USB controller did not perform operations such as resetting the interrupt mask bits. After entering the crash kernel,before the USB interrupt handler registration was completed while loading the DWC USB driver,an GINTSTS_SOF interrupt was received.This triggered the misroute_irq process within the GIC handling framework,ultimately leading to the misrouting of the interrupt,causing it to be handled by the wrong interrupt handler and resulting in the issue. Summary:In a scenario where the kernel triggers a panic and enters the crash kernel,it is necessary to ensure that the interrupt mask bit is not enabled before the interrupt registration is complete. If an interrupt reaches the CPU at this moment,it will certainly not be handled correctly,especially in cases where this interrupt is reported frequently. Please refer to the Crashkernel dmesg information as follows (the message on line 3 was added before devm_request_irq is called by the dwc2_driver_probe function): [ 5.866837][ T1] dwc2 JMIC0010:01: supply vusb_d not found, using dummy regulator [ 5.874588][ T1] dwc2 JMIC0010:01: supply vusb_a not found, using dummy regulator [ 5.882335][ T1] dwc2 JMIC0010:01: before devm_request_irq irq: [71], gintmsk[0xf300080e], gintsts[0x04200009] [ 5.892686][ C0] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.0-jmnd1.2_RC gregkh#18 [ 5.900327][ C0] Hardware name: CMSS HyperCard4-25G/HyperCard4-25G, BIOS 1.6.4 Jul 8 2024 [ 5.908836][ C0] Call trace: [ 5.911965][ C0] dump_backtrace+0x0/0x1f0 [ 5.916308][ C0] show_stack+0x20/0x30 [ 5.920304][ C0] dump_stack+0xd8/0x140 [ 5.924387][ C0] pcie_xxx_handler+0x3c/0x1d8 [ 5.930121][ C0] __handle_irq_event_percpu+0x64/0x1e0 [ 5.935506][ C0] handle_irq_event+0x80/0x1d0 [ 5.940109][ C0] try_one_irq+0x138/0x174 [ 5.944365][ C0] misrouted_irq+0x134/0x140 [ 5.948795][ C0] note_interrupt+0x1d0/0x30c [ 5.953311][ C0] handle_irq_event+0x13c/0x1d0 [ 5.958001][ C0] handle_fasteoi_irq+0xd4/0x260 [ 5.962779][ C0] __handle_domain_irq+0x88/0xf0 [ 5.967555][ C0] gic_handle_irq+0x9c/0x2f0 [ 5.971985][ C0] el1_irq+0xb8/0x140 [ 5.975807][ C0] __setup_irq+0x3dc/0x7cc [ 5.980064][ C0] request_threaded_irq+0xf4/0x1b4 [ 5.985015][ C0] devm_request_threaded_irq+0x80/0x100 [ 5.990400][ C0] dwc2_driver_probe+0x1b8/0x6b0 [ 5.995178][ C0] platform_drv_probe+0x5c/0xb0 [ 5.999868][ C0] really_probe+0xf8/0x51c [ 6.004125][ C0] driver_probe_device+0xfc/0x170 [ 6.008989][ C0] device_driver_attach+0xc8/0xd0 [ 6.013853][ C0] __driver_attach+0xe8/0x1b0 [ 6.018369][ C0] bus_for_each_dev+0x7c/0xdc [ 6.022886][ C0] driver_attach+0x2c/0x3c [ 6.027143][ C0] bus_add_driver+0xdc/0x240 [ 6.031573][ C0] driver_register+0x80/0x13c [ 6.036090][ C0] __platform_driver_register+0x50/0x5c [ 6.041476][ C0] dwc2_platform_driver_init+0x24/0x30 [ 6.046774][ C0] do_one_initcall+0x50/0x25c [ 6.051291][ C0] do_initcall_level+0xe4/0xfc [ 6.055894][ C0] do_initcalls+0x80/0xa4 [ 6.060064][ C0] kernel_init_freeable+0x198/0x240 [ 6.065102][ C0] kernel_init+0x1c/0x12c Signed-off-by: Shawn Shao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
github-actions bot
pushed a commit
to sirdarckcat/linux-1
that referenced
this pull request
Oct 17, 2024
…the Crashkernel Scenario [ Upstream commit 4058c39 ] The issue is that before entering the crash kernel, the DWC USB controller did not perform operations such as resetting the interrupt mask bits. After entering the crash kernel,before the USB interrupt handler registration was completed while loading the DWC USB driver,an GINTSTS_SOF interrupt was received.This triggered the misroute_irq process within the GIC handling framework,ultimately leading to the misrouting of the interrupt,causing it to be handled by the wrong interrupt handler and resulting in the issue. Summary:In a scenario where the kernel triggers a panic and enters the crash kernel,it is necessary to ensure that the interrupt mask bit is not enabled before the interrupt registration is complete. If an interrupt reaches the CPU at this moment,it will certainly not be handled correctly,especially in cases where this interrupt is reported frequently. Please refer to the Crashkernel dmesg information as follows (the message on line 3 was added before devm_request_irq is called by the dwc2_driver_probe function): [ 5.866837][ T1] dwc2 JMIC0010:01: supply vusb_d not found, using dummy regulator [ 5.874588][ T1] dwc2 JMIC0010:01: supply vusb_a not found, using dummy regulator [ 5.882335][ T1] dwc2 JMIC0010:01: before devm_request_irq irq: [71], gintmsk[0xf300080e], gintsts[0x04200009] [ 5.892686][ C0] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.0-jmnd1.2_RC gregkh#18 [ 5.900327][ C0] Hardware name: CMSS HyperCard4-25G/HyperCard4-25G, BIOS 1.6.4 Jul 8 2024 [ 5.908836][ C0] Call trace: [ 5.911965][ C0] dump_backtrace+0x0/0x1f0 [ 5.916308][ C0] show_stack+0x20/0x30 [ 5.920304][ C0] dump_stack+0xd8/0x140 [ 5.924387][ C0] pcie_xxx_handler+0x3c/0x1d8 [ 5.930121][ C0] __handle_irq_event_percpu+0x64/0x1e0 [ 5.935506][ C0] handle_irq_event+0x80/0x1d0 [ 5.940109][ C0] try_one_irq+0x138/0x174 [ 5.944365][ C0] misrouted_irq+0x134/0x140 [ 5.948795][ C0] note_interrupt+0x1d0/0x30c [ 5.953311][ C0] handle_irq_event+0x13c/0x1d0 [ 5.958001][ C0] handle_fasteoi_irq+0xd4/0x260 [ 5.962779][ C0] __handle_domain_irq+0x88/0xf0 [ 5.967555][ C0] gic_handle_irq+0x9c/0x2f0 [ 5.971985][ C0] el1_irq+0xb8/0x140 [ 5.975807][ C0] __setup_irq+0x3dc/0x7cc [ 5.980064][ C0] request_threaded_irq+0xf4/0x1b4 [ 5.985015][ C0] devm_request_threaded_irq+0x80/0x100 [ 5.990400][ C0] dwc2_driver_probe+0x1b8/0x6b0 [ 5.995178][ C0] platform_drv_probe+0x5c/0xb0 [ 5.999868][ C0] really_probe+0xf8/0x51c [ 6.004125][ C0] driver_probe_device+0xfc/0x170 [ 6.008989][ C0] device_driver_attach+0xc8/0xd0 [ 6.013853][ C0] __driver_attach+0xe8/0x1b0 [ 6.018369][ C0] bus_for_each_dev+0x7c/0xdc [ 6.022886][ C0] driver_attach+0x2c/0x3c [ 6.027143][ C0] bus_add_driver+0xdc/0x240 [ 6.031573][ C0] driver_register+0x80/0x13c [ 6.036090][ C0] __platform_driver_register+0x50/0x5c [ 6.041476][ C0] dwc2_platform_driver_init+0x24/0x30 [ 6.046774][ C0] do_one_initcall+0x50/0x25c [ 6.051291][ C0] do_initcall_level+0xe4/0xfc [ 6.055894][ C0] do_initcalls+0x80/0xa4 [ 6.060064][ C0] kernel_init_freeable+0x198/0x240 [ 6.065102][ C0] kernel_init+0x1c/0x12c Signed-off-by: Shawn Shao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
piso77
pushed a commit
to piso77/linux
that referenced
this pull request
Oct 21, 2024
Uprobe needs to fetch args into a percpu buffer, and then copy to ring buffer to avoid non-atomic context problem. Sometimes user-space strings, arrays can be very large, but the size of percpu buffer is only page size. And store_trace_args() won't check whether these data exceeds a single page or not, caused out-of-bounds memory access. It could be reproduced by following steps: 1. build kernel with CONFIG_KASAN enabled 2. save follow program as test.c ``` \#include <stdio.h> \#include <stdlib.h> \#include <string.h> // If string length large than MAX_STRING_SIZE, the fetch_store_strlen() // will return 0, cause __get_data_size() return shorter size, and // store_trace_args() will not trigger out-of-bounds access. // So make string length less than 4096. \#define STRLEN 4093 void generate_string(char *str, int n) { int i; for (i = 0; i < n; ++i) { char c = i % 26 + 'a'; str[i] = c; } str[n-1] = '\0'; } void print_string(char *str) { printf("%s\n", str); } int main() { char tmp[STRLEN]; generate_string(tmp, STRLEN); print_string(tmp); return 0; } ``` 3. compile program `gcc -o test test.c` 4. get the offset of `print_string()` ``` objdump -t test | grep -w print_string 0000000000401199 g F .text 000000000000001b print_string ``` 5. configure uprobe with offset 0x1199 ``` off=0x1199 cd /sys/kernel/debug/tracing/ echo "p /root/test:${off} arg1=+0(%di):ustring arg2=\$comm arg3=+0(%di):ustring" > uprobe_events echo 1 > events/uprobes/enable echo 1 > tracing_on ``` 6. run `test`, and kasan will report error. ================================================================== BUG: KASAN: use-after-free in strncpy_from_user+0x1d6/0x1f0 Write of size 8 at addr ffff88812311c004 by task test/499CPU: 0 UID: 0 PID: 499 Comm: test Not tainted 6.12.0-rc3+ gregkh#18 Hardware name: Red Hat KVM, BIOS 1.16.0-4.al8 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x55/0x70 print_address_description.constprop.0+0x27/0x310 kasan_report+0x10f/0x120 ? strncpy_from_user+0x1d6/0x1f0 strncpy_from_user+0x1d6/0x1f0 ? rmqueue.constprop.0+0x70d/0x2ad0 process_fetch_insn+0xb26/0x1470 ? __pfx_process_fetch_insn+0x10/0x10 ? _raw_spin_lock+0x85/0xe0 ? __pfx__raw_spin_lock+0x10/0x10 ? __pte_offset_map+0x1f/0x2d0 ? unwind_next_frame+0xc5f/0x1f80 ? arch_stack_walk+0x68/0xf0 ? is_bpf_text_address+0x23/0x30 ? kernel_text_address.part.0+0xbb/0xd0 ? __kernel_text_address+0x66/0xb0 ? unwind_get_return_address+0x5e/0xa0 ? __pfx_stack_trace_consume_entry+0x10/0x10 ? arch_stack_walk+0xa2/0xf0 ? _raw_spin_lock_irqsave+0x8b/0xf0 ? __pfx__raw_spin_lock_irqsave+0x10/0x10 ? depot_alloc_stack+0x4c/0x1f0 ? _raw_spin_unlock_irqrestore+0xe/0x30 ? stack_depot_save_flags+0x35d/0x4f0 ? kasan_save_stack+0x34/0x50 ? kasan_save_stack+0x24/0x50 ? mutex_lock+0x91/0xe0 ? __pfx_mutex_lock+0x10/0x10 prepare_uprobe_buffer.part.0+0x2cd/0x500 uprobe_dispatcher+0x2c3/0x6a0 ? __pfx_uprobe_dispatcher+0x10/0x10 ? __kasan_slab_alloc+0x4d/0x90 handler_chain+0xdd/0x3e0 handle_swbp+0x26e/0x3d0 ? __pfx_handle_swbp+0x10/0x10 ? uprobe_pre_sstep_notifier+0x151/0x1b0 irqentry_exit_to_user_mode+0xe2/0x1b0 asm_exc_int3+0x39/0x40 RIP: 0033:0x401199 Code: 01 c2 0f b6 45 fb 88 02 83 45 fc 01 8b 45 fc 3b 45 e4 7c b7 8b 45 e4 48 98 48 8d 50 ff 48 8b 45 e8 48 01 d0 ce RSP: 002b:00007ffdf00576a8 EFLAGS: 00000206 RAX: 00007ffdf00576b0 RBX: 0000000000000000 RCX: 0000000000000ff2 RDX: 0000000000000ffc RSI: 0000000000000ffd RDI: 00007ffdf00576b0 RBP: 00007ffdf00586b0 R08: 00007feb2f9c0d20 R09: 00007feb2f9c0d20 R10: 0000000000000001 R11: 0000000000000202 R12: 0000000000401040 R13: 00007ffdf0058780 R14: 0000000000000000 R15: 0000000000000000 </TASK> This commit enforces the buffer's maxlen less than a page-size to avoid store_trace_args() out-of-memory access. Link: https://lore.kernel.org/all/[email protected]/ Fixes: dcad1a2 ("tracing/uprobes: Fetch args before reserving a ring buffer") Signed-off-by: Qiao Ma <[email protected]> Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
piso77
pushed a commit
to piso77/linux
that referenced
this pull request
Oct 25, 2024
During the migration of Soundwire runtime stream allocation from the Qualcomm Soundwire controller to SoC's soundcard drivers the sdm845 soundcard was forgotten. At this point any playback attempt or audio daemon startup, for instance on sdm845-db845c (Qualcomm RB3 board), will result in stream pointer NULL dereference: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 Mem abort info: ESR = 0x0000000096000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault Data abort info: ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=0000000101ecf000 [0000000000000020] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 0000000096000004 [gregkh#1] PREEMPT SMP Modules linked in: ... CPU: 5 UID: 0 PID: 1198 Comm: aplay Not tainted 6.12.0-rc2-qcomlt-arm64-00059-g9d78f315a362-dirty gregkh#18 Hardware name: Thundercomm Dragonboard 845c (DT) pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : sdw_stream_add_slave+0x44/0x380 [soundwire_bus] lr : sdw_stream_add_slave+0x44/0x380 [soundwire_bus] sp : ffff80008a2035c0 x29: ffff80008a2035c0 x28: ffff80008a203978 x27: 0000000000000000 x26: 00000000000000c0 x25: 0000000000000000 x24: ffff1676025f4800 x23: ffff167600ff1cb8 x22: ffff167600ff1c98 x21: 0000000000000003 x20: ffff167607316000 x19: ffff167604e64e80 x18: 0000000000000000 x17: 0000000000000000 x16: ffffcec265074160 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 x8 : 0000000000000000 x7 : 0000000000000000 x6 : ffff167600ff1cec x5 : ffffcec22cfa2010 x4 : 0000000000000000 x3 : 0000000000000003 x2 : ffff167613f836c0 x1 : 0000000000000000 x0 : ffff16761feb60b8 Call trace: sdw_stream_add_slave+0x44/0x380 [soundwire_bus] wsa881x_hw_params+0x68/0x80 [snd_soc_wsa881x] snd_soc_dai_hw_params+0x3c/0xa4 __soc_pcm_hw_params+0x230/0x660 dpcm_be_dai_hw_params+0x1d0/0x3f8 dpcm_fe_dai_hw_params+0x98/0x268 snd_pcm_hw_params+0x124/0x460 snd_pcm_common_ioctl+0x998/0x16e8 snd_pcm_ioctl+0x34/0x58 __arm64_sys_ioctl+0xac/0xf8 invoke_syscall+0x48/0x104 el0_svc_common.constprop.0+0x40/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x34/0xe0 el0t_64_sync_handler+0x120/0x12c el0t_64_sync+0x190/0x194 Code: aa0403fb f9418400 9100e000 9400102f (f8420f22) ---[ end trace 0000000000000000 ]--- 0000000000006108 <sdw_stream_add_slave>: 6108: d503233f paciasp 610c: a9b97bfd stp x29, x30, [sp, #-112]! 6110: 910003fd mov x29, sp 6114: a90153f3 stp x19, x20, [sp, gregkh#16] 6118: a9025bf5 stp x21, x22, [sp, #32] 611c: aa0103f6 mov x22, x1 6120: 2a0303f5 mov w21, w3 6124: a90363f7 stp x23, x24, [sp, #48] 6128: aa0003f8 mov x24, x0 612c: aa0203f7 mov x23, x2 6130: a9046bf9 stp x25, x26, [sp, #64] 6134: aa0403f9 mov x25, x4 <-- x4 copied to x25 6138: a90573fb stp x27, x28, [sp, #80] 613c: aa0403fb mov x27, x4 6140: f9418400 ldr x0, [x0, #776] 6144: 9100e000 add x0, x0, #0x38 6148: 94000000 bl 0 <mutex_lock> 614c: f8420f22 ldr x2, [x25, #32]! <-- offset 0x44 ^^^ This is 0x6108 + offset 0x44 from the beginning of sdw_stream_add_slave() where data abort happens. wsa881x_hw_params() is called with stream = NULL and passes it further in register x4 (5th argument) to sdw_stream_add_slave() without any checks. Value from x4 is copied to x25 and finally it aborts on trying to load a value from address in x25 plus offset 32 (in dec) which corresponds to master_list member in struct sdw_stream_runtime: struct sdw_stream_runtime { const char * name; /* 0 8 */ struct sdw_stream_params params; /* 8 12 */ enum sdw_stream_state state; /* 20 4 */ enum sdw_stream_type type; /* 24 4 */ /* XXX 4 bytes hole, try to pack */ here-> struct list_head master_list; /* 32 16 */ int m_rt_count; /* 48 4 */ /* size: 56, cachelines: 1, members: 6 */ /* sum members: 48, holes: 1, sum holes: 4 */ /* padding: 4 */ /* last cacheline: 56 bytes */ Fix this by adding required calls to qcom_snd_sdw_startup() and sdw_release_stream() to startup and shutdown routines which restores the previous correct behaviour when ->set_stream() method is called to set a valid stream runtime pointer on playback startup. Reproduced and then fix was tested on db845c RB3 board. Reported-by: Dmitry Baryshkov <[email protected]> Cc: [email protected] Fixes: 15c7fab ("ASoC: qcom: Move Soundwire runtime stream alloc to soundcards") Cc: Srinivas Kandagatla <[email protected]> Cc: Dmitry Baryshkov <[email protected]> Cc: Krzysztof Kozlowski <[email protected]> Cc: Pierre-Louis Bossart <[email protected]> Signed-off-by: Alexey Klimov <[email protected]> Tested-by: Steev Klimaszewski <[email protected]> # Lenovo Yoga C630 Reviewed-by: Krzysztof Kozlowski <[email protected]> Reviewed-by: Srinivas Kandagatla <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Mark Brown <[email protected]>
paniakin-aws
pushed a commit
to amazonlinux/linux
that referenced
this pull request
Oct 30, 2024
[ Upstream commit c145eea ] mwifiex_get_priv_by_id() returns the priv pointer corresponding to the bss_num and bss_type, but without checking if the priv is actually currently in use. Unused priv pointers do not have a wiphy attached to them which can lead to NULL pointer dereferences further down the callstack. Fix this by returning only used priv pointers which have priv->bss_mode set to something else than NL80211_IFTYPE_UNSPECIFIED. Said NULL pointer dereference happened when an Accesspoint was started with wpa_supplicant -i mlan0 with this config: network={ ssid="somessid" mode=2 frequency=2412 key_mgmt=WPA-PSK WPA-PSK-SHA256 proto=RSN group=CCMP pairwise=CCMP psk="12345678" } When waiting for the AP to be established, interrupting wpa_supplicant with <ctrl-c> and starting it again this happens: | Unable to handle kernel NULL pointer dereference at virtual address 0000000000000140 | Mem abort info: | ESR = 0x0000000096000004 | EC = 0x25: DABT (current EL), IL = 32 bits | SET = 0, FnV = 0 | EA = 0, S1PTW = 0 | FSC = 0x04: level 0 translation fault | Data abort info: | ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 | CM = 0, WnR = 0, TnD = 0, TagAccess = 0 | GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 | user pgtable: 4k pages, 48-bit VAs, pgdp=0000000046d96000 | [0000000000000140] pgd=0000000000000000, p4d=0000000000000000 | Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP | Modules linked in: caam_jr caamhash_desc spidev caamalg_desc crypto_engine authenc libdes mwifiex_sdio +mwifiex crct10dif_ce cdc_acm onboard_usb_hub fsl_imx8_ddr_perf imx8m_ddrc rtc_ds1307 lm75 rtc_snvs +imx_sdma caam imx8mm_thermal spi_imx error imx_cpufreq_dt fuse ip_tables x_tables ipv6 | CPU: 0 PID: 8 Comm: kworker/0:1 Not tainted 6.9.0-00007-g937242013fce-dirty gregkh#18 | Hardware name: somemachine (DT) | Workqueue: events sdio_irq_work | pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : mwifiex_get_cfp+0xd8/0x15c [mwifiex] | lr : mwifiex_get_cfp+0x34/0x15c [mwifiex] | sp : ffff8000818b3a70 | x29: ffff8000818b3a70 x28: ffff000006bfd8a5 x27: 0000000000000004 | x26: 000000000000002c x25: 0000000000001511 x24: 0000000002e86bc9 | x23: ffff000006bfd996 x22: 0000000000000004 x21: ffff000007bec000 | x20: 000000000000002c x19: 0000000000000000 x18: 0000000000000000 | x17: 000000040044ffff x16: 00500072b5503510 x15: ccc283740681e517 | x14: 0201000101006d15 x13: 0000000002e8ff43 x12: 002c01000000ffb1 | x11: 0100000000000000 x10: 02e8ff43002c0100 x9 : 0000ffb100100157 | x8 : ffff000003d20000 x7 : 00000000000002f1 x6 : 00000000ffffe124 | x5 : 0000000000000001 x4 : 0000000000000003 x3 : 0000000000000000 | x2 : 0000000000000000 x1 : 0001000000011001 x0 : 0000000000000000 | Call trace: | mwifiex_get_cfp+0xd8/0x15c [mwifiex] | mwifiex_parse_single_response_buf+0x1d0/0x504 [mwifiex] | mwifiex_handle_event_ext_scan_report+0x19c/0x2f8 [mwifiex] | mwifiex_process_sta_event+0x298/0xf0c [mwifiex] | mwifiex_process_event+0x110/0x238 [mwifiex] | mwifiex_main_process+0x428/0xa44 [mwifiex] | mwifiex_sdio_interrupt+0x64/0x12c [mwifiex_sdio] | process_sdio_pending_irqs+0x64/0x1b8 | sdio_irq_work+0x4c/0x7c | process_one_work+0x148/0x2a0 | worker_thread+0x2fc/0x40c | kthread+0x110/0x114 | ret_from_fork+0x10/0x20 | Code: a94153f3 a8c37bfd d50323bf d65f03c0 (f940a000) | ---[ end trace 0000000000000000 ]--- Signed-off-by: Sascha Hauer <[email protected]> Acked-by: Brian Norris <[email protected]> Reviewed-by: Francesco Dolcini <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]> (cherry picked from commit a12cf97) Signed-off-by: Harshit Mogalapalli <[email protected]> Signed-off-by: Vegard Nossum <[email protected]>
github-actions bot
pushed a commit
to sirdarckcat/linux-1
that referenced
this pull request
Nov 1, 2024
[ Upstream commit 373b933 ] Uprobe needs to fetch args into a percpu buffer, and then copy to ring buffer to avoid non-atomic context problem. Sometimes user-space strings, arrays can be very large, but the size of percpu buffer is only page size. And store_trace_args() won't check whether these data exceeds a single page or not, caused out-of-bounds memory access. It could be reproduced by following steps: 1. build kernel with CONFIG_KASAN enabled 2. save follow program as test.c ``` \#include <stdio.h> \#include <stdlib.h> \#include <string.h> // If string length large than MAX_STRING_SIZE, the fetch_store_strlen() // will return 0, cause __get_data_size() return shorter size, and // store_trace_args() will not trigger out-of-bounds access. // So make string length less than 4096. \#define STRLEN 4093 void generate_string(char *str, int n) { int i; for (i = 0; i < n; ++i) { char c = i % 26 + 'a'; str[i] = c; } str[n-1] = '\0'; } void print_string(char *str) { printf("%s\n", str); } int main() { char tmp[STRLEN]; generate_string(tmp, STRLEN); print_string(tmp); return 0; } ``` 3. compile program `gcc -o test test.c` 4. get the offset of `print_string()` ``` objdump -t test | grep -w print_string 0000000000401199 g F .text 000000000000001b print_string ``` 5. configure uprobe with offset 0x1199 ``` off=0x1199 cd /sys/kernel/debug/tracing/ echo "p /root/test:${off} arg1=+0(%di):ustring arg2=\$comm arg3=+0(%di):ustring" > uprobe_events echo 1 > events/uprobes/enable echo 1 > tracing_on ``` 6. run `test`, and kasan will report error. ================================================================== BUG: KASAN: use-after-free in strncpy_from_user+0x1d6/0x1f0 Write of size 8 at addr ffff88812311c004 by task test/499CPU: 0 UID: 0 PID: 499 Comm: test Not tainted 6.12.0-rc3+ gregkh#18 Hardware name: Red Hat KVM, BIOS 1.16.0-4.al8 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x55/0x70 print_address_description.constprop.0+0x27/0x310 kasan_report+0x10f/0x120 ? strncpy_from_user+0x1d6/0x1f0 strncpy_from_user+0x1d6/0x1f0 ? rmqueue.constprop.0+0x70d/0x2ad0 process_fetch_insn+0xb26/0x1470 ? __pfx_process_fetch_insn+0x10/0x10 ? _raw_spin_lock+0x85/0xe0 ? __pfx__raw_spin_lock+0x10/0x10 ? __pte_offset_map+0x1f/0x2d0 ? unwind_next_frame+0xc5f/0x1f80 ? arch_stack_walk+0x68/0xf0 ? is_bpf_text_address+0x23/0x30 ? kernel_text_address.part.0+0xbb/0xd0 ? __kernel_text_address+0x66/0xb0 ? unwind_get_return_address+0x5e/0xa0 ? __pfx_stack_trace_consume_entry+0x10/0x10 ? arch_stack_walk+0xa2/0xf0 ? _raw_spin_lock_irqsave+0x8b/0xf0 ? __pfx__raw_spin_lock_irqsave+0x10/0x10 ? depot_alloc_stack+0x4c/0x1f0 ? _raw_spin_unlock_irqrestore+0xe/0x30 ? stack_depot_save_flags+0x35d/0x4f0 ? kasan_save_stack+0x34/0x50 ? kasan_save_stack+0x24/0x50 ? mutex_lock+0x91/0xe0 ? __pfx_mutex_lock+0x10/0x10 prepare_uprobe_buffer.part.0+0x2cd/0x500 uprobe_dispatcher+0x2c3/0x6a0 ? __pfx_uprobe_dispatcher+0x10/0x10 ? __kasan_slab_alloc+0x4d/0x90 handler_chain+0xdd/0x3e0 handle_swbp+0x26e/0x3d0 ? __pfx_handle_swbp+0x10/0x10 ? uprobe_pre_sstep_notifier+0x151/0x1b0 irqentry_exit_to_user_mode+0xe2/0x1b0 asm_exc_int3+0x39/0x40 RIP: 0033:0x401199 Code: 01 c2 0f b6 45 fb 88 02 83 45 fc 01 8b 45 fc 3b 45 e4 7c b7 8b 45 e4 48 98 48 8d 50 ff 48 8b 45 e8 48 01 d0 ce RSP: 002b:00007ffdf00576a8 EFLAGS: 00000206 RAX: 00007ffdf00576b0 RBX: 0000000000000000 RCX: 0000000000000ff2 RDX: 0000000000000ffc RSI: 0000000000000ffd RDI: 00007ffdf00576b0 RBP: 00007ffdf00586b0 R08: 00007feb2f9c0d20 R09: 00007feb2f9c0d20 R10: 0000000000000001 R11: 0000000000000202 R12: 0000000000401040 R13: 00007ffdf0058780 R14: 0000000000000000 R15: 0000000000000000 </TASK> This commit enforces the buffer's maxlen less than a page-size to avoid store_trace_args() out-of-memory access. Link: https://lore.kernel.org/all/[email protected]/ Fixes: dcad1a2 ("tracing/uprobes: Fetch args before reserving a ring buffer") Signed-off-by: Qiao Ma <[email protected]> Signed-off-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
github-actions bot
pushed a commit
to sirdarckcat/linux-1
that referenced
this pull request
Nov 1, 2024
[ Upstream commit 373b933 ] Uprobe needs to fetch args into a percpu buffer, and then copy to ring buffer to avoid non-atomic context problem. Sometimes user-space strings, arrays can be very large, but the size of percpu buffer is only page size. And store_trace_args() won't check whether these data exceeds a single page or not, caused out-of-bounds memory access. It could be reproduced by following steps: 1. build kernel with CONFIG_KASAN enabled 2. save follow program as test.c ``` \#include <stdio.h> \#include <stdlib.h> \#include <string.h> // If string length large than MAX_STRING_SIZE, the fetch_store_strlen() // will return 0, cause __get_data_size() return shorter size, and // store_trace_args() will not trigger out-of-bounds access. // So make string length less than 4096. \#define STRLEN 4093 void generate_string(char *str, int n) { int i; for (i = 0; i < n; ++i) { char c = i % 26 + 'a'; str[i] = c; } str[n-1] = '\0'; } void print_string(char *str) { printf("%s\n", str); } int main() { char tmp[STRLEN]; generate_string(tmp, STRLEN); print_string(tmp); return 0; } ``` 3. compile program `gcc -o test test.c` 4. get the offset of `print_string()` ``` objdump -t test | grep -w print_string 0000000000401199 g F .text 000000000000001b print_string ``` 5. configure uprobe with offset 0x1199 ``` off=0x1199 cd /sys/kernel/debug/tracing/ echo "p /root/test:${off} arg1=+0(%di):ustring arg2=\$comm arg3=+0(%di):ustring" > uprobe_events echo 1 > events/uprobes/enable echo 1 > tracing_on ``` 6. run `test`, and kasan will report error. ================================================================== BUG: KASAN: use-after-free in strncpy_from_user+0x1d6/0x1f0 Write of size 8 at addr ffff88812311c004 by task test/499CPU: 0 UID: 0 PID: 499 Comm: test Not tainted 6.12.0-rc3+ gregkh#18 Hardware name: Red Hat KVM, BIOS 1.16.0-4.al8 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x55/0x70 print_address_description.constprop.0+0x27/0x310 kasan_report+0x10f/0x120 ? strncpy_from_user+0x1d6/0x1f0 strncpy_from_user+0x1d6/0x1f0 ? rmqueue.constprop.0+0x70d/0x2ad0 process_fetch_insn+0xb26/0x1470 ? __pfx_process_fetch_insn+0x10/0x10 ? _raw_spin_lock+0x85/0xe0 ? __pfx__raw_spin_lock+0x10/0x10 ? __pte_offset_map+0x1f/0x2d0 ? unwind_next_frame+0xc5f/0x1f80 ? arch_stack_walk+0x68/0xf0 ? is_bpf_text_address+0x23/0x30 ? kernel_text_address.part.0+0xbb/0xd0 ? __kernel_text_address+0x66/0xb0 ? unwind_get_return_address+0x5e/0xa0 ? __pfx_stack_trace_consume_entry+0x10/0x10 ? arch_stack_walk+0xa2/0xf0 ? _raw_spin_lock_irqsave+0x8b/0xf0 ? __pfx__raw_spin_lock_irqsave+0x10/0x10 ? depot_alloc_stack+0x4c/0x1f0 ? _raw_spin_unlock_irqrestore+0xe/0x30 ? stack_depot_save_flags+0x35d/0x4f0 ? kasan_save_stack+0x34/0x50 ? kasan_save_stack+0x24/0x50 ? mutex_lock+0x91/0xe0 ? __pfx_mutex_lock+0x10/0x10 prepare_uprobe_buffer.part.0+0x2cd/0x500 uprobe_dispatcher+0x2c3/0x6a0 ? __pfx_uprobe_dispatcher+0x10/0x10 ? __kasan_slab_alloc+0x4d/0x90 handler_chain+0xdd/0x3e0 handle_swbp+0x26e/0x3d0 ? __pfx_handle_swbp+0x10/0x10 ? uprobe_pre_sstep_notifier+0x151/0x1b0 irqentry_exit_to_user_mode+0xe2/0x1b0 asm_exc_int3+0x39/0x40 RIP: 0033:0x401199 Code: 01 c2 0f b6 45 fb 88 02 83 45 fc 01 8b 45 fc 3b 45 e4 7c b7 8b 45 e4 48 98 48 8d 50 ff 48 8b 45 e8 48 01 d0 ce RSP: 002b:00007ffdf00576a8 EFLAGS: 00000206 RAX: 00007ffdf00576b0 RBX: 0000000000000000 RCX: 0000000000000ff2 RDX: 0000000000000ffc RSI: 0000000000000ffd RDI: 00007ffdf00576b0 RBP: 00007ffdf00586b0 R08: 00007feb2f9c0d20 R09: 00007feb2f9c0d20 R10: 0000000000000001 R11: 0000000000000202 R12: 0000000000401040 R13: 00007ffdf0058780 R14: 0000000000000000 R15: 0000000000000000 </TASK> This commit enforces the buffer's maxlen less than a page-size to avoid store_trace_args() out-of-memory access. Link: https://lore.kernel.org/all/[email protected]/ Fixes: dcad1a2 ("tracing/uprobes: Fetch args before reserving a ring buffer") Signed-off-by: Qiao Ma <[email protected]> Signed-off-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
github-actions bot
pushed a commit
to sirdarckcat/linux-1
that referenced
this pull request
Nov 1, 2024
commit d0e806b upstream. During the migration of Soundwire runtime stream allocation from the Qualcomm Soundwire controller to SoC's soundcard drivers the sdm845 soundcard was forgotten. At this point any playback attempt or audio daemon startup, for instance on sdm845-db845c (Qualcomm RB3 board), will result in stream pointer NULL dereference: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 Mem abort info: ESR = 0x0000000096000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault Data abort info: ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=0000000101ecf000 [0000000000000020] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 0000000096000004 [gregkh#1] PREEMPT SMP Modules linked in: ... CPU: 5 UID: 0 PID: 1198 Comm: aplay Not tainted 6.12.0-rc2-qcomlt-arm64-00059-g9d78f315a362-dirty gregkh#18 Hardware name: Thundercomm Dragonboard 845c (DT) pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : sdw_stream_add_slave+0x44/0x380 [soundwire_bus] lr : sdw_stream_add_slave+0x44/0x380 [soundwire_bus] sp : ffff80008a2035c0 x29: ffff80008a2035c0 x28: ffff80008a203978 x27: 0000000000000000 x26: 00000000000000c0 x25: 0000000000000000 x24: ffff1676025f4800 x23: ffff167600ff1cb8 x22: ffff167600ff1c98 x21: 0000000000000003 x20: ffff167607316000 x19: ffff167604e64e80 x18: 0000000000000000 x17: 0000000000000000 x16: ffffcec265074160 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 x8 : 0000000000000000 x7 : 0000000000000000 x6 : ffff167600ff1cec x5 : ffffcec22cfa2010 x4 : 0000000000000000 x3 : 0000000000000003 x2 : ffff167613f836c0 x1 : 0000000000000000 x0 : ffff16761feb60b8 Call trace: sdw_stream_add_slave+0x44/0x380 [soundwire_bus] wsa881x_hw_params+0x68/0x80 [snd_soc_wsa881x] snd_soc_dai_hw_params+0x3c/0xa4 __soc_pcm_hw_params+0x230/0x660 dpcm_be_dai_hw_params+0x1d0/0x3f8 dpcm_fe_dai_hw_params+0x98/0x268 snd_pcm_hw_params+0x124/0x460 snd_pcm_common_ioctl+0x998/0x16e8 snd_pcm_ioctl+0x34/0x58 __arm64_sys_ioctl+0xac/0xf8 invoke_syscall+0x48/0x104 el0_svc_common.constprop.0+0x40/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x34/0xe0 el0t_64_sync_handler+0x120/0x12c el0t_64_sync+0x190/0x194 Code: aa0403fb f9418400 9100e000 9400102f (f8420f22) ---[ end trace 0000000000000000 ]--- 0000000000006108 <sdw_stream_add_slave>: 6108: d503233f paciasp 610c: a9b97bfd stp x29, x30, [sp, #-112]! 6110: 910003fd mov x29, sp 6114: a90153f3 stp x19, x20, [sp, gregkh#16] 6118: a9025bf5 stp x21, x22, [sp, #32] 611c: aa0103f6 mov x22, x1 6120: 2a0303f5 mov w21, w3 6124: a90363f7 stp x23, x24, [sp, #48] 6128: aa0003f8 mov x24, x0 612c: aa0203f7 mov x23, x2 6130: a9046bf9 stp x25, x26, [sp, #64] 6134: aa0403f9 mov x25, x4 <-- x4 copied to x25 6138: a90573fb stp x27, x28, [sp, #80] 613c: aa0403fb mov x27, x4 6140: f9418400 ldr x0, [x0, #776] 6144: 9100e000 add x0, x0, #0x38 6148: 94000000 bl 0 <mutex_lock> 614c: f8420f22 ldr x2, [x25, #32]! <-- offset 0x44 ^^^ This is 0x6108 + offset 0x44 from the beginning of sdw_stream_add_slave() where data abort happens. wsa881x_hw_params() is called with stream = NULL and passes it further in register x4 (5th argument) to sdw_stream_add_slave() without any checks. Value from x4 is copied to x25 and finally it aborts on trying to load a value from address in x25 plus offset 32 (in dec) which corresponds to master_list member in struct sdw_stream_runtime: struct sdw_stream_runtime { const char * name; /* 0 8 */ struct sdw_stream_params params; /* 8 12 */ enum sdw_stream_state state; /* 20 4 */ enum sdw_stream_type type; /* 24 4 */ /* XXX 4 bytes hole, try to pack */ here-> struct list_head master_list; /* 32 16 */ int m_rt_count; /* 48 4 */ /* size: 56, cachelines: 1, members: 6 */ /* sum members: 48, holes: 1, sum holes: 4 */ /* padding: 4 */ /* last cacheline: 56 bytes */ Fix this by adding required calls to qcom_snd_sdw_startup() and sdw_release_stream() to startup and shutdown routines which restores the previous correct behaviour when ->set_stream() method is called to set a valid stream runtime pointer on playback startup. Reproduced and then fix was tested on db845c RB3 board. Reported-by: Dmitry Baryshkov <[email protected]> Cc: [email protected] Fixes: 15c7fab ("ASoC: qcom: Move Soundwire runtime stream alloc to soundcards") Cc: Srinivas Kandagatla <[email protected]> Cc: Dmitry Baryshkov <[email protected]> Cc: Krzysztof Kozlowski <[email protected]> Cc: Pierre-Louis Bossart <[email protected]> Signed-off-by: Alexey Klimov <[email protected]> Tested-by: Steev Klimaszewski <[email protected]> # Lenovo Yoga C630 Reviewed-by: Krzysztof Kozlowski <[email protected]> Reviewed-by: Srinivas Kandagatla <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
gregkh
pushed a commit
that referenced
this pull request
Nov 8, 2024
…the Crashkernel Scenario [ Upstream commit 4058c39 ] The issue is that before entering the crash kernel, the DWC USB controller did not perform operations such as resetting the interrupt mask bits. After entering the crash kernel,before the USB interrupt handler registration was completed while loading the DWC USB driver,an GINTSTS_SOF interrupt was received.This triggered the misroute_irq process within the GIC handling framework,ultimately leading to the misrouting of the interrupt,causing it to be handled by the wrong interrupt handler and resulting in the issue. Summary:In a scenario where the kernel triggers a panic and enters the crash kernel,it is necessary to ensure that the interrupt mask bit is not enabled before the interrupt registration is complete. If an interrupt reaches the CPU at this moment,it will certainly not be handled correctly,especially in cases where this interrupt is reported frequently. Please refer to the Crashkernel dmesg information as follows (the message on line 3 was added before devm_request_irq is called by the dwc2_driver_probe function): [ 5.866837][ T1] dwc2 JMIC0010:01: supply vusb_d not found, using dummy regulator [ 5.874588][ T1] dwc2 JMIC0010:01: supply vusb_a not found, using dummy regulator [ 5.882335][ T1] dwc2 JMIC0010:01: before devm_request_irq irq: [71], gintmsk[0xf300080e], gintsts[0x04200009] [ 5.892686][ C0] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.0-jmnd1.2_RC #18 [ 5.900327][ C0] Hardware name: CMSS HyperCard4-25G/HyperCard4-25G, BIOS 1.6.4 Jul 8 2024 [ 5.908836][ C0] Call trace: [ 5.911965][ C0] dump_backtrace+0x0/0x1f0 [ 5.916308][ C0] show_stack+0x20/0x30 [ 5.920304][ C0] dump_stack+0xd8/0x140 [ 5.924387][ C0] pcie_xxx_handler+0x3c/0x1d8 [ 5.930121][ C0] __handle_irq_event_percpu+0x64/0x1e0 [ 5.935506][ C0] handle_irq_event+0x80/0x1d0 [ 5.940109][ C0] try_one_irq+0x138/0x174 [ 5.944365][ C0] misrouted_irq+0x134/0x140 [ 5.948795][ C0] note_interrupt+0x1d0/0x30c [ 5.953311][ C0] handle_irq_event+0x13c/0x1d0 [ 5.958001][ C0] handle_fasteoi_irq+0xd4/0x260 [ 5.962779][ C0] __handle_domain_irq+0x88/0xf0 [ 5.967555][ C0] gic_handle_irq+0x9c/0x2f0 [ 5.971985][ C0] el1_irq+0xb8/0x140 [ 5.975807][ C0] __setup_irq+0x3dc/0x7cc [ 5.980064][ C0] request_threaded_irq+0xf4/0x1b4 [ 5.985015][ C0] devm_request_threaded_irq+0x80/0x100 [ 5.990400][ C0] dwc2_driver_probe+0x1b8/0x6b0 [ 5.995178][ C0] platform_drv_probe+0x5c/0xb0 [ 5.999868][ C0] really_probe+0xf8/0x51c [ 6.004125][ C0] driver_probe_device+0xfc/0x170 [ 6.008989][ C0] device_driver_attach+0xc8/0xd0 [ 6.013853][ C0] __driver_attach+0xe8/0x1b0 [ 6.018369][ C0] bus_for_each_dev+0x7c/0xdc [ 6.022886][ C0] driver_attach+0x2c/0x3c [ 6.027143][ C0] bus_add_driver+0xdc/0x240 [ 6.031573][ C0] driver_register+0x80/0x13c [ 6.036090][ C0] __platform_driver_register+0x50/0x5c [ 6.041476][ C0] dwc2_platform_driver_init+0x24/0x30 [ 6.046774][ C0] do_one_initcall+0x50/0x25c [ 6.051291][ C0] do_initcall_level+0xe4/0xfc [ 6.055894][ C0] do_initcalls+0x80/0xa4 [ 6.060064][ C0] kernel_init_freeable+0x198/0x240 [ 6.065102][ C0] kernel_init+0x1c/0x12c Signed-off-by: Shawn Shao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
gregkh
pushed a commit
that referenced
this pull request
Nov 17, 2024
commit 373b933 upstream. Uprobe needs to fetch args into a percpu buffer, and then copy to ring buffer to avoid non-atomic context problem. Sometimes user-space strings, arrays can be very large, but the size of percpu buffer is only page size. And store_trace_args() won't check whether these data exceeds a single page or not, caused out-of-bounds memory access. It could be reproduced by following steps: 1. build kernel with CONFIG_KASAN enabled 2. save follow program as test.c ``` \#include <stdio.h> \#include <stdlib.h> \#include <string.h> // If string length large than MAX_STRING_SIZE, the fetch_store_strlen() // will return 0, cause __get_data_size() return shorter size, and // store_trace_args() will not trigger out-of-bounds access. // So make string length less than 4096. \#define STRLEN 4093 void generate_string(char *str, int n) { int i; for (i = 0; i < n; ++i) { char c = i % 26 + 'a'; str[i] = c; } str[n-1] = '\0'; } void print_string(char *str) { printf("%s\n", str); } int main() { char tmp[STRLEN]; generate_string(tmp, STRLEN); print_string(tmp); return 0; } ``` 3. compile program `gcc -o test test.c` 4. get the offset of `print_string()` ``` objdump -t test | grep -w print_string 0000000000401199 g F .text 000000000000001b print_string ``` 5. configure uprobe with offset 0x1199 ``` off=0x1199 cd /sys/kernel/debug/tracing/ echo "p /root/test:${off} arg1=+0(%di):ustring arg2=\$comm arg3=+0(%di):ustring" > uprobe_events echo 1 > events/uprobes/enable echo 1 > tracing_on ``` 6. run `test`, and kasan will report error. ================================================================== BUG: KASAN: use-after-free in strncpy_from_user+0x1d6/0x1f0 Write of size 8 at addr ffff88812311c004 by task test/499CPU: 0 UID: 0 PID: 499 Comm: test Not tainted 6.12.0-rc3+ #18 Hardware name: Red Hat KVM, BIOS 1.16.0-4.al8 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x55/0x70 print_address_description.constprop.0+0x27/0x310 kasan_report+0x10f/0x120 ? strncpy_from_user+0x1d6/0x1f0 strncpy_from_user+0x1d6/0x1f0 ? rmqueue.constprop.0+0x70d/0x2ad0 process_fetch_insn+0xb26/0x1470 ? __pfx_process_fetch_insn+0x10/0x10 ? _raw_spin_lock+0x85/0xe0 ? __pfx__raw_spin_lock+0x10/0x10 ? __pte_offset_map+0x1f/0x2d0 ? unwind_next_frame+0xc5f/0x1f80 ? arch_stack_walk+0x68/0xf0 ? is_bpf_text_address+0x23/0x30 ? kernel_text_address.part.0+0xbb/0xd0 ? __kernel_text_address+0x66/0xb0 ? unwind_get_return_address+0x5e/0xa0 ? __pfx_stack_trace_consume_entry+0x10/0x10 ? arch_stack_walk+0xa2/0xf0 ? _raw_spin_lock_irqsave+0x8b/0xf0 ? __pfx__raw_spin_lock_irqsave+0x10/0x10 ? depot_alloc_stack+0x4c/0x1f0 ? _raw_spin_unlock_irqrestore+0xe/0x30 ? stack_depot_save_flags+0x35d/0x4f0 ? kasan_save_stack+0x34/0x50 ? kasan_save_stack+0x24/0x50 ? mutex_lock+0x91/0xe0 ? __pfx_mutex_lock+0x10/0x10 prepare_uprobe_buffer.part.0+0x2cd/0x500 uprobe_dispatcher+0x2c3/0x6a0 ? __pfx_uprobe_dispatcher+0x10/0x10 ? __kasan_slab_alloc+0x4d/0x90 handler_chain+0xdd/0x3e0 handle_swbp+0x26e/0x3d0 ? __pfx_handle_swbp+0x10/0x10 ? uprobe_pre_sstep_notifier+0x151/0x1b0 irqentry_exit_to_user_mode+0xe2/0x1b0 asm_exc_int3+0x39/0x40 RIP: 0033:0x401199 Code: 01 c2 0f b6 45 fb 88 02 83 45 fc 01 8b 45 fc 3b 45 e4 7c b7 8b 45 e4 48 98 48 8d 50 ff 48 8b 45 e8 48 01 d0 ce RSP: 002b:00007ffdf00576a8 EFLAGS: 00000206 RAX: 00007ffdf00576b0 RBX: 0000000000000000 RCX: 0000000000000ff2 RDX: 0000000000000ffc RSI: 0000000000000ffd RDI: 00007ffdf00576b0 RBP: 00007ffdf00586b0 R08: 00007feb2f9c0d20 R09: 00007feb2f9c0d20 R10: 0000000000000001 R11: 0000000000000202 R12: 0000000000401040 R13: 00007ffdf0058780 R14: 0000000000000000 R15: 0000000000000000 </TASK> This commit enforces the buffer's maxlen less than a page-size to avoid store_trace_args() out-of-memory access. Link: https://lore.kernel.org/all/[email protected]/ Fixes: dcad1a2 ("tracing/uprobes: Fetch args before reserving a ring buffer") Signed-off-by: Qiao Ma <[email protected]> Signed-off-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Vamsi Krishna Brahmajosyula <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
piso77
pushed a commit
to piso77/linux
that referenced
this pull request
Nov 29, 2024
In binder_add_freeze_work() we iterate over the proc->nodes with the proc->inner_lock held. However, this lock is temporarily dropped to acquire the node->lock first (lock nesting order). This can race with binder_deferred_release() which removes the nodes from the proc->nodes rbtree and adds them into binder_dead_nodes list. This leads to a broken iteration in binder_add_freeze_work() as rb_next() will use data from binder_dead_nodes, triggering an out-of-bounds access: ================================================================== BUG: KASAN: global-out-of-bounds in rb_next+0xfc/0x124 Read of size 8 at addr ffffcb84285f7170 by task freeze/660 CPU: 8 UID: 0 PID: 660 Comm: freeze Not tainted 6.11.0-07343-ga727812a8d45 gregkh#18 Hardware name: linux,dummy-virt (DT) Call trace: rb_next+0xfc/0x124 binder_add_freeze_work+0x344/0x534 binder_ioctl+0x1e70/0x25ac __arm64_sys_ioctl+0x124/0x190 The buggy address belongs to the variable: binder_dead_nodes+0x10/0x40 [...] ================================================================== This is possible because proc->nodes (rbtree) and binder_dead_nodes (list) share entries in binder_node through a union: struct binder_node { [...] union { struct rb_node rb_node; struct hlist_node dead_node; }; Fix the race by checking that the proc is still alive. If not, simply break out of the iteration. Fixes: d579b04 ("binder: frozen notification") Cc: [email protected] Reviewed-by: Alice Ryhl <[email protected]> Acked-by: Todd Kjos <[email protected]> Signed-off-by: Carlos Llamas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
gregkh
pushed a commit
that referenced
this pull request
Dec 9, 2024
commit 011e69a upstream. In binder_add_freeze_work() we iterate over the proc->nodes with the proc->inner_lock held. However, this lock is temporarily dropped to acquire the node->lock first (lock nesting order). This can race with binder_deferred_release() which removes the nodes from the proc->nodes rbtree and adds them into binder_dead_nodes list. This leads to a broken iteration in binder_add_freeze_work() as rb_next() will use data from binder_dead_nodes, triggering an out-of-bounds access: ================================================================== BUG: KASAN: global-out-of-bounds in rb_next+0xfc/0x124 Read of size 8 at addr ffffcb84285f7170 by task freeze/660 CPU: 8 UID: 0 PID: 660 Comm: freeze Not tainted 6.11.0-07343-ga727812a8d45 #18 Hardware name: linux,dummy-virt (DT) Call trace: rb_next+0xfc/0x124 binder_add_freeze_work+0x344/0x534 binder_ioctl+0x1e70/0x25ac __arm64_sys_ioctl+0x124/0x190 The buggy address belongs to the variable: binder_dead_nodes+0x10/0x40 [...] ================================================================== This is possible because proc->nodes (rbtree) and binder_dead_nodes (list) share entries in binder_node through a union: struct binder_node { [...] union { struct rb_node rb_node; struct hlist_node dead_node; }; Fix the race by checking that the proc is still alive. If not, simply break out of the iteration. Fixes: d579b04 ("binder: frozen notification") Cc: [email protected] Reviewed-by: Alice Ryhl <[email protected]> Acked-by: Todd Kjos <[email protected]> Signed-off-by: Carlos Llamas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
heynemax
pushed a commit
to amazonlinux/linux
that referenced
this pull request
Jan 21, 2025
commit 373b933 upstream. Uprobe needs to fetch args into a percpu buffer, and then copy to ring buffer to avoid non-atomic context problem. Sometimes user-space strings, arrays can be very large, but the size of percpu buffer is only page size. And store_trace_args() won't check whether these data exceeds a single page or not, caused out-of-bounds memory access. It could be reproduced by following steps: 1. build kernel with CONFIG_KASAN enabled 2. save follow program as test.c ``` \#include <stdio.h> \#include <stdlib.h> \#include <string.h> // If string length large than MAX_STRING_SIZE, the fetch_store_strlen() // will return 0, cause __get_data_size() return shorter size, and // store_trace_args() will not trigger out-of-bounds access. // So make string length less than 4096. \#define STRLEN 4093 void generate_string(char *str, int n) { int i; for (i = 0; i < n; ++i) { char c = i % 26 + 'a'; str[i] = c; } str[n-1] = '\0'; } void print_string(char *str) { printf("%s\n", str); } int main() { char tmp[STRLEN]; generate_string(tmp, STRLEN); print_string(tmp); return 0; } ``` 3. compile program `gcc -o test test.c` 4. get the offset of `print_string()` ``` objdump -t test | grep -w print_string 0000000000401199 g F .text 000000000000001b print_string ``` 5. configure uprobe with offset 0x1199 ``` off=0x1199 cd /sys/kernel/debug/tracing/ echo "p /root/test:${off} arg1=+0(%di):ustring arg2=\$comm arg3=+0(%di):ustring" > uprobe_events echo 1 > events/uprobes/enable echo 1 > tracing_on ``` 6. run `test`, and kasan will report error. ================================================================== BUG: KASAN: use-after-free in strncpy_from_user+0x1d6/0x1f0 Write of size 8 at addr ffff88812311c004 by task test/499CPU: 0 UID: 0 PID: 499 Comm: test Not tainted 6.12.0-rc3+ gregkh#18 Hardware name: Red Hat KVM, BIOS 1.16.0-4.al8 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x55/0x70 print_address_description.constprop.0+0x27/0x310 kasan_report+0x10f/0x120 ? strncpy_from_user+0x1d6/0x1f0 strncpy_from_user+0x1d6/0x1f0 ? rmqueue.constprop.0+0x70d/0x2ad0 process_fetch_insn+0xb26/0x1470 ? __pfx_process_fetch_insn+0x10/0x10 ? _raw_spin_lock+0x85/0xe0 ? __pfx__raw_spin_lock+0x10/0x10 ? __pte_offset_map+0x1f/0x2d0 ? unwind_next_frame+0xc5f/0x1f80 ? arch_stack_walk+0x68/0xf0 ? is_bpf_text_address+0x23/0x30 ? kernel_text_address.part.0+0xbb/0xd0 ? __kernel_text_address+0x66/0xb0 ? unwind_get_return_address+0x5e/0xa0 ? __pfx_stack_trace_consume_entry+0x10/0x10 ? arch_stack_walk+0xa2/0xf0 ? _raw_spin_lock_irqsave+0x8b/0xf0 ? __pfx__raw_spin_lock_irqsave+0x10/0x10 ? depot_alloc_stack+0x4c/0x1f0 ? _raw_spin_unlock_irqrestore+0xe/0x30 ? stack_depot_save_flags+0x35d/0x4f0 ? kasan_save_stack+0x34/0x50 ? kasan_save_stack+0x24/0x50 ? mutex_lock+0x91/0xe0 ? __pfx_mutex_lock+0x10/0x10 prepare_uprobe_buffer.part.0+0x2cd/0x500 uprobe_dispatcher+0x2c3/0x6a0 ? __pfx_uprobe_dispatcher+0x10/0x10 ? __kasan_slab_alloc+0x4d/0x90 handler_chain+0xdd/0x3e0 handle_swbp+0x26e/0x3d0 ? __pfx_handle_swbp+0x10/0x10 ? uprobe_pre_sstep_notifier+0x151/0x1b0 irqentry_exit_to_user_mode+0xe2/0x1b0 asm_exc_int3+0x39/0x40 RIP: 0033:0x401199 Code: 01 c2 0f b6 45 fb 88 02 83 45 fc 01 8b 45 fc 3b 45 e4 7c b7 8b 45 e4 48 98 48 8d 50 ff 48 8b 45 e8 48 01 d0 ce RSP: 002b:00007ffdf00576a8 EFLAGS: 00000206 RAX: 00007ffdf00576b0 RBX: 0000000000000000 RCX: 0000000000000ff2 RDX: 0000000000000ffc RSI: 0000000000000ffd RDI: 00007ffdf00576b0 RBP: 00007ffdf00586b0 R08: 00007feb2f9c0d20 R09: 00007feb2f9c0d20 R10: 0000000000000001 R11: 0000000000000202 R12: 0000000000401040 R13: 00007ffdf0058780 R14: 0000000000000000 R15: 0000000000000000 </TASK> This commit enforces the buffer's maxlen less than a page-size to avoid store_trace_args() out-of-memory access. Link: https://lore.kernel.org/all/[email protected]/ Fixes: dcad1a2 ("tracing/uprobes: Fetch args before reserving a ring buffer") Signed-off-by: Qiao Ma <[email protected]> Signed-off-by: Masami Hiramatsu (Google) <[email protected]> [mheyne: Adjusted context to account for missing commit 3eaea21 ("uprobes: encapsulate preparation of uprobe args buffer")] Signed-off-by: Maximilian Heyne <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Bumps setuptools from 68.0.0 to 70.0.0.
Changelog
Sourced from setuptools's changelog.
... (truncated)
Commits
5cbf12a
Workaround for release error in v709c1bcc3
Bump version: 69.5.1 → 70.0.04dc0c31
Remove deprecatedsetuptools.dep_util
(#4360)6c1ef57
Remove xfail now that test passes. Ref #4371.d14fa01
Add all site-packages dirs when creating simulated environment for test_edita...6b7f7a1
Preventbin
folders to be taken as extern packages when vendoring (#4370)69141f6
Add doctest for vendorised bin folder2a53cc1
Prevent 'bin' folders to be taken as extern packages7208628
Replace call to deprecatedvalidate_pyproject
command (#4363)96d681a
Remove call to deprecated validate_pyproject commandDependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting
@dependabot rebase
.Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
@dependabot rebase
will rebase this PR@dependabot recreate
will recreate this PR, overwriting any edits that have been made to it@dependabot merge
will merge this PR after your CI passes on it@dependabot squash and merge
will squash and merge this PR after your CI passes on it@dependabot cancel merge
will cancel a previously requested merge and block automerging@dependabot reopen
will reopen this PR if it is closed@dependabot close
will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually@dependabot show <dependency name> ignore conditions
will show all of the ignore conditions of the specified dependency@dependabot ignore this major version
will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this minor version
will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this dependency
will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)You can disable automated security fix PRs for this repo from the Security Alerts page.