Список изменений в ядре 6.1.54

af_unix: Fix data race around sk->sk_err. [+ + +]

Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Fri Sep 1 17:27:08 2023 -0700

    af_unix: Fix data race around sk->sk_err.
    
    [ Upstream commit b192812905e4b134f7b7994b079eb647e9d2d37e ]
    
    As with sk->sk_shutdown shown in the previous patch, sk->sk_err can be
    read locklessly by unix_dgram_sendmsg().
    
    Let's use READ_ONCE() for sk_err as well.
    
    Note that the writer side is marked by commit cc04410af7de ("af_unix:
    annotate lockless accesses to sk->sk_err").
    
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

af_unix: Fix data-race around unix_tot_inflight. [+ + +]

Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Fri Sep 1 17:27:06 2023 -0700

    af_unix: Fix data-race around unix_tot_inflight.
    
    [ Upstream commit ade32bd8a738d7497ffe9743c46728db26740f78 ]
    
    unix_tot_inflight is changed under spin_lock(unix_gc_lock), but
    unix_release_sock() reads it locklessly.
    
    Let's use READ_ONCE() for unix_tot_inflight.
    
    Note that the writer side was marked by commit 9d6d7f1cb67c ("af_unix:
    annote lockless accesses to unix_tot_inflight & gc_in_progress")
    
    BUG: KCSAN: data-race in unix_inflight / unix_release_sock
    
    write (marked) to 0xffffffff871852b8 of 4 bytes by task 123 on cpu 1:
     unix_inflight+0x130/0x180 net/unix/scm.c:64
     unix_attach_fds+0x137/0x1b0 net/unix/scm.c:123
     unix_scm_to_skb net/unix/af_unix.c:1832 [inline]
     unix_dgram_sendmsg+0x46a/0x14f0 net/unix/af_unix.c:1955
     sock_sendmsg_nosec net/socket.c:724 [inline]
     sock_sendmsg+0x148/0x160 net/socket.c:747
     ____sys_sendmsg+0x4e4/0x610 net/socket.c:2493
     ___sys_sendmsg+0xc6/0x140 net/socket.c:2547
     __sys_sendmsg+0x94/0x140 net/socket.c:2576
     __do_sys_sendmsg net/socket.c:2585 [inline]
     __se_sys_sendmsg net/socket.c:2583 [inline]
     __x64_sys_sendmsg+0x45/0x50 net/socket.c:2583
     do_syscall_x64 arch/x86/entry/common.c:50 [inline]
     do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
     entry_SYSCALL_64_after_hwframe+0x72/0xdc
    
    read to 0xffffffff871852b8 of 4 bytes by task 4891 on cpu 0:
     unix_release_sock+0x608/0x910 net/unix/af_unix.c:671
     unix_release+0x59/0x80 net/unix/af_unix.c:1058
     __sock_release+0x7d/0x170 net/socket.c:653
     sock_close+0x19/0x30 net/socket.c:1385
     __fput+0x179/0x5e0 fs/file_table.c:321
     ____fput+0x15/0x20 fs/file_table.c:349
     task_work_run+0x116/0x1a0 kernel/task_work.c:179
     resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
     exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
     exit_to_user_mode_prepare+0x174/0x180 kernel/entry/common.c:204
     __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
     syscall_exit_to_user_mode+0x1a/0x30 kernel/entry/common.c:297
     do_syscall_64+0x4b/0x90 arch/x86/entry/common.c:86
     entry_SYSCALL_64_after_hwframe+0x72/0xdc
    
    value changed: 0x00000000 -> 0x00000001
    
    Reported by Kernel Concurrency Sanitizer on:
    CPU: 0 PID: 4891 Comm: systemd-coredum Not tainted 6.4.0-rc5-01219-gfa0e21fa4443 #5
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
    
    Fixes: 9305cfa4443d ("[AF_UNIX]: Make unix_tot_inflight counter non-atomic")
    Reported-by: syzkaller <syzkaller@googlegroups.com>
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

af_unix: Fix data-races around sk->sk_shutdown. [+ + +]

Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Fri Sep 1 17:27:07 2023 -0700

    af_unix: Fix data-races around sk->sk_shutdown.
    
    [ Upstream commit afe8764f76346ba838d4f162883e23d2fcfaa90e ]
    
    sk->sk_shutdown is changed under unix_state_lock(sk), but
    unix_dgram_sendmsg() calls two functions to read sk_shutdown locklessly.
    
      sock_alloc_send_pskb
      `- sock_wait_for_wmem
    
    Let's use READ_ONCE() there.
    
    Note that the writer side was marked by commit e1d09c2c2f57 ("af_unix:
    Fix data races around sk->sk_shutdown.").
    
    BUG: KCSAN: data-race in sock_alloc_send_pskb / unix_release_sock
    
    write (marked) to 0xffff8880069af12c of 1 bytes by task 1 on cpu 1:
     unix_release_sock+0x75c/0x910 net/unix/af_unix.c:631
     unix_release+0x59/0x80 net/unix/af_unix.c:1053
     __sock_release+0x7d/0x170 net/socket.c:654
     sock_close+0x19/0x30 net/socket.c:1386
     __fput+0x2a3/0x680 fs/file_table.c:384
     ____fput+0x15/0x20 fs/file_table.c:412
     task_work_run+0x116/0x1a0 kernel/task_work.c:179
     resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
     exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
     exit_to_user_mode_prepare+0x174/0x180 kernel/entry/common.c:204
     __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
     syscall_exit_to_user_mode+0x1a/0x30 kernel/entry/common.c:297
     do_syscall_64+0x4b/0x90 arch/x86/entry/common.c:86
     entry_SYSCALL_64_after_hwframe+0x6e/0xd8
    
    read to 0xffff8880069af12c of 1 bytes by task 28650 on cpu 0:
     sock_alloc_send_pskb+0xd2/0x620 net/core/sock.c:2767
     unix_dgram_sendmsg+0x2f8/0x14f0 net/unix/af_unix.c:1944
     unix_seqpacket_sendmsg net/unix/af_unix.c:2308 [inline]
     unix_seqpacket_sendmsg+0xba/0x130 net/unix/af_unix.c:2292
     sock_sendmsg_nosec net/socket.c:725 [inline]
     sock_sendmsg+0x148/0x160 net/socket.c:748
     ____sys_sendmsg+0x4e4/0x610 net/socket.c:2494
     ___sys_sendmsg+0xc6/0x140 net/socket.c:2548
     __sys_sendmsg+0x94/0x140 net/socket.c:2577
     __do_sys_sendmsg net/socket.c:2586 [inline]
     __se_sys_sendmsg net/socket.c:2584 [inline]
     __x64_sys_sendmsg+0x45/0x50 net/socket.c:2584
     do_syscall_x64 arch/x86/entry/common.c:50 [inline]
     do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
     entry_SYSCALL_64_after_hwframe+0x6e/0xd8
    
    value changed: 0x00 -> 0x03
    
    Reported by Kernel Concurrency Sanitizer on:
    CPU: 0 PID: 28650 Comm: systemd-coredum Not tainted 6.4.0-11989-g6843306689af #6
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
    
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Reported-by: syzkaller <syzkaller@googlegroups.com>
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

af_unix: Fix data-races around user->unix_inflight. [+ + +]

Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Fri Sep 1 17:27:05 2023 -0700

    af_unix: Fix data-races around user->unix_inflight.
    
    [ Upstream commit 0bc36c0650b21df36fbec8136add83936eaf0607 ]
    
    user->unix_inflight is changed under spin_lock(unix_gc_lock),
    but too_many_unix_fds() reads it locklessly.
    
    Let's annotate the write/read accesses to user->unix_inflight.
    
    BUG: KCSAN: data-race in unix_attach_fds / unix_inflight
    
    write to 0xffffffff8546f2d0 of 8 bytes by task 44798 on cpu 1:
     unix_inflight+0x157/0x180 net/unix/scm.c:66
     unix_attach_fds+0x147/0x1e0 net/unix/scm.c:123
     unix_scm_to_skb net/unix/af_unix.c:1827 [inline]
     unix_dgram_sendmsg+0x46a/0x14f0 net/unix/af_unix.c:1950
     unix_seqpacket_sendmsg net/unix/af_unix.c:2308 [inline]
     unix_seqpacket_sendmsg+0xba/0x130 net/unix/af_unix.c:2292
     sock_sendmsg_nosec net/socket.c:725 [inline]
     sock_sendmsg+0x148/0x160 net/socket.c:748
     ____sys_sendmsg+0x4e4/0x610 net/socket.c:2494
     ___sys_sendmsg+0xc6/0x140 net/socket.c:2548
     __sys_sendmsg+0x94/0x140 net/socket.c:2577
     __do_sys_sendmsg net/socket.c:2586 [inline]
     __se_sys_sendmsg net/socket.c:2584 [inline]
     __x64_sys_sendmsg+0x45/0x50 net/socket.c:2584
     do_syscall_x64 arch/x86/entry/common.c:50 [inline]
     do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
     entry_SYSCALL_64_after_hwframe+0x6e/0xd8
    
    read to 0xffffffff8546f2d0 of 8 bytes by task 44814 on cpu 0:
     too_many_unix_fds net/unix/scm.c:101 [inline]
     unix_attach_fds+0x54/0x1e0 net/unix/scm.c:110
     unix_scm_to_skb net/unix/af_unix.c:1827 [inline]
     unix_dgram_sendmsg+0x46a/0x14f0 net/unix/af_unix.c:1950
     unix_seqpacket_sendmsg net/unix/af_unix.c:2308 [inline]
     unix_seqpacket_sendmsg+0xba/0x130 net/unix/af_unix.c:2292
     sock_sendmsg_nosec net/socket.c:725 [inline]
     sock_sendmsg+0x148/0x160 net/socket.c:748
     ____sys_sendmsg+0x4e4/0x610 net/socket.c:2494
     ___sys_sendmsg+0xc6/0x140 net/socket.c:2548
     __sys_sendmsg+0x94/0x140 net/socket.c:2577
     __do_sys_sendmsg net/socket.c:2586 [inline]
     __se_sys_sendmsg net/socket.c:2584 [inline]
     __x64_sys_sendmsg+0x45/0x50 net/socket.c:2584
     do_syscall_x64 arch/x86/entry/common.c:50 [inline]
     do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
     entry_SYSCALL_64_after_hwframe+0x6e/0xd8
    
    value changed: 0x000000000000000c -> 0x000000000000000d
    
    Reported by Kernel Concurrency Sanitizer on:
    CPU: 0 PID: 44814 Comm: systemd-coredum Not tainted 6.4.0-11989-g6843306689af #6
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
    
    Fixes: 712f4aad406b ("unix: properly account for FDs passed over unix sockets")
    Reported-by: syzkaller <syzkaller@googlegroups.com>
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Acked-by: Willy Tarreau <w@1wt.eu>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ARC: atomics: Add compiler barrier to atomic operations... [+ + +]

Author: Pavel Kozlov <pavel.kozlov@synopsys.com>
Date:   Tue Aug 15 19:11:36 2023 +0400

    ARC: atomics: Add compiler barrier to atomic operations...
    
    commit 42f51fb24fd39cc547c086ab3d8a314cc603a91c upstream.
    
    ... to avoid unwanted gcc optimizations
    
    SMP kernels fail to boot with commit 596ff4a09b89
    ("cpumask: re-introduce constant-sized cpumask optimizations").
    
    |
    | percpu: BUG: failure at mm/percpu.c:2981/pcpu_build_alloc_info()!
    |
    
    The write operation performed by the SCOND instruction in the atomic
    inline asm code is not properly passed to the compiler. The compiler
    cannot correctly optimize a nested loop that runs through the cpumask
    in the pcpu_build_alloc_info() function.
    
    Fix this by add a compiler barrier (memory clobber in inline asm).
    
    Apparently atomic ops used to have memory clobber implicitly via
    surrounding smp_mb(). However commit b64be6836993c431e
    ("ARC: atomics: implement relaxed variants") removed the smp_mb() for
    the relaxed variants, but failed to add the explicit compiler barrier.
    
    Link: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/135
    Cc: <stable@vger.kernel.org> # v6.3+
    Fixes: b64be6836993c43 ("ARC: atomics: implement relaxed variants")
    Signed-off-by: Pavel Kozlov <pavel.kozlov@synopsys.com>
    Signed-off-by: Vineet Gupta <vgupta@kernel.org>
    [vgupta: tweaked the changelog and added Fixes tag]
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: renesas: rzg2l: Fix txdv-skew-psec typos [+ + +]

Author: Chris Paterson <chris.paterson2@renesas.com>
Date:   Fri Jun 9 23:11:36 2023 +0100

    arm64: dts: renesas: rzg2l: Fix txdv-skew-psec typos
    
    commit db67345716a52abb750ec8f76d6a5675218715f9 upstream.
    
    It looks like txdv-skew-psec is a typo from a copy+paste. txdv-skew-psec
    is not present in the PHY bindings nor is it in the driver.
    
    Correct to txen-skew-psec which is clearly what it was meant to be.
    
    Given that the default for txen-skew-psec is 0, and the device tree is
    only trying to set it to 0 anyway, there should not be any functional
    change from this fix.
    
    Fixes: 361b0dcbd7f9 ("arm64: dts: renesas: rzg2l-smarc-som: Enable Ethernet")
    Fixes: 6494e4f90503 ("arm64: dts: renesas: rzg2ul-smarc-som: Enable Ethernet on SMARC platform")
    Fixes: ce0c63b6a5ef ("arm64: dts: renesas: Add initial device tree for RZ/G2LC SMARC EVK")
    Cc: stable@vger.kernel.org # 6.1.y
    Reported-by: Tomohiro Komagata <tomohiro.komagata.aj@renesas.com>
    Signed-off-by: Chris Paterson <chris.paterson2@renesas.com>
    Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Link: https://lore.kernel.org/r/20230609221136.7431-1-chris.paterson2@renesas.com
    Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ASoC: tegra: Fix SFC conversion for few rates [+ + +]

Author: Sheetal <sheetal@nvidia.com>
Date:   Thu Jun 22 17:04:09 2023 +0530

    ASoC: tegra: Fix SFC conversion for few rates
    
    commit d900d9a435ca95a386f49424f3689cd17ec201da upstream.
    
    Sample rate conversions for rates greater than 48kHz are found to be
    failing. It means x->y conversions fail when either x or y is greater
    than 48kHz.
    
    This happens because, tegra210_sfc_rate_to_idx() returns incorrect
    index for rates greater than 48kHz. This actually depends on the
    tegra210_sfc_rates[] array and it is not in sync with frequency
    values of SFC TX/RX register. To be precise, 64kHz entry is missing
    in above array defined in the driver. Due to this wrong index is
    returned and this results in incorrect programming of coefficients.
    
    To fix this, align the tegra210_sfc_rates[] array with SFC register
    specification and thus add 64kHz entry to it. Also, the coefficient
    table is updated to reflect that none of the conversions are supported
    for 64kHz.
    
    Fixes: b2f74ec53a6c ("ASoC: tegra: Add Tegra210 based SFC driver")
    Cc: stable@vger.kernel.org
    Signed-off-by: Sheetal <sheetal@nvidia.com>
    Reviewed-by: Mohan Kumar D <mkumard@nvidia.com>
    Reviewed-by: Sameer Pujar <spujar@nvidia.com>
    Link: https://lore.kernel.org/r/Message-Id: <1687433656-7892-2-git-send-email-spujar@nvidia.com>
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ata: ahci: Add Elkhart Lake AHCI controller [+ + +]

Author: Werner Fischer <devlists@wefi.net>
Date:   Tue Aug 29 13:33:58 2023 +0200

    ata: ahci: Add Elkhart Lake AHCI controller
    
    commit 2a2df98ec592667927b5c1351afa6493ea125c9f upstream.
    
    Elkhart Lake is the successor of Apollo Lake and Gemini Lake. These
    CPUs and their PCHs are used in mobile and embedded environments.
    
    With this patch I suggest that Elkhart Lake SATA controllers [1] should
    use the default LPM policy for mobile chipsets.
    The disadvantage of missing hot-plug support with this setting should
    not be an issue, as those CPUs are used in embedded environments and
    not in servers with hot-plug backplanes.
    
    We discovered that the Elkhart Lake SATA controllers have been missing
    in ahci.c after a customer reported the throttling of his SATA SSD
    after a short period of higher I/O. We determined the high temperature
    of the SSD controller in idle mode as the root cause for that.
    
    Depending on the used SSD, we have seen up to 1.8 Watt lower system
    idle power usage and up to 30б╟C lower SSD controller temperatures in
    our tests, when we set med_power_with_dipm manually. I have provided a
    table showing seven different SATA SSDs from ATP, Intel/Solidigm and
    Samsung [2].
    
    Intel lists a total of 3 SATA controller IDs (4B60, 4B62, 4B63) in [1]
    for those mobile PCHs.
    This commit just adds 0x4b63 as I do not have test systems with 0x4b60
    and 0x4b62 SATA controllers.
    I have tested this patch with a system which uses 0x4b63 as SATA
    controller.
    
    [1] https://sata-io.org/product/8803
    [2] https://www.thomas-krenn.com/en/wiki/SATA_Link_Power_Management#Example_LES_v4
    
    Signed-off-by: Werner Fischer <devlists@wefi.net>
    Cc: stable@vger.kernel.org
    Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ata: pata_falcon: fix IO base selection for Q40 [+ + +]

Author: Michael Schmitz <schmitzmic@gmail.com>
Date:   Sun Aug 27 16:13:47 2023 +1200

    ata: pata_falcon: fix IO base selection for Q40
    
    commit 8a1f00b753ecfdb117dc1a07e68c46d80e7923ea upstream.
    
    With commit 44b1fbc0f5f3 ("m68k/q40: Replace q40ide driver
    with pata_falcon and falconide"), the Q40 IDE driver was
    replaced by pata_falcon.c.
    
    Both IO and memory resources were defined for the Q40 IDE
    platform device, but definition of the IDE register addresses
    was modeled after the Falcon case, both in use of the memory
    resources and in including register shift and byte vs. word
    offset in the address.
    
    This was correct for the Falcon case, which does not apply
    any address translation to the register addresses. In the
    Q40 case, all of device base address, byte access offset
    and register shift is included in the platform specific
    ISA access translation (in asm/mm_io.h).
    
    As a consequence, such address translation gets applied
    twice, and register addresses are mangled.
    
    Use the device base address from the platform IO resource
    for Q40 (the IO address translation will then add the correct
    ISA window base address and byte access offset), with register
    shift 1. Use MMIO base address and register shift 2 as before
    for Falcon.
    
    Encode PIO_OFFSET into IO port addresses for all registers
    for Q40 except the data transfer register. Encode the MMIO
    offset there (pata_falcon_data_xfer() directly uses raw IO
    with no address translation).
    
    Reported-by: William R Sowerbutts <will@sowerbutts.com>
    Closes: https://lore.kernel.org/r/CAMuHMdUU62jjunJh9cqSqHT87B0H0A4udOOPs=WN7WZKpcagVA@mail.gmail.com
    Link: https://lore.kernel.org/r/CAMuHMdUU62jjunJh9cqSqHT87B0H0A4udOOPs=WN7WZKpcagVA@mail.gmail.com
    Fixes: 44b1fbc0f5f3 ("m68k/q40: Replace q40ide driver with pata_falcon and falconide")
    Cc: stable@vger.kernel.org
    Cc: Finn Thain <fthain@linux-m68k.org>
    Cc: Geert Uytterhoeven <geert@linux-m68k.org>
    Tested-by: William R Sowerbutts <will@sowerbutts.com>
    Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
    Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
    Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
    Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ata: pata_ftide010: Add missing MODULE_DESCRIPTION [+ + +]

Author: Damien Le Moal <dlemoal@kernel.org>
Date:   Thu Aug 24 07:41:59 2023 +0900

    ata: pata_ftide010: Add missing MODULE_DESCRIPTION
    
    commit 7274eef5729037300f29d14edeb334a47a098f65 upstream.
    
    Add the missing MODULE_DESCRIPTION() to avoid warnings such as:
    
    WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/ata/pata_ftide010.o
    
    when compiling with W=1.
    
    Fixes: be4e456ed3a5 ("ata: Add driver for Faraday Technology FTIDE010")
    Cc: stable@vger.kernel.org
    Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
    Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ata: sata_gemini: Add missing MODULE_DESCRIPTION [+ + +]

Author: Damien Le Moal <dlemoal@kernel.org>
Date:   Thu Aug 24 07:43:18 2023 +0900

    ata: sata_gemini: Add missing MODULE_DESCRIPTION
    
    commit 8566572bf3b4d6e416a4bf2110dbb4817d11ba59 upstream.
    
    Add the missing MODULE_DESCRIPTION() to avoid warnings such as:
    
    WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/ata/sata_gemini.o
    
    when compiling with W=1.
    
    Fixes: be4e456ed3a5 ("ata: Add driver for Faraday Technology FTIDE010")
    Cc: stable@vger.kernel.org
    Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
    Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

backlight: gpio_backlight: Drop output GPIO direction check for initial power state [+ + +]

Author: Ying Liu <victor.liu@nxp.com>
Date:   Fri Jul 21 09:29:03 2023 +0000

    backlight: gpio_backlight: Drop output GPIO direction check for initial power state
    
    [ Upstream commit fe1328b5b2a087221e31da77e617f4c2b70f3b7f ]
    
    So, let's drop output GPIO direction check and only check GPIO value to set
    the initial power state.
    
    Fixes: 706dc68102bc ("backlight: gpio: Explicitly set the direction of the GPIO")
    Signed-off-by: Liu Ying <victor.liu@nxp.com>
    Reviewed-by: Andy Shevchenko <andy@kernel.org>
    Acked-by: Linus Walleij <linus.walleij@linaro.org>
    Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
    Link: https://lore.kernel.org/r/20230721093342.1532531-1-victor.liu@nxp.com
    Signed-off-by: Lee Jones <lee@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

blk-throttle: consider 'carryover_ios/bytes' in throtl_trim_slice() [+ + +]

Author: Yu Kuai <yukuai3@huawei.com>
Date:   Wed Aug 16 09:27:08 2023 +0800

    blk-throttle: consider 'carryover_ios/bytes' in throtl_trim_slice()
    
    [ Upstream commit eead0056648cef49d7b15c07ae612fa217083165 ]
    
    Currently, 'carryover_ios/bytes' is not handled in throtl_trim_slice(),
    for consequence, 'carryover_ios/bytes' will be used to throttle bio
    multiple times, for example:
    
    1) set iops limit to 100, and slice start is 0, slice end is 100ms;
    2) current time is 0, and 10 ios are dispatched, those io won't be
       throttled and io_disp is 10;
    3) still at current time 0, update iops limit to 1000, carryover_ios is
       updated to (0 - 10) = -10;
    4) in this slice(0 - 100ms), io_allowed = 100 + (-10) = 90, which means
       only 90 ios can be dispatched without waiting;
    5) assume that io is throttled in slice(0 - 100ms), and
       throtl_trim_slice() update silce to (100ms - 200ms). In this case,
       'carryover_ios/bytes' is not cleared and still only 90 ios can be
       dispatched between 100ms - 200ms.
    
    Fix this problem by updating 'carryover_ios/bytes' in
    throtl_trim_slice().
    
    Fixes: a880ae93e5b5 ("blk-throttle: fix io hung due to configuration updates")
    Reported-by: zhuxiaohui <zhuxiaohui.400@bytedance.com>
    Link: https://lore.kernel.org/all/20230812072116.42321-1-zhuxiaohui.400@bytedance.com/
    Signed-off-by: Yu Kuai <yukuai3@huawei.com>
    Acked-by: Tejun Heo <tj@kernel.org>
    Link: https://lore.kernel.org/r/20230816012708.1193747-5-yukuai1@huaweicloud.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

blk-throttle: use calculate_io/bytes_allowed() for throtl_trim_slice() [+ + +]

Author: Yu Kuai <yukuai3@huawei.com>
Date:   Wed Aug 16 09:27:07 2023 +0800

    blk-throttle: use calculate_io/bytes_allowed() for throtl_trim_slice()
    
    [ Upstream commit e8368b57c006dc0e02dcd8a9dc9f2060ff5476fe ]
    
    There are no functional changes, just make the code cleaner.
    
    Signed-off-by: Yu Kuai <yukuai3@huawei.com>
    Acked-by: Tejun Heo <tj@kernel.org>
    Link: https://lore.kernel.org/r/20230816012708.1193747-4-yukuai1@huaweicloud.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Stable-dep-of: eead0056648c ("blk-throttle: consider 'carryover_ios/bytes' in throtl_trim_slice()")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf, sockmap: Fix skb refcnt race after locking changes [+ + +]

Author: John Fastabend <john.fastabend@gmail.com>
Date:   Fri Sep 1 13:21:37 2023 -0700

    bpf, sockmap: Fix skb refcnt race after locking changes
    
    [ Upstream commit a454d84ee20baf7bd7be90721b9821f73c7d23d9 ]
    
    There is a race where skb's from the sk_psock_backlog can be referenced
    after userspace side has already skb_consumed() the sk_buff and its refcnt
    dropped to zer0 causing use after free.
    
    The flow is the following:
    
      while ((skb = skb_peek(&psock->ingress_skb))
        sk_psock_handle_Skb(psock, skb, ..., ingress)
        if (!ingress) ...
        sk_psock_skb_ingress
           sk_psock_skb_ingress_enqueue(skb)
              msg->skb = skb
              sk_psock_queue_msg(psock, msg)
        skb_dequeue(&psock->ingress_skb)
    
    The sk_psock_queue_msg() puts the msg on the ingress_msg queue. This is
    what the application reads when recvmsg() is called. An application can
    read this anytime after the msg is placed on the queue. The recvmsg hook
    will also read msg->skb and then after user space reads the msg will call
    consume_skb(skb) on it effectively free'ing it.
    
    But, the race is in above where backlog queue still has a reference to
    the skb and calls skb_dequeue(). If the skb_dequeue happens after the
    user reads and free's the skb we have a use after free.
    
    The !ingress case does not suffer from this problem because it uses
    sendmsg_*(sk, msg) which does not pass the sk_buff further down the
    stack.
    
    The following splat was observed with 'test_progs -t sockmap_listen':
    
      [ 1022.710250][ T2556] general protection fault, ...
      [...]
      [ 1022.712830][ T2556] Workqueue: events sk_psock_backlog
      [ 1022.713262][ T2556] RIP: 0010:skb_dequeue+0x4c/0x80
      [ 1022.713653][ T2556] Code: ...
      [...]
      [ 1022.720699][ T2556] Call Trace:
      [ 1022.720984][ T2556]  <TASK>
      [ 1022.721254][ T2556]  ? die_addr+0x32/0x80^M
      [ 1022.721589][ T2556]  ? exc_general_protection+0x25a/0x4b0
      [ 1022.722026][ T2556]  ? asm_exc_general_protection+0x22/0x30
      [ 1022.722489][ T2556]  ? skb_dequeue+0x4c/0x80
      [ 1022.722854][ T2556]  sk_psock_backlog+0x27a/0x300
      [ 1022.723243][ T2556]  process_one_work+0x2a7/0x5b0
      [ 1022.723633][ T2556]  worker_thread+0x4f/0x3a0
      [ 1022.723998][ T2556]  ? __pfx_worker_thread+0x10/0x10
      [ 1022.724386][ T2556]  kthread+0xfd/0x130
      [ 1022.724709][ T2556]  ? __pfx_kthread+0x10/0x10
      [ 1022.725066][ T2556]  ret_from_fork+0x2d/0x50
      [ 1022.725409][ T2556]  ? __pfx_kthread+0x10/0x10
      [ 1022.725799][ T2556]  ret_from_fork_asm+0x1b/0x30
      [ 1022.726201][ T2556]  </TASK>
    
    To fix we add an skb_get() before passing the skb to be enqueued in the
    engress queue. This bumps the skb->users refcnt so that consume_skb()
    and kfree_skb will not immediately free the sk_buff. With this we can
    be sure the skb is still around when we do the dequeue. Then we just
    need to decrement the refcnt or free the skb in the backlog case which
    we do by calling kfree_skb() on the ingress case as well as the sendmsg
    case.
    
    Before locking change from fixes tag we had the sock locked so we
    couldn't race with user and there was no issue here.
    
    Fixes: 799aa7f98d53e ("skmsg: Avoid lock_sock() in sk_psock_backlog()")
    Reported-by: Jiri Olsa  <jolsa@kernel.org>
    Signed-off-by: John Fastabend <john.fastabend@gmail.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Tested-by: Xu Kuohai <xukuohai@huawei.com>
    Tested-by: Jiri Olsa <jolsa@kernel.org>
    Link: https://lore.kernel.org/bpf/20230901202137.214666-1-john.fastabend@gmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf: Assign bpf_tramp_run_ctx::saved_run_ctx before recursion check. [+ + +]

Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date:   Wed Aug 30 10:04:05 2023 +0200

    bpf: Assign bpf_tramp_run_ctx::saved_run_ctx before recursion check.
    
    [ Upstream commit 6764e767f4af1e35f87f3497e1182d945de37f93 ]
    
    __bpf_prog_enter_recur() assigns bpf_tramp_run_ctx::saved_run_ctx before
    performing the recursion check which means in case of a recursion
    __bpf_prog_exit_recur() uses the previously set bpf_tramp_run_ctx::saved_run_ctx
    value.
    
    __bpf_prog_enter_sleepable_recur() assigns bpf_tramp_run_ctx::saved_run_ctx
    after the recursion check which means in case of a recursion
    __bpf_prog_exit_sleepable_recur() uses an uninitialized value. This does not
    look right. If I read the entry trampoline code right, then bpf_tramp_run_ctx
    isn't initialized upfront.
    
    Align __bpf_prog_enter_sleepable_recur() with __bpf_prog_enter_recur() and
    set bpf_tramp_run_ctx::saved_run_ctx before the recursion check is made.
    Remove the assignment of saved_run_ctx in kern_sys_bpf() since it happens
    a few cycles later.
    
    Fixes: e384c7b7b46d0 ("bpf, x86: Create bpf_tramp_run_ctx on the caller thread's stack")
    Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Jiri Olsa <jolsa@kernel.org>
    Link: https://lore.kernel.org/bpf/20230830080405.251926-3-bigeasy@linutronix.de
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf: Invoke __bpf_prog_exit_sleepable_recur() on recursion in kern_sys_bpf(). [+ + +]

Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date:   Wed Aug 30 10:04:04 2023 +0200

    bpf: Invoke __bpf_prog_exit_sleepable_recur() on recursion in kern_sys_bpf().
    
    [ Upstream commit 7645629f7dc88cd777f98970134bf1a54c8d77e3 ]
    
    If __bpf_prog_enter_sleepable_recur() detects recursion then it returns
    0 without undoing rcu_read_lock_trace(), migrate_disable() or
    decrementing the recursion counter. This is fine in the JIT case because
    the JIT code will jump in the 0 case to the end and invoke the matching
    exit trampoline (__bpf_prog_exit_sleepable_recur()).
    
    This is not the case in kern_sys_bpf() which returns directly to the
    caller with an error code.
    
    Add __bpf_prog_exit_sleepable_recur() as clean up in the recursion case.
    
    Fixes: b1d18a7574d0d ("bpf: Extend sys_bpf commands for bpf_syscall programs.")
    Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: Jiri Olsa <jolsa@kernel.org>
    Link: https://lore.kernel.org/bpf/20230830080405.251926-2-bigeasy@linutronix.de
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf: Remove prog->active check for bpf_lsm and bpf_iter [+ + +]

Author: Martin KaFai Lau <martin.lau@kernel.org>
Date:   Tue Oct 25 11:45:16 2022 -0700

    bpf: Remove prog->active check for bpf_lsm and bpf_iter
    
    [ Upstream commit 271de525e1d7f564e88a9d212c50998b49a54476 ]
    
    The commit 64696c40d03c ("bpf: Add __bpf_prog_{enter,exit}_struct_ops for struct_ops trampoline")
    removed prog->active check for struct_ops prog.  The bpf_lsm
    and bpf_iter is also using trampoline.  Like struct_ops, the bpf_lsm
    and bpf_iter have fixed hooks for the prog to attach.  The
    kernel does not call the same hook in a recursive way.
    This patch also removes the prog->active check for
    bpf_lsm and bpf_iter.
    
    A later patch has a test to reproduce the recursion issue
    for a sleepable bpf_lsm program.
    
    This patch appends the '_recur' naming to the existing
    enter and exit functions that track the prog->active counter.
    New __bpf_prog_{enter,exit}[_sleepable] function are
    added to skip the prog->active tracking. The '_struct_ops'
    version is also removed.
    
    It also moves the decision on picking the enter and exit function to
    the new bpf_trampoline_{enter,exit}().  It returns the '_recur' ones
    for all tracing progs to use.  For bpf_lsm, bpf_iter,
    struct_ops (no prog->active tracking after 64696c40d03c), and
    bpf_lsm_cgroup (no prog->active tracking after 69fd337a975c7),
    it will return the functions that don't track the prog->active.
    
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
    Link: https://lore.kernel.org/r/20221025184524.3526117-2-martin.lau@linux.dev
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Stable-dep-of: 7645629f7dc8 ("bpf: Invoke __bpf_prog_exit_sleepable_recur() on recursion in kern_sys_bpf().")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

btrfs: don't start transaction when joining with TRANS_JOIN_NOSTART [+ + +]

Author: Filipe Manana <fdmanana@suse.com>
Date:   Wed Jul 26 16:56:57 2023 +0100

    btrfs: don't start transaction when joining with TRANS_JOIN_NOSTART
    
    commit 4490e803e1fe9fab8db5025e44e23b55df54078b upstream.
    
    When joining a transaction with TRANS_JOIN_NOSTART, if we don't find a
    running transaction we end up creating one. This goes against the purpose
    of TRANS_JOIN_NOSTART which is to join a running transaction if its state
    is at or below the state TRANS_STATE_COMMIT_START, otherwise return an
    -ENOENT error and don't start a new transaction. So fix this to not create
    a new transaction if there's no running transaction at or below that
    state.
    
    CC: stable@vger.kernel.org # 4.14+
    Fixes: a6d155d2e363 ("Btrfs: fix deadlock between fiemap and transaction commits")
    Signed-off-by: Filipe Manana <fdmanana@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: fix start transaction qgroup rsv double free [+ + +]

Author: Boris Burkov <boris@bur.io>
Date:   Fri Jul 21 09:02:07 2023 -0700

    btrfs: fix start transaction qgroup rsv double free
    
    commit a6496849671a5bc9218ecec25a983253b34351b1 upstream.
    
    btrfs_start_transaction reserves metadata space of the PERTRANS type
    before it identifies a transaction to start/join. This allows flushing
    when reserving that space without a deadlock. However, it results in a
    race which temporarily breaks qgroup rsv accounting.
    
    T1                                              T2
    start_transaction
    do_stuff
                                                start_transaction
                                                    qgroup_reserve_meta_pertrans
    commit_transaction
        qgroup_free_meta_all_pertrans
                                                hit an error starting txn
                                                goto reserve_fail
                                                qgroup_free_meta_pertrans (already freed!)
    
    The basic issue is that there is nothing preventing another commit from
    committing before start_transaction finishes (in fact sometimes we
    intentionally wait for it) so any error path that frees the reserve is
    at risk of this race.
    
    While this exact space was getting freed anyway, and it's not a huge
    deal to double free it (just a warning, the free code catches this), it
    can result in incorrectly freeing some other pertrans reservation in
    this same reservation, which could then lead to spuriously granting
    reservations we might not have the space for. Therefore, I do believe it
    is worth fixing.
    
    To fix it, use the existing prealloc->pertrans conversion mechanism.
    When we first reserve the space, we reserve prealloc space and only when
    we are sure we have a transaction do we convert it to pertrans. This way
    any racing commits do not blow away our reservation, but we still get a
    pertrans reservation that is freed when _this_ transaction gets committed.
    
    This issue can be reproduced by running generic/269 with either qgroups
    or squotas enabled via mkfs on the scratch device.
    
    Reviewed-by: Josef Bacik <josef@toxicpanda.com>
    CC: stable@vger.kernel.org # 5.10+
    Signed-off-by: Boris Burkov <boris@bur.io>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: free qgroup rsv on io failure [+ + +]

Author: Boris Burkov <boris@bur.io>
Date:   Fri Jul 21 09:02:06 2023 -0700

    btrfs: free qgroup rsv on io failure
    
    commit e28b02118b94e42be3355458a2406c6861e2dd32 upstream.
    
    If we do a write whose bio suffers an error, we will never reclaim the
    qgroup reserved space for it. We allocate the space in the write_iter
    codepath, then release the reservation as we allocate the ordered
    extent, but we only create a delayed ref if the ordered extent finishes.
    If it has an error, we simply leak the rsv. This is apparent in running
    any error injecting (dmerror) fstests like btrfs/146 or btrfs/160. Such
    tests fail due to dmesg on umount complaining about the leaked qgroup
    data space.
    
    When we clean up other aspects of space on failed ordered_extents, also
    free the qgroup rsv.
    
    Reviewed-by: Josef Bacik <josef@toxicpanda.com>
    CC: stable@vger.kernel.org # 5.10+
    Signed-off-by: Boris Burkov <boris@bur.io>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: set page extent mapped after read_folio in relocate_one_page [+ + +]

Author: Josef Bacik <josef@toxicpanda.com>
Date:   Mon Jul 31 11:13:00 2023 -0400

    btrfs: set page extent mapped after read_folio in relocate_one_page
    
    commit e7f1326cc24e22b38afc3acd328480a1183f9e79 upstream.
    
    One of the CI runs triggered the following panic
    
      assertion failed: PagePrivate(page) && page->private, in fs/btrfs/subpage.c:229
      ------------[ cut here ]------------
      kernel BUG at fs/btrfs/subpage.c:229!
      Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
      CPU: 0 PID: 923660 Comm: btrfs Not tainted 6.5.0-rc3+ #1
      pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
      pc : btrfs_subpage_assert+0xbc/0xf0
      lr : btrfs_subpage_assert+0xbc/0xf0
      sp : ffff800093213720
      x29: ffff800093213720 x28: ffff8000932138b4 x27: 000000000c280000
      x26: 00000001b5d00000 x25: 000000000c281000 x24: 000000000c281fff
      x23: 0000000000001000 x22: 0000000000000000 x21: ffffff42b95bf880
      x20: ffff42b9528e0000 x19: 0000000000001000 x18: ffffffffffffffff
      x17: 667274622f736620 x16: 6e69202c65746176 x15: 0000000000000028
      x14: 0000000000000003 x13: 00000000002672d7 x12: 0000000000000000
      x11: ffffcd3f0ccd9204 x10: ffffcd3f0554ae50 x9 : ffffcd3f0379528c
      x8 : ffff800093213428 x7 : 0000000000000000 x6 : ffffcd3f091771e8
      x5 : ffff42b97f333948 x4 : 0000000000000000 x3 : 0000000000000000
      x2 : 0000000000000000 x1 : ffff42b9556cde80 x0 : 000000000000004f
      Call trace:
       btrfs_subpage_assert+0xbc/0xf0
       btrfs_subpage_set_dirty+0x38/0xa0
       btrfs_page_set_dirty+0x58/0x88
       relocate_one_page+0x204/0x5f0
       relocate_file_extent_cluster+0x11c/0x180
       relocate_data_extent+0xd0/0xf8
       relocate_block_group+0x3d0/0x4e8
       btrfs_relocate_block_group+0x2d8/0x490
       btrfs_relocate_chunk+0x54/0x1a8
       btrfs_balance+0x7f4/0x1150
       btrfs_ioctl+0x10f0/0x20b8
       __arm64_sys_ioctl+0x120/0x11d8
       invoke_syscall.constprop.0+0x80/0xd8
       do_el0_svc+0x6c/0x158
       el0_svc+0x50/0x1b0
       el0t_64_sync_handler+0x120/0x130
       el0t_64_sync+0x194/0x198
      Code: 91098021 b0007fa0 91346000 97e9c6d2 (d4210000)
    
    This is the same problem outlined in 17b17fcd6d44 ("btrfs:
    set_page_extent_mapped after read_folio in btrfs_cont_expand") , and the
    fix is the same.  I originally looked for the same pattern elsewhere in
    our code, but mistakenly skipped over this code because I saw the page
    cache readahead before we set_page_extent_mapped, not realizing that
    this was only in the !page case, that we can still end up with a
    !uptodate page and then do the btrfs_read_folio further down.
    
    The fix here is the same as the above mentioned patch, move the
    set_page_extent_mapped call to after the btrfs_read_folio() block to
    make sure that we have the subpage blocksize stuff setup properly before
    using the page.
    
    CC: stable@vger.kernel.org # 6.1+
    Reviewed-by: Filipe Manana <fdmanana@suse.com>
    Signed-off-by: Josef Bacik <josef@toxicpanda.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: use the correct superblock to compare fsid in btrfs_validate_super [+ + +]

Author: Anand Jain <anand.jain@oracle.com>
Date:   Mon Jul 31 19:16:34 2023 +0800

    btrfs: use the correct superblock to compare fsid in btrfs_validate_super
    
    commit d167aa76dc0683828588c25767da07fb549e4f48 upstream.
    
    The function btrfs_validate_super() should verify the fsid in the provided
    superblock argument. Because, all its callers expect it to do that.
    
    Such as in the following stack:
    
       write_all_supers()
           sb = fs_info->super_for_commit;
           btrfs_validate_write_super(.., sb)
             btrfs_validate_super(.., sb, ..)
    
       scrub_one_super()
            btrfs_validate_super(.., sb, ..)
    
    And
       check_dev_super()
            btrfs_validate_super(.., sb, ..)
    
    However, it currently verifies the fs_info::super_copy::fsid instead,
    which is not correct.  Fix this using the correct fsid in the superblock
    argument.
    
    CC: stable@vger.kernel.org # 5.4+
    Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
    Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
    Signed-off-by: Anand Jain <anand.jain@oracle.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: zoned: do not zone finish data relocation block group [+ + +]

Author: Naohiro Aota <naohiro.aota@wdc.com>
Date:   Fri Jul 21 16:42:14 2023 +0900

    btrfs: zoned: do not zone finish data relocation block group
    
    commit 332581bde2a419d5f12a93a1cdc2856af649a3cc upstream.
    
    When multiple writes happen at once, we may need to sacrifice a currently
    active block group to be zone finished for a new allocation. We choose a
    block group with the least free space left, and zone finish it.
    
    To do the finishing, we need to send IOs for already allocated region
    and wait for them and on-going IOs. Otherwise, these IOs fail because the
    zone is already finished at the time the IO reach a device.
    
    However, if a block group dedicated to the data relocation is zone
    finished, there is a chance that finishing it before an ongoing write IO
    reaches the device. That is because there is timing gap between an
    allocation is done (block_group->reservations == 0, as pre-allocation is
    done) and an ordered extent is created when the relocation IO starts.
    Thus, if we finish the zone between them, we can fail the IOs.
    
    We cannot simply use "fs_info->data_reloc_bg == block_group->start" to
    avoid the zone finishing. Because, the data_reloc_bg may already switch to
    a new block group, while there are still ongoing write IOs to the old
    data_reloc_bg.
    
    So, this patch reworks the BLOCK_GROUP_FLAG_ZONED_DATA_RELOC bit to
    indicate there is a data relocation allocation and/or ongoing write to the
    block group. The bit is set on allocation and cleared in end_io function of
    the last IO for the currently allocated region.
    
    To change the timing of the bit setting also solves the issue that the bit
    being left even after there is no IO going on. With the current code, if
    the data_reloc_bg switches after the last IO to the current data_reloc_bg,
    the bit is set at this timing and there is no one clearing that bit. As a
    result, that block group is kept unallocatable for anything.
    
    Fixes: 343d8a30851c ("btrfs: zoned: prevent allocation from previous data relocation BG")
    Fixes: 74e91b12b115 ("btrfs: zoned: zone finish unused block group")
    CC: stable@vger.kernel.org # 6.1+
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
    Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: zoned: re-enable metadata over-commit for zoned mode [+ + +]

Author: Naohiro Aota <naohiro.aota@wdc.com>
Date:   Tue Aug 8 01:12:40 2023 +0900

    btrfs: zoned: re-enable metadata over-commit for zoned mode
    
    commit 5b135b382a360f4c87cf8896d1465b0b07f10cb0 upstream.
    
    Now that, we can re-enable metadata over-commit. As we moved the activation
    from the reservation time to the write time, we no longer need to ensure
    all the reserved bytes is properly activated.
    
    Without the metadata over-commit, it suffers from lower performance because
    it needs to flush the delalloc items more often and allocate more block
    groups. Re-enabling metadata over-commit will solve the issue.
    
    Fixes: 79417d040f4f ("btrfs: zoned: disable metadata overcommit for zoned")
    CC: stable@vger.kernel.org # 6.1+
    Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
    Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bus: mhi: host: Skip MHI reset if device is in RDDM [+ + +]

Author: Qiang Yu <quic_qianyu@quicinc.com>
Date:   Thu May 18 14:22:39 2023 +0800

    bus: mhi: host: Skip MHI reset if device is in RDDM
    
    commit cabce92dd805945a090dc6fc73b001bb35ed083a upstream.
    
    In RDDM EE, device can not process MHI reset issued by host. In case of MHI
    power off, host is issuing MHI reset and polls for it to get cleared until
    it times out. Since this timeout can not be avoided in case of RDDM, skip
    the MHI reset in this scenarios.
    
    Cc: <stable@vger.kernel.org>
    Fixes: a6e2e3522f29 ("bus: mhi: core: Add support for PM state transitions")
    Signed-off-by: Qiang Yu <quic_qianyu@quicinc.com>
    Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
    Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
    Link: https://lore.kernel.org/r/1684390959-17836-1-git-send-email-quic_qianyu@quicinc.com
    Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cifs: update desired access while requesting for directory lease [+ + +]

Author: Bharath SM <bharathsm@microsoft.com>
Date:   Wed Aug 16 19:38:45 2023 +0000

    cifs: update desired access while requesting for directory lease
    
    commit b6d44d42313baa45a81ce9b299aeee2ccf3d0ee1 upstream.
    
    We read and cache directory contents when we get directory
    lease, so we should ask for read permission to read contents
    of directory.
    
    Signed-off-by: Bharath SM <bharathsm@microsoft.com>
    Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cifs: use fs_context for automounts [+ + +]

Author: Paulo Alcantara <pc@cjr.nz>
Date:   Tue Oct 4 18:41:20 2022 -0300

    cifs: use fs_context for automounts
    
    [ Upstream commit 9fd29a5bae6e8f94b410374099a6fddb253d2d5f ]
    
    Use filesystem context support to handle dfs links.
    
    Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Stable-dep-of: efc0b0bcffcb ("smb: propagate error code of extract_sharename()")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

clk: imx: pll14xx: align pdiv with reference manual [+ + +]

Author: Marco Felsch <m.felsch@pengutronix.de>
Date:   Mon Aug 7 10:47:43 2023 +0200

    clk: imx: pll14xx: align pdiv with reference manual
    
    commit 37cfd5e457cbdcd030f378127ff2d62776f641e7 upstream.
    
    The PLL14xx hardware can be found on i.MX8M{M,N,P} SoCs and always come
    with a 6-bit pre-divider. Neither the reference manuals nor the
    datasheets of these SoCs do mention any restrictions. Furthermore the
    current code doesn't respect the restrictions from the comment too.
    
    Therefore drop the restriction and align the max pre-divider (pdiv)
    value to 63 to get more accurate frequencies.
    
    Fixes: b09c68dc57c9 ("clk: imx: pll14xx: Support dynamic rates")
    Cc: stable@vger.kernel.org
    Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
    Reviewed-by: Abel Vesa <abel.vesa@linaro.org>
    Reviewed-by: Adam Ford <aford173@gmail.com>
    Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
    Acked-by: Sascha Hauer <s.hauer@pengutronix.de>
    Tested-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
    Link: https://lore.kernel.org/r/20230807084744.1184791-1-m.felsch@pengutronix.de
    Signed-off-by: Abel Vesa <abel.vesa@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: imx: pll14xx: dynamically configure PLL for 393216000/361267200Hz [+ + +]

Author: Ahmad Fatoum <a.fatoum@pengutronix.de>
Date:   Mon Aug 7 10:47:44 2023 +0200

    clk: imx: pll14xx: dynamically configure PLL for 393216000/361267200Hz
    
    commit 72d00e560d10665e6139c9431956a87ded6e9880 upstream.
    
    Since commit b09c68dc57c9 ("clk: imx: pll14xx: Support dynamic rates"),
    the driver has the ability to dynamically compute PLL parameters to
    approximate the requested rates. This is not always used, because the
    logic is as follows:
    
      - Check if the target rate is hardcoded in the frequency table
      - Check if varying only kdiv is possible, so switch over is glitch free
      - Compute rate dynamically by iterating over pdiv range
    
    If we skip the frequency table for the 1443x PLL, we find that the
    computed values differ to the hardcoded ones. This can be valid if the
    hardcoded values guarantee for example an earlier lock-in or if the
    divisors are chosen, so that other important rates are more likely to
    be reached glitch-free.
    
    For rates (393216000 and 361267200, this doesn't seem to be the case:
    They are only approximated by existing parameters (393215995 and
    361267196 Hz, respectively) and they aren't reachable glitch-free from
    other hardcoded frequencies. Dropping them from the table allows us
    to lock-in to these frequencies exactly.
    
    This is immediately noticeable because they are the assigned-clock-rates
    for IMX8MN_AUDIO_PLL1 and IMX8MN_AUDIO_PLL2, respectively and a look
    into clk_summary so far showed that they were a few Hz short of the target:
    
    imx8mn-board:~# grep audio_pll[12]_out /sys/kernel/debug/clk/clk_summary
    audio_pll2_out           0        0        0   361267196 0     0  50000   N
    audio_pll1_out           1        1        0   393215995 0     0  50000   Y
    
    and afterwards:
    
    imx8mn-board:~# grep audio_pll[12]_out /sys/kernel/debug/clk/clk_summary
    audio_pll2_out           0        0        0   361267200 0     0  50000   N
    audio_pll1_out           1        1        0   393216000 0     0  50000   Y
    
    This change is equivalent to adding following hardcoded values:
    
      /*               rate     mdiv  pdiv  sdiv   kdiv */
      PLL_1443X_RATE(393216000, 655,    5,    3,  23593),
      PLL_1443X_RATE(361267200, 497,   33,    0, -16882),
    
    Fixes: 053a4ffe2988 ("clk: imx: imx8mm: fix audio pll setting")
    Cc: stable@vger.kernel.org # v5.18+
    Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
    Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
    Link: https://lore.kernel.org/r/20230807084744.1184791-2-m.felsch@pengutronix.de
    Signed-off-by: Abel Vesa <abel.vesa@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: qcom: camcc-sc7180: fix async resume during probe [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Tue Jul 18 15:28:55 2023 +0200

    clk: qcom: camcc-sc7180: fix async resume during probe
    
    commit c948ff727e25297f3a703eb5349dd66aabf004e4 upstream.
    
    To make sure that the controller is runtime resumed and its power domain
    is enabled before accessing its registers during probe, the synchronous
    runtime PM interface must be used.
    
    Fixes: 8d4025943e13 ("clk: qcom: camcc-sc7180: Use runtime PM ops instead of clk ones")
    Cc: stable@vger.kernel.org      # 5.11
    Cc: Stephen Boyd <sboyd@kernel.org>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Link: https://lore.kernel.org/r/20230718132902.21430-2-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: qcom: dispcc-sm8450: fix runtime PM imbalance on probe errors [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Tue Jul 18 15:28:56 2023 +0200

    clk: qcom: dispcc-sm8450: fix runtime PM imbalance on probe errors
    
    commit b0f3d01bda6c3f6f811e70f76d2040ae81f64565 upstream.
    
    Make sure to decrement the runtime PM usage count before returning in
    case regmap initialisation fails.
    
    Fixes: 16fb89f92ec4 ("clk: qcom: Add support for Display Clock Controller on SM8450")
    Cc: stable@vger.kernel.org      # 6.1
    Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Link: https://lore.kernel.org/r/20230718132902.21430-3-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: qcom: gcc-mdm9615: use proper parent for pll0_vote clock [+ + +]

Author: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Date:   Sat May 13 00:17:23 2023 +0300

    clk: qcom: gcc-mdm9615: use proper parent for pll0_vote clock
    
    commit 1583694bb4eaf186f17131dbc1b83d6057d2749b upstream.
    
    The pll0_vote clock definitely should have pll0 as a parent (instead of
    pll8).
    
    Fixes: 7792a8d6713c ("clk: mdm9615: Add support for MDM9615 Clock Controllers")
    Cc: stable@kernel.org
    Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
    Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
    Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
    Link: https://lore.kernel.org/r/20230512211727.3445575-7-dmitry.baryshkov@linaro.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: qcom: lpasscc-sc7280: fix missing resume during probe [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Tue Jul 18 15:28:59 2023 +0200

    clk: qcom: lpasscc-sc7280: fix missing resume during probe
    
    commit 66af5339d4f8e20c6d89a490570bd94d40f1a7f6 upstream.
    
    Drivers that enable runtime PM must make sure that the controller is
    runtime resumed before accessing its registers to prevent the power
    domain from being disabled.
    
    Fixes: 4ab43d171181 ("clk: qcom: Add lpass clock controller driver for SC7280")
    Cc: stable@vger.kernel.org      # 5.16
    Cc: Taniya Das <quic_tdas@quicinc.com>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Link: https://lore.kernel.org/r/20230718132902.21430-6-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: qcom: mss-sc7180: fix missing resume during probe [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Tue Jul 18 15:29:01 2023 +0200

    clk: qcom: mss-sc7180: fix missing resume during probe
    
    commit e2349da0fa7ca822cda72f427345b95795358fe7 upstream.
    
    Drivers that enable runtime PM must make sure that the controller is
    runtime resumed before accessing its registers to prevent the power
    domain from being disabled.
    
    Fixes: 8def929c4097 ("clk: qcom: Add modem clock controller driver for SC7180")
    Cc: stable@vger.kernel.org      # 5.7
    Cc: Taniya Das <quic_tdas@quicinc.com>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Link: https://lore.kernel.org/r/20230718132902.21430-8-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: qcom: q6sstop-qcs404: fix missing resume during probe [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Tue Jul 18 15:29:00 2023 +0200

    clk: qcom: q6sstop-qcs404: fix missing resume during probe
    
    commit 97112c83f4671a4a722f99a53be4e91fac4091bc upstream.
    
    Drivers that enable runtime PM must make sure that the controller is
    runtime resumed before accessing its registers to prevent the power
    domain from being disabled.
    
    Fixes: 6cdef2738db0 ("clk: qcom: Add Q6SSTOP clock controller for QCS404")
    Cc: stable@vger.kernel.org      # 5.5
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Link: https://lore.kernel.org/r/20230718132902.21430-7-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: qcom: turingcc-qcs404: fix missing resume during probe [+ + +]

Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Tue Jul 18 15:29:02 2023 +0200

    clk: qcom: turingcc-qcs404: fix missing resume during probe
    
    commit a9f71a033587c9074059132d34c74eabbe95ef26 upstream.
    
    Drivers that enable runtime PM must make sure that the controller is
    runtime resumed before accessing its registers to prevent the power
    domain from being disabled.
    
    Fixes: 892df0191b29 ("clk: qcom: Add QCS404 TuringCC")
    Cc: stable@vger.kernel.org      # 5.2
    Cc: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Link: https://lore.kernel.org/r/20230718132902.21430-9-johan+linaro@kernel.org
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clocksource/drivers/arm_arch_timer: Disable timer before programming CVAL [+ + +]

Author: Walter Chang <walter.chang@mediatek.com>
Date:   Mon Jul 17 17:07:34 2023 +0800

    clocksource/drivers/arm_arch_timer: Disable timer before programming CVAL
    
    commit e7d65e40ab5a5940785c5922f317602d0268caaf upstream.
    
    Due to the fact that the use of `writeq_relaxed()` to program CVAL is
    not guaranteed to be atomic, it is necessary to disable the timer before
    programming CVAL.
    
    However, if the MMIO timer is already enabled and has not yet expired,
    there is a possibility of unexpected behavior occurring: when the CPU
    enters the idle state during this period, and if the CPU's local event
    is earlier than the broadcast event, the following process occurs:
    
    tick_broadcast_enter()
      tick_broadcast_oneshot_control(TICK_BROADCAST_ENTER)
        __tick_broadcast_oneshot_control()
          ___tick_broadcast_oneshot_control()
            tick_broadcast_set_event()
              clockevents_program_event()
                set_next_event_mem()
    
    During this process, the MMIO timer remains enabled while programming
    CVAL. To prevent such behavior, disable timer explicitly prior to
    programming CVAL.
    
    Fixes: 8b82c4f883a7 ("clocksource/drivers/arm_arch_timer: Move MMIO timer programming over to CVAL")
    Cc: stable@vger.kernel.org
    Signed-off-by: Walter Chang <walter.chang@mediatek.com>
    Acked-by: Marc Zyngier <maz@kernel.org>
    Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
    Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
    Link: https://lore.kernel.org/r/20230717090735.19370-1-walter.chang@mediatek.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dmaengine: sh: rz-dmac: Fix destination and source data size setting [+ + +]

Author: Hien Huynh <hien.huynh.px@renesas.com>
Date:   Thu Jul 6 12:21:50 2023 +0100

    dmaengine: sh: rz-dmac: Fix destination and source data size setting
    
    commit c6ec8c83a29fb3aec3efa6fabbf5344498f57c7f upstream.
    
    Before setting DDS and SDS values, we need to clear its value first
    otherwise, we get incorrect results when we change/update the DMA bus
    width several times due to the 'OR' expression.
    
    Fixes: 5000d37042a6 ("dmaengine: sh: Add DMAC driver for RZ/G2L SoC")
    Cc: stable@kernel.org
    Signed-off-by: Hien Huynh <hien.huynh.px@renesas.com>
    Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
    Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Link: https://lore.kernel.org/r/20230706112150.198941-3-biju.das.jz@bp.renesas.com
    Signed-off-by: Vinod Koul <vkoul@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd/display: always switch off ODM before committing more streams [+ + +]

Author: Wenjing Liu <wenjing.liu@amd.com>
Date:   Tue Aug 15 10:47:52 2023 -0400

    drm/amd/display: always switch off ODM before committing more streams
    
    commit 49a30c3d1a2258fc93cfe6eea8e4951dabadc824 upstream.
    
    ODM power optimization is only supported with single stream. When ODM
    power optimization is enabled, we might not have enough free pipes for
    enabling other stream. So when we are committing more than 1 stream we
    should first switch off ODM power optimization to make room for new
    stream and then allocating pipe resource for the new stream.
    
    Cc: stable@vger.kernel.org
    Fixes: 59de751e3845 ("drm/amd/display: add ODM case when looking for first split pipe")
    Reviewed-by: Dillon Varone <dillon.varone@amd.com>
    Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd/display: enable cursor degamma for DCN3+ DRM legacy gamma [+ + +]

Author: Melissa Wen <mwen@igalia.com>
Date:   Thu Aug 31 15:12:28 2023 -0100

    drm/amd/display: enable cursor degamma for DCN3+ DRM legacy gamma
    
    commit 57a943ebfcdb4a97fbb409640234bdb44bfa1953 upstream.
    
    For DRM legacy gamma, AMD display manager applies implicit sRGB degamma
    using a pre-defined sRGB transfer function. It works fine for DCN2
    family where degamma ROM and custom curves go to the same color block.
    But, on DCN3+, degamma is split into two blocks: degamma ROM for
    pre-defined TFs and `gamma correction` for user/custom curves and
    degamma ROM settings doesn't apply to cursor plane. To get DRM legacy
    gamma working as expected, enable cursor degamma ROM for implict sRGB
    degamma on HW with this configuration.
    
    Cc: stable@vger.kernel.org
    Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2803
    Fixes: 96b020e2163f ("drm/amd/display: check attr flag before set cursor degamma on DCN3+")
    Signed-off-by: Melissa Wen <mwen@igalia.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd/display: Fix a bug when searching for insert_above_mpcc [+ + +]

Author: Wesley Chalmers <wesley.chalmers@amd.com>
Date:   Wed Jun 21 19:13:26 2023 -0400

    drm/amd/display: Fix a bug when searching for insert_above_mpcc
    
    commit 3d028d5d60d516c536de1ddd3ebf3d55f3f8983b upstream.
    
    [WHY]
    Currently, when insert_plane is called with insert_above_mpcc
    parameter that is equal to tree->opp_list, the function returns NULL.
    
    [HOW]
    Instead, the function should insert the plane at the top of the tree.
    
    Cc: Mario Limonciello <mario.limonciello@amd.com>
    Cc: Alex Deucher <alexander.deucher@amd.com>
    Cc: stable@vger.kernel.org
    Reviewed-by: Jun Lei <jun.lei@amd.com>
    Acked-by: Tom Chung <chiahsuan.chung@amd.com>
    Signed-off-by: Wesley Chalmers <wesley.chalmers@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd/display: prevent potential division by zero errors [+ + +]

Author: Hamza Mahfooz <hamza.mahfooz@amd.com>
Date:   Tue Sep 5 13:27:22 2023 -0400

    drm/amd/display: prevent potential division by zero errors
    
    commit 07e388aab042774f284a2ad75a70a194517cdad4 upstream.
    
    There are two places in apply_below_the_range() where it's possible for
    a divide by zero error to occur. So, to fix this make sure the divisor
    is non-zero before attempting the computation in both cases.
    
    Cc: stable@vger.kernel.org
    Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2637
    Fixes: a463b263032f ("drm/amd/display: Fix frames_to_insert math")
    Fixes: ded6119e825a ("drm/amd/display: Reinstate LFC optimization")
    Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd/display: Remove wait while locked [+ + +]

Author: Gabe Teeger <gabe.teeger@amd.com>
Date:   Mon Aug 14 16:06:18 2023 -0400

    drm/amd/display: Remove wait while locked
    
    commit 5a3ccb1400339268c5e3dc1fa044a7f6c7f59a02 upstream.
    
    [Why]
    We wait for mpc idle while in a locked state, leading to potential
    deadlock.
    
    [What]
    Move the wait_for_idle call to outside of HW lock. This and a
    call to wait_drr_doublebuffer_pending_clear are moved added to a new
    static helper function called wait_for_outstanding_hw_updates, to make
    the interface clearer.
    
    Cc: stable@vger.kernel.org
    Fixes: 8f0d304d21b3 ("drm/amd/display: Do not commit pipe when updating DRR")
    Reviewed-by: Jun Lei <jun.lei@amd.com>
    Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Signed-off-by: Gabe Teeger <gabe.teeger@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amdgpu: register a dirty framebuffer callback for fbcon [+ + +]

Author: Hamza Mahfooz <hamza.mahfooz@amd.com>
Date:   Tue Aug 15 09:13:37 2023 -0400

    drm/amdgpu: register a dirty framebuffer callback for fbcon
    
    commit 0a611560f53bfd489e33f4a718c915f1a6123d03 upstream.
    
    fbcon requires that we implement &drm_framebuffer_funcs.dirty.
    Otherwise, the framebuffer might take a while to flush (which would
    manifest as noticeable lag). However, we can't enable this callback for
    non-fbcon cases since it may cause too many atomic commits to be made at
    once. So, implement amdgpu_dirtyfb() and only enable it for fbcon
    framebuffers (we can use the "struct drm_file file" parameter in the
    callback to check for this since it is only NULL when called by fbcon,
    at least in the mainline kernel) on devices that support atomic KMS.
    
    Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
    Cc: Mario Limonciello <mario.limonciello@amd.com>
    Cc: stable@vger.kernel.org # 6.1+
    Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2519
    Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
    Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/ast: Fix DRAM init on AST2200 [+ + +]

Author: Thomas Zimmermann <tzimmermann@suse.de>
Date:   Wed Jun 21 14:53:35 2023 +0200

    drm/ast: Fix DRAM init on AST2200
    
    commit 4cfe75f0f14f044dae66ad0e6eea812d038465d9 upstream.
    
    Fix the test for the AST2200 in the DRAM initialization. The value
    in ast->chip has to be compared against an enum constant instead of
    a numerical value.
    
    This bug got introduced when the driver was first imported into the
    kernel.
    
    Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
    Fixes: 312fec1405dd ("drm: Initial KMS driver for AST (ASpeed Technologies) 2000 series (v2)")
    Cc: Dave Airlie <airlied@redhat.com>
    Cc: dri-devel@lists.freedesktop.org
    Cc: <stable@vger.kernel.org> # v3.5+
    Reviewed-by: Sui Jingfeng <suijingfeng@loongson.cn>
    Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
    Tested-by: Jocelyn Falempe <jfalempe@redhat.com> # AST2600
    Link: https://patchwork.freedesktop.org/patch/msgid/20230621130032.3568-2-tzimmermann@suse.de
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915/gvt: Drop unused helper intel_vgpu_reset_gtt() [+ + +]

Author: Sean Christopherson <seanjc@google.com>
Date:   Fri Jul 28 18:35:16 2023 -0700

    drm/i915/gvt: Drop unused helper intel_vgpu_reset_gtt()
    
    [ Upstream commit a90c367e5af63880008e21dd199dac839e0e9e0f ]
    
    Drop intel_vgpu_reset_gtt() as it no longer has any callers.  In addition
    to eliminating dead code, this eliminates the last possible scenario where
    __kvmgt_protect_table_find() can be reached without holding vgpu_lock.
    Requiring vgpu_lock to be held when calling __kvmgt_protect_table_find()
    will allow a protecting the gfn hash with vgpu_lock without too much fuss.
    
    No functional change intended.
    
    Fixes: ba25d977571e ("drm/i915/gvt: Do not destroy ppgtt_mm during vGPU D3->D0.")
    Reviewed-by: Yan Zhao <yan.y.zhao@intel.com>
    Tested-by: Yongwei Ma <yongwei.ma@intel.com>
    Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
    Link: https://lore.kernel.org/r/20230729013535.1070024-11-seanjc@google.com
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/i915/gvt: Put the page reference obtained by KVM's gfn_to_pfn() [+ + +]

Author: Sean Christopherson <seanjc@google.com>
Date:   Fri Jul 28 18:35:11 2023 -0700

    drm/i915/gvt: Put the page reference obtained by KVM's gfn_to_pfn()
    
    [ Upstream commit 708e49583d7da863898b25dafe4bcd799c414278 ]
    
    Put the struct page reference acquired by gfn_to_pfn(), KVM's API is that
    the caller is ultimately responsible for dropping any reference.
    
    Note, kvm_release_pfn_clean() ensures the pfn is actually a refcounted
    struct page before trying to put any references.
    
    Fixes: b901b252b6cf ("drm/i915/gvt: Add 2M huge gtt support")
    Reviewed-by: Yan Zhao <yan.y.zhao@intel.com>
    Tested-by: Yongwei Ma <yongwei.ma@intel.com>
    Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
    Link: https://lore.kernel.org/r/20230729013535.1070024-6-seanjc@google.com
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/i915/gvt: Verify pfn is "valid" before dereferencing "struct page" [+ + +]

Author: Sean Christopherson <seanjc@google.com>
Date:   Fri Jul 28 18:35:07 2023 -0700

    drm/i915/gvt: Verify pfn is "valid" before dereferencing "struct page"
    
    [ Upstream commit f046923af79158361295ed4f0a588c80b9fdcc1d ]
    
    Check that the pfn found by gfn_to_pfn() is actually backed by "struct
    page" memory prior to retrieving and dereferencing the page.  KVM
    supports backing guest memory with VM_PFNMAP, VM_IO, etc., and so
    there is no guarantee the pfn returned by gfn_to_pfn() has an associated
    "struct page".
    
    Fixes: b901b252b6cf ("drm/i915/gvt: Add 2M huge gtt support")
    Reviewed-by: Yan Zhao <yan.y.zhao@intel.com>
    Tested-by: Yongwei Ma <yongwei.ma@intel.com>
    Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
    Link: https://lore.kernel.org/r/20230729013535.1070024-2-seanjc@google.com
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/i915: mark requests for GuC virtual engines to avoid use-after-free [+ + +]

Author: Andrzej Hajda <andrzej.hajda@intel.com>
Date:   Mon Aug 21 17:30:35 2023 +0200

    drm/i915: mark requests for GuC virtual engines to avoid use-after-free
    
    [ Upstream commit 5eefc5307c983b59344a4cb89009819f580c84fa ]
    
    References to i915_requests may be trapped by userspace inside a
    sync_file or dmabuf (dma-resv) and held indefinitely across different
    proceses. To counter-act the memory leaks, we try to not to keep
    references from the request past their completion.
    On the other side on fence release we need to know if rq->engine
    is valid and points to hw engine (true for non-virtual requests).
    To make it possible extra bit has been added to rq->execution_mask,
    for marking virtual engines.
    
    Fixes: bcb9aa45d5a0 ("Revert "drm/i915: Hold reference to intel_context over life of i915_request"")
    Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com>
    Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
    Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
    Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20230821153035.3903006-1-andrzej.hajda@intel.com
    (cherry picked from commit 280410677af763f3871b93e794a199cfcf6fb580)
    Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/mxsfb: Disable overlay plane in mxsfb_plane_overlay_atomic_disable() [+ + +]

Author: Liu Ying <victor.liu@nxp.com>
Date:   Mon Jun 12 17:23:59 2023 +0800

    drm/mxsfb: Disable overlay plane in mxsfb_plane_overlay_atomic_disable()
    
    commit aa656d48e871a1b062e1bbf9474d8b831c35074c upstream.
    
    When disabling overlay plane in mxsfb_plane_overlay_atomic_update(),
    overlay plane's framebuffer pointer is NULL.  So, dereferencing it would
    cause a kernel Oops(NULL pointer dereferencing).  Fix the issue by
    disabling overlay plane in mxsfb_plane_overlay_atomic_disable() instead.
    
    Fixes: cb285a5348e7 ("drm: mxsfb: Replace mxsfb_get_fb_paddr() with drm_fb_cma_get_gem_addr()")
    Cc: stable@vger.kernel.org # 5.19+
    Signed-off-by: Liu Ying <victor.liu@nxp.com>
    Reviewed-by: Marek Vasut <marex@denx.de>
    Signed-off-by: Marek Vasut <marex@denx.de>
    Link: https://patchwork.freedesktop.org/patch/msgid/20230612092359.784115-1-victor.liu@nxp.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/virtio: Conditionally allocate virtio_gpu_fence [+ + +]

Author: Gurchetan Singh <gurchetansingh@chromium.org>
Date:   Fri Jul 7 14:31:24 2023 -0700

    drm/virtio: Conditionally allocate virtio_gpu_fence
    
    commit 70d1ace56db6c79d39dbe9c0d5244452b67e2fde upstream.
    
    We don't want to create a fence for every command submission.  It's
    only necessary when userspace provides a waitable token for submission.
    This could be:
    
    1) bo_handles, to be used with VIRTGPU_WAIT
    2) out_fence_fd, to be used with dma_fence apis
    3) a ring_idx provided with VIRTGPU_CONTEXT_PARAM_POLL_RINGS_MASK
       + DRM event API
    4) syncobjs in the future
    
    The use case for just submitting a command to the host, and expecting
    no response.  For example, gfxstream has GFXSTREAM_CONTEXT_PING that
    just wakes up the host side worker threads.  There's also
    CROSS_DOMAIN_CMD_SEND which just sends data to the Wayland server.
    
    This prevents the need to signal the automatically created
    virtio_gpu_fence.
    
    In addition, VIRTGPU_EXECBUF_RING_IDX is checked when creating a
    DRM event object.  VIRTGPU_CONTEXT_PARAM_POLL_RINGS_MASK is
    already defined in terms of per-context rings.  It was theoretically
    possible to create a DRM event on the global timeline (ring_idx == 0),
    if the context enabled DRM event polling.  However, that wouldn't
    work and userspace (Sommelier).  Explicitly disallow it for
    clarity.
    
    Signed-off-by: Gurchetan Singh <gurchetansingh@chromium.org>
    Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
    Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
    Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> # edited coding style
    Link: https://patchwork.freedesktop.org/patch/msgid/20230707213124.494-1-gurchetansingh@chromium.org
    Signed-off-by: Alyssa Ross <hi@alyssa.is>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dt-bindings: clock: xlnx,versal-clk: drop select:false [+ + +]

Author: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Date:   Fri Jul 28 18:59:23 2023 +0200

    dt-bindings: clock: xlnx,versal-clk: drop select:false
    
    commit 172044e30b00977784269e8ab72132a48293c654 upstream.
    
    select:false makes the schema basically ignored and not effective, which
    is clearly not what we want for a device binding.
    
    Fixes: 352546805a44 ("dt-bindings: clock: Add bindings for versal clock driver")
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Link: https://lore.kernel.org/r/20230728165923.108589-1-krzysztof.kozlowski@linaro.org
    Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
    Reviewed-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>
    Signed-off-by: Stephen Boyd <sboyd@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: add correct group descriptors and reserved GDT blocks to system zone [+ + +]

Author: Wang Jianjian <wangjianjian0@foxmail.com>
Date:   Thu Aug 3 00:28:39 2023 +0800

    ext4: add correct group descriptors and reserved GDT blocks to system zone
    
    commit 68228da51c9a436872a4ef4b5a7692e29f7e5bc7 upstream.
    
    When setup_system_zone, flex_bg is not initialized so it is always 1.
    Use a new helper function, ext4_num_base_meta_blocks() which does not
    depend on sbi->s_log_groups_per_flex being initialized.
    
    [ Squashed two patches in the Link URL's below together into a single
      commit, which is simpler to review/understand.  Also fix checkpatch
      warnings. --TYT ]
    
    Cc: stable@kernel.org
    Signed-off-by: Wang Jianjian <wangjianjian0@foxmail.com>
    Link: https://lore.kernel.org/r/tencent_21AF0D446A9916ED5C51492CC6C9A0A77B05@qq.com
    Link: https://lore.kernel.org/r/tencent_D744D1450CC169AEA77FCF0A64719909ED05@qq.com
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ext4: fix memory leaks in ext4_fname_{setup_filename,prepare_lookup} [+ + +]

Author: Luц╜s Henriques <lhenriques@suse.de>
Date:   Thu Aug 3 10:17:13 2023 +0100

    ext4: fix memory leaks in ext4_fname_{setup_filename,prepare_lookup}
    
    commit 7ca4b085f430f3774c3838b3da569ceccd6a0177 upstream.
    
    If the filename casefolding fails, we'll be leaking memory from the
    fscrypt_name struct, namely from the 'crypto_buf.name' member.
    
    Make sure we free it in the error path on both ext4_fname_setup_filename()
    and ext4_fname_prepare_lookup() functions.
    
    Cc: stable@kernel.org
    Fixes: 1ae98e295fa2 ("ext4: optimize match for casefolded encrypted dirs")
    Signed-off-by: Luц╜s Henriques <lhenriques@suse.de>
    Reviewed-by: Eric Biggers <ebiggers@google.com>
    Link: https://lore.kernel.org/r/20230803091713.13239-1-lhenriques@suse.de
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

f2fs: avoid false alarm of circular locking [+ + +]

Author: Jaegeuk Kim <jaegeuk@kernel.org>
Date:   Fri Aug 18 11:34:32 2023 -0700

    f2fs: avoid false alarm of circular locking
    
    commit 5c13e2388bf3426fd69a89eb46e50469e9624e56 upstream.
    
    ======================================================
    WARNING: possible circular locking dependency detected
    6.5.0-rc5-syzkaller-00353-gae545c3283dc #0 Not tainted
    ------------------------------------------------------
    syz-executor273/5027 is trying to acquire lock:
    ffff888077fe1fb0 (&fi->i_sem){+.+.}-{3:3}, at: f2fs_down_write fs/f2fs/f2fs.h:2133 [inline]
    ffff888077fe1fb0 (&fi->i_sem){+.+.}-{3:3}, at: f2fs_add_inline_entry+0x300/0x6f0 fs/f2fs/inline.c:644
    
    but task is already holding lock:
    ffff888077fe07c8 (&fi->i_xattr_sem){.+.+}-{3:3}, at: f2fs_down_read fs/f2fs/f2fs.h:2108 [inline]
    ffff888077fe07c8 (&fi->i_xattr_sem){.+.+}-{3:3}, at: f2fs_add_dentry+0x92/0x230 fs/f2fs/dir.c:783
    
    which lock already depends on the new lock.
    
    the existing dependency chain (in reverse order) is:
    
    -> #1 (&fi->i_xattr_sem){.+.+}-{3:3}:
           down_read+0x9c/0x470 kernel/locking/rwsem.c:1520
           f2fs_down_read fs/f2fs/f2fs.h:2108 [inline]
           f2fs_getxattr+0xb1e/0x12c0 fs/f2fs/xattr.c:532
           __f2fs_get_acl+0x5a/0x900 fs/f2fs/acl.c:179
           f2fs_acl_create fs/f2fs/acl.c:377 [inline]
           f2fs_init_acl+0x15c/0xb30 fs/f2fs/acl.c:420
           f2fs_init_inode_metadata+0x159/0x1290 fs/f2fs/dir.c:558
           f2fs_add_regular_entry+0x79e/0xb90 fs/f2fs/dir.c:740
           f2fs_add_dentry+0x1de/0x230 fs/f2fs/dir.c:788
           f2fs_do_add_link+0x190/0x280 fs/f2fs/dir.c:827
           f2fs_add_link fs/f2fs/f2fs.h:3554 [inline]
           f2fs_mkdir+0x377/0x620 fs/f2fs/namei.c:781
           vfs_mkdir+0x532/0x7e0 fs/namei.c:4117
           do_mkdirat+0x2a9/0x330 fs/namei.c:4140
           __do_sys_mkdir fs/namei.c:4160 [inline]
           __se_sys_mkdir fs/namei.c:4158 [inline]
           __x64_sys_mkdir+0xf2/0x140 fs/namei.c:4158
           do_syscall_x64 arch/x86/entry/common.c:50 [inline]
           do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
           entry_SYSCALL_64_after_hwframe+0x63/0xcd
    
    -> #0 (&fi->i_sem){+.+.}-{3:3}:
           check_prev_add kernel/locking/lockdep.c:3142 [inline]
           check_prevs_add kernel/locking/lockdep.c:3261 [inline]
           validate_chain kernel/locking/lockdep.c:3876 [inline]
           __lock_acquire+0x2e3d/0x5de0 kernel/locking/lockdep.c:5144
           lock_acquire kernel/locking/lockdep.c:5761 [inline]
           lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5726
           down_write+0x93/0x200 kernel/locking/rwsem.c:1573
           f2fs_down_write fs/f2fs/f2fs.h:2133 [inline]
           f2fs_add_inline_entry+0x300/0x6f0 fs/f2fs/inline.c:644
           f2fs_add_dentry+0xa6/0x230 fs/f2fs/dir.c:784
           f2fs_do_add_link+0x190/0x280 fs/f2fs/dir.c:827
           f2fs_add_link fs/f2fs/f2fs.h:3554 [inline]
           f2fs_mkdir+0x377/0x620 fs/f2fs/namei.c:781
           vfs_mkdir+0x532/0x7e0 fs/namei.c:4117
           ovl_do_mkdir fs/overlayfs/overlayfs.h:196 [inline]
           ovl_mkdir_real+0xb5/0x370 fs/overlayfs/dir.c:146
           ovl_workdir_create+0x3de/0x820 fs/overlayfs/super.c:309
           ovl_make_workdir fs/overlayfs/super.c:711 [inline]
           ovl_get_workdir fs/overlayfs/super.c:864 [inline]
           ovl_fill_super+0xdab/0x6180 fs/overlayfs/super.c:1400
           vfs_get_super+0xf9/0x290 fs/super.c:1152
           vfs_get_tree+0x88/0x350 fs/super.c:1519
           do_new_mount fs/namespace.c:3335 [inline]
           path_mount+0x1492/0x1ed0 fs/namespace.c:3662
           do_mount fs/namespace.c:3675 [inline]
           __do_sys_mount fs/namespace.c:3884 [inline]
           __se_sys_mount fs/namespace.c:3861 [inline]
           __x64_sys_mount+0x293/0x310 fs/namespace.c:3861
           do_syscall_x64 arch/x86/entry/common.c:50 [inline]
           do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
           entry_SYSCALL_64_after_hwframe+0x63/0xcd
    
    other info that might help us debug this:
    
     Possible unsafe locking scenario:
    
           CPU0                    CPU1
           ----                    ----
      rlock(&fi->i_xattr_sem);
                                   lock(&fi->i_sem);
                                   lock(&fi->i_xattr_sem);
      lock(&fi->i_sem);
    
    Cc: <stable@vger.kernel.org>
    Reported-and-tested-by: syzbot+e5600587fa9cbf8e3826@syzkaller.appspotmail.com
    Fixes: 5eda1ad1aaff "f2fs: fix deadlock in i_xattr_sem and inode page lock"
    Tested-by: Guenter Roeck <linux@roeck-us.net>
    Reviewed-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

f2fs: flush inode if atomic file is aborted [+ + +]

Author: Jaegeuk Kim <jaegeuk@kernel.org>
Date:   Fri Jul 7 07:03:13 2023 -0700

    f2fs: flush inode if atomic file is aborted
    
    commit a3ab55746612247ce3dcaac6de66f5ffc055b9df upstream.
    
    Let's flush the inode being aborted atomic operation to avoid stale dirty
    inode during eviction in this call stack:
    
      f2fs_mark_inode_dirty_sync+0x22/0x40 [f2fs]
      f2fs_abort_atomic_write+0xc4/0xf0 [f2fs]
      f2fs_evict_inode+0x3f/0x690 [f2fs]
      ? sugov_start+0x140/0x140
      evict+0xc3/0x1c0
      evict_inodes+0x17b/0x210
      generic_shutdown_super+0x32/0x120
      kill_block_super+0x21/0x50
      deactivate_locked_super+0x31/0x90
      cleanup_mnt+0x100/0x160
      task_work_run+0x59/0x90
      do_exit+0x33b/0xa50
      do_group_exit+0x2d/0x80
      __x64_sys_exit_group+0x14/0x20
      do_syscall_64+0x3b/0x90
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
    
    This triggers f2fs_bug_on() in f2fs_evict_inode:
     f2fs_bug_on(sbi, is_inode_flag_set(inode, FI_DIRTY_INODE));
    
    This fixes the syzbot report:
    
    loop0: detected capacity change from 0 to 131072
    F2FS-fs (loop0): invalid crc value
    F2FS-fs (loop0): Found nat_bits in checkpoint
    F2FS-fs (loop0): Mounted with checkpoint version = 48b305e4
    ------------[ cut here ]------------
    kernel BUG at fs/f2fs/inode.c:869!
    invalid opcode: 0000 [#1] PREEMPT SMP KASAN
    CPU: 0 PID: 5014 Comm: syz-executor220 Not tainted 6.4.0-syzkaller-11479-g6cd06ab12d1a #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/27/2023
    RIP: 0010:f2fs_evict_inode+0x172d/0x1e00 fs/f2fs/inode.c:869
    Code: ff df 48 c1 ea 03 80 3c 02 00 0f 85 6a 06 00 00 8b 75 40 ba 01 00 00 00 4c 89 e7 e8 6d ce 06 00 e9 aa fc ff ff e8 63 22 e2 fd <0f> 0b e8 5c 22 e2 fd 48 c7 c0 a8 3a 18 8d 48 ba 00 00 00 00 00 fc
    RSP: 0018:ffffc90003a6fa00 EFLAGS: 00010293
    RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
    RDX: ffff8880273b8000 RSI: ffffffff83a2bd0d RDI: 0000000000000007
    RBP: ffff888077db91b0 R08: 0000000000000007 R09: 0000000000000000
    R10: 0000000000000001 R11: 0000000000000001 R12: ffff888029a3c000
    R13: ffff888077db9660 R14: ffff888029a3c0b8 R15: ffff888077db9c50
    FS:  0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f1909bb9000 CR3: 00000000276a9000 CR4: 0000000000350ef0
    Call Trace:
     <TASK>
     evict+0x2ed/0x6b0 fs/inode.c:665
     dispose_list+0x117/0x1e0 fs/inode.c:698
     evict_inodes+0x345/0x440 fs/inode.c:748
     generic_shutdown_super+0xaf/0x480 fs/super.c:478
     kill_block_super+0x64/0xb0 fs/super.c:1417
     kill_f2fs_super+0x2af/0x3c0 fs/f2fs/super.c:4704
     deactivate_locked_super+0x98/0x160 fs/super.c:330
     deactivate_super+0xb1/0xd0 fs/super.c:361
     cleanup_mnt+0x2ae/0x3d0 fs/namespace.c:1254
     task_work_run+0x16f/0x270 kernel/task_work.c:179
     exit_task_work include/linux/task_work.h:38 [inline]
     do_exit+0xa9a/0x29a0 kernel/exit.c:874
     do_group_exit+0xd4/0x2a0 kernel/exit.c:1024
     __do_sys_exit_group kernel/exit.c:1035 [inline]
     __se_sys_exit_group kernel/exit.c:1033 [inline]
     __x64_sys_exit_group+0x3e/0x50 kernel/exit.c:1033
     do_syscall_x64 arch/x86/entry/common.c:50 [inline]
     do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
     entry_SYSCALL_64_after_hwframe+0x63/0xcd
    RIP: 0033:0x7f309be71a09
    Code: Unable to access opcode bytes at 0x7f309be719df.
    RSP: 002b:00007fff171df518 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
    RAX: ffffffffffffffda RBX: 00007f309bef7330 RCX: 00007f309be71a09
    RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000001
    RBP: 0000000000000001 R08: ffffffffffffffc0 R09: 00007f309bef1e40
    R10: 0000000000010600 R11: 0000000000000246 R12: 00007f309bef7330
    R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000001
     </TASK>
    Modules linked in:
    ---[ end trace 0000000000000000 ]---
    RIP: 0010:f2fs_evict_inode+0x172d/0x1e00 fs/f2fs/inode.c:869
    Code: ff df 48 c1 ea 03 80 3c 02 00 0f 85 6a 06 00 00 8b 75 40 ba 01 00 00 00 4c 89 e7 e8 6d ce 06 00 e9 aa fc ff ff e8 63 22 e2 fd <0f> 0b e8 5c 22 e2 fd 48 c7 c0 a8 3a 18 8d 48 ba 00 00 00 00 00 fc
    RSP: 0018:ffffc90003a6fa00 EFLAGS: 00010293
    RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
    RDX: ffff8880273b8000 RSI: ffffffff83a2bd0d RDI: 0000000000000007
    RBP: ffff888077db91b0 R08: 0000000000000007 R09: 0000000000000000
    R10: 0000000000000001 R11: 0000000000000001 R12: ffff888029a3c000
    R13: ffff888077db9660 R14: ffff888029a3c0b8 R15: ffff888077db9c50
    FS:  0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f1909bb9000 CR3: 00000000276a9000 CR4: 0000000000350ef0
    
    Cc: <stable@vger.kernel.org>
    Reported-and-tested-by: syzbot+e1246909d526a9d470fa@syzkaller.appspotmail.com
    Reviewed-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

fbdev/ep93xx-fb: Do not assign to struct fb_info.dev [+ + +]

Author: Thomas Zimmermann <tzimmermann@suse.de>
Date:   Tue Jun 13 13:06:49 2023 +0200

    fbdev/ep93xx-fb: Do not assign to struct fb_info.dev
    
    commit f90a0e5265b60cdd3c77990e8105f79aa2fac994 upstream.
    
    Do not assing the Linux device to struct fb_info.dev. The call to
    register_framebuffer() initializes the field to the fbdev device.
    Drivers should not override its value.
    
    Fixes a bug where the driver incorrectly decreases the hardware
    device's reference counter and leaks the fbdev device.
    
    v2:
            * add Fixes tag (Dan)
    
    Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
    Fixes: 88017bda96a5 ("ep93xx video driver")
    Cc: <stable@vger.kernel.org> # v2.6.32+
    Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
    Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
    Link: https://patchwork.freedesktop.org/patch/msgid/20230613110953.24176-15-tzimmermann@suse.de
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

fuse: nlookup missing decrement in fuse_direntplus_link [+ + +]

Author: ruanmeisi <ruan.meisi@zte.com.cn>
Date:   Tue Apr 25 19:13:54 2023 +0800

    fuse: nlookup missing decrement in fuse_direntplus_link
    
    commit b8bd342d50cbf606666488488f9fea374aceb2d5 upstream.
    
    During our debugging of glusterfs, we found an Assertion failed error:
    inode_lookup >= nlookup, which was caused by the nlookup value in the
    kernel being greater than that in the FUSE file system.
    
    The issue was introduced by fuse_direntplus_link, where in the function,
    fuse_iget increments nlookup, and if d_splice_alias returns failure,
    fuse_direntplus_link returns failure without decrementing nlookup
    https://github.com/gluster/glusterfs/pull/4081
    
    Signed-off-by: ruanmeisi <ruan.meisi@zte.com.cn>
    Fixes: 0b05b18381ee ("fuse: implement NFS-like readdirplus support")
    Cc: <stable@vger.kernel.org> # v3.9
    Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

gfs2: low-memory forced flush fixes [+ + +]

Author: Andreas Gruenbacher <agruenba@redhat.com>
Date:   Thu Aug 10 17:15:46 2023 +0200

    gfs2: low-memory forced flush fixes
    
    [ Upstream commit b74cd55aa9a9d0aca760028a51343ec79812e410 ]
    
    First, function gfs2_ail_flush_reqd checks the SDF_FORCE_AIL_FLUSH flag
    to determine if an AIL flush should be forced in low-memory situations.
    However, it also immediately clears the flag, and when called repeatedly
    as in function gfs2_logd, the flag will be lost.  Fix that by pulling
    the SDF_FORCE_AIL_FLUSH flag check out of gfs2_ail_flush_reqd.
    
    Second, function gfs2_writepages sets the SDF_FORCE_AIL_FLUSH flag
    whether or not enough pages were written.  If enough pages could be
    written, flushing the AIL is unnecessary, though.
    
    Third, gfs2_writepages doesn't wake up logd after setting the
    SDF_FORCE_AIL_FLUSH flag, so it can take a long time for logd to react.
    It would be preferable to wake up logd, but that hurts the performance
    of some workloads and we don't quite understand why so far, so don't
    wake up logd so far.
    
    Fixes: b066a4eebd4f ("gfs2: forcibly flush ail to relieve memory pressure")
    Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

gfs2: Switch to wait_event in gfs2_logd [+ + +]

Author: Andreas Gruenbacher <agruenba@redhat.com>
Date:   Thu Aug 17 15:46:16 2023 +0200

    gfs2: Switch to wait_event in gfs2_logd
    
    [ Upstream commit 6df373b09b1dcf2f7d579f515f653f89a896d417 ]
    
    In gfs2_logd(), switch from an open-coded wait loop to
    wait_event_interruptible_timeout().
    
    Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
    Stable-dep-of: b74cd55aa9a9 ("gfs2: low-memory forced flush fixes")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

gve: fix frag_list chaining [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Aug 31 21:38:12 2023 +0000

    gve: fix frag_list chaining
    
    [ Upstream commit 817c7cd2043a83a3d8147f40eea1505ac7300b62 ]
    
    gve_rx_append_frags() is able to build skbs chained with frag_list,
    like GRO engine.
    
    Problem is that shinfo->frag_list should only be used
    for the head of the chain.
    
    All other links should use skb->next pointer.
    
    Otherwise, built skbs are not valid and can cause crashes.
    
    Equivalent code in GRO (skb_gro_receive()) is:
    
        if (NAPI_GRO_CB(p)->last == p)
            skb_shinfo(p)->frag_list = skb;
        else
            NAPI_GRO_CB(p)->last->next = skb;
        NAPI_GRO_CB(p)->last = skb;
    
    Fixes: 9b8dd5e5ea48 ("gve: DQO: Add RX path")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Bailey Forrest <bcf@google.com>
    Cc: Willem de Bruijn <willemb@google.com>
    Cc: Catherine Sullivan <csully@google.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

hsr: Fix uninit-value access in fill_frame_info() [+ + +]

Author: Ziyang Xuan <william.xuanziyang@huawei.com>
Date:   Fri Sep 8 18:17:52 2023 +0800

    hsr: Fix uninit-value access in fill_frame_info()
    
    [ Upstream commit 484b4833c604c0adcf19eac1ca14b60b757355b5 ]
    
    Syzbot reports the following uninit-value access problem.
    
    =====================================================
    BUG: KMSAN: uninit-value in fill_frame_info net/hsr/hsr_forward.c:601 [inline]
    BUG: KMSAN: uninit-value in hsr_forward_skb+0x9bd/0x30f0 net/hsr/hsr_forward.c:616
     fill_frame_info net/hsr/hsr_forward.c:601 [inline]
     hsr_forward_skb+0x9bd/0x30f0 net/hsr/hsr_forward.c:616
     hsr_dev_xmit+0x192/0x330 net/hsr/hsr_device.c:223
     __netdev_start_xmit include/linux/netdevice.h:4889 [inline]
     netdev_start_xmit include/linux/netdevice.h:4903 [inline]
     xmit_one net/core/dev.c:3544 [inline]
     dev_hard_start_xmit+0x247/0xa10 net/core/dev.c:3560
     __dev_queue_xmit+0x34d0/0x52a0 net/core/dev.c:4340
     dev_queue_xmit include/linux/netdevice.h:3082 [inline]
     packet_xmit+0x9c/0x6b0 net/packet/af_packet.c:276
     packet_snd net/packet/af_packet.c:3087 [inline]
     packet_sendmsg+0x8b1d/0x9f30 net/packet/af_packet.c:3119
     sock_sendmsg_nosec net/socket.c:730 [inline]
     sock_sendmsg net/socket.c:753 [inline]
     __sys_sendto+0x781/0xa30 net/socket.c:2176
     __do_sys_sendto net/socket.c:2188 [inline]
     __se_sys_sendto net/socket.c:2184 [inline]
     __ia32_sys_sendto+0x11f/0x1c0 net/socket.c:2184
     do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
     __do_fast_syscall_32+0xa2/0x100 arch/x86/entry/common.c:178
     do_fast_syscall_32+0x37/0x80 arch/x86/entry/common.c:203
     do_SYSENTER_32+0x1f/0x30 arch/x86/entry/common.c:246
     entry_SYSENTER_compat_after_hwframe+0x70/0x82
    
    Uninit was created at:
     slab_post_alloc_hook+0x12f/0xb70 mm/slab.h:767
     slab_alloc_node mm/slub.c:3478 [inline]
     kmem_cache_alloc_node+0x577/0xa80 mm/slub.c:3523
     kmalloc_reserve+0x148/0x470 net/core/skbuff.c:559
     __alloc_skb+0x318/0x740 net/core/skbuff.c:644
     alloc_skb include/linux/skbuff.h:1286 [inline]
     alloc_skb_with_frags+0xc8/0xbd0 net/core/skbuff.c:6299
     sock_alloc_send_pskb+0xa80/0xbf0 net/core/sock.c:2794
     packet_alloc_skb net/packet/af_packet.c:2936 [inline]
     packet_snd net/packet/af_packet.c:3030 [inline]
     packet_sendmsg+0x70e8/0x9f30 net/packet/af_packet.c:3119
     sock_sendmsg_nosec net/socket.c:730 [inline]
     sock_sendmsg net/socket.c:753 [inline]
     __sys_sendto+0x781/0xa30 net/socket.c:2176
     __do_sys_sendto net/socket.c:2188 [inline]
     __se_sys_sendto net/socket.c:2184 [inline]
     __ia32_sys_sendto+0x11f/0x1c0 net/socket.c:2184
     do_syscall_32_irqs_on arch/x86/entry/common.c:112 [inline]
     __do_fast_syscall_32+0xa2/0x100 arch/x86/entry/common.c:178
     do_fast_syscall_32+0x37/0x80 arch/x86/entry/common.c:203
     do_SYSENTER_32+0x1f/0x30 arch/x86/entry/common.c:246
     entry_SYSENTER_compat_after_hwframe+0x70/0x82
    
    It is because VLAN not yet supported in hsr driver. Return error
    when protocol is ETH_P_8021Q in fill_frame_info() now to fix it.
    
    Fixes: 451d8123f897 ("net: prp: add packet handling support")
    Reported-by: syzbot+bf7e6250c7ce248f3ec9@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=bf7e6250c7ce248f3ec9
    Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

hwspinlock: qcom: add missing regmap config for SFPB MMIO implementation [+ + +]

Author: Christian Marangi <ansuelsmth@gmail.com>
Date:   Sun Jul 16 04:28:04 2023 +0200

    hwspinlock: qcom: add missing regmap config for SFPB MMIO implementation
    
    commit 23316be8a9d450f33a21f1efe7d89570becbec58 upstream.
    
    Commit 5d4753f741d8 ("hwspinlock: qcom: add support for MMIO on older
    SoCs") introduced and made regmap_config mandatory in the of_data struct
    but didn't add the regmap_config for sfpb based devices.
    
    SFPB based devices can both use the legacy syscon way to probe or the
    new MMIO way and currently device that use the MMIO way are broken as
    they lack the definition of the now required regmap_config and always
    return -EINVAL (and indirectly makes fail probing everything that
    depends on it, smem, nandc with smem-parser...)
    
    Fix this by correctly adding the missing regmap_config and restore
    function of hwspinlock on SFPB based devices with MMIO implementation.
    
    Cc: stable@vger.kernel.org
    Fixes: 5d4753f741d8 ("hwspinlock: qcom: add support for MMIO on older SoCs")
    Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
    Link: https://lore.kernel.org/r/20230716022804.21239-1-ansuelsmth@gmail.com
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

idr: fix param name in idr_alloc_cyclic() doc [+ + +]

Author: Ariel Marcovitch <arielmarcovitch@gmail.com>
Date:   Sat Aug 26 20:33:17 2023 +0300

    idr: fix param name in idr_alloc_cyclic() doc
    
    [ Upstream commit 2a15de80dd0f7e04a823291aa9eb49c5294f56af ]
    
    The relevant parameter is 'start' and not 'nextid'
    
    Fixes: 460488c58ca8 ("idr: Remove idr_alloc_ext")
    Signed-off-by: Ariel Marcovitch <arielmarcovitch@gmail.com>
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

igb: Change IGB_MIN to allow set rx/tx value between 64 and 80 [+ + +]

Author: Olga Zaborska <olga.zaborska@intel.com>
Date:   Tue Jul 25 10:10:58 2023 +0200

    igb: Change IGB_MIN to allow set rx/tx value between 64 and 80
    
    [ Upstream commit 6319685bdc8ad5310890add907b7c42f89302886 ]
    
    Change the minimum value of RX/TX descriptors to 64 to enable setting the rx/tx
    value between 64 and 80. All igb devices can use as low as 64 descriptors.
    This change will unify igb with other drivers.
    Based on commit 7b1be1987c1e ("e1000e: lower ring minimum size to 64")
    
    Fixes: 9d5c824399de ("igb: PCI-Express 82575 Gigabit Ethernet driver")
    Signed-off-by: Olga Zaborska <olga.zaborska@intel.com>
    Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

igb: disable virtualization features on 82580 [+ + +]

Author: Corinna Vinschen <vinschen@redhat.com>
Date:   Thu Aug 31 14:19:13 2023 +0200

    igb: disable virtualization features on 82580
    
    [ Upstream commit fa09bc40b21a33937872c4c4cf0f266ec9fa4869 ]
    
    Disable virtualization features on 82580 just as on i210/i211.
    This avoids that virt functions are acidentally called on 82850.
    
    Fixes: 55cac248caa4 ("igb: Add full support for 82580 devices")
    Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

igbvf: Change IGBVF_MIN to allow set rx/tx value between 64 and 80 [+ + +]

Author: Olga Zaborska <olga.zaborska@intel.com>
Date:   Tue Jul 25 10:10:57 2023 +0200

    igbvf: Change IGBVF_MIN to allow set rx/tx value between 64 and 80
    
    [ Upstream commit 8360717524a24a421c36ef8eb512406dbd42160a ]
    
    Change the minimum value of RX/TX descriptors to 64 to enable setting the rx/tx
    value between 64 and 80. All igbvf devices can use as low as 64 descriptors.
    This change will unify igbvf with other drivers.
    Based on commit 7b1be1987c1e ("e1000e: lower ring minimum size to 64")
    
    Fixes: d4e0fe01a38a ("igbvf: add new driver to support 82576 virtual functions")
    Signed-off-by: Olga Zaborska <olga.zaborska@intel.com>
    Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

igc: Change IGC_MIN to allow set rx/tx value between 64 and 80 [+ + +]

Author: Olga Zaborska <olga.zaborska@intel.com>
Date:   Tue Jul 25 10:10:56 2023 +0200

    igc: Change IGC_MIN to allow set rx/tx value between 64 and 80
    
    [ Upstream commit 5aa48279712e1f134aac908acde4df798955a955 ]
    
    Change the minimum value of RX/TX descriptors to 64 to enable setting the rx/tx
    value between 64 and 80. All igc devices can use as low as 64 descriptors.
    This change will unify igc with other drivers.
    Based on commit 7b1be1987c1e ("e1000e: lower ring minimum size to 64")
    
    Fixes: 0507ef8a0372 ("igc: Add transmit and receive fastpath and interrupt handlers")
    Signed-off-by: Olga Zaborska <olga.zaborska@intel.com>
    Tested-by: Naama Meir <naamax.meir@linux.intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Input: iqs7222 - configure power mode before triggering ATI [+ + +]

Author: Jeff LaBundy <jeff@labundy.com>
Date:   Sun Jul 9 12:06:37 2023 -0500

    Input: iqs7222 - configure power mode before triggering ATI
    
    [ Upstream commit 2e00b8bf5624767f6be7427b6eb532524793463e ]
    
    If the device drops into ultra-low-power mode before being placed
    into normal-power mode as part of ATI being triggered, the device
    does not assert any interrupts until the ATI routine is restarted
    two seconds later.
    
    Solve this problem by adopting the vendor's recommendation, which
    calls for the device to be placed into normal-power mode prior to
    being configured and ATI being triggered.
    
    The original implementation followed this sequence, but the order
    was inadvertently changed as part of the resolution of a separate
    erratum.
    
    Fixes: 1e4189d8af27 ("Input: iqs7222 - protect volatile registers")
    Signed-off-by: Jeff LaBundy <jeff@labundy.com>
    Link: https://lore.kernel.org/r/ZKrpHc2Ji9qR25r2@nixie71
    Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Input: tca6416-keypad - always expect proper IRQ number in i2c client [+ + +]

Author: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Date:   Sun Jul 23 22:30:18 2023 -0700

    Input: tca6416-keypad - always expect proper IRQ number in i2c client
    
    [ Upstream commit 687fe7dfb736b03ab820d172ea5dbfc1ec447135 ]
    
    Remove option having i2c client contain raw gpio number instead of proper
    IRQ number. There are no users of this facility in mainline and it will
    allow cleaning up the driver code with regard to wakeup handling, etc.
    
    Link: https://lore.kernel.org/r/20230724053024.352054-1-dmitry.torokhov@gmail.com
    Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Stable-dep-of: cc141c35af87 ("Input: tca6416-keypad - fix interrupt enable disbalance")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Input: tca6416-keypad - fix interrupt enable disbalance [+ + +]

Author: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Date:   Sun Jul 23 22:30:20 2023 -0700

    Input: tca6416-keypad - fix interrupt enable disbalance
    
    [ Upstream commit cc141c35af873c6796e043adcb820833bd8ef8c5 ]
    
    The driver has been switched to use IRQF_NO_AUTOEN, but in the error
    unwinding and remove paths calls to enable_irq() were left in place, which
    will lead to an incorrect enable counter value.
    
    Fixes: bcd9730a04a1 ("Input: move to use request_irq by IRQF_NO_AUTOEN flag")
    Link: https://lore.kernel.org/r/20230724053024.352054-3-dmitry.torokhov@gmail.com
    Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

io_uring/net: don't overflow multishot accept [+ + +]

Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Sep 12 14:57:05 2023 +0100

    io_uring/net: don't overflow multishot accept
    
    [ upstream commit 1bfed23349716a7811645336a7ce42c4b8f250bc ]
    
    Don't allow overflowing multishot accept CQEs, we want to limit
    the grows of the overflow list.
    
    Cc: stable@vger.kernel.org
    Fixes: 4e86a2c980137 ("io_uring: implement multishot mode for accept")
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Link: https://lore.kernel.org/r/7d0d749649244873772623dd7747966f516fe6e2.1691757663.git.asml.silence@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

io_uring/sqpoll: fix io-wq affinity when IORING_SETUP_SQPOLL is used [+ + +]

Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Sep 12 14:57:07 2023 +0100

    io_uring/sqpoll: fix io-wq affinity when IORING_SETUP_SQPOLL is used
    
    From: Jens Axboe <axboe@kernel.dk>
    
    [ upstream commit ebdfefc09c6de7897962769bd3e63a2ff443ebf5 ]
    
    If we setup the ring with SQPOLL, then that polling thread has its
    own io-wq setup. This means that if the application uses
    IORING_REGISTER_IOWQ_AFF to set the io-wq affinity, we should not be
    setting it for the invoking task, but rather the sqpoll task.
    
    Add an sqpoll helper that parks the thread and updates the affinity,
    and use that one if we're using SQPOLL.
    
    Fixes: fe76421d1da1 ("io_uring: allow user configurable IO thread CPU affinity")
    Cc: stable@vger.kernel.org # 5.10+
    Link: https://github.com/axboe/liburing/discussions/884
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

io_uring: always lock in io_apoll_task_func [+ + +]

Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Sep 12 14:57:03 2023 +0100

    io_uring: always lock in io_apoll_task_func
    
    From: Dylan Yudaken <dylany@meta.com>
    
    [ upstream commit c06c6c5d276707e04cedbcc55625e984922118aa ]
    
    This is required for the failure case (io_req_complete_failed) and is
    missing.
    
    The alternative would be to only lock in the failure path, however all of
    the non-error paths in io_poll_check_events that do not do not return
    IOU_POLL_NO_ACTION end up locking anyway. The only extraneous lock would
    be for the multishot poll overflowing the CQE ring, however multishot poll
    would probably benefit from being locked as it will allow completions to
    be batched.
    
    So it seems reasonable to lock always.
    
    Signed-off-by: Dylan Yudaken <dylany@meta.com>
    Link: https://lore.kernel.org/r/20221124093559.3780686-3-dylany@meta.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

io_uring: break out of iowq iopoll on teardown [+ + +]

Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Sep 12 14:57:06 2023 +0100

    io_uring: break out of iowq iopoll on teardown
    
    [ upstream commit 45500dc4e01c167ee063f3dcc22f51ced5b2b1e9 ]
    
    io-wq will retry iopoll even when it failed with -EAGAIN. If that
    races with task exit, which sets TIF_NOTIFY_SIGNAL for all its workers,
    such workers might potentially infinitely spin retrying iopoll again and
    again and each time failing on some allocation / waiting / etc. Don't
    keep spinning if io-wq is dying.
    
    Fixes: 561fb04a6a225 ("io_uring: replace workqueue usage with io-wq")
    Cc: stable@vger.kernel.org
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

io_uring: Don't set affinity on a dying sqpoll thread [+ + +]

Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Sep 12 14:57:08 2023 +0100

    io_uring: Don't set affinity on a dying sqpoll thread
    
    From: Gabriel Krisman Bertazi <krisman@suse.de>
    
    [ upstream commit bd6fc5da4c51107e1e0cec4a3a07963d1dae2c84 ]
    
    Syzbot reported a null-ptr-deref of sqd->thread inside
    io_sqpoll_wq_cpu_affinity.  It turns out the sqd->thread can go away
    from under us during io_uring_register, in case the process gets a
    fatal signal during io_uring_register.
    
    It is not particularly hard to hit the race, and while I am not sure
    this is the exact case hit by syzbot, it solves it.  Finally, checking
    ->thread is enough to close the race because we locked sqd while
    "parking" the thread, thus preventing it from going away.
    
    I reproduced it fairly consistently with a program that does:
    
    int main(void) {
      ...
      io_uring_queue_init(RING_LEN, &ring1, IORING_SETUP_SQPOLL);
      while (1) {
        io_uring_register_iowq_aff(ring, 1, &mask);
      }
    }
    
    Executed in a loop with timeout to trigger SIGTERM:
      while true; do timeout 1 /a.out ; done
    
    This will hit the following BUG() in very few attempts.
    
    BUG: kernel NULL pointer dereference, address: 00000000000007a8
    PGD 800000010e949067 P4D 800000010e949067 PUD 10e46e067 PMD 0
    Oops: 0000 [#1] PREEMPT SMP PTI
    CPU: 0 PID: 15715 Comm: dead-sqpoll Not tainted 6.5.0-rc7-next-20230825-g193296236fa0-dirty #23
    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
    RIP: 0010:io_sqpoll_wq_cpu_affinity+0x27/0x70
    Code: 90 90 90 0f 1f 44 00 00 55 53 48 8b 9f 98 03 00 00 48 85 db 74 4f
    48 89 df 48 89 f5 e8 e2 f8 ff ff 48 8b 43 38 48 85 c0 74 22 <48> 8b b8
    a8 07 00 00 48 89 ee e8 ba b1 00 00 48 89 df 89 c5 e8 70
    RSP: 0018:ffffb04040ea7e70 EFLAGS: 00010282
    RAX: 0000000000000000 RBX: ffff93c010749e40 RCX: 0000000000000001
    RDX: 0000000000000000 RSI: ffffffffa7653331 RDI: 00000000ffffffff
    RBP: ffffb04040ea7eb8 R08: 0000000000000000 R09: c0000000ffffdfff
    R10: ffff93c01141b600 R11: ffffb04040ea7d18 R12: ffff93c00ea74840
    R13: 0000000000000011 R14: 0000000000000000 R15: ffff93c00ea74800
    FS:  00007fb7c276ab80(0000) GS:ffff93c36f200000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000000007a8 CR3: 0000000111634003 CR4: 0000000000370ef0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <TASK>
     ? __die_body+0x1a/0x60
     ? page_fault_oops+0x154/0x440
     ? do_user_addr_fault+0x174/0x7b0
     ? exc_page_fault+0x63/0x140
     ? asm_exc_page_fault+0x22/0x30
     ? io_sqpoll_wq_cpu_affinity+0x27/0x70
     __io_register_iowq_aff+0x2b/0x60
     __io_uring_register+0x614/0xa70
     __x64_sys_io_uring_register+0xaa/0x1a0
     do_syscall_64+0x3a/0x90
     entry_SYSCALL_64_after_hwframe+0x6e/0xd8
    RIP: 0033:0x7fb7c226fec9
    Code: 2e 00 b8 ca 00 00 00 0f 05 eb a5 66 0f 1f 44 00 00 48 89 f8 48 89
    f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
    f0 ff ff 73 01 c3 48 8b 0d 97 7f 2d 00 f7 d8 64 89 01 48
    RSP: 002b:00007ffe2c0674f8 EFLAGS: 00000246 ORIG_RAX: 00000000000001ab
    RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb7c226fec9
    RDX: 00007ffe2c067530 RSI: 0000000000000011 RDI: 0000000000000003
    RBP: 00007ffe2c0675d0 R08: 00007ffe2c067550 R09: 00007ffe2c067550
    R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
    R13: 00007ffe2c067750 R14: 0000000000000000 R15: 0000000000000000
     </TASK>
    Modules linked in:
    CR2: 00000000000007a8
    ---[ end trace 0000000000000000 ]---
    
    Reported-by: syzbot+c74fea926a78b8a91042@syzkaller.appspotmail.com
    Fixes: ebdfefc09c6d ("io_uring/sqpoll: fix io-wq affinity when IORING_SETUP_SQPOLL is used")
    Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
    Link: https://lore.kernel.org/r/87v8cybuo6.fsf@suse.de
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

io_uring: revert "io_uring fix multishot accept ordering" [+ + +]

Author: Pavel Begunkov <asml.silence@gmail.com>
Date:   Tue Sep 12 14:57:04 2023 +0100

    io_uring: revert "io_uring fix multishot accept ordering"
    
    From: Dylan Yudaken <dylany@meta.com>
    
    [ upstream commit 515e26961295bee9da5e26916c27739dca6c10e1 ]
    
    This is no longer needed after commit aa1df3a360a0 ("io_uring: fix CQE
    reordering"), since all reordering is now taken care of.
    
    This reverts commit cbd25748545c ("io_uring: fix multishot accept
    ordering").
    
    Signed-off-by: Dylan Yudaken <dylany@meta.com>
    Link: https://lore.kernel.org/r/20221107125236.260132-2-dylany@meta.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ip_tunnels: use DEV_STATS_INC() [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Sep 5 13:40:46 2023 +0000

    ip_tunnels: use DEV_STATS_INC()
    
    [ Upstream commit 9b271ebaf9a2c5c566a54bc6cd915962e8241130 ]
    
    syzbot/KCSAN reported data-races in iptunnel_xmit_stats() [1]
    
    This can run from multiple cpus without mutual exclusion.
    
    Adopt SMP safe DEV_STATS_INC() to update dev->stats fields.
    
    [1]
    BUG: KCSAN: data-race in iptunnel_xmit / iptunnel_xmit
    
    read-write to 0xffff8881353df170 of 8 bytes by task 30263 on cpu 1:
    iptunnel_xmit_stats include/net/ip_tunnels.h:493 [inline]
    iptunnel_xmit+0x432/0x4a0 net/ipv4/ip_tunnel_core.c:87
    ip_tunnel_xmit+0x1477/0x1750 net/ipv4/ip_tunnel.c:831
    __gre_xmit net/ipv4/ip_gre.c:469 [inline]
    ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:662
    __netdev_start_xmit include/linux/netdevice.h:4889 [inline]
    netdev_start_xmit include/linux/netdevice.h:4903 [inline]
    xmit_one net/core/dev.c:3544 [inline]
    dev_hard_start_xmit+0x11b/0x3f0 net/core/dev.c:3560
    __dev_queue_xmit+0xeee/0x1de0 net/core/dev.c:4340
    dev_queue_xmit include/linux/netdevice.h:3082 [inline]
    __bpf_tx_skb net/core/filter.c:2129 [inline]
    __bpf_redirect_no_mac net/core/filter.c:2159 [inline]
    __bpf_redirect+0x723/0x9c0 net/core/filter.c:2182
    ____bpf_clone_redirect net/core/filter.c:2453 [inline]
    bpf_clone_redirect+0x16c/0x1d0 net/core/filter.c:2425
    ___bpf_prog_run+0xd7d/0x41e0 kernel/bpf/core.c:1954
    __bpf_prog_run512+0x74/0xa0 kernel/bpf/core.c:2195
    bpf_dispatcher_nop_func include/linux/bpf.h:1181 [inline]
    __bpf_prog_run include/linux/filter.h:609 [inline]
    bpf_prog_run include/linux/filter.h:616 [inline]
    bpf_test_run+0x15d/0x3d0 net/bpf/test_run.c:423
    bpf_prog_test_run_skb+0x77b/0xa00 net/bpf/test_run.c:1045
    bpf_prog_test_run+0x265/0x3d0 kernel/bpf/syscall.c:3996
    __sys_bpf+0x3af/0x780 kernel/bpf/syscall.c:5353
    __do_sys_bpf kernel/bpf/syscall.c:5439 [inline]
    __se_sys_bpf kernel/bpf/syscall.c:5437 [inline]
    __x64_sys_bpf+0x43/0x50 kernel/bpf/syscall.c:5437
    do_syscall_x64 arch/x86/entry/common.c:50 [inline]
    do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
    entry_SYSCALL_64_after_hwframe+0x63/0xcd
    
    read-write to 0xffff8881353df170 of 8 bytes by task 30249 on cpu 0:
    iptunnel_xmit_stats include/net/ip_tunnels.h:493 [inline]
    iptunnel_xmit+0x432/0x4a0 net/ipv4/ip_tunnel_core.c:87
    ip_tunnel_xmit+0x1477/0x1750 net/ipv4/ip_tunnel.c:831
    __gre_xmit net/ipv4/ip_gre.c:469 [inline]
    ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:662
    __netdev_start_xmit include/linux/netdevice.h:4889 [inline]
    netdev_start_xmit include/linux/netdevice.h:4903 [inline]
    xmit_one net/core/dev.c:3544 [inline]
    dev_hard_start_xmit+0x11b/0x3f0 net/core/dev.c:3560
    __dev_queue_xmit+0xeee/0x1de0 net/core/dev.c:4340
    dev_queue_xmit include/linux/netdevice.h:3082 [inline]
    __bpf_tx_skb net/core/filter.c:2129 [inline]
    __bpf_redirect_no_mac net/core/filter.c:2159 [inline]
    __bpf_redirect+0x723/0x9c0 net/core/filter.c:2182
    ____bpf_clone_redirect net/core/filter.c:2453 [inline]
    bpf_clone_redirect+0x16c/0x1d0 net/core/filter.c:2425
    ___bpf_prog_run+0xd7d/0x41e0 kernel/bpf/core.c:1954
    __bpf_prog_run512+0x74/0xa0 kernel/bpf/core.c:2195
    bpf_dispatcher_nop_func include/linux/bpf.h:1181 [inline]
    __bpf_prog_run include/linux/filter.h:609 [inline]
    bpf_prog_run include/linux/filter.h:616 [inline]
    bpf_test_run+0x15d/0x3d0 net/bpf/test_run.c:423
    bpf_prog_test_run_skb+0x77b/0xa00 net/bpf/test_run.c:1045
    bpf_prog_test_run+0x265/0x3d0 kernel/bpf/syscall.c:3996
    __sys_bpf+0x3af/0x780 kernel/bpf/syscall.c:5353
    __do_sys_bpf kernel/bpf/syscall.c:5439 [inline]
    __se_sys_bpf kernel/bpf/syscall.c:5437 [inline]
    __x64_sys_bpf+0x43/0x50 kernel/bpf/syscall.c:5437
    do_syscall_x64 arch/x86/entry/common.c:50 [inline]
    do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
    entry_SYSCALL_64_after_hwframe+0x63/0xcd
    
    value changed: 0x0000000000018830 -> 0x0000000000018831
    
    Reported by Kernel Concurrency Sanitizer on:
    CPU: 0 PID: 30249 Comm: syz-executor.4 Not tainted 6.5.0-syzkaller-11704-g3f86ed6ec0b3 #0
    
    Fixes: 039f50629b7f ("ip_tunnel: Move stats update to iptunnel_xmit()")
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ipv4: annotate data-races around fi->fib_dead [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Wed Aug 30 09:55:20 2023 +0000

    ipv4: annotate data-races around fi->fib_dead
    
    [ Upstream commit fce92af1c29d90184dfec638b5738831097d66e9 ]
    
    syzbot complained about a data-race in fib_table_lookup() [1]
    
    Add appropriate annotations to document it.
    
    [1]
    BUG: KCSAN: data-race in fib_release_info / fib_table_lookup
    
    write to 0xffff888150f31744 of 1 bytes by task 1189 on cpu 0:
    fib_release_info+0x3a0/0x460 net/ipv4/fib_semantics.c:281
    fib_table_delete+0x8d2/0x900 net/ipv4/fib_trie.c:1777
    fib_magic+0x1c1/0x1f0 net/ipv4/fib_frontend.c:1106
    fib_del_ifaddr+0x8cf/0xa60 net/ipv4/fib_frontend.c:1317
    fib_inetaddr_event+0x77/0x200 net/ipv4/fib_frontend.c:1448
    notifier_call_chain kernel/notifier.c:93 [inline]
    blocking_notifier_call_chain+0x90/0x200 kernel/notifier.c:388
    __inet_del_ifa+0x4df/0x800 net/ipv4/devinet.c:432
    inet_del_ifa net/ipv4/devinet.c:469 [inline]
    inetdev_destroy net/ipv4/devinet.c:322 [inline]
    inetdev_event+0x553/0xaf0 net/ipv4/devinet.c:1606
    notifier_call_chain kernel/notifier.c:93 [inline]
    raw_notifier_call_chain+0x6b/0x1c0 kernel/notifier.c:461
    call_netdevice_notifiers_info net/core/dev.c:1962 [inline]
    call_netdevice_notifiers_mtu+0xd2/0x130 net/core/dev.c:2037
    dev_set_mtu_ext+0x30b/0x3e0 net/core/dev.c:8673
    do_setlink+0x5be/0x2430 net/core/rtnetlink.c:2837
    rtnl_setlink+0x255/0x300 net/core/rtnetlink.c:3177
    rtnetlink_rcv_msg+0x807/0x8c0 net/core/rtnetlink.c:6445
    netlink_rcv_skb+0x126/0x220 net/netlink/af_netlink.c:2549
    rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:6463
    netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
    netlink_unicast+0x56f/0x640 net/netlink/af_netlink.c:1365
    netlink_sendmsg+0x665/0x770 net/netlink/af_netlink.c:1914
    sock_sendmsg_nosec net/socket.c:725 [inline]
    sock_sendmsg net/socket.c:748 [inline]
    sock_write_iter+0x1aa/0x230 net/socket.c:1129
    do_iter_write+0x4b4/0x7b0 fs/read_write.c:860
    vfs_writev+0x1a8/0x320 fs/read_write.c:933
    do_writev+0xf8/0x220 fs/read_write.c:976
    __do_sys_writev fs/read_write.c:1049 [inline]
    __se_sys_writev fs/read_write.c:1046 [inline]
    __x64_sys_writev+0x45/0x50 fs/read_write.c:1046
    do_syscall_x64 arch/x86/entry/common.c:50 [inline]
    do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
    entry_SYSCALL_64_after_hwframe+0x63/0xcd
    
    read to 0xffff888150f31744 of 1 bytes by task 21839 on cpu 1:
    fib_table_lookup+0x2bf/0xd50 net/ipv4/fib_trie.c:1585
    fib_lookup include/net/ip_fib.h:383 [inline]
    ip_route_output_key_hash_rcu+0x38c/0x12c0 net/ipv4/route.c:2751
    ip_route_output_key_hash net/ipv4/route.c:2641 [inline]
    __ip_route_output_key include/net/route.h:134 [inline]
    ip_route_output_flow+0xa6/0x150 net/ipv4/route.c:2869
    send4+0x1e7/0x500 drivers/net/wireguard/socket.c:61
    wg_socket_send_skb_to_peer+0x94/0x130 drivers/net/wireguard/socket.c:175
    wg_socket_send_buffer_to_peer+0xd6/0x100 drivers/net/wireguard/socket.c:200
    wg_packet_send_handshake_initiation drivers/net/wireguard/send.c:40 [inline]
    wg_packet_handshake_send_worker+0x10c/0x150 drivers/net/wireguard/send.c:51
    process_one_work+0x434/0x860 kernel/workqueue.c:2600
    worker_thread+0x5f2/0xa10 kernel/workqueue.c:2751
    kthread+0x1d7/0x210 kernel/kthread.c:389
    ret_from_fork+0x2e/0x40 arch/x86/kernel/process.c:145
    ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304
    
    value changed: 0x00 -> 0x01
    
    Reported by Kernel Concurrency Sanitizer on:
    CPU: 1 PID: 21839 Comm: kworker/u4:18 Tainted: G W 6.5.0-syzkaller #0
    
    Fixes: dccd9ecc3744 ("ipv4: Do not use dead fib_info entries.")
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Link: https://lore.kernel.org/r/20230830095520.1046984-1-edumazet@google.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ipv4: ignore dst hint for multipath routes [+ + +]

Author: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Date:   Thu Aug 31 10:03:30 2023 +0200

    ipv4: ignore dst hint for multipath routes
    
    [ Upstream commit 6ac66cb03ae306c2e288a9be18226310529f5b25 ]
    
    Route hints when the nexthop is part of a multipath group causes packets
    in the same receive batch to be sent to the same nexthop irrespective of
    the multipath hash of the packet. So, do not extract route hint for
    packets whose destination is part of a multipath group.
    
    A new SKB flag IPSKB_MULTIPATH is introduced for this purpose, set the
    flag when route is looked up in ip_mkroute_input() and use it in
    ip_extract_route_hint() to check for the existence of the flag.
    
    Fixes: 02b24941619f ("ipv4: use dst hint for ipv4 list receive")
    Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
    Reviewed-by: Ido Schimmel <idosch@nvidia.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ipv6: fix ip6_sock_set_addr_preferences() typo [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Mon Sep 11 15:42:13 2023 +0000

    ipv6: fix ip6_sock_set_addr_preferences() typo
    
    [ Upstream commit 8cdd9f1aaedf823006449faa4e540026c692ac43 ]
    
    ip6_sock_set_addr_preferences() second argument should be an integer.
    
    SUNRPC attempts to set IPV6_PREFER_SRC_PUBLIC were
    translated to IPV6_PREFER_SRC_TMP
    
    Fixes: 18d5ad623275 ("ipv6: add ip6_sock_set_addr_preferences")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Christoph Hellwig <hch@lst.de>
    Cc: Chuck Lever <chuck.lever@oracle.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/20230911154213.713941-1-edumazet@google.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ipv6: ignore dst hint for multipath routes [+ + +]

Author: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Date:   Thu Aug 31 10:03:31 2023 +0200

    ipv6: ignore dst hint for multipath routes
    
    [ Upstream commit 8423be8926aa82cd2e28bba5cc96ccb72c7ce6be ]
    
    Route hints when the nexthop is part of a multipath group causes packets
    in the same receive batch to be sent to the same nexthop irrespective of
    the multipath hash of the packet. So, do not extract route hint for
    packets whose destination is part of a multipath group.
    
    A new SKB flag IP6SKB_MULTIPATH is introduced for this purpose, set the
    flag when route is looked up in fib6_select_path() and use it in
    ip6_can_use_hint() to check for the existence of the flag.
    
    Fixes: 197dbf24e360 ("ipv6: introduce and uses route look hints for list input.")
    Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
    Reviewed-by: Ido Schimmel <idosch@nvidia.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ipv6: Remove in6addr_any alternatives. [+ + +]

Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Mon Mar 27 16:54:54 2023 -0700

    ipv6: Remove in6addr_any alternatives.
    
    [ Upstream commit 8cdc3223e78c43e1b60ea1c536a103e32fdca3c5 ]
    
    Some code defines the IPv6 wildcard address as a local variable and
    use it with memcmp() or ipv6_addr_equal().
    
    Let's use in6addr_any and ipv6_addr_any() instead.
    
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Reviewed-by: Mark Bloch <mbloch@nvidia.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Stable-dep-of: aa99e5f87bd5 ("tcp: Fix bind() regression for v4-mapped-v6 wildcard address.")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ixgbe: fix timestamp configuration code [+ + +]

Author: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Date:   Mon Sep 11 13:28:14 2023 -0700

    ixgbe: fix timestamp configuration code
    
    [ Upstream commit 3c44191dd76cf9c0cc49adaf34384cbd42ef8ad2 ]
    
    The commit in fixes introduced flags to control the status of hardware
    configuration while processing packets. At the same time another structure
    is used to provide configuration of timestamper to user-space applications.
    The way it was coded makes this structures go out of sync easily. The
    repro is easy for 82599 chips:
    
    [root@hostname ~]# hwstamp_ctl -i eth0 -r 12 -t 1
    current settings:
    tx_type 0
    rx_filter 0
    new settings:
    tx_type 1
    rx_filter 12
    
    The eth0 device is properly configured to timestamp any PTPv2 events.
    
    [root@hostname ~]# hwstamp_ctl -i eth0 -r 1 -t 1
    current settings:
    tx_type 1
    rx_filter 12
    SIOCSHWTSTAMP failed: Numerical result out of range
    The requested time stamping mode is not supported by the hardware.
    
    The error is properly returned because HW doesn't support all packets
    timestamping. But the adapter->flags is cleared of timestamp flags
    even though no HW configuration was done. From that point no RX timestamps
    are received by user-space application. But configuration shows good
    values:
    
    [root@hostname ~]# hwstamp_ctl -i eth0
    current settings:
    tx_type 1
    rx_filter 12
    
    Fix the issue by applying new flags only when the HW was actually
    configured.
    
    Fixes: a9763f3cb54c ("ixgbe: Update PTP to support X550EM_x devices")
    Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

jbd2: check 'jh->b_transaction' before removing it from checkpoint [+ + +]

Author: Zhihao Cheng <chengzhihao1@huawei.com>
Date:   Fri Jul 14 10:55:27 2023 +0800

    jbd2: check 'jh->b_transaction' before removing it from checkpoint
    
    commit 590a809ff743e7bd890ba5fb36bc38e20a36de53 upstream.
    
    Following process will corrupt ext4 image:
    Step 1:
    jbd2_journal_commit_transaction
     __jbd2_journal_insert_checkpoint(jh, commit_transaction)
     // Put jh into trans1->t_checkpoint_list
     journal->j_checkpoint_transactions = commit_transaction
     // Put trans1 into journal->j_checkpoint_transactions
    
    Step 2:
    do_get_write_access
     test_clear_buffer_dirty(bh) // clear buffer dirtyО╪▄set jbd dirty
     __jbd2_journal_file_buffer(jh, transaction) // jh belongs to trans2
    
    Step 3:
    drop_cache
     journal_shrink_one_cp_list
      jbd2_journal_try_remove_checkpoint
       if (!trylock_buffer(bh))  // lock bh, true
       if (buffer_dirty(bh))     // buffer is not dirty
       __jbd2_journal_remove_checkpoint(jh)
       // remove jh from trans1->t_checkpoint_list
    
    Step 4:
    jbd2_log_do_checkpoint
     trans1 = journal->j_checkpoint_transactions
     // jh is not in trans1->t_checkpoint_list
     jbd2_cleanup_journal_tail(journal)  // trans1 is done
    
    Step 5: Power cut, trans2 is not committed, jh is lost in next mounting.
    
    Fix it by checking 'jh->b_transaction' before remove it from checkpoint.
    
    Cc: stable@kernel.org
    Fixes: 46f881b5b175 ("jbd2: fix a race when checking checkpoint buffer busy")
    Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
    Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Link: https://lore.kernel.org/r/20230714025528.564988-3-yi.zhang@huaweicloud.com
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

jbd2: correct the end of the journal recovery scan range [+ + +]

Author: Zhang Yi <yi.zhang@huawei.com>
Date:   Mon Jun 26 15:33:22 2023 +0800

    jbd2: correct the end of the journal recovery scan range
    
    commit 2dfba3bb40ad8536b9fa802364f2d40da31aa88e upstream.
    
    We got a filesystem inconsistency issue below while running generic/475
    I/O failure pressure test with fast_commit feature enabled.
    
     Symlink /p3/d3/d1c/d6c/dd6/dce/l101 (inode #132605) is invalid.
    
    If fast_commit feature is enabled, a special fast_commit journal area is
    appended to the end of the normal journal area. The journal->j_last
    point to the first unused block behind the normal journal area instead
    of the whole log area, and the journal->j_fc_last point to the first
    unused block behind the fast_commit journal area. While doing journal
    recovery, do_one_pass(PASS_SCAN) should first scan the normal journal
    area and turn around to the first block once it meet journal->j_last,
    but the wrap() macro misuse the journal->j_fc_last, so the recovering
    could not read the next magic block (commit block perhaps) and would end
    early mistakenly and missing tN and every transaction after it in the
    following example. Finally, it could lead to filesystem inconsistency.
    
     | normal journal area                             | fast commit area |
     +-------------------------------------------------+------------------+
     | tN(rere) | tN+1 |~| tN-x |...| tN-1 | tN(front) |       ....       |
     +-------------------------------------------------+------------------+
                         /                             /                  /
                    start               journal->j_last journal->j_fc_last
    
    This patch fix it by use the correct ending journal->j_last.
    
    Fixes: 5b849b5f96b4 ("jbd2: fast commit recovery path")
    Cc: stable@kernel.org
    Reported-by: Theodore Ts'o <tytso@mit.edu>
    Link: https://lore.kernel.org/linux-ext4/20230613043120.GB1584772@mit.edu/
    Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Link: https://lore.kernel.org/r/20230626073322.3956567-1-yi.zhang@huaweicloud.com
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

jbd2: fix checkpoint cleanup performance regression [+ + +]

Author: Zhang Yi <yi.zhang@huawei.com>
Date:   Fri Jul 14 10:55:26 2023 +0800

    jbd2: fix checkpoint cleanup performance regression
    
    commit 373ac521799d9e97061515aca6ec6621789036bb upstream.
    
    journal_clean_one_cp_list() has been merged into
    journal_shrink_one_cp_list(), but do chekpoint buffer cleanup from the
    committing process is just a best effort, it should stop scan once it
    meet a busy buffer, or else it will cause a lot of invalid buffer scan
    and checks. We catch a performance regression when doing fs_mark tests
    below.
    
    Test cmd:
     ./fs_mark  -d  scratch  -s  1024  -n  10000  -t  1  -D  100  -N  100
    
    Before merging checkpoint buffer cleanup:
     FSUse%        Count         Size    Files/sec     App Overhead
         95        10000         1024       8304.9            49033
    
    After merging checkpoint buffer cleanup:
     FSUse%        Count         Size    Files/sec     App Overhead
         95        10000         1024       7649.0            50012
     FSUse%        Count         Size    Files/sec     App Overhead
         95        10000         1024       2107.1            50871
    
    After merging checkpoint buffer cleanup, the total loop count in
    journal_shrink_one_cp_list() could be up to 6,261,600+ (50,000+ ~
    100,000+ in general), most of them are invalid. This patch fix it
    through passing 'shrink_type' into journal_shrink_one_cp_list() and add
    a new 'SHRINK_BUSY_STOP' to indicate it should stop once meet a busy
    buffer. After fix, the loop count descending back to 10,000+.
    
    After this fix:
     FSUse%        Count         Size    Files/sec     App Overhead
         95        10000         1024       8558.4            49109
    
    Cc: stable@kernel.org
    Fixes: b98dba273a0e ("jbd2: remove journal_clean_one_cp_list()")
    Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Link: https://lore.kernel.org/r/20230714025528.564988-2-yi.zhang@huaweicloud.com
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

kbuild: do not run depmod for 'make modules_sign' [+ + +]

Author: Masahiro Yamada <masahiroy@kernel.org>
Date:   Wed Aug 23 20:50:41 2023 +0900

    kbuild: do not run depmod for 'make modules_sign'
    
    [ Upstream commit 2429742e506a2b5939a62c629c4a46d91df0ada8 ]
    
    Commit 961ab4a3cd66 ("kbuild: merge scripts/Makefile.modsign to
    scripts/Makefile.modinst") started to run depmod at the end of
    'make modules_sign'.
    
    Move the depmod rule to scripts/Makefile.modinst and run it only when
    $(modules_sign_only) is empty.
    
    Fixes: 961ab4a3cd66 ("kbuild: merge scripts/Makefile.modsign to scripts/Makefile.modinst")
    Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
    Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

kbuild: rpm-pkg: define _arch conditionally [+ + +]

Author: Masahiro Yamada <masahiroy@kernel.org>
Date:   Sat Jul 22 13:47:48 2023 +0900

    kbuild: rpm-pkg: define _arch conditionally
    
    [ Upstream commit 233046a2afd12a4f699305b92ee634eebf1e4f31 ]
    
    Commit 3089b2be0cce ("kbuild: rpm-pkg: fix build error when _arch is
    undefined") does not work as intended; _arch is always defined as
    $UTS_MACHINE.
    
    The intention was to define _arch to $UTS_MACHINE only when it is not
    defined.
    
    Fixes: 3089b2be0cce ("kbuild: rpm-pkg: fix build error when _arch is undefined")
    Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

kcm: Destroy mutex in kcm_exit_net() [+ + +]

Author: Shigeru Yoshida <syoshida@redhat.com>
Date:   Sun Sep 3 02:07:08 2023 +0900

    kcm: Destroy mutex in kcm_exit_net()
    
    [ Upstream commit 6ad40b36cd3b04209e2d6c89d252c873d8082a59 ]
    
    kcm_exit_net() should call mutex_destroy() on knet->mutex. This is especially
    needed if CONFIG_DEBUG_MUTEXES is enabled.
    
    Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
    Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
    Link: https://lore.kernel.org/r/20230902170708.1727999-1-syoshida@redhat.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

kcm: Fix error handling for SOCK_DGRAM in kcm_sendmsg(). [+ + +]

Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Mon Sep 11 19:27:53 2023 -0700

    kcm: Fix error handling for SOCK_DGRAM in kcm_sendmsg().
    
    [ Upstream commit a22730b1b4bf437c6bbfdeff5feddf54be4aeada ]
    
    syzkaller found a memory leak in kcm_sendmsg(), and commit c821a88bd720
    ("kcm: Fix memory leak in error path of kcm_sendmsg()") suppressed it by
    updating kcm_tx_msg(head)->last_skb if partial data is copied so that the
    following sendmsg() will resume from the skb.
    
    However, we cannot know how many bytes were copied when we get the error.
    Thus, we could mess up the MSG_MORE queue.
    
    When kcm_sendmsg() fails for SOCK_DGRAM, we should purge the queue as we
    do so for UDP by udp_flush_pending_frames().
    
    Even without this change, when the error occurred, the following sendmsg()
    resumed from a wrong skb and the queue was messed up.  However, we have
    yet to get such a report, and only syzkaller stumbled on it.  So, this
    can be changed safely.
    
    Note this does not change SOCK_SEQPACKET behaviour.
    
    Fixes: c821a88bd720 ("kcm: Fix memory leak in error path of kcm_sendmsg()")
    Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Link: https://lore.kernel.org/r/20230912022753.33327-1-kuniyu@amazon.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

kcm: Fix memory leak in error path of kcm_sendmsg() [+ + +]

Author: Shigeru Yoshida <syoshida@redhat.com>
Date:   Sun Sep 10 02:03:10 2023 +0900

    kcm: Fix memory leak in error path of kcm_sendmsg()
    
    [ Upstream commit c821a88bd720b0046433173185fd841a100d44ad ]
    
    syzbot reported a memory leak like below:
    
    BUG: memory leak
    unreferenced object 0xffff88810b088c00 (size 240):
      comm "syz-executor186", pid 5012, jiffies 4294943306 (age 13.680s)
      hex dump (first 32 bytes):
        00 89 08 0b 81 88 ff ff 00 00 00 00 00 00 00 00  ................
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      backtrace:
        [<ffffffff83e5d5ff>] __alloc_skb+0x1ef/0x230 net/core/skbuff.c:634
        [<ffffffff84606e59>] alloc_skb include/linux/skbuff.h:1289 [inline]
        [<ffffffff84606e59>] kcm_sendmsg+0x269/0x1050 net/kcm/kcmsock.c:815
        [<ffffffff83e479c6>] sock_sendmsg_nosec net/socket.c:725 [inline]
        [<ffffffff83e479c6>] sock_sendmsg+0x56/0xb0 net/socket.c:748
        [<ffffffff83e47f55>] ____sys_sendmsg+0x365/0x470 net/socket.c:2494
        [<ffffffff83e4c389>] ___sys_sendmsg+0xc9/0x130 net/socket.c:2548
        [<ffffffff83e4c536>] __sys_sendmsg+0xa6/0x120 net/socket.c:2577
        [<ffffffff84ad7bb8>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
        [<ffffffff84ad7bb8>] do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
        [<ffffffff84c0008b>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
    
    In kcm_sendmsg(), kcm_tx_msg(head)->last_skb is used as a cursor to append
    newly allocated skbs to 'head'. If some bytes are copied, an error occurred,
    and jumped to out_error label, 'last_skb' is left unmodified. A later
    kcm_sendmsg() will use an obsoleted 'last_skb' reference, corrupting the
    'head' frag_list and causing the leak.
    
    This patch fixes this issue by properly updating the last allocated skb in
    'last_skb'.
    
    Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
    Reported-and-tested-by: syzbot+6f98de741f7dbbfc4ccb@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=6f98de741f7dbbfc4ccb
    Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

kconfig: fix possible buffer overflow [+ + +]

Author: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
Date:   Tue Sep 5 17:59:14 2023 +0800

    kconfig: fix possible buffer overflow
    
    [ Upstream commit a3b7039bb2b22fcd2ad20d59c00ed4e606ce3754 ]
    
    Buffer 'new_argv' is accessed without bound check after accessing with
    bound check via 'new_argc' index.
    
    Fixes: e298f3b49def ("kconfig: add built-in function support")
    Co-developed-by: Ivanov Mikhail <ivanov.mikhail1@huawei-partners.com>
    Signed-off-by: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
    Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

kselftest/runner.sh: Propagate SIGTERM to runner child [+ + +]

Author: Bjц╤rn Tц╤pel <bjorn@rivosinc.com>
Date:   Wed Jul 5 13:53:17 2023 +0200

    kselftest/runner.sh: Propagate SIGTERM to runner child
    
    [ Upstream commit 9616cb34b08ec86642b162eae75c5a7ca8debe3c ]
    
    Timeouts in kselftest are done using the "timeout" command with the
    "--foreground" option. Without the "foreground" option, it is not
    possible for a user to cancel the runner using SIGINT, because the
    signal is not propagated to timeout which is running in a different
    process group. The "forground" options places the timeout in the same
    process group as its parent, but only sends the SIGTERM (on timeout)
    signal to the forked process. Unfortunately, this does not play nice
    with all kselftests, e.g. "net:fcnal-test.sh", where the child
    processes will linger because timeout does not send SIGTERM to the
    group.
    
    Some users have noted these hangs [1].
    
    Fix this by nesting the timeout with an additional timeout without the
    foreground option.
    
    Link: https://lore.kernel.org/all/7650b2eb-0aee-a2b0-2e64-c9bc63210f67@alu.unizg.hr/ # [1]
    Fixes: 651e0d881461 ("kselftest/runner: allow to properly deliver signals to tests")
    Signed-off-by: Bjц╤rn Tц╤pel <bjorn@rivosinc.com>
    Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

kunit: Fix wild-memory-access bug in kunit_free_suite_set() [+ + +]

Author: Jinjie Ruan <ruanjinjie@huawei.com>
Date:   Sun Sep 3 15:10:25 2023 +0800

    kunit: Fix wild-memory-access bug in kunit_free_suite_set()
    
    [ Upstream commit 2810c1e99867a811e631dd24e63e6c1e3b78a59d ]
    
    Inject fault while probing kunit-example-test.ko, if kstrdup()
    fails in mod_sysfs_setup() in load_module(), the mod->state will
    switch from MODULE_STATE_COMING to MODULE_STATE_GOING instead of
    from MODULE_STATE_LIVE to MODULE_STATE_GOING, so only
    kunit_module_exit() will be called without kunit_module_init(), and
    the mod->kunit_suites is no set correctly and the free in
    kunit_free_suite_set() will cause below wild-memory-access bug.
    
    The mod->state state machine when load_module() succeeds:
    
    MODULE_STATE_UNFORMED ---> MODULE_STATE_COMING ---> MODULE_STATE_LIVE
             ^                                              |
             |                                              | delete_module
             +---------------- MODULE_STATE_GOING <---------+
    
    The mod->state state machine when load_module() fails at
    mod_sysfs_setup():
    
    MODULE_STATE_UNFORMED ---> MODULE_STATE_COMING ---> MODULE_STATE_GOING
            ^                                               |
            |                                               |
            +-----------------------------------------------+
    
    Call kunit_module_init() at MODULE_STATE_COMING state to fix the issue
    because MODULE_STATE_LIVE is transformed from it.
    
     Unable to handle kernel paging request at virtual address ffffff341e942a88
     KASAN: maybe wild-memory-access in range [0x0003f9a0f4a15440-0x0003f9a0f4a15447]
     Mem abort info:
       ESR = 0x0000000096000004
       EC = 0x25: DABT (current EL), IL = 32 bits
       SET = 0, FnV = 0
       EA = 0, S1PTW = 0
       FSC = 0x04: level 0 translation fault
     Data abort info:
       ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
       CM = 0, WnR = 0, TnD = 0, TagAccess = 0
       GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
     swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000441ea000
     [ffffff341e942a88] pgd=0000000000000000, p4d=0000000000000000
     Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
     Modules linked in: kunit_example_test(-) cfg80211 rfkill 8021q garp mrp stp llc ipv6 [last unloaded: kunit_example_test]
     CPU: 3 PID: 2035 Comm: modprobe Tainted: G        W        N 6.5.0-next-20230828+ #136
     Hardware name: linux,dummy-virt (DT)
     pstate: a0000005 (NzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
     pc : kfree+0x2c/0x70
     lr : kunit_free_suite_set+0xcc/0x13c
     sp : ffff8000829b75b0
     x29: ffff8000829b75b0 x28: ffff8000829b7b90 x27: 0000000000000000
     x26: dfff800000000000 x25: ffffcd07c82a7280 x24: ffffcd07a50ab300
     x23: ffffcd07a50ab2e8 x22: 1ffff00010536ec0 x21: dfff800000000000
     x20: ffffcd07a50ab2f0 x19: ffffcd07a50ab2f0 x18: 0000000000000000
     x17: 0000000000000000 x16: 0000000000000000 x15: ffffcd07c24b6764
     x14: ffffcd07c24b63c0 x13: ffffcd07c4cebb94 x12: ffff700010536ec7
     x11: 1ffff00010536ec6 x10: ffff700010536ec6 x9 : dfff800000000000
     x8 : 00008fffefac913a x7 : 0000000041b58ab3 x6 : 0000000000000000
     x5 : 1ffff00010536ec5 x4 : ffff8000829b7628 x3 : dfff800000000000
     x2 : ffffff341e942a80 x1 : ffffcd07a50aa000 x0 : fffffc0000000000
     Call trace:
      kfree+0x2c/0x70
      kunit_free_suite_set+0xcc/0x13c
      kunit_module_notify+0xd8/0x360
      blocking_notifier_call_chain+0xc4/0x128
      load_module+0x382c/0x44a4
      init_module_from_file+0xd4/0x128
      idempotent_init_module+0x2c8/0x524
      __arm64_sys_finit_module+0xac/0x100
      invoke_syscall+0x6c/0x258
      el0_svc_common.constprop.0+0x160/0x22c
      do_el0_svc+0x44/0x5c
      el0_svc+0x38/0x78
      el0t_64_sync_handler+0x13c/0x158
      el0t_64_sync+0x190/0x194
     Code: aa0003e1 b25657e0 d34cfc42 8b021802 (f9400440)
     ---[ end trace 0000000000000000 ]---
     Kernel panic - not syncing: Oops: Fatal exception
     SMP: stopping secondary CPUs
     Kernel Offset: 0x4d0742200000 from 0xffff800080000000
     PHYS_OFFSET: 0xffffee43c0000000
     CPU features: 0x88000203,3c020000,1000421b
     Memory Limit: none
     Rebooting in 1 seconds..
    
    Fixes: 3d6e44623841 ("kunit: unify module and builtin suite definitions")
    Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
    Reviewed-by: Rae Moar <rmoar@google.com>
    Reviewed-by: David Gow <davidgow@google.com>
    Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

KVM: nSVM: Check instead of asserting on nested TSC scaling support [+ + +]

Author: Sean Christopherson <seanjc@google.com>
Date:   Fri Jul 28 18:15:48 2023 -0700

    KVM: nSVM: Check instead of asserting on nested TSC scaling support
    
    commit 7cafe9b8e22bb3d77f130c461aedf6868c4aaf58 upstream.
    
    Check for nested TSC scaling support on nested SVM VMRUN instead of
    asserting that TSC scaling is exposed to L1 if L1's MSR_AMD64_TSC_RATIO
    has diverged from KVM's default.  Userspace can trigger the WARN at will
    by writing the MSR and then updating guest CPUID to hide the feature
    (modifying guest CPUID is allowed anytime before KVM_RUN).  E.g. hacking
    KVM's state_test selftest to do
    
                    vcpu_set_msr(vcpu, MSR_AMD64_TSC_RATIO, 0);
                    vcpu_clear_cpuid_feature(vcpu, X86_FEATURE_TSCRATEMSR);
    
    after restoring state in a new VM+vCPU yields an endless supply of:
    
      ------------[ cut here ]------------
      WARNING: CPU: 164 PID: 62565 at arch/x86/kvm/svm/nested.c:699
               nested_vmcb02_prepare_control+0x3d6/0x3f0 [kvm_amd]
      Call Trace:
       <TASK>
       enter_svm_guest_mode+0x114/0x560 [kvm_amd]
       nested_svm_vmrun+0x260/0x330 [kvm_amd]
       vmrun_interception+0x29/0x30 [kvm_amd]
       svm_invoke_exit_handler+0x35/0x100 [kvm_amd]
       svm_handle_exit+0xe7/0x180 [kvm_amd]
       kvm_arch_vcpu_ioctl_run+0x1eab/0x2570 [kvm]
       kvm_vcpu_ioctl+0x4c9/0x5b0 [kvm]
       __se_sys_ioctl+0x7a/0xc0
       __x64_sys_ioctl+0x21/0x30
       do_syscall_64+0x41/0x90
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      RIP: 0033:0x45ca1b
    
    Note, the nested #VMEXIT path has the same flaw, but needs a different
    fix and will be handled separately.
    
    Fixes: 5228eb96a487 ("KVM: x86: nSVM: implement nested TSC scaling")
    Cc: Maxim Levitsky <mlevitsk@redhat.com>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20230729011608.1065019-2-seanjc@google.com
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

KVM: nSVM: Load L1's TSC multiplier based on L1 state, not L2 state [+ + +]

Author: Sean Christopherson <seanjc@google.com>
Date:   Fri Jul 28 18:15:49 2023 -0700

    KVM: nSVM: Load L1's TSC multiplier based on L1 state, not L2 state
    
    commit 0c94e2468491cbf0754f49a5136ab51294a96b69 upstream.
    
    When emulating nested VM-Exit, load L1's TSC multiplier if L1's desired
    ratio doesn't match the current ratio, not if the ratio L1 is using for
    L2 diverges from the default.  Functionally, the end result is the same
    as KVM will run L2 with L1's multiplier if L2's multiplier is the default,
    i.e. checking that L1's multiplier is loaded is equivalent to checking if
    L2 has a non-default multiplier.
    
    However, the assertion that TSC scaling is exposed to L1 is flawed, as
    userspace can trigger the WARN at will by writing the MSR and then
    updating guest CPUID to hide the feature (modifying guest CPUID is
    allowed anytime before KVM_RUN).  E.g. hacking KVM's state_test
    selftest to do
    
                    vcpu_set_msr(vcpu, MSR_AMD64_TSC_RATIO, 0);
                    vcpu_clear_cpuid_feature(vcpu, X86_FEATURE_TSCRATEMSR);
    
    after restoring state in a new VM+vCPU yields an endless supply of:
    
      ------------[ cut here ]------------
      WARNING: CPU: 10 PID: 206939 at arch/x86/kvm/svm/nested.c:1105
               nested_svm_vmexit+0x6af/0x720 [kvm_amd]
      Call Trace:
       nested_svm_exit_handled+0x102/0x1f0 [kvm_amd]
       svm_handle_exit+0xb9/0x180 [kvm_amd]
       kvm_arch_vcpu_ioctl_run+0x1eab/0x2570 [kvm]
       kvm_vcpu_ioctl+0x4c9/0x5b0 [kvm]
       ? trace_hardirqs_off+0x4d/0xa0
       __se_sys_ioctl+0x7a/0xc0
       __x64_sys_ioctl+0x21/0x30
       do_syscall_64+0x41/0x90
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
    
    Unlike the nested VMRUN path, hoisting the svm->tsc_scaling_enabled check
    into the if-statement is wrong as KVM needs to ensure L1's multiplier is
    loaded in the above scenario.   Alternatively, the WARN_ON() could simply
    be deleted, but that would make KVM's behavior even more subtle, e.g. it's
    not immediately obvious why it's safe to write MSR_AMD64_TSC_RATIO when
    checking only tsc_ratio_msr.
    
    Fixes: 5228eb96a487 ("KVM: x86: nSVM: implement nested TSC scaling")
    Cc: Maxim Levitsky <mlevitsk@redhat.com>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20230729011608.1065019-3-seanjc@google.com
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

KVM: SVM: Don't inject #UD if KVM attempts to skip SEV guest insn [+ + +]

Author: Sean Christopherson <seanjc@google.com>
Date:   Thu Aug 24 18:36:18 2023 -0700

    KVM: SVM: Don't inject #UD if KVM attempts to skip SEV guest insn
    
    commit cb49631ad111570f1bad37702c11c2ae07fa2e3c upstream.
    
    Don't inject a #UD if KVM attempts to "emulate" to skip an instruction
    for an SEV guest, and instead resume the guest and hope that it can make
    forward progress.  When commit 04c40f344def ("KVM: SVM: Inject #UD on
    attempted emulation for SEV guest w/o insn buffer") added the completely
    arbitrary #UD behavior, there were no known scenarios where a well-behaved
    guest would induce a VM-Exit that triggered emulation, i.e. it was thought
    that injecting #UD would be helpful.
    
    However, now that KVM (correctly) attempts to re-inject INT3/INTO, e.g. if
    a #NPF is encountered when attempting to deliver the INT3/INTO, an SEV
    guest can trigger emulation without a buffer, through no fault of its own.
    Resuming the guest and retrying the INT3/INTO is architecturally wrong,
    e.g. the vCPU will incorrectly re-hit code #DBs, but for SEV guests there
    is literally no other option that has a chance of making forward progress.
    
    Drop the #UD injection for all "skip" emulation, not just those related to
    INT3/INTO, even though that means that the guest will likely end up in an
    infinite loop instead of getting a #UD (the vCPU may also crash, e.g. if
    KVM emulated everything about an instruction except for advancing RIP).
    There's no evidence that suggests that an unexpected #UD is actually
    better than hanging the vCPU, e.g. a soft-hung vCPU can still respond to
    IRQs and NMIs to generate a backtrace.
    
    Reported-by: Wu Zongyo <wuzongyo@mail.ustc.edu.cn>
    Closes: https://lore.kernel.org/all/8eb933fd-2cf3-d7a9-32fe-2a1d82eac42a@mail.ustc.edu.cn
    Fixes: 6ef88d6e36c2 ("KVM: SVM: Re-inject INT3/INTO instead of retrying the instruction")
    Cc: stable@vger.kernel.org
    Cc: Tom Lendacky <thomas.lendacky@amd.com>
    Link: https://lore.kernel.org/r/20230825013621.2845700-2-seanjc@google.com
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

KVM: SVM: Get source vCPUs from source VM for SEV-ES intrahost migration [+ + +]

Author: Sean Christopherson <seanjc@google.com>
Date:   Thu Aug 24 19:23:56 2023 -0700

    KVM: SVM: Get source vCPUs from source VM for SEV-ES intrahost migration
    
    commit f1187ef24eb8f36e8ad8106d22615ceddeea6097 upstream.
    
    Fix a goof where KVM tries to grab source vCPUs from the destination VM
    when doing intrahost migration.  Grabbing the wrong vCPU not only hoses
    the guest, it also crashes the host due to the VMSA pointer being left
    NULL.
    
      BUG: unable to handle page fault for address: ffffe38687000000
      #PF: supervisor read access in kernel mode
      #PF: error_code(0x0000) - not-present page
      PGD 0 P4D 0
      Oops: 0000 [#1] SMP NOPTI
      CPU: 39 PID: 17143 Comm: sev_migrate_tes Tainted: GO       6.5.0-smp--fff2e47e6c3b-next #151
      Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 34.28.0 07/10/2023
      RIP: 0010:__free_pages+0x15/0xd0
      RSP: 0018:ffff923fcf6e3c78 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffffe38687000000 RCX: 0000000000000100
      RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffffe38687000000
      RBP: ffff923fcf6e3c88 R08: ffff923fcafb0000 R09: 0000000000000000
      R10: 0000000000000000 R11: ffffffff83619b90 R12: ffff923fa9540000
      R13: 0000000000080007 R14: ffff923f6d35d000 R15: 0000000000000000
      FS:  0000000000000000(0000) GS:ffff929d0d7c0000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffe38687000000 CR3: 0000005224c34005 CR4: 0000000000770ee0
      PKRU: 55555554
      Call Trace:
       <TASK>
       sev_free_vcpu+0xcb/0x110 [kvm_amd]
       svm_vcpu_free+0x75/0xf0 [kvm_amd]
       kvm_arch_vcpu_destroy+0x36/0x140 [kvm]
       kvm_destroy_vcpus+0x67/0x100 [kvm]
       kvm_arch_destroy_vm+0x161/0x1d0 [kvm]
       kvm_put_kvm+0x276/0x560 [kvm]
       kvm_vm_release+0x25/0x30 [kvm]
       __fput+0x106/0x280
       ____fput+0x12/0x20
       task_work_run+0x86/0xb0
       do_exit+0x2e3/0x9c0
       do_group_exit+0xb1/0xc0
       __x64_sys_exit_group+0x1b/0x20
       do_syscall_64+0x41/0x90
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
       </TASK>
      CR2: ffffe38687000000
    
    Fixes: 6defa24d3b12 ("KVM: SEV: Init target VMCBs in sev_migrate_from")
    Cc: stable@vger.kernel.org
    Cc: Peter Gonda <pgonda@google.com>
    Reviewed-by: Peter Gonda <pgonda@google.com>
    Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com>
    Link: https://lore.kernel.org/r/20230825022357.2852133-2-seanjc@google.com
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

KVM: SVM: Set target pCPU during IRTE update if target vCPU is running [+ + +]

Author: Sean Christopherson <seanjc@google.com>
Date:   Tue Aug 8 16:31:32 2023 -0700

    KVM: SVM: Set target pCPU during IRTE update if target vCPU is running
    
    commit f3cebc75e7425d6949d726bb8e937095b0aef025 upstream.
    
    Update the target pCPU for IOMMU doorbells when updating IRTE routing if
    KVM is actively running the associated vCPU.  KVM currently only updates
    the pCPU when loading the vCPU (via avic_vcpu_load()), and so doorbell
    events will be delayed until the vCPU goes through a put+load cycle (which
    might very well "never" happen for the lifetime of the VM).
    
    To avoid inserting a stale pCPU, e.g. due to racing between updating IRTE
    routing and vCPU load/put, get the pCPU information from the vCPU's
    Physical APIC ID table entry (a.k.a. avic_physical_id_cache in KVM) and
    update the IRTE while holding ir_list_lock.  Add comments with --verbose
    enabled to explain exactly what is and isn't protected by ir_list_lock.
    
    Fixes: 411b44ba80ab ("svm: Implements update_pi_irte hook to setup posted interrupt")
    Reported-by: dengqiao.joey <dengqiao.joey@bytedance.com>
    Cc: stable@vger.kernel.org
    Cc: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
    Cc: Joao Martins <joao.m.martins@oracle.com>
    Cc: Maxim Levitsky <mlevitsk@redhat.com>
    Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
    Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
    Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
    Link: https://lore.kernel.org/r/20230808233132.2499764-3-seanjc@google.com
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

KVM: SVM: Skip VMSA init in sev_es_init_vmcb() if pointer is NULL [+ + +]

Author: Sean Christopherson <seanjc@google.com>
Date:   Thu Aug 24 19:23:57 2023 -0700

    KVM: SVM: Skip VMSA init in sev_es_init_vmcb() if pointer is NULL
    
    commit 1952e74da96fb3e48b72a2d0ece78c688a5848c1 upstream.
    
    Skip initializing the VMSA physical address in the VMCB if the VMSA is
    NULL, which occurs during intrahost migration as KVM initializes the VMCB
    before copying over state from the source to the destination (including
    the VMSA and its physical address).
    
    In normal builds, __pa() is just math, so the bug isn't fatal, but with
    CONFIG_DEBUG_VIRTUAL=y, the validity of the virtual address is verified
    and passing in NULL will make the kernel unhappy.
    
    Fixes: 6defa24d3b12 ("KVM: SEV: Init target VMCBs in sev_migrate_from")
    Cc: stable@vger.kernel.org
    Cc: Peter Gonda <pgonda@google.com>
    Reviewed-by: Peter Gonda <pgonda@google.com>
    Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com>
    Link: https://lore.kernel.org/r/20230825022357.2852133-3-seanjc@google.com
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

KVM: SVM: Take and hold ir_list_lock when updating vCPU's Physical ID entry [+ + +]

Author: Sean Christopherson <seanjc@google.com>
Date:   Tue Aug 8 16:31:31 2023 -0700

    KVM: SVM: Take and hold ir_list_lock when updating vCPU's Physical ID entry
    
    commit 4c08e737f056fec930b416a2bd37ed266d724f95 upstream.
    
    Hoist the acquisition of ir_list_lock from avic_update_iommu_vcpu_affinity()
    to its two callers, avic_vcpu_load() and avic_vcpu_put(), specifically to
    encapsulate the write to the vCPU's entry in the AVIC Physical ID table.
    This will allow a future fix to pull information from the Physical ID entry
    when updating the IRTE, without potentially consuming stale information,
    i.e. without racing with the vCPU being (un)loaded.
    
    Add a comment to call out that ir_list_lock does NOT protect against
    multiple writers, specifically that reading the Physical ID entry in
    avic_vcpu_put() outside of the lock is safe.
    
    To preserve some semblance of independence from ir_list_lock, keep the
    READ_ONCE() in avic_vcpu_load() even though acuiring the spinlock
    effectively ensures the load(s) will be generated after acquiring the
    lock.
    
    Cc: stable@vger.kernel.org
    Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
    Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
    Link: https://lore.kernel.org/r/20230808233132.2499764-2-seanjc@google.com
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

lib/test_meminit: allocate pages up to order MAX_ORDER [+ + +]

Author: Andrew Donnellan <ajd@linux.ibm.com>
Date:   Fri Jul 14 11:52:38 2023 +1000

    lib/test_meminit: allocate pages up to order MAX_ORDER
    
    commit efb78fa86e95832b78ca0ba60f3706788a818938 upstream.
    
    test_pages() tests the page allocator by calling alloc_pages() with
    different orders up to order 10.
    
    However, different architectures and platforms support different maximum
    contiguous allocation sizes.  The default maximum allocation order
    (MAX_ORDER) is 10, but architectures can use CONFIG_ARCH_FORCE_MAX_ORDER
    to override this.  On platforms where this is less than 10, test_meminit()
    will blow up with a WARN().  This is expected, so let's not do that.
    
    Replace the hardcoded "10" with the MAX_ORDER macro so that we test
    allocations up to the expected platform limit.
    
    Link: https://lkml.kernel.org/r/20230714015238.47931-1-ajd@linux.ibm.com
    Fixes: 5015a300a522 ("lib: introduce test_meminit module")
    Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
    Reviewed-by: Alexander Potapenko <glider@google.com>
    Cc: Xiaoke Wang <xkernel.wang@foxmail.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

lib: test_scanf: Add explicit type cast to result initialization in test_number_prefix() [+ + +]

Author: Nathan Chancellor <nathan@kernel.org>
Date:   Mon Aug 7 08:36:28 2023 -0700

    lib: test_scanf: Add explicit type cast to result initialization in test_number_prefix()
    
    commit 92382d744176f230101d54f5c017bccd62770f01 upstream.
    
    A recent change in clang allows it to consider more expressions as
    compile time constants, which causes it to point out an implicit
    conversion in the scanf tests:
    
      lib/test_scanf.c:661:2: warning: implicit conversion from 'int' to 'unsigned char' changes value from -168 to 88 [-Wconstant-conversion]
        661 |         test_number_prefix(unsigned char,       "0xA7", "%2hhx%hhx", 0, 0xa7, 2, check_uchar);
            |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      lib/test_scanf.c:609:29: note: expanded from macro 'test_number_prefix'
        609 |         T result[2] = {~expect[0], ~expect[1]};                                 \
            |                       ~            ^~~~~~~~~~
      1 warning generated.
    
    The result of the bitwise negation is the type of the operand after
    going through the integer promotion rules, so this truncation is
    expected but harmless, as the initial values in the result array get
    overwritten by _test() anyways. Add an explicit cast to the expected
    type in test_number_prefix() to silence the warning. There is no
    functional change, as all the tests still pass with GCC 13.1.0 and clang
    18.0.0.
    
    Cc: stable@vger.kernel.org
    Link: https://github.com/ClangBuiltLinux/linuxq/issues/1899
    Link: https://github.com/llvm/llvm-project/commit/610ec954e1f81c0e8fcadedcd25afe643f5a094e
    Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
    Signed-off-by: Nathan Chancellor <nathan@kernel.org>
    Reviewed-by: Petr Mladek <pmladek@suse.com>
    Signed-off-by: Petr Mladek <pmladek@suse.com>
    Link: https://lore.kernel.org/r/20230807-test_scanf-wconstant-conversion-v2-1-839ca39083e1@kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Linux: Linux 6.1.54 [+ + +]

Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Tue Sep 19 12:28:10 2023 +0200

    Linux 6.1.54
    
    Link: https://lore.kernel.org/r/20230917191040.964416434@linuxfoundation.org
    Tested-by: SeongJae Park <sj@kernel.org>
    Tested-by: Takeshi Ogasawara <takeshi.ogasawara@futuring-girl.com>
    Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
    Tested-by: Conor Dooley <conor.dooley@microchip.com>
    Tested-by: Ron Economos <re@w6rz.net>
    Tested-by: Jon Hunter <jonathanh@nvidia.com>
    Tested-by: Salvatore Bonaccorso <carnil@debian.org>
    Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Tested-by: Guenter Roeck <linux@roeck-us.net>
    Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
    Tested-by: Shuah Khan <skhan@linuxfoundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mailbox: qcom-ipcc: fix incorrect num_chans counting [+ + +]

Author: Jonathan Marek <jonathan@marek.ca>
Date:   Wed Aug 2 09:52:22 2023 -0400

    mailbox: qcom-ipcc: fix incorrect num_chans counting
    
    [ Upstream commit a493208079e299aefdc15169dc80e3da3ebb718a ]
    
    Breaking out early when a match is found leads to an incorrect num_chans
    value when more than one ipcc mailbox channel is used by the same device.
    
    Fixes: e9d50e4b4d04 ("mailbox: qcom-ipcc: Dynamic alloc for channel arrangement")
    Signed-off-by: Jonathan Marek <jonathan@marek.ca>
    Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

memcg: drop kmem.limit_in_bytes [+ + +]

Author: Michal Hocko <mhocko@suse.com>
Date:   Tue Jul 4 13:52:40 2023 +0200

    memcg: drop kmem.limit_in_bytes
    
    commit 86327e8eb94c52eca4f93cfece2e29d1bf52acbf upstream.
    
    kmem.limit_in_bytes (v1 way to limit kernel memory usage) has been
    deprecated since 58056f77502f ("memcg, kmem: further deprecate
    kmem.limit_in_bytes") merged in 5.16.  We haven't heard about any serious
    users since then but it seems that the mere presence of the file is
    causing more harm thatn good.  We (SUSE) have had several bug reports from
    customers where Docker based containers started to fail because a write to
    kmem.limit_in_bytes has failed.
    
    This was unexpected because runc code only expects ENOENT (kmem disabled)
    or EBUSY (tasks already running within cgroup).  So a new error code was
    unexpected and the whole container startup failed.  This has been later
    addressed by
    https://github.com/opencontainers/runc/commit/52390d68040637dfc77f9fda6bbe70952423d380
    so current Docker runtimes do not suffer from the problem anymore.  There
    are still older version of Docker in use and likely hard to get rid of
    completely.
    
    Address this by wiping out the file completely and effectively get back to
    pre 4.5 era and CONFIG_MEMCG_KMEM=n configuration.
    
    I would recommend backporting to stable trees which have picked up
    58056f77502f ("memcg, kmem: further deprecate kmem.limit_in_bytes").
    
    [mhocko@suse.com: restore _KMEM switch case]
      Link: https://lkml.kernel.org/r/ZKe5wxdbvPi5Cwd7@dhcp22.suse.cz
    Link: https://lkml.kernel.org/r/20230704115240.14672-1-mhocko@kernel.org
    Signed-off-by: Michal Hocko <mhocko@suse.com>
    Acked-by: Shakeel Butt <shakeelb@google.com>
    Acked-by: Johannes Weiner <hannes@cmpxchg.org>
    Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
    Cc: Muchun Song <muchun.song@linux.dev>
    Cc: Tejun Heo <tj@kernel.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

MIPS: Fix CONFIG_CPU_DADDI_WORKAROUNDS `modules_install' regression [+ + +]

Author: Maciej W. Rozycki <macro@orcam.me.uk>
Date:   Tue Jul 18 15:37:18 2023 +0100

    MIPS: Fix CONFIG_CPU_DADDI_WORKAROUNDS `modules_install' regression
    
    commit a79a404e6c2241ebc528b9ebf4c0832457b498c3 upstream.
    
    Remove a build-time check for the presence of the GCC `-msym32' option.
    This option has been there since GCC 4.1.0, which is below the minimum
    required as at commit 805b2e1d427a ("kbuild: include Makefile.compiler
    only when compiler is needed"), when an error message:
    
    arch/mips/Makefile:306: *** CONFIG_CPU_DADDI_WORKAROUNDS unsupported without -msym32.  Stop.
    
    started to trigger for the `modules_install' target with configurations
    such as `decstation_64_defconfig' that set CONFIG_CPU_DADDI_WORKAROUNDS,
    because said commit has made `cc-option-yn' an undefined function for
    non-build targets.
    
    Reported-by: Jan-Benedict Glaw <jbglaw@lug-owl.de>
    Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
    Fixes: 805b2e1d427a ("kbuild: include Makefile.compiler only when compiler is needed")
    Cc: stable@vger.kernel.org # v5.13+
    Reviewed-by: Philippe Mathieu-Daudц╘ <philmd@linaro.org>
    Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

MIPS: Only fiddle with CHECKFLAGS if `need-compiler' [+ + +]

Author: Maciej W. Rozycki <macro@orcam.me.uk>
Date:   Tue Jul 18 15:37:23 2023 +0100

    MIPS: Only fiddle with CHECKFLAGS if `need-compiler'
    
    commit 4fe4a6374c4db9ae2b849b61e84b58685dca565a upstream.
    
    We have originally guarded fiddling with CHECKFLAGS in our arch Makefile
    by checking for the CONFIG_MIPS variable, not set for targets such as
    `distclean', etc. that neither include `.config' nor use the compiler.
    
    Starting from commit 805b2e1d427a ("kbuild: include Makefile.compiler
    only when compiler is needed") we have had a generic `need-compiler'
    variable explicitly telling us if the compiler will be used and thus its
    capabilities need to be checked and expressed in the form of compilation
    flags.  If this variable is not set, then `make' functions such as
    `cc-option' are undefined, causing all kinds of weirdness to happen if
    we expect specific results to be returned, most recently:
    
    cc1: error: '-mloongson-mmi' must be used with '-mhard-float'
    
    messages with configurations such as `fuloong2e_defconfig' and the
    `modules_install' target, which does include `.config' and yet does not
    use the compiler.
    
    Replace the check for CONFIG_MIPS with one for `need-compiler' instead,
    so as to prevent the compiler from being ever called for CHECKFLAGS when
    not needed.
    
    Reported-by: Guillaume Tucker <guillaume.tucker@collabora.com>
    Closes: https://lore.kernel.org/r/85031c0c-d981-031e-8a50-bc4fad2ddcd8@collabora.com/
    Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
    Fixes: 805b2e1d427a ("kbuild: include Makefile.compiler only when compiler is needed")
    Cc: stable@vger.kernel.org # v5.13+
    Reported-by: "kernelci.org bot" <bot@kernelci.org>
    Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: hugetlb_vmemmap: fix a race between vmemmap pmd split [+ + +]

Author: Muchun Song <muchun.song@linux.dev>
Date:   Fri Jul 7 11:38:59 2023 +0800

    mm: hugetlb_vmemmap: fix a race between vmemmap pmd split
    
    commit 3ce2c24cb68f228590a053d6058a5901cd31af61 upstream.
    
    The local variable @page in __split_vmemmap_huge_pmd() to obtain a pmd
    page without holding page_table_lock may possiblely get the page table
    page instead of a huge pmd page.
    
    The effect may be in set_pte_at() since we may pass an invalid page
    struct, if set_pte_at() wants to access the page struct (e.g.
    CONFIG_PAGE_TABLE_CHECK is enabled), it may crash the kernel.
    
    So fix it.  And inline __split_vmemmap_huge_pmd() since it only has one
    user.
    
    Link: https://lkml.kernel.org/r/20230707033859.16148-1-songmuchun@bytedance.com
    Fixes: d8d55f5616cf ("mm: sparsemem: use page table lock to protect kernel pmd operations")
    Signed-off-by: Muchun Song <songmuchun@bytedance.com>
    Cc: Mike Kravetz <mike.kravetz@oracle.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: multi-gen LRU: rename lrugen->lists[] to lrugen->folios[] [+ + +]

Author: Yu Zhao <yuzhao@google.com>
Date:   Wed Dec 21 21:19:00 2022 -0700

    mm: multi-gen LRU: rename lrugen->lists[] to lrugen->folios[]
    
    commit 6df1b2212950aae2b2188c6645ea18e2a9e3fdd5 upstream.
    
    lru_gen_folio will be chained into per-node lists by the coming
    lrugen->list.
    
    Link: https://lkml.kernel.org/r/20221222041905.2431096-3-yuzhao@google.com
    Signed-off-by: Yu Zhao <yuzhao@google.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Michael Larabel <Michael@MichaelLarabel.com>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Mike Rapoport <rppt@kernel.org>
    Cc: Roman Gushchin <roman.gushchin@linux.dev>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mptcp: annotate data-races around msk->rmem_fwd_alloc [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Aug 31 13:52:10 2023 +0000

    mptcp: annotate data-races around msk->rmem_fwd_alloc
    
    [ Upstream commit 9531e4a83febc3fb47ac77e24cfb5ea97e50034d ]
    
    msk->rmem_fwd_alloc can be read locklessly.
    
    Add mptcp_rmem_fwd_alloc_add(), similar to sk_forward_alloc_add(),
    and appropriate READ_ONCE()/WRITE_ONCE() annotations.
    
    Fixes: 6511882cdd82 ("mptcp: allocate fwd memory separately on the rx and tx path")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mtd: rawnand: brcmnand: Fix crash during the panic_write [+ + +]

Author: William Zhang <william.zhang@broadcom.com>
Date:   Thu Jul 6 11:29:07 2023 -0700

    mtd: rawnand: brcmnand: Fix crash during the panic_write
    
    commit e66dd317194daae0475fe9e5577c80aa97f16cb9 upstream.
    
    When executing a NAND command within the panic write path, wait for any
    pending command instead of calling BUG_ON to avoid crashing while
    already crashing.
    
    Fixes: 27c5b17cd1b1 ("mtd: nand: add NAND driver "library" for Broadcom STB NAND controller")
    Signed-off-by: William Zhang <william.zhang@broadcom.com>
    Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Reviewed-by: Kursad Oney <kursad.oney@broadcom.com>
    Reviewed-by: Kamal Dasu <kamal.dasu@broadcom.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
    Link: https://lore.kernel.org/linux-mtd/20230706182909.79151-4-william.zhang@broadcom.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: rawnand: brcmnand: Fix ECC level field setting for v7.2 controller [+ + +]

Author: William Zhang <william.zhang@broadcom.com>
Date:   Thu Jul 6 11:29:05 2023 -0700

    mtd: rawnand: brcmnand: Fix ECC level field setting for v7.2 controller
    
    commit 2ec2839a9062db8a592525a3fdabd42dcd9a3a9b upstream.
    
    v7.2 controller has different ECC level field size and shift in the acc
    control register than its predecessor and successor controller. It needs
    to be set specifically.
    
    Fixes: decba6d47869 ("mtd: brcmnand: Add v7.2 controller support")
    Signed-off-by: William Zhang <william.zhang@broadcom.com>
    Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
    Link: https://lore.kernel.org/linux-mtd/20230706182909.79151-2-william.zhang@broadcom.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: rawnand: brcmnand: Fix potential false time out warning [+ + +]

Author: William Zhang <william.zhang@broadcom.com>
Date:   Thu Jul 6 11:29:06 2023 -0700

    mtd: rawnand: brcmnand: Fix potential false time out warning
    
    commit 9cc0a598b944816f2968baf2631757f22721b996 upstream.
    
    If system is busy during the command status polling function, the driver
    may not get the chance to poll the status register till the end of time
    out and return the premature status.  Do a final check after time out
    happens to ensure reading the correct status.
    
    Fixes: 9d2ee0a60b8b ("mtd: nand: brcmnand: Check flash #WP pin status before nand erase/program")
    Signed-off-by: William Zhang <william.zhang@broadcom.com>
    Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
    Link: https://lore.kernel.org/linux-mtd/20230706182909.79151-3-william.zhang@broadcom.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: rawnand: brcmnand: Fix potential out-of-bounds access in oob write [+ + +]

Author: William Zhang <william.zhang@broadcom.com>
Date:   Thu Jul 6 11:29:08 2023 -0700

    mtd: rawnand: brcmnand: Fix potential out-of-bounds access in oob write
    
    commit 5d53244186c9ac58cb88d76a0958ca55b83a15cd upstream.
    
    When the oob buffer length is not in multiple of words, the oob write
    function does out-of-bounds read on the oob source buffer at the last
    iteration. Fix that by always checking length limit on the oob buffer
    read and fill with 0xff when reaching the end of the buffer to the oob
    registers.
    
    Fixes: 27c5b17cd1b1 ("mtd: nand: add NAND driver "library" for Broadcom STB NAND controller")
    Signed-off-by: William Zhang <william.zhang@broadcom.com>
    Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
    Link: https://lore.kernel.org/linux-mtd/20230706182909.79151-5-william.zhang@broadcom.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mtd: spi-nor: Correct flags for Winbond w25q128 [+ + +]

Author: Linus Walleij <linus.walleij@linaro.org>
Date:   Tue Jul 18 13:56:11 2023 +0200

    mtd: spi-nor: Correct flags for Winbond w25q128
    
    commit 83e824a4a595132f9bd7ac4f5afff857bfc5991e upstream.
    
    The Winbond "w25q128" (actual vendor name W25Q128JV) has
    exactly the same flags as the sibling device "w25q128jv".
    The devices both require unlocking to enable write access.
    
    The actual product naming between devices vs the Linux
    strings in winbond.c:
    
    0xef4018: "w25q128"   W25Q128JV-IN/IQ/JQ
    0xef7018: "w25q128jv" W25Q128JV-IM/JM
    
    The latter device, "w25q128jv" supports features named DTQ
    and QPI, otherwise it is the same.
    
    Not having the right flags has the annoying side effect
    that write access does not work.
    
    After this patch I can write to the flash on the Inteno
    XG6846 router.
    
    The flash memory also supports dual and quad SPI modes.
    This does not currently manifest, but by turning on SFDP
    parsing, the right SPI modes are emitted in
    /sys/kernel/debug/spi-nor/spi1.0/capabilities
    for this chip, so we also turn on this.
    
    Since we now have determined that SFDP parsing works on
    the device, we also detect the geometry using SFDP.
    
    After this dmesg and sysfs says:
    [    1.062401] spi-nor spi1.0: w25q128 (16384 Kbytes)
    cat erasesize
    65536
    (16384*1024)/65536 = 256 sectors
    
    spi-nor sysfs:
    cat jedec_id
    ef4018
    cat manufacturer
    winbond
    cat partname
    w25q128
    hexdump -v -C sfdp
    00000000  53 46 44 50 05 01 00 ff  00 05 01 10 80 00 00 ff
    00000010  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff
    00000020  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff
    00000030  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff
    00000040  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff
    00000050  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff
    00000060  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff
    00000070  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff
    00000080  e5 20 f9 ff ff ff ff 07  44 eb 08 6b 08 3b 42 bb
    00000090  fe ff ff ff ff ff 00 00  ff ff 40 eb 0c 20 0f 52
    000000a0  10 d8 00 00 36 02 a6 00  82 ea 14 c9 e9 63 76 33
    000000b0  7a 75 7a 75 f7 a2 d5 5c  19 f7 4d ff e9 30 f8 80
    
    Cc: stable@vger.kernel.org
    Suggested-by: Michael Walle <michael@walle.cc>
    Reviewed-by: Michael Walle <michael@walle.cc>
    Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
    Link: https://lore.kernel.org/r/20230718-spi-nor-winbond-w25q128-v5-1-a73653ee46c3@linaro.org
    Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Multi-gen LRU: avoid race in inc_min_seq() [+ + +]

Author: Kalesh Singh <kaleshsingh@google.com>
Date:   Tue Aug 1 19:56:03 2023 -0700

    Multi-gen LRU: avoid race in inc_min_seq()
    
    commit bb5e7f234eacf34b65be67ebb3613e3b8cf11b87 upstream.
    
    inc_max_seq() will try to inc_min_seq() if nr_gens == MAX_NR_GENS. This
    is because the generations are reused (the last oldest now empty
    generation will become the next youngest generation).
    
    inc_min_seq() is retried until successful, dropping the lru_lock
    and yielding the CPU on each failure, and retaking the lock before
    trying again:
    
            while (!inc_min_seq(lruvec, type, can_swap)) {
                    spin_unlock_irq(&lruvec->lru_lock);
                    cond_resched();
                    spin_lock_irq(&lruvec->lru_lock);
            }
    
    However, the initial condition that required incrementing the min_seq
    (nr_gens == MAX_NR_GENS) is not retested. This can change by another
    call to inc_max_seq() from run_aging() with force_scan=true from the
    debugfs interface.
    
    Since the eviction stalls when the nr_gens == MIN_NR_GENS, avoid
    unnecessarily incrementing the min_seq by rechecking the number of
    generations before each attempt.
    
    This issue was uncovered in previous discussion on the list by Yu Zhao
    and Aneesh Kumar [1].
    
    [1] https://lore.kernel.org/linux-mm/CAOUHufbO7CaVm=xjEb1avDhHVvnC8pJmGyKcFf2iY_dpf+zR3w@mail.gmail.com/
    
    Link: https://lkml.kernel.org/r/20230802025606.346758-2-kaleshsingh@google.com
    Fixes: d6c3af7d8a2b ("mm: multi-gen LRU: debugfs interface")
    Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
    Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> [mediatek]
    Tested-by: Charan Teja Kalla <quic_charante@quicinc.com>
    Cc: Yu Zhao <yuzhao@google.com>
    Cc: Aneesh Kumar K V <aneesh.kumar@linux.ibm.com>
    Cc: Barry Song <baohua@kernel.org>
    Cc: Brian Geffon <bgeffon@google.com>
    Cc: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
    Cc: Lecopzer Chen <lecopzer.chen@mediatek.com>
    Cc: Matthias Brugger <matthias.bgg@gmail.com>
    Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
    Cc: Qi Zheng <zhengqi.arch@bytedance.com>
    Cc: Steven Barrett <steven@liquorix.net>
    Cc: Suleiman Souhlal <suleiman@google.com>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Multi-gen LRU: fix per-zone reclaim [+ + +]

Author: Kalesh Singh <kaleshsingh@google.com>
Date:   Tue Aug 1 19:56:02 2023 -0700

    Multi-gen LRU: fix per-zone reclaim
    
    commit 669281ee7ef731fb5204df9d948669bf32a5e68d upstream.
    
    MGLRU has a LRU list for each zone for each type (anon/file) in each
    generation:
    
            long nr_pages[MAX_NR_GENS][ANON_AND_FILE][MAX_NR_ZONES];
    
    The min_seq (oldest generation) can progress independently for each
    type but the max_seq (youngest generation) is shared for both anon and
    file. This is to maintain a common frame of reference.
    
    In order for eviction to advance the min_seq of a type, all the per-zone
    lists in the oldest generation of that type must be empty.
    
    The eviction logic only considers pages from eligible zones for
    eviction or promotion.
    
        scan_folios() {
            ...
            for (zone = sc->reclaim_idx; zone >= 0; zone--)  {
                ...
                sort_folio();       // Promote
                ...
                isolate_folio();    // Evict
            }
            ...
        }
    
    Consider the system has the movable zone configured and default 4
    generations. The current state of the system is as shown below
    (only illustrating one type for simplicity):
    
    Type: ANON
    
            Zone    DMA32     Normal    Movable    Device
    
            Gen 0       0          0        4GB         0
    
            Gen 1       0        1GB        1MB         0
    
            Gen 2     1MB        4GB        1MB         0
    
            Gen 3     1MB        1MB        1MB         0
    
    Now consider there is a GFP_KERNEL allocation request (eligible zone
    index <= Normal), evict_folios() will return without doing any work
    since there are no pages to scan in the eligible zones of the oldest
    generation. Reclaim won't make progress until triggered from a ZONE_MOVABLE
    allocation request; which may not happen soon if there is a lot of free
    memory in the movable zone. This can lead to OOM kills, although there
    is 1GB pages in the Normal zone of Gen 1 that we have not yet tried to
    reclaim.
    
    This issue is not seen in the conventional active/inactive LRU since
    there are no per-zone lists.
    
    If there are no (not enough) folios to scan in the eligible zones, move
    folios from ineligible zone (zone_index > reclaim_index) to the next
    generation. This allows for the progression of min_seq and reclaiming
    from the next generation (Gen 1).
    
    Qualcomm, Mediatek and raspberrypi [1] discovered this issue independently.
    
    [1] https://github.com/raspberrypi/linux/issues/5395
    
    Link: https://lkml.kernel.org/r/20230802025606.346758-1-kaleshsingh@google.com
    Fixes: ac35a4902374 ("mm: multi-gen LRU: minimal implementation")
    Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
    Reported-by: Charan Teja Kalla <quic_charante@quicinc.com>
    Reported-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
    Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> [mediatek]
    Tested-by: Charan Teja Kalla <quic_charante@quicinc.com>
    Cc: Yu Zhao <yuzhao@google.com>
    Cc: Barry Song <baohua@kernel.org>
    Cc: Brian Geffon <bgeffon@google.com>
    Cc: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
    Cc: Matthias Brugger <matthias.bgg@gmail.com>
    Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
    Cc: Qi Zheng <zhengqi.arch@bytedance.com>
    Cc: Steven Barrett <steven@liquorix.net>
    Cc: Suleiman Souhlal <suleiman@google.com>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Cc: Aneesh Kumar K V <aneesh.kumar@linux.ibm.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net/ipv6: SKB symmetric hash should incorporate transport ports [+ + +]

Author: Quan Tian <qtian@vmware.com>
Date:   Tue Sep 5 10:36:10 2023 +0000

    net/ipv6: SKB symmetric hash should incorporate transport ports
    
    commit a5e2151ff9d5852d0ababbbcaeebd9646af9c8d9 upstream.
    
    __skb_get_hash_symmetric() was added to compute a symmetric hash over
    the protocol, addresses and transport ports, by commit eb70db875671
    ("packet: Use symmetric hash for PACKET_FANOUT_HASH."). It uses
    flow_keys_dissector_symmetric_keys as the flow_dissector to incorporate
    IPv4 addresses, IPv6 addresses and ports. However, it should not specify
    the flag as FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL, which stops further
    dissection when an IPv6 flow label is encountered, making transport
    ports not being incorporated in such case.
    
    As a consequence, the symmetric hash is based on 5-tuple for IPv4 but
    3-tuple for IPv6 when flow label is present. It caused a few problems,
    e.g. when nft symhash and openvswitch l4_sym rely on the symmetric hash
    to perform load balancing as different L4 flows between two given IPv6
    addresses would always get the same symmetric hash, leading to uneven
    traffic distribution.
    
    Removing the use of FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL makes sure the
    symmetric hash is based on 5-tuple for both IPv4 and IPv6 consistently.
    
    Fixes: eb70db875671 ("packet: Use symmetric hash for PACKET_FANOUT_HASH.")
    Reported-by: Lars Ekman <uablrek@gmail.com>
    Closes: https://github.com/antrea-io/antrea/issues/5457
    Signed-off-by: Quan Tian <qtian@vmware.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net/mlx5: Free IRQ rmap and notifier on kernel shutdown [+ + +]

Author: Saeed Mahameed <saeedm@nvidia.com>
Date:   Thu Jun 8 12:00:54 2023 -0700

    net/mlx5: Free IRQ rmap and notifier on kernel shutdown
    
    commit 314ded538e5f22e7610b1bf621402024a180ec80 upstream.
    
    The kernel IRQ system needs the irq affinity notifier to be clear
    before attempting to free the irq, see WARN_ON log below.
    
    On a normal driver unload we don't have this issue since we do the
    complete cleanup of the irq resources.
    
    To fix this, put the important resources cleanup in a helper function
    and use it in both normal driver unload and shutdown flows.
    
    [ 4497.498434] ------------[ cut here ]------------
    [ 4497.498726] WARNING: CPU: 0 PID: 9 at kernel/irq/manage.c:2034 free_irq+0x295/0x340
    [ 4497.499193] Modules linked in:
    [ 4497.499386] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G        W          6.4.0-rc4+ #10
    [ 4497.499876] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
    [ 4497.500518] Workqueue: events do_poweroff
    [ 4497.500849] RIP: 0010:free_irq+0x295/0x340
    [ 4497.501132] Code: 85 c0 0f 84 1d ff ff ff 48 89 ef ff d0 0f 1f 00 e9 10 ff ff ff 0f 0b e9 72 ff ff ff 49 8d 7f 28 ff d0 0f 1f 00 e9 df fd ff ff <0f> 0b 48 c7 80 c0 008
    [ 4497.502269] RSP: 0018:ffffc90000053da0 EFLAGS: 00010282
    [ 4497.502589] RAX: ffff888100949600 RBX: ffff88810330b948 RCX: 0000000000000000
    [ 4497.503035] RDX: ffff888100949600 RSI: ffff888100400490 RDI: 0000000000000023
    [ 4497.503472] RBP: ffff88810330c7e0 R08: ffff8881004005d0 R09: ffffffff8273a260
    [ 4497.503923] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881009ae000
    [ 4497.504359] R13: ffff8881009ae148 R14: 0000000000000000 R15: ffff888100949600
    [ 4497.504804] FS:  0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
    [ 4497.505302] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 4497.505671] CR2: 00007fce98806298 CR3: 000000000262e005 CR4: 0000000000370ef0
    [ 4497.506104] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 4497.506540] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [ 4497.507002] Call Trace:
    [ 4497.507158]  <TASK>
    [ 4497.507299]  ? free_irq+0x295/0x340
    [ 4497.507522]  ? __warn+0x7c/0x130
    [ 4497.507740]  ? free_irq+0x295/0x340
    [ 4497.507963]  ? report_bug+0x171/0x1a0
    [ 4497.508197]  ? handle_bug+0x3c/0x70
    [ 4497.508417]  ? exc_invalid_op+0x17/0x70
    [ 4497.508662]  ? asm_exc_invalid_op+0x1a/0x20
    [ 4497.508926]  ? free_irq+0x295/0x340
    [ 4497.509146]  mlx5_irq_pool_free_irqs+0x48/0x90
    [ 4497.509421]  mlx5_irq_table_free_irqs+0x38/0x50
    [ 4497.509714]  mlx5_core_eq_free_irqs+0x27/0x40
    [ 4497.509984]  shutdown+0x7b/0x100
    [ 4497.510184]  pci_device_shutdown+0x30/0x60
    [ 4497.510440]  device_shutdown+0x14d/0x240
    [ 4497.510698]  kernel_power_off+0x30/0x70
    [ 4497.510938]  process_one_work+0x1e6/0x3e0
    [ 4497.511183]  worker_thread+0x49/0x3b0
    [ 4497.511407]  ? __pfx_worker_thread+0x10/0x10
    [ 4497.511679]  kthread+0xe0/0x110
    [ 4497.511879]  ? __pfx_kthread+0x10/0x10
    [ 4497.512114]  ret_from_fork+0x29/0x50
    [ 4497.512342]  </TASK>
    
    Fixes: 9c2d08010963 ("net/mlx5: Free irqs only on shutdown callback")
    Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
    Reviewed-by: Shay Drory <shayd@nvidia.com>
    Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net/sched: fq_pie: avoid stalls in fq_pie_timer() [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Aug 29 12:35:41 2023 +0000

    net/sched: fq_pie: avoid stalls in fq_pie_timer()
    
    [ Upstream commit 8c21ab1bae945686c602c5bfa4e3f3352c2452c5 ]
    
    When setting a high number of flows (limit being 65536),
    fq_pie_timer() is currently using too much time as syzbot reported.
    
    Add logic to yield the cpu every 2048 flows (less than 150 usec
    on debug kernels).
    It should also help by not blocking qdisc fast paths for too long.
    Worst case (65536 flows) would need 31 jiffies for a complete scan.
    
    Relevant extract from syzbot report:
    
    rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 2663 jiffies s: 873 root: 0x1/.
    rcu: blocking rcu_node structures (internal RCU debug):
    Sending NMI from CPU 1 to CPUs 0:
    NMI backtrace for cpu 0
    CPU: 0 PID: 5177 Comm: syz-executor273 Not tainted 6.5.0-syzkaller-00453-g727dbda16b83 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023
    RIP: 0010:check_kcov_mode kernel/kcov.c:173 [inline]
    RIP: 0010:write_comp_data+0x21/0x90 kernel/kcov.c:236
    Code: 2e 0f 1f 84 00 00 00 00 00 65 8b 05 01 b2 7d 7e 49 89 f1 89 c6 49 89 d2 81 e6 00 01 00 00 49 89 f8 65 48 8b 14 25 80 b9 03 00 <a9> 00 01 ff 00 74 0e 85 f6 74 59 8b 82 04 16 00 00 85 c0 74 4f 8b
    RSP: 0018:ffffc90000007bb8 EFLAGS: 00000206
    RAX: 0000000000000101 RBX: ffffc9000dc0d140 RCX: ffffffff885893b0
    RDX: ffff88807c075940 RSI: 0000000000000100 RDI: 0000000000000001
    RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffffc9000dc0d178
    R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
    FS:  0000555555d54380(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f6b442f6130 CR3: 000000006fe1c000 CR4: 00000000003506f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <NMI>
     </NMI>
     <IRQ>
     pie_calculate_probability+0x480/0x850 net/sched/sch_pie.c:415
     fq_pie_timer+0x1da/0x4f0 net/sched/sch_fq_pie.c:387
     call_timer_fn+0x1a0/0x580 kernel/time/timer.c:1700
    
    Fixes: ec97ecf1ebe4 ("net: sched: add Flow Queue PIE packet scheduler")
    Link: https://lore.kernel.org/lkml/00000000000017ad3f06040bf394@google.com/
    Reported-by: syzbot+e46fbd5289363464bc13@syzkaller.appspotmail.com
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
    Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
    Link: https://lore.kernel.org/r/20230829123541.3745013-1-edumazet@google.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/smc: use smc_lgr_list.lock to protect smc_lgr_list.list iterate in smcr_port_add [+ + +]

Author: Guangguan Wang <guangguan.wang@linux.alibaba.com>
Date:   Fri Sep 8 11:31:43 2023 +0800

    net/smc: use smc_lgr_list.lock to protect smc_lgr_list.list iterate in smcr_port_add
    
    [ Upstream commit f5146e3ef0a9eea405874b36178c19a4863b8989 ]
    
    While doing smcr_port_add, there maybe linkgroup add into or delete
    from smc_lgr_list.list at the same time, which may result kernel crash.
    So, use smc_lgr_list.lock to protect smc_lgr_list.list iterate in
    smcr_port_add.
    
    The crash calltrace show below:
    BUG: kernel NULL pointer dereference, address: 0000000000000000
    PGD 0 P4D 0
    Oops: 0000 [#1] SMP NOPTI
    CPU: 0 PID: 559726 Comm: kworker/0:92 Kdump: loaded Tainted: G
    Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 449e491 04/01/2014
    Workqueue: events smc_ib_port_event_work [smc]
    RIP: 0010:smcr_port_add+0xa6/0xf0 [smc]
    RSP: 0000:ffffa5a2c8f67de0 EFLAGS: 00010297
    RAX: 0000000000000001 RBX: ffff9935e0650000 RCX: 0000000000000000
    RDX: 0000000000000010 RSI: ffff9935e0654290 RDI: ffff9935c8560000
    RBP: 0000000000000000 R08: 0000000000000000 R09: ffff9934c0401918
    R10: 0000000000000000 R11: ffffffffb4a5c278 R12: ffff99364029aae4
    R13: ffff99364029aa00 R14: 00000000ffffffed R15: ffff99364029ab08
    FS:  0000000000000000(0000) GS:ffff994380600000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 0000000f06a10003 CR4: 0000000002770ef0
    PKRU: 55555554
    Call Trace:
     smc_ib_port_event_work+0x18f/0x380 [smc]
     process_one_work+0x19b/0x340
     worker_thread+0x30/0x370
     ? process_one_work+0x340/0x340
     kthread+0x114/0x130
     ? __kthread_cancel_work+0x50/0x50
     ret_from_fork+0x1f/0x30
    
    Fixes: 1f90a05d9ff9 ("net/smc: add smcr_port_add() and smcr_link_up() processing")
    Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/tls: do not free tls_rec on async operation in bpf_exec_tx_verdict() [+ + +]

Author: Liu Jian <liujian56@huawei.com>
Date:   Sat Sep 9 16:14:34 2023 +0800

    net/tls: do not free tls_rec on async operation in bpf_exec_tx_verdict()
    
    [ Upstream commit cfaa80c91f6f99b9342b6557f0f0e1143e434066 ]
    
    I got the below warning when do fuzzing test:
    BUG: KASAN: null-ptr-deref in scatterwalk_copychunks+0x320/0x470
    Read of size 4 at addr 0000000000000008 by task kworker/u8:1/9
    
    CPU: 0 PID: 9 Comm: kworker/u8:1 Tainted: G           OE
    Hardware name: linux,dummy-virt (DT)
    Workqueue: pencrypt_parallel padata_parallel_worker
    Call trace:
     dump_backtrace+0x0/0x420
     show_stack+0x34/0x44
     dump_stack+0x1d0/0x248
     __kasan_report+0x138/0x140
     kasan_report+0x44/0x6c
     __asan_load4+0x94/0xd0
     scatterwalk_copychunks+0x320/0x470
     skcipher_next_slow+0x14c/0x290
     skcipher_walk_next+0x2fc/0x480
     skcipher_walk_first+0x9c/0x110
     skcipher_walk_aead_common+0x380/0x440
     skcipher_walk_aead_encrypt+0x54/0x70
     ccm_encrypt+0x13c/0x4d0
     crypto_aead_encrypt+0x7c/0xfc
     pcrypt_aead_enc+0x28/0x84
     padata_parallel_worker+0xd0/0x2dc
     process_one_work+0x49c/0xbdc
     worker_thread+0x124/0x880
     kthread+0x210/0x260
     ret_from_fork+0x10/0x18
    
    This is because the value of rec_seq of tls_crypto_info configured by the
    user program is too large, for example, 0xffffffffffffff. In addition, TLS
    is asynchronously accelerated. When tls_do_encryption() returns
    -EINPROGRESS and sk->sk_err is set to EBADMSG due to rec_seq overflow,
    skmsg is released before the asynchronous encryption process ends. As a
    result, the UAF problem occurs during the asynchronous processing of the
    encryption module.
    
    If the operation is asynchronous and the encryption module returns
    EINPROGRESS, do not free the record information.
    
    Fixes: 635d93981786 ("net/tls: free record only on encryption error")
    Signed-off-by: Liu Jian <liujian56@huawei.com>
    Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
    Link: https://lore.kernel.org/r/20230909081434.2324940-1-liujian56@huawei.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: add SKB_HEAD_ALIGN() helper [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Fri Sep 15 23:51:02 2023 +0530

    net: add SKB_HEAD_ALIGN() helper
    
    commit 115f1a5c42bdad9a9ea356fc0b4a39ec7537947f upstream.
    
    We have many places using this expression:
    
     SKB_DATA_ALIGN(sizeof(struct skb_shared_info))
    
    Use of SKB_HEAD_ALIGN() will allow to clean them.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
    Acked-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    [Ajay: Regenerated the patch for v6.1.y]
    Signed-off-by: Ajay Kaher <akaher@vmware.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: annotate data-races around sk->sk_forward_alloc [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Aug 31 13:52:09 2023 +0000

    net: annotate data-races around sk->sk_forward_alloc
    
    [ Upstream commit 5e6300e7b3a4ab5b72a82079753868e91fbf9efc ]
    
    Every time sk->sk_forward_alloc is read locklessly,
    add a READ_ONCE().
    
    Add sk_forward_alloc_add() helper to centralize updates,
    to reduce number of WRITE_ONCE().
    
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: deal with integer overflows in kmalloc_reserve() [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Fri Sep 15 23:51:05 2023 +0530

    net: deal with integer overflows in kmalloc_reserve()
    
    commit 915d975b2ffa58a14bfcf16fafe00c41315949ff upstream.
    
    Blamed commit changed:
        ptr = kmalloc(size);
        if (ptr)
          size = ksize(ptr);
    
    to:
        size = kmalloc_size_roundup(size);
        ptr = kmalloc(size);
    
    This allowed various crash as reported by syzbot [1]
    and Kyle Zeng.
    
    Problem is that if @size is bigger than 0x80000001,
    kmalloc_size_roundup(size) returns 2^32.
    
    kmalloc_reserve() uses a 32bit variable (obj_size),
    so 2^32 is truncated to 0.
    
    kmalloc(0) returns ZERO_SIZE_PTR which is not handled by
    skb allocations.
    
    Following trace can be triggered if a netdev->mtu is set
    close to 0x7fffffff
    
    We might in the future limit netdev->mtu to more sensible
    limit (like KMALLOC_MAX_SIZE).
    
    This patch is based on a syzbot report, and also a report
    and tentative fix from Kyle Zeng.
    
    [1]
    BUG: KASAN: user-memory-access in __build_skb_around net/core/skbuff.c:294 [inline]
    BUG: KASAN: user-memory-access in __alloc_skb+0x3c4/0x6e8 net/core/skbuff.c:527
    Write of size 32 at addr 00000000fffffd10 by task syz-executor.4/22554
    
    CPU: 1 PID: 22554 Comm: syz-executor.4 Not tainted 6.1.39-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/03/2023
    Call trace:
    dump_backtrace+0x1c8/0x1f4 arch/arm64/kernel/stacktrace.c:279
    show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:286
    __dump_stack lib/dump_stack.c:88 [inline]
    dump_stack_lvl+0x120/0x1a0 lib/dump_stack.c:106
    print_report+0xe4/0x4b4 mm/kasan/report.c:398
    kasan_report+0x150/0x1ac mm/kasan/report.c:495
    kasan_check_range+0x264/0x2a4 mm/kasan/generic.c:189
    memset+0x40/0x70 mm/kasan/shadow.c:44
    __build_skb_around net/core/skbuff.c:294 [inline]
    __alloc_skb+0x3c4/0x6e8 net/core/skbuff.c:527
    alloc_skb include/linux/skbuff.h:1316 [inline]
    igmpv3_newpack+0x104/0x1088 net/ipv4/igmp.c:359
    add_grec+0x81c/0x1124 net/ipv4/igmp.c:534
    igmpv3_send_cr net/ipv4/igmp.c:667 [inline]
    igmp_ifc_timer_expire+0x1b0/0x1008 net/ipv4/igmp.c:810
    call_timer_fn+0x1c0/0x9f0 kernel/time/timer.c:1474
    expire_timers kernel/time/timer.c:1519 [inline]
    __run_timers+0x54c/0x710 kernel/time/timer.c:1790
    run_timer_softirq+0x28/0x4c kernel/time/timer.c:1803
    _stext+0x380/0xfbc
    ____do_softirq+0x14/0x20 arch/arm64/kernel/irq.c:79
    call_on_irq_stack+0x24/0x4c arch/arm64/kernel/entry.S:891
    do_softirq_own_stack+0x20/0x2c arch/arm64/kernel/irq.c:84
    invoke_softirq kernel/softirq.c:437 [inline]
    __irq_exit_rcu+0x1c0/0x4cc kernel/softirq.c:683
    irq_exit_rcu+0x14/0x78 kernel/softirq.c:695
    el0_interrupt+0x7c/0x2e0 arch/arm64/kernel/entry-common.c:717
    __el0_irq_handler_common+0x18/0x24 arch/arm64/kernel/entry-common.c:724
    el0t_64_irq_handler+0x10/0x1c arch/arm64/kernel/entry-common.c:729
    el0t_64_irq+0x1a0/0x1a4 arch/arm64/kernel/entry.S:584
    
    Fixes: 12d6c1d3a2ad ("skbuff: Proactively round up to kmalloc bucket size")
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Reported-by: Kyle Zeng <zengyhkyle@gmail.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    [Ajay: Regenerated the patch for v6.1.y]
    Signed-off-by: Ajay Kaher <akaher@vmware.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: dsa: sja1105: block FDB accesses that are concurrent with a switch reset [+ + +]

Author: Vladimir Oltean <vladimir.oltean@nxp.com>
Date:   Fri Sep 8 16:33:52 2023 +0300

    net: dsa: sja1105: block FDB accesses that are concurrent with a switch reset
    
    [ Upstream commit 86899e9e1e29e854b5f6dcc24ba4f75f792c89aa ]
    
    Currently, when we add the first sja1105 port to a bridge with
    vlan_filtering 1, then we sometimes see this output:
    
    sja1105 spi2.2: port 4 failed to read back entry for be:79:b4:9e:9e:96 vid 3088: -ENOENT
    sja1105 spi2.2: Reset switch and programmed static config. Reason: VLAN filtering
    sja1105 spi2.2: port 0 failed to add be:79:b4:9e:9e:96 vid 0 to fdb: -2
    
    It is because sja1105_fdb_add() runs from the dsa_owq which is no longer
    serialized with switch resets since it dropped the rtnl_lock() in the
    blamed commit.
    
    Either performing the FDB accesses before the reset, or after the reset,
    is equally fine, because sja1105_static_fdb_change() backs up those
    changes in the static config, but FDB access during reset isn't ok.
    
    Make sja1105_static_config_reload() take the fdb_lock to fix that.
    
    Fixes: 0faf890fc519 ("net: dsa: drop rtnl_lock from dsa_slave_switchdev_event_work")
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: dsa: sja1105: complete tc-cbs offload support on SJA1110 [+ + +]

Author: Vladimir Oltean <vladimir.oltean@nxp.com>
Date:   Wed Sep 6 00:53:38 2023 +0300

    net: dsa: sja1105: complete tc-cbs offload support on SJA1110
    
    [ Upstream commit 180a7419fe4adc8d9c8e0ef0fd17bcdd0cf78acd ]
    
    The blamed commit left this delta behind:
    
      struct sja1105_cbs_entry {
     -      u64 port;
     -      u64 prio;
     +      u64 port; /* Not used for SJA1110 */
     +      u64 prio; /* Not used for SJA1110 */
            u64 credit_hi;
            u64 credit_lo;
            u64 send_slope;
            u64 idle_slope;
      };
    
    but did not actually implement tc-cbs offload fully for the new switch.
    The offload is accepted, but it doesn't work.
    
    The difference compared to earlier switch generations is that now, the
    table of CBS shapers is sparse, because there are many more shapers, so
    the mapping between a {port, prio} and a table index is static, rather
    than requiring us to store the port and prio into the sja1105_cbs_entry.
    
    So, the problem is that the code programs the CBS shaper parameters at a
    dynamic table index which is incorrect.
    
    All that needs to be done for SJA1110 CBS shapers to work is to bypass
    the logic which allocates shapers in a dense manner, as for SJA1105, and
    use the fixed mapping instead.
    
    Fixes: 3e77e59bf8cf ("net: dsa: sja1105: add support for the SJA1110 switch family")
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: dsa: sja1105: fix -ENOSPC when replacing the same tc-cbs too many times [+ + +]

Author: Vladimir Oltean <vladimir.oltean@nxp.com>
Date:   Wed Sep 6 00:53:37 2023 +0300

    net: dsa: sja1105: fix -ENOSPC when replacing the same tc-cbs too many times
    
    [ Upstream commit 894cafc5c62ccced758077bd4e970dc714c42637 ]
    
    After running command [2] too many times in a row:
    
    [1] $ tc qdisc add dev sw2p0 root handle 1: mqprio num_tc 8 \
            map 0 1 2 3 4 5 6 7 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0
    [2] $ tc qdisc replace dev sw2p0 parent 1:1 cbs offload 1 \
            idleslope 120000 sendslope -880000 locredit -1320 hicredit 180
    
    (aka more than priv->info->num_cbs_shapers times)
    
    we start seeing the following error message:
    
    Error: Specified device failed to setup cbs hardware offload.
    
    This comes from the fact that ndo_setup_tc(TC_SETUP_QDISC_CBS) presents
    the same API for the qdisc create and replace cases, and the sja1105
    driver fails to distinguish between the 2. Thus, it always thinks that
    it must allocate the same shaper for a {port, queue} pair, when it may
    instead have to replace an existing one.
    
    Fixes: 4d7525085a9b ("net: dsa: sja1105: offload the Credit-Based Shaper qdisc")
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: dsa: sja1105: fix bandwidth discrepancy between tc-cbs software and offload [+ + +]

Author: Vladimir Oltean <vladimir.oltean@nxp.com>
Date:   Wed Sep 6 00:53:36 2023 +0300

    net: dsa: sja1105: fix bandwidth discrepancy between tc-cbs software and offload
    
    [ Upstream commit 954ad9bf13c4f95a4958b5f8433301f2ab99e1f5 ]
    
    More careful measurement of the tc-cbs bandwidth shows that the stream
    bandwidth (effectively idleslope) increases, there is a larger and
    larger discrepancy between the rate limit obtained by the software
    Qdisc, and the rate limit obtained by its offloaded counterpart.
    
    The discrepancy becomes so large, that e.g. at an idleslope of 40000
    (40Mbps), the offloaded cbs does not actually rate limit anything, and
    traffic will pass at line rate through a 100 Mbps port.
    
    The reason for the discrepancy is that the hardware documentation I've
    been following is incorrect. UM11040.pdf (for SJA1105P/Q/R/S) states
    about IDLE_SLOPE that it is "the rate (in unit of bytes/sec) at which
    the credit counter is increased".
    
    Cross-checking with UM10944.pdf (for SJA1105E/T) and UM11107.pdf
    (for SJA1110), the wording is different: "This field specifies the
    value, in bytes per second times link speed, by which the credit counter
    is increased".
    
    So there's an extra scaling for link speed that the driver is currently
    not accounting for, and apparently (empirically), that link speed is
    expressed in Kbps.
    
    I've pondered whether to pollute the sja1105_mac_link_up()
    implementation with CBS shaper reprogramming, but I don't think it is
    worth it. IMO, the UAPI exposed by tc-cbs requires user space to
    recalculate the sendslope anyway, since the formula for that depends on
    port_transmit_rate (see man tc-cbs), which is not an invariant from tc's
    perspective.
    
    So we use the offload->sendslope and offload->idleslope to deduce the
    original port_transmit_rate from the CBS formula, and use that value to
    scale the offload->sendslope and offload->idleslope to values that the
    hardware understands.
    
    Some numerical data points:
    
     40Mbps stream, max interfering frame size 1500, port speed 100M
     ---------------------------------------------------------------
    
     tc-cbs parameters:
     idleslope 40000 sendslope -60000 locredit -900 hicredit 600
    
     which result in hardware values:
    
     Before (doesn't work)           After (works)
     credit_hi    600                600
     credit_lo    900                900
     send_slope   7500000            75
     idle_slope   5000000            50
    
     40Mbps stream, max interfering frame size 1500, port speed 1G
     -------------------------------------------------------------
    
     tc-cbs parameters:
     idleslope 40000 sendslope -960000 locredit -1440 hicredit 60
    
     which result in hardware values:
    
     Before (doesn't work)           After (works)
     credit_hi    60                 60
     credit_lo    1440               1440
     send_slope   120000000          120
     idle_slope   5000000            5
    
     5.12Mbps stream, max interfering frame size 1522, port speed 100M
     -----------------------------------------------------------------
    
     tc-cbs parameters:
     idleslope 5120 sendslope -94880 locredit -1444 hicredit 77
    
     which result in hardware values:
    
     Before (doesn't work)           After (works)
     credit_hi    77                 77
     credit_lo    1444               1444
     send_slope   11860000           118
     idle_slope   640000             6
    
    Tested on SJA1105T, SJA1105S and SJA1110A, at 1Gbps and 100Mbps.
    
    Fixes: 4d7525085a9b ("net: dsa: sja1105: offload the Credit-Based Shaper qdisc")
    Reported-by: Yanan Yang <yanan.yang@nxp.com>
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: dsa: sja1105: fix multicast forwarding working only for last added mdb entry [+ + +]

Author: Vladimir Oltean <vladimir.oltean@nxp.com>
Date:   Fri Sep 8 16:33:50 2023 +0300

    net: dsa: sja1105: fix multicast forwarding working only for last added mdb entry
    
    [ Upstream commit 7cef293b9a634a05fcce9e1df4aee3aeed023345 ]
    
    The commit cited in Fixes: did 2 things: it refactored the read-back
    polling from sja1105_dynamic_config_read() into a new function,
    sja1105_dynamic_config_wait_complete(), and it called that from
    sja1105_dynamic_config_write() too.
    
    What is problematic is the refactoring.
    
    The refactored code from sja1105_dynamic_config_poll_valid() works like
    the previous one, but the problem is that it uses another packed_buf[]
    SPI buffer, and there was code at the end of sja1105_dynamic_config_read()
    which was relying on the read-back packed_buf[]:
    
            /* Don't dereference possibly NULL pointer - maybe caller
             * only wanted to see whether the entry existed or not.
             */
            if (entry)
                    ops->entry_packing(packed_buf, entry, UNPACK);
    
    After the change, the packed_buf[] that this code sees is no longer the
    entry read back from hardware, but the original entry that the caller
    passed to the sja1105_dynamic_config_read(), packed into this buffer.
    
    This difference is the most notable with the SJA1105_SEARCH uses from
    sja1105pqrs_fdb_add() - used for both fdb and mdb. There, we have logic
    added by commit 728db843df88 ("net: dsa: sja1105: ignore the FDB entry
    for unknown multicast when adding a new address") to figure out whether
    the address we're trying to add matches on any existing hardware entry,
    with the exception of the catch-all multicast address.
    
    That logic was broken, because with sja1105_dynamic_config_read() not
    working properly, it doesn't return us the entry read back from
    hardware, but the entry that we passed to it. And, since for multicast,
    a match will always exist, it will tell us that any mdb entry already
    exists at index=0 L2 Address Lookup table. It is index=0 because the
    caller doesn't know the index - it wants to find it out, and
    sja1105_dynamic_config_read() does:
    
            if (index < 0) { // SJA1105_SEARCH
                    /* Avoid copying a signed negative number to an u64 */
                    cmd.index = 0; // <- this
                    cmd.search = true;
            } else {
                    cmd.index = index;
                    cmd.search = false;
            }
    
    So, to the caller of sja1105_dynamic_config_read(), the returned info
    looks entirely legit, and it will add all mdb entries to FDB index 0.
    There, they will always overwrite each other (not to mention,
    potentially they can also overwrite a pre-existing bridge fdb entry),
    and the user-visible impact will be that only the last mdb entry will be
    forwarded as it should. The others won't (will be flooded or dropped,
    depending on the egress flood settings).
    
    Fixing is a bit more complicated, and involves either passing the same
    packed_buf[] to sja1105_dynamic_config_wait_complete(), or moving all
    the extra processing on the packed_buf[] to
    sja1105_dynamic_config_wait_complete(). I've opted for the latter,
    because it makes sja1105_dynamic_config_wait_complete() a bit more
    self-contained.
    
    Fixes: df405910ab9f ("net: dsa: sja1105: wait for dynamic config command completion on writes too")
    Reported-by: Yanan Yang <yanan.yang@nxp.com>
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: dsa: sja1105: hide all multicast addresses from "bridge fdb show" [+ + +]

Author: Vladimir Oltean <vladimir.oltean@nxp.com>
Date:   Fri Sep 8 16:33:48 2023 +0300

    net: dsa: sja1105: hide all multicast addresses from "bridge fdb show"
    
    [ Upstream commit 02c652f5465011126152bbd93b6a582a1d0c32f1 ]
    
    Commit 4d9423549501 ("net: dsa: sja1105: offload bridge port flags to
    device") has partially hidden some multicast entries from showing up in
    the "bridge fdb show" output, but it wasn't enough. Addresses which are
    added through "bridge mdb add" still show up. Hide them all.
    
    Fixes: 291d1e72b756 ("net: dsa: sja1105: Add support for FDB and MDB management")
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: dsa: sja1105: propagate exact error code from sja1105_dynamic_config_poll_valid() [+ + +]

Author: Vladimir Oltean <vladimir.oltean@nxp.com>
Date:   Fri Sep 8 16:33:49 2023 +0300

    net: dsa: sja1105: propagate exact error code from sja1105_dynamic_config_poll_valid()
    
    [ Upstream commit c956798062b5a308db96e75157747291197f0378 ]
    
    Currently, sja1105_dynamic_config_wait_complete() returns either 0 or
    -ETIMEDOUT, because it just looks at the read_poll_timeout() return code.
    
    There will be future changes which move some more checks to
    sja1105_dynamic_config_poll_valid(). It is important that we propagate
    their exact return code (-ENOENT, -EINVAL), because callers of
    sja1105_dynamic_config_read() depend on them.
    
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Stable-dep-of: 7cef293b9a63 ("net: dsa: sja1105: fix multicast forwarding working only for last added mdb entry")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: dsa: sja1105: serialize sja1105_port_mcast_flood() with other FDB accesses [+ + +]

Author: Vladimir Oltean <vladimir.oltean@nxp.com>
Date:   Fri Sep 8 16:33:51 2023 +0300

    net: dsa: sja1105: serialize sja1105_port_mcast_flood() with other FDB accesses
    
    [ Upstream commit ea32690daf4fa525dc5a4d164bd00ed8c756e1c6 ]
    
    sja1105_fdb_add() runs from the dsa_owq, and sja1105_port_mcast_flood()
    runs from switchdev_deferred_process_work(). Prior to the blamed commit,
    they used to be indirectly serialized through the rtnl_lock(), which
    no longer holds true because dsa_owq dropped that.
    
    So, it is now possible that we traverse the static config BLK_IDX_L2_LOOKUP
    elements concurrently compared to when we change them, in
    sja1105_static_fdb_change(). That is not ideal, since it might result in
    data corruption.
    
    Introduce a mutex which serializes accesses to the hardware FDB and to
    the static config elements for the L2 Address Lookup table.
    
    I can't find a good reason to add locking around sja1105_fdb_dump().
    I'll add it later if needed.
    
    Fixes: 0faf890fc519 ("net: dsa: drop rtnl_lock from dsa_slave_switchdev_event_work")
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: ethernet: adi: adin1110: use eth_broadcast_addr() to assign broadcast address [+ + +]

Author: Yang Yingliang <yangyingliang@huawei.com>
Date:   Fri Aug 4 17:35:31 2023 +0800

    net: ethernet: adi: adin1110: use eth_broadcast_addr() to assign broadcast address
    
    [ Upstream commit 54024dbec95585243391caeb9f04a2620e630765 ]
    
    Use eth_broadcast_addr() to assign broadcast address instead
    of memset().
    
    Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Stable-dep-of: 32530dba1bd4 ("net:ethernet:adi:adin1110: Fix forwarding offload")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: ethernet: mtk_eth_soc: fix possible NULL pointer dereference in mtk_hwlro_get_fdir_all() [+ + +]

Author: Hangyu Hua <hbh25y@gmail.com>
Date:   Fri Sep 8 14:19:50 2023 +0800

    net: ethernet: mtk_eth_soc: fix possible NULL pointer dereference in mtk_hwlro_get_fdir_all()
    
    [ Upstream commit e4c79810755f66c9a933ca810da2724133b1165a ]
    
    rule_locs is allocated in ethtool_get_rxnfc and the size is determined by
    rule_cnt from user space. So rule_cnt needs to be check before using
    rule_locs to avoid NULL pointer dereference.
    
    Fixes: 7aab747e5563 ("net: ethernet: mediatek: add ethtool functions to configure RX flows of HW LRO")
    Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: ethernet: mvpp2_main: fix possible OOB write in mvpp2_ethtool_get_rxnfc() [+ + +]

Author: Hangyu Hua <hbh25y@gmail.com>
Date:   Fri Sep 8 14:19:49 2023 +0800

    net: ethernet: mvpp2_main: fix possible OOB write in mvpp2_ethtool_get_rxnfc()
    
    [ Upstream commit 51fe0a470543f345e3c62b6798929de3ddcedc1d ]
    
    rules is allocated in ethtool_get_rxnfc and the size is determined by
    rule_cnt from user space. So rule_cnt needs to be check before using
    rules to avoid OOB writing or NULL pointer dereference.
    
    Fixes: 90b509b39ac9 ("net: mvpp2: cls: Add Classification offload support")
    Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
    Reviewed-by: Marcin Wojtas <mw@semihalf.com>
    Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: factorize code in kmalloc_reserve() [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Fri Sep 15 23:51:04 2023 +0530

    net: factorize code in kmalloc_reserve()
    
    commit 5c0e820cbbbe2d1c4cea5cd2bfc1302c123436df upstream.
    
    All kmalloc_reserve() callers have to make the same computation,
    we can factorize them, to prepare following patch in the series.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
    Acked-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    [Ajay: Regenerated the patch for v6.1.y]
    Signed-off-by: Ajay Kaher <akaher@vmware.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: fib: avoid warn splat in flow dissector [+ + +]

Author: Florian Westphal <fw@strlen.de>
Date:   Wed Aug 30 13:00:37 2023 +0200

    net: fib: avoid warn splat in flow dissector
    
    [ Upstream commit 8aae7625ff3f0bd5484d01f1b8d5af82e44bec2d ]
    
    New skbs allocated via nf_send_reset() have skb->dev == NULL.
    
    fib*_rules_early_flow_dissect helpers already have a 'struct net'
    argument but its not passed down to the flow dissector core, which
    will then WARN as it can't derive a net namespace to use:
    
     WARNING: CPU: 0 PID: 0 at net/core/flow_dissector.c:1016 __skb_flow_dissect+0xa91/0x1cd0
     [..]
      ip_route_me_harder+0x143/0x330
      nf_send_reset+0x17c/0x2d0 [nf_reject_ipv4]
      nft_reject_inet_eval+0xa9/0xf2 [nft_reject_inet]
      nft_do_chain+0x198/0x5d0 [nf_tables]
      nft_do_chain_inet+0xa4/0x110 [nf_tables]
      nf_hook_slow+0x41/0xc0
      ip_local_deliver+0xce/0x110
      ..
    
    Cc: Stanislav Fomichev <sdf@google.com>
    Cc: David Ahern <dsahern@kernel.org>
    Cc: Ido Schimmel <idosch@nvidia.com>
    Fixes: 812fa71f0d96 ("netfilter: Dissect flow after packet mangling")
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=217826
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Reviewed-by: Ido Schimmel <idosch@nvidia.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Link: https://lore.kernel.org/r/20230830110043.30497-1-fw@strlen.de
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: hns3: fix byte order conversion issue in hclge_dbg_fd_tcam_read() [+ + +]

Author: Hao Chen <chenhao418@huawei.com>
Date:   Wed Sep 6 15:20:14 2023 +0800

    net: hns3: fix byte order conversion issue in hclge_dbg_fd_tcam_read()
    
    [ Upstream commit efccf655e99b6907ca07a466924e91805892e7d3 ]
    
    req1->tcam_data is defined as "u8 tcam_data[8]", and we convert it as
    (u32 *) without considerring byte order conversion,
    it may result in printing wrong data for tcam_data.
    
    Convert tcam_data to (__le32 *) first to fix it.
    
    Fixes: b5a0b70d77b9 ("net: hns3: refactor dump fd tcam of debugfs")
    Signed-off-by: Hao Chen <chenhao418@huawei.com>
    Signed-off-by: Jijie Shao <shaojijie@huawei.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: hns3: fix debugfs concurrency issue between kfree buffer and read [+ + +]

Author: Hao Chen <chenhao418@huawei.com>
Date:   Wed Sep 6 15:20:15 2023 +0800

    net: hns3: fix debugfs concurrency issue between kfree buffer and read
    
    [ Upstream commit c295160b1d95e885f1af4586a221cb221d232d10 ]
    
    Now in hns3_dbg_uninit(), there may be concurrency between
    kfree buffer and read, it may result in memory error.
    
    Moving debugfs_remove_recursive() in front of kfree buffer to ensure
    they don't happen at the same time.
    
    Fixes: 5e69ea7ee2a6 ("net: hns3: refactor the debugfs process")
    Signed-off-by: Hao Chen <chenhao418@huawei.com>
    Signed-off-by: Jijie Shao <shaojijie@huawei.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: hns3: fix invalid mutex between tc qdisc and dcb ets command issue [+ + +]

Author: Jijie Shao <shaojijie@huawei.com>
Date:   Wed Sep 6 15:20:16 2023 +0800

    net: hns3: fix invalid mutex between tc qdisc and dcb ets command issue
    
    [ Upstream commit fa5564945f7d15ae2390b00c08b6abaef0165cda ]
    
    We hope that tc qdisc and dcb ets commands can not be used crosswise.
    If we want to use any of the commands to configure tc,
    We must use the other command to clear the existing configuration.
    
    However, when we configure a single tc with tc qdisc,
    we can still configure it with dcb ets.
    Because we use mqprio_active as the tag of tc qdisc configuration,
    but with dcb ets, we do not check mqprio_active.
    
    This patch fix this issue by check mqprio_active before
    executing the dcb ets command. and add dcb_ets_active to
    replace HCLGE_FLAG_DCB_ENABLE and HCLGE_FLAG_MQPRIO_ENABLE
    at the hclge layer,
    
    Fixes: cacde272dd00 ("net: hns3: Add hclge_dcb module for the support of DCB feature")
    Signed-off-by: Jijie Shao <shaojijie@huawei.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: hns3: fix the port information display when sfp is absent [+ + +]

Author: Yisen Zhuang <yisen.zhuang@huawei.com>
Date:   Wed Sep 6 15:20:17 2023 +0800

    net: hns3: fix the port information display when sfp is absent
    
    [ Upstream commit 674d9591a32d01df75d6b5fffed4ef942a294376 ]
    
    When sfp is absent or unidentified, the port type should be
    displayed as PORT_OTHERS, rather than PORT_FIBRE.
    
    Fixes: 88d10bd6f730 ("net: hns3: add support for multiple media type")
    Signed-off-by: Yisen Zhuang <yisen.zhuang@huawei.com>
    Signed-off-by: Jijie Shao <shaojijie@huawei.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: hns3: fix tx timeout issue [+ + +]

Author: Jian Shen <shenjian15@huawei.com>
Date:   Wed Sep 6 15:20:12 2023 +0800

    net: hns3: fix tx timeout issue
    
    [ Upstream commit 61a1deacc3d4fd3d57d7fda4d935f7f7503e8440 ]
    
    Currently, the driver knocks the ring doorbell before updating
    the ring->last_to_use in tx flow. if the hardware transmiting
    packet and napi poll scheduling are fast enough, it may get
    the old ring->last_to_use in drivers' napi poll.
    In this case, the driver will think the tx is not completed, and
    return directly without clear the flag __QUEUE_STATE_STACK_XOFF,
    which may cause tx timeout.
    
    Fixes: 20d06ca2679c ("net: hns3: optimize the tx clean process")
    Signed-off-by: Jian Shen <shenjian15@huawei.com>
    Signed-off-by: Jijie Shao <shaojijie@huawei.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: hns3: remove GSO partial feature bit [+ + +]

Author: Jie Wang <wangjie125@huawei.com>
Date:   Wed Sep 6 15:20:18 2023 +0800

    net: hns3: remove GSO partial feature bit
    
    [ Upstream commit 60326634f6c54528778de18bfef1e8a7a93b3771 ]
    
    HNS3 NIC does not support GSO partial packets segmentation. Actually tunnel
    packets for example NvGRE packets segment offload and checksum offload is
    already supported. There is no need to keep gso partial feature bit. So
    this patch removes it.
    
    Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
    Signed-off-by: Jie Wang <wangjie125@huawei.com>
    Signed-off-by: Jijie Shao <shaojijie@huawei.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: ipv4: fix one memleak in __inet_del_ifa() [+ + +]

Author: Liu Jian <liujian56@huawei.com>
Date:   Thu Sep 7 10:57:09 2023 +0800

    net: ipv4: fix one memleak in __inet_del_ifa()
    
    [ Upstream commit ac28b1ec6135649b5d78b028e47264cb3ebca5ea ]
    
    I got the below warning when do fuzzing test:
    unregister_netdevice: waiting for bond0 to become free. Usage count = 2
    
    It can be repoduced via:
    
    ip link add bond0 type bond
    sysctl -w net.ipv4.conf.bond0.promote_secondaries=1
    ip addr add 4.117.174.103/0 scope 0x40 dev bond0
    ip addr add 192.168.100.111/255.255.255.254 scope 0 dev bond0
    ip addr add 0.0.0.4/0 scope 0x40 secondary dev bond0
    ip addr del 4.117.174.103/0 scope 0x40 dev bond0
    ip link delete bond0 type bond
    
    In this reproduction test case, an incorrect 'last_prim' is found in
    __inet_del_ifa(), as a result, the secondary address(0.0.0.4/0 scope 0x40)
    is lost. The memory of the secondary address is leaked and the reference of
    in_device and net_device is leaked.
    
    Fix this problem:
    Look for 'last_prim' starting at location of the deleted IP and inserting
    the promoted IP into the location of 'last_prim'.
    
    Fixes: 0ff60a45678e ("[IPV4]: Fix secondary IP addresses after promotion")
    Signed-off-by: Liu Jian <liujian56@huawei.com>
    Signed-off-by: Julian Anastasov <ja@ssi.bg>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: ipv6/addrconf: avoid integer underflow in ipv6_create_tempaddr [+ + +]

Author: Alex Henrie <alexhenrie24@gmail.com>
Date:   Thu Aug 31 22:41:27 2023 -0600

    net: ipv6/addrconf: avoid integer underflow in ipv6_create_tempaddr
    
    [ Upstream commit f31867d0d9d82af757c1e0178b659438f4c1ea3c ]
    
    The existing code incorrectly casted a negative value (the result of a
    subtraction) to an unsigned value without checking. For example, if
    /proc/sys/net/ipv6/conf/*/temp_prefered_lft was set to 1, the preferred
    lifetime would jump to 4 billion seconds. On my machine and network the
    shortest lifetime that avoided underflow was 3 seconds.
    
    Fixes: 76506a986dc3 ("IPv6: fix DESYNC_FACTOR")
    Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: macb: Enable PTP unicast [+ + +]

Author: Harini Katakam <harini.katakam@xilinx.com>
Date:   Tue Apr 11 18:07:11 2023 +0530

    net: macb: Enable PTP unicast
    
    [ Upstream commit ee4e92c26c60b7344b7261035683a37da5a6119b ]
    
    Enable transmission and reception of PTP unicast packets by
    updating PTP unicast config bit and setting current HW mac
    address as allowed address in PTP unicast filter registers.
    
    Signed-off-by: Harini Katakam <harini.katakam@xilinx.com>
    Signed-off-by: Michal Simek <michal.simek@xilinx.com>
    Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: 403f0e771457 ("net: macb: fix sleep inside spinlock")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: macb: fix sleep inside spinlock [+ + +]

Author: Sascha Hauer <s.hauer@pengutronix.de>
Date:   Fri Sep 8 13:29:13 2023 +0200

    net: macb: fix sleep inside spinlock
    
    [ Upstream commit 403f0e771457e2b8811dc280719d11b9bacf10f4 ]
    
    macb_set_tx_clk() is called under a spinlock but itself calls clk_set_rate()
    which can sleep. This results in:
    
    | BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
    | pps pps1: new PPS source ptp1
    | in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 40, name: kworker/u4:3
    | preempt_count: 1, expected: 0
    | RCU nest depth: 0, expected: 0
    | 4 locks held by kworker/u4:3/40:
    |  #0: ffff000003409148
    | macb ff0c0000.ethernet: gem-ptp-timer ptp clock registered.
    |  ((wq_completion)events_power_efficient){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c
    |  #1: ffff8000833cbdd8 ((work_completion)(&pl->resolve)){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c
    |  #2: ffff000004f01578 (&pl->state_mutex){+.+.}-{4:4}, at: phylink_resolve+0x44/0x4e8
    |  #3: ffff000004f06f50 (&bp->lock){....}-{3:3}, at: macb_mac_link_up+0x40/0x2ac
    | irq event stamp: 113998
    | hardirqs last  enabled at (113997): [<ffff800080e8503c>] _raw_spin_unlock_irq+0x30/0x64
    | hardirqs last disabled at (113998): [<ffff800080e84478>] _raw_spin_lock_irqsave+0xac/0xc8
    | softirqs last  enabled at (113608): [<ffff800080010630>] __do_softirq+0x430/0x4e4
    | softirqs last disabled at (113597): [<ffff80008001614c>] ____do_softirq+0x10/0x1c
    | CPU: 0 PID: 40 Comm: kworker/u4:3 Not tainted 6.5.0-11717-g9355ce8b2f50-dirty #368
    | Hardware name: ... ZynqMP ... (DT)
    | Workqueue: events_power_efficient phylink_resolve
    | Call trace:
    |  dump_backtrace+0x98/0xf0
    |  show_stack+0x18/0x24
    |  dump_stack_lvl+0x60/0xac
    |  dump_stack+0x18/0x24
    |  __might_resched+0x144/0x24c
    |  __might_sleep+0x48/0x98
    |  __mutex_lock+0x58/0x7b0
    |  mutex_lock_nested+0x24/0x30
    |  clk_prepare_lock+0x4c/0xa8
    |  clk_set_rate+0x24/0x8c
    |  macb_mac_link_up+0x25c/0x2ac
    |  phylink_resolve+0x178/0x4e8
    |  process_one_work+0x1ec/0x51c
    |  worker_thread+0x1ec/0x3e4
    |  kthread+0x120/0x124
    |  ret_from_fork+0x10/0x20
    
    The obvious fix is to move the call to macb_set_tx_clk() out of the
    protected area. This seems safe as rx and tx are both disabled anyway at
    this point.
    It is however not entirely clear what the spinlock shall protect. It
    could be the read-modify-write access to the NCFGR register, but this
    is accessed in macb_set_rx_mode() and macb_set_rxcsum_feature() as well
    without holding the spinlock. It could also be the register accesses
    done in mog_init_rings() or macb_init_buffers(), but again these
    functions are called without holding the spinlock in macb_hresp_error_task().
    The locking seems fishy in this driver and it might deserve another look
    before this patch is applied.
    
    Fixes: 633e98a711ac0 ("net: macb: use resolved link config in mac_link_up()")
    Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
    Link: https://lore.kernel.org/r/20230908112913.1701766-1-s.hauer@pengutronix.de
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: phy: micrel: Correct bit assignments for phy_device flags [+ + +]

Author: Oleksij Rempel <linux@rempel-privat.de>
Date:   Fri Sep 1 06:53:23 2023 +0200

    net: phy: micrel: Correct bit assignments for phy_device flags
    
    [ Upstream commit 719c5e37e99d2fd588d1c994284d17650a66354c ]
    
    Previously, the defines for phy_device flags in the Micrel driver were
    ambiguous in their representation. They were intended to be bit masks
    but were mistakenly defined as bit positions. This led to the following
    issues:
    
    - MICREL_KSZ8_P1_ERRATA, designated for KSZ88xx switches, overlapped
      with MICREL_PHY_FXEN and MICREL_PHY_50MHZ_CLK.
    - Due to this overlap, the code path for MICREL_PHY_FXEN, tailored for
      the KSZ8041 PHY, was not executed for KSZ88xx PHYs.
    - Similarly, the code associated with MICREL_PHY_50MHZ_CLK wasn't
      triggered for KSZ88xx.
    
    To rectify this, all three flags have now been explicitly converted to
    use the `BIT()` macro, ensuring they are defined as bit masks and
    preventing potential overlaps in the future.
    
    Fixes: 49011e0c1555 ("net: phy: micrel: ksz886x/ksz8081: add cabletest support")
    Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
    Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: read sk->sk_family once in sk_mc_loop() [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Wed Aug 30 10:12:44 2023 +0000

    net: read sk->sk_family once in sk_mc_loop()
    
    [ Upstream commit a3e0fdf71bbe031de845e8e08ed7fba49f9c702c ]
    
    syzbot is playing with IPV6_ADDRFORM quite a lot these days,
    and managed to hit the WARN_ON_ONCE(1) in sk_mc_loop()
    
    We have many more similar issues to fix.
    
    WARNING: CPU: 1 PID: 1593 at net/core/sock.c:782 sk_mc_loop+0x165/0x260
    Modules linked in:
    CPU: 1 PID: 1593 Comm: kworker/1:3 Not tainted 6.1.40-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023
    Workqueue: events_power_efficient gc_worker
    RIP: 0010:sk_mc_loop+0x165/0x260 net/core/sock.c:782
    Code: 34 1b fd 49 81 c7 18 05 00 00 4c 89 f8 48 c1 e8 03 42 80 3c 20 00 74 08 4c 89 ff e8 25 36 6d fd 4d 8b 37 eb 13 e8 db 33 1b fd <0f> 0b b3 01 eb 34 e8 d0 33 1b fd 45 31 f6 49 83 c6 38 4c 89 f0 48
    RSP: 0018:ffffc90000388530 EFLAGS: 00010246
    RAX: ffffffff846d9b55 RBX: 0000000000000011 RCX: ffff88814f884980
    RDX: 0000000000000102 RSI: ffffffff87ae5160 RDI: 0000000000000011
    RBP: ffffc90000388550 R08: 0000000000000003 R09: ffffffff846d9a65
    R10: 0000000000000002 R11: ffff88814f884980 R12: dffffc0000000000
    R13: ffff88810dbee000 R14: 0000000000000010 R15: ffff888150084000
    FS: 0000000000000000(0000) GS:ffff8881f6b00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000020000180 CR3: 000000014ee5b000 CR4: 00000000003506e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    <IRQ>
    [<ffffffff8507734f>] ip6_finish_output2+0x33f/0x1ae0 net/ipv6/ip6_output.c:83
    [<ffffffff85062766>] __ip6_finish_output net/ipv6/ip6_output.c:200 [inline]
    [<ffffffff85062766>] ip6_finish_output+0x6c6/0xb10 net/ipv6/ip6_output.c:211
    [<ffffffff85061f8c>] NF_HOOK_COND include/linux/netfilter.h:298 [inline]
    [<ffffffff85061f8c>] ip6_output+0x2bc/0x3d0 net/ipv6/ip6_output.c:232
    [<ffffffff852071cf>] dst_output include/net/dst.h:444 [inline]
    [<ffffffff852071cf>] ip6_local_out+0x10f/0x140 net/ipv6/output_core.c:161
    [<ffffffff83618fb4>] ipvlan_process_v6_outbound drivers/net/ipvlan/ipvlan_core.c:483 [inline]
    [<ffffffff83618fb4>] ipvlan_process_outbound drivers/net/ipvlan/ipvlan_core.c:529 [inline]
    [<ffffffff83618fb4>] ipvlan_xmit_mode_l3 drivers/net/ipvlan/ipvlan_core.c:602 [inline]
    [<ffffffff83618fb4>] ipvlan_queue_xmit+0x1174/0x1be0 drivers/net/ipvlan/ipvlan_core.c:677
    [<ffffffff8361ddd9>] ipvlan_start_xmit+0x49/0x100 drivers/net/ipvlan/ipvlan_main.c:229
    [<ffffffff84763fc0>] netdev_start_xmit include/linux/netdevice.h:4925 [inline]
    [<ffffffff84763fc0>] xmit_one net/core/dev.c:3644 [inline]
    [<ffffffff84763fc0>] dev_hard_start_xmit+0x320/0x980 net/core/dev.c:3660
    [<ffffffff8494c650>] sch_direct_xmit+0x2a0/0x9c0 net/sched/sch_generic.c:342
    [<ffffffff8494d883>] qdisc_restart net/sched/sch_generic.c:407 [inline]
    [<ffffffff8494d883>] __qdisc_run+0xb13/0x1e70 net/sched/sch_generic.c:415
    [<ffffffff8478c426>] qdisc_run+0xd6/0x260 include/net/pkt_sched.h:125
    [<ffffffff84796eac>] net_tx_action+0x7ac/0x940 net/core/dev.c:5247
    [<ffffffff858002bd>] __do_softirq+0x2bd/0x9bd kernel/softirq.c:599
    [<ffffffff814c3fe8>] invoke_softirq kernel/softirq.c:430 [inline]
    [<ffffffff814c3fe8>] __irq_exit_rcu+0xc8/0x170 kernel/softirq.c:683
    [<ffffffff814c3f09>] irq_exit_rcu+0x9/0x20 kernel/softirq.c:695
    
    Fixes: 7ad6848c7e81 ("ip: fix mc_loop checks for tunnels with multicast outer addresses")
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Link: https://lore.kernel.org/r/20230830101244.1146934-1-edumazet@google.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: remove osize variable in __alloc_skb() [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Fri Sep 15 23:51:03 2023 +0530

    net: remove osize variable in __alloc_skb()
    
    commit 65998d2bf857b9ae5acc1f3b70892bd1b429ccab upstream.
    
    This is a cleanup patch, to prepare following change.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
    Acked-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    [Ajay: Regenerated the patch for v6.1.y]
    Signed-off-by: Ajay Kaher <akaher@vmware.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: sched: sch_qfq: Fix UAF in qfq_dequeue() [+ + +]

Author: valis <sec@valis.email>
Date:   Fri Sep 1 12:22:37 2023 -0400

    net: sched: sch_qfq: Fix UAF in qfq_dequeue()
    
    [ Upstream commit 8fc134fee27f2263988ae38920bc03da416b03d8 ]
    
    When the plug qdisc is used as a class of the qfq qdisc it could trigger a
    UAF. This issue can be reproduced with following commands:
    
      tc qdisc add dev lo root handle 1: qfq
      tc class add dev lo parent 1: classid 1:1 qfq weight 1 maxpkt 512
      tc qdisc add dev lo parent 1:1 handle 2: plug
      tc filter add dev lo parent 1: basic classid 1:1
      ping -c1 127.0.0.1
    
    and boom:
    
    [  285.353793] BUG: KASAN: slab-use-after-free in qfq_dequeue+0xa7/0x7f0
    [  285.354910] Read of size 4 at addr ffff8880bad312a8 by task ping/144
    [  285.355903]
    [  285.356165] CPU: 1 PID: 144 Comm: ping Not tainted 6.5.0-rc3+ #4
    [  285.357112] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
    [  285.358376] Call Trace:
    [  285.358773]  <IRQ>
    [  285.359109]  dump_stack_lvl+0x44/0x60
    [  285.359708]  print_address_description.constprop.0+0x2c/0x3c0
    [  285.360611]  kasan_report+0x10c/0x120
    [  285.361195]  ? qfq_dequeue+0xa7/0x7f0
    [  285.361780]  qfq_dequeue+0xa7/0x7f0
    [  285.362342]  __qdisc_run+0xf1/0x970
    [  285.362903]  net_tx_action+0x28e/0x460
    [  285.363502]  __do_softirq+0x11b/0x3de
    [  285.364097]  do_softirq.part.0+0x72/0x90
    [  285.364721]  </IRQ>
    [  285.365072]  <TASK>
    [  285.365422]  __local_bh_enable_ip+0x77/0x90
    [  285.366079]  __dev_queue_xmit+0x95f/0x1550
    [  285.366732]  ? __pfx_csum_and_copy_from_iter+0x10/0x10
    [  285.367526]  ? __pfx___dev_queue_xmit+0x10/0x10
    [  285.368259]  ? __build_skb_around+0x129/0x190
    [  285.368960]  ? ip_generic_getfrag+0x12c/0x170
    [  285.369653]  ? __pfx_ip_generic_getfrag+0x10/0x10
    [  285.370390]  ? csum_partial+0x8/0x20
    [  285.370961]  ? raw_getfrag+0xe5/0x140
    [  285.371559]  ip_finish_output2+0x539/0xa40
    [  285.372222]  ? __pfx_ip_finish_output2+0x10/0x10
    [  285.372954]  ip_output+0x113/0x1e0
    [  285.373512]  ? __pfx_ip_output+0x10/0x10
    [  285.374130]  ? icmp_out_count+0x49/0x60
    [  285.374739]  ? __pfx_ip_finish_output+0x10/0x10
    [  285.375457]  ip_push_pending_frames+0xf3/0x100
    [  285.376173]  raw_sendmsg+0xef5/0x12d0
    [  285.376760]  ? do_syscall_64+0x40/0x90
    [  285.377359]  ? __static_call_text_end+0x136578/0x136578
    [  285.378173]  ? do_syscall_64+0x40/0x90
    [  285.378772]  ? kasan_enable_current+0x11/0x20
    [  285.379469]  ? __pfx_raw_sendmsg+0x10/0x10
    [  285.380137]  ? __sock_create+0x13e/0x270
    [  285.380673]  ? __sys_socket+0xf3/0x180
    [  285.381174]  ? __x64_sys_socket+0x3d/0x50
    [  285.381725]  ? entry_SYSCALL_64_after_hwframe+0x6e/0xd8
    [  285.382425]  ? __rcu_read_unlock+0x48/0x70
    [  285.382975]  ? ip4_datagram_release_cb+0xd8/0x380
    [  285.383608]  ? __pfx_ip4_datagram_release_cb+0x10/0x10
    [  285.384295]  ? preempt_count_sub+0x14/0xc0
    [  285.384844]  ? __list_del_entry_valid+0x76/0x140
    [  285.385467]  ? _raw_spin_lock_bh+0x87/0xe0
    [  285.386014]  ? __pfx__raw_spin_lock_bh+0x10/0x10
    [  285.386645]  ? release_sock+0xa0/0xd0
    [  285.387148]  ? preempt_count_sub+0x14/0xc0
    [  285.387712]  ? freeze_secondary_cpus+0x348/0x3c0
    [  285.388341]  ? aa_sk_perm+0x177/0x390
    [  285.388856]  ? __pfx_aa_sk_perm+0x10/0x10
    [  285.389441]  ? check_stack_object+0x22/0x70
    [  285.390032]  ? inet_send_prepare+0x2f/0x120
    [  285.390603]  ? __pfx_inet_sendmsg+0x10/0x10
    [  285.391172]  sock_sendmsg+0xcc/0xe0
    [  285.391667]  __sys_sendto+0x190/0x230
    [  285.392168]  ? __pfx___sys_sendto+0x10/0x10
    [  285.392727]  ? kvm_clock_get_cycles+0x14/0x30
    [  285.393328]  ? set_normalized_timespec64+0x57/0x70
    [  285.393980]  ? _raw_spin_unlock_irq+0x1b/0x40
    [  285.394578]  ? __x64_sys_clock_gettime+0x11c/0x160
    [  285.395225]  ? __pfx___x64_sys_clock_gettime+0x10/0x10
    [  285.395908]  ? _copy_to_user+0x3e/0x60
    [  285.396432]  ? exit_to_user_mode_prepare+0x1a/0x120
    [  285.397086]  ? syscall_exit_to_user_mode+0x22/0x50
    [  285.397734]  ? do_syscall_64+0x71/0x90
    [  285.398258]  __x64_sys_sendto+0x74/0x90
    [  285.398786]  do_syscall_64+0x64/0x90
    [  285.399273]  ? exit_to_user_mode_prepare+0x1a/0x120
    [  285.399949]  ? syscall_exit_to_user_mode+0x22/0x50
    [  285.400605]  ? do_syscall_64+0x71/0x90
    [  285.401124]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
    [  285.401807] RIP: 0033:0x495726
    [  285.402233] Code: ff ff ff f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 11 b8 2c 00 00 00 0f 09
    [  285.404683] RSP: 002b:00007ffcc25fb618 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
    [  285.405677] RAX: ffffffffffffffda RBX: 0000000000000040 RCX: 0000000000495726
    [  285.406628] RDX: 0000000000000040 RSI: 0000000002518750 RDI: 0000000000000000
    [  285.407565] RBP: 00000000005205ef R08: 00000000005f8838 R09: 000000000000001c
    [  285.408523] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000002517634
    [  285.409460] R13: 00007ffcc25fb6f0 R14: 0000000000000003 R15: 0000000000000000
    [  285.410403]  </TASK>
    [  285.410704]
    [  285.410929] Allocated by task 144:
    [  285.411402]  kasan_save_stack+0x1e/0x40
    [  285.411926]  kasan_set_track+0x21/0x30
    [  285.412442]  __kasan_slab_alloc+0x55/0x70
    [  285.412973]  kmem_cache_alloc_node+0x187/0x3d0
    [  285.413567]  __alloc_skb+0x1b4/0x230
    [  285.414060]  __ip_append_data+0x17f7/0x1b60
    [  285.414633]  ip_append_data+0x97/0xf0
    [  285.415144]  raw_sendmsg+0x5a8/0x12d0
    [  285.415640]  sock_sendmsg+0xcc/0xe0
    [  285.416117]  __sys_sendto+0x190/0x230
    [  285.416626]  __x64_sys_sendto+0x74/0x90
    [  285.417145]  do_syscall_64+0x64/0x90
    [  285.417624]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
    [  285.418306]
    [  285.418531] Freed by task 144:
    [  285.418960]  kasan_save_stack+0x1e/0x40
    [  285.419469]  kasan_set_track+0x21/0x30
    [  285.419988]  kasan_save_free_info+0x27/0x40
    [  285.420556]  ____kasan_slab_free+0x109/0x1a0
    [  285.421146]  kmem_cache_free+0x1c2/0x450
    [  285.421680]  __netif_receive_skb_core+0x2ce/0x1870
    [  285.422333]  __netif_receive_skb_one_core+0x97/0x140
    [  285.423003]  process_backlog+0x100/0x2f0
    [  285.423537]  __napi_poll+0x5c/0x2d0
    [  285.424023]  net_rx_action+0x2be/0x560
    [  285.424510]  __do_softirq+0x11b/0x3de
    [  285.425034]
    [  285.425254] The buggy address belongs to the object at ffff8880bad31280
    [  285.425254]  which belongs to the cache skbuff_head_cache of size 224
    [  285.426993] The buggy address is located 40 bytes inside of
    [  285.426993]  freed 224-byte region [ffff8880bad31280, ffff8880bad31360)
    [  285.428572]
    [  285.428798] The buggy address belongs to the physical page:
    [  285.429540] page:00000000f4b77674 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xbad31
    [  285.430758] flags: 0x100000000000200(slab|node=0|zone=1)
    [  285.431447] page_type: 0xffffffff()
    [  285.431934] raw: 0100000000000200 ffff88810094a8c0 dead000000000122 0000000000000000
    [  285.432757] raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000
    [  285.433562] page dumped because: kasan: bad access detected
    [  285.434144]
    [  285.434320] Memory state around the buggy address:
    [  285.434828]  ffff8880bad31180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    [  285.435580]  ffff8880bad31200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    [  285.436264] >ffff8880bad31280: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    [  285.436777]                                   ^
    [  285.437106]  ffff8880bad31300: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
    [  285.437616]  ffff8880bad31380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    [  285.438126] ==================================================================
    [  285.438662] Disabling lock debugging due to kernel taint
    
    Fix this by:
    1. Changing sch_plug's .peek handler to qdisc_peek_dequeued(), a
    function compatible with non-work-conserving qdiscs
    2. Checking the return value of qdisc_dequeue_peeked() in sch_qfq.
    
    Fixes: 462dbc9101ac ("pkt_sched: QFQ Plus: fair-queueing service at DRR cost")
    Reported-by: valis <sec@valis.email>
    Signed-off-by: valis <sec@valis.email>
    Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
    Link: https://lore.kernel.org/r/20230901162237.11525-1-jhs@mojatatu.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: stmmac: fix handling of zero coalescing tx-usecs [+ + +]

Author: Vincent Whitchurch <vincent.whitchurch@axis.com>
Date:   Thu Sep 7 12:46:31 2023 +0200

    net: stmmac: fix handling of zero coalescing tx-usecs
    
    [ Upstream commit fa60b8163816f194786f3ee334c9a458da7699c6 ]
    
    Setting ethtool -C eth0 tx-usecs 0 is supposed to disable the use of the
    coalescing timer but currently it gets programmed with zero delay
    instead.
    
    Disable the use of the coalescing timer if tx-usecs is zero by
    preventing it from being restarted.  Note that to keep things simple we
    don't start/stop the timer when the coalescing settings are changed, but
    just let that happen on the next transmit or timer expiry.
    
    Fixes: 8fce33317023 ("net: stmmac: Rework coalesce timer and fix multi-queue races")
    Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: use sk_forward_alloc_get() in sk_get_meminfo() [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Aug 31 13:52:08 2023 +0000

    net: use sk_forward_alloc_get() in sk_get_meminfo()
    
    [ Upstream commit 66d58f046c9d3a8f996b7138d02e965fd0617de0 ]
    
    inet_sk_diag_fill() has been changed to use sk_forward_alloc_get(),
    but sk_get_meminfo() was forgotten.
    
    Fixes: 292e6077b040 ("net: introduce sk_forward_alloc_get()")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Linux: net:ethernet:adi:adin1110: Fix forwarding offload [+ + +]

Author: Ciprian Regus <ciprian.regus@analog.com>
Date:   Fri Sep 8 15:58:08 2023 +0300

    net:ethernet:adi:adin1110: Fix forwarding offload
    
    [ Upstream commit 32530dba1bd48da4437d18d9a8dbc9d2826938a6 ]
    
    Currently, when a new fdb entry is added (with both ports of the
    ADIN2111 bridged), the driver configures the MAC filters for the wrong
    port, which results in the forwarding being done by the host, and not
    actually hardware offloaded.
    
    The ADIN2111 offloads the forwarding by setting filters on the
    destination MAC address of incoming frames. Based on these, they may be
    routed to the other port. Thus, if a frame has to be forwarded from port
    1 to port 2, the required configuration for the ADDR_FILT_UPRn register
    should set the APPLY2PORT1 bit (instead of APPLY2PORT2, as it's
    currently the case).
    
    Fixes: bc93e19d088b ("net: ethernet: adi: Add ADIN1110 support")
    Signed-off-by: Ciprian Regus <ciprian.regus@analog.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: nfnetlink_osf: avoid OOB read [+ + +]

Author: Wander Lairson Costa <wander@redhat.com>
Date:   Fri Sep 1 10:50:20 2023 -0300

    netfilter: nfnetlink_osf: avoid OOB read
    
    [ Upstream commit f4f8a7803119005e87b716874bec07c751efafec ]
    
    The opt_num field is controlled by user mode and is not currently
    validated inside the kernel. An attacker can take advantage of this to
    trigger an OOB read and potentially leak information.
    
    BUG: KASAN: slab-out-of-bounds in nf_osf_match_one+0xbed/0xd10 net/netfilter/nfnetlink_osf.c:88
    Read of size 2 at addr ffff88804bc64272 by task poc/6431
    
    CPU: 1 PID: 6431 Comm: poc Not tainted 6.0.0-rc4 #1
    Call Trace:
     nf_osf_match_one+0xbed/0xd10 net/netfilter/nfnetlink_osf.c:88
     nf_osf_find+0x186/0x2f0 net/netfilter/nfnetlink_osf.c:281
     nft_osf_eval+0x37f/0x590 net/netfilter/nft_osf.c:47
     expr_call_ops_eval net/netfilter/nf_tables_core.c:214
     nft_do_chain+0x2b0/0x1490 net/netfilter/nf_tables_core.c:264
     nft_do_chain_ipv4+0x17c/0x1f0 net/netfilter/nft_chain_filter.c:23
     [..]
    
    Also add validation to genre, subtype and version fields.
    
    Fixes: 11eeef41d5f6 ("netfilter: passive OS fingerprint xtables match")
    Reported-by: Lucas Leong <wmliang@infosec.exchange>
    Signed-off-by: Wander Lairson Costa <wander@redhat.com>
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: nftables: exthdr: fix 4-byte stack OOB write [+ + +]

Author: Florian Westphal <fw@strlen.de>
Date:   Tue Sep 5 23:13:56 2023 +0200

    netfilter: nftables: exthdr: fix 4-byte stack OOB write
    
    [ Upstream commit fd94d9dadee58e09b49075240fe83423eb1dcd36 ]
    
    If priv->len is a multiple of 4, then dst[len / 4] can write past
    the destination array which leads to stack corruption.
    
    This construct is necessary to clean the remainder of the register
    in case ->len is NOT a multiple of the register size, so make it
    conditional just like nft_payload.c does.
    
    The bug was added in 4.1 cycle and then copied/inherited when
    tcp/sctp and ip option support was added.
    
    Bug reported by Zero Day Initiative project (ZDI-CAN-21950,
    ZDI-CAN-21951, ZDI-CAN-21961).
    
    Fixes: 49499c3e6e18 ("netfilter: nf_tables: switch registers to 32 bit addressing")
    Fixes: 935b7f643018 ("netfilter: nft_exthdr: add TCP option matching")
    Fixes: 133dc203d77d ("netfilter: nft_exthdr: Support SCTP chunks")
    Fixes: dbb5281a1f84 ("netfilter: nf_tables: add support for matching IPv4 options")
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFS: Fix a potential data corruption [+ + +]

Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date:   Sat Aug 19 17:22:14 2023 -0400

    NFS: Fix a potential data corruption
    
    commit 88975a55969e11f26fe3846bf4fbf8e7dc8cbbd4 upstream.
    
    We must ensure that the subrequests are joined back into the head before
    we can retransmit a request. If the head was not on the commit lists,
    because the server wrote it synchronously, we still need to add it back
    to the retransmission list.
    Add a call that mirrors the effect of nfs_cancel_remove_inode() for
    O_DIRECT.
    
    Fixes: ed5d588fe47f ("NFS: Try to join page groups before an O_DIRECT retransmission")
    Cc: stable@vger.kernel.org
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFSv4/pnfs: minor fix for cleanup path in nfs4_get_device_info [+ + +]

Author: Fedor Pchelkin <pchelkin@ispras.ru>
Date:   Thu Jul 20 18:37:51 2023 +0300

    NFSv4/pnfs: minor fix for cleanup path in nfs4_get_device_info
    
    commit 96562c45af5c31b89a197af28f79bfa838fb8391 upstream.
    
    It is an almost improbable error case but when page allocating loop in
    nfs4_get_device_info() fails then we should only free the already
    allocated pages, as __free_page() can't deal with NULL arguments.
    
    Found by Linux Verification Center (linuxtesting.org).
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
    Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

null_blk: fix poll request timeout handling [+ + +]

Author: Chengming Zhou <zhouchengming@bytedance.com>
Date:   Fri Sep 1 20:03:06 2023 +0800

    null_blk: fix poll request timeout handling
    
    commit 5a26e45edb4690d58406178b5a9ea4c6dcf2c105 upstream.
    
    When doing io_uring benchmark on /dev/nullb0, it's easy to crash the
    kernel if poll requests timeout triggered, as reported by David. [1]
    
    BUG: kernel NULL pointer dereference, address: 0000000000000008
    Workqueue: kblockd blk_mq_timeout_work
    RIP: 0010:null_timeout_rq+0x4e/0x91
    Call Trace:
     ? null_timeout_rq+0x4e/0x91
     blk_mq_handle_expired+0x31/0x4b
     bt_iter+0x68/0x84
     ? bt_tags_iter+0x81/0x81
     __sbitmap_for_each_set.constprop.0+0xb0/0xf2
     ? __blk_mq_complete_request_remote+0xf/0xf
     bt_for_each+0x46/0x64
     ? __blk_mq_complete_request_remote+0xf/0xf
     ? percpu_ref_get_many+0xc/0x2a
     blk_mq_queue_tag_busy_iter+0x14d/0x18e
     blk_mq_timeout_work+0x95/0x127
     process_one_work+0x185/0x263
     worker_thread+0x1b5/0x227
    
    This is indeed a race problem between null_timeout_rq() and null_poll().
    
    null_poll()                             null_timeout_rq()
      spin_lock(&nq->poll_lock)
      list_splice_init(&nq->poll_list, &list)
      spin_unlock(&nq->poll_lock)
    
      while (!list_empty(&list))
        req = list_first_entry()
        list_del_init()
        ...
        blk_mq_add_to_batch()
        // req->rq_next = NULL
                                            spin_lock(&nq->poll_lock)
    
                                            // rq->queuelist->next == NULL
                                            list_del_init(&rq->queuelist)
    
                                            spin_unlock(&nq->poll_lock)
    
    Fix these problems by setting requests state to MQ_RQ_COMPLETE under
    nq->poll_lock protection, in which null_timeout_rq() can safely detect
    this race and early return.
    
    Note this patch just fix the kernel panic when request timeout happen.
    
    [1] https://lore.kernel.org/all/3893581.1691785261@warthog.procyon.org.uk/
    
    Fixes: 0a593fbbc245 ("null_blk: poll queue support")
    Reported-by: David Howells <dhowells@redhat.com>
    Tested-by: David Howells <dhowells@redhat.com>
    Reviewed-by: Ming Lei <ming.lei@redhat.com>
    Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
    Link: https://lore.kernel.org/r/20230901120306.170520-2-chengming.zhou@linux.dev
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

octeontx2-af: Fix truncation of smq in CN10K NIX AQ enqueue mbox handler [+ + +]

Author: Geetha sowjanya <gakula@marvell.com>
Date:   Tue Sep 5 12:18:16 2023 +0530

    octeontx2-af: Fix truncation of smq in CN10K NIX AQ enqueue mbox handler
    
    [ Upstream commit 29fe7a1b62717d58f033009874554d99d71f7d37 ]
    
    The smq value used in the CN10K NIX AQ instruction enqueue mailbox
    handler was truncated to 9-bit value from 10-bit value because of
    typecasting the CN10K mbox request structure to the CN9K structure.
    Though this hasn't caused any problems when programming the NIX SQ
    context to the HW because the context structure is the same size.
    However, this causes a problem when accessing the structure parameters.
    This patch reads the right smq value for each platform.
    
    Fixes: 30077d210c83 ("octeontx2-af: cn10k: Update NIX/NPA context structure")
    Signed-off-by: Geetha sowjanya <gakula@marvell.com>
    Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

parisc: led: Fix LAN receive and transmit LEDs [+ + +]

Author: Helge Deller <deller@gmx.de>
Date:   Sun Aug 27 13:46:11 2023 +0200

    parisc: led: Fix LAN receive and transmit LEDs
    
    commit 4db89524b084f712a887256391fc19d9f66c8e55 upstream.
    
    Fix the LAN receive and LAN transmit LEDs, which where swapped
    up to now.
    
    Signed-off-by: Helge Deller <deller@gmx.de>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

parisc: led: Reduce CPU overhead for disk & lan LED computation [+ + +]

Author: Helge Deller <deller@gmx.de>
Date:   Fri Aug 25 17:46:39 2023 +0200

    parisc: led: Reduce CPU overhead for disk & lan LED computation
    
    commit 358ad816e52d4253b38c2f312e6b1cbd89e0dbf7 upstream.
    
    Older PA-RISC machines have LEDs which show the disk- and LAN-activity.
    The computation is done in software and takes quite some time, e.g. on a
    J6500 this may take up to 60% time of one CPU if the machine is loaded
    via network traffic.
    
    Since most people don't care about the LEDs, start with LEDs disabled and
    just show a CPU heartbeat LED. The disk and LAN LEDs can be turned on
    manually via /proc/pdc/led.
    
    Signed-off-by: Helge Deller <deller@gmx.de>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf annotate bpf: Don't enclose non-debug code with an assert() [+ + +]

Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date:   Wed Aug 2 18:22:14 2023 -0300

    perf annotate bpf: Don't enclose non-debug code with an assert()
    
    [ Upstream commit 979e9c9fc9c2a761303585e07fe2699bdd88182f ]
    
    In 616b14b47a86d880 ("perf build: Conditionally define NDEBUG") we
    started using NDEBUG=1 when DEBUG=1 isn't present, so code that is
    enclosed with assert() is not called.
    
    In dd317df072071903 ("perf build: Make binutil libraries opt in") we
    stopped linking against binutils-devel, for licensing reasons.
    
    Recently people asked me why annotation of BPF programs wasn't working,
    i.e. this:
    
      $ perf annotate bpf_prog_5280546344e3f45c_kfree_skb
    
    was returning:
    
      case SYMBOL_ANNOTATE_ERRNO__NO_LIBOPCODES_FOR_BPF:
         scnprintf(buf, buflen, "Please link with binutils's libopcode to enable BPF annotation");
    
    This was on a fedora rpm, so its new enough that I had to try to test by
    rebuilding using BUILD_NONDISTRO=1, only to get it segfaulting on me.
    
    This combination made this libopcode function not to be called:
    
            assert(bfd_check_format(bfdf, bfd_object));
    
    Changing it to:
    
            if (!bfd_check_format(bfdf, bfd_object))
                    abort();
    
    Made it work, looking at this "check" function made me realize it
    changes the 'bfdf' internal state, i.e. we better call it.
    
    So stop using assert() on it, just call it and abort if it fails.
    
    Probably it is better to propagate the error, etc, but it seems it is
    unlikely to fail from the usage done so far and we really need to stop
    using libopcodes, so do the quick fix above and move on.
    
    With it we have BPF annotation back working when built with
    BUILD_NONDISTRO=1:
    
      Б╛╒[acme@toolbox perf-tools-next]$ perf annotate --stdio2 bpf_prog_5280546344e3f45c_kfree_skb   | head
      No kallsyms or vmlinux with build-id 939bc71a1a51cdc434e60af93c7e734f7d5c0e7e was found
      Samples: 12  of event 'cpu-clock:ppp', 4000 Hz, Event count (approx.): 3000000, [percent: local period]
      bpf_prog_5280546344e3f45c_kfree_skb() bpf_prog_5280546344e3f45c_kfree_skb
      Percent      int kfree_skb(struct trace_event_raw_kfree_skb *args) {
                     nop
       33.33         xchg   %ax,%ax
                     push   %rbp
                     mov    %rsp,%rbp
                     sub    $0x180,%rsp
                     push   %rbx
                     push   %r13
      Б╛╒[acme@toolbox perf-tools-next]$
    
    Fixes: 6987561c9e86eace ("perf annotate: Enable annotation of BPF programs")
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Cc: Ian Rogers <irogers@google.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Mohamed Mahmoud <mmahmoud@redhat.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Dave Tucker <datucker@redhat.com>
    Cc: Derek Barbosa <debarbos@redhat.com>
    Cc: Song Liu <songliubraving@fb.com>
    Link: https://lore.kernel.org/lkml/ZMrMzoQBe0yqMek1@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

perf hists browser: Fix hierarchy mode header [+ + +]

Author: Namhyung Kim <namhyung@kernel.org>
Date:   Mon Jul 31 02:49:32 2023 -0700

    perf hists browser: Fix hierarchy mode header
    
    commit e2cabf2a44791f01c21f8d5189b946926e34142e upstream.
    
    The commit ef9ff6017e3c4593 ("perf ui browser: Move the extra title
    lines from the hists browser") introduced ui_browser__gotorc_title() to
    help moving non-title lines easily.  But it missed to update the title
    for the hierarchy mode so it won't print the header line on TUI at all.
    
      $ perf report --hierarchy
    
    Fixes: ef9ff6017e3c4593 ("perf ui browser: Move the extra title lines from the hists browser")
    Signed-off-by: Namhyung Kim <namhyung@kernel.org>
    Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Cc: Ian Rogers <irogers@google.com>
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20230731094934.1616495-1-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf hists browser: Fix the number of entries for 'e' key [+ + +]

Author: Namhyung Kim <namhyung@kernel.org>
Date:   Mon Jul 31 02:49:33 2023 -0700

    perf hists browser: Fix the number of entries for 'e' key
    
    commit f6b8436bede3e80226e8b2100279c4450c73806a upstream.
    
    The 'e' key is to toggle expand/collapse the selected entry only.  But
    the current code has a bug that it only increases the number of entries
    by 1 in the hierarchy mode so users cannot move under the current entry
    after the key stroke.  This is due to a wrong assumption in the
    hist_entry__set_folding().
    
    The commit b33f922651011eff ("perf hists browser: Put hist_entry folding
    logic into single function") factored out the code, but actually it
    should be handled separately.  The hist_browser__set_folding() is to
    update fold state for each entry so it needs to traverse all (child)
    entries regardless of the current fold state.  So it increases the
    number of entries by 1.
    
    But the hist_entry__set_folding() only cares the currently selected
    entry and its all children.  So it should count all unfolded child
    entries.  This code is implemented in hist_browser__toggle_fold()
    already so we can just call it.
    
    Fixes: b33f922651011eff ("perf hists browser: Put hist_entry folding logic into single function")
    Signed-off-by: Namhyung Kim <namhyung@kernel.org>
    Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Cc: Ian Rogers <irogers@google.com>
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20230731094934.1616495-2-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf test shell stat_bpf_counters: Fix test on Intel [+ + +]

Author: Namhyung Kim <namhyung@kernel.org>
Date:   Fri Aug 25 09:41:51 2023 -0700

    perf test shell stat_bpf_counters: Fix test on Intel
    
    commit 68ca249c964f520af7f8763e22f12bd26b57b870 upstream.
    
    As of now, bpf counters (bperf) don't support event groups.  But the
    default perf stat includes topdown metrics if supported (on recent Intel
    machines) which require groups.  That makes perf stat exiting.
    
      $ sudo perf stat --bpf-counter true
      bpf managed perf events do not yet support groups.
    
    Actually the test explicitly uses cycles event only, but it missed to
    pass the option when it checks the availability of the command.
    
    Fixes: 2c0cb9f56020d2ea ("perf test: Add a shell test for 'perf stat --bpf-counters' new option")
    Reviewed-by: Song Liu <song@kernel.org>
    Signed-off-by: Namhyung Kim <namhyung@kernel.org>
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Cc: Ian Rogers <irogers@google.com>
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: bpf@vger.kernel.org
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20230825164152.165610-2-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test [+ + +]

Author: Namhyung Kim <namhyung@kernel.org>
Date:   Fri Aug 25 09:41:52 2023 -0700

    perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test
    
    [ Upstream commit a84260e314029e6dc9904fd6eabf8d9fd7965351 ]
    
    It has system-wide test and cpu-list test but the cpu-list test fails
    sometimes.  It runs sleep command on CPU1 and measure both user.slice
    and system.slice cgroups by default (on systemd-based systems).
    
    But if the system was idle enough, sometime the system.slice gets no
    count and it makes the test failing.  Maybe that's because it only looks
    at the CPU1, let's add CPU0 to increase the chance it finds some tasks.
    
    Fixes: 7901086014bbaa3a ("perf test: Add a new test for perf stat cgroup BPF counter")
    Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
    Signed-off-by: Namhyung Kim <namhyung@kernel.org>
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Cc: Ian Rogers <irogers@google.com>
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: bpf@vger.kernel.org
    Link: https://lore.kernel.org/r/20230825164152.165610-3-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

perf test stat_bpf_counters_cgrp: Fix shellcheck issue about logical operators [+ + +]

Author: Kajol Jain <kjain@linux.ibm.com>
Date:   Sun Jul 9 23:57:39 2023 +0530

    perf test stat_bpf_counters_cgrp: Fix shellcheck issue about logical operators
    
    [ Upstream commit 0dd1f815545d7210150642741c364521cc5cf116 ]
    
    Running shellcheck on lock_contention.sh generates below warning:
    
    In stat_bpf_counters_cgrp.sh line 28:
            if [ -d /sys/fs/cgroup/system.slice -a -d /sys/fs/cgroup/user.slice ]; then
                                                ^-- SC2166 (warning): Prefer [ p ] && [ q ] as [ p -a q ] is not well defined.
    
    In stat_bpf_counters_cgrp.sh line 34:
            local self_cgrp=$(grep perf_event /proc/self/cgroup | cut -d: -f3)
            ^-------------^ SC3043 (warning): In POSIX sh, 'local' is undefined.
                  ^-------^ SC2155 (warning): Declare and assign separately to avoid masking return values.
                            ^-- SC2046 (warning): Quote this to prevent word splitting.
    
    In stat_bpf_counters_cgrp.sh line 51:
            local output
            ^----------^ SC3043 (warning): In POSIX sh, 'local' is undefined.
    
    In stat_bpf_counters_cgrp.sh line 65:
            local output
            ^----------^ SC3043 (warning): In POSIX sh, 'local' is undefined.
    
    Fixed above warnings by:
    - Changing the expression [p -a q] to [p] && [q].
    - Fixing shellcheck warnings for local usage, by prefixing
      function name to the variable.
    
    Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
    Acked-by: Ian Rogers <irogers@google.com>
    Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: linuxppc-dev@lists.ozlabs.org
    Link: https://lore.kernel.org/r/20230709182800.53002-6-atrajeev@linux.vnet.ibm.com
    Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Stable-dep-of: a84260e31402 ("perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

perf tools: Handle old data in PERF_RECORD_ATTR [+ + +]

Author: Namhyung Kim <namhyung@kernel.org>
Date:   Fri Aug 25 08:25:49 2023 -0700

    perf tools: Handle old data in PERF_RECORD_ATTR
    
    commit 9bf63282ea77a531ea58acb42fb3f40d2d1e4497 upstream.
    
    The PERF_RECORD_ATTR is used for a pipe mode to describe an event with
    attribute and IDs.  The ID table comes after the attr and it calculate
    size of the table using the total record size and the attr size.
    
      n_ids = (total_record_size - end_of_the_attr_field) / sizeof(u64)
    
    This is fine for most use cases, but sometimes it saves the pipe output
    in a file and then process it later.  And it becomes a problem if there
    is a change in attr size between the record and report.
    
      $ perf record -o- > perf-pipe.data  # old version
      $ perf report -i- < perf-pipe.data  # new version
    
    For example, if the attr size is 128 and it has 4 IDs, then it would
    save them in 168 byte like below:
    
       8 byte: perf event header { .type = PERF_RECORD_ATTR, .size = 168 },
     128 byte: perf event attr { .size = 128, ... },
      32 byte: event IDs [] = { 1234, 1235, 1236, 1237 },
    
    But when report later, it thinks the attr size is 136 then it only read
    the last 3 entries as ID.
    
       8 byte: perf event header { .type = PERF_RECORD_ATTR, .size = 168 },
     136 byte: perf event attr { .size = 136, ... },
      24 byte: event IDs [] = { 1235, 1236, 1237 },  // 1234 is missing
    
    So it should use the recorded version of the attr.  The attr has the
    size field already then it should honor the size when reading data.
    
    Fixes: 2c46dbb517a10b18 ("perf: Convert perf header attrs into attr events")
    Signed-off-by: Namhyung Kim <namhyung@kernel.org>
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Cc: Ian Rogers <irogers@google.com>
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Tom Zanussi <zanussi@kernel.org>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20230825152552.112913-1-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

perf top: Don't pass an ERR_PTR() directly to perf_session__delete() [+ + +]

Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date:   Thu Aug 17 09:11:21 2023 -0300

    perf top: Don't pass an ERR_PTR() directly to perf_session__delete()
    
    [ Upstream commit ef23cb593304bde0cc046fd4cc83ae7ea2e24f16 ]
    
    While debugging a segfault on 'perf lock contention' without an
    available perf.data file I noticed that it was basically calling:
    
            perf_session__delete(ERR_PTR(-1))
    
    Resulting in:
    
      (gdb) run lock contention
      Starting program: /root/bin/perf lock contention
      [Thread debugging using libthread_db enabled]
      Using host libthread_db library "/lib64/libthread_db.so.1".
      failed to open perf.data: No such file or directory  (try 'perf record' first)
      Initializing perf session failed
    
      Program received signal SIGSEGV, Segmentation fault.
      0x00000000005e7515 in auxtrace__free (session=0xffffffffffffffff) at util/auxtrace.c:2858
      2858          if (!session->auxtrace)
      (gdb) p session
      $1 = (struct perf_session *) 0xffffffffffffffff
      (gdb) bt
      #0  0x00000000005e7515 in auxtrace__free (session=0xffffffffffffffff) at util/auxtrace.c:2858
      #1  0x000000000057bb4d in perf_session__delete (session=0xffffffffffffffff) at util/session.c:300
      #2  0x000000000047c421 in __cmd_contention (argc=0, argv=0x7fffffffe200) at builtin-lock.c:2161
      #3  0x000000000047dc95 in cmd_lock (argc=0, argv=0x7fffffffe200) at builtin-lock.c:2604
      #4  0x0000000000501466 in run_builtin (p=0xe597a8 <commands+552>, argc=2, argv=0x7fffffffe200) at perf.c:322
      #5  0x00000000005016d5 in handle_internal_command (argc=2, argv=0x7fffffffe200) at perf.c:375
      #6  0x0000000000501824 in run_argv (argcp=0x7fffffffe02c, argv=0x7fffffffe020) at perf.c:419
      #7  0x0000000000501b11 in main (argc=2, argv=0x7fffffffe200) at perf.c:535
      (gdb)
    
    So just set it to NULL after using PTR_ERR(session) to decode the error
    as perf_session__delete(NULL) is supported.
    
    The same problem was found in 'perf top' after an audit of all
    perf_session__new() failure handling.
    
    Fixes: 6ef81c55a2b6584c ("perf session: Return error code for perf_session__new() function on failure")
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
    Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
    Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Kate Stewart <kstewart@linuxfoundation.org>
    Cc: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
    Cc: Mukesh Ojha <mojha@codeaurora.org>
    Cc: Nageswara R Sastry <rnsastry@linux.vnet.ibm.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
    Cc: Shawn Landden <shawn@git.icu>
    Cc: Song Liu <songliubraving@fb.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com>
    Link: https://lore.kernel.org/lkml/ZN4Q2rxxsL08A8rd@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

perf trace: Really free the evsel->priv area [+ + +]

Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date:   Wed Jul 19 15:37:14 2023 -0300

    perf trace: Really free the evsel->priv area
    
    [ Upstream commit 7962ef13651a9163f07b530607392ea123482e8a ]
    
    In 3cb4d5e00e037c70 ("perf trace: Free syscall tp fields in
    evsel->priv") it only was freeing if strcmp(evsel->tp_format->system,
    "syscalls") returned zero, while the corresponding initialization of
    evsel->priv was being performed if it was _not_ zero, i.e. if the tp
    system wasn't 'syscalls'.
    
    Just stop looking for that and free it if evsel->priv was set, which
    should be equivalent.
    
    Also use the pre-existing evsel_trace__delete() function.
    
    This resolves these leaks, detected with:
    
      $ make EXTRA_CFLAGS="-fsanitize=address" BUILD_BPF_SKEL=1 CORESIGHT=1 O=/tmp/build/perf-tools-next -C tools/perf install-bin
    
      =================================================================
      ==481565==ERROR: LeakSanitizer: detected memory leaks
    
      Direct leak of 40 byte(s) in 1 object(s) allocated from:
          #0 0x7f7343cba097 in calloc (/lib64/libasan.so.8+0xba097)
          #1 0x987966 in zalloc (/home/acme/bin/perf+0x987966)
          #2 0x52f9b9 in evsel_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:307
          #3 0x52f9b9 in evsel__syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:333
          #4 0x52f9b9 in evsel__init_raw_syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:458
          #5 0x52f9b9 in perf_evsel__raw_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:480
          #6 0x540e8b in trace__add_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3212
          #7 0x540e8b in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3891
          #8 0x540e8b in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5156
          #9 0x5ef262 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323
          #10 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377
          #11 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421
          #12 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537
          #13 0x7f7342c4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)
    
      Direct leak of 40 byte(s) in 1 object(s) allocated from:
          #0 0x7f7343cba097 in calloc (/lib64/libasan.so.8+0xba097)
          #1 0x987966 in zalloc (/home/acme/bin/perf+0x987966)
          #2 0x52f9b9 in evsel_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:307
          #3 0x52f9b9 in evsel__syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:333
          #4 0x52f9b9 in evsel__init_raw_syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:458
          #5 0x52f9b9 in perf_evsel__raw_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:480
          #6 0x540dd1 in trace__add_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3205
          #7 0x540dd1 in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3891
          #8 0x540dd1 in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5156
          #9 0x5ef262 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323
          #10 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377
          #11 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421
          #12 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537
          #13 0x7f7342c4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)
    
      SUMMARY: AddressSanitizer: 80 byte(s) leaked in 2 allocation(s).
      [root@quaco ~]#
    
    With this we plug all leaks with "perf trace sleep 1".
    
    Fixes: 3cb4d5e00e037c70 ("perf trace: Free syscall tp fields in evsel->priv")
    Acked-by: Ian Rogers <irogers@google.com>
    Cc: Adrian Hunter <adrian.hunter@intel.com>
    Cc: Jiri Olsa <jolsa@kernel.org>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Riccardo Mancini <rickyman7@gmail.com>
    Link: https://lore.kernel.org/lkml/20230719202951.534582-5-acme@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

perf trace: Use zfree() to reduce chances of use after free [+ + +]

Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date:   Wed Apr 12 09:50:08 2023 -0300

    perf trace: Use zfree() to reduce chances of use after free
    
    [ Upstream commit 9997d5dd177c52017fa0541bf236a4232c8148e6 ]
    
    Do defensive programming by using zfree() to initialize freed pointers
    to NULL, so that eventual use after free result in a NULL pointer deref
    instead of more subtle behaviour.
    
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Stable-dep-of: 7962ef13651a ("perf trace: Really free the evsel->priv area")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

perf vendor events: Drop some of the JSON/events for power10 platform [+ + +]

Author: Kajol Jain <kjain@linux.ibm.com>
Date:   Mon Aug 14 16:57:58 2023 +0530

    perf vendor events: Drop some of the JSON/events for power10 platform
    
    [ Upstream commit e104df97b8dcfbab2e42de634b99bf03f0805d85 ]
    
    Drop some of the JSON/events for power10 platform due to counter
    data mismatch.
    
    Fixes: 32daa5d7899e0343 ("perf vendor events: Initial JSON/events list for power10 platform")
    Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
    Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
    Cc: Disha Goel <disgoel@linux.ibm.com>
    Cc: Ian Rogers <irogers@google.com>
    Cc: Kajol Jain <kjain@linux.ibm.com>
    Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: linuxppc-dev@lists.ozlabs.org
    Link: https://lore.kernel.org/r/20230814112803.1508296-2-kjain@linux.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

perf vendor events: Drop STORES_PER_INST metric event for power10 platform [+ + +]

Author: Kajol Jain <kjain@linux.ibm.com>
Date:   Mon Aug 14 16:57:59 2023 +0530

    perf vendor events: Drop STORES_PER_INST metric event for power10 platform
    
    [ Upstream commit 4836b9a85ef148c7c9779b66fab3f7279e488d90 ]
    
    Drop STORES_PER_INST metric event for the power10 platform, as the
    metric expression of STORES_PER_INST metric event using dropped event
    PM_ST_FIN.
    
    Fixes: 3ca3af7d1f230d1f ("perf vendor events power10: Add metric events JSON file for power10 platform")
    Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
    Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
    Cc: Disha Goel <disgoel@linux.ibm.com>
    Cc: Ian Rogers <irogers@google.com>
    Cc: Kajol Jain <kjain@linux.ibm.com>
    Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: linuxppc-dev@lists.ozlabs.org
    Link: https://lore.kernel.org/r/20230814112803.1508296-3-kjain@linux.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

perf vendor events: Update the JSON/events descriptions for power10 platform [+ + +]

Author: Kajol Jain <kjain@linux.ibm.com>
Date:   Mon Aug 14 16:57:57 2023 +0530

    perf vendor events: Update the JSON/events descriptions for power10 platform
    
    [ Upstream commit 3286f88f31da060ac2789cee247153961ba57e49 ]
    
    Update the description for some of the JSON/events for power10 platform.
    
    Fixes: 32daa5d7899e0343 ("perf vendor events: Initial JSON/events list for power10 platform")
    Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
    Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
    Cc: Disha Goel <disgoel@linux.ibm.com>
    Cc: Ian Rogers <irogers@google.com>
    Cc: Kajol Jain <kjain@linux.ibm.com>
    Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: linuxppc-dev@lists.ozlabs.org
    Link: https://lore.kernel.org/r/20230814112803.1508296-1-kjain@linux.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

pinctrl: cherryview: fix address_space_handler() argument [+ + +]

Author: Raag Jadav <raag.jadav@intel.com>
Date:   Tue Aug 22 12:53:40 2023 +0530

    pinctrl: cherryview: fix address_space_handler() argument
    
    commit d5301c90716a8e20bc961a348182daca00c8e8f0 upstream.
    
    First argument of acpi_*_address_space_handler() APIs is acpi_handle of
    the device, which is incorrectly passed in driver ->remove() path here.
    Fix it by passing the appropriate argument and while at it, make both
    API calls consistent using ACPI_HANDLE().
    
    Fixes: a0b028597d59 ("pinctrl: cherryview: Add support for GMMR GPIO opregion")
    Cc: stable@vger.kernel.org
    Signed-off-by: Raag Jadav <raag.jadav@intel.com>
    Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
    Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

platform/mellanox: mlxbf-pmc: Fix potential buffer overflows [+ + +]

Author: Shravan Kumar Ramani <shravankr@nvidia.com>
Date:   Tue Sep 5 08:49:32 2023 -0400

    platform/mellanox: mlxbf-pmc: Fix potential buffer overflows
    
    [ Upstream commit 80ccd40568bcd3655b0fd0be1e9b3379fd6e1056 ]
    
    Replace sprintf with sysfs_emit where possible.
    Size check in mlxbf_pmc_event_list_show should account for "\0".
    
    Fixes: 1a218d312e65 ("platform/mellanox: mlxbf-pmc: Add Mellanox BlueField PMC driver")
    Signed-off-by: Shravan Kumar Ramani <shravankr@nvidia.com>
    Reviewed-by: Vadim Pasternak <vadimp@nvidia.com>
    Reviewed-by: David Thompson <davthompson@nvidia.com>
    Link: https://lore.kernel.org/r/bef39ef32319a31b32f999065911f61b0d3b17c3.1693917738.git.shravankr@nvidia.com
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

platform/mellanox: mlxbf-pmc: Fix reading of unprogrammed events [+ + +]

Author: Shravan Kumar Ramani <shravankr@nvidia.com>
Date:   Tue Sep 5 08:49:33 2023 -0400

    platform/mellanox: mlxbf-pmc: Fix reading of unprogrammed events
    
    [ Upstream commit 0f5969452e162efc50bdc98968fb62b424a9874b ]
    
    This fix involves 2 changes:
     - All event regs have a reset value of 0, which is not a valid
       event_number as per the event_list for most blocks and hence seen
       as an error. Add a "disable" event with event_number 0 for all blocks.
    
     - The enable bit for each counter need not be checked before
       reading the event info, and hence removed.
    
    Fixes: 1a218d312e65 ("platform/mellanox: mlxbf-pmc: Add Mellanox BlueField PMC driver")
    Signed-off-by: Shravan Kumar Ramani <shravankr@nvidia.com>
    Reviewed-by: Vadim Pasternak <vadimp@nvidia.com>
    Reviewed-by: David Thompson <davthompson@nvidia.com>
    Link: https://lore.kernel.org/r/04d0213932d32681de1c716b54320ed894e52425.1693917738.git.shravankr@nvidia.com
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

platform/mellanox: mlxbf-tmfifo: Drop jumbo frames [+ + +]

Author: Liming Sun <limings@nvidia.com>
Date:   Tue Aug 29 13:43:00 2023 -0400

    platform/mellanox: mlxbf-tmfifo: Drop jumbo frames
    
    [ Upstream commit fc4c655821546239abb3cf4274d66b9747aa87dd ]
    
    This commit drops over-sized network packets to avoid tmfifo
    queue stuck.
    
    Fixes: 1357dfd7261f ("platform/mellanox: Add TmFifo driver for Mellanox BlueField Soc")
    Signed-off-by: Liming Sun <limings@nvidia.com>
    Reviewed-by: Vadim Pasternak <vadimp@nvidia.com>
    Reviewed-by: David Thompson <davthompson@nvidia.com>
    Link: https://lore.kernel.org/r/9318936c2447f76db475c985ca6d91f057efcd41.1693322547.git.limings@nvidia.com
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

platform/mellanox: mlxbf-tmfifo: Drop the Rx packet if no more descriptors [+ + +]

Author: Liming Sun <limings@nvidia.com>
Date:   Tue Aug 29 13:42:59 2023 -0400

    platform/mellanox: mlxbf-tmfifo: Drop the Rx packet if no more descriptors
    
    [ Upstream commit 78034cbece79c2d730ad0770b3b7f23eedbbecf5 ]
    
    This commit fixes tmfifo console stuck issue when the virtual
    networking interface is in down state. In such case, the network
    Rx descriptors runs out and causes the Rx network packet staying
    in the head of the tmfifo thus blocking the console packets. The
    fix is to drop the Rx network packet when no more Rx descriptors.
    Function name mlxbf_tmfifo_release_pending_pkt() is also renamed
    to mlxbf_tmfifo_release_pkt() to be more approperiate.
    
    Fixes: 1357dfd7261f ("platform/mellanox: Add TmFifo driver for Mellanox BlueField Soc")
    Signed-off-by: Liming Sun <limings@nvidia.com>
    Reviewed-by: Vadim Pasternak <vadimp@nvidia.com>
    Reviewed-by: David Thompson <davthompson@nvidia.com>
    Link: https://lore.kernel.org/r/8c0177dc938ae03f52ff7e0b62dbeee74b7bec09.1693322547.git.limings@nvidia.com
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

platform/mellanox: NVSW_SN2201 should depend on ACPI [+ + +]

Author: Geert Uytterhoeven <geert+renesas@glider.be>
Date:   Mon Sep 4 14:00:35 2023 +0200

    platform/mellanox: NVSW_SN2201 should depend on ACPI
    
    [ Upstream commit 0a138f1670bd1af13ba6949c48ea86ddd4bf557e ]
    
    The only probing method supported by the Nvidia SN2201 platform driver
    is probing through an ACPI match table.  Hence add a dependency on
    ACPI, to prevent asking the user about this driver when configuring a
    kernel without ACPI support.
    
    Fixes: 662f24826f95 ("platform/mellanox: Add support for new SN2201 system")
    Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Acked-by: Vadim Pasternak <vadimp@nvidia.com>
    Acked-by: Andi Shyti <andi.shyti@kernel.org>
    Link: https://lore.kernel.org/r/ec5a4071691ab08d58771b7732a9988e89779268.1693828363.git.geert+renesas@glider.be
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

pwm: atmel-tcb: Convert to platform remove callback returning void [+ + +]

Author: Uwe Kleine-Kц╤nig <u.kleine-koenig@pengutronix.de>
Date:   Fri Mar 3 19:54:17 2023 +0100

    pwm: atmel-tcb: Convert to platform remove callback returning void
    
    [ Upstream commit 9609284a76978daf53a54e05cff36873a75e4d13 ]
    
    The .remove() callback for a platform driver returns an int which makes
    many driver authors wrongly assume it's possible to do error handling by
    returning an error code. However the value returned is (mostly) ignored
    and this typically results in resource leaks. To improve here there is a
    quest to make the remove callback return void. In the first step of this
    quest all drivers are converted to .remove_new() which already returns
    void.
    
    Trivially convert this driver from always returning zero in the remove
    callback to the void returning variant.
    
    Signed-off-by: Uwe Kleine-Kц╤nig <u.kleine-koenig@pengutronix.de>
    Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
    Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
    Stable-dep-of: c11622324c02 ("pwm: atmel-tcb: Fix resource freeing in error path and remove")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

pwm: atmel-tcb: Fix resource freeing in error path and remove [+ + +]

Author: Uwe Kleine-Kц╤nig <u.kleine-koenig@pengutronix.de>
Date:   Wed Jul 19 21:20:10 2023 +0200

    pwm: atmel-tcb: Fix resource freeing in error path and remove
    
    [ Upstream commit c11622324c023415fb69196c5fc3782d2b8cced0 ]
    
    Several resources were not freed in the error path and the remove
    function. Add the forgotten items.
    
    Fixes: 34cbcd72588f ("pwm: atmel-tcb: Add sama5d2 support")
    Fixes: 061f8572a31c ("pwm: atmel-tcb: Switch to new binding")
    Signed-off-by: Uwe Kleine-Kц╤nig <u.kleine-koenig@pengutronix.de>
    Reviewed-by: Claudiu Beznea <claudiu.beznea@tuxon.dev>
    Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

pwm: atmel-tcb: Harmonize resource allocation order [+ + +]

Author: Uwe Kleine-Kц╤nig <u.kleine-koenig@pengutronix.de>
Date:   Wed Jul 19 21:20:09 2023 +0200

    pwm: atmel-tcb: Harmonize resource allocation order
    
    [ Upstream commit 0323e8fedd1ef25342cf7abf3a2024f5670362b8 ]
    
    Allocate driver data as first resource in the probe function. This way it
    can be used during allocation of the other resources (instead of assigning
    these to local variables first and update driver data only when it's
    allocated). Also as driver data is allocated using a devm function this
    should happen first to have the order of freeing resources in the error
    path and the remove function in reverse.
    
    Signed-off-by: Uwe Kleine-Kц╤nig <u.kleine-koenig@pengutronix.de>
    Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
    Stable-dep-of: c11622324c02 ("pwm: atmel-tcb: Fix resource freeing in error path and remove")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

pwm: lpc32xx: Remove handling of PWM channels [+ + +]

Author: Vladimir Zapolskiy <vz@mleia.com>
Date:   Mon Jul 17 17:52:57 2023 +0200

    pwm: lpc32xx: Remove handling of PWM channels
    
    [ Upstream commit 4aae44f65827f0213a7361cf9c32cfe06114473f ]
    
    Because LPC32xx PWM controllers have only a single output which is
    registered as the only PWM device/channel per controller, it is known in
    advance that pwm->hwpwm value is always 0. On basis of this fact
    simplify the code by removing operations with pwm->hwpwm, there is no
    controls which require channel number as input.
    
    Even though I wasn't aware at the time when I forward ported that patch,
    this fixes a null pointer dereference as lpc32xx->chip.pwms is NULL
    before devm_pwmchip_add() is called.
    
    Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
    Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
    Signed-off-by: Uwe Kleine-Kц╤nig <u.kleine-koenig@pengutronix.de>
    Fixes: 3d2813fb17e5 ("pwm: lpc32xx: Don't modify HW state in .probe() after the PWM chip was registered")
    Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

r8152: check budget for r8152_poll() [+ + +]

Author: Hayes Wang <hayeswang@realtek.com>
Date:   Fri Sep 8 15:01:52 2023 +0800

    r8152: check budget for r8152_poll()
    
    [ Upstream commit a7b8d60b37237680009dd0b025fe8c067aba0ee3 ]
    
    According to the document of napi, there is no rx process when the
    budget is 0. Therefore, r8152_poll() has to return 0 directly when the
    budget is equal to 0.
    
    Fixes: d2187f8e4454 ("r8152: divide the tx and rx bottom functions")
    Signed-off-by: Hayes Wang <hayeswang@realtek.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

s390/zcrypt: don't leak memory if dev_set_name() fails [+ + +]

Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date:   Thu Aug 31 13:59:59 2023 +0300

    s390/zcrypt: don't leak memory if dev_set_name() fails
    
    [ Upstream commit 6252f47b78031979ad919f971dc8468b893488bd ]
    
    When dev_set_name() fails, zcdn_create() doesn't free the newly
    allocated resources. Do it.
    
    Fixes: 00fab2350e6b ("s390/zcrypt: multiple zcrypt device nodes support")
    Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    Link: https://lore.kernel.org/r/20230831110000.24279-1-andriy.shevchenko@linux.intel.com
    Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
    Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

scsi: qla2xxx: Adjust IOCB resource on qpair create [+ + +]

Author: Quinn Tran <qutran@marvell.com>
Date:   Fri Jul 14 12:30:56 2023 +0530

    scsi: qla2xxx: Adjust IOCB resource on qpair create
    
    commit efa74a62aaa2429c04fe6cb277b3bf6739747d86 upstream.
    
    During NVMe queue creation, a new qpair is created. FW resource limit needs
    to be re-adjusted to take into account the new qpair. Otherwise, NVMe
    command can not go through.  This issue was discovered while
    testing/forcing FW execution to fail at load time.
    
    Add call to readjust IOCB and exchange limit.
    
    In addition, get FW state command and require FW to be running. Otherwise,
    error is generated.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Quinn Tran <qutran@marvell.com>
    Signed-off-by: Nilesh Javali <njavali@marvell.com>
    Link: https://lore.kernel.org/r/20230714070104.40052-3-njavali@marvell.com
    Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: qla2xxx: Error code did not return to upper layer [+ + +]

Author: Quinn Tran <qutran@marvell.com>
Date:   Mon Aug 21 18:30:41 2023 +0530

    scsi: qla2xxx: Error code did not return to upper layer
    
    commit 0ba0b018f94525a6b32f5930f980ce9b62b72e6f upstream.
    
    TMF was returned with an error code. The error code was not preserved to be
    returned to upper layer. Instead, the error code from the Marker was
    returned.
    
    Preserve error code from TMF and return it to upper layer.
    
    Cc: stable@vger.kernel.org
    Fixes: da7c21b72aa8 ("scsi: qla2xxx: Fix command flush during TMF")
    Signed-off-by: Quinn Tran <qutran@marvell.com>
    Signed-off-by: Nilesh Javali <njavali@marvell.com>
    Link: https://lore.kernel.org/r/20230821130045.34850-6-njavali@marvell.com
    Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: qla2xxx: Fix command flush during TMF [+ + +]

Author: Quinn Tran <qutran@marvell.com>
Date:   Fri Jul 14 12:30:58 2023 +0530

    scsi: qla2xxx: Fix command flush during TMF
    
    commit da7c21b72aa86e990af5f73bce6590b8d8d148d0 upstream.
    
    For each TMF request, driver iterates through each qpair and flushes
    commands associated to the TMF. At the end of the qpair flush, a Marker is
    used to complete the flush transaction. This process was repeated for each
    qpair. The multiple flush and marker for this TMF request seems to cause
    confusion for FW.
    
    Instead, 1 flush is sent to FW. Driver would wait for FW to go through all
    the I/Os on each qpair to be read then return. Driver then closes out the
    transaction with a Marker.
    
    Cc: stable@vger.kernel.org
    Fixes: d90171dd0da5 ("scsi: qla2xxx: Multi-que support for TMF")
    Signed-off-by: Quinn Tran <qutran@marvell.com>
    Signed-off-by: Nilesh Javali <njavali@marvell.com>
    Link: https://lore.kernel.org/r/20230714070104.40052-5-njavali@marvell.com
    Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: qla2xxx: Fix deletion race condition [+ + +]

Author: Quinn Tran <qutran@marvell.com>
Date:   Fri Jul 14 12:30:55 2023 +0530

    scsi: qla2xxx: Fix deletion race condition
    
    commit 6dfe4344c168c6ca20fe7640649aacfcefcccb26 upstream.
    
    System crash when using debug kernel due to link list corruption. The cause
    of the link list corruption is due to session deletion was allowed to queue
    up twice.  Here's the internal trace that show the same port was allowed to
    double queue for deletion on different cpu.
    
    20808683956 015 qla2xxx [0000:13:00.1]-e801:4: Scheduling sess ffff93ebf9306800 for deletion 50:06:0e:80:12:48:ff:50 fc4_type 1
    20808683957 027 qla2xxx [0000:13:00.1]-e801:4: Scheduling sess ffff93ebf9306800 for deletion 50:06:0e:80:12:48:ff:50 fc4_type 1
    
    Move the clearing/setting of deleted flag lock.
    
    Cc: stable@vger.kernel.org
    Fixes: 726b85487067 ("qla2xxx: Add framework for async fabric discovery")
    Signed-off-by: Quinn Tran <qutran@marvell.com>
    Signed-off-by: Nilesh Javali <njavali@marvell.com>
    Link: https://lore.kernel.org/r/20230714070104.40052-2-njavali@marvell.com
    Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: qla2xxx: Fix erroneous link up failure [+ + +]

Author: Quinn Tran <qutran@marvell.com>
Date:   Fri Jul 14 12:30:59 2023 +0530

    scsi: qla2xxx: Fix erroneous link up failure
    
    commit 5b51f35d127e7bef55fa869d2465e2bca4636454 upstream.
    
    Link up failure occurred where driver failed to see certain events from FW
    indicating link up (AEN 8011) and fabric login completion (AEN 8014).
    Without these 2 events, driver would not proceed forward to scan the
    fabric. The cause of this is due to delay in the receive of interrupt for
    Mailbox 60 that causes qla to set the fw_started flag late.  The late
    setting of this flag causes other interrupts to be dropped.  These dropped
    interrupts happen to be the link up (AEN 8011) and fabric login completion
    (AEN 8014).
    
    Set fw_started flag early to prevent interrupts being dropped.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Quinn Tran <qutran@marvell.com>
    Signed-off-by: Nilesh Javali <njavali@marvell.com>
    Link: https://lore.kernel.org/r/20230714070104.40052-6-njavali@marvell.com
    Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: qla2xxx: Fix firmware resource tracking [+ + +]

Author: Quinn Tran <qutran@marvell.com>
Date:   Mon Aug 21 18:30:39 2023 +0530

    scsi: qla2xxx: Fix firmware resource tracking
    
    commit e370b64c7db96384a0886a09a9d80406e4c663d7 upstream.
    
    The storage was not draining I/Os and the work load was not spread out
    across different CPUs evenly. This led to firmware resource counters
    getting overrun on the busy CPU. This overrun prevented error recovery from
    happening in a timely manner.
    
    By switching the counter to atomic, it allows the count to be little more
    accurate to prevent the overrun.
    
    Cc: stable@vger.kernel.org
    Fixes: da7c21b72aa8 ("scsi: qla2xxx: Fix command flush during TMF")
    Signed-off-by: Quinn Tran <qutran@marvell.com>
    Signed-off-by: Nilesh Javali <njavali@marvell.com>
    Link: https://lore.kernel.org/r/20230821130045.34850-4-njavali@marvell.com
    Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: qla2xxx: fix inconsistent TMF timeout [+ + +]

Author: Quinn Tran <qutran@marvell.com>
Date:   Fri Jul 14 12:31:03 2023 +0530

    scsi: qla2xxx: fix inconsistent TMF timeout
    
    commit 009e7fe4a1ed52276b332842a6b6e23b07200f2d upstream.
    
    Different behavior were experienced of session being torn down vs not when
    TMF is timed out. When FW detects the time out, the session is torn down.
    When driver detects the time out, the session is not torn down.
    
    Allow TMF error to return to upper layer without session tear down.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Quinn Tran <qutran@marvell.com>
    Signed-off-by: Nilesh Javali <njavali@marvell.com>
    Link: https://lore.kernel.org/r/20230714070104.40052-10-njavali@marvell.com
    Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: qla2xxx: Fix session hang in gnl [+ + +]

Author: Quinn Tran <qutran@marvell.com>
Date:   Fri Jul 14 12:31:00 2023 +0530

    scsi: qla2xxx: Fix session hang in gnl
    
    commit 39d22740712c7563a2e18c08f033deeacdaf66e7 upstream.
    
    Connection does not resume after a host reset / chip reset. The cause of
    the blockage is due to the FCF_ASYNC_ACTIVE left on. The gnl command was
    interrupted by the chip reset. On exiting the command, this flag should be
    turn off to allow relogin to reoccur. Clear this flag to prevent blockage.
    
    Cc: stable@vger.kernel.org
    Fixes: 17e64648aa47 ("scsi: qla2xxx: Correct fcport flags handling")
    Signed-off-by: Quinn Tran <qutran@marvell.com>
    Signed-off-by: Nilesh Javali <njavali@marvell.com>
    Link: https://lore.kernel.org/r/20230714070104.40052-7-njavali@marvell.com
    Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: qla2xxx: Fix smatch warn for qla_init_iocb_limit() [+ + +]

Author: Nilesh Javali <njavali@marvell.com>
Date:   Mon Aug 21 18:30:43 2023 +0530

    scsi: qla2xxx: Fix smatch warn for qla_init_iocb_limit()
    
    commit b496953dd0444001b12f425ea07d78c1f47e3193 upstream.
    
    Fix indentation for warning reported by smatch:
    
    drivers/scsi/qla2xxx/qla_init.c:4199 qla_init_iocb_limit() warn: inconsistent indenting
    
    Fixes: efa74a62aaa2 ("scsi: qla2xxx: Adjust IOCB resource on qpair create")
    Signed-off-by: Nilesh Javali <njavali@marvell.com>
    Link: https://lore.kernel.org/r/20230821130045.34850-8-njavali@marvell.com
    Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: qla2xxx: Fix TMF leak through [+ + +]

Author: Quinn Tran <qutran@marvell.com>
Date:   Fri Jul 14 12:31:02 2023 +0530

    scsi: qla2xxx: Fix TMF leak through
    
    commit 5d3148d8e8b05f084e607ac3bd55a4c317a9f934 upstream.
    
    Task management can retry up to 5 times when FW resource becomes bottle
    neck. Between the retries, there is a short sleep.  Current code assumes
    the chip has not reset or session has not changed.
    
    Check for chip reset or session change before sending Task management.
    
    Cc: stable@vger.kernel.org
    Fixes: 9803fb5d2759 ("scsi: qla2xxx: Fix task management cmd failure")
    Signed-off-by: Quinn Tran <qutran@marvell.com>
    Signed-off-by: Nilesh Javali <njavali@marvell.com>
    Link: https://lore.kernel.org/r/20230714070104.40052-9-njavali@marvell.com
    Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: qla2xxx: Flush mailbox commands on chip reset [+ + +]

Author: Quinn Tran <qutran@marvell.com>
Date:   Mon Aug 21 18:30:38 2023 +0530

    scsi: qla2xxx: Flush mailbox commands on chip reset
    
    commit 6d0b65569c0a10b27c49bacd8d25bcd406003533 upstream.
    
    Fix race condition between Interrupt thread and Chip reset thread in trying
    to flush the same mailbox. With the race condition, the "ha->mbx_intr_comp"
    will get an extra complete() call. The extra complete call create erroneous
    mailbox timeout condition when the next mailbox is sent where the mailbox
    call does not wait for interrupt to arrive. Instead, it advances without
    waiting.
    
    Add lock protection around the check for mailbox completion.
    
    Cc: stable@vger.kernel.org
    Fixes: b2000805a975 ("scsi: qla2xxx: Flush mailbox commands on chip reset")
    Signed-off-by: Quinn Tran <quinn.tran@marvell.com>
    Signed-off-by: Nilesh Javali <njavali@marvell.com>
    Link: https://lore.kernel.org/r/20230821130045.34850-3-njavali@marvell.com
    Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: qla2xxx: Limit TMF to 8 per function [+ + +]

Author: Quinn Tran <qutran@marvell.com>
Date:   Fri Jul 14 12:30:57 2023 +0530

    scsi: qla2xxx: Limit TMF to 8 per function
    
    commit a8ec192427e0516436e61f9ca9eb49c54eadfe0a upstream.
    
    Per FW recommendation, 8 TMF's can be outstanding for each
    function. Previously, it allowed 8 per target.
    
    Limit TMF to 8 per function.
    
    Cc: stable@vger.kernel.org
    Fixes: 6a87679626b5 ("scsi: qla2xxx: Fix task management cmd fail due to unavailable resource")
    Signed-off-by: Quinn Tran <qutran@marvell.com>
    Signed-off-by: Nilesh Javali <njavali@marvell.com>
    Link: https://lore.kernel.org/r/20230714070104.40052-4-njavali@marvell.com
    Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: qla2xxx: Remove unsupported ql2xenabledif option [+ + +]

Author: Manish Rangankar <mrangankar@marvell.com>
Date:   Mon Aug 21 18:30:42 2023 +0530

    scsi: qla2xxx: Remove unsupported ql2xenabledif option
    
    commit e9105c4b7a9208a21a9bda133707624f12ddabc2 upstream.
    
    User accidently passed module parameter ql2xenabledif=1 which is
    unsupported. However, driver still initialized which lead to guard tag
    errors during device discovery.
    
    Remove unsupported ql2xenabledif=1 option and validate the user input.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Manish Rangankar <mrangankar@marvell.com>
    Signed-off-by: Nilesh Javali <njavali@marvell.com>
    Link: https://lore.kernel.org/r/20230821130045.34850-7-njavali@marvell.com
    Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

scsi: qla2xxx: Turn off noisy message log [+ + +]

Author: Quinn Tran <qutran@marvell.com>
Date:   Fri Jul 14 12:31:01 2023 +0530

    scsi: qla2xxx: Turn off noisy message log
    
    commit 8ebaa45163a3fedc885c1dc7d43ea987a2f00a06 upstream.
    
    Some consider noisy log as test failure.  Turn off noisy message log.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Quinn Tran <qutran@marvell.com>
    Signed-off-by: Nilesh Javali <njavali@marvell.com>
    Link: https://lore.kernel.org/r/20230714070104.40052-8-njavali@marvell.com
    Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sctp: annotate data-races around sk->sk_wmem_queued [+ + +]

Author: Eric Dumazet <edumazet@google.com>
Date:   Wed Aug 30 09:45:19 2023 +0000

    sctp: annotate data-races around sk->sk_wmem_queued
    
    [ Upstream commit dc9511dd6f37fe803f6b15b61b030728d7057417 ]
    
    sk->sk_wmem_queued can be read locklessly from sctp_poll()
    
    Use sk_wmem_queued_add() when the field is changed,
    and add READ_ONCE() annotations in sctp_writeable()
    and sctp_assocs_seq_show()
    
    syzbot reported:
    
    BUG: KCSAN: data-race in sctp_poll / sctp_wfree
    
    read-write to 0xffff888149d77810 of 4 bytes by interrupt on cpu 0:
    sctp_wfree+0x170/0x4a0 net/sctp/socket.c:9147
    skb_release_head_state+0xb7/0x1a0 net/core/skbuff.c:988
    skb_release_all net/core/skbuff.c:1000 [inline]
    __kfree_skb+0x16/0x140 net/core/skbuff.c:1016
    consume_skb+0x57/0x180 net/core/skbuff.c:1232
    sctp_chunk_destroy net/sctp/sm_make_chunk.c:1503 [inline]
    sctp_chunk_put+0xcd/0x130 net/sctp/sm_make_chunk.c:1530
    sctp_datamsg_put+0x29a/0x300 net/sctp/chunk.c:128
    sctp_chunk_free+0x34/0x50 net/sctp/sm_make_chunk.c:1515
    sctp_outq_sack+0xafa/0xd70 net/sctp/outqueue.c:1381
    sctp_cmd_process_sack net/sctp/sm_sideeffect.c:834 [inline]
    sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1366 [inline]
    sctp_side_effects net/sctp/sm_sideeffect.c:1198 [inline]
    sctp_do_sm+0x12c7/0x31b0 net/sctp/sm_sideeffect.c:1169
    sctp_assoc_bh_rcv+0x2b2/0x430 net/sctp/associola.c:1051
    sctp_inq_push+0x108/0x120 net/sctp/inqueue.c:80
    sctp_rcv+0x116e/0x1340 net/sctp/input.c:243
    sctp6_rcv+0x25/0x40 net/sctp/ipv6.c:1120
    ip6_protocol_deliver_rcu+0x92f/0xf30 net/ipv6/ip6_input.c:437
    ip6_input_finish net/ipv6/ip6_input.c:482 [inline]
    NF_HOOK include/linux/netfilter.h:303 [inline]
    ip6_input+0xbd/0x1b0 net/ipv6/ip6_input.c:491
    dst_input include/net/dst.h:468 [inline]
    ip6_rcv_finish+0x1e2/0x2e0 net/ipv6/ip6_input.c:79
    NF_HOOK include/linux/netfilter.h:303 [inline]
    ipv6_rcv+0x74/0x150 net/ipv6/ip6_input.c:309
    __netif_receive_skb_one_core net/core/dev.c:5452 [inline]
    __netif_receive_skb+0x90/0x1b0 net/core/dev.c:5566
    process_backlog+0x21f/0x380 net/core/dev.c:5894
    __napi_poll+0x60/0x3b0 net/core/dev.c:6460
    napi_poll net/core/dev.c:6527 [inline]
    net_rx_action+0x32b/0x750 net/core/dev.c:6660
    __do_softirq+0xc1/0x265 kernel/softirq.c:553
    run_ksoftirqd+0x17/0x20 kernel/softirq.c:921
    smpboot_thread_fn+0x30a/0x4a0 kernel/smpboot.c:164
    kthread+0x1d7/0x210 kernel/kthread.c:389
    ret_from_fork+0x2e/0x40 arch/x86/kernel/process.c:145
    ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304
    
    read to 0xffff888149d77810 of 4 bytes by task 17828 on cpu 1:
    sctp_writeable net/sctp/socket.c:9304 [inline]
    sctp_poll+0x265/0x410 net/sctp/socket.c:8671
    sock_poll+0x253/0x270 net/socket.c:1374
    vfs_poll include/linux/poll.h:88 [inline]
    do_pollfd fs/select.c:873 [inline]
    do_poll fs/select.c:921 [inline]
    do_sys_poll+0x636/0xc00 fs/select.c:1015
    __do_sys_ppoll fs/select.c:1121 [inline]
    __se_sys_ppoll+0x1af/0x1f0 fs/select.c:1101
    __x64_sys_ppoll+0x67/0x80 fs/select.c:1101
    do_syscall_x64 arch/x86/entry/common.c:50 [inline]
    do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
    entry_SYSCALL_64_after_hwframe+0x63/0xcd
    
    value changed: 0x00019e80 -> 0x0000cc80
    
    Reported by Kernel Concurrency Sanitizer on:
    CPU: 1 PID: 17828 Comm: syz-executor.1 Not tainted 6.5.0-rc7-syzkaller-00185-g28f20a19294d #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023
    
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
    Acked-by: Xin Long <lucien.xin@gmail.com>
    Link: https://lore.kernel.org/r/20230830094519.950007-1-edumazet@google.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests: Keep symlinks, when possible [+ + +]

Author: Bjц╤rn Tц╤pel <bjorn@rivosinc.com>
Date:   Tue Aug 22 15:58:37 2023 +0200

    selftests: Keep symlinks, when possible
    
    [ Upstream commit 3f3f384139ed147c71e1d770accf610133d5309b ]
    
    When kselftest is built/installed with the 'gen_tar' target, rsync is
    used for the installation step to copy files. Extra care is needed for
    tests that have symlinks. Commit ae108c48b5d2 ("selftests: net: Fix
    cross-tree inclusion of scripts") added '-L' (transform symlink into
    referent file/dir) to rsync, to fix dangling links. However, that
    broke some tests where the symlink (being a symlink) is part of the
    test (e.g. exec:execveat).
    
    Use rsync's '--copy-unsafe-links' that does right thing.
    
    Fixes: ae108c48b5d2 ("selftests: net: Fix cross-tree inclusion of scripts")
    Signed-off-by: Bjц╤rn Tц╤pel <bjorn@rivosinc.com>
    Reviewed-by: Benjamin Poirier <bpoirier@nvidia.com>
    Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Linux: send channel sequence number in SMB3 requests after reconnects [+ + +]

Author: Steve French <stfrench@microsoft.com>
Date:   Thu Aug 24 23:29:18 2023 -0500

    send channel sequence number in SMB3 requests after reconnects
    
    commit 09ee7a3bf866c0fa5ee1914d2c65958559eb5b4c upstream.
    
    The ChannelSequence field in the SMB3 header is supposed to be
    increased after reconnect to allow the server to distinguish
    requests from before and after the reconnect.  We had always
    been setting it to zero.  There are cases where incrementing
    ChannelSequence on requests after network reconnects can reduce
    the chance of data corruptions.
    
    See MS-SMB2 3.2.4.1 and 3.2.7.1
    
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Cc: stable@vger.kernel.org # 5.16+
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sh: boards: Fix CEU buffer size passed to dma_declare_coherent_memory() [+ + +]

Author: Petr Tesarik <petr.tesarik.ext@huawei.com>
Date:   Mon Jul 24 14:07:42 2023 +0200

    sh: boards: Fix CEU buffer size passed to dma_declare_coherent_memory()
    
    [ Upstream commit fb60211f377b69acffead3147578f86d0092a7a5 ]
    
    In all these cases, the last argument to dma_declare_coherent_memory() is
    the buffer end address, but the expected value should be the size of the
    reserved region.
    
    Fixes: 39fb993038e1 ("media: arch: sh: ap325rxa: Use new renesas-ceu camera driver")
    Fixes: c2f9b05fd5c1 ("media: arch: sh: ecovec: Use new renesas-ceu camera driver")
    Fixes: f3590dc32974 ("media: arch: sh: kfr2r09: Use new renesas-ceu camera driver")
    Fixes: 186c446f4b84 ("media: arch: sh: migor: Use new renesas-ceu camera driver")
    Fixes: 1a3c230b4151 ("media: arch: sh: ms7724se: Use new renesas-ceu camera driver")
    Signed-off-by: Petr Tesarik <petr.tesarik.ext@huawei.com>
    Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Reviewed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com>
    Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
    Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
    Link: https://lore.kernel.org/r/20230724120742.2187-1-petrtesarik@huaweicloud.com
    Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

smb: propagate error code of extract_sharename() [+ + +]

Author: Katya Orlova <e.orlova@ispras.ru>
Date:   Tue Aug 15 16:38:31 2023 +0300

    smb: propagate error code of extract_sharename()
    
    [ Upstream commit efc0b0bcffcba60d9c6301063d25a22a4744b499 ]
    
    In addition to the EINVAL, there may be an ENOMEM.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Fixes: 70431bfd825d ("cifs: Support fscache indexing rewrite")
    Signed-off-by: Katya Orlova <e.orlova@ispras.ru>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

soc: qcom: qmi_encdec: Restrict string length in decode [+ + +]

Author: Chris Lew <quic_clew@quicinc.com>
Date:   Tue Aug 1 12:17:12 2023 +0530

    soc: qcom: qmi_encdec: Restrict string length in decode
    
    commit 8d207400fd6b79c92aeb2f33bb79f62dff904ea2 upstream.
    
    The QMI TLV value for strings in a lot of qmi element info structures
    account for null terminated strings with MAX_LEN + 1. If a string is
    actually MAX_LEN + 1 length, this will cause an out of bounds access
    when the NULL character is appended in decoding.
    
    Fixes: 9b8a11e82615 ("soc: qcom: Introduce QMI encoder/decoder")
    Cc: stable@vger.kernel.org
    Signed-off-by: Chris Lew <quic_clew@quicinc.com>
    Signed-off-by: Praveenkumar I <quic_ipkumar@quicinc.com>
    Link: https://lore.kernel.org/r/20230801064712.3590128-1-quic_ipkumar@quicinc.com
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tcp: Factorise sk_family-independent comparison in inet_bind2_bucket_match(_addr_any). [+ + +]

Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Mon Sep 11 11:36:55 2023 -0700

    tcp: Factorise sk_family-independent comparison in inet_bind2_bucket_match(_addr_any).
    
    [ Upstream commit c6d277064b1da7f9015b575a562734de87a7e463 ]
    
    This is a prep patch to make the following patches cleaner that touch
    inet_bind2_bucket_match() and inet_bind2_bucket_match_addr_any().
    
    Both functions have duplicated comparison for netns, port, and l3mdev.
    Let's factorise them.
    
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Stable-dep-of: aa99e5f87bd5 ("tcp: Fix bind() regression for v4-mapped-v6 wildcard address.")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address. [+ + +]

Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Mon Sep 11 11:36:57 2023 -0700

    tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address.
    
    [ Upstream commit c48ef9c4aed3632566b57ba66cec6ec78624d4cb ]
    
    Since bhash2 was introduced, the example below does not work as expected.
    These two bind() should conflict, but the 2nd bind() now succeeds.
    
      from socket import *
    
      s1 = socket(AF_INET6, SOCK_STREAM)
      s1.bind(('::ffff:127.0.0.1', 0))
    
      s2 = socket(AF_INET, SOCK_STREAM)
      s2.bind(('127.0.0.1', s1.getsockname()[1]))
    
    During the 2nd bind() in inet_csk_get_port(), inet_bind2_bucket_find()
    fails to find the 1st socket's tb2, so inet_bind2_bucket_create() allocates
    a new tb2 for the 2nd socket.  Then, we call inet_csk_bind_conflict() that
    checks conflicts in the new tb2 by inet_bhash2_conflict().  However, the
    new tb2 does not include the 1st socket, thus the bind() finally succeeds.
    
    In this case, inet_bind2_bucket_match() must check if AF_INET6 tb2 has
    the conflicting v4-mapped-v6 address so that inet_bind2_bucket_find()
    returns the 1st socket's tb2.
    
    Note that if we bind two sockets to 127.0.0.1 and then ::FFFF:127.0.0.1,
    the 2nd bind() fails properly for the same reason mentinoed in the previous
    commit.
    
    Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Acked-by: Andrei Vagin <avagin@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tcp: Fix bind() regression for v4-mapped-v6 wildcard address. [+ + +]

Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date:   Mon Sep 11 11:36:56 2023 -0700

    tcp: Fix bind() regression for v4-mapped-v6 wildcard address.
    
    [ Upstream commit aa99e5f87bd54db55dd37cb130bd5eb55933027f ]
    
    Andrei Vagin reported bind() regression with strace logs.
    
    If we bind() a TCPv6 socket to ::FFFF:0.0.0.0 and then bind() a TCPv4
    socket to 127.0.0.1, the 2nd bind() should fail but now succeeds.
    
      from socket import *
    
      s1 = socket(AF_INET6, SOCK_STREAM)
      s1.bind(('::ffff:0.0.0.0', 0))
    
      s2 = socket(AF_INET, SOCK_STREAM)
      s2.bind(('127.0.0.1', s1.getsockname()[1]))
    
    During the 2nd bind(), if tb->family is AF_INET6 and sk->sk_family is
    AF_INET in inet_bind2_bucket_match_addr_any(), we still need to check
    if tb has the v4-mapped-v6 wildcard address.
    
    The example above does not work after commit 5456262d2baa ("net: Fix
    incorrect address comparison when searching for a bind2 bucket"), but
    the blamed change is not the commit.
    
    Before the commit, the leading zeros of ::FFFF:0.0.0.0 were treated
    as 0.0.0.0, and the sequence above worked by chance.  Technically, this
    case has been broken since bhash2 was introduced.
    
    Note that if we bind() two sockets to 127.0.0.1 and then ::FFFF:0.0.0.0,
    the 2nd bind() fails properly because we fall back to using bhash to
    detect conflicts for the v4-mapped-v6 address.
    
    Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
    Reported-by: Andrei Vagin <avagin@google.com>
    Closes: https://lore.kernel.org/netdev/ZPuYBOFC8zsK6r9T@google.com/
    Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tpm_crb: Fix an error handling path in crb_acpi_add() [+ + +]

Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date:   Sat Feb 25 11:58:48 2023 +0100

    tpm_crb: Fix an error handling path in crb_acpi_add()
    
    [ Upstream commit 9c377852ddfdc557b1370f196b0cfdf28d233460 ]
    
    Some error paths don't call acpi_put_table() before returning.
    Branch to the correct place instead of doing some direct return.
    
    Fixes: 4d2732882703 ("tpm_crb: Add support for CRB devices based on Pluton")
    Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    Acked-by: Matthew Garrett <mgarrett@aurora.tech>
    Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
    Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

veth: Fixing transmit return status for dropped packets [+ + +]

Author: Liang Chen <liangchen.linux@gmail.com>
Date:   Fri Sep 1 12:09:21 2023 +0800

    veth: Fixing transmit return status for dropped packets
    
    [ Upstream commit 151e887d8ff97e2e42110ffa1fb1e6a2128fb364 ]
    
    The veth_xmit function returns NETDEV_TX_OK even when packets are dropped.
    This behavior leads to incorrect calculations of statistics counts, as
    well as things like txq->trans_start updates.
    
    Fixes: e314dbdc1c0d ("[NET]: Virtual ethernet device driver.")
    Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

watchdog: intel-mid_wdt: add MODULE_ALIAS() to allow auto-load [+ + +]

Author: Raag Jadav <raag.jadav@intel.com>
Date:   Fri Aug 11 17:32:20 2023 +0530

    watchdog: intel-mid_wdt: add MODULE_ALIAS() to allow auto-load
    
    [ Upstream commit cf38e7691c85f1b09973b22a0b89bf1e1228d2f9 ]
    
    When built with CONFIG_INTEL_MID_WATCHDOG=m, currently the driver
    needs to be loaded manually, for the lack of module alias.
    This causes unintended resets in cases where watchdog timer is
    set-up by bootloader and the driver is not explicitly loaded.
    Add MODULE_ALIAS() to load the driver automatically at boot and
    avoid this issue.
    
    Fixes: 87a1ef8058d9 ("watchdog: add Intel MID watchdog driver support")
    Signed-off-by: Raag Jadav <raag.jadav@intel.com>
    Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
    Reviewed-by: Guenter Roeck <linux@roeck-us.net>
    Link: https://lore.kernel.org/r/20230811120220.31578-1-raag.jadav@intel.com
    Signed-off-by: Guenter Roeck <linux@roeck-us.net>
    Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

x86/virt: Drop unnecessary check on extended CPUID level in cpu_has_svm() [+ + +]

Author: Sean Christopherson <seanjc@google.com>
Date:   Fri Jul 21 13:18:52 2023 -0700

    x86/virt: Drop unnecessary check on extended CPUID level in cpu_has_svm()
    
    [ Upstream commit 5df8ecfe3632d5879d1f154f7aa8de441b5d1c89 ]
    
    Drop the explicit check on the extended CPUID level in cpu_has_svm(), the
    kernel's cached CPUID info will leave the entire SVM leaf unset if said
    leaf is not supported by hardware.  Prior to using cached information,
    the check was needed to avoid false positives due to Intel's rather crazy
    CPUID behavior of returning the values of the maximum supported leaf if
    the specified leaf is unsupported.
    
    Fixes: 682a8108872f ("x86/kvm/svm: Simplify cpu_has_svm()")
    Link: https://lore.kernel.org/r/20230721201859.2307736-13-seanjc@google.com
    Signed-off-by: Sean Christopherson <seanjc@google.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

xsk: Fix xsk_diag use-after-free error during socket cleanup [+ + +]

Author: Magnus Karlsson <magnus.karlsson@intel.com>
Date:   Thu Aug 31 12:01:17 2023 +0200

    xsk: Fix xsk_diag use-after-free error during socket cleanup
    
    [ Upstream commit 3e019d8a05a38abb5c85d4f1e85fda964610aa14 ]
    
    Fix a use-after-free error that is possible if the xsk_diag interface
    is used after the socket has been unbound from the device. This can
    happen either due to the socket being closed or the device
    disappearing. In the early days of AF_XDP, the way we tested that a
    socket was not bound to a device was to simply check if the netdevice
    pointer in the xsk socket structure was NULL. Later, a better system
    was introduced by having an explicit state variable in the xsk socket
    struct. For example, the state of a socket that is on the way to being
    closed and has been unbound from the device is XSK_UNBOUND.
    
    The commit in the Fixes tag below deleted the old way of signalling
    that a socket is unbound, setting dev to NULL. This in the belief that
    all code using the old way had been exterminated. That was
    unfortunately not true as the xsk diagnostics code was still using the
    old way and thus does not work as intended when a socket is going
    down. Fix this by introducing a test against the state variable. If
    the socket is in the state XSK_UNBOUND, simply abort the diagnostic's
    netlink operation.
    
    Fixes: 18b1ab7aa76b ("xsk: Fix race at socket teardown")
    Reported-by: syzbot+822d1359297e2694f873@syzkaller.appspotmail.com
    Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Tested-by: syzbot+822d1359297e2694f873@syzkaller.appspotmail.com
    Tested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
    Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
    Link: https://lore.kernel.org/bpf/20230831100119.17408-1-magnus.karlsson@gmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Список изменений в Linux 6.1.54