óÐÉÓÏË ÉÚÍÅÎÅÎÉÊ × Linux 6.1.57

 
arm64: Add Cortex-A520 CPU part definition [+ + +]
Author: Rob Herring <robh@kernel.org>
Date:   Thu Sep 21 14:41:51 2023 -0500

    arm64: Add Cortex-A520 CPU part definition
    
    commit a654a69b9f9c06b2e56387d0b99f0e3e6b0ff4ef upstream.
    
    Add the CPU Part number for the new Arm design.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Rob Herring <robh@kernel.org>
    Link: https://lore.kernel.org/r/20230921194156.1050055-1-robh@kernel.org
    Signed-off-by: Will Deacon <will@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: Avoid repeated AA64MMFR1_EL1 register read on pagefault path [+ + +]
Author: Gabriel Krisman Bertazi <krisman@suse.de>
Date:   Mon Jan 9 12:19:55 2023 -0300

    arm64: Avoid repeated AA64MMFR1_EL1 register read on pagefault path
    
    [ Upstream commit a89c6bcdac22bec1bfbe6e64060b4cf5838d4f47 ]
    
    Accessing AA64MMFR1_EL1 is expensive in KVM guests, since it is emulated
    in the hypervisor.  In fact, ARM documentation mentions some feature
    registers are not supposed to be accessed frequently by the OS, and
    therefore should be emulated for guests [1].
    
    Commit 0388f9c74330 ("arm64: mm: Implement
    arch_wants_old_prefaulted_pte()") introduced a read of this register in
    the page fault path.  But, even when the feature of setting faultaround
    pages with the old flag is disabled for a given cpu, we are still paying
    the cost of checking the register on every pagefault. This results in an
    explosion of vmexit events in KVM guests, which directly impacts the
    performance of virtualized workloads.  For instance, running kernbench
    yields a 15% increase in system time solely due to the increased vmexit
    cycles.
    
    This patch avoids the extra cost by using the sanitized cached value.
    It should be safe to do so, since this register mustn't change for a
    given cpu.
    
    [1] https://developer.arm.com/-/media/Arm%20Developer%20Community/PDF/Learn%20the%20Architecture/Armv8-A%20virtualization.pdf?revision=a765a7df-1a00-434d-b241-357bfda2dd31
    
    Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
    Acked-by: Will Deacon <will@kernel.org>
    Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
    Link: https://lore.kernel.org/r/20230109151955.8292-1-krisman@suse.de
    Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

arm64: cpufeature: Fix CLRBHB and BC detection [+ + +]
Author: Kristina Martsenko <kristina.martsenko@arm.com>
Date:   Tue Sep 12 14:34:29 2023 +0100

    arm64: cpufeature: Fix CLRBHB and BC detection
    
    commit 479965a2b7ec481737df0cadf553331063b9c343 upstream.
    
    ClearBHB support is indicated by the CLRBHB field in ID_AA64ISAR2_EL1.
    Following some refactoring the kernel incorrectly checks the BC field
    instead. Fix the detection to use the right field.
    
    (Note: The original ClearBHB support had it as FTR_HIGHER_SAFE, but this
    patch uses FTR_LOWER_SAFE, which seems more correct.)
    
    Also fix the detection of BC (hinted conditional branches) to use
    FTR_LOWER_SAFE, so that it is not reported on mismatched systems.
    
    Fixes: 356137e68a9f ("arm64/sysreg: Make BHB clear feature defines match the architecture")
    Fixes: 8fcc8285c0e3 ("arm64/sysreg: Convert ID_AA64ISAR2_EL1 to automatic generation")
    Cc: stable@vger.kernel.org
    Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com>
    Reviewed-by: Mark Brown <broonie@kernel.org>
    Link: https://lore.kernel.org/r/20230912133429.2606875-1-kristina.martsenko@arm.com
    Signed-off-by: Will Deacon <will@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: errata: Add Cortex-A520 speculative unprivileged load workaround [+ + +]
Author: Rob Herring <robh@kernel.org>
Date:   Thu Sep 21 14:41:52 2023 -0500

    arm64: errata: Add Cortex-A520 speculative unprivileged load workaround
    
    commit 471470bc7052d28ce125901877dd10e4c048e513 upstream.
    
    Implement the workaround for ARM Cortex-A520 erratum 2966298. On an
    affected Cortex-A520 core, a speculatively executed unprivileged load
    might leak data from a privileged load via a cache side channel. The
    issue only exists for loads within a translation regime with the same
    translation (e.g. same ASID and VMID). Therefore, the issue only affects
    the return to EL0.
    
    The workaround is to execute a TLBI before returning to EL0 after all
    loads of privileged data. A non-shareable TLBI to any address is
    sufficient.
    
    The workaround isn't necessary if page table isolation (KPTI) is
    enabled, but for simplicity it will be. Page table isolation should
    normally be disabled for Cortex-A520 as it supports the CSV3 feature
    and the E0PD feature (used when KASLR is enabled).
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Rob Herring <robh@kernel.org>
    Link: https://lore.kernel.org/r/20230921194156.1050055-2-robh@kernel.org
    Signed-off-by: Will Deacon <will@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
ASoC: soc-utils: Export snd_soc_dai_is_dummy() symbol [+ + +]
Author: Sameer Pujar <spujar@nvidia.com>
Date:   Thu Sep 7 20:32:24 2023 +0530

    ASoC: soc-utils: Export snd_soc_dai_is_dummy() symbol
    
    [ Upstream commit f101583fa9f8c3f372d4feb61d67da0ccbf4d9a5 ]
    
    Export symbol snd_soc_dai_is_dummy() for usage outside core driver
    modules. This is required by Tegra ASoC machine driver.
    
    Signed-off-by: Sameer Pujar <spujar@nvidia.com>
    Link: https://lore.kernel.org/r/1694098945-32760-2-git-send-email-spujar@nvidia.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: tegra: Fix redundant PLLA and PLLA_OUT0 updates [+ + +]
Author: Sameer Pujar <spujar@nvidia.com>
Date:   Thu Sep 7 20:32:25 2023 +0530

    ASoC: tegra: Fix redundant PLLA and PLLA_OUT0 updates
    
    [ Upstream commit e765886249c533e1bb5cbc3cd741bad677417312 ]
    
    Tegra audio graph card has many DAI links which connects internal
    AHUB modules and external audio codecs. Since these are DPCM links,
    hw_params() call in the machine driver happens for each connected
    BE link and PLLA is updated every time. This is not really needed
    for all links as only I/O link DAIs derive respective clocks from
    PLLA_OUT0 and thus from PLLA. Hence add checks to limit the clock
    updates to DAIs over I/O links.
    
    This found to be fixing a DMIC clock discrepancy which is suspected
    to happen because of back to back quick PLLA and PLLA_OUT0 rate
    updates. This was observed on Jetson TX2 platform where DMIC clock
    ended up with unexpected value.
    
    Fixes: 202e2f774543 ("ASoC: tegra: Add audio graph based card driver")
    Cc: stable@vger.kernel.org
    Signed-off-by: Sameer Pujar <spujar@nvidia.com>
    Link: https://lore.kernel.org/r/1694098945-32760-3-git-send-email-spujar@nvidia.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ata,scsi: do not issue START STOP UNIT on resume [+ + +]
Author: Damien Le Moal <dlemoal@kernel.org>
Date:   Mon Jul 24 13:23:14 2023 +0900

    ata,scsi: do not issue START STOP UNIT on resume
    
    [ Upstream commit 0a8589055936d8feb56477123a8373ac634018fa ]
    
    During system resume, ata_port_pm_resume() triggers ata EH to
    1) Resume the controller
    2) Reset and rescan the ports
    3) Revalidate devices
    This EH execution is started asynchronously from ata_port_pm_resume(),
    which means that when sd_resume() is executed, none or only part of the
    above processing may have been executed. However, sd_resume() issues a
    START STOP UNIT to wake up the drive from sleep mode. This command is
    translated to ATA with ata_scsi_start_stop_xlat() and issued to the
    device. However, depending on the state of execution of the EH process
    and revalidation triggerred by ata_port_pm_resume(), two things may
    happen:
    1) The START STOP UNIT fails if it is received before the controller has
       been reenabled at the beginning of the EH execution. This is visible
       with error messages like:
    
    ata10.00: device reported invalid CHS sector 0
    sd 9:0:0:0: [sdc] Start/Stop Unit failed: Result: hostbyte=DID_OK driverbyte=DRIVER_OK
    sd 9:0:0:0: [sdc] Sense Key : Illegal Request [current]
    sd 9:0:0:0: [sdc] Add. Sense: Unaligned write command
    sd 9:0:0:0: PM: dpm_run_callback(): scsi_bus_resume+0x0/0x90 returns -5
    sd 9:0:0:0: PM: failed to resume async: error -5
    
    2) The START STOP UNIT command is received while the EH process is
       on-going, which mean that it is stopped and must wait for its
       completion, at which point the command is rather useless as the drive
       is already fully spun up already. This case results also in a
       significant delay in sd_resume() which is observable by users as
       the entire system resume completion is delayed.
    
    Given that ATA devices will be woken up by libata activity on resume,
    sd_resume() has no need to issue a START STOP UNIT command, which solves
    the above mentioned problems. Do not issue this command by introducing
    the new scsi_device flag no_start_on_resume and setting this flag to 1
    in ata_scsi_dev_config(). sd_resume() is modified to issue a START STOP
    UNIT command only if this flag is not set.
    
    Reported-by: Paul Ausbeck <paula@soe.ucsc.edu>
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=215880
    Fixes: a19a93e4c6a9 ("scsi: core: pm: Rely on the device driver core for async power management")
    Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
    Tested-by: Tanner Watkins <dalzot@gmail.com>
    Tested-by: Paul Ausbeck <paula@soe.ucsc.edu>
    Reviewed-by: Hannes Reinecke <hare@suse.de>
    Reviewed-by: Bart Van Assche <bvanassche@acm.org>
    Stable-dep-of: 99398d2070ab ("scsi: sd: Do not issue commands to suspended disks on shutdown")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ata: libata-scsi: Fix delayed scsi_rescan_device() execution [+ + +]
Author: Damien Le Moal <dlemoal@kernel.org>
Date:   Tue Sep 5 09:06:23 2023 +0900

    ata: libata-scsi: Fix delayed scsi_rescan_device() execution
    
    [ Upstream commit 8b4d9469d0b0e553208ee6f62f2807111fde18b9 ]
    
    Commit 6aa0365a3c85 ("ata: libata-scsi: Avoid deadlock on rescan after
    device resume") modified ata_scsi_dev_rescan() to check the scsi device
    "is_suspended" power field to ensure that the scsi device associated
    with an ATA device is fully resumed when scsi_rescan_device() is
    executed. However, this fix is problematic as:
    1) It relies on a PM internal field that should not be used without PM
       device locking protection.
    2) The check for is_suspended and the call to scsi_rescan_device() are
       not atomic and a suspend PM event may be triggered between them,
       casuing scsi_rescan_device() to be called on a suspended device and
       in that function blocking while holding the scsi device lock. This
       would deadlock a following resume operation.
    These problems can trigger PM deadlocks on resume, especially with
    resume operations triggered quickly after or during suspend operations.
    E.g., a simple bash script like:
    
    for (( i=0; i<10; i++ )); do
            echo "+2 > /sys/class/rtc/rtc0/wakealarm
            echo mem > /sys/power/state
    done
    
    that triggers a resume 2 seconds after starting suspending a system can
    quickly lead to a PM deadlock preventing the system from correctly
    resuming.
    
    Fix this by replacing the check on is_suspended with a check on the
    return value given by scsi_rescan_device() as that function will fail if
    called against a suspended device. Also make sure rescan tasks already
    scheduled are first cancelled before suspending an ata port.
    
    Fixes: 6aa0365a3c85 ("ata: libata-scsi: Avoid deadlock on rescan after device resume")
    Cc: stable@vger.kernel.org
    Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
    Reviewed-by: Hannes Reinecke <hare@suse.de>
    Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
    Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
block: fix use-after-free of q->q_usage_counter [+ + +]
Author: Ming Lei <ming.lei@redhat.com>
Date:   Thu Dec 15 10:16:29 2022 +0800

    block: fix use-after-free of q->q_usage_counter
    
    commit d36a9ea5e7766961e753ee38d4c331bbe6ef659b upstream.
    
    For blk-mq, queue release handler is usually called after
    blk_mq_freeze_queue_wait() returns. However, the
    q_usage_counter->release() handler may not be run yet at that time, so
    this can cause a use-after-free.
    
    Fix the issue by moving percpu_ref_exit() into blk_free_queue_rcu().
    Since ->release() is called with rcu read lock held, it is agreed that
    the race should be covered in caller per discussion from the two links.
    
    Reported-by: Zhang Wensheng <zhangwensheng@huaweicloud.com>
    Reported-by: Zhong Jinghua <zhongjinghua@huawei.com>
    Link: https://lore.kernel.org/linux-block/Y5prfOjyyjQKUrtH@T590/T/#u
    Link: https://lore.kernel.org/lkml/Y4%2FmzMd4evRg9yDi@fedora/
    Cc: Hillf Danton <hdanton@sina.com>
    Cc: Yu Kuai <yukuai3@huawei.com>
    Cc: Dennis Zhou <dennis@kernel.org>
    Fixes: 2b0d3d3e4fcf ("percpu_ref: reduce memory footprint of percpu_ref in fast path")
    Signed-off-by: Ming Lei <ming.lei@redhat.com>
    Link: https://lore.kernel.org/r/20221215021629.74870-1-ming.lei@redhat.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Saranya Muruganandam <saranyamohan@google.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
Bluetooth: Delete unused hci_req_prepare_suspend() declaration [+ + +]
Author: Yao Xiao <xiaoyao@rock-chips.com>
Date:   Sat Aug 26 16:13:13 2023 +0800

    Bluetooth: Delete unused hci_req_prepare_suspend() declaration
    
    [ Upstream commit cbaabbcdcbd355f0a1ccc09a925575c51c270750 ]
    
    hci_req_prepare_suspend() has been deprecated in favor of
    hci_suspend_sync().
    
    Fixes: 182ee45da083 ("Bluetooth: hci_sync: Rework hci_suspend_notifier")
    Signed-off-by: Yao Xiao <xiaoyao@rock-chips.com>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Bluetooth: hci_codec: Fix leaking content of local_codecs [+ + +]
Author: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Date:   Fri Sep 15 13:24:47 2023 -0700

    Bluetooth: hci_codec: Fix leaking content of local_codecs
    
    commit b938790e70540bf4f2e653dcd74b232494d06c8f upstream.
    
    The following memory leak can be observed when the controller supports
    codecs which are stored in local_codecs list but the elements are never
    freed:
    
    unreferenced object 0xffff88800221d840 (size 32):
      comm "kworker/u3:0", pid 36, jiffies 4294898739 (age 127.060s)
      hex dump (first 32 bytes):
        f8 d3 02 03 80 88 ff ff 80 d8 21 02 80 88 ff ff  ..........!.....
        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      backtrace:
        [<ffffffffb324f557>] __kmalloc+0x47/0x120
        [<ffffffffb39ef37d>] hci_codec_list_add.isra.0+0x2d/0x160
        [<ffffffffb39ef643>] hci_read_codec_capabilities+0x183/0x270
        [<ffffffffb39ef9ab>] hci_read_supported_codecs+0x1bb/0x2d0
        [<ffffffffb39f162e>] hci_read_local_codecs_sync+0x3e/0x60
        [<ffffffffb39ff1b3>] hci_dev_open_sync+0x943/0x11e0
        [<ffffffffb396d55d>] hci_power_on+0x10d/0x3f0
        [<ffffffffb30c99b4>] process_one_work+0x404/0x800
        [<ffffffffb30ca134>] worker_thread+0x374/0x670
        [<ffffffffb30d9108>] kthread+0x188/0x1c0
        [<ffffffffb304db6b>] ret_from_fork+0x2b/0x50
        [<ffffffffb300206a>] ret_from_fork_asm+0x1a/0x30
    
    Cc: stable@vger.kernel.org
    Fixes: 8961987f3f5f ("Bluetooth: Enumerate local supported codec and cache details")
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bluetooth: hci_sync: Fix handling of HCI_QUIRK_STRICT_DUPLICATE_FILTER [+ + +]
Author: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Date:   Tue Aug 29 13:50:06 2023 -0700

    Bluetooth: hci_sync: Fix handling of HCI_QUIRK_STRICT_DUPLICATE_FILTER
    
    commit 941c998b42f5c90384f49da89a6e11233de567cf upstream.
    
    When HCI_QUIRK_STRICT_DUPLICATE_FILTER is set LE scanning requires
    periodic restarts of the scanning procedure as the controller would
    consider device previously found as duplicated despite of RSSI changes,
    but in order to set the scan timeout properly set le_scan_restart needs
    to be synchronous so it shall not use hci_cmd_sync_queue which defers
    the command processing to cmd_sync_work.
    
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/linux-bluetooth/578e6d7afd676129decafba846a933f5@agner.ch/#t
    Fixes: 27d54b778ad1 ("Bluetooth: Rework le_scan_restart for hci_sync")
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Bluetooth: ISO: Fix handling of listen for unicast [+ + +]
Author: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Date:   Mon Aug 28 13:05:45 2023 -0700

    Bluetooth: ISO: Fix handling of listen for unicast
    
    [ Upstream commit e0275ea52169412b8faccb4e2f4fed8a057844c6 ]
    
    iso_listen_cis shall only return -EADDRINUSE if the listening socket has
    the destination set to BDADDR_ANY otherwise if the destination is set to
    a specific address it is for broadcast which shall be ignored.
    
    Fixes: f764a6c2c1e4 ("Bluetooth: ISO: Add broadcast support")
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
bpf, sockmap: Do not inc copied_seq when PEEK flag set [+ + +]
Author: John Fastabend <john.fastabend@gmail.com>
Date:   Mon Sep 25 20:52:59 2023 -0700

    bpf, sockmap: Do not inc copied_seq when PEEK flag set
    
    [ Upstream commit da9e915eaf5dadb1963b7738cdfa42ed55212445 ]
    
    When data is peek'd off the receive queue we shouldn't considered it
    copied from tcp_sock side. When we increment copied_seq this will confuse
    tcp_data_ready() because copied_seq can be arbitrarily increased. From
    application side it results in poll() operations not waking up when
    expected.
    
    Notice tcp stack without BPF recvmsg programs also does not increment
    copied_seq.
    
    We broke this when we moved copied_seq into recvmsg to only update when
    actual copy was happening. But, it wasn't working correctly either before
    because the tcp_data_ready() tried to use the copied_seq value to see
    if data was read by user yet. See fixes tags.
    
    Fixes: e5c6de5fa0258 ("bpf, sockmap: Incorrectly handling copied_seq")
    Fixes: 04919bed948dc ("tcp: Introduce tcp_read_skb()")
    Signed-off-by: John Fastabend <john.fastabend@gmail.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
    Link: https://lore.kernel.org/bpf/20230926035300.135096-3-john.fastabend@gmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf, sockmap: Reject sk_msg egress redirects to non-TCP sockets [+ + +]
Author: Jakub Sitnicki <jakub@cloudflare.com>
Date:   Wed Sep 20 12:20:55 2023 +0200

    bpf, sockmap: Reject sk_msg egress redirects to non-TCP sockets
    
    [ Upstream commit b80e31baa43614e086a9d29dc1151932b1bd7fc5 ]
    
    With a SOCKMAP/SOCKHASH map and an sk_msg program user can steer messages
    sent from one TCP socket (s1) to actually egress from another TCP
    socket (s2):
    
    tcp_bpf_sendmsg(s1)             // = sk_prot->sendmsg
      tcp_bpf_send_verdict(s1)      // __SK_REDIRECT case
        tcp_bpf_sendmsg_redir(s2)
          tcp_bpf_push_locked(s2)
            tcp_bpf_push(s2)
              tcp_rate_check_app_limited(s2) // expects tcp_sock
              tcp_sendmsg_locked(s2)         // ditto
    
    There is a hard-coded assumption in the call-chain, that the egress
    socket (s2) is a TCP socket.
    
    However in commit 122e6c79efe1 ("sock_map: Update sock type checks for
    UDP") we have enabled redirects to non-TCP sockets. This was done for the
    sake of BPF sk_skb programs. There was no indention to support sk_msg
    send-to-egress use case.
    
    As a result, attempts to send-to-egress through a non-TCP socket lead to a
    crash due to invalid downcast from sock to tcp_sock:
    
     BUG: kernel NULL pointer dereference, address: 000000000000002f
     ...
     Call Trace:
      <TASK>
      ? show_regs+0x60/0x70
      ? __die+0x1f/0x70
      ? page_fault_oops+0x80/0x160
      ? do_user_addr_fault+0x2d7/0x800
      ? rcu_is_watching+0x11/0x50
      ? exc_page_fault+0x70/0x1c0
      ? asm_exc_page_fault+0x27/0x30
      ? tcp_tso_segs+0x14/0xa0
      tcp_write_xmit+0x67/0xce0
      __tcp_push_pending_frames+0x32/0xf0
      tcp_push+0x107/0x140
      tcp_sendmsg_locked+0x99f/0xbb0
      tcp_bpf_push+0x19d/0x3a0
      tcp_bpf_sendmsg_redir+0x55/0xd0
      tcp_bpf_send_verdict+0x407/0x550
      tcp_bpf_sendmsg+0x1a1/0x390
      inet_sendmsg+0x6a/0x70
      sock_sendmsg+0x9d/0xc0
      ? sockfd_lookup_light+0x12/0x80
      __sys_sendto+0x10e/0x160
      ? syscall_enter_from_user_mode+0x20/0x60
      ? __this_cpu_preempt_check+0x13/0x20
      ? lockdep_hardirqs_on+0x82/0x110
      __x64_sys_sendto+0x1f/0x30
      do_syscall_64+0x38/0x90
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
    
    Reject selecting a non-TCP sockets as redirect target from a BPF sk_msg
    program to prevent the crash. When attempted, user will receive an EACCES
    error from send/sendto/sendmsg() syscall.
    
    Fixes: 122e6c79efe1 ("sock_map: Update sock type checks for UDP")
    Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Acked-by: John Fastabend <john.fastabend@gmail.com>
    Link: https://lore.kernel.org/bpf/20230920102055.42662-1-jakub@cloudflare.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
bpf: Add BPF_FIB_LOOKUP_SKIP_NEIGH for bpf_fib_lookup [+ + +]
Author: Martin KaFai Lau <martin.lau@kernel.org>
Date:   Fri Feb 17 12:55:14 2023 -0800

    bpf: Add BPF_FIB_LOOKUP_SKIP_NEIGH for bpf_fib_lookup
    
    [ Upstream commit 31de4105f00d64570139bc5494a201b0bd57349f ]
    
    The bpf_fib_lookup() also looks up the neigh table.
    This was done before bpf_redirect_neigh() was added.
    
    In the use case that does not manage the neigh table
    and requires bpf_fib_lookup() to lookup a fib to
    decide if it needs to redirect or not, the bpf prog can
    depend only on using bpf_redirect_neigh() to lookup the
    neigh. It also keeps the neigh entries fresh and connected.
    
    This patch adds a bpf_fib_lookup flag, SKIP_NEIGH, to avoid
    the double neigh lookup when the bpf prog always call
    bpf_redirect_neigh() to do the neigh lookup. The params->smac
    output is skipped together when SKIP_NEIGH is set because
    bpf_redirect_neigh() will figure out the smac also.
    
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20230217205515.3583372-1-martin.lau@linux.dev
    Stable-dep-of: 5baa0433a15e ("neighbour: fix data-races around n->output")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf: Fix tr dereferencing [+ + +]
Author: Leon Hwang <hffilwlqm@gmail.com>
Date:   Sun Sep 17 23:38:46 2023 +0800

    bpf: Fix tr dereferencing
    
    [ Upstream commit b724a6418f1f853bcb39c8923bf14a50c7bdbd07 ]
    
    Fix 'tr' dereferencing bug when CONFIG_BPF_JIT is turned off.
    
    When CONFIG_BPF_JIT is turned off, 'bpf_trampoline_get()' returns NULL,
    which is same as the cases when CONFIG_BPF_JIT is turned on.
    
    Closes: https://lore.kernel.org/r/202309131936.5Nc8eUD0-lkp@intel.com/
    Fixes: f7b12b6fea00 ("bpf: verifier: refactor check_attach_btf_id()")
    Reported-by: kernel test robot <lkp@intel.com>
    Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
    Signed-off-by: Leon Hwang <hffilwlqm@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20230917153846.88732-1-hffilwlqm@gmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf: tcp_read_skb needs to pop skb regardless of seq [+ + +]
Author: John Fastabend <john.fastabend@gmail.com>
Date:   Mon Sep 25 20:52:58 2023 -0700

    bpf: tcp_read_skb needs to pop skb regardless of seq
    
    [ Upstream commit 9b7177b1df64b8d7f85700027c324aadd6aded00 ]
    
    Before fix e5c6de5fa0258 tcp_read_skb() would increment the tp->copied-seq
    value. This (as described in the commit) would cause an error for apps
    because once that is incremented the application might believe there is no
    data to be read. Then some apps would stall or abort believing no data is
    available.
    
    However, the fix is incomplete because it introduces another issue in
    the skb dequeue. The loop does tcp_recv_skb() in a while loop to consume
    as many skbs as possible. The problem is the call is ...
    
      tcp_recv_skb(sk, seq, &offset)
    
    ... where 'seq' is:
    
      u32 seq = tp->copied_seq;
    
    Now we can hit a case where we've yet incremented copied_seq from BPF side,
    but then tcp_recv_skb() fails this test ...
    
     if (offset < skb->len || (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN))
    
    ... so that instead of returning the skb we call tcp_eat_recv_skb() which
    frees the skb. This is because the routine believes the SKB has been collapsed
    per comment:
    
     /* This looks weird, but this can happen if TCP collapsing
      * splitted a fat GRO packet, while we released socket lock
      * in skb_splice_bits()
      */
    
    This can't happen here we've unlinked the full SKB and orphaned it. Anyways
    it would confuse any BPF programs if the data were suddenly moved underneath
    it.
    
    To fix this situation do simpler operation and just skb_peek() the data
    of the queue followed by the unlink. It shouldn't need to check this
    condition and tcp_read_skb() reads entire skbs so there is no need to
    handle the 'offset!=0' case as we would see in tcp_read_sock().
    
    Fixes: e5c6de5fa0258 ("bpf, sockmap: Incorrectly handling copied_seq")
    Fixes: 04919bed948dc ("tcp: Introduce tcp_read_skb()")
    Signed-off-by: John Fastabend <john.fastabend@gmail.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
    Link: https://lore.kernel.org/bpf/20230926035300.135096-2-john.fastabend@gmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
btrfs: file_remove_privs needs an exclusive lock in direct io write [+ + +]
Author: Bernd Schubert <bschubert@ddn.com>
Date:   Wed Sep 6 17:59:03 2023 +0200

    btrfs: file_remove_privs needs an exclusive lock in direct io write
    
    commit 9af86694fd5d387992699ec99007ed374966ce9a upstream.
    
    This was noticed by Miklos that file_remove_privs might call into
    notify_change(), which requires to hold an exclusive lock. The problem
    exists in FUSE and btrfs. We can fix it without any additional helpers
    from VFS, in case the privileges would need to be dropped, change the
    lock type to be exclusive and redo the loop.
    
    Fixes: e9adabb9712e ("btrfs: use shared lock for direct writes within EOF")
    CC: Miklos Szeredi <miklos@szeredi.hu>
    CC: stable@vger.kernel.org # 5.15+
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Bernd Schubert <bschubert@ddn.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: fix an error handling path in btrfs_rename() [+ + +]
Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date:   Mon Dec 12 21:14:17 2022 +0100

    btrfs: fix an error handling path in btrfs_rename()
    
    commit abe3bf7425fb695a9b37394af18b9ea58a800802 upstream.
    
    If new_whiteout_inode() fails, some resources need to be freed.
    Add the missing goto to the error handling path.
    
    Fixes: ab3c5c18e8fa ("btrfs: setup qstr from dentrys using fscrypt helper")
    Reviewed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
    Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: fix fscrypt name leak after failure to join log transaction [+ + +]
Author: Filipe Manana <fdmanana@suse.com>
Date:   Tue Dec 20 11:13:33 2022 +0000

    btrfs: fix fscrypt name leak after failure to join log transaction
    
    commit fee4c19937439693f2420a916169d08e88576e8e upstream.
    
    When logging a new name, we don't expect to fail joining a log transaction
    since we know at least one of the inodes was logged before in the current
    transaction. However if we fail for some unexpected reason, we end up not
    freeing the fscrypt name we previously allocated. So fix that by freeing
    the name in case we failed to join a log transaction.
    
    Fixes: ab3c5c18e8fa ("btrfs: setup qstr from dentrys using fscrypt helper")
    Reviewed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
    Signed-off-by: Filipe Manana <fdmanana@suse.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: setup qstr from dentrys using fscrypt helper [+ + +]
Author: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
Date:   Thu Oct 20 12:58:26 2022 -0400

    btrfs: setup qstr from dentrys using fscrypt helper
    
    [ Upstream commit ab3c5c18e8fa3f8ea116016095d25adab466cd39 ]
    
    Most places where we get a struct qstr, we are doing so from a dentry.
    With fscrypt, the dentry's name may be encrypted on-disk, so fscrypt
    provides a helper to convert a dentry name to the appropriate disk name
    if necessary. Convert each of the dentry name accesses to use
    fscrypt_setup_filename(), then convert the resulting fscrypt_name back
    to an unencrypted qstr. This does not work for nokey names, but the
    specific locations that could spawn nokey names are noted.
    
    At present, since there are no encrypted directories, nothing goes down
    the filename encryption paths.
    
    Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Stable-dep-of: 9af86694fd5d ("btrfs: file_remove_privs needs an exclusive lock in direct io write")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

btrfs: use struct fscrypt_str instead of struct qstr [+ + +]
Author: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
Date:   Thu Oct 20 12:58:27 2022 -0400

    btrfs: use struct fscrypt_str instead of struct qstr
    
    [ Upstream commit 6db75318823a169e836a478ca57d6a7c0a156b77 ]
    
    While struct qstr is more natural without fscrypt, since it's provided
    by dentries, struct fscrypt_str is provided by the fscrypt handlers
    processing dentries, and is thus more natural in the fscrypt world.
    Replace all of the struct qstr uses with struct fscrypt_str.
    
    Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Stable-dep-of: 9af86694fd5d ("btrfs: file_remove_privs needs an exclusive lock in direct io write")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

btrfs: use struct qstr instead of name and namelen pairs [+ + +]
Author: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
Date:   Thu Oct 20 12:58:25 2022 -0400

    btrfs: use struct qstr instead of name and namelen pairs
    
    [ Upstream commit e43eec81c5167b655b72c781b0e75e62a05e415e ]
    
    Many functions throughout btrfs take name buffer and name length
    arguments. Most of these functions at the highest level are usually
    called with these arguments extracted from a supplied dentry's name.
    But the entire name can be passed instead, making each function a little
    more elegant.
    
    Each function whose arguments are currently the name and length
    extracted from a dentry is herein converted to instead take a pointer to
    the name in the dentry. The couple of calls to these calls without a
    struct dentry are converted to create an appropriate qstr to pass in.
    Additionally, every function which is only called with a name/len
    extracted directly from a qstr is also converted.
    
    This change has positive effect on stack consumption, frame of many
    functions is reduced but this will be used in the future for fscrypt
    related structures.
    
    Signed-off-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Stable-dep-of: 9af86694fd5d ("btrfs: file_remove_privs needs an exclusive lock in direct io write")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
dm zoned: free dmz->ddev array in dmz_put_zoned_devices [+ + +]
Author: Fedor Pchelkin <pchelkin@ispras.ru>
Date:   Wed Sep 20 13:51:16 2023 +0300

    dm zoned: free dmz->ddev array in dmz_put_zoned_devices
    
    commit 9850ccd5dd88075b2b7fd28d96299d5535f58cc5 upstream.
    
    Commit 4dba12881f88 ("dm zoned: support arbitrary number of devices")
    made the pointers to additional zoned devices to be stored in a
    dynamically allocated dmz->ddev array. However, this array is not freed.
    
    Rename dmz_put_zoned_device to dmz_put_zoned_devices and fix it to
    free the dmz->ddev array when cleaning up zoned device information.
    Remove NULL assignment for all dmz->ddev elements and just free the
    dmz->ddev array instead.
    
    Found by Linux Verification Center (linuxtesting.org).
    
    Fixes: 4dba12881f88 ("dm zoned: support arbitrary number of devices")
    Cc: stable@vger.kernel.org
    Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
drivers/net: process the result of hdlc_open() and add call of hdlc_close() in uhdlc_close() [+ + +]
Author: Alexandra Diupina <adiupina@astralinux.ru>
Date:   Tue Sep 19 17:25:02 2023 +0300

    drivers/net: process the result of hdlc_open() and add call of hdlc_close() in uhdlc_close()
    
    [ Upstream commit a59addacf899b1b21a7b7449a1c52c98704c2472 ]
    
    Process the result of hdlc_open() and call uhdlc_close()
    in case of an error. It is necessary to pass the error
    code up the control flow, similar to a possible
    error in request_irq().
    Also add a hdlc_close() call to the uhdlc_close()
    because the comment to hdlc_close() says it must be called
    by the hardware driver when the HDLC device is being closed
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Fixes: c19b6d246a35 ("drivers/net: support hdlc function for QE-UCC")
    Signed-off-by: Alexandra Diupina <adiupina@astralinux.ru>
    Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/amd/display: Adjust the MST resume flow [+ + +]
Author: Wayne Lin <wayne.lin@amd.com>
Date:   Tue Aug 22 16:03:17 2023 +0800

    drm/amd/display: Adjust the MST resume flow
    
    commit ec5fa9fcdeca69edf7dab5ca3b2e0ceb1c08fe9a upstream.
    
    [Why]
    In drm_dp_mst_topology_mgr_resume() today, it will resume the
    mst branch to be ready handling mst mode and also consecutively do
    the mst topology probing. Which will cause the dirver have chance
    to fire hotplug event before restoring the old state. Then Userspace
    will react to the hotplug event based on a wrong state.
    
    [How]
    Adjust the mst resume flow as:
    1. set dpcd to resume mst branch status
    2. restore source old state
    3. Do mst resume topology probing
    
    For drm_dp_mst_topology_mgr_resume(), it's better to adjust it to
    pull out topology probing work into a 2nd part procedure of the mst
    resume. Will have a follow up patch in drm.
    
    Reviewed-by: Chao-kai Wang <stylon.wang@amd.com>
    Cc: Mario Limonciello <mario.limonciello@amd.com>
    Cc: Alex Deucher <alexander.deucher@amd.com>
    Cc: stable@vger.kernel.org
    Acked-by: Stylon Wang <stylon.wang@amd.com>
    Signed-off-by: Wayne Lin <wayne.lin@amd.com>
    Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    [ Adjust for missing variable rename in
     f0127cb11299 ("drm/amdgpu/display/mst: adjust the naming of mst_port and port of aconnector") ]
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
drm/amd: Fix detection of _PR3 on the PCIe root port [+ + +]
Author: Mario Limonciello <mario.limonciello@amd.com>
Date:   Tue Sep 26 17:59:53 2023 -0500

    drm/amd: Fix detection of _PR3 on the PCIe root port
    
    commit 134b8c5d8674e7cde380f82e9aedfd46dcdd16f7 upstream.
    
    On some systems with Navi3x dGPU will attempt to use BACO for runtime
    PM but fails to resume properly.  This is because on these systems
    the root port goes into D3cold which is incompatible with BACO.
    
    This happens because in this case dGPU is connected to a bridge between
    root port which causes BOCO detection logic to fail.  Fix the intent of
    the logic by looking at root port, not the immediate upstream bridge for
    _PR3.
    
    Cc: stable@vger.kernel.org
    Suggested-by: Jun Ma <Jun.Ma2@amd.com>
    Tested-by: David Perry <David.Perry@amd.com>
    Fixes: b10c1c5b3a4e ("drm/amdgpu: add check for ACPI power resources")
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amd: Fix logic error in sienna_cichlid_update_pcie_parameters() [+ + +]
Author: Mario Limonciello <mario.limonciello@amd.com>
Date:   Tue Sep 26 21:07:43 2023 -0500

    drm/amd: Fix logic error in sienna_cichlid_update_pcie_parameters()
    
    commit 2a1fe39a5be785e962e387146aed34fa9a829f3f upstream.
    
    While aligning SMU11 with SMU13 implementation an assumption was made that
    `dpm_context->dpm_tables.pcie_table` was populated in dpm table initialization
    like in SMU13 but it isn't.
    
    So restore some of the original logic and instead just check for
    amdgpu_device_pcie_dynamic_switching_supported() to decide whether to hardcode
    values; erring on the side of performance.
    
    Cc: stable@vger.kernel.org # 6.1+
    Reported-and-tested-by: Umio Yasuno <coelacanth_dream@protonmail.com>
    Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1447#note_2101382
    Fixes: e701156ccc6c ("drm/amd: Align SMU11 SMU_MSG_OverridePcieParameters implementation with SMU13")
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
erofs: fix memory leak of LZMA global compressed deduplication [+ + +]
Author: Gao Xiang <xiang@kernel.org>
Date:   Thu Sep 7 13:05:42 2023 +0800

    erofs: fix memory leak of LZMA global compressed deduplication
    
    [ Upstream commit 75a5221630fe5aa3fedba7a06be618db0f79ba1e ]
    
    When stressing microLZMA EROFS images with the new global compressed
    deduplication feature enabled (`-Ededupe`), I found some short-lived
    temporary pages weren't properly released, which could slowly cause
    unexpected OOMs hours later.
    
    Let's fix it now (LZ4 and DEFLATE don't have this issue.)
    
    Fixes: 5c2a64252c5d ("erofs: introduce partial-referenced pclusters")
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
    Link: https://lore.kernel.org/r/20230907050542.97152-1-hsiangkao@linux.alibaba.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
gpio: aspeed: fix the GPIO number passed to pinctrl_gpio_set_config() [+ + +]
Author: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Date:   Tue Oct 3 09:39:26 2023 +0200

    gpio: aspeed: fix the GPIO number passed to pinctrl_gpio_set_config()
    
    commit f9315f17bf778cb8079a29639419fcc8a41a3c84 upstream.
    
    pinctrl_gpio_set_config() expects the GPIO number from the global GPIO
    numberspace, not the controller-relative offset, which needs to be added
    to the chip base.
    
    Fixes: 5ae4cb94b313 ("gpio: aspeed: Add debounce support")
    Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
    Reviewed-by: Andy Shevchenko <andy@kernel.org>
    Reviewed-by: Andrew Jeffery <andrew@codeconstruct.com.au>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

gpio: pxa: disable pinctrl calls for MMP_GPIO [+ + +]
Author: Duje Mihanović <duje.mihanovic@skole.hr>
Date:   Fri Sep 29 17:41:57 2023 +0200

    gpio: pxa: disable pinctrl calls for MMP_GPIO
    
    commit f0575116507b981e6a810e78ce3c9040395b958b upstream.
    
    Similarly to PXA3xx and MMP2, pinctrl-single isn't capable of setting
    pin direction on MMP either.
    
    Fixes: a770d946371e ("gpio: pxa: add pin control gpio direction and request")
    Signed-off-by: Duje Mihanović <duje.mihanovic@skole.hr>
    Reviewed-by: Andy Shevchenko <andy@kernel.org>
    Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
HID: intel-ish-hid: ipc: Disable and reenable ACPI GPE bit [+ + +]
Author: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Date:   Tue Oct 3 08:53:32 2023 -0700

    HID: intel-ish-hid: ipc: Disable and reenable ACPI GPE bit
    
    [ Upstream commit 8f02139ad9a7e6e5c05712f8c1501eebed8eacfd ]
    
    The EHL (Elkhart Lake) based platforms provide a OOB (Out of band)
    service, which allows to wakup device when the system is in S5 (Soft-Off
    state). This OOB service can be enabled/disabled from BIOS settings. When
    enabled, the ISH device gets PME wake capability. To enable PME wakeup,
    driver also needs to enable ACPI GPE bit.
    
    On resume, BIOS will clear the wakeup bit. So driver need to re-enable it
    in resume function to keep the next wakeup capability. But this BIOS
    clearing of wakeup bit doesn't decrement internal OS GPE reference count,
    so this reenabling on every resume will cause reference count to overflow.
    
    So first disable and reenable ACPI GPE bit using acpi_disable_gpe().
    
    Fixes: 2e23a70edabe ("HID: intel-ish-hid: ipc: finish power flow for EHL OOB")
    Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
    Closes: https://lore.kernel.org/lkml/CAAd53p4=oLYiH2YbVSmrPNj1zpMcfp=Wxbasb5vhMXOWCArLCg@mail.gmail.com/T/
    Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
    Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
    Signed-off-by: Jiri Kosina <jkosina@suse.cz>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

HID: sony: Fix a potential memory leak in sony_probe() [+ + +]
Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date:   Sun Sep 3 18:04:00 2023 +0200

    HID: sony: Fix a potential memory leak in sony_probe()
    
    [ Upstream commit e1cd4004cde7c9b694bbdd8def0e02288ee58c74 ]
    
    If an error occurs after a successful usb_alloc_urb() call, usb_free_urb()
    should be called.
    
    Fixes: fb1a79a6b6e1 ("HID: sony: fix freeze when inserting ghlive ps3/wii dongles")
    Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    Signed-off-by: Jiri Kosina <jkosina@suse.cz>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

HID: sony: remove duplicate NULL check before calling usb_free_urb() [+ + +]
Author: Jiri Kosina <jkosina@suse.cz>
Date:   Wed Oct 4 21:10:41 2023 +0200

    HID: sony: remove duplicate NULL check before calling usb_free_urb()
    
    [ Upstream commit b328dd02e19cb9d3b35de4322f5363516a20ac8c ]
    
    usb_free_urb() does the NULL check itself, so there is no need to duplicate
    it prior to calling.
    
    Reported-by: kernel test robot <lkp@intel.com>
    Fixes: e1cd4004cde7c9 ("HID: sony: Fix a potential memory leak in sony_probe()")
    Signed-off-by: Jiri Kosina <jkosina@suse.cz>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
hwmon: (nzxt-smart2) add another USB ID [+ + +]
Author: Aleksandr Mezin <mezin.alexander@gmail.com>
Date:   Sun Feb 19 12:59:19 2023 +0200

    hwmon: (nzxt-smart2) add another USB ID
    
    commit 4a148e9b1ee04e608263fa9536a96214d5561220 upstream.
    
    This seems to be a new revision of the device. RGB controls have changed,
    but this driver doesn't touch them anyway.
    
    Fan speed control reported to be working with existing userspace (hidraw)
    software, so I assume it's compatible. Fan channel count is the same.
    
    Recently added (0x1e71, 0x2019) seems to be the same device.
    
    Discovered in liquidctl project:
    
    https://github.com/liquidctl/liquidctl/issues/541
    
    Signed-off-by: Aleksandr Mezin <mezin.alexander@gmail.com>
    Link: https://lore.kernel.org/r/20230219105924.333007-1-mezin.alexander@gmail.com
    Signed-off-by: Guenter Roeck <linux@roeck-us.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

hwmon: (nzxt-smart2) Add device id [+ + +]
Author: Herman Fries <baracoder@googlemail.com>
Date:   Wed Dec 14 20:46:28 2022 +0100

    hwmon: (nzxt-smart2) Add device id
    
    commit e247510e1baad04e9b7b8ed7190dbb00989387b9 upstream.
    
    Adding support for new device id
    1e71:2019 NZXT NZXT RGB & Fan Controller
    
    Signed-off-by: Herman Fries <baracoder@googlemail.com>
    Link: https://lore.kernel.org/r/20221214194627.135692-1-baracoder@googlemail.com
    Signed-off-by: Guenter Roeck <linux@roeck-us.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
i40e: fix the wrong PTP frequency calculation [+ + +]
Author: Yajun Deng <yajun.deng@linux.dev>
Date:   Tue Sep 26 15:10:59 2023 +0800

    i40e: fix the wrong PTP frequency calculation
    
    The new adjustment should be based on the base frequency, not the
    I40E_PTP_40GB_INCVAL in i40e_ptp_adjfine().
    
    This issue was introduced in commit 3626a690b717 ("i40e: use
    mul_u64_u64_div_u64 for PTP frequency calculation"), frequency is left
    just as base I40E_PTP_40GB_INCVAL before the commit. After the commit,
    frequency is the I40E_PTP_40GB_INCVAL times the ptp_adj_mult value.
    But then the diff is applied on the wrong value, and no multiplication
    is done afterwards.
    
    It was accidentally fixed in commit 1060707e3809 ("ptp: introduce helpers
    to adjust by scaled parts per million"). It uses adjust_by_scaled_ppm
    correctly performs the calculation and uses the base adjustment, so
    there's no error here. But it is a new feature and doesn't need to
    backported to the stable releases.
    
    This issue affects both v6.0 and v6.1, and the v6.1 version is an LTS
    release. Therefore, the patch only needs to be applied to v6.1 stable.
    
    Fixes: 3626a690b717 ("i40e: use mul_u64_u64_div_u64 for PTP frequency calculation")
    Cc: <stable@vger.kernel.org> # 6.1
    Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
    Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
IB/mlx4: Fix the size of a buffer in add_port_entries() [+ + +]
Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date:   Sat Sep 23 07:55:56 2023 +0200

    IB/mlx4: Fix the size of a buffer in add_port_entries()
    
    commit d7f393430a17c2bfcdf805462a5aa80be4285b27 upstream.
    
    In order to be sure that 'buff' is never truncated, its size should be
    12, not 11.
    
    When building with W=1, this fixes the following warnings:
    
      drivers/infiniband/hw/mlx4/sysfs.c: In function ‘add_port_entries’:
      drivers/infiniband/hw/mlx4/sysfs.c:268:34: error: ‘sprintf’ may write a terminating nul past the end of the destination [-Werror=format-overflow=]
        268 |                 sprintf(buff, "%d", i);
            |                                  ^
      drivers/infiniband/hw/mlx4/sysfs.c:268:17: note: ‘sprintf’ output between 2 and 12 bytes into a destination of size 11
        268 |                 sprintf(buff, "%d", i);
            |                 ^~~~~~~~~~~~~~~~~~~~~~
      drivers/infiniband/hw/mlx4/sysfs.c:286:34: error: ‘sprintf’ may write a terminating nul past the end of the destination [-Werror=format-overflow=]
        286 |                 sprintf(buff, "%d", i);
            |                                  ^
      drivers/infiniband/hw/mlx4/sysfs.c:286:17: note: ‘sprintf’ output between 2 and 12 bytes into a destination of size 11
        286 |                 sprintf(buff, "%d", i);
            |                 ^~~~~~~~~~~~~~~~~~~~~~
    
    Fixes: c1e7e466120b ("IB/mlx4: Add iov directory in sysfs under the ib device")
    Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    Link: https://lore.kernel.org/r/0bb1443eb47308bc9be30232cc23004c4d4cf43e.1695448530.git.christophe.jaillet@wanadoo.fr
    Signed-off-by: Leon Romanovsky <leon@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
ibmveth: Remove condition to recompute TCP header checksum. [+ + +]
Author: David Wilder <dwilder@us.ibm.com>
Date:   Tue Sep 26 16:42:51 2023 -0500

    ibmveth: Remove condition to recompute TCP header checksum.
    
    [ Upstream commit 51e7a66666e0ca9642c59464ef8359f0ac604d41 ]
    
    In some OVS environments the TCP pseudo header checksum may need to be
    recomputed. Currently this is only done when the interface instance is
    configured for "Trunk Mode". We found the issue also occurs in some
    Kubernetes environments, these environments do not use "Trunk Mode",
    therefor the condition is removed.
    
    Performance tests with this change show only a fractional decrease in
    throughput (< 0.2%).
    
    Fixes: 7525de2516fb ("ibmveth: Set CHECKSUM_PARTIAL if NULL TCP CSUM.")
    Signed-off-by: David Wilder <dwilder@us.ibm.com>
    Reviewed-by: Nick Child <nnac123@linux.ibm.com>
    Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ima: Finish deprecation of IMA_TRUSTED_KEYRING Kconfig [+ + +]
Author: Oleksandr Tymoshenko <ovt@google.com>
Date:   Thu Sep 21 06:45:05 2023 +0000

    ima: Finish deprecation of IMA_TRUSTED_KEYRING Kconfig
    
    [ Upstream commit be210c6d3597faf330cb9af33b9f1591d7b2a983 ]
    
    The removal of IMA_TRUSTED_KEYRING made IMA_LOAD_X509
    and IMA_BLACKLIST_KEYRING unavailable because the latter
    two depend on the former. Since IMA_TRUSTED_KEYRING was
    deprecated in favor of INTEGRITY_TRUSTED_KEYRING use it
    as a dependency for the two Kconfigs affected by the
    deprecation.
    
    Fixes: 5087fd9e80e5 ("ima: Remove deprecated IMA_TRUSTED_KEYRING Kconfig")
    Signed-off-by: Oleksandr Tymoshenko <ovt@google.com>
    Reviewed-by: Nayna Jain <nayna@linux.ibm.com>
    Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ima: rework CONFIG_IMA dependency block [+ + +]
Author: Arnd Bergmann <arnd@arndb.de>
Date:   Wed Sep 27 09:22:14 2023 +0200

    ima: rework CONFIG_IMA dependency block
    
    [ Upstream commit 91e326563ee34509c35267808a4b1b3ea3db62a8 ]
    
    Changing the direct dependencies of IMA_BLACKLIST_KEYRING and
    IMA_LOAD_X509 caused them to no longer depend on IMA, but a
    a configuration without IMA results in link failures:
    
    arm-linux-gnueabi-ld: security/integrity/iint.o: in function `integrity_load_keys':
    iint.c:(.init.text+0xd8): undefined reference to `ima_load_x509'
    
    aarch64-linux-ld: security/integrity/digsig_asymmetric.o: in function `asymmetric_verify':
    digsig_asymmetric.c:(.text+0x104): undefined reference to `ima_blacklist_keyring'
    
    Adding explicit dependencies on IMA would fix this, but a more reliable
    way to do this is to enclose the entire Kconfig file in an 'if IMA' block.
    This also allows removing the existing direct dependencies.
    
    Fixes: be210c6d3597f ("ima: Finish deprecation of IMA_TRUSTED_KEYRING Kconfig")
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
intel_idle: add Emerald Rapids Xeon support [+ + +]
Author: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Date:   Fri Jan 20 11:15:28 2023 +0200

    intel_idle: add Emerald Rapids Xeon support
    
    [ Upstream commit 74528edfbc664f9d2c927c4e5a44f1285598ed0f ]
    
    Emerald Rapids (EMR) is the next Intel Xeon processor after Sapphire
    Rapids (SPR).
    
    EMR C-states are the same as SPR C-states, and we expect that EMR
    C-state characteristics (latency and target residency) will be the
    same as in SPR. Therefore, add EMR support by using SPR C-states table.
    
    Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
iommu/arm-smmu-v3: Avoid constructing invalid range commands [+ + +]
Author: Robin Murphy <robin.murphy@arm.com>
Date:   Mon Sep 11 12:57:04 2023 +0100

    iommu/arm-smmu-v3: Avoid constructing invalid range commands
    
    [ Upstream commit eb6c97647be227822c7ce23655482b05e348fba5 ]
    
    Although io-pgtable's non-leaf invalidations are always for full tables,
    I missed that SVA also uses non-leaf invalidations, while being at the
    mercy of whatever range the MMU notifier throws at it. This means it
    definitely wants the previous TTL fix as well, since it also doesn't
    know exactly which leaf level(s) may need invalidating, but it can also
    give us less-aligned ranges wherein certain corners may lead to building
    an invalid command where TTL, Num and Scale are all 0. It should be fine
    to handle this by over-invalidating an extra page, since falling back to
    a non-range command opens up a whole can of errata-flavoured worms.
    
    Fixes: 6833b8f2e199 ("iommu/arm-smmu-v3: Set TTL invalidation hint better")
    Reported-by: Rui Zhu <zhurui3@huawei.com>
    Signed-off-by: Robin Murphy <robin.murphy@arm.com>
    Link: https://lore.kernel.org/r/b99cfe71af2bd93a8a2930f20967fb2a4f7748dd.1694432734.git.robin.murphy@arm.com
    Signed-off-by: Will Deacon <will@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

iommu/arm-smmu-v3: Set TTL invalidation hint better [+ + +]
Author: Robin Murphy <robin.murphy@arm.com>
Date:   Thu Jun 1 17:43:33 2023 +0100

    iommu/arm-smmu-v3: Set TTL invalidation hint better
    
    [ Upstream commit 6833b8f2e19945a41e4d5efd8c6d9f4cae9a5b7d ]
    
    When io-pgtable unmaps a whole table, rather than waste time walking it
    to find the leaf entries to invalidate exactly, it simply expects
    .tlb_flush_walk with nominal last-level granularity to invalidate any
    leaf entries at higher intermediate levels as well. This works fine with
    page-based invalidation, but with range commands we need to be careful
    with the TTL hint - unconditionally setting it based on the given level
    3 granule means that an invalidation for a level 1 table would strictly
    not be required to affect level 2 block entries. It's easy to comply
    with the expected behaviour by simply not setting the TTL hint for
    non-leaf invalidations, so let's do that.
    
    Signed-off-by: Robin Murphy <robin.murphy@arm.com>
    Link: https://lore.kernel.org/r/b409d9a17c52dc0db51faee91d92737bb7975f5b.1685637456.git.robin.murphy@arm.com
    Signed-off-by: Will Deacon <will@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
iommu/mediatek: Fix share pgtable for iova over 4GB [+ + +]
Author: Yong Wu <yong.wu@mediatek.com>
Date:   Sat Aug 19 16:14:43 2023 +0800

    iommu/mediatek: Fix share pgtable for iova over 4GB
    
    [ Upstream commit b07eba71a512eb196cbcc29765c29c8c29b11b59 ]
    
    In mt8192/mt8186, there is only one MM IOMMU that supports 16GB iova
    space, which is shared by display, vcodec and camera. These two SoC use
    one pgtable and have not the flag SHARE_PGTABLE, we should also keep
    share pgtable for this case.
    
    In mtk_iommu_domain_finalise, MM IOMMU always share pgtable, thus remove
    the flag SHARE_PGTABLE checking. Infra IOMMU always uses independent
    pgtable.
    
    Fixes: cf69ef46dbd9 ("iommu/mediatek: Fix two IOMMU share pagetable issue")
    Reported-by: Laura Nao <laura.nao@collabora.com>
    Closes: https://lore.kernel.org/linux-iommu/20230818154156.314742-1-laura.nao@collabora.com/
    Signed-off-by: Yong Wu <yong.wu@mediatek.com>
    Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
    Tested-by: Laura Nao <laura.nao@collabora.com>
    Link: https://lore.kernel.org/r/20230819081443.8333-1-yong.wu@mediatek.com
    Signed-off-by: Joerg Roedel <jroedel@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
iommu/vt-d: Avoid memory allocation in iommu_suspend() [+ + +]
Author: Zhang Rui <rui.zhang@intel.com>
Date:   Mon Sep 25 20:04:17 2023 +0800

    iommu/vt-d: Avoid memory allocation in iommu_suspend()
    
    commit 59df44bfb0ca4c3ee1f1c3c5d0ee8e314844799e upstream.
    
    The iommu_suspend() syscore suspend callback is invoked with IRQ disabled.
    Allocating memory with the GFP_KERNEL flag may re-enable IRQs during
    the suspend callback, which can cause intermittent suspend/hibernation
    problems with the following kernel traces:
    
    Calling iommu_suspend+0x0/0x1d0
    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 15 at kernel/time/timekeeping.c:868 ktime_get+0x9b/0xb0
    ...
    CPU: 0 PID: 15 Comm: rcu_preempt Tainted: G     U      E      6.3-intel #r1
    RIP: 0010:ktime_get+0x9b/0xb0
    ...
    Call Trace:
     <IRQ>
     tick_sched_timer+0x22/0x90
     ? __pfx_tick_sched_timer+0x10/0x10
     __hrtimer_run_queues+0x111/0x2b0
     hrtimer_interrupt+0xfa/0x230
     __sysvec_apic_timer_interrupt+0x63/0x140
     sysvec_apic_timer_interrupt+0x7b/0xa0
     </IRQ>
     <TASK>
     asm_sysvec_apic_timer_interrupt+0x1f/0x30
    ...
    ------------[ cut here ]------------
    Interrupts enabled after iommu_suspend+0x0/0x1d0
    WARNING: CPU: 0 PID: 27420 at drivers/base/syscore.c:68 syscore_suspend+0x147/0x270
    CPU: 0 PID: 27420 Comm: rtcwake Tainted: G     U  W   E      6.3-intel #r1
    RIP: 0010:syscore_suspend+0x147/0x270
    ...
    Call Trace:
     <TASK>
     hibernation_snapshot+0x25b/0x670
     hibernate+0xcd/0x390
     state_store+0xcf/0xe0
     kobj_attr_store+0x13/0x30
     sysfs_kf_write+0x3f/0x50
     kernfs_fop_write_iter+0x128/0x200
     vfs_write+0x1fd/0x3c0
     ksys_write+0x6f/0xf0
     __x64_sys_write+0x1d/0x30
     do_syscall_64+0x3b/0x90
     entry_SYSCALL_64_after_hwframe+0x72/0xdc
    
    Given that only 4 words memory is needed, avoid the memory allocation in
    iommu_suspend().
    
    CC: stable@kernel.org
    Fixes: 33e07157105e ("iommu/vt-d: Avoid GFP_ATOMIC where it is not needed")
    Signed-off-by: Zhang Rui <rui.zhang@intel.com>
    Tested-by: Ooi, Chin Hao <chin.hao.ooi@intel.com>
    Link: https://lore.kernel.org/r/20230921093956.234692-1-rui.zhang@intel.com
    Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
    Link: https://lore.kernel.org/r/20230925120417.55977-2-baolu.lu@linux.intel.com
    Signed-off-by: Joerg Roedel <jroedel@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
ipv4, ipv6: Fix handling of transhdrlen in __ip{,6}_append_data() [+ + +]
Author: David Howells <dhowells@redhat.com>
Date:   Thu Sep 21 11:41:19 2023 +0100

    ipv4, ipv6: Fix handling of transhdrlen in __ip{,6}_append_data()
    
    [ Upstream commit 9d4c75800f61e5d75c1659ba201b6c0c7ead3070 ]
    
    Including the transhdrlen in length is a problem when the packet is
    partially filled (e.g. something like send(MSG_MORE) happened previously)
    when appending to an IPv4 or IPv6 packet as we don't want to repeat the
    transport header or account for it twice.  This can happen under some
    circumstances, such as splicing into an L2TP socket.
    
    The symptom observed is a warning in __ip6_append_data():
    
        WARNING: CPU: 1 PID: 5042 at net/ipv6/ip6_output.c:1800 __ip6_append_data.isra.0+0x1be8/0x47f0 net/ipv6/ip6_output.c:1800
    
    that occurs when MSG_SPLICE_PAGES is used to append more data to an already
    partially occupied skbuff.  The warning occurs when 'copy' is larger than
    the amount of data in the message iterator.  This is because the requested
    length includes the transport header length when it shouldn't.  This can be
    triggered by, for example:
    
            sfd = socket(AF_INET6, SOCK_DGRAM, IPPROTO_L2TP);
            bind(sfd, ...); // ::1
            connect(sfd, ...); // ::1 port 7
            send(sfd, buffer, 4100, MSG_MORE);
            sendfile(sfd, dfd, NULL, 1024);
    
    Fix this by only adding transhdrlen into the length if the write queue is
    empty in l2tp_ip6_sendmsg(), analogously to how UDP does things.
    
    l2tp_ip_sendmsg() looks like it won't suffer from this problem as it builds
    the UDP packet itself.
    
    Fixes: a32e0eec7042 ("l2tp: introduce L2TPv3 IP encapsulation support for IPv6")
    Reported-by: syzbot+62cbf263225ae13ff153@syzkaller.appspotmail.com
    Link: https://lore.kernel.org/r/0000000000001c12b30605378ce8@google.com/
    Suggested-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
    Signed-off-by: David Howells <dhowells@redhat.com>
    cc: Eric Dumazet <edumazet@google.com>
    cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
    cc: "David S. Miller" <davem@davemloft.net>
    cc: David Ahern <dsahern@kernel.org>
    cc: Paolo Abeni <pabeni@redhat.com>
    cc: Jakub Kicinski <kuba@kernel.org>
    cc: netdev@vger.kernel.org
    cc: bpf@vger.kernel.org
    cc: syzkaller-bugs@googlegroups.com
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ipv4: Set offload_failed flag in fibmatch results [+ + +]
Author: Benjamin Poirier <bpoirier@nvidia.com>
Date:   Tue Sep 26 14:27:30 2023 -0400

    ipv4: Set offload_failed flag in fibmatch results
    
    [ Upstream commit 0add5c597f3253a9c6108a0a81d57f44ab0d9d30 ]
    
    Due to a small omission, the offload_failed flag is missing from ipv4
    fibmatch results. Make sure it is set correctly.
    
    The issue can be witnessed using the following commands:
    echo "1 1" > /sys/bus/netdevsim/new_device
    ip link add dummy1 up type dummy
    ip route add 192.0.2.0/24 dev dummy1
    echo 1 > /sys/kernel/debug/netdevsim/netdevsim1/fib/fail_route_offload
    ip route add 198.51.100.0/24 dev dummy1
    ip route
            # 192.168.15.0/24 has rt_trap
            # 198.51.100.0/24 has rt_offload_failed
    ip route get 192.168.15.1 fibmatch
            # Result has rt_trap
    ip route get 198.51.100.1 fibmatch
            # Result differs from the route shown by `ip route`, it is missing
            # rt_offload_failed
    ip link del dev dummy1
    echo 1 > /sys/bus/netdevsim/del_device
    
    Fixes: 36c5100e859d ("IPv4: Add "offload failed" indication to routes")
    Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
    Reviewed-by: Ido Schimmel <idosch@nvidia.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Link: https://lore.kernel.org/r/20230926182730.231208-1-bpoirier@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ipv6: remove nexthop_fib6_nh_bh() [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Wed May 10 15:46:46 2023 +0000

    ipv6: remove nexthop_fib6_nh_bh()
    
    commit ef1148d4487438a3408d6face2a8360d91b4af70 upstream.
    
    After blamed commit, nexthop_fib6_nh_bh() and nexthop_fib6_nh()
    are the same.
    
    Delete nexthop_fib6_nh_bh(), and convert /proc/net/ipv6_route
    to standard rcu to avoid this splat:
    
    [ 5723.180080] WARNING: suspicious RCU usage
    [ 5723.180083] -----------------------------
    [ 5723.180084] include/net/nexthop.h:516 suspicious rcu_dereference_check() usage!
    [ 5723.180086]
    other info that might help us debug this:
    
    [ 5723.180087]
    rcu_scheduler_active = 2, debug_locks = 1
    [ 5723.180089] 2 locks held by cat/55856:
    [ 5723.180091] #0: ffff9440a582afa8 (&p->lock){+.+.}-{3:3}, at: seq_read_iter (fs/seq_file.c:188)
    [ 5723.180100] #1: ffffffffaac07040 (rcu_read_lock_bh){....}-{1:2}, at: rcu_lock_acquire (include/linux/rcupdate.h:326)
    [ 5723.180109]
    stack backtrace:
    [ 5723.180111] CPU: 14 PID: 55856 Comm: cat Tainted: G S        I        6.3.0-dbx-DEV #528
    [ 5723.180115] Call Trace:
    [ 5723.180117]  <TASK>
    [ 5723.180119] dump_stack_lvl (lib/dump_stack.c:107)
    [ 5723.180124] dump_stack (lib/dump_stack.c:114)
    [ 5723.180126] lockdep_rcu_suspicious (include/linux/context_tracking.h:122)
    [ 5723.180132] ipv6_route_seq_show (include/net/nexthop.h:?)
    [ 5723.180135] ? ipv6_route_seq_next (net/ipv6/ip6_fib.c:2605)
    [ 5723.180140] seq_read_iter (fs/seq_file.c:272)
    [ 5723.180145] seq_read (fs/seq_file.c:163)
    [ 5723.180151] proc_reg_read (fs/proc/inode.c:316 fs/proc/inode.c:328)
    [ 5723.180155] vfs_read (fs/read_write.c:468)
    [ 5723.180160] ? up_read (kernel/locking/rwsem.c:1617)
    [ 5723.180164] ksys_read (fs/read_write.c:613)
    [ 5723.180168] __x64_sys_read (fs/read_write.c:621)
    [ 5723.180170] do_syscall_64 (arch/x86/entry/common.c:?)
    [ 5723.180174] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
    [ 5723.180177] RIP: 0033:0x7fa455677d2a
    
    Fixes: 09eed1192cec ("neighbour: switch to standard rcu, instead of rcu_bh")
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Link: https://lore.kernel.org/r/20230510154646.370659-1-edumazet@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ipv6: remove one read_lock()/read_unlock() pair in rt6_check_neigh() [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Mon Mar 13 20:17:32 2023 +0000

    ipv6: remove one read_lock()/read_unlock() pair in rt6_check_neigh()
    
    commit c486640aa710ddd06c13a7f7162126e1552e8842 upstream.
    
    rt6_check_neigh() uses read_lock() to protect n->nud_state reading.
    
    This seems overkill and causes false sharing.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ipv6: tcp: add a missing nf_reset_ct() in 3WHS handling [+ + +]
Author: Ilya Maximets <i.maximets@ovn.org>
Date:   Fri Sep 22 23:04:58 2023 +0200

    ipv6: tcp: add a missing nf_reset_ct() in 3WHS handling
    
    [ Upstream commit 9593c7cb6cf670ef724d17f7f9affd7a8d2ad0c5 ]
    
    Commit b0e214d21203 ("netfilter: keep conntrack reference until
    IPsecv6 policy checks are done") is a direct copy of the old
    commit b59c270104f0 ("[NETFILTER]: Keep conntrack reference until
    IPsec policy checks are done") but for IPv6.  However, it also
    copies a bug that this old commit had.  That is: when the third
    packet of 3WHS connection establishment contains payload, it is
    added into socket receive queue without the XFRM check and the
    drop of connection tracking context.
    
    That leads to nf_conntrack module being impossible to unload as
    it waits for all the conntrack references to be dropped while
    the packet release is deferred in per-cpu cache indefinitely, if
    not consumed by the application.
    
    The issue for IPv4 was fixed in commit 6f0012e35160 ("tcp: add a
    missing nf_reset_ct() in 3WHS handling") by adding a missing XFRM
    check and correctly dropping the conntrack context.  However, the
    issue was introduced to IPv6 code afterwards.  Fixing it the
    same way for IPv6 now.
    
    Fixes: b0e214d21203 ("netfilter: keep conntrack reference until IPsecv6 policy checks are done")
    Link: https://lore.kernel.org/netdev/d589a999-d4dd-2768-b2d5-89dec64a4a42@ovn.org/
    Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
    Acked-by: Florian Westphal <fw@strlen.de>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://lore.kernel.org/r/20230922210530.2045146-1-i.maximets@ovn.org
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ksmbd: fix race condition between session lookup and expire [+ + +]
Author: Namjae Jeon <linkinjeon@kernel.org>
Date:   Wed Oct 4 18:25:01 2023 +0900

    ksmbd: fix race condition between session lookup and expire
    
    commit 53ff5cf89142b978b1a5ca8dc4d4425e6a09745f upstream.
    
     Thread A                        +  Thread B
     ksmbd_session_lookup            |  smb2_sess_setup
       sess = xa_load                |
                                     |
                                     |    xa_erase(&conn->sessions, sess->id);
                                     |
                                     |    ksmbd_session_destroy(sess) --> kfree(sess)
                                     |
       // UAF!                       |
       sess->last_active = jiffies   |
                                     +
    
    This patch add rwsem to fix race condition between ksmbd_session_lookup
    and ksmbd_expire_session.
    
    Reported-by: luosili <rootlab@huawei.com>
    Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ksmbd: fix uaf in smb20_oplock_break_ack [+ + +]
Author: luosili <rootlab@huawei.com>
Date:   Wed Oct 4 18:29:36 2023 +0900

    ksmbd: fix uaf in smb20_oplock_break_ack
    
    commit c69813471a1ec081a0b9bf0c6bd7e8afd818afce upstream.
    
    drop reference after use opinfo.
    
    Signed-off-by: luosili <rootlab@huawei.com>
    Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
leds: Drop BUG_ON check for LED_COLOR_ID_MULTI [+ + +]
Author: Marek Behún <kabel@kernel.org>
Date:   Mon Sep 18 16:07:24 2023 +0200

    leds: Drop BUG_ON check for LED_COLOR_ID_MULTI
    
    [ Upstream commit 9dc1664fab2246bc2c3e9bf2cf21518a857f9b5b ]
    
    Commit c3f853184bed ("leds: Fix BUG_ON check for LED_COLOR_ID_MULTI that
    is always false") fixed a no-op BUG_ON. This turned out to cause a
    regression, since some in-tree device-tree files already use
    LED_COLOR_ID_MULTI.
    
    Drop the BUG_ON altogether.
    
    Fixes: c3f853184bed ("leds: Fix BUG_ON check for LED_COLOR_ID_MULTI that is always false")
    Reported-by: Da Xue <da@libre.computer>
    Closes: https://lore.kernel.org/linux-leds/ZQLelWcNjjp2xndY@duo.ucw.cz/T/
    Signed-off-by: Marek Behún <kabel@kernel.org>
    Link: https://lore.kernel.org/r/20230918140724.18634-1-kabel@kernel.org
    Signed-off-by: Lee Jones <lee@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
Linux: Linux 6.1.57 [+ + +]
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Tue Oct 10 22:00:46 2023 +0200

    Linux 6.1.57
    
    Link: https://lore.kernel.org/r/20231009130122.946357448@linuxfoundation.org
    Tested-by: SeongJae Park <sj@kernel.org>
    Tested-by: Shuah Khan <skhan@linuxfoundation.org>
    Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
    Tested-by: Jon Hunter <jonathanh@nvidia.com>
    Tested-by: Takeshi Ogasawara <takeshi.ogasawara@futuring-girl.com>
    Tested-by: Guenter Roeck <linux@roeck-us.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
md/raid5: release batch_last before waiting for another stripe_head [+ + +]
Author: David Jeffery <djeffery@redhat.com>
Date:   Mon Oct 2 14:32:29 2023 -0400

    md/raid5: release batch_last before waiting for another stripe_head
    
    commit 2fd7b0f6d5ad655b1d947d3acdd82f687c31465e upstream.
    
    When raid5_get_active_stripe is called with a ctx containing a stripe_head in
    its batch_last pointer, it can cause a deadlock if the task sleeps waiting on
    another stripe_head to become available. The stripe_head held by batch_last
    can be blocking the advancement of other stripe_heads, leading to no
    stripe_heads being released so raid5_get_active_stripe waits forever.
    
    Like with the quiesce state handling earlier in the function, batch_last
    needs to be released by raid5_get_active_stripe before it waits for another
    stripe_head.
    
    Fixes: 3312e6c887fe ("md/raid5: Keep a reference to last stripe_head for batch")
    Cc: stable@vger.kernel.org # v6.0+
    Signed-off-by: David Jeffery <djeffery@redhat.com>
    Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
    Signed-off-by: Song Liu <song@kernel.org>
    Link: https://lore.kernel.org/r/20231002183422.13047-1-djeffery@redhat.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
mm/memory: add vm_normal_folio() [+ + +]
Author: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Date:   Wed Dec 21 10:08:45 2022 -0800

    mm/memory: add vm_normal_folio()
    
    [ Upstream commit 318e9342fbbb6888d903d86e83865609901a1c65 ]
    
    Patch series "Convert deactivate_page() to folio_deactivate()", v4.
    
    Deactivate_page() has already been converted to use folios.  This patch
    series modifies the callers of deactivate_page() to use folios.  It also
    introduces vm_normal_folio() to assist with folio conversions, and
    converts deactivate_page() to folio_deactivate() which takes in a folio.
    
    This patch (of 4):
    
    Introduce a wrapper function called vm_normal_folio().  This function
    calls vm_normal_page() and returns the folio of the page found, or null if
    no page is found.
    
    This function allows callers to get a folio from a pte, which will
    eventually allow them to completely replace their struct page variables
    with struct folio instead.
    
    Link: https://lkml.kernel.org/r/20221221180848.20774-1-vishal.moola@gmail.com
    Link: https://lkml.kernel.org/r/20221221180848.20774-2-vishal.moola@gmail.com
    Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
    Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: SeongJae Park <sj@kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 24526268f4e3 ("mm: mempolicy: keep VMA walk if both MPOL_MF_STRICT and MPOL_MF_MOVE are specified")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
mm/mempolicy: convert migrate_page_add() to migrate_folio_add() [+ + +]
Author: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Date:   Mon Jan 30 12:18:33 2023 -0800

    mm/mempolicy: convert migrate_page_add() to migrate_folio_add()
    
    [ Upstream commit 4a64981dfee9119aa2c1f243b48f34cbbd67779c ]
    
    Replace migrate_page_add() with migrate_folio_add().  migrate_folio_add()
    does the same a migrate_page_add() but takes in a folio instead of a page.
    This removes a couple of calls to compound_head().
    
    Link: https://lkml.kernel.org/r/20230130201833.27042-7-vishal.moola@gmail.com
    Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
    Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Jane Chu <jane.chu@oracle.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 24526268f4e3 ("mm: mempolicy: keep VMA walk if both MPOL_MF_STRICT and MPOL_MF_MOVE are specified")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mm/mempolicy: convert queue_pages_pmd() to queue_folios_pmd() [+ + +]
Author: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Date:   Mon Jan 30 12:18:29 2023 -0800

    mm/mempolicy: convert queue_pages_pmd() to queue_folios_pmd()
    
    [ Upstream commit de1f5055523e9a035b38533f25a56df03d45034a ]
    
    The function now operates on a folio instead of the page associated with a
    pmd.
    
    This change is in preparation for the conversion of queue_pages_required()
    to queue_folio_required() and migrate_page_add() to migrate_folio_add().
    
    Link: https://lkml.kernel.org/r/20230130201833.27042-3-vishal.moola@gmail.com
    Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Jane Chu <jane.chu@oracle.com>
    Cc: "Yin, Fengwei" <fengwei.yin@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 24526268f4e3 ("mm: mempolicy: keep VMA walk if both MPOL_MF_STRICT and MPOL_MF_MOVE are specified")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mm/mempolicy: convert queue_pages_pte_range() to queue_folios_pte_range() [+ + +]
Author: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Date:   Mon Jan 30 12:18:30 2023 -0800

    mm/mempolicy: convert queue_pages_pte_range() to queue_folios_pte_range()
    
    [ Upstream commit 3dae02bbd07f40e37bbfec2d77119628db461eaa ]
    
    This function now operates on folios associated with ptes instead of
    pages.
    
    This change is in preparation for the conversion of queue_pages_required()
    to queue_folio_required() and migrate_page_add() to migrate_folio_add().
    
    Link: https://lkml.kernel.org/r/20230130201833.27042-4-vishal.moola@gmail.com
    Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
    Cc: David Hildenbrand <david@redhat.com>
    Cc: Jane Chu <jane.chu@oracle.com>
    Cc: "Yin, Fengwei" <fengwei.yin@intel.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 24526268f4e3 ("mm: mempolicy: keep VMA walk if both MPOL_MF_STRICT and MPOL_MF_MOVE are specified")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
mm/page_alloc: always remove pages from temporary list [+ + +]
Author: Mel Gorman <mgorman@techsingularity.net>
Date:   Fri Nov 18 10:17:13 2022 +0000

    mm/page_alloc: always remove pages from temporary list
    
    [ Upstream commit c3e58a70425ac6ddaae1529c8146e88b4f7252bb ]
    
    Patch series "Leave IRQs enabled for per-cpu page allocations", v3.
    
    This patch (of 2):
    
    free_unref_page_list() has neglected to remove pages properly from the
    list of pages to free since forever.  It works by coincidence because
    list_add happened to do the right thing adding the pages to just the PCP
    lists.  However, a later patch added pages to either the PCP list or the
    zone list but only properly deleted the page from the list in one path
    leading to list corruption and a subsequent failure.  As a preparation
    patch, always delete the pages from one list properly before adding to
    another.  On its own, this fixes nothing although it adds a fractional
    amount of overhead but is critical to the next patch.
    
    Link: https://lkml.kernel.org/r/20221118101714.19590-1-mgorman@techsingularity.net
    Link: https://lkml.kernel.org/r/20221118101714.19590-2-mgorman@techsingularity.net
    Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
    Reported-by: Hugh Dickins <hughd@google.com>
    Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: Marcelo Tosatti <mtosatti@redhat.com>
    Cc: Marek Szyprowski <m.szyprowski@samsung.com>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Yu Zhao <yuzhao@google.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 7b086755fb8c ("mm: page_alloc: fix CMA and HIGHATOMIC landing on the wrong buddy list")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mm/page_alloc: leave IRQs enabled for per-cpu page allocations [+ + +]
Author: Mel Gorman <mgorman@techsingularity.net>
Date:   Fri Nov 18 10:17:14 2022 +0000

    mm/page_alloc: leave IRQs enabled for per-cpu page allocations
    
    [ Upstream commit 5749077415994eb02d660b2559b9d8278521e73d ]
    
    The pcp_spin_lock_irqsave protecting the PCP lists is IRQ-safe as a task
    allocating from the PCP must not re-enter the allocator from IRQ context.
    In each instance where IRQ-reentrancy is possible, the lock is acquired
    using pcp_spin_trylock_irqsave() even though IRQs are disabled and
    re-entrancy is impossible.
    
    Demote the lock to pcp_spin_lock avoids an IRQ disable/enable in the
    common case at the cost of some IRQ allocations taking a slower path.  If
    the PCP lists need to be refilled, the zone lock still needs to disable
    IRQs but that will only happen on PCP refill and drain.  If an IRQ is
    raised when a PCP allocation is in progress, the trylock will fail and
    fallback to using the buddy lists directly.  Note that this may not be a
    universal win if an interrupt-intensive workload also allocates heavily
    from interrupt context and contends heavily on the zone->lock as a result.
    
    [mgorman@techsingularity.net: migratetype might be wrong if a PCP was locked]
      Link: https://lkml.kernel.org/r/20221122131229.5263-2-mgorman@techsingularity.net
    [yuzhao@google.com: reported lockdep issue on IO completion from softirq]
    [hughd@google.com: fix list corruption, lock improvements, micro-optimsations]
    Link: https://lkml.kernel.org/r/20221118101714.19590-3-mgorman@techsingularity.net
    Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
    Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: Marcelo Tosatti <mtosatti@redhat.com>
    Cc: Marek Szyprowski <m.szyprowski@samsung.com>
    Cc: Michal Hocko <mhocko@kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 7b086755fb8c ("mm: page_alloc: fix CMA and HIGHATOMIC landing on the wrong buddy list")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
mm: mempolicy: keep VMA walk if both MPOL_MF_STRICT and MPOL_MF_MOVE are specified [+ + +]
Author: Yang Shi <yang@os.amperecomputing.com>
Date:   Wed Sep 20 15:32:42 2023 -0700

    mm: mempolicy: keep VMA walk if both MPOL_MF_STRICT and MPOL_MF_MOVE are specified
    
    [ Upstream commit 24526268f4e38c9ec0c4a30de4f37ad2a2a84e47 ]
    
    When calling mbind() with MPOL_MF_{MOVE|MOVEALL} | MPOL_MF_STRICT, kernel
    should attempt to migrate all existing pages, and return -EIO if there is
    misplaced or unmovable page.  Then commit 6f4576e3687b ("mempolicy: apply
    page table walker on queue_pages_range()") messed up the return value and
    didn't break VMA scan early ianymore when MPOL_MF_STRICT alone.  The
    return value problem was fixed by commit a7f40cfe3b7a ("mm: mempolicy:
    make mbind() return -EIO when MPOL_MF_STRICT is specified"), but it broke
    the VMA walk early if unmovable page is met, it may cause some pages are
    not migrated as expected.
    
    The code should conceptually do:
    
     if (MPOL_MF_MOVE|MOVEALL)
         scan all vmas
         try to migrate the existing pages
         return success
     else if (MPOL_MF_MOVE* | MPOL_MF_STRICT)
         scan all vmas
         try to migrate the existing pages
         return -EIO if unmovable or migration failed
     else /* MPOL_MF_STRICT alone */
         break early if meets unmovable and don't call mbind_range() at all
     else /* none of those flags */
         check the ranges in test_walk, EFAULT without mbind_range() if discontig.
    
    Fixed the behavior.
    
    Link: https://lkml.kernel.org/r/20230920223242.3425775-1-yang@os.amperecomputing.com
    Fixes: a7f40cfe3b7a ("mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified")
    Signed-off-by: Yang Shi <yang@os.amperecomputing.com>
    Cc: Hugh Dickins <hughd@google.com>
    Cc: Suren Baghdasaryan <surenb@google.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: Rafael Aquini <aquini@redhat.com>
    Cc: Kirill A. Shutemov <kirill@shutemov.name>
    Cc: David Rientjes <rientjes@google.com>
    Cc: <stable@vger.kernel.org>    [4.9+]
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mm: page_alloc: fix CMA and HIGHATOMIC landing on the wrong buddy list [+ + +]
Author: Johannes Weiner <hannes@cmpxchg.org>
Date:   Mon Sep 11 14:11:08 2023 -0400

    mm: page_alloc: fix CMA and HIGHATOMIC landing on the wrong buddy list
    
    [ Upstream commit 7b086755fb8cdbb6b3e45a1bbddc00e7f9b1dc03 ]
    
    Commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
    bypasses the pcplist on lock contention and returns the page directly to
    the buddy list of the page's migratetype.
    
    For pages that don't have their own pcplist, such as CMA and HIGHATOMIC,
    the migratetype is temporarily updated such that the page can hitch a ride
    on the MOVABLE pcplist.  Their true type is later reassessed when flushing
    in free_pcppages_bulk().  However, when lock contention is detected after
    the type was already overridden, the bypass will then put the page on the
    wrong buddy list.
    
    Once on the MOVABLE buddy list, the page becomes eligible for fallbacks
    and even stealing.  In the case of HIGHATOMIC, otherwise ineligible
    allocations can dip into the highatomic reserves.  In the case of CMA, the
    page can be lost from the CMA region permanently.
    
    Use a separate pcpmigratetype variable for the pcplist override.  Use the
    original migratetype when going directly to the buddy.  This fixes the bug
    and should make the intentions more obvious in the code.
    
    Originally sent here to address the HIGHATOMIC case:
    https://lore.kernel.org/lkml/20230821183733.106619-4-hannes@cmpxchg.org/
    
    Changelog updated in response to the CMA-specific bug report.
    
    [mgorman@techsingularity.net: updated changelog]
    Link: https://lkml.kernel.org/r/20230911181108.GA104295@cmpxchg.org
    Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
    Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
    Reported-by: Joe Liu <joe.liu@mediatek.com>
    Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
modpost: add missing else to the "of" check [+ + +]
Author: Mauricio Faria de Oliveira <mfo@canonical.com>
Date:   Thu Sep 28 17:28:07 2023 -0300

    modpost: add missing else to the "of" check
    
    [ Upstream commit cbc3d00cf88fda95dbcafee3b38655b7a8f2650a ]
    
    Without this 'else' statement, an "usb" name goes into two handlers:
    the first/previous 'if' statement _AND_ the for-loop over 'devtable',
    but the latter is useless as it has no 'usb' device_id entry anyway.
    
    Tested with allmodconfig before/after patch; no changes to *.mod.c:
    
        git checkout v6.6-rc3
        make -j$(nproc) allmodconfig
        make -j$(nproc) olddefconfig
    
        make -j$(nproc)
        find . -name '*.mod.c' | cpio -pd /tmp/before
    
        # apply patch
    
        make -j$(nproc)
        find . -name '*.mod.c' | cpio -pd /tmp/after
    
        diff -r /tmp/before/ /tmp/after/
        # no difference
    
    Fixes: acbef7b76629 ("modpost: fix module autoloading for OF devices with generic compatible property")
    Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
    Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
mptcp: annotate lockless accesses to sk->sk_err [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Wed Mar 15 20:57:45 2023 +0000

    mptcp: annotate lockless accesses to sk->sk_err
    
    [ Upstream commit 9ae8e5ad99b8ebcd3d3dd46075f3825e6f08f063 ]
    
    mptcp_poll() reads sk->sk_err without socket lock held/owned.
    
    Add READ_ONCE() and WRITE_ONCE() to avoid load/store tearing.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Stable-dep-of: d5fbeff1ab81 ("mptcp: move __mptcp_error_report in protocol.c")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mptcp: fix dangling connection hang-up [+ + +]
Author: Paolo Abeni <pabeni@redhat.com>
Date:   Sat Sep 16 12:52:49 2023 +0200

    mptcp: fix dangling connection hang-up
    
    [ Upstream commit 27e5ccc2d5a50ed61bb73153edb1066104b108b3 ]
    
    According to RFC 8684 section 3.3:
    
      A connection is not closed unless [...] or an implementation-specific
      connection-level send timeout.
    
    Currently the MPTCP protocol does not implement such timeout, and
    connection timing-out at the TCP-level never move to close state.
    
    Introduces a catch-up condition at subflow close time to move the
    MPTCP socket to close, too.
    
    That additionally allows removing similar existing inside the worker.
    
    Finally, allow some additional timeout for plain ESTABLISHED mptcp
    sockets, as the protocol allows creating new subflows even at that
    point and making the connection functional again.
    
    This issue is actually present since the beginning, but it is basically
    impossible to solve without a long chain of functional pre-requisites
    topped by commit bbd49d114d57 ("mptcp: consolidate transition to
    TCP_CLOSE in mptcp_do_fastclose()"). When backporting this current
    patch, please also backport this other commit as well.
    
    Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/430
    Fixes: e16163b6e2b7 ("mptcp: refactor shutdown and close")
    Cc: stable@vger.kernel.org
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
    Reviewed-by: Mat Martineau <martineau@kernel.org>
    Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mptcp: move __mptcp_error_report in protocol.c [+ + +]
Author: Paolo Abeni <pabeni@redhat.com>
Date:   Sat Sep 16 12:52:46 2023 +0200

    mptcp: move __mptcp_error_report in protocol.c
    
    [ Upstream commit d5fbeff1ab812b6c473b6924bee8748469462e2c ]
    
    This will simplify the next patch ("mptcp: process pending subflow error
    on close").
    
    No functional change intended.
    
    Cc: stable@vger.kernel.org # v5.12+
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Mat Martineau <martineau@kernel.org>
    Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mptcp: process pending subflow error on close [+ + +]
Author: Paolo Abeni <pabeni@redhat.com>
Date:   Sat Sep 16 12:52:47 2023 +0200

    mptcp: process pending subflow error on close
    
    [ Upstream commit 9f1a98813b4b686482e5ef3c9d998581cace0ba6 ]
    
    On incoming TCP reset, subflow closing could happen before error
    propagation. That in turn could cause the socket error being ignored,
    and a missing socket state transition, as reported by Daire-Byrne.
    
    Address the issues explicitly checking for subflow socket error at
    close time. To avoid code duplication, factor-out of __mptcp_error_report()
    a new helper implementing the relevant bits.
    
    Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/429
    Fixes: 15cc10453398 ("mptcp: deliver ssk errors to msk")
    Cc: stable@vger.kernel.org
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Mat Martineau <martineau@kernel.org>
    Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mptcp: rename timer related helper to less confusing names [+ + +]
Author: Paolo Abeni <pabeni@redhat.com>
Date:   Sat Sep 16 12:52:48 2023 +0200

    mptcp: rename timer related helper to less confusing names
    
    [ Upstream commit f6909dc1c1f4452879278128012da6c76bc186a5 ]
    
    The msk socket uses to different timeout to track close related
    events and retransmissions. The existing helpers do not indicate
    clearly which timer they actually touch, making the related code
    quite confusing.
    
    Change the existing helpers name to avoid such confusion. No
    functional change intended.
    
    This patch is linked to the next one ("mptcp: fix dangling connection
    hang-up"). The two patches are supposed to be backported together.
    
    Cc: stable@vger.kernel.org # v5.11+
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
    Reviewed-by: Mat Martineau <martineau@kernel.org>
    Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Stable-dep-of: 27e5ccc2d5a5 ("mptcp: fix dangling connection hang-up")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mptcp: userspace pm allow creating id 0 subflow [+ + +]
Author: Geliang Tang <geliang.tang@suse.com>
Date:   Wed Oct 4 13:38:12 2023 -0700

    mptcp: userspace pm allow creating id 0 subflow
    
    commit e5ed101a602873d65d2d64edaba93e8c73ec1b0f upstream.
    
    This patch drops id 0 limitation in mptcp_nl_cmd_sf_create() to allow
    creating additional subflows with the local addr ID 0.
    
    There is no reason not to allow additional subflows from this local
    address: we should be able to create new subflows from the initial
    endpoint. This limitation was breaking fullmesh support from userspace.
    
    Fixes: 702c2f646d42 ("mptcp: netlink: allow userspace-driven subflow establishment")
    Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/391
    Cc: stable@vger.kernel.org
    Suggested-by: Matthieu Baerts <matthieu.baerts@tessares.net>
    Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
    Signed-off-by: Geliang Tang <geliang.tang@suse.com>
    Signed-off-by: Mat Martineau <martineau@kernel.org>
    Link: https://lore.kernel.org/r/20231004-send-net-20231004-v1-2-28de4ac663ae@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
neighbour: annotate lockless accesses to n->nud_state [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Mon Mar 13 20:17:31 2023 +0000

    neighbour: annotate lockless accesses to n->nud_state
    
    [ Upstream commit b071af523579df7341cabf0f16fc661125e9a13f ]
    
    We have many lockless accesses to n->nud_state.
    
    Before adding another one in the following patch,
    add annotations to readers and writers.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: 5baa0433a15e ("neighbour: fix data-races around n->output")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

neighbour: fix data-races around n->output [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Sep 21 09:27:13 2023 +0000

    neighbour: fix data-races around n->output
    
    [ Upstream commit 5baa0433a15eadd729625004c37463acb982eca7 ]
    
    n->output field can be read locklessly, while a writer
    might change the pointer concurrently.
    
    Add missing annotations to prevent load-store tearing.
    
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

neighbour: switch to standard rcu, instead of rcu_bh [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Mar 21 04:01:14 2023 +0000

    neighbour: switch to standard rcu, instead of rcu_bh
    
    [ Upstream commit 09eed1192cec1755967f2af8394207acdde579a1 ]
    
    rcu_bh is no longer a win, especially for objects freed
    with standard call_rcu().
    
    Switch neighbour code to no longer disable BH when not necessary.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: 5baa0433a15e ("neighbour: fix data-races around n->output")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
net: add sysctl accept_ra_min_rtr_lft [+ + +]
Author: Patrick Rohr <prohr@google.com>
Date:   Wed Jul 19 07:52:13 2023 -0700

    net: add sysctl accept_ra_min_rtr_lft
    
    commit 1671bcfd76fdc0b9e65153cf759153083755fe4c upstream.
    
    This change adds a new sysctl accept_ra_min_rtr_lft to specify the
    minimum acceptable router lifetime in an RA. If the received RA router
    lifetime is less than the configured value (and not 0), the RA is
    ignored.
    This is useful for mobile devices, whose battery life can be impacted
    by networks that configure RAs with a short lifetime. On such networks,
    the device should never gain IPv6 provisioning and should attempt to
    drop RAs via hardware offload, if available.
    
    Signed-off-by: Patrick Rohr <prohr@google.com>
    Cc: Maciej Żenczykowski <maze@google.com>
    Cc: Lorenzo Colitti <lorenzo@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: change accept_ra_min_rtr_lft to affect all RA lifetimes [+ + +]
Author: Patrick Rohr <prohr@google.com>
Date:   Wed Jul 26 16:07:01 2023 -0700

    net: change accept_ra_min_rtr_lft to affect all RA lifetimes
    
    commit 5027d54a9c30bc7ec808360378e2b4753f053f25 upstream.
    
    accept_ra_min_rtr_lft only considered the lifetime of the default route
    and discarded entire RAs accordingly.
    
    This change renames accept_ra_min_rtr_lft to accept_ra_min_lft, and
    applies the value to individual RA sections; in particular, router
    lifetime, PIO preferred lifetime, and RIO lifetime. If any of those
    lifetimes are lower than the configured value, the specific RA section
    is ignored.
    
    In order for the sysctl to be useful to Android, it should really apply
    to all lifetimes in the RA, since that is what determines the minimum
    frequency at which RAs must be processed by the kernel. Android uses
    hardware offloads to drop RAs for a fraction of the minimum of all
    lifetimes present in the RA (some networks have very frequent RAs (5s)
    with high lifetimes (2h)). Despite this, we have encountered networks
    that set the router lifetime to 30s which results in very frequent CPU
    wakeups. Instead of disabling IPv6 (and dropping IPv6 ethertype in the
    WiFi firmware) entirely on such networks, it seems better to ignore the
    misconfigured routers while still processing RAs from other IPv6 routers
    on the same network (i.e. to support IoT applications).
    
    The previous implementation dropped the entire RA based on router
    lifetime. This turned out to be hard to expand to the other lifetimes
    present in the RA in a consistent manner; dropping the entire RA based
    on RIO/PIO lifetimes would essentially require parsing the whole thing
    twice.
    
    Fixes: 1671bcfd76fd ("net: add sysctl accept_ra_min_rtr_lft")
    Cc: Lorenzo Colitti <lorenzo@google.com>
    Signed-off-by: Patrick Rohr <prohr@google.com>
    Reviewed-by: Maciej Żenczykowski <maze@google.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Link: https://lore.kernel.org/r/20230726230701.919212-1-prohr@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: dsa: mv88e6xxx: Avoid EEPROM timeout when EEPROM is absent [+ + +]
Author: Fabio Estevam <festevam@denx.de>
Date:   Fri Sep 22 09:47:41 2023 -0300

    net: dsa: mv88e6xxx: Avoid EEPROM timeout when EEPROM is absent
    
    [ Upstream commit 6ccf50d4d4741e064ba35511a95402c63bbe21a8 ]
    
    Since commit 23d775f12dcd ("net: dsa: mv88e6xxx: Wait for EEPROM done
    before HW reset") the following error is seen on a imx8mn board with
    a 88E6320 switch:
    
    mv88e6085 30be0000.ethernet-1:00: Timeout waiting for EEPROM done
    
    This board does not have an EEPROM attached to the switch though.
    
    This problem is well explained by Andrew Lunn:
    
    "If there is an EEPROM, and the EEPROM contains a lot of data, it could
    be that when we perform a hardware reset towards the end of probe, it
    interrupts an I2C bus transaction, leaving the I2C bus in a bad state,
    and future reads of the EEPROM do not work.
    
    The work around for this was to poll the EEInt status and wait for it
    to go true before performing the hardware reset.
    
    However, we have discovered that for some boards which do not have an
    EEPROM, EEInt never indicates complete. As a result,
    mv88e6xxx_g1_wait_eeprom_done() spins for a second and then prints a
    warning.
    
    We probably need a different solution than calling
    mv88e6xxx_g1_wait_eeprom_done(). The datasheet for 6352 documents the
    EEPROM Command register:
    
    bit 15 is:
    
      EEPROM Unit Busy. This bit must be set to a one to start an EEPROM
      operation (see EEOp below). Only one EEPROM operation can be
      executing at one time so this bit must be zero before setting it to
      a one.  When the requested EEPROM operation completes this bit will
      automatically be cleared to a zero. The transition of this bit from
      a one to a zero can be used to generate an interrupt (the EEInt in
      Global 1, offset 0x00).
    
    and more interesting is bit 11:
    
      Register Loader Running. This bit is set to one whenever the
      register loader is busy executing instructions contained in the
      EEPROM."
    
    Change to using mv88e6xxx_g2_eeprom_wait() to fix the timeout error
    when the EEPROM chip is not present.
    
    Fixes: 23d775f12dcd ("net: dsa: mv88e6xxx: Wait for EEPROM done before HW reset")
    Suggested-by: Andrew Lunn <andrew@lunn.ch>
    Signed-off-by: Fabio Estevam <festevam@denx.de>
    Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Reviewed-by: Andrew Lunn <andrew@lunn.ch>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: ethernet: mediatek: disable irq before schedule napi [+ + +]
Author: Christian Marangi <ansuelsmth@gmail.com>
Date:   Mon Oct 2 16:08:05 2023 +0200

    net: ethernet: mediatek: disable irq before schedule napi
    
    commit fcdfc462881d8acf9db77f483b2c821e286ca97b upstream.
    
    While searching for possible refactor of napi_schedule_prep and
    __napi_schedule it was notice that the mtk eth driver disable the
    interrupt for rx and tx AFTER napi is scheduled.
    
    While this is a very hard to repro case it might happen to have
    situation where the interrupt is disabled and never enabled again as the
    napi completes and the interrupt is enabled before.
    
    This is caused by the fact that a napi driven by interrupt expect a
    logic with:
    1. interrupt received. napi prepared -> interrupt disabled -> napi
       scheduled
    2. napi triggered. ring cleared -> interrupt enabled -> wait for new
       interrupt
    
    To prevent this case, disable the interrupt BEFORE the napi is
    scheduled.
    
    Fixes: 656e705243fd ("net-next: mediatek: add support for MT7623 ethernet")
    Cc: stable@vger.kernel.org
    Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
    Link: https://lore.kernel.org/r/20231002140805.568-1-ansuelsmth@gmail.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: ethernet: ti: am65-cpsw: Fix error code in am65_cpsw_nuss_init_tx_chns() [+ + +]
Author: Dan Carpenter <dan.carpenter@linaro.org>
Date:   Tue Sep 26 17:04:43 2023 +0300

    net: ethernet: ti: am65-cpsw: Fix error code in am65_cpsw_nuss_init_tx_chns()
    
    [ Upstream commit 37d4f55567982e445f86dc0ff4ecfa72921abfe8 ]
    
    This accidentally returns success, but it should return a negative error
    code.
    
    Fixes: 93a76530316a ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver")
    Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
    Reviewed-by: Roger Quadros <rogerq@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: fix possible store tearing in neigh_periodic_work() [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Sep 21 08:46:26 2023 +0000

    net: fix possible store tearing in neigh_periodic_work()
    
    [ Upstream commit 25563b581ba3a1f263a00e8c9a97f5e7363be6fd ]
    
    While looking at a related syzbot report involving neigh_periodic_work(),
    I found that I forgot to add an annotation when deleting an
    RCU protected item from a list.
    
    Readers use rcu_deference(*np), we need to use either
    rcu_assign_pointer() or WRITE_ONCE() on writer side
    to prevent store tearing.
    
    I use rcu_assign_pointer() to have lockdep support,
    this was the choice made in neigh_flush_dev().
    
    Fixes: 767e97e1e0db ("neigh: RCU conversion of struct neighbour")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: nfc: llcp: Add lock when modifying device list [+ + +]
Author: Jeremy Cline <jeremy@jcline.org>
Date:   Fri Sep 8 19:58:53 2023 -0400

    net: nfc: llcp: Add lock when modifying device list
    
    [ Upstream commit dfc7f7a988dad34c3bf4c053124fb26aa6c5f916 ]
    
    The device list needs its associated lock held when modifying it, or the
    list could become corrupted, as syzbot discovered.
    
    Reported-and-tested-by: syzbot+c1d0a03d305972dbbe14@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=c1d0a03d305972dbbe14
    Signed-off-by: Jeremy Cline <jeremy@jcline.org>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Fixes: 6709d4b7bc2e ("net: nfc: Fix use-after-free caused by nfc_llcp_find_local")
    Link: https://lore.kernel.org/r/20230908235853.1319596-1-jeremy@jcline.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: prevent rewrite of msg_name in sock_sendmsg() [+ + +]
Author: Jordan Rife <jrife@google.com>
Date:   Thu Sep 21 18:46:41 2023 -0500

    net: prevent rewrite of msg_name in sock_sendmsg()
    
    commit 86a7e0b69bd5b812e48a20c66c2161744f3caa16 upstream.
    
    Callers of sock_sendmsg(), and similarly kernel_sendmsg(), in kernel
    space may observe their value of msg_name change in cases where BPF
    sendmsg hooks rewrite the send address. This has been confirmed to break
    NFS mounts running in UDP mode and has the potential to break other
    systems.
    
    This patch:
    
    1) Creates a new function called __sock_sendmsg() with same logic as the
       old sock_sendmsg() function.
    2) Replaces calls to sock_sendmsg() made by __sys_sendto() and
       __sys_sendmsg() with __sock_sendmsg() to avoid an unnecessary copy,
       as these system calls are already protected.
    3) Modifies sock_sendmsg() so that it makes a copy of msg_name if
       present before passing it down the stack to insulate callers from
       changes to the send address.
    
    Link: https://lore.kernel.org/netdev/20230912013332.2048422-1-jrife@google.com/
    Fixes: 1cedee13d25a ("bpf: Hooks for sys_sendmsg")
    Cc: stable@vger.kernel.org
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Signed-off-by: Jordan Rife <jrife@google.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: release reference to inet6_dev pointer [+ + +]
Author: Patrick Rohr <prohr@google.com>
Date:   Fri Aug 18 11:22:49 2023 -0700

    net: release reference to inet6_dev pointer
    
    commit 5cb249686e67dbef3ffe53887fa725eefc5a7144 upstream.
    
    addrconf_prefix_rcv returned early without releasing the inet6_dev
    pointer when the PIO lifetime is less than accept_ra_min_lft.
    
    Fixes: 5027d54a9c30 ("net: change accept_ra_min_rtr_lft to affect all RA lifetimes")
    Cc: Maciej Żenczykowski <maze@google.com>
    Cc: Lorenzo Colitti <lorenzo@google.com>
    Cc: David Ahern <dsahern@kernel.org>
    Cc: Simon Horman <horms@kernel.org>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Reviewed-by: Maciej Żenczykowski <maze@google.com>
    Signed-off-by: Patrick Rohr <prohr@google.com>
    Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: replace calls to sock->ops->connect() with kernel_connect() [+ + +]
Author: Jordan Rife <jrife@google.com>
Date:   Thu Sep 21 18:46:40 2023 -0500

    net: replace calls to sock->ops->connect() with kernel_connect()
    
    commit 26297b4ce1ce4ea40bc9a48ec99f45da3f64d2e2 upstream.
    
    commit 0bdf399342c5 ("net: Avoid address overwrite in kernel_connect")
    ensured that kernel_connect() will not overwrite the address parameter
    in cases where BPF connect hooks perform an address rewrite. This change
    replaces direct calls to sock->ops->connect() in net with kernel_connect()
    to make these call safe.
    
    Link: https://lore.kernel.org/netdev/20230912013332.2048422-1-jrife@google.com/
    Fixes: d74bad4e74ee ("bpf: Hooks for sys_connect")
    Cc: stable@vger.kernel.org
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Signed-off-by: Jordan Rife <jrife@google.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: stmmac: dwmac-stm32: fix resume on STM32 MCU [+ + +]
Author: Ben Wolsieffer <ben.wolsieffer@hefring.com>
Date:   Wed Sep 27 13:57:49 2023 -0400

    net: stmmac: dwmac-stm32: fix resume on STM32 MCU
    
    [ Upstream commit 6f195d6b0da3b689922ba9e302af2f49592fa9fc ]
    
    The STM32MP1 keeps clk_rx enabled during suspend, and therefore the
    driver does not enable the clock in stm32_dwmac_init() if the device was
    suspended. The problem is that this same code runs on STM32 MCUs, which
    do disable clk_rx during suspend, causing the clock to never be
    re-enabled on resume.
    
    This patch adds a variant flag to indicate that clk_rx remains enabled
    during suspend, and uses this to decide whether to enable the clock in
    stm32_dwmac_init() if the device was suspended.
    
    This approach fixes this specific bug with limited opportunity for
    unintended side-effects, but I have a follow up patch that will refactor
    the clock configuration and hopefully make it less error prone.
    
    Fixes: 6528e02cc9ff ("net: ethernet: stmmac: add adaptation for stm32mp157c.")
    Signed-off-by: Ben Wolsieffer <ben.wolsieffer@hefring.com>
    Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
    Link: https://lore.kernel.org/r/20230927175749.1419774-1-ben.wolsieffer@hefring.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: usb: smsc75xx: Fix uninit-value access in __smsc75xx_read_reg [+ + +]
Author: Shigeru Yoshida <syoshida@redhat.com>
Date:   Sun Sep 24 02:35:49 2023 +0900

    net: usb: smsc75xx: Fix uninit-value access in __smsc75xx_read_reg
    
    [ Upstream commit e9c65989920f7c28775ec4e0c11b483910fb67b8 ]
    
    syzbot reported the following uninit-value access issue:
    
    =====================================================
    BUG: KMSAN: uninit-value in smsc75xx_wait_ready drivers/net/usb/smsc75xx.c:975 [inline]
    BUG: KMSAN: uninit-value in smsc75xx_bind+0x5c9/0x11e0 drivers/net/usb/smsc75xx.c:1482
    CPU: 0 PID: 8696 Comm: kworker/0:3 Not tainted 5.8.0-rc5-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Workqueue: usb_hub_wq hub_event
    Call Trace:
     __dump_stack lib/dump_stack.c:77 [inline]
     dump_stack+0x21c/0x280 lib/dump_stack.c:118
     kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:121
     __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215
     smsc75xx_wait_ready drivers/net/usb/smsc75xx.c:975 [inline]
     smsc75xx_bind+0x5c9/0x11e0 drivers/net/usb/smsc75xx.c:1482
     usbnet_probe+0x1152/0x3f90 drivers/net/usb/usbnet.c:1737
     usb_probe_interface+0xece/0x1550 drivers/usb/core/driver.c:374
     really_probe+0xf20/0x20b0 drivers/base/dd.c:529
     driver_probe_device+0x293/0x390 drivers/base/dd.c:701
     __device_attach_driver+0x63f/0x830 drivers/base/dd.c:807
     bus_for_each_drv+0x2ca/0x3f0 drivers/base/bus.c:431
     __device_attach+0x4e2/0x7f0 drivers/base/dd.c:873
     device_initial_probe+0x4a/0x60 drivers/base/dd.c:920
     bus_probe_device+0x177/0x3d0 drivers/base/bus.c:491
     device_add+0x3b0e/0x40d0 drivers/base/core.c:2680
     usb_set_configuration+0x380f/0x3f10 drivers/usb/core/message.c:2032
     usb_generic_driver_probe+0x138/0x300 drivers/usb/core/generic.c:241
     usb_probe_device+0x311/0x490 drivers/usb/core/driver.c:272
     really_probe+0xf20/0x20b0 drivers/base/dd.c:529
     driver_probe_device+0x293/0x390 drivers/base/dd.c:701
     __device_attach_driver+0x63f/0x830 drivers/base/dd.c:807
     bus_for_each_drv+0x2ca/0x3f0 drivers/base/bus.c:431
     __device_attach+0x4e2/0x7f0 drivers/base/dd.c:873
     device_initial_probe+0x4a/0x60 drivers/base/dd.c:920
     bus_probe_device+0x177/0x3d0 drivers/base/bus.c:491
     device_add+0x3b0e/0x40d0 drivers/base/core.c:2680
     usb_new_device+0x1bd4/0x2a30 drivers/usb/core/hub.c:2554
     hub_port_connect drivers/usb/core/hub.c:5208 [inline]
     hub_port_connect_change drivers/usb/core/hub.c:5348 [inline]
     port_event drivers/usb/core/hub.c:5494 [inline]
     hub_event+0x5e7b/0x8a70 drivers/usb/core/hub.c:5576
     process_one_work+0x1688/0x2140 kernel/workqueue.c:2269
     worker_thread+0x10bc/0x2730 kernel/workqueue.c:2415
     kthread+0x551/0x590 kernel/kthread.c:292
     ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:293
    
    Local variable ----buf.i87@smsc75xx_bind created at:
     __smsc75xx_read_reg drivers/net/usb/smsc75xx.c:83 [inline]
     smsc75xx_wait_ready drivers/net/usb/smsc75xx.c:968 [inline]
     smsc75xx_bind+0x485/0x11e0 drivers/net/usb/smsc75xx.c:1482
     __smsc75xx_read_reg drivers/net/usb/smsc75xx.c:83 [inline]
     smsc75xx_wait_ready drivers/net/usb/smsc75xx.c:968 [inline]
     smsc75xx_bind+0x485/0x11e0 drivers/net/usb/smsc75xx.c:1482
    
    This issue is caused because usbnet_read_cmd() reads less bytes than requested
    (zero byte in the reproducer). In this case, 'buf' is not properly filled.
    
    This patch fixes the issue by returning -ENODATA if usbnet_read_cmd() reads
    less bytes than requested.
    
    Fixes: d0cad871703b ("smsc75xx: SMSC LAN75xx USB gigabit ethernet adapter driver")
    Reported-and-tested-by: syzbot+6966546b78d050bb0b5d@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=6966546b78d050bb0b5d
    Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/20230923173549.3284502-1-syoshida@redhat.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
netfilter: handle the connecting collision properly in nf_conntrack_proto_sctp [+ + +]
Author: Xin Long <lucien.xin@gmail.com>
Date:   Tue Oct 3 13:17:53 2023 -0400

    netfilter: handle the connecting collision properly in nf_conntrack_proto_sctp
    
    [ Upstream commit 8e56b063c86569e51eed1c5681ce6361fa97fc7a ]
    
    In Scenario A and B below, as the delayed INIT_ACK always changes the peer
    vtag, SCTP ct with the incorrect vtag may cause packet loss.
    
    Scenario A: INIT_ACK is delayed until the peer receives its own INIT_ACK
    
      192.168.1.2 > 192.168.1.1: [INIT] [init tag: 1328086772]
        192.168.1.1 > 192.168.1.2: [INIT] [init tag: 1414468151]
        192.168.1.2 > 192.168.1.1: [INIT ACK] [init tag: 1328086772]
      192.168.1.1 > 192.168.1.2: [INIT ACK] [init tag: 1650211246] *
      192.168.1.2 > 192.168.1.1: [COOKIE ECHO]
        192.168.1.1 > 192.168.1.2: [COOKIE ECHO]
        192.168.1.2 > 192.168.1.1: [COOKIE ACK]
    
    Scenario B: INIT_ACK is delayed until the peer completes its own handshake
    
      192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 3922216408]
        192.168.1.1 > 192.168.1.2: sctp (1) [INIT] [init tag: 144230885]
        192.168.1.2 > 192.168.1.1: sctp (1) [INIT ACK] [init tag: 3922216408]
        192.168.1.1 > 192.168.1.2: sctp (1) [COOKIE ECHO]
        192.168.1.2 > 192.168.1.1: sctp (1) [COOKIE ACK]
      192.168.1.1 > 192.168.1.2: sctp (1) [INIT ACK] [init tag: 3914796021] *
    
    This patch fixes it as below:
    
    In SCTP_CID_INIT processing:
    - clear ct->proto.sctp.init[!dir] if ct->proto.sctp.init[dir] &&
      ct->proto.sctp.init[!dir]. (Scenario E)
    - set ct->proto.sctp.init[dir].
    
    In SCTP_CID_INIT_ACK processing:
    - drop it if !ct->proto.sctp.init[!dir] && ct->proto.sctp.vtag[!dir] &&
      ct->proto.sctp.vtag[!dir] != ih->init_tag. (Scenario B, Scenario C)
    - drop it if ct->proto.sctp.init[dir] && ct->proto.sctp.init[!dir] &&
      ct->proto.sctp.vtag[!dir] != ih->init_tag. (Scenario A)
    
    In SCTP_CID_COOKIE_ACK processing:
    - clear ct->proto.sctp.init[dir] and ct->proto.sctp.init[!dir].
      (Scenario D)
    
    Also, it's important to allow the ct state to move forward with cookie_echo
    and cookie_ack from the opposite dir for the collision scenarios.
    
    There are also other Scenarios where it should allow the packet through,
    addressed by the processing above:
    
    Scenario C: new CT is created by INIT_ACK.
    
    Scenario D: start INIT on the existing ESTABLISHED ct.
    
    Scenario E: start INIT after the old collision on the existing ESTABLISHED
    ct.
    
      192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 3922216408]
      192.168.1.1 > 192.168.1.2: sctp (1) [INIT] [init tag: 144230885]
      (both side are stopped, then start new connection again in hours)
      192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 242308742]
    
    Fixes: 9fb9cbb1082d ("[NETFILTER]: Add nf_conntrack subsystem.")
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: nf_tables: Deduplicate nft_register_obj audit logs [+ + +]
Author: Phil Sutter <phil@nwl.cc>
Date:   Sat Sep 23 03:53:50 2023 +0200

    netfilter: nf_tables: Deduplicate nft_register_obj audit logs
    
    [ Upstream commit 0d880dc6f032e0b541520e9926f398a77d3d433c ]
    
    When adding/updating an object, the transaction handler emits suitable
    audit log entries already, the one in nft_obj_notify() is redundant. To
    fix that (and retain the audit logging from objects' 'update' callback),
    Introduce an "audit log free" variant for internal use.
    
    Fixes: c520292f29b8 ("audit: log nftables configuration change events once per table")
    Signed-off-by: Phil Sutter <phil@nwl.cc>
    Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
    Acked-by: Paul Moore <paul@paul-moore.com> (Audit)
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: nf_tables: nft_set_rbtree: fix spurious insertion failure [+ + +]
Author: Florian Westphal <fw@strlen.de>
Date:   Thu Sep 28 15:12:44 2023 +0200

    netfilter: nf_tables: nft_set_rbtree: fix spurious insertion failure
    
    [ Upstream commit 087388278e0f301f4c61ddffb1911d3a180f84b8 ]
    
    nft_rbtree_gc_elem() walks back and removes the end interval element that
    comes before the expired element.
    
    There is a small chance that we've cached this element as 'rbe_ge'.
    If this happens, we hold and test a pointer that has been queued for
    freeing.
    
    It also causes spurious insertion failures:
    
    $ cat test-testcases-sets-0044interval_overlap_0.1/testout.log
    Error: Could not process rule: File exists
    add element t s {  0 -  2 }
                       ^^^^^^
    Failed to insert  0 -  2 given:
    table ip t {
            set s {
                    type inet_service
                    flags interval,timeout
                    timeout 2s
                    gc-interval 2s
            }
    }
    
    The set (rbtree) is empty. The 'failure' doesn't happen on next attempt.
    
    Reason is that when we try to insert, the tree may hold an expired
    element that collides with the range we're adding.
    While we do evict/erase this element, we can trip over this check:
    
    if (rbe_ge && nft_rbtree_interval_end(rbe_ge) && nft_rbtree_interval_end(new))
          return -ENOTEMPTY;
    
    rbe_ge was erased by the synchronous gc, we should not have done this
    check.  Next attempt won't find it, so retry results in successful
    insertion.
    
    Restart in-kernel to avoid such spurious errors.
    
    Such restart are rare, unless userspace intentionally adds very large
    numbers of elements with very short timeouts while setting a huge
    gc interval.
    
    Even in this case, this cannot loop forever, on each retry an existing
    element has been removed.
    
    As the caller is holding the transaction mutex, its impossible
    for a second entity to add more expiring elements to the tree.
    
    After this it also becomes feasible to remove the async gc worker
    and perform all garbage collection from the commit path.
    
    Fixes: c9e6978e2725 ("netfilter: nft_set_rbtree: Switch to node list walk for overlap detection")
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
netlink: annotate data-races around sk->sk_err [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Oct 3 18:34:55 2023 +0000

    netlink: annotate data-races around sk->sk_err
    
    [ Upstream commit d0f95894fda7d4f895b29c1097f92d7fee278cb2 ]
    
    syzbot caught another data-race in netlink when
    setting sk->sk_err.
    
    Annotate all of them for good measure.
    
    BUG: KCSAN: data-race in netlink_recvmsg / netlink_recvmsg
    
    write to 0xffff8881613bb220 of 4 bytes by task 28147 on cpu 0:
    netlink_recvmsg+0x448/0x780 net/netlink/af_netlink.c:1994
    sock_recvmsg_nosec net/socket.c:1027 [inline]
    sock_recvmsg net/socket.c:1049 [inline]
    __sys_recvfrom+0x1f4/0x2e0 net/socket.c:2229
    __do_sys_recvfrom net/socket.c:2247 [inline]
    __se_sys_recvfrom net/socket.c:2243 [inline]
    __x64_sys_recvfrom+0x78/0x90 net/socket.c:2243
    do_syscall_x64 arch/x86/entry/common.c:50 [inline]
    do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
    entry_SYSCALL_64_after_hwframe+0x63/0xcd
    
    write to 0xffff8881613bb220 of 4 bytes by task 28146 on cpu 1:
    netlink_recvmsg+0x448/0x780 net/netlink/af_netlink.c:1994
    sock_recvmsg_nosec net/socket.c:1027 [inline]
    sock_recvmsg net/socket.c:1049 [inline]
    __sys_recvfrom+0x1f4/0x2e0 net/socket.c:2229
    __do_sys_recvfrom net/socket.c:2247 [inline]
    __se_sys_recvfrom net/socket.c:2243 [inline]
    __x64_sys_recvfrom+0x78/0x90 net/socket.c:2243
    do_syscall_x64 arch/x86/entry/common.c:50 [inline]
    do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
    entry_SYSCALL_64_after_hwframe+0x63/0xcd
    
    value changed: 0x00000000 -> 0x00000016
    
    Reported by Kernel Concurrency Sanitizer on:
    CPU: 1 PID: 28146 Comm: syz-executor.0 Not tainted 6.6.0-rc3-syzkaller-00055-g9ed22ae6be81 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/06/2023
    
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/20231003183455.3410550-1-edumazet@google.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netlink: Fix potential skb memleak in netlink_ack [+ + +]
Author: Tao Chen <chentao.kernel@linux.alibaba.com>
Date:   Sat Nov 5 17:05:04 2022 +0800

    netlink: Fix potential skb memleak in netlink_ack
    
    [ Upstream commit e69761483361f3df455bc493c99af0ef1744a14f ]
    
    Fix coverity issue 'Resource leak'.
    
    We should clean the skb resource if nlmsg_put/append failed.
    
    Fixes: 738136a0e375 ("netlink: split up copies in the ack construction")
    Signed-off-by: Tao Chen <chentao.kernel@linux.alibaba.com>
    Link: https://lore.kernel.org/r/bff442d62c87de6299817fe1897cc5a5694ba9cc.1667638204.git.chentao.kernel@linux.alibaba.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: d0f95894fda7 ("netlink: annotate data-races around sk->sk_err")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netlink: remove the flex array from struct nlmsghdr [+ + +]
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Thu Nov 17 19:39:03 2022 -0800

    netlink: remove the flex array from struct nlmsghdr
    
    commit c73a72f4cbb47672c8cc7f7d7aba52f1cb15baca upstream.
    
    I've added a flex array to struct nlmsghdr in
    commit 738136a0e375 ("netlink: split up copies in the ack construction")
    to allow accessing the data easily. It leads to warnings with clang,
    if user space wraps this structure into another struct and the flex
    array is not at the end of the container.
    
    Reviewed-by: Kees Cook <keescook@chromium.org>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Link: https://lore.kernel.org/all/20221114023927.GA685@u2004-local/
    Link: https://lore.kernel.org/r/20221118033903.1651026-1-kuba@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

netlink: split up copies in the ack construction [+ + +]
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Thu Oct 27 14:25:53 2022 -0700

    netlink: split up copies in the ack construction
    
    [ Upstream commit 738136a0e3757a8534df3ad97d6ff6d7f429f6c1 ]
    
    Clean up the use of unsafe_memcpy() by adding a flexible array
    at the end of netlink message header and splitting up the header
    and data copies.
    
    Reviewed-by: Kees Cook <keescook@chromium.org>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Stable-dep-of: d0f95894fda7 ("netlink: annotate data-races around sk->sk_err")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
NFS: Cleanup unused rpc_clnt variable [+ + +]
Author: Benjamin Coddington <bcodding@redhat.com>
Date:   Thu Apr 20 12:17:35 2023 -0400

    NFS: Cleanup unused rpc_clnt variable
    
    [ Upstream commit e025f0a73f6acb920d86549b2177a5883535421d ]
    
    The root rpc_clnt is not used here, clean it up.
    
    Fixes: 4dc73c679114 ("NFSv4: keep state manager thread active if swap is enabled")
    Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
    Reviewed-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
    Stable-dep-of: 956fd46f97d2 ("NFSv4: Fix a state manager thread deadlock regression")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFS: rename nfs_client_kset to nfs_kset [+ + +]
Author: Benjamin Coddington <bcodding@redhat.com>
Date:   Thu Jun 15 14:07:22 2023 -0400

    NFS: rename nfs_client_kset to nfs_kset
    
    [ Upstream commit 8b18a2edecc0741b0eecf8b18fdb356a0f8682de ]
    
    Be brief and match the subsystem name.  There's no need to distinguish this
    kset variable from the server.
    
    Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Stable-dep-of: 956fd46f97d2 ("NFSv4: Fix a state manager thread deadlock regression")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
NFSv4: Fix a nfs4_state_manager() race [+ + +]
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date:   Sun Sep 17 19:05:50 2023 -0400

    NFSv4: Fix a nfs4_state_manager() race
    
    [ Upstream commit ed1cc05aa1f7fe8197d300e914afc28ab9818f89 ]
    
    If the NFS4CLNT_RUN_MANAGER flag got set just before we cleared
    NFS4CLNT_MANAGER_RUNNING, then we might have won the race against
    nfs4_schedule_state_manager(), and are responsible for handling the
    recovery situation.
    
    Fixes: aeabb3c96186 ("NFSv4: Fix a NFSv4 state manager deadlock")
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

NFSv4: Fix a state manager thread deadlock regression [+ + +]
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date:   Sun Sep 24 13:14:15 2023 -0400

    NFSv4: Fix a state manager thread deadlock regression
    
    [ Upstream commit 956fd46f97d238032cb5fa4771cdaccc6e760f9a ]
    
    Commit 4dc73c679114 reintroduces the deadlock that was fixed by commit
    aeabb3c96186 ("NFSv4: Fix a NFSv4 state manager deadlock") because it
    prevents the setup of new threads to handle reboot recovery, while the
    older recovery thread is stuck returning delegations.
    
    Fixes: 4dc73c679114 ("NFSv4: keep state manager thread active if swap is enabled")
    Cc: stable@vger.kernel.org
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
of: dynamic: Fix potential memory leak in of_changeset_action() [+ + +]
Author: Dan Carpenter <dan.carpenter@linaro.org>
Date:   Fri Sep 8 10:03:50 2023 +0300

    of: dynamic: Fix potential memory leak in of_changeset_action()
    
    commit 55e95bfccf6db8d26a66c46e1de50d53c59a6774 upstream.
    
    Smatch complains that the error path where "action" is invalid leaks
    the "ce" allocation:
        drivers/of/dynamic.c:935 of_changeset_action()
        warn: possible memory leak of 'ce'
    
    Fix this by doing the validation before the allocation.
    
    Note that there is not any actual problem with upstream kernels. All
    callers of of_changeset_action() are static inlines with fixed action
    values.
    
    Fixes: 914d9d831e61 ("of: dynamic: Refactor action prints to not use "%pOF" inside devtree_lock")
    Reported-by: kernel test robot <lkp@intel.com>
    Closes: https://lore.kernel.org/r/202309011059.EOdr4im9-lkp@intel.com/
    Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
    Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Link: https://lore.kernel.org/r/7dfaf999-30ad-491c-9615-fb1138db121c@moroto.mountain
    Signed-off-by: Rob Herring <robh@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
parisc: Fix crash with nr_cpus=1 option [+ + +]
Author: Helge Deller <deller@gmx.de>
Date:   Tue Sep 19 15:26:35 2023 +0200

    parisc: Fix crash with nr_cpus=1 option
    
    commit d3b3c637e4eb8d3bbe53e5692aee66add72f9851 upstream.
    
    John David Anglin reported that giving "nr_cpus=1" on the command
    line causes a crash, while "maxcpus=1" works.
    
    Reported-by: John David Anglin <dave.anglin@bell.net>
    Signed-off-by: Helge Deller <deller@gmx.de>
    Cc: stable@vger.kernel.org # v5.18+
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

parisc: Restore __ldcw_align for PA-RISC 2.0 processors [+ + +]
Author: John David Anglin <dave@parisc-linux.org>
Date:   Tue Sep 19 17:51:40 2023 +0000

    parisc: Restore __ldcw_align for PA-RISC 2.0 processors
    
    commit 914988e099fc658436fbd7b8f240160c352b6552 upstream.
    
    Back in 2005, Kyle McMartin removed the 16-byte alignment for
    ldcw semaphores on PA 2.0 machines (CONFIG_PA20). This broke
    spinlocks on pre PA8800 processors. The main symptom was random
    faults in mmap'd memory (e.g., gcc compilations, etc).
    
    Unfortunately, the errata for this ldcw change is lost.
    
    The issue is the 16-byte alignment required for ldcw semaphore
    instructions can only be reduced to natural alignment when the
    ldcw operation can be handled coherently in cache. Only PA8800
    and PA8900 processors actually support doing the operation in
    cache.
    
    Aligning the spinlock dynamically adds two integer instructions
    to each spinlock.
    
    Tested on rp3440, c8000 and a500.
    
    Signed-off-by: John David Anglin <dave.anglin@bell.net>
    Link: https://lore.kernel.org/linux-parisc/6b332788-2227-127f-ba6d-55e99ecf4ed8@bell.net/T/#t
    Link: https://lore.kernel.org/linux-parisc/20050609050702.GB4641@roadwarrior.mcmartin.ca/
    Cc: stable@vger.kernel.org
    Signed-off-by: Helge Deller <deller@gmx.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
PCI: qcom: Fix IPQ8074 enumeration [+ + +]
Author: Sricharan Ramabadhran <quic_srichara@quicinc.com>
Date:   Tue Sep 19 15:59:48 2023 +0530

    PCI: qcom: Fix IPQ8074 enumeration
    
    commit 6a878a54d0053ef21f3b829dc267487c2302b012 upstream.
    
    PARF_SLV_ADDR_SPACE_SIZE_2_3_3 is used by qcom_pcie_post_init_2_3_3().
    This PCIe slave address space size register offset is 0x358 but was
    incorrectly changed to 0x16c by 39171b33f652 ("PCI: qcom: Remove PCIE20_
    prefix from register definitions").
    
    This prevented access to slave address space registers like iATU, etc.,
    so the IPQ8074 PCIe controller was not enumerated.
    
    Revert back to the correct 0x358 offset and remove the unused
    PARF_SLV_ADDR_SPACE_SIZE_2_3_3.
    
    Fixes: 39171b33f652 ("PCI: qcom: Remove PCIE20_ prefix from register definitions")
    Link: https://lore.kernel.org/r/20230919102948.1844909-1-quic_srichara@quicinc.com
    Tested-by: Robert Marko <robimarko@gmail.com>
    Signed-off-by: Sricharan Ramabadhran <quic_srichara@quicinc.com>
    [bhelgaas: commit log]
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
    Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
    Cc: stable@vger.kernel.org      # v6.4+
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
perf/x86/amd/core: Fix overflow reset on hotplug [+ + +]
Author: Sandipan Das <sandipan.das@amd.com>
Date:   Thu Sep 14 19:36:04 2023 +0530

    perf/x86/amd/core: Fix overflow reset on hotplug
    
    [ Upstream commit 23d2626b841c2adccdeb477665313c02dff02dc3 ]
    
    Kernels older than v5.19 do not support PerfMonV2 and the PMI handler
    does not clear the overflow bits of the PerfCntrGlobalStatus register.
    Because of this, loading a recent kernel using kexec from an older
    kernel can result in inconsistent register states on Zen 4 systems.
    
    The PMI handler of the new kernel gets confused and shows a warning when
    an overflow occurs because some of the overflow bits are set even if the
    corresponding counters are inactive. These are remnants from overflows
    that were handled by the older kernel.
    
    During CPU hotplug, the PerfCntrGlobalCtl and PerfCntrGlobalStatus
    registers should always be cleared for PerfMonV2-capable processors.
    However, a condition used for NB event constaints applicable only to
    older processors currently prevents this from happening. Move the reset
    sequence to an appropriate place and also clear the LBR Freeze bit.
    
    Fixes: 21d59e3e2c40 ("perf/x86/amd/core: Detect PerfMonV2 support")
    Signed-off-by: Sandipan Das <sandipan.das@amd.com>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Link: https://lore.kernel.org/r/882a87511af40792ba69bb0e9026f19a2e71e8a3.1694696888.git.sandipan.das@amd.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
perf/x86/amd: Do not WARN() on every IRQ [+ + +]
Author: Breno Leitao <leitao@debian.org>
Date:   Thu Sep 14 19:58:40 2023 +0530

    perf/x86/amd: Do not WARN() on every IRQ
    
    [ Upstream commit 599522d9d2e19d6240e4312577f1c5f3ffca22f6 ]
    
    Zen 4 systems running buggy microcode can hit a WARN_ON() in the PMI
    handler, as shown below, several times while perf runs. A simple
    `perf top` run is enough to render the system unusable:
    
      WARNING: CPU: 18 PID: 20608 at arch/x86/events/amd/core.c:944 amd_pmu_v2_handle_irq+0x1be/0x2b0
    
    This happens because the Performance Counter Global Status Register
    (PerfCntGlobalStatus) has one or more bits set which are considered
    reserved according to the "AMD64 Architecture Programmer’s Manual,
    Volume 2: System Programming, 24593":
    
      https://www.amd.com/system/files/TechDocs/24593.pdf
    
    To make this less intrusive, warn just once if any reserved bit is set
    and prompt the user to update the microcode. Also sanitize the value to
    what the code is handling, so that the overflow events continue to be
    handled for the number of counters that are known to be sane.
    
    Going forward, the following microcode patch levels are recommended
    for Zen 4 processors in order to avoid such issues with reserved bits:
    
      Family=0x19 Model=0x11 Stepping=0x01: Patch=0x0a10113e
      Family=0x19 Model=0x11 Stepping=0x02: Patch=0x0a10123e
      Family=0x19 Model=0xa0 Stepping=0x01: Patch=0x0aa00116
      Family=0x19 Model=0xa0 Stepping=0x02: Patch=0x0aa00212
    
    Commit f2eb058afc57 ("linux-firmware: Update AMD cpu microcode") from
    the linux-firmware tree has binaries that meet the minimum required
    patch levels.
    
      [ sandipan: - add message to prompt users to update microcode
                  - rework commit message and call out required microcode levels ]
    
    Fixes: 7685665c390d ("perf/x86/amd/core: Add PerfMonV2 overflow handling")
    Reported-by: Jirka Hladky <jhladky@redhat.com>
    Signed-off-by: Breno Leitao <leitao@debian.org>
    Signed-off-by: Sandipan Das <sandipan.das@amd.com>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Link: https://lore.kernel.org/all/3540f985652f41041e54ee82aa53e7dbd55739ae.1694696888.git.sandipan.das@amd.com/
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ptp: ocp: Fix error handling in ptp_ocp_device_init [+ + +]
Author: Dinghao Liu <dinghao.liu@zju.edu.cn>
Date:   Fri Sep 22 17:40:44 2023 +0800

    ptp: ocp: Fix error handling in ptp_ocp_device_init
    
    [ Upstream commit caa0578c1d487d39e4bb947a1b4965417053b409 ]
    
    When device_add() fails, ptp_ocp_dev_release() will be called
    after put_device(). Therefore, it seems that the
    ptp_ocp_dev_release() before put_device() is redundant.
    
    Fixes: 773bda964921 ("ptp: ocp: Expose various resources on the timecard.")
    Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
    Reviewed-by: Vadim Feodrenko <vadim.fedorenko@linux.dev>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
qed/red_ll2: Fix undefined behavior bug in struct qed_ll2_info [+ + +]
Author: Gustavo A. R. Silva <gustavoars@kernel.org>
Date:   Sat Sep 23 19:15:59 2023 -0600

    qed/red_ll2: Fix undefined behavior bug in struct qed_ll2_info
    
    commit eea03d18af9c44235865a4bc9bec4d780ef6cf21 upstream.
    
    The flexible structure (a structure that contains a flexible-array member
    at the end) `qed_ll2_tx_packet` is nested within the second layer of
    `struct qed_ll2_info`:
    
    struct qed_ll2_tx_packet {
            ...
            /* Flexible Array of bds_set determined by max_bds_per_packet */
            struct {
                    struct core_tx_bd *txq_bd;
                    dma_addr_t tx_frag;
                    u16 frag_len;
            } bds_set[];
    };
    
    struct qed_ll2_tx_queue {
            ...
            struct qed_ll2_tx_packet cur_completing_packet;
    };
    
    struct qed_ll2_info {
            ...
            struct qed_ll2_tx_queue tx_queue;
            struct qed_ll2_cbs cbs;
    };
    
    The problem is that member `cbs` in `struct qed_ll2_info` is placed just
    after an object of type `struct qed_ll2_tx_queue`, which is in itself
    an implicit flexible structure, which by definition ends in a flexible
    array member, in this case `bds_set`. This causes an undefined behavior
    bug at run-time when dynamic memory is allocated for `bds_set`, which
    could lead to a serious issue if `cbs` in `struct qed_ll2_info` is
    overwritten by the contents of `bds_set`. Notice that the type of `cbs`
    is a structure full of function pointers (and a cookie :) ):
    
    include/linux/qed/qed_ll2_if.h:
    107 typedef
    108 void (*qed_ll2_complete_rx_packet_cb)(void *cxt,
    109                                       struct qed_ll2_comp_rx_data *data);
    110
    111 typedef
    112 void (*qed_ll2_release_rx_packet_cb)(void *cxt,
    113                                      u8 connection_handle,
    114                                      void *cookie,
    115                                      dma_addr_t rx_buf_addr,
    116                                      bool b_last_packet);
    117
    118 typedef
    119 void (*qed_ll2_complete_tx_packet_cb)(void *cxt,
    120                                       u8 connection_handle,
    121                                       void *cookie,
    122                                       dma_addr_t first_frag_addr,
    123                                       bool b_last_fragment,
    124                                       bool b_last_packet);
    125
    126 typedef
    127 void (*qed_ll2_release_tx_packet_cb)(void *cxt,
    128                                      u8 connection_handle,
    129                                      void *cookie,
    130                                      dma_addr_t first_frag_addr,
    131                                      bool b_last_fragment, bool b_last_packet);
    132
    133 typedef
    134 void (*qed_ll2_slowpath_cb)(void *cxt, u8 connection_handle,
    135                             u32 opaque_data_0, u32 opaque_data_1);
    136
    137 struct qed_ll2_cbs {
    138         qed_ll2_complete_rx_packet_cb rx_comp_cb;
    139         qed_ll2_release_rx_packet_cb rx_release_cb;
    140         qed_ll2_complete_tx_packet_cb tx_comp_cb;
    141         qed_ll2_release_tx_packet_cb tx_release_cb;
    142         qed_ll2_slowpath_cb slowpath_cb;
    143         void *cookie;
    144 };
    
    Fix this by moving the declaration of `cbs` to the  middle of its
    containing structure `qed_ll2_info`, preventing it from being
    overwritten by the contents of `bds_set` at run-time.
    
    This bug was introduced in 2017, when `bds_set` was converted to a
    one-element array, and started to be used as a Variable Length Object
    (VLO) at run-time.
    
    Fixes: f5823fe6897c ("qed: Add ll2 option to limit the number of bds per packet")
    Cc: stable@vger.kernel.org
    Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
    Reviewed-by: Kees Cook <keescook@chromium.org>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/ZQ+Nz8DfPg56pIzr@work
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
rbd: decouple header read-in from updating rbd_dev->header [+ + +]
Author: Ilya Dryomov <idryomov@gmail.com>
Date:   Thu Oct 5 11:59:33 2023 +0200

    rbd: decouple header read-in from updating rbd_dev->header
    
    commit 510a7330c82a7754d5df0117a8589e8a539067c7 upstream.
    
    Make rbd_dev_header_info() populate a passed struct rbd_image_header
    instead of rbd_dev->header and introduce rbd_dev_update_header() for
    updating mutable fields in rbd_dev->header upon refresh.  The initial
    read-in of both mutable and immutable fields in rbd_dev_image_probe()
    passes in rbd_dev->header so no update step is required there.
    
    rbd_init_layout() is now called directly from rbd_dev_image_probe()
    instead of individually in format 1 and format 2 implementations.
    
    Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
    Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

rbd: decouple parent info read-in from updating rbd_dev [+ + +]
Author: Ilya Dryomov <idryomov@gmail.com>
Date:   Thu Oct 5 11:59:34 2023 +0200

    rbd: decouple parent info read-in from updating rbd_dev
    
    commit c10311776f0a8ddea2276df96e255625b07045a8 upstream.
    
    Unlike header read-in, parent info read-in is already decoupled in
    get_parent_info(), but it's buried in rbd_dev_v2_parent_info() along
    with the processing logic.
    
    Separate the initial read-in and update read-in logic into
    rbd_dev_setup_parent() and rbd_dev_update_parent() respectively and
    have rbd_dev_v2_parent_info() just populate struct parent_image_info
    (i.e. what get_parent_info() did).  Some existing QoI issues, like
    flatten of a standalone clone being disregarded on refresh, remain.
    
    Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
    Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

rbd: move rbd_dev_refresh() definition [+ + +]
Author: Ilya Dryomov <idryomov@gmail.com>
Date:   Thu Oct 5 11:59:32 2023 +0200

    rbd: move rbd_dev_refresh() definition
    
    commit 0b035401c57021fc6c300272cbb1c5a889d4fe45 upstream.
    
    Move rbd_dev_refresh() definition further down to avoid having to
    move struct parent_image_info definition in the next commit.  This
    spares some forward declarations too.
    
    Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
    Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
    [idryomov@gmail.com: backport to 5.10-6.1: context]
    Signed-off-by: Sasha Levin <sashal@kernel.org>

rbd: take header_rwsem in rbd_dev_refresh() only when updating [+ + +]
Author: Ilya Dryomov <idryomov@gmail.com>
Date:   Thu Oct 5 11:59:35 2023 +0200

    rbd: take header_rwsem in rbd_dev_refresh() only when updating
    
    commit 0b207d02bd9ab8dcc31b262ca9f60dbc1822500d upstream.
    
    rbd_dev_refresh() has been holding header_rwsem across header and
    parent info read-in unnecessarily for ages.  With commit 870611e4877e
    ("rbd: get snapshot context after exclusive lock is ensured to be
    held"), the potential for deadlocks became much more real owning to
    a) header_rwsem now nesting inside lock_rwsem and b) rw_semaphores
    not allowing new readers after a writer is registered.
    
    For example, assuming that I/O request 1, I/O request 2 and header
    read-in request all target the same OSD:
    
    1. I/O request 1 comes in and gets submitted
    2. watch error occurs
    3. rbd_watch_errcb() takes lock_rwsem for write, clears owner_cid and
       releases lock_rwsem
    4. after reestablishing the watch, rbd_reregister_watch() calls
       rbd_dev_refresh() which takes header_rwsem for write and submits
       a header read-in request
    5. I/O request 2 comes in: after taking lock_rwsem for read in
       __rbd_img_handle_request(), it blocks trying to take header_rwsem
       for read in rbd_img_object_requests()
    6. another watch error occurs
    7. rbd_watch_errcb() blocks trying to take lock_rwsem for write
    8. I/O request 1 completion is received by the messenger but can't be
       processed because lock_rwsem won't be granted anymore
    9. header read-in request completion can't be received, let alone
       processed, because the messenger is stranded
    
    Change rbd_dev_refresh() to take header_rwsem only for actually
    updating rbd_dev->header.  Header and parent info read-in don't need
    any locking.
    
    Cc: stable@vger.kernel.org # 0b035401c570: rbd: move rbd_dev_refresh() definition
    Cc: stable@vger.kernel.org # 510a7330c82a: rbd: decouple header read-in from updating rbd_dev->header
    Cc: stable@vger.kernel.org # c10311776f0a: rbd: decouple parent info read-in from updating rbd_dev
    Cc: stable@vger.kernel.org
    Fixes: 870611e4877e ("rbd: get snapshot context after exclusive lock is ensured to be held")
    Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
    Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
RDMA/cma: Fix truncation compilation warning in make_cma_ports [+ + +]
Author: Leon Romanovsky <leon@kernel.org>
Date:   Mon Sep 11 15:18:06 2023 +0300

    RDMA/cma: Fix truncation compilation warning in make_cma_ports
    
    commit 18126c767658ae8a831257c6cb7776c5ba5e7249 upstream.
    
    The following compilation error is false alarm as RDMA devices don't
    have such large amount of ports to actually cause to format truncation.
    
    drivers/infiniband/core/cma_configfs.c: In function ‘make_cma_ports’:
    drivers/infiniband/core/cma_configfs.c:223:57: error: ‘snprintf’ output may be truncated before the last format character [-Werror=format-truncation=]
      223 |                 snprintf(port_str, sizeof(port_str), "%u", i + 1);
          |                                                         ^
    drivers/infiniband/core/cma_configfs.c:223:17: note: ‘snprintf’ output between 2 and 11 bytes into a destination of size 10
      223 |                 snprintf(port_str, sizeof(port_str), "%u", i + 1);
          |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    cc1: all warnings being treated as errors
    make[5]: *** [scripts/Makefile.build:243: drivers/infiniband/core/cma_configfs.o] Error 1
    
    Fixes: 045959db65c6 ("IB/cma: Add configfs for rdma_cm")
    Link: https://lore.kernel.org/r/a7e3b347ee134167fa6a3787c56ef231a04bc8c2.1694434639.git.leonro@nvidia.com
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

RDMA/cma: Initialize ib_sa_multicast structure to 0 when join [+ + +]
Author: Mark Zhang <markzhang@nvidia.com>
Date:   Wed Sep 27 12:05:11 2023 +0300

    RDMA/cma: Initialize ib_sa_multicast structure to 0 when join
    
    commit e0fe97efdb00f0f32b038a4836406a82886aec9c upstream.
    
    Initialize the structure to 0 so that it's fields won't have random
    values. For example fields like rec.traffic_class (as well as
    rec.flow_label and rec.sl) is used to generate the user AH through:
      cma_iboe_join_multicast
        cma_make_mc_event
          ib_init_ah_from_mcmember
    
    And a random traffic_class causes a random IP DSCP in RoCEv2.
    
    Fixes: b5de0c60cc30 ("RDMA/cma: Fix use after free race in roce multicast join")
    Signed-off-by: Mark Zhang <markzhang@nvidia.com>
    Link: https://lore.kernel.org/r/20230927090511.603595-1-markzhang@nvidia.com
    Signed-off-by: Leon Romanovsky <leon@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
RDMA/core: Require admin capabilities to set system parameters [+ + +]
Author: Leon Romanovsky <leon@kernel.org>
Date:   Wed Oct 4 21:17:49 2023 +0300

    RDMA/core: Require admin capabilities to set system parameters
    
    commit c38d23a54445f9a8aa6831fafc9af0496ba02f9e upstream.
    
    Like any other set command, require admin permissions to do it.
    
    Cc: stable@vger.kernel.org
    Fixes: 2b34c5580226 ("RDMA/core: Add command to set ib_core device net namspace sharing mode")
    Link: https://lore.kernel.org/r/75d329fdd7381b52cbdf87910bef16c9965abb1f.1696443438.git.leon@kernel.org
    Reviewed-by: Parav Pandit <parav@nvidia.com>
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
RDMA/mlx5: Fix mutex unlocking on error flow for steering anchor creation [+ + +]
Author: Hamdan Igbaria <hamdani@nvidia.com>
Date:   Wed Sep 20 13:01:55 2023 +0300

    RDMA/mlx5: Fix mutex unlocking on error flow for steering anchor creation
    
    commit 2fad8f06a582cd431d398a0b3f9be21d069603ab upstream.
    
    The mutex was not unlocked on some of the error flows.
    Moved the unlock location to include all the error flow scenarios.
    
    Fixes: e1f4a52ac171 ("RDMA/mlx5: Create an indirect flow table for steering anchor")
    Reviewed-by: Mark Bloch <mbloch@nvidia.com>
    Signed-off-by: Hamdan Igbaria <hamdani@nvidia.com>
    Link: https://lore.kernel.org/r/1244a69d783da997c0af0b827c622eb00495492e.1695203958.git.leonro@nvidia.com
    Signed-off-by: Leon Romanovsky <leon@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

RDMA/mlx5: Fix NULL string error [+ + +]
Author: Shay Drory <shayd@nvidia.com>
Date:   Wed Sep 20 13:01:56 2023 +0300

    RDMA/mlx5: Fix NULL string error
    
    commit dab994bcc609a172bfdab15a0d4cb7e50e8b5458 upstream.
    
    checkpath is complaining about NULL string, change it to 'Unknown'.
    
    Fixes: 37aa5c36aa70 ("IB/mlx5: Add UARs write-combining and non-cached mapping")
    Signed-off-by: Shay Drory <shayd@nvidia.com>
    Link: https://lore.kernel.org/r/8638e5c14fadbde5fa9961874feae917073af920.1695203958.git.leonro@nvidia.com
    Signed-off-by: Leon Romanovsky <leon@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
RDMA/siw: Fix connection failure handling [+ + +]
Author: Bernard Metzler <bmt@zurich.ibm.com>
Date:   Tue Sep 5 16:58:22 2023 +0200

    RDMA/siw: Fix connection failure handling
    
    commit 53a3f777049771496f791504e7dc8ef017cba590 upstream.
    
    In case immediate MPA request processing fails, the newly
    created endpoint unlinks the listening endpoint and is
    ready to be dropped. This special case was not handled
    correctly by the code handling the later TCP socket close,
    causing a NULL dereference crash in siw_cm_work_handler()
    when dereferencing a NULL listener. We now also cancel
    the useless MPA timeout, if immediate MPA request
    processing fails.
    
    This patch furthermore simplifies MPA processing in general:
    Scheduling a useless TCP socket read in sk_data_ready() upcall
    is now surpressed, if the socket is already moved out of
    TCP_ESTABLISHED state.
    
    Fixes: 6c52fdc244b5 ("rdma/siw: connection management")
    Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com>
    Link: https://lore.kernel.org/r/20230905145822.446263-1-bmt@zurich.ibm.com
    Signed-off-by: Leon Romanovsky <leon@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
RDMA/srp: Do not call scsi_done() from srp_abort() [+ + +]
Author: Bart Van Assche <bvanassche@acm.org>
Date:   Wed Aug 23 13:57:27 2023 -0700

    RDMA/srp: Do not call scsi_done() from srp_abort()
    
    commit e193b7955dfad68035b983a0011f4ef3590c85eb upstream.
    
    After scmd_eh_abort_handler() has called the SCSI LLD eh_abort_handler
    callback, it performs one of the following actions:
    * Call scsi_queue_insert().
    * Call scsi_finish_command().
    * Call scsi_eh_scmd_add().
    Hence, SCSI abort handlers must not call scsi_done(). Otherwise all
    the above actions would trigger a use-after-free. Hence remove the
    scsi_done() call from srp_abort(). Keep the srp_free_req() call
    before returning SUCCESS because we may not see the command again if
    SUCCESS is returned.
    
    Cc: Bob Pearson <rpearsonhpe@gmail.com>
    Cc: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
    Fixes: d8536670916a ("IB/srp: Avoid having aborted requests hang")
    Signed-off-by: Bart Van Assche <bvanassche@acm.org>
    Link: https://lore.kernel.org/r/20230823205727.505681-1-bvanassche@acm.org
    Signed-off-by: Leon Romanovsky <leon@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
RDMA/uverbs: Fix typo of sizeof argument [+ + +]
Author: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
Date:   Tue Sep 5 18:32:58 2023 +0800

    RDMA/uverbs: Fix typo of sizeof argument
    
    commit c489800e0d48097fc6afebd862c6afa039110a36 upstream.
    
    Since size of 'hdr' pointer and '*hdr' structure is equal on 64-bit
    machines issue probably didn't cause any wrong behavior. But anyway,
    fixing of typo is required.
    
    Fixes: da0f60df7bd5 ("RDMA/uverbs: Prohibit write() calls with too small buffers")
    Co-developed-by: Ivanov Mikhail <ivanov.mikhail1@huawei-partners.com>
    Signed-off-by: Ivanov Mikhail <ivanov.mikhail1@huawei-partners.com>
    Signed-off-by: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
    Link: https://lore.kernel.org/r/20230905103258.1738246-1-konstantin.meskhidze@huawei.com
    Signed-off-by: Leon Romanovsky <leon@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
regmap: rbtree: Fix wrong register marked as in-cache when creating new node [+ + +]
Author: Richard Fitzgerald <rf@opensource.cirrus.com>
Date:   Fri Sep 22 16:37:11 2023 +0100

    regmap: rbtree: Fix wrong register marked as in-cache when creating new node
    
    [ Upstream commit 7a795ac8d49e2433e1b97caf5e99129daf8e1b08 ]
    
    When regcache_rbtree_write() creates a new rbtree_node it was passing the
    wrong bit number to regcache_rbtree_set_register(). The bit number is the
    offset __in number of registers__, but in the case of creating a new block
    regcache_rbtree_write() was not dividing by the address stride to get the
    number of registers.
    
    Fix this by dividing by map->reg_stride.
    Compare with regcache_rbtree_read() where the bit is checked.
    
    This bug meant that the wrong register was marked as present. The register
    that was written to the cache could not be read from the cache because it
    was not marked as cached. But a nearby register could be marked as having
    a cached value even if it was never written to the cache.
    
    Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
    Fixes: 3f4ff561bc88 ("regmap: rbtree: Make cache_present bitmap per node")
    Link: https://lore.kernel.org/r/20230922153711.28103-1-rf@opensource.cirrus.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
regulator/core: regulator_register: set device->class earlier [+ + +]
Author: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Date:   Tue Sep 19 00:50:26 2023 +0200

    regulator/core: regulator_register: set device->class earlier
    
    [ Upstream commit 8adb4e647a83cb5928c05dae95b010224aea0705 ]
    
    When fixing a memory leak in commit d3c731564e09 ("regulator: plug
    of_node leak in regulator_register()'s error path") it moved the
    device_initialize() call earlier, but did not move the `dev->class`
    initialization.  The bug was spotted and fixed by reverting part of
    the commit (in commit 5f4b204b6b81 "regulator: core: fix kobject
    release warning and memory leak in regulator_register()") but
    introducing a different bug: now early error paths use `kfree(dev)`
    instead of `put_device()` for an already initialized `struct device`.
    
    Move the missing assignments to just after `device_initialize()`.
    
    Fixes: d3c731564e09 ("regulator: plug of_node leak in regulator_register()'s error path")
    Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
    Link: https://lore.kernel.org/r/b5b19cb458c40c9d02f3d5a7bd1ba7d97ba17279.1695077303.git.mirq-linux@rere.qmqm.pl
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
regulator: mt6358: Drop *_SSHUB regulators [+ + +]
Author: Chen-Yu Tsai <wenst@chromium.org>
Date:   Fri Jun 9 16:30:01 2023 +0800

    regulator: mt6358: Drop *_SSHUB regulators
    
    [ Upstream commit 04ba665248ed91576d326041108e5fc2ec2254eb ]
    
    The *_SSHUB regulators are actually alternate configuration interfaces
    for their non *_SSHUB counterparts. They are not separate regulator
    outputs. These registers are intended for the companion processor to
    use to configure the power rails while the main processor is sleeping.
    They are not intended for the main operating system to use.
    
    Since they are not real outputs they shouldn't be modeled separately.
    Remove them. Luckily no device tree actually uses them.
    
    Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
    Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
    Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
    Link: https://lore.kernel.org/r/20230609083009.2822259-5-wenst@chromium.org
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Stable-dep-of: 7e37c851374e ("regulator: mt6358: split ops for buck and linear range LDO regulators")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

regulator: mt6358: split ops for buck and linear range LDO regulators [+ + +]
Author: Chen-Yu Tsai <wenst@chromium.org>
Date:   Wed Sep 20 16:53:34 2023 +0800

    regulator: mt6358: split ops for buck and linear range LDO regulators
    
    [ Upstream commit 7e37c851374eca2d1f6128de03195c9f7b4baaf2 ]
    
    The buck and linear range LDO (VSRAM_*) regulators share one set of ops.
    This set includes support for get/set mode. However this only makes
    sense for buck regulators, not LDOs. The callbacks were not checking
    whether the register offset and/or mask for mode setting was valid or
    not. This ends up making the kernel report "normal" mode operation for
    the LDOs.
    
    Create a new set of ops without the get/set mode callbacks for the
    linear range LDO regulators.
    
    Fixes: f67ff1bd58f0 ("regulator: mt6358: Add support for MT6358 regulator")
    Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
    Link: https://lore.kernel.org/r/20230920085336.136238-1-wenst@chromium.org
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

regulator: mt6358: Use linear voltage helpers for single range regulators [+ + +]
Author: Chen-Yu Tsai <wenst@chromium.org>
Date:   Fri Jun 9 16:30:03 2023 +0800

    regulator: mt6358: Use linear voltage helpers for single range regulators
    
    [ Upstream commit ea861df772fd8cca715d43f62fe13c09c975f7a2 ]
    
    Some of the regulators on the MT6358/MT6366 PMICs have just one linear
    voltage range. These are the bulk regulators and VSRAM_* LDOs. Currently
    they are modeled with one linear range, but also have their minimum,
    maximum, and step voltage described.
    
    Convert them to the linear voltage helpers. These helpers are a bit
    simpler, and we can also drop the linear range definitions. Also reflow
    the touched lines now that they are shorter.
    
    Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
    Link: https://lore.kernel.org/r/20230609083009.2822259-7-wenst@chromium.org
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Stable-dep-of: 7e37c851374e ("regulator: mt6358: split ops for buck and linear range LDO regulators")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
Revert "NFSv4: Retry LOCK on OLD_STATEID during delegation return" [+ + +]
Author: Benjamin Coddington <bcodding@redhat.com>
Date:   Tue Jun 27 14:31:49 2023 -0400

    Revert "NFSv4: Retry LOCK on OLD_STATEID during delegation return"
    
    [ Upstream commit 5b4a82a0724af1dfd1320826e0266117b6a57fbd ]
    
    Olga Kornievskaia reports that this patch breaks NFSv4.0 state recovery.
    It also introduces additional complexity in the error paths for cases not
    related to the original problem.  Let's revert it for now, and address the
    original problem in another manner.
    
    This reverts commit f5ea16137a3fa2858620dc9084466491c128535f.
    
    Fixes: f5ea16137a3f ("NFSv4: Retry LOCK on OLD_STATEID during delegation return")
    Reported-by: Kornievskaia, Olga <Olga.Kornievskaia@netapp.com>
    Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ring-buffer: Fix bytes info in per_cpu buffer stats [+ + +]
Author: Zheng Yejian <zhengyejian1@huawei.com>
Date:   Thu Sep 21 20:54:25 2023 +0800

    ring-buffer: Fix bytes info in per_cpu buffer stats
    
    [ Upstream commit 45d99ea451d0c30bfd4864f0fe485d7dac014902 ]
    
    The 'bytes' info in file 'per_cpu/cpu<X>/stats' means the number of
    bytes in cpu buffer that have not been consumed. However, currently
    after consuming data by reading file 'trace_pipe', the 'bytes' info
    was not changed as expected.
    
      # cat per_cpu/cpu0/stats
      entries: 0
      overrun: 0
      commit overrun: 0
      bytes: 568             <--- 'bytes' is problematical !!!
      oldest event ts:  8651.371479
      now ts:  8653.912224
      dropped events: 0
      read events: 8
    
    The root cause is incorrect stat on cpu_buffer->read_bytes. To fix it:
      1. When stat 'read_bytes', account consumed event in rb_advance_reader();
      2. When stat 'entries_bytes', exclude the discarded padding event which
         is smaller than minimum size because it is invisible to reader. Then
         use rb_page_commit() instead of BUF_PAGE_SIZE at where accounting for
         page-based read/remove/overrun.
    
    Also correct the comments of ring_buffer_bytes_cpu() in this patch.
    
    Link: https://lore.kernel.org/linux-trace-kernel/20230921125425.1708423-1-zhengyejian1@huawei.com
    
    Cc: stable@vger.kernel.org
    Fixes: c64e148a3be3 ("trace: Add ring buffer stats to measure rate of events")
    Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ring-buffer: remove obsolete comment for free_buffer_page() [+ + +]
Author: Vlastimil Babka <vbabka@suse.cz>
Date:   Wed Mar 15 15:24:46 2023 +0100

    ring-buffer: remove obsolete comment for free_buffer_page()
    
    [ Upstream commit a98151ad53b53f010ee364ec2fd06445b328578b ]
    
    The comment refers to mm/slob.c which is being removed. It comes from
    commit ed56829cb319 ("ring_buffer: reset buffer page when freeing") and
    according to Steven the borrowed code was a page mapcount and mapping
    reset, which was later removed by commit e4c2ce82ca27 ("ring_buffer:
    allocate buffer page pointer"). Thus the comment is not accurate anyway,
    remove it.
    
    Link: https://lore.kernel.org/linux-trace-kernel/20230315142446.27040-1-vbabka@suse.cz
    
    Cc: Masami Hiramatsu <mhiramat@kernel.org>
    Cc: Ingo Molnar <mingo@elte.hu>
    Reported-by: Mike Rapoport <mike.rapoport@gmail.com>
    Suggested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Fixes: e4c2ce82ca27 ("ring_buffer: allocate buffer page pointer")
    Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
    Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Stable-dep-of: 45d99ea451d0 ("ring-buffer: Fix bytes info in per_cpu buffer stats")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
scsi: core: Improve type safety of scsi_rescan_device() [+ + +]
Author: Bart Van Assche <bvanassche@acm.org>
Date:   Tue Aug 22 08:30:41 2023 -0700

    scsi: core: Improve type safety of scsi_rescan_device()
    
    [ Upstream commit 79519528a180c64a90863db2ce70887de6c49d16 ]
    
    Most callers of scsi_rescan_device() have the scsi_device pointer readily
    available. Pass a struct scsi_device pointer to scsi_rescan_device()
    instead of a struct device pointer. This change prevents that a pointer to
    another struct device would be passed accidentally to scsi_rescan_device().
    
    Remove the scsi_rescan_device() declaration from the scsi_priv.h header
    file since it duplicates the declaration in <scsi/scsi_host.h>.
    
    Reviewed-by: Hannes Reinecke <hare@suse.de>
    Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
    Reviewed-by: John Garry <john.g.garry@oracle.com>
    Cc: Mike Christie <michael.christie@oracle.com>
    Cc: Ming Lei <ming.lei@redhat.com>
    Signed-off-by: Bart Van Assche <bvanassche@acm.org>
    Link: https://lore.kernel.org/r/20230822153043.4046244-1-bvanassche@acm.org
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Stable-dep-of: 8b4d9469d0b0 ("ata: libata-scsi: Fix delayed scsi_rescan_device() execution")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

scsi: Do not attempt to rescan suspended devices [+ + +]
Author: Damien Le Moal <dlemoal@kernel.org>
Date:   Fri Sep 15 15:00:13 2023 +0900

    scsi: Do not attempt to rescan suspended devices
    
    [ Upstream commit ff48b37802e5c134e2dfc4d091f10b2eb5065a72 ]
    
    scsi_rescan_device() takes a scsi device lock before executing a device
    handler and device driver rescan methods. Waiting for the completion of
    any command issued to the device by these methods will thus be done with
    the device lock held. As a result, there is a risk of deadlocking within
    the power management code if scsi_rescan_device() is called to handle a
    device resume with the associated scsi device not yet resumed.
    
    Avoid such situation by checking that the target scsi device is in the
    running state, that is, fully capable of executing commands, before
    proceeding with the rescan and bailout returning -EWOULDBLOCK otherwise.
    With this error return, the caller can retry rescaning the device after
    a delay.
    
    The state check is done with the device lock held and is thus safe
    against incoming suspend power management operations.
    
    Fixes: 6aa0365a3c85 ("ata: libata-scsi: Avoid deadlock on rescan after device resume")
    Cc: stable@vger.kernel.org
    Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
    Reviewed-by: Hannes Reinecke <hare@suse.de>
    Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
    Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
    Reviewed-by: Bart Van Assche <bvanassche@acm.org>
    Stable-dep-of: 8b4d9469d0b0 ("ata: libata-scsi: Fix delayed scsi_rescan_device() execution")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

scsi: sd: Differentiate system and runtime start/stop management [+ + +]
Author: Damien Le Moal <dlemoal@kernel.org>
Date:   Fri Sep 15 10:02:41 2023 +0900

    scsi: sd: Differentiate system and runtime start/stop management
    
    [ Upstream commit 3cc2ffe5c16dc65dfac354bc5b5bc98d3b397567 ]
    
    The underlying device and driver of a SCSI disk may have different
    system and runtime power mode control requirements. This is because
    runtime power management affects only the SCSI disk, while system level
    power management affects all devices, including the controller for the
    SCSI disk.
    
    For instance, issuing a START STOP UNIT command when a SCSI disk is
    runtime suspended and resumed is fine: the command is translated to a
    STANDBY IMMEDIATE command to spin down the ATA disk and to a VERIFY
    command to wake it up. The SCSI disk runtime operations have no effect
    on the ata port device used to connect the ATA disk. However, for
    system suspend/resume operations, the ATA port used to connect the
    device will also be suspended and resumed, with the resume operation
    requiring re-validating the device link and the device itself. In this
    case, issuing a VERIFY command to spinup the disk must be done before
    starting to revalidate the device, when the ata port is being resumed.
    In such case, we must not allow the SCSI disk driver to issue START STOP
    UNIT commands.
    
    Allow a low level driver to refine the SCSI disk start/stop management
    by differentiating system and runtime cases with two new SCSI device
    flags: manage_system_start_stop and manage_runtime_start_stop. These new
    flags replace the current manage_start_stop flag. Drivers setting the
    manage_start_stop are modifed to set both new flags, thus preserving the
    existing start/stop management behavior. For backward compatibility, the
    old manage_start_stop sysfs device attribute is kept as a read-only
    attribute showing a value of 1 for devices enabling both new flags and 0
    otherwise.
    
    Fixes: 0a8589055936 ("ata,scsi: do not issue START STOP UNIT on resume")
    Cc: stable@vger.kernel.org
    Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
    Reviewed-by: Hannes Reinecke <hare@suse.de>
    Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
    Stable-dep-of: 99398d2070ab ("scsi: sd: Do not issue commands to suspended disks on shutdown")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

scsi: sd: Do not issue commands to suspended disks on shutdown [+ + +]
Author: Damien Le Moal <dlemoal@kernel.org>
Date:   Fri Sep 8 17:03:15 2023 +0900

    scsi: sd: Do not issue commands to suspended disks on shutdown
    
    [ Upstream commit 99398d2070ab03d13f90b758ad397e19a65fffb0 ]
    
    If an error occurs when resuming a host adapter before the devices
    attached to the adapter are resumed, the adapter low level driver may
    remove the scsi host, resulting in a call to sd_remove() for the
    disks of the host. This in turn results in a call to sd_shutdown() which
    will issue a synchronize cache command and a start stop unit command to
    spindown the disk. sd_shutdown() issues the commands only if the device
    is not already runtime suspended but does not check the power state for
    system-wide suspend/resume. That is, the commands may be issued with the
    device in a suspended state, which causes PM resume to hang, forcing a
    reset of the machine to recover.
    
    Fix this by tracking the suspended state of a disk by introducing the
    suspended boolean field in the scsi_disk structure. This flag is set to
    true when the disk is suspended is sd_suspend_common() and resumed with
    sd_resume(). When suspended is true, sd_shutdown() is not executed from
    sd_remove().
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
    Reviewed-by: Hannes Reinecke <hare@suse.de>
    Reviewed-by: Bart Van Assche <bvanassche@acm.org>
    Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

scsi: target: core: Fix deadlock due to recursive locking [+ + +]
Author: Junxiao Bi <junxiao.bi@oracle.com>
Date:   Mon Sep 18 15:58:48 2023 -0700

    scsi: target: core: Fix deadlock due to recursive locking
    
    [ Upstream commit a154f5f643c6ecddd44847217a7a3845b4350003 ]
    
    The following call trace shows a deadlock issue due to recursive locking of
    mutex "device_mutex". First lock acquire is in target_for_each_device() and
    second in target_free_device().
    
     PID: 148266   TASK: ffff8be21ffb5d00  CPU: 10   COMMAND: "iscsi_ttx"
      #0 [ffffa2bfc9ec3b18] __schedule at ffffffffa8060e7f
      #1 [ffffa2bfc9ec3ba0] schedule at ffffffffa8061224
      #2 [ffffa2bfc9ec3bb8] schedule_preempt_disabled at ffffffffa80615ee
      #3 [ffffa2bfc9ec3bc8] __mutex_lock at ffffffffa8062fd7
      #4 [ffffa2bfc9ec3c40] __mutex_lock_slowpath at ffffffffa80631d3
      #5 [ffffa2bfc9ec3c50] mutex_lock at ffffffffa806320c
      #6 [ffffa2bfc9ec3c68] target_free_device at ffffffffc0935998 [target_core_mod]
      #7 [ffffa2bfc9ec3c90] target_core_dev_release at ffffffffc092f975 [target_core_mod]
      #8 [ffffa2bfc9ec3ca0] config_item_put at ffffffffa79d250f
      #9 [ffffa2bfc9ec3cd0] config_item_put at ffffffffa79d2583
     #10 [ffffa2bfc9ec3ce0] target_devices_idr_iter at ffffffffc0933f3a [target_core_mod]
     #11 [ffffa2bfc9ec3d00] idr_for_each at ffffffffa803f6fc
     #12 [ffffa2bfc9ec3d60] target_for_each_device at ffffffffc0935670 [target_core_mod]
     #13 [ffffa2bfc9ec3d98] transport_deregister_session at ffffffffc0946408 [target_core_mod]
     #14 [ffffa2bfc9ec3dc8] iscsit_close_session at ffffffffc09a44a6 [iscsi_target_mod]
     #15 [ffffa2bfc9ec3df0] iscsit_close_connection at ffffffffc09a4a88 [iscsi_target_mod]
     #16 [ffffa2bfc9ec3df8] finish_task_switch at ffffffffa76e5d07
     #17 [ffffa2bfc9ec3e78] iscsit_take_action_for_connection_exit at ffffffffc0991c23 [iscsi_target_mod]
     #18 [ffffa2bfc9ec3ea0] iscsi_target_tx_thread at ffffffffc09a403b [iscsi_target_mod]
     #19 [ffffa2bfc9ec3f08] kthread at ffffffffa76d8080
     #20 [ffffa2bfc9ec3f50] ret_from_fork at ffffffffa8200364
    
    Fixes: 36d4cb460bcb ("scsi: target: Avoid that EXTENDED COPY commands trigger lock inversion")
    Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
    Link: https://lore.kernel.org/r/20230918225848.66463-1-junxiao.bi@oracle.com
    Reviewed-by: Mike Christie <michael.christie@oracle.com>
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

scsi: zfcp: Fix a double put in zfcp_port_enqueue() [+ + +]
Author: Dinghao Liu <dinghao.liu@zju.edu.cn>
Date:   Sat Sep 23 18:37:23 2023 +0800

    scsi: zfcp: Fix a double put in zfcp_port_enqueue()
    
    commit b481f644d9174670b385c3a699617052cd2a79d3 upstream.
    
    When device_register() fails, zfcp_port_release() will be called after
    put_device(). As a result, zfcp_ccw_adapter_put() will be called twice: one
    in zfcp_port_release() and one in the error path after device_register().
    So the reference on the adapter object is doubly put, which may lead to a
    premature free. Fix this by adjusting the error tag after
    device_register().
    
    Fixes: f3450c7b9172 ("[SCSI] zfcp: Replace local reference counting with common kref")
    Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
    Link: https://lore.kernel.org/r/20230923103723.10320-1-dinghao.liu@zju.edu.cn
    Acked-by: Benjamin Block <bblock@linux.ibm.com>
    Cc: stable@vger.kernel.org # v2.6.33+
    Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
sctp: update hb timer immediately after users change hb_interval [+ + +]
Author: Xin Long <lucien.xin@gmail.com>
Date:   Sun Oct 1 11:04:20 2023 -0400

    sctp: update hb timer immediately after users change hb_interval
    
    [ Upstream commit 1f4e803cd9c9166eb8b6c8b0b8e4124f7499fc07 ]
    
    Currently, when hb_interval is changed by users, it won't take effect
    until the next expiry of hb timer. As the default value is 30s, users
    have to wait up to 30s to wait its hb_interval update to work.
    
    This becomes pretty bad in containers where a much smaller value is
    usually set on hb_interval. This patch improves it by resetting the
    hb timer immediately once the value of hb_interval is updated by users.
    
    Note that we don't address the already existing 'problem' when sending
    a heartbeat 'on demand' if one hb has just been sent(from the timer)
    mentioned in:
    
      https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg590224.html
    
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Link: https://lore.kernel.org/r/75465785f8ee5df2fb3acdca9b8fafdc18984098.1696172660.git.lucien.xin@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

sctp: update transport state when processing a dupcook packet [+ + +]
Author: Xin Long <lucien.xin@gmail.com>
Date:   Sun Oct 1 10:58:45 2023 -0400

    sctp: update transport state when processing a dupcook packet
    
    [ Upstream commit 2222a78075f0c19ca18db53fd6623afb4aff602d ]
    
    During the 4-way handshake, the transport's state is set to ACTIVE in
    sctp_process_init() when processing INIT_ACK chunk on client or
    COOKIE_ECHO chunk on server.
    
    In the collision scenario below:
    
      192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 3922216408]
        192.168.1.1 > 192.168.1.2: sctp (1) [INIT] [init tag: 144230885]
        192.168.1.2 > 192.168.1.1: sctp (1) [INIT ACK] [init tag: 3922216408]
        192.168.1.1 > 192.168.1.2: sctp (1) [COOKIE ECHO]
        192.168.1.2 > 192.168.1.1: sctp (1) [COOKIE ACK]
      192.168.1.1 > 192.168.1.2: sctp (1) [INIT ACK] [init tag: 3914796021]
    
    when processing COOKIE_ECHO on 192.168.1.2, as it's in COOKIE_WAIT state,
    sctp_sf_do_dupcook_b() is called by sctp_sf_do_5_2_4_dupcook() where it
    creates a new association and sets its transport to ACTIVE then updates
    to the old association in sctp_assoc_update().
    
    However, in sctp_assoc_update(), it will skip the transport update if it
    finds a transport with the same ipaddr already existing in the old asoc,
    and this causes the old asoc's transport state not to move to ACTIVE
    after the handshake.
    
    This means if DATA retransmission happens at this moment, it won't be able
    to enter PF state because of the check 'transport->state == SCTP_ACTIVE'
    in sctp_do_8_2_transport_strike().
    
    This patch fixes it by updating the transport in sctp_assoc_update() with
    sctp_assoc_add_peer() where it updates the transport state if there is
    already a transport with the same ipaddr exists in the old asoc.
    
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Link: https://lore.kernel.org/r/fd17356abe49713ded425250cc1ae51e9f5846c6.1696172325.git.lucien.xin@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
selftests: netfilter: Extend nft_audit.sh [+ + +]
Author: Phil Sutter <phil@nwl.cc>
Date:   Sat Sep 23 03:53:49 2023 +0200

    selftests: netfilter: Extend nft_audit.sh
    
    [ Upstream commit 203bb9d39866d3c5a8135433ce3742fe4f9d5741 ]
    
    Add tests for sets and elements and deletion of all kinds. Also
    reorder rule reset tests: By moving the bulk rule add command up, the
    two 'reset rules' tests become identical.
    
    While at it, fix for a failing bulk rule add test's error status getting
    lost due to its use in a pipe. Avoid this by using a temporary file.
    
    Headings in diff output for failing tests contain no useful data, strip
    them.
    
    Signed-off-by: Phil Sutter <phil@nwl.cc>
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Stable-dep-of: 0d880dc6f032 ("netfilter: nf_tables: Deduplicate nft_register_obj audit logs")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests: netfilter: Test nf_tables audit logging [+ + +]
Author: Phil Sutter <phil@nwl.cc>
Date:   Wed Sep 13 15:51:37 2023 +0200

    selftests: netfilter: Test nf_tables audit logging
    
    [ Upstream commit e8dbde59ca3fe925d0105bfb380e8429928b16dd ]
    
    Compare NETFILTER_CFG type audit logs emitted from kernel upon ruleset
    modifications against expected output.
    
    Signed-off-by: Phil Sutter <phil@nwl.cc>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Stable-dep-of: 0d880dc6f032 ("netfilter: nf_tables: Deduplicate nft_register_obj audit logs")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
smb: use kernel_connect() and kernel_bind() [+ + +]
Author: Jordan Rife <jrife@google.com>
Date:   Tue Oct 3 20:13:03 2023 -0500

    smb: use kernel_connect() and kernel_bind()
    
    commit cedc019b9f260facfadd20c6c490e403abf292e3 upstream.
    
    Recent changes to kernel_connect() and kernel_bind() ensure that
    callers are insulated from changes to the address parameter made by BPF
    SOCK_ADDR hooks. This patch wraps direct calls to ops->connect() and
    ops->bind() with kernel_connect() and kernel_bind() to ensure that SMB
    mounts do not see their mount address overwritten in such cases.
    
    Link: https://lore.kernel.org/netdev/9944248dba1bce861375fcce9de663934d933ba9.camel@redhat.com/
    Cc: <stable@vger.kernel.org> # 6.0+
    Signed-off-by: Jordan Rife <jrife@google.com>
    Acked-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
spi: zynqmp-gqspi: fix clock imbalance on probe failure [+ + +]
Author: Johan Hovold <johan+linaro@kernel.org>
Date:   Thu Jun 22 10:24:35 2023 +0200

    spi: zynqmp-gqspi: fix clock imbalance on probe failure
    
    [ Upstream commit 1527b076ae2cb6a9c590a02725ed39399fcad1cf ]
    
    Make sure that the device is not runtime suspended before explicitly
    disabling the clocks on probe failure and on driver unbind to avoid a
    clock enable-count imbalance.
    
    Fixes: 9e3a000362ae ("spi: zynqmp: Add pm runtime support")
    Cc: stable@vger.kernel.org      # 4.19
    Cc: Naga Sureshkumar Relli <naga.sureshkumar.relli@xilinx.com>
    Cc: Shubhrajyoti Datta <shubhrajyoti.datta@xilinx.com>
    Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
    Link: https://lore.kernel.org/r/Message-Id: <20230622082435.7873-1-johan+linaro@kernel.org>
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>
 
tcp: fix delayed ACKs for MSS boundary condition [+ + +]
Author: Neal Cardwell <ncardwell@google.com>
Date:   Sun Oct 1 11:12:39 2023 -0400

    tcp: fix delayed ACKs for MSS boundary condition
    
    [ Upstream commit 4720852ed9afb1c5ab84e96135cb5b73d5afde6f ]
    
    This commit fixes poor delayed ACK behavior that can cause poor TCP
    latency in a particular boundary condition: when an application makes
    a TCP socket write that is an exact multiple of the MSS size.
    
    The problem is that there is painful boundary discontinuity in the
    current delayed ACK behavior. With the current delayed ACK behavior,
    we have:
    
    (1) If an app reads data when > 1*MSS is unacknowledged, then
        tcp_cleanup_rbuf() ACKs immediately because of:
    
         tp->rcv_nxt - tp->rcv_wup > icsk->icsk_ack.rcv_mss ||
    
    (2) If an app reads all received data, and the packets were < 1*MSS,
        and either (a) the app is not ping-pong or (b) we received two
        packets < 1*MSS, then tcp_cleanup_rbuf() ACKs immediately beecause
        of:
    
         ((icsk->icsk_ack.pending & ICSK_ACK_PUSHED2) ||
          ((icsk->icsk_ack.pending & ICSK_ACK_PUSHED) &&
           !inet_csk_in_pingpong_mode(sk))) &&
    
    (3) *However*: if an app reads exactly 1*MSS of data,
        tcp_cleanup_rbuf() does not send an immediate ACK. This is true
        even if the app is not ping-pong and the 1*MSS of data had the PSH
        bit set, suggesting the sending application completed an
        application write.
    
    Thus if the app is not ping-pong, we have this painful case where
    >1*MSS gets an immediate ACK, and <1*MSS gets an immediate ACK, but a
    write whose last skb is an exact multiple of 1*MSS can get a 40ms
    delayed ACK. This means that any app that transfers data in one
    direction and takes care to align write size or packet size with MSS
    can suffer this problem. With receive zero copy making 4KB MSS values
    more common, it is becoming more common to have application writes
    naturally align with MSS, and more applications are likely to
    encounter this delayed ACK problem.
    
    The fix in this commit is to refine the delayed ACK heuristics with a
    simple check: immediately ACK a received 1*MSS skb with PSH bit set if
    the app reads all data. Why? If an skb has a len of exactly 1*MSS and
    has the PSH bit set then it is likely the end of an application
    write. So more data may not be arriving soon, and yet the data sender
    may be waiting for an ACK if cwnd-bound or using TX zero copy. Thus we
    set ICSK_ACK_PUSHED in this case so that tcp_cleanup_rbuf() will send
    an ACK immediately if the app reads all of the data and is not
    ping-pong. Note that this logic is also executed for the case where
    len > MSS, but in that case this logic does not matter (and does not
    hurt) because tcp_cleanup_rbuf() will always ACK immediately if the
    app reads data and there is more than an MSS of unACKed data.
    
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Neal Cardwell <ncardwell@google.com>
    Reviewed-by: Yuchung Cheng <ycheng@google.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Cc: Xin Guo <guoxin0309@gmail.com>
    Link: https://lore.kernel.org/r/20231001151239.1866845-2-ncardwell.sw@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tcp: fix quick-ack counting to count actual ACKs of new data [+ + +]
Author: Neal Cardwell <ncardwell@google.com>
Date:   Sun Oct 1 11:12:38 2023 -0400

    tcp: fix quick-ack counting to count actual ACKs of new data
    
    [ Upstream commit 059217c18be6757b95bfd77ba53fb50b48b8a816 ]
    
    This commit fixes quick-ack counting so that it only considers that a
    quick-ack has been provided if we are sending an ACK that newly
    acknowledges data.
    
    The code was erroneously using the number of data segments in outgoing
    skbs when deciding how many quick-ack credits to remove. This logic
    does not make sense, and could cause poor performance in
    request-response workloads, like RPC traffic, where requests or
    responses can be multi-segment skbs.
    
    When a TCP connection decides to send N quick-acks, that is to
    accelerate the cwnd growth of the congestion control module
    controlling the remote endpoint of the TCP connection. That quick-ack
    decision is purely about the incoming data and outgoing ACKs. It has
    nothing to do with the outgoing data or the size of outgoing data.
    
    And in particular, an ACK only serves the intended purpose of allowing
    the remote congestion control to grow the congestion window quickly if
    the ACK is ACKing or SACKing new data.
    
    The fix is simple: only count packets as serving the goal of the
    quickack mechanism if they are ACKing/SACKing new data. We can tell
    whether this is the case by checking inet_csk_ack_scheduled(), since
    we schedule an ACK exactly when we are ACKing/SACKing new data.
    
    Fixes: fc6415bcb0f5 ("[TCP]: Fix quick-ack decrementing with TSO.")
    Signed-off-by: Neal Cardwell <ncardwell@google.com>
    Reviewed-by: Yuchung Cheng <ycheng@google.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://lore.kernel.org/r/20231001151239.1866845-1-ncardwell.sw@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
tipc: fix a potential deadlock on &tx->lock [+ + +]
Author: Chengfeng Ye <dg573847474@gmail.com>
Date:   Wed Sep 27 18:14:14 2023 +0000

    tipc: fix a potential deadlock on &tx->lock
    
    [ Upstream commit 08e50cf071847323414df0835109b6f3560d44f5 ]
    
    It seems that tipc_crypto_key_revoke() could be be invoked by
    wokequeue tipc_crypto_work_rx() under process context and
    timer/rx callback under softirq context, thus the lock acquisition
    on &tx->lock seems better use spin_lock_bh() to prevent possible
    deadlock.
    
    This flaw was found by an experimental static analysis tool I am
    developing for irq-related deadlock.
    
    tipc_crypto_work_rx() <workqueue>
    --> tipc_crypto_key_distr()
    --> tipc_bcast_xmit()
    --> tipc_bcbase_xmit()
    --> tipc_bearer_bc_xmit()
    --> tipc_crypto_xmit()
    --> tipc_ehdr_build()
    --> tipc_crypto_key_revoke()
    --> spin_lock(&tx->lock)
    <timer interrupt>
       --> tipc_disc_timeout()
       --> tipc_bearer_xmit_skb()
       --> tipc_crypto_xmit()
       --> tipc_ehdr_build()
       --> tipc_crypto_key_revoke()
       --> spin_lock(&tx->lock) <deadlock here>
    
    Signed-off-by: Chengfeng Ye <dg573847474@gmail.com>
    Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
    Acked-by: Jon Maloy <jmaloy@redhat.com>
    Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication")
    Link: https://lore.kernel.org/r/20230927181414.59928-1-dg573847474@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ubi: Refuse attaching if mtd's erasesize is 0 [+ + +]
Author: Zhihao Cheng <chengzhihao1@huawei.com>
Date:   Sun Apr 23 19:10:41 2023 +0800

    ubi: Refuse attaching if mtd's erasesize is 0
    
    [ Upstream commit 017c73a34a661a861712f7cc1393a123e5b2208c ]
    
    There exists mtd devices with zero erasesize, which will trigger a
    divide-by-zero exception while attaching ubi device.
    Fix it by refusing attaching if mtd's erasesize is 0.
    
    Fixes: 801c135ce73d ("UBI: Unsorted Block Images")
    Reported-by: Yu Hao <yhao016@ucr.edu>
    Link: https://lore.kernel.org/lkml/977347543.226888.1682011999468.JavaMail.zimbra@nod.at/T/
    Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
    Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
    Signed-off-by: Richard Weinberger <richard@nod.at>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
vrf: Fix lockdep splat in output path [+ + +]
Author: Ido Schimmel <idosch@nvidia.com>
Date:   Sat Jul 15 18:36:05 2023 +0300

    vrf: Fix lockdep splat in output path
    
    commit 2033ab90380d46e0e9f0520fd6776a73d107fd95 upstream.
    
    Cited commit converted the neighbour code to use the standard RCU
    variant instead of the RCU-bh variant, but the VRF code still uses
    rcu_read_lock_bh() / rcu_read_unlock_bh() around the neighbour lookup
    code in its IPv4 and IPv6 output paths, resulting in lockdep splats
    [1][2]. Can be reproduced using [3].
    
    Fix by switching to rcu_read_lock() / rcu_read_unlock().
    
    [1]
    =============================
    WARNING: suspicious RCU usage
    6.5.0-rc1-custom-g9c099e6dbf98 #403 Not tainted
    -----------------------------
    include/net/neighbour.h:302 suspicious rcu_dereference_check() usage!
    
    other info that might help us debug this:
    
    rcu_scheduler_active = 2, debug_locks = 1
    2 locks held by ping/183:
     #0: ffff888105ea1d80 (sk_lock-AF_INET){+.+.}-{0:0}, at: raw_sendmsg+0xc6c/0x33c0
     #1: ffffffff85b46820 (rcu_read_lock_bh){....}-{1:2}, at: vrf_output+0x2e3/0x2030
    
    stack backtrace:
    CPU: 0 PID: 183 Comm: ping Not tainted 6.5.0-rc1-custom-g9c099e6dbf98 #403
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
    Call Trace:
     <TASK>
     dump_stack_lvl+0xc1/0xf0
     lockdep_rcu_suspicious+0x211/0x3b0
     vrf_output+0x1380/0x2030
     ip_push_pending_frames+0x125/0x2a0
     raw_sendmsg+0x200d/0x33c0
     inet_sendmsg+0xa2/0xe0
     __sys_sendto+0x2aa/0x420
     __x64_sys_sendto+0xe5/0x1c0
     do_syscall_64+0x38/0x80
     entry_SYSCALL_64_after_hwframe+0x63/0xcd
    
    [2]
    =============================
    WARNING: suspicious RCU usage
    6.5.0-rc1-custom-g9c099e6dbf98 #403 Not tainted
    -----------------------------
    include/net/neighbour.h:302 suspicious rcu_dereference_check() usage!
    
    other info that might help us debug this:
    
    rcu_scheduler_active = 2, debug_locks = 1
    2 locks held by ping6/182:
     #0: ffff888114b63000 (sk_lock-AF_INET6){+.+.}-{0:0}, at: rawv6_sendmsg+0x1602/0x3e50
     #1: ffffffff85b46820 (rcu_read_lock_bh){....}-{1:2}, at: vrf_output6+0xe9/0x1310
    
    stack backtrace:
    CPU: 0 PID: 182 Comm: ping6 Not tainted 6.5.0-rc1-custom-g9c099e6dbf98 #403
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
    Call Trace:
     <TASK>
     dump_stack_lvl+0xc1/0xf0
     lockdep_rcu_suspicious+0x211/0x3b0
     vrf_output6+0xd32/0x1310
     ip6_local_out+0xb4/0x1a0
     ip6_send_skb+0xbc/0x340
     ip6_push_pending_frames+0xe5/0x110
     rawv6_sendmsg+0x2e6e/0x3e50
     inet_sendmsg+0xa2/0xe0
     __sys_sendto+0x2aa/0x420
     __x64_sys_sendto+0xe5/0x1c0
     do_syscall_64+0x38/0x80
     entry_SYSCALL_64_after_hwframe+0x63/0xcd
    
    [3]
    #!/bin/bash
    
    ip link add name vrf-red up numtxqueues 2 type vrf table 10
    ip link add name swp1 up master vrf-red type dummy
    ip address add 192.0.2.1/24 dev swp1
    ip address add 2001:db8:1::1/64 dev swp1
    ip neigh add 192.0.2.2 lladdr 00:11:22:33:44:55 nud perm dev swp1
    ip neigh add 2001:db8:1::2 lladdr 00:11:22:33:44:55 nud perm dev swp1
    ip vrf exec vrf-red ping 192.0.2.2 -c 1 &> /dev/null
    ip vrf exec vrf-red ping6 2001:db8:1::2 -c 1 &> /dev/null
    
    Fixes: 09eed1192cec ("neighbour: switch to standard rcu, instead of rcu_bh")
    Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
    Link: https://lore.kernel.org/netdev/CA+G9fYtEr-=GbcXNDYo3XOkwR+uYgehVoDjsP0pFLUpZ_AZcyg@mail.gmail.com/
    Signed-off-by: Ido Schimmel <idosch@nvidia.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://lore.kernel.org/r/20230715153605.4068066-1-idosch@nvidia.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
vringh: don't use vringh_kiov_advance() in vringh_iov_xfer() [+ + +]
Author: Stefano Garzarella <sgarzare@redhat.com>
Date:   Mon Sep 25 12:30:57 2023 +0200

    vringh: don't use vringh_kiov_advance() in vringh_iov_xfer()
    
    commit 7aed44babc7f97e82b38e9a68515e699692cc100 upstream.
    
    In the while loop of vringh_iov_xfer(), `partlen` could be 0 if one of
    the `iov` has 0 lenght.
    In this case, we should skip the iov and go to the next one.
    But calling vringh_kiov_advance() with 0 lenght does not cause the
    advancement, since it returns immediately if asked to advance by 0 bytes.
    
    Let's restore the code that was there before commit b8c06ad4d67d
    ("vringh: implement vringh_kiov_advance()"), avoiding using
    vringh_kiov_advance().
    
    Fixes: b8c06ad4d67d ("vringh: implement vringh_kiov_advance()")
    Cc: stable@vger.kernel.org
    Reported-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
    Acked-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
wifi: cfg80211: add a work abstraction with special semantics [+ + +]
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Tue Jun 6 14:49:25 2023 +0200

    wifi: cfg80211: add a work abstraction with special semantics
    
    [ Upstream commit a3ee4dc84c4e9d14cb34dad095fd678127aca5b6 ]
    
    Add a work abstraction at the cfg80211 level that will always
    hold the wiphy_lock() for any work executed and therefore also
    can be canceled safely (without waiting) while holding that.
    This improves on what we do now as with the new wiphy works we
    don't have to worry about locking while cancelling them safely.
    
    Also, don't let such works run while the device is suspended,
    since they'll likely need to interact with the device. Flush
    them before suspend though.
    
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Stable-dep-of: 37c20b2effe9 ("wifi: cfg80211: fix cqm_config access race")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: cfg80211: add missing kernel-doc for cqm_rssi_work [+ + +]
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Wed Sep 13 09:36:57 2023 +0200

    wifi: cfg80211: add missing kernel-doc for cqm_rssi_work
    
    [ Upstream commit d1383077c225ceb87ac7a3b56b2c505193f77ed7 ]
    
    As reported by Stephen, I neglected to add the kernel-doc
    for the new struct member. Fix that.
    
    Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
    Fixes: 37c20b2effe9 ("wifi: cfg80211: fix cqm_config access race")
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: cfg80211: fix cqm_config access race [+ + +]
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Wed Aug 16 15:38:04 2023 +0200

    wifi: cfg80211: fix cqm_config access race
    
    [ Upstream commit 37c20b2effe987b806c8de6d12978e4ffeff026f ]
    
    Max Schulze reports crashes with brcmfmac. The reason seems
    to be a race between userspace removing the CQM config and
    the driver calling cfg80211_cqm_rssi_notify(), where if the
    data is freed while cfg80211_cqm_rssi_notify() runs it will
    crash since it assumes wdev->cqm_config is set. This can't
    be fixed with a simple non-NULL check since there's nothing
    we can do for locking easily, so use RCU instead to protect
    the pointer, but that requires pulling the updates out into
    an asynchronous worker so they can sleep and call back into
    the driver.
    
    Since we need to change the free anyway, also change it to
    go back to the old settings if changing the settings fails.
    
    Reported-and-tested-by: Max Schulze <max.schulze@online.de>
    Closes: https://lore.kernel.org/r/ac96309a-8d8d-4435-36e6-6d152eb31876@online.de
    Fixes: 4a4b8169501b ("cfg80211: Accept multiple RSSI thresholds for CQM")
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: cfg80211: hold wiphy lock in auto-disconnect [+ + +]
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Tue Jun 6 14:49:20 2023 +0200

    wifi: cfg80211: hold wiphy lock in auto-disconnect
    
    [ Upstream commit e9da6df7492a981b071bafd169fb4c35b45f5ebf ]
    
    Most code paths in cfg80211 already hold the wiphy lock,
    mostly by virtue of being called from nl80211, so make
    the auto-disconnect worker also hold it, aligning the
    locking promises between different parts of cfg80211.
    
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Stable-dep-of: 37c20b2effe9 ("wifi: cfg80211: fix cqm_config access race")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: cfg80211: move wowlan disable under locks [+ + +]
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Tue Jun 6 14:49:22 2023 +0200

    wifi: cfg80211: move wowlan disable under locks
    
    [ Upstream commit a993df0f9143e63eca38c96a30daf08db99a98a3 ]
    
    This is a driver callback, and the driver should be able
    to assume that it's called with the wiphy lock held. Move
    the call up so that's true, it has no other effect since
    the device is already unregistering and we cannot reach
    this function through other paths.
    
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Stable-dep-of: 37c20b2effe9 ("wifi: cfg80211: fix cqm_config access race")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: iwlwifi: dbg_ini: fix structure packing [+ + +]
Author: Arnd Bergmann <arnd@arndb.de>
Date:   Fri Jun 16 11:03:34 2023 +0200

    wifi: iwlwifi: dbg_ini: fix structure packing
    
    [ Upstream commit 424c82e8ad56756bb98b08268ffcf68d12d183eb ]
    
    The iwl_fw_ini_error_dump_range structure has conflicting alignment
    requirements for the inner union and the outer struct:
    
    In file included from drivers/net/wireless/intel/iwlwifi/fw/dbg.c:9:
    drivers/net/wireless/intel/iwlwifi/fw/error-dump.h:312:2: error: field  within 'struct iwl_fw_ini_error_dump_range' is less aligned than 'union iwl_fw_ini_error_dump_range::(anonymous at drivers/net/wireless/intel/iwlwifi/fw/error-dump.h:312:2)' and is usually due to 'struct iwl_fw_ini_error_dump_range' being packed, which can lead to unaligned accesses [-Werror,-Wunaligned-access]
            union {
    
    As the original intention was apparently to make the entire structure
    unaligned, mark the innermost members the same way so the union
    becomes packed as well.
    
    Fixes: 973193554cae6 ("iwlwifi: dbg_ini: dump headers cleanup")
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Acked-by: Gregory Greenman <gregory.greenman@intel.com>
    Link: https://lore.kernel.org/r/20230616090343.2454061-1-arnd@kernel.org
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: iwlwifi: mvm: Fix a memory corruption issue [+ + +]
Author: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Date:   Sun Jul 23 22:24:59 2023 +0200

    wifi: iwlwifi: mvm: Fix a memory corruption issue
    
    [ Upstream commit 8ba438ef3cacc4808a63ed0ce24d4f0942cfe55d ]
    
    A few lines above, space is kzalloc()'ed for:
            sizeof(struct iwl_nvm_data) +
            sizeof(struct ieee80211_channel) +
            sizeof(struct ieee80211_rate)
    
    'mvm->nvm_data' is a 'struct iwl_nvm_data', so it is fine.
    
    At the end of this structure, there is the 'channels' flex array.
    Each element is of type 'struct ieee80211_channel'.
    So only 1 element is allocated in this array.
    
    When doing:
      mvm->nvm_data->bands[0].channels = mvm->nvm_data->channels;
    We point at the first element of the 'channels' flex array.
    So this is fine.
    
    However, when doing:
      mvm->nvm_data->bands[0].bitrates =
                            (void *)((u8 *)mvm->nvm_data->channels + 1);
    because of the "(u8 *)" cast, we add only 1 to the address of the beginning
    of the flex array.
    
    It is likely that we want point at the 'struct ieee80211_rate' allocated
    just after.
    
    Remove the spurious casting so that the pointer arithmetic works as
    expected.
    
    Fixes: 8ca151b568b6 ("iwlwifi: add the MVM driver")
    Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    Acked-by: Gregory Greenman <gregory.greenman@intel.com>
    Link: https://lore.kernel.org/r/23f0ec986ef1529055f4f93dcb3940a6cf8d9a94.1690143750.git.christophe.jaillet@wanadoo.fr
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mac80211: fix potential key use-after-free [+ + +]
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Tue Sep 19 08:34:15 2023 +0200

    wifi: mac80211: fix potential key use-after-free
    
    [ Upstream commit 31db78a4923ef5e2008f2eed321811ca79e7f71b ]
    
    When ieee80211_key_link() is called by ieee80211_gtk_rekey_add()
    but returns 0 due to KRACK protection (identical key reinstall),
    ieee80211_gtk_rekey_add() will still return a pointer into the
    key, in a potential use-after-free. This normally doesn't happen
    since it's only called by iwlwifi in case of WoWLAN rekey offload
    which has its own KRACK protection, but still better to fix, do
    that by returning an error code and converting that to success on
    the cfg80211 boundary only, leaving the error for bad callers of
    ieee80211_gtk_rekey_add().
    
    Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
    Fixes: fdf7cb4185b6 ("mac80211: accept key reinstall without changing anything")
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mt76: mt76x02: fix MT76x0 external LNA gain handling [+ + +]
Author: Felix Fietkau <nbd@nbd.name>
Date:   Tue Sep 19 21:47:47 2023 +0200

    wifi: mt76: mt76x02: fix MT76x0 external LNA gain handling
    
    [ Upstream commit 684e45e120b82deccaf8b85633905304a3bbf56d ]
    
    On MT76x0, LNA gain should be applied for both external and internal LNA.
    On MT76x2, LNA gain should be treated as 0 for external LNA.
    Move the LNA type based logic to mt76x2 in order to fix mt76x0.
    
    Fixes: 2daa67588f34 ("mt76x0: unify lna_gain parsing")
    Reported-by: Shiji Yang <yangshiji66@outlook.com>
    Signed-off-by: Felix Fietkau <nbd@nbd.name>
    Signed-off-by: Kalle Valo <kvalo@kernel.org>
    Link: https://lore.kernel.org/r/20230919194747.31647-1-nbd@nbd.name
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mwifiex: Fix oob check condition in mwifiex_process_rx_packet [+ + +]
Author: Pin-yen Lin <treapking@chromium.org>
Date:   Fri Sep 8 18:41:12 2023 +0800

    wifi: mwifiex: Fix oob check condition in mwifiex_process_rx_packet
    
    [ Upstream commit aef7a0300047e7b4707ea0411dc9597cba108fc8 ]
    
    Only skip the code path trying to access the rfc1042 headers when the
    buffer is too small, so the driver can still process packets without
    rfc1042 headers.
    
    Fixes: 119585281617 ("wifi: mwifiex: Fix OOB and integer underflow when rx packets")
    Signed-off-by: Pin-yen Lin <treapking@chromium.org>
    Acked-by: Brian Norris <briannorris@chromium.org>
    Reviewed-by: Matthew Wang <matthewmwang@chromium.org>
    Signed-off-by: Kalle Valo <kvalo@kernel.org>
    Link: https://lore.kernel.org/r/20230908104308.1546501-1-treapking@chromium.org
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mwifiex: Fix tlv_buf_left calculation [+ + +]
Author: Gustavo A. R. Silva <gustavoars@kernel.org>
Date:   Thu Aug 24 21:06:51 2023 -0600

    wifi: mwifiex: Fix tlv_buf_left calculation
    
    commit eec679e4ac5f47507774956fb3479c206e761af7 upstream.
    
    In a TLV encoding scheme, the Length part represents the length after
    the header containing the values for type and length. In this case,
    `tlv_len` should be:
    
    tlv_len == (sizeof(*tlv_rxba) - 1) - sizeof(tlv_rxba->header) + tlv_bitmap_len
    
    Notice that the `- 1` accounts for the one-element array `bitmap`, which
    1-byte size is already included in `sizeof(*tlv_rxba)`.
    
    So, if the above is correct, there is a double-counting of some members
    in `struct mwifiex_ie_types_rxba_sync`, when `tlv_buf_left` and `tmp`
    are calculated:
    
    968                 tlv_buf_left -= (sizeof(*tlv_rxba) + tlv_len);
    969                 tmp = (u8 *)tlv_rxba + tlv_len + sizeof(*tlv_rxba);
    
    in specific, members:
    
    drivers/net/wireless/marvell/mwifiex/fw.h:777
     777         u8 mac[ETH_ALEN];
     778         u8 tid;
     779         u8 reserved;
     780         __le16 seq_num;
     781         __le16 bitmap_len;
    
    This is clearly wrong, and affects the subsequent decoding of data in
    `event_buf` through `tlv_rxba`:
    
    970                 tlv_rxba = (struct mwifiex_ie_types_rxba_sync *)tmp;
    
    Fix this by using `sizeof(tlv_rxba->header)` instead of `sizeof(*tlv_rxba)`
    in the calculation of `tlv_buf_left` and `tmp`.
    
    This results in the following binary differences before/after changes:
    
    | drivers/net/wireless/marvell/mwifiex/11n_rxreorder.o
    | @@ -4698,11 +4698,11 @@
    |  drivers/net/wireless/marvell/mwifiex/11n_rxreorder.c:968
    |                 tlv_buf_left -= (sizeof(tlv_rxba->header) + tlv_len);
    | -    1da7:      lea    -0x11(%rbx),%edx
    | +    1da7:      lea    -0x4(%rbx),%edx
    |      1daa:      movzwl %bp,%eax
    |  drivers/net/wireless/marvell/mwifiex/11n_rxreorder.c:969
    |                 tmp = (u8 *)tlv_rxba  + sizeof(tlv_rxba->header) + tlv_len;
    | -    1dad:      lea    0x11(%r15,%rbp,1),%r15
    | +    1dad:      lea    0x4(%r15,%rbp,1),%r15
    
    The above reflects the desired change: avoid counting 13 too many bytes;
    which is the total size of the double-counted members in
    `struct mwifiex_ie_types_rxba_sync`:
    
    $ pahole -C mwifiex_ie_types_rxba_sync drivers/net/wireless/marvell/mwifiex/11n_rxreorder.o
    struct mwifiex_ie_types_rxba_sync {
            struct mwifiex_ie_types_header header;           /*     0     4 */
    
         |-----------------------------------------------------------------------
         |  u8                         mac[6];               /*     4     6 */  |
         |  u8                         tid;                  /*    10     1 */  |
         |  u8                         reserved;             /*    11     1 */  |
         |  __le16                     seq_num;              /*    12     2 */  |
         |  __le16                     bitmap_len;           /*    14     2 */  |
         |  u8                         bitmap[1];            /*    16     1 */  |
         |----------------------------------------------------------------------|
                                                                      | 13 bytes|
                                                                      -----------
    
            /* size: 17, cachelines: 1, members: 7 */
            /* last cacheline: 17 bytes */
    } __attribute__((__packed__));
    
    Fixes: 99ffe72cdae4 ("mwifiex: process rxba_sync event")
    Cc: stable@vger.kernel.org
    Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
    Reviewed-by: Kees Cook <keescook@chromium.org>
    Signed-off-by: Kalle Valo <kvalo@kernel.org>
    Link: https://lore.kernel.org/r/06668edd68e7a26bbfeebd1201ae077a2a7a8bce.1692931954.git.gustavoars@kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
x86/sev: Use the GHCB protocol when available for SNP CPUID requests [+ + +]
Author: Tom Lendacky <thomas.lendacky@amd.com>
Date:   Fri Jul 28 16:09:26 2023 -0500

    x86/sev: Use the GHCB protocol when available for SNP CPUID requests
    
    commit 6bc6f7d9d7ac3cdbe9e8b0495538b4a0cc11f032 upstream.
    
    SNP retrieves the majority of CPUID information from the SNP CPUID page.
    But there are times when that information needs to be supplemented by the
    hypervisor, for example, obtaining the initial APIC ID of the vCPU from
    leaf 1.
    
    The current implementation uses the MSR protocol to retrieve the data from
    the hypervisor, even when a GHCB exists. The problem arises when an NMI
    arrives on return from the VMGEXIT. The NMI will be immediately serviced
    and may generate a #VC requiring communication with the hypervisor.
    
    Since a GHCB exists in this case, it will be used. As part of using the
    GHCB, the #VC handler will write the GHCB physical address into the GHCB
    MSR and the #VC will be handled.
    
    When the NMI completes, processing resumes at the site of the VMGEXIT
    which is expecting to read the GHCB MSR and find a CPUID MSR protocol
    response. Since the NMI handling overwrote the GHCB MSR response, the
    guest will see an invalid reply from the hypervisor and self-terminate.
    
    Fix this problem by using the GHCB when it is available. Any NMI
    received is properly handled because the GHCB contents are copied into
    a backup page and restored on NMI exit, thus preserving the active GHCB
    request or result.
    
      [ bp: Touchups. ]
    
    Fixes: ee0bfa08a345 ("x86/compressed/64: Add support for SEV-SNP CPUID table in #VC handlers")
    Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
    Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
    Cc: <stable@kernel.org>
    Link: https://lore.kernel.org/r/a5856fa1ebe3879de91a8f6298b6bbd901c61881.1690578565.git.thomas.lendacky@amd.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
xen/events: replace evtchn_rwlock with RCU [+ + +]
Author: Juergen Gross <jgross@suse.com>
Date:   Mon Aug 28 08:09:47 2023 +0200

    xen/events: replace evtchn_rwlock with RCU
    
    commit 87797fad6cce28ec9be3c13f031776ff4f104cfc upstream.
    
    In unprivileged Xen guests event handling can cause a deadlock with
    Xen console handling. The evtchn_rwlock and the hvc_lock are taken in
    opposite sequence in __hvc_poll() and in Xen console IRQ handling.
    Normally this is no problem, as the evtchn_rwlock is taken as a reader
    in both paths, but as soon as an event channel is being closed, the
    lock will be taken as a writer, which will cause read_lock() to block:
    
    CPU0                     CPU1                CPU2
    (IRQ handling)           (__hvc_poll())      (closing event channel)
    
    read_lock(evtchn_rwlock)
                             spin_lock(hvc_lock)
                                                 write_lock(evtchn_rwlock)
                                                     [blocks]
    spin_lock(hvc_lock)
        [blocks]
                            read_lock(evtchn_rwlock)
                                [blocks due to writer waiting,
                                 and not in_interrupt()]
    
    This issue can be avoided by replacing evtchn_rwlock with RCU in
    xen_free_irq(). Note that RCU is used only to delay freeing of the
    irq_info memory. There is no RCU based dereferencing or replacement of
    pointers involved.
    
    In order to avoid potential races between removing the irq_info
    reference and handling of interrupts, set the irq_info pointer to NULL
    only when freeing its memory. The IRQ itself must be freed at that
    time, too, as otherwise the same IRQ number could be allocated again
    before handling of the old instance would have been finished.
    
    This is XSA-441 / CVE-2023-34324.
    
    Fixes: 54c9de89895e ("xen/events: add a new "late EOI" evtchn framework")
    Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
    Signed-off-by: Juergen Gross <jgross@suse.com>
    Reviewed-by: Julien Grall <jgrall@amazon.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>