Список изменений в Linux 6.1.72

 
ALSA: hda/realtek: Add quirk for Lenovo Yoga Pro 7 [+ + +]
Author: Takashi Iwai <tiwai@suse.de>
Date:   Thu Dec 7 19:20:35 2023 +0100

    ALSA: hda/realtek: Add quirk for Lenovo Yoga Pro 7
    
    [ Upstream commit 634e5e1e06f5cdd614a1bc429ecb243a51cc009d ]
    
    Lenovo Yoga Pro 7 14APH8 (PCI SSID 17aa:3882) seems requiring the
    similar workaround like Yoga 9 model for the bass speaker.
    
    Cc: <stable@vger.kernel.org>
    Link: https://lore.kernel.org/r/CAGGk=CRRQ1L9p771HsXTN_ebZP41Qj+3gw35Gezurn+nokRewg@mail.gmail.com
    Link: https://lore.kernel.org/r/20231207182035.30248-1-tiwai@suse.de
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ALSA: hda/realtek: enable SND_PCI_QUIRK for hp pavilion 14-ec1xxx series [+ + +]
Author: Aabish Malik <aabishmalik3337@gmail.com>
Date:   Fri Dec 29 22:33:54 2023 +0530

    ALSA: hda/realtek: enable SND_PCI_QUIRK for hp pavilion 14-ec1xxx series
    
    commit 13a5b21197587a3d9cac9e1a00de9b91526a55e4 upstream.
    
    The HP Pavilion 14 ec1xxx series uses the HP mainboard 8A0F with the
    ALC287 codec.
    The mute led can be enabled using the already existing
    ALC287_FIXUP_HP_GPIO_LED quirk.
    Tested on an HP Pavilion ec1003AU
    
    Signed-off-by: Aabish Malik <aabishmalik3337@gmail.com>
    Cc: <stable@vger.kernel.org>
    Link: https://lore.kernel.org/r/20231229170352.742261-3-aabishmalik3337@gmail.com
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda/realtek: Fix mute and mic-mute LEDs for HP ProBook 440 G6 [+ + +]
Author: Siddhesh Dharme <siddheshdharme18@gmail.com>
Date:   Thu Jan 4 11:37:36 2024 +0530

    ALSA: hda/realtek: Fix mute and mic-mute LEDs for HP ProBook 440 G6
    
    commit b6ce6e6c79e4ec650887f1fe391a70e54972001a upstream.
    
    LEDs in 'HP ProBook 440 G6' laptop are controlled by ALC236 codec.
    Enable already existing quirk 'ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF'
    to fix mute and mic-mute LEDs.
    
    Signed-off-by: Siddhesh Dharme <siddheshdharme18@gmail.com>
    Cc: <stable@vger.kernel.org>
    Link: https://lore.kernel.org/r/20240104060736.5149-1-siddheshdharme18@gmail.com
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda/realtek: fix mute/micmute LEDs for a HP ZBook [+ + +]
Author: Andy Chi <andy.chi@canonical.com>
Date:   Tue Jan 2 10:49:15 2024 +0800

    ALSA: hda/realtek: fix mute/micmute LEDs for a HP ZBook
    
    commit 18a434f32fa61b3fda8ddcd9a63d5274569c6a41 upstream.
    
    There is a HP ZBook which using ALC236 codec and need the
    ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF quirk to make mute LED
    and micmute LED work.
    
    [ confirmed that the new entries are for new models that have no
      proper name, so the strings are left as "HP" which will be updated
      eventually later -- tiwai ]
    
    Signed-off-by: Andy Chi <andy.chi@canonical.com>
    Cc: <stable@vger.kernel.org>
    Link: https://lore.kernel.org/r/20240102024916.19093-1-andy.chi@canonical.com
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
arm64: dts: qcom: sdm845: align RPMh regulator nodes with bindings [+ + +]
Author: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Date:   Fri Jan 27 12:43:42 2023 +0100

    arm64: dts: qcom: sdm845: align RPMh regulator nodes with bindings
    
    [ Upstream commit 86dd19bbdea2b7d3feb69c0c39f141de30a18ec9 ]
    
    Device node names should be generic and bindings expect certain pattern
    for RPMh regulator nodes.
    
    Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Link: https://lore.kernel.org/r/20230127114347.235963-6-krzysztof.kozlowski@linaro.org
    Stable-dep-of: a5f01673d394 ("arm64: dts: qcom: sdm845: Fix PSCI power domain names")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

arm64: dts: qcom: sdm845: Fix PSCI power domain names [+ + +]
Author: David Heidelberg <david@ixit.cz>
Date:   Tue Sep 12 12:42:03 2023 +0530

    arm64: dts: qcom: sdm845: Fix PSCI power domain names
    
    [ Upstream commit a5f01673d3946e424091e6b8ff274716f9c21454 ]
    
    The original commit hasn't been updated according to
    refactoring done in sdm845.dtsi.
    
    Fixes: a1ade6cac5a2 ("arm64: dts: qcom: sdm845: Switch PSCI cpu idle states from PC to OSI")
    Suggested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
    Reviewed-by: Douglas Anderson <dianders@chromium.org>
    Signed-off-by: David Heidelberg <david@ixit.cz>
    Reviewed-by: Stephen Boyd <swboyd@chromium.org>
    Reviewed-by: Abel Vesa <abel.vesa@linaro.org>
    Link: https://lore.kernel.org/r/20230912071205.11502-1-david@ixit.cz
    Signed-off-by: Bjorn Andersson <andersson@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ARM: sun9i: smp: Fix array-index-out-of-bounds read in sunxi_mc_smp_init [+ + +]
Author: Stefan Wahren <wahrenst@gmx.net>
Date:   Thu Dec 28 20:39:02 2023 +0100

    ARM: sun9i: smp: Fix array-index-out-of-bounds read in sunxi_mc_smp_init
    
    [ Upstream commit 72ad3b772b6d393701df58ba1359b0bb346a19ed ]
    
    Running a multi-arch kernel (multi_v7_defconfig) on a Raspberry Pi 3B+
    with enabled CONFIG_UBSAN triggers the following warning:
    
     UBSAN: array-index-out-of-bounds in arch/arm/mach-sunxi/mc_smp.c:810:29
     index 2 is out of range for type 'sunxi_mc_smp_data [2]'
     CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.7.0-rc6-00248-g5254c0cbc92d
     Hardware name: BCM2835
      unwind_backtrace from show_stack+0x10/0x14
      show_stack from dump_stack_lvl+0x40/0x4c
      dump_stack_lvl from ubsan_epilogue+0x8/0x34
      ubsan_epilogue from __ubsan_handle_out_of_bounds+0x78/0x80
      __ubsan_handle_out_of_bounds from sunxi_mc_smp_init+0xe4/0x4cc
      sunxi_mc_smp_init from do_one_initcall+0xa0/0x2fc
      do_one_initcall from kernel_init_freeable+0xf4/0x2f4
      kernel_init_freeable from kernel_init+0x18/0x158
      kernel_init from ret_from_fork+0x14/0x28
    
    Since the enabled method couldn't match with any entry from
    sunxi_mc_smp_data, the value of the index shouldn't be used right after
    the loop. So move it after the check of ret in order to have a valid
    index.
    
    Fixes: 1631090e34f5 ("ARM: sun9i: smp: Add is_a83t field")
    Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
    Link: https://lore.kernel.org/r/20231228193903.9078-1-wahrenst@gmx.net
    Reviewed-by: Chen-Yu Tsai <wens@csie.org>
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
asix: Add check for usbnet_get_endpoints [+ + +]
Author: Chen Ni <nichen@iscas.ac.cn>
Date:   Wed Jan 3 03:35:34 2024 +0000

    asix: Add check for usbnet_get_endpoints
    
    [ Upstream commit eaac6a2d26b65511e164772bec6918fcbc61938e ]
    
    Add check for usbnet_get_endpoints() and return the error if it fails
    in order to transfer the error.
    
    Fixes: 16626b0cc3d5 ("asix: Add a new driver for the AX88172A")
    Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ASoC: fsl_rpmsg: Fix error handler with pm_runtime_enable [+ + +]
Author: Chancel Liu <chancel.liu@nxp.com>
Date:   Mon Dec 25 17:06:08 2023 +0900

    ASoC: fsl_rpmsg: Fix error handler with pm_runtime_enable
    
    [ Upstream commit f9d378fc68c43fd41b35133edec9cd902ec334ec ]
    
    There is error message when defer probe happens:
    
    fsl_rpmsg rpmsg_audio: Unbalanced pm_runtime_enable!
    
    Fix the error handler with pm_runtime_enable.
    
    Fixes: b73d9e6225e8 ("ASoC: fsl_rpmsg: Add CPU DAI driver for audio base on rpmsg")
    Signed-off-by: Chancel Liu <chancel.liu@nxp.com>
    Acked-by: Shengjiu Wang <shengjiu.wang@gmail.com>
    Link: https://lore.kernel.org/r/20231225080608.967953-1-chancel.liu@nxp.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: mediatek: mt8186: fix AUD_PAD_TOP register and offset [+ + +]
Author: Eugen Hristev <eugen.hristev@collabora.com>
Date:   Fri Dec 29 13:43:42 2023 +0200

    ASoC: mediatek: mt8186: fix AUD_PAD_TOP register and offset
    
    [ Upstream commit 38744c3fa00109c51076121c2deb4f02e2f09194 ]
    
    AUD_PAD_TOP widget's correct register is AFE_AUD_PAD_TOP , and not zero.
    Having a zero as register, it would mean that the `snd_soc_dapm_new_widgets`
    would try to read the register at offset zero when trying to get the power
    status of this widget, which is incorrect.
    
    Fixes: b65c466220b3 ("ASoC: mediatek: mt8186: support adda in platform driver")
    Signed-off-by: Eugen Hristev <eugen.hristev@collabora.com>
    Link: https://lore.kernel.org/r/20231229114342.195867-1-eugen.hristev@collabora.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: meson: g12a-toacodec: Fix event generation [+ + +]
Author: Mark Brown <broonie@kernel.org>
Date:   Wed Jan 3 18:34:03 2024 +0000

    ASoC: meson: g12a-toacodec: Fix event generation
    
    [ Upstream commit 172c88244b5f2d3375403ebb504d407be0fded59 ]
    
    When a control changes value the return value from _put() should be 1 so
    we get events generated to userspace notifying applications of the change.
    We are checking if there has been a change and exiting early if not but we
    are not providing the correct return value in the latter case, fix this.
    
    Fixes: af2618a2eee8 ("ASoC: meson: g12a: add internal DAC glue driver")
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Link: https://lore.kernel.org/r/20240103-meson-enum-val-v1-3-424af7a8fb91@kernel.org
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: meson: g12a-toacodec: Validate written enum values [+ + +]
Author: Mark Brown <broonie@kernel.org>
Date:   Wed Jan 3 18:34:01 2024 +0000

    ASoC: meson: g12a-toacodec: Validate written enum values
    
    [ Upstream commit 3150b70e944ead909260285dfb5707d0bedcf87b ]
    
    When writing to an enum we need to verify that the value written is valid
    for the enumeration, the helper function snd_soc_item_enum_to_val() doesn't
    do it since it needs to return an unsigned (and in any case we'd need to
    check the return value).
    
    Fixes: af2618a2eee8 ("ASoC: meson: g12a: add internal DAC glue driver")
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Link: https://lore.kernel.org/r/20240103-meson-enum-val-v1-1-424af7a8fb91@kernel.org
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: meson: g12a-tohdmitx: Fix event generation for S/PDIF mux [+ + +]
Author: Mark Brown <broonie@kernel.org>
Date:   Wed Jan 3 18:34:04 2024 +0000

    ASoC: meson: g12a-tohdmitx: Fix event generation for S/PDIF mux
    
    [ Upstream commit b036d8ef3120b996751495ce25994eea58032a98 ]
    
    When a control changes value the return value from _put() should be 1 so
    we get events generated to userspace notifying applications of the change.
    While the I2S mux gets this right the S/PDIF mux does not, fix the return
    value.
    
    Fixes: c8609f3870f7 ("ASoC: meson: add g12a tohdmitx control")
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Link: https://lore.kernel.org/r/20240103-meson-enum-val-v1-4-424af7a8fb91@kernel.org
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: meson: g12a-tohdmitx: Validate written enum values [+ + +]
Author: Mark Brown <broonie@kernel.org>
Date:   Wed Jan 3 18:34:02 2024 +0000

    ASoC: meson: g12a-tohdmitx: Validate written enum values
    
    [ Upstream commit 1e001206804be3f3d21f4a1cf16e5d059d75643f ]
    
    When writing to an enum we need to verify that the value written is valid
    for the enumeration, the helper function snd_soc_item_enum_to_val() doesn't
    do it since it needs to return an unsigned (and in any case we'd need to
    check the return value).
    
    Fixes: c8609f3870f7 ("ASoC: meson: add g12a tohdmitx control")
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Link: https://lore.kernel.org/r/20240103-meson-enum-val-v1-2-424af7a8fb91@kernel.org
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
blk-mq: make sure active queue usage is held for bio_integrity_prep() [+ + +]
Author: Christoph Hellwig <hch@infradead.org>
Date:   Mon Nov 13 11:52:31 2023 +0800

    blk-mq: make sure active queue usage is held for bio_integrity_prep()
    
    [ Upstream commit b0077e269f6c152e807fdac90b58caf012cdbaab ]
    
    blk_integrity_unregister() can come if queue usage counter isn't held
    for one bio with integrity prepared, so this request may be completed with
    calling profile->complete_fn, then kernel panic.
    
    Another constraint is that bio_integrity_prep() needs to be called
    before bio merge.
    
    Fix the issue by:
    
    - call bio_integrity_prep() with one queue usage counter grabbed reliably
    
    - call bio_integrity_prep() before bio merge
    
    Fixes: 900e080752025f00 ("block: move queue enter logic into blk_mq_submit_bio()")
    Reported-by: Yi Zhang <yi.zhang@redhat.com>
    Cc: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Ming Lei <ming.lei@redhat.com>
    Tested-by: Yi Zhang <yi.zhang@redhat.com>
    Link: https://lore.kernel.org/r/20231113035231.2708053-1-ming.lei@redhat.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
block: Don't invalidate pagecache for invalid falloc modes [+ + +]
Author: Sarthak Kukreti <sarthakkukreti@chromium.org>
Date:   Wed Oct 11 13:12:30 2023 -0700

    block: Don't invalidate pagecache for invalid falloc modes
    
    commit 1364a3c391aedfeb32aa025303ead3d7c91cdf9d upstream.
    
    Only call truncate_bdev_range() if the fallocate mode is supported. This
    fixes a bug where data in the pagecache could be invalidated if the
    fallocate() was called on the block device with an invalid mode.
    
    Fixes: 25f4c41415e5 ("block: implement (some of) fallocate for block devices")
    Cc: stable@vger.kernel.org
    Reported-by: "Darrick J. Wong" <djwong@kernel.org>
    Signed-off-by: Sarthak Kukreti <sarthakkukreti@chromium.org>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Fixes: line?  I've never seen those wrapped.
    Link: https://lore.kernel.org/r/20231011201230.750105-1-sarthakkukreti@chromium.org
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Sarthak Kukreti <sarthakkukreti@chromium.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

block: update the stable_writes flag in bdev_add [+ + +]
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Oct 25 16:10:18 2023 +0200

    block: update the stable_writes flag in bdev_add
    
    [ Upstream commit 1898efcdbed32bb1c67269c985a50bab0dbc9493 ]
    
    Propagate the per-queue stable_write flags into each bdev inode in bdev_add.
    This makes sure devices that require stable writes have it set for I/O
    on the block device node as well.
    
    Note that this doesn't cover the case of a flag changing on a live device
    yet.  We should handle that as well, but I plan to cover it as part of a
    more general rework of how changing runtime paramters on block devices
    works.
    
    Fixes: 1cb039f3dc16 ("bdi: replace BDI_CAP_STABLE_WRITES with a queue and a sb flag")
    Reported-by: Ilya Dryomov <idryomov@gmail.com>
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Link: https://lore.kernel.org/r/20231025141020.192413-3-hch@lst.de
    Tested-by: Ilya Dryomov <idryomov@gmail.com>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters() [+ + +]
Author: Michael Chan <michael.chan@broadcom.com>
Date:   Wed Jan 3 16:59:24 2024 -0800

    bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters()
    
    [ Upstream commit e009b2efb7a8850498796b360043ac25c8d3d28f ]
    
    The 2 lines to check for the BNXT_HWRM_PF_UNLOAD_SP_EVENT bit was
    mis-applied to bnxt_cfg_ntp_filters() and should have been applied to
    bnxt_sp_task().
    
    Fixes: 19241368443f ("bnxt_en: Send PF driver unload notification to all VFs.")
    Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
    Signed-off-by: Michael Chan <michael.chan@broadcom.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
bpf, sockmap: af_unix stream sockets need to hold ref for pair sock [+ + +]
Author: John Fastabend <john.fastabend@gmail.com>
Date:   Tue Nov 28 17:25:56 2023 -0800

    bpf, sockmap: af_unix stream sockets need to hold ref for pair sock
    
    [ Upstream commit 8866730aed5100f06d3d965c22f1c61f74942541 ]
    
    AF_UNIX stream sockets are a paired socket. So sending on one of the pairs
    will lookup the paired socket as part of the send operation. It is possible
    however to put just one of the pairs in a BPF map. This currently increments
    the refcnt on the sock in the sockmap to ensure it is not free'd by the
    stack before sockmap cleans up its state and stops any skbs being sent/recv'd
    to that socket.
    
    But we missed a case. If the peer socket is closed it will be free'd by the
    stack. However, the paired socket can still be referenced from BPF sockmap
    side because we hold a reference there. Then if we are sending traffic through
    BPF sockmap to that socket it will try to dereference the free'd pair in its
    send logic creating a use after free. And following splat:
    
       [59.900375] BUG: KASAN: slab-use-after-free in sk_wake_async+0x31/0x1b0
       [59.901211] Read of size 8 at addr ffff88811acbf060 by task kworker/1:2/954
       [...]
       [59.905468] Call Trace:
       [59.905787]  <TASK>
       [59.906066]  dump_stack_lvl+0x130/0x1d0
       [59.908877]  print_report+0x16f/0x740
       [59.910629]  kasan_report+0x118/0x160
       [59.912576]  sk_wake_async+0x31/0x1b0
       [59.913554]  sock_def_readable+0x156/0x2a0
       [59.914060]  unix_stream_sendmsg+0x3f9/0x12a0
       [59.916398]  sock_sendmsg+0x20e/0x250
       [59.916854]  skb_send_sock+0x236/0xac0
       [59.920527]  sk_psock_backlog+0x287/0xaa0
    
    To fix let BPF sockmap hold a refcnt on both the socket in the sockmap and its
    paired socket. It wasn't obvious how to contain the fix to bpf_unix logic. The
    primarily problem with keeping this logic in bpf_unix was: In the sock close()
    we could handle the deref by having a close handler. But, when we are destroying
    the psock through a map delete operation we wouldn't have gotten any signal
    thorugh the proto struct other than it being replaced. If we do the deref from
    the proto replace its too early because we need to deref the sk_pair after the
    backlog worker has been stopped.
    
    Given all this it seems best to just cache it at the end of the psock and eat 8B
    for the af_unix and vsock users. Notice dgram sockets are OK because they handle
    locking already.
    
    Fixes: 94531cfcbe79 ("af_unix: Add unix_stream_proto for sockmap")
    Signed-off-by: John Fastabend <john.fastabend@gmail.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
    Link: https://lore.kernel.org/bpf/20231129012557.95371-2-john.fastabend@gmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
bpf, x64: Fix tailcall infinite loop [+ + +]
Author: Leon Hwang <hffilwlqm@gmail.com>
Date:   Tue Sep 12 23:04:41 2023 +0800

    bpf, x64: Fix tailcall infinite loop
    
    [ Upstream commit 2b5dcb31a19a2e0acd869b12c9db9b2d696ef544 ]
    
    From commit ebf7d1f508a73871 ("bpf, x64: rework pro/epilogue and tailcall
    handling in JIT"), the tailcall on x64 works better than before.
    
    From commit e411901c0b775a3a ("bpf: allow for tailcalls in BPF subprograms
    for x64 JIT"), tailcall is able to run in BPF subprograms on x64.
    
    From commit 5b92a28aae4dd0f8 ("bpf: Support attaching tracing BPF program
    to other BPF programs"), BPF program is able to trace other BPF programs.
    
    How about combining them all together?
    
    1. FENTRY/FEXIT on a BPF subprogram.
    2. A tailcall runs in the BPF subprogram.
    3. The tailcall calls the subprogram's caller.
    
    As a result, a tailcall infinite loop comes up. And the loop would halt
    the machine.
    
    As we know, in tail call context, the tail_call_cnt propagates by stack
    and rax register between BPF subprograms. So do in trampolines.
    
    Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT")
    Fixes: e411901c0b77 ("bpf: allow for tailcalls in BPF subprograms for x64 JIT")
    Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
    Signed-off-by: Leon Hwang <hffilwlqm@gmail.com>
    Link: https://lore.kernel.org/r/20230912150442.2009-3-hffilwlqm@gmail.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
bpf, x86: save/restore regs with BPF_DW size [+ + +]
Author: Menglong Dong <imagedong@tencent.com>
Date:   Thu Jul 13 12:07:36 2023 +0800

    bpf, x86: save/restore regs with BPF_DW size
    
    [ Upstream commit 02a6dfa8ff43efb1c989f87a4d862aedf436088a ]
    
    As we already reserve 8 byte in the stack for each reg, it is ok to
    store/restore the regs in BPF_DW size. This will make the code in
    save_regs()/restore_regs() simpler.
    
    Signed-off-by: Menglong Dong <imagedong@tencent.com>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20230713040738.1789742-2-imagedong@tencent.com
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Stable-dep-of: 2b5dcb31a19a ("bpf, x64: Fix tailcall infinite loop")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf, x86: Simplify the parsing logic of structure parameters [+ + +]
Author: Pu Lehui <pulehui@huawei.com>
Date:   Thu Jan 5 11:50:26 2023 +0800

    bpf, x86: Simplify the parsing logic of structure parameters
    
    [ Upstream commit 7f7880495770329d095d402c2865bfa7089192f8 ]
    
    Extra_nregs of structure parameters and nr_args can be
    added directly at the beginning, and using a flip flag
    to identifiy structure parameters. Meantime, renaming
    some variables to make them more sense.
    
    Signed-off-by: Pu Lehui <pulehui@huawei.com>
    Acked-by: Yonghong Song <yhs@fb.com>
    Link: https://lore.kernel.org/r/20230105035026.3091988-1-pulehui@huaweicloud.com
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
    Stable-dep-of: 2b5dcb31a19a ("bpf, x64: Fix tailcall infinite loop")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
bpf: clean up visit_insn()'s instruction processing [+ + +]
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Thu Mar 2 15:50:04 2023 -0800

    bpf: clean up visit_insn()'s instruction processing
    
    [ Upstream commit 653ae3a874aca6764a4c1f5a8bf1b072ade0d6f4 ]
    
    Instead of referencing processed instruction repeatedly as insns[t]
    throughout entire visit_insn() function, take a local insn pointer and
    work with it in a cleaner way.
    
    It makes enhancing this function further a bit easier as well.
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20230302235015.2044271-7-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Stable-dep-of: 3feb263bb516 ("bpf: handle ldimm64 properly in check_cfg()")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf: decouple prune and jump points [+ + +]
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Tue Dec 6 15:33:43 2022 -0800

    bpf: decouple prune and jump points
    
    [ Upstream commit bffdeaa8a5af7200b0e74c9d5a41167f86626a36 ]
    
    BPF verifier marks some instructions as prune points. Currently these
    prune points serve two purposes.
    
    It's a point where verifier tries to find previously verified state and
    check current state's equivalence to short circuit verification for
    current code path.
    
    But also currently it's a point where jump history, used for precision
    backtracking, is updated. This is done so that non-linear flow of
    execution could be properly backtracked.
    
    Such coupling is coincidental and unnecessary. Some prune points are not
    part of some non-linear jump path, so don't need update of jump history.
    On the other hand, not all instructions which have to be recorded in
    jump history necessarily are good prune points.
    
    This patch splits prune and jump points into independent flags.
    Currently all prune points are marked as jump points to minimize amount
    of changes in this patch, but next patch will perform some optimization
    of prune vs jmp point placement.
    
    No functional changes are intended.
    
    Acked-by: John Fastabend <john.fastabend@gmail.com>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20221206233345.438540-2-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Stable-dep-of: 3feb263bb516 ("bpf: handle ldimm64 properly in check_cfg()")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf: Fix a verifier bug due to incorrect branch offset comparison with cpu=v4 [+ + +]
Author: Yonghong Song <yonghong.song@linux.dev>
Date:   Thu Nov 30 18:46:40 2023 -0800

    bpf: Fix a verifier bug due to incorrect branch offset comparison with cpu=v4
    
    commit dfce9cb3140592b886838e06f3e0c25fea2a9cae upstream.
    
    Bpf cpu=v4 support is introduced in [1] and Commit 4cd58e9af8b9
    ("bpf: Support new 32bit offset jmp instruction") added support for new
    32bit offset jmp instruction. Unfortunately, in function
    bpf_adj_delta_to_off(), for new branch insn with 32bit offset, the offset
    (plus/minor a small delta) compares to 16-bit offset bound
    [S16_MIN, S16_MAX], which caused the following verification failure:
      $ ./test_progs-cpuv4 -t verif_scale_pyperf180
      ...
      insn 10 cannot be patched due to 16-bit range
      ...
      libbpf: failed to load object 'pyperf180.bpf.o'
      scale_test:FAIL:expect_success unexpected error: -12 (errno 12)
      #405     verif_scale_pyperf180:FAIL
    
    Note that due to recent llvm18 development, the patch [2] (already applied
    in bpf-next) needs to be applied to bpf tree for testing purpose.
    
    The fix is rather simple. For 32bit offset branch insn, the adjusted
    offset compares to [S32_MIN, S32_MAX] and then verification succeeded.
    
      [1] https://lore.kernel.org/all/20230728011143.3710005-1-yonghong.song@linux.dev
      [2] https://lore.kernel.org/bpf/20231110193644.3130906-1-yonghong.song@linux.dev
    
    Fixes: 4cd58e9af8b9 ("bpf: Support new 32bit offset jmp instruction")
    Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/bpf/20231201024640.3417057-1-yonghong.song@linux.dev
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bpf: fix precision backtracking instruction iteration [+ + +]
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Thu Nov 9 16:26:37 2023 -0800

    bpf: fix precision backtracking instruction iteration
    
    [ Upstream commit 4bb7ea946a370707315ab774432963ce47291946 ]
    
    Fix an edge case in __mark_chain_precision() which prematurely stops
    backtracking instructions in a state if it happens that state's first
    and last instruction indexes are the same. This situations doesn't
    necessarily mean that there were no instructions simulated in a state,
    but rather that we starting from the instruction, jumped around a bit,
    and then ended up at the same instruction before checkpointing or
    marking precision.
    
    To distinguish between these two possible situations, we need to consult
    jump history. If it's empty or contain a single record "bridging" parent
    state and first instruction of processed state, then we indeed
    backtracked all instructions in this state. But if history is not empty,
    we are definitely not done yet.
    
    Move this logic inside get_prev_insn_idx() to contain it more nicely.
    Use -ENOENT return code to denote "we are out of instructions"
    situation.
    
    This bug was exposed by verifier_loop1.c's bounded_recursion subtest, once
    the next fix in this patch set is applied.
    
    Acked-by: Eduard Zingerman <eddyz87@gmail.com>
    Fixes: b5dc0163d8fd ("bpf: precise scalar_value tracking")
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20231110002638.4168352-3-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf: handle ldimm64 properly in check_cfg() [+ + +]
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Thu Nov 9 16:26:36 2023 -0800

    bpf: handle ldimm64 properly in check_cfg()
    
    [ Upstream commit 3feb263bb516ee7e1da0acd22b15afbb9a7daa19 ]
    
    ldimm64 instructions are 16-byte long, and so have to be handled
    appropriately in check_cfg(), just like the rest of BPF verifier does.
    
    This has implications in three places:
      - when determining next instruction for non-jump instructions;
      - when determining next instruction for callback address ldimm64
        instructions (in visit_func_call_insn());
      - when checking for unreachable instructions, where second half of
        ldimm64 is expected to be unreachable;
    
    We take this also as an opportunity to report jump into the middle of
    ldimm64. And adjust few test_verifier tests accordingly.
    
    Acked-by: Eduard Zingerman <eddyz87@gmail.com>
    Reported-by: Hao Sun <sunhao.th@gmail.com>
    Fixes: 475fb78fbf48 ("bpf: verifier (add branch/goto checks)")
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Link: https://lore.kernel.org/r/20231110002638.4168352-2-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf: remove unnecessary prune and jump points [+ + +]
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Tue Dec 6 15:33:45 2022 -0800

    bpf: remove unnecessary prune and jump points
    
    [ Upstream commit 618945fbed501b6e5865042068a51edfb2dda948 ]
    
    Don't mark some instructions as jump points when there are actually no
    jumps and instructions are just processed sequentially. Such case is
    handled naturally by precision backtracking logic without the need to
    update jump history. See get_prev_insn_idx(). It goes back linearly by
    one instruction, unless current top of jmp_history is pointing to
    current instruction. In such case we use `st->jmp_history[cnt - 1].prev_idx`
    to find instruction from which we jumped to the current instruction
    non-linearly.
    
    Also remove both jump and prune point marking for instruction right
    after unconditional jumps, as program flow can get to the instruction
    right after unconditional jump instruction only if there is a jump to
    that instruction from somewhere else in the program. In such case we'll
    mark such instruction as prune/jump point because it's a destination of
    a jump.
    
    This change has no changes in terms of number of instructions or states
    processes across Cilium and selftests programs.
    
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Acked-by: John Fastabend <john.fastabend@gmail.com>
    Link: https://lore.kernel.org/r/20221206233345.438540-4-andrii@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Stable-dep-of: 3feb263bb516 ("bpf: handle ldimm64 properly in check_cfg()")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf: Remove unused insn_cnt argument from visit_[func_call_]insn() [+ + +]
Author: Andrii Nakryiko <andrii@kernel.org>
Date:   Wed Dec 7 11:55:34 2022 -0800

    bpf: Remove unused insn_cnt argument from visit_[func_call_]insn()
    
    [ Upstream commit dcb2288b1fd9a8cdf2f3b8c0c7b3763346ef515f ]
    
    Number of total instructions in BPF program (including subprogs) can and
    is accessed from env->prog->len. visit_func_call_insn() doesn't do any
    checks against insn_cnt anymore, relying on push_insn() to do this check
    internally. So remove unnecessary insn_cnt input argument from
    visit_func_call_insn() and visit_insn() functions.
    
    Suggested-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Link: https://lore.kernel.org/bpf/20221207195534.2866030-1-andrii@kernel.org
    Stable-dep-of: 3feb263bb516 ("bpf: handle ldimm64 properly in check_cfg()")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf: Support new 32bit offset jmp instruction [+ + +]
Author: Yonghong Song <yonghong.song@linux.dev>
Date:   Thu Jul 27 18:12:31 2023 -0700

    bpf: Support new 32bit offset jmp instruction
    
    [ Upstream commit 4cd58e9af8b9d9fff6b7145e742abbfcda0af4af ]
    
    Add interpreter/jit/verifier support for 32bit offset jmp instruction.
    If a conditional jmp instruction needs more than 16bit offset,
    it can be simulated with a conditional jmp + a 32bit jmp insn.
    
    Acked-by: Eduard Zingerman <eddyz87@gmail.com>
    Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
    Link: https://lore.kernel.org/r/20230728011231.3716103-1-yonghong.song@linux.dev
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Stable-dep-of: 3feb263bb516 ("bpf: handle ldimm64 properly in check_cfg()")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf: syzkaller found null ptr deref in unix_bpf proto add [+ + +]
Author: John Fastabend <john.fastabend@gmail.com>
Date:   Fri Dec 1 10:01:38 2023 -0800

    bpf: syzkaller found null ptr deref in unix_bpf proto add
    
    commit 8d6650646ce49e9a5b8c5c23eb94f74b1749f70f upstream.
    
    I added logic to track the sock pair for stream_unix sockets so that we
    ensure lifetime of the sock matches the time a sockmap could reference
    the sock (see fixes tag). I forgot though that we allow af_unix unconnected
    sockets into a sock{map|hash} map.
    
    This is problematic because previous fixed expected sk_pair() to exist
    and did not NULL check it. Because unconnected sockets have a NULL
    sk_pair this resulted in the NULL ptr dereference found by syzkaller.
    
    BUG: KASAN: null-ptr-deref in unix_stream_bpf_update_proto+0x72/0x430 net/unix/unix_bpf.c:171
    Write of size 4 at addr 0000000000000080 by task syz-executor360/5073
    Call Trace:
     <TASK>
     ...
     sock_hold include/net/sock.h:777 [inline]
     unix_stream_bpf_update_proto+0x72/0x430 net/unix/unix_bpf.c:171
     sock_map_init_proto net/core/sock_map.c:190 [inline]
     sock_map_link+0xb87/0x1100 net/core/sock_map.c:294
     sock_map_update_common+0xf6/0x870 net/core/sock_map.c:483
     sock_map_update_elem_sys+0x5b6/0x640 net/core/sock_map.c:577
     bpf_map_update_value+0x3af/0x820 kernel/bpf/syscall.c:167
    
    We considered just checking for the null ptr and skipping taking a ref
    on the NULL peer sock. But, if the socket is then connected() after
    being added to the sockmap we can cause the original issue again. So
    instead this patch blocks adding af_unix sockets that are not in the
    ESTABLISHED state.
    
    Reported-by: Eric Dumazet <edumazet@google.com>
    Reported-by: syzbot+e8030702aefd3444fb9e@syzkaller.appspotmail.com
    Fixes: 8866730aed51 ("bpf, sockmap: af_unix stream sockets need to hold ref for pair sock")
    Acked-by: Jakub Sitnicki <jakub@cloudflare.com>
    Signed-off-by: John Fastabend <john.fastabend@gmail.com>
    Link: https://lore.kernel.org/r/20231201180139.328529-2-john.fastabend@gmail.com
    Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
btrfs: fix qgroup_free_reserved_data int overflow [+ + +]
Author: Boris Burkov <boris@bur.io>
Date:   Fri Dec 1 13:00:10 2023 -0800

    btrfs: fix qgroup_free_reserved_data int overflow
    
    [ Upstream commit 9e65bfca24cf1d77e4a5c7a170db5867377b3fe7 ]
    
    The reserved data counter and input parameter is a u64, but we
    inadvertently accumulate it in an int. Overflowing that int results in
    freeing the wrong amount of data and breaking reserve accounting.
    
    Unfortunately, this overflow rot spreads from there, as the qgroup
    release/free functions rely on returning an int to take advantage of
    negative values for error codes.
    
    Therefore, the full fix is to return the "released" or "freed" amount by
    a u64 argument and to return 0 or negative error code via the return
    value.
    
    Most of the call sites simply ignore the return value, though some
    of them handle the error and count the returned bytes. Change all of
    them accordingly.
    
    CC: stable@vger.kernel.org # 6.1+
    Reviewed-by: Qu Wenruo <wqu@suse.com>
    Signed-off-by: Boris Burkov <boris@bur.io>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

btrfs: mark the len field in struct btrfs_ordered_sum as unsigned [+ + +]
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed May 24 17:03:06 2023 +0200

    btrfs: mark the len field in struct btrfs_ordered_sum as unsigned
    
    [ Upstream commit 6e4b2479ab38b3f949a85964da212295d32102f0 ]
    
    len can't ever be negative, so mark it as an u32 instead of int.
    
    Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>
    Stable-dep-of: 9e65bfca24cf ("btrfs: fix qgroup_free_reserved_data int overflow")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
can: raw: add support for SO_MARK [+ + +]
Author: Marc Kleine-Budde <mkl@pengutronix.de>
Date:   Fri Dec 9 10:10:08 2022 +0100

    can: raw: add support for SO_MARK
    
    [ Upstream commit 0826e82b8a32e646b7b32ba8b68ba30812028e47 ]
    
    Add support for SO_MARK to the CAN_RAW protocol. This makes it
    possible to add traffic control filters based on the fwmark.
    
    Link: https://lore.kernel.org/all/20221210113653.170346-1-mkl@pengutronix.de
    Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
    Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
    Stable-dep-of: 7f6ca95d16b9 ("net: Implement missing getsockopt(SO_TIMESTAMPING_NEW)")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
cifs: cifs_chan_is_iface_active should be called with chan_lock held [+ + +]
Author: Shyam Prasad N <sprasad@microsoft.com>
Date:   Fri Dec 29 11:16:15 2023 +0000

    cifs: cifs_chan_is_iface_active should be called with chan_lock held
    
    commit 7257bcf3bdc785eabc4eef1f329a59815b032508 upstream.
    
    cifs_chan_is_iface_active checks the channels of a session to see
    if the associated iface is active. This should always happen
    with chan_lock held. However, these two callers of this function
    were missing this locking.
    
    This change makes sure the function calls are protected with
    proper locking.
    
    Fixes: b54034a73baf ("cifs: during reconnect, update interface if necessary")
    Fixes: fa1d0508bdd4 ("cifs: account for primary channel in the interface list")
    Cc: stable@vger.kernel.org
    Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cifs: do not depend on release_iface for maintaining iface_list [+ + +]
Author: Shyam Prasad N <sprasad@microsoft.com>
Date:   Fri Dec 29 11:16:16 2023 +0000

    cifs: do not depend on release_iface for maintaining iface_list
    
    commit 09eeb0723f219fbd96d8865bf9b935e03ee2ec22 upstream.
    
    parse_server_interfaces should be in complete charge of maintaining
    the iface_list linked list. Today, iface entries are removed
    from the list only when the last refcount is dropped.
    i.e. in release_iface. However, this can result in undercounting
    of refcount if the server stops advertising interfaces (which
    Azure SMB server does).
    
    This change puts parse_server_interfaces in full charge of
    maintaining the iface_list. So if an empty list is returned
    by the server, the entries in the list will immediately be
    removed. This way, a following call to the same function will
    not find entries in the list.
    
    Fixes: aa45dadd34e4 ("cifs: change iface_list from array to sorted linked list")
    Cc: stable@vger.kernel.org
    Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
cpu/SMT: Create topology_smt_thread_allowed() [+ + +]
Author: Michael Ellerman <mpe@ellerman.id.au>
Date:   Wed Jul 5 16:51:39 2023 +0200

    cpu/SMT: Create topology_smt_thread_allowed()
    
    [ Upstream commit 38253464bc821d6de6bba81bb1412ebb36f6cbd1 ]
    
    Some architectures allows partial SMT states, i.e. when not all SMT threads
    are brought online.
    
    To support that, add an architecture helper which checks whether a given
    CPU is allowed to be brought online depending on how many SMT threads are
    currently enabled. Since this is only applicable to architecture supporting
    partial SMT, only these architectures should select the new configuration
    variable CONFIG_SMT_NUM_THREADS_DYNAMIC. For the other architectures, not
    supporting the partial SMT states, there is no need to define
    topology_cpu_smt_allowed(), the generic code assumed that all the threads
    are allowed or only the primary ones.
    
    Call the helper from cpu_smt_enable(), and cpu_smt_allowed() when SMT is
    enabled, to check if the particular thread should be onlined. Notably,
    also call it from cpu_smt_disable() if CPU_SMT_ENABLED, to allow
    offlining some threads to move from a higher to lower number of threads
    online.
    
    [ ldufour: Slightly reword the commit's description ]
    [ ldufour: Introduce CONFIG_SMT_NUM_THREADS_DYNAMIC ]
    
    Suggested-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
    Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Tested-by: Zhang Rui <rui.zhang@intel.com>
    Link: https://lore.kernel.org/r/20230705145143.40545-7-ldufour@linux.ibm.com
    Stable-dep-of: d91bdd96b55c ("cpu/SMT: Make SMT control more robust against enumeration failures")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

cpu/SMT: Make SMT control more robust against enumeration failures [+ + +]
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Mon Aug 14 10:18:27 2023 +0200

    cpu/SMT: Make SMT control more robust against enumeration failures
    
    [ Upstream commit d91bdd96b55cc3ce98d883a60f133713821b80a6 ]
    
    The SMT control mechanism got added as speculation attack vector
    mitigation. The implemented logic relies on the primary thread mask to
    be set up properly.
    
    This turns out to be an issue with XEN/PV guests because their CPU hotplug
    mechanics do not enumerate APICs and therefore the mask is never correctly
    populated.
    
    This went unnoticed so far because by chance XEN/PV ends up with
    smp_num_siblings == 2. So smt_hotplug_control stays at its default value
    CPU_SMT_ENABLED and the primary thread mask is never evaluated in the
    context of CPU hotplug.
    
    This stopped "working" with the upcoming overhaul of the topology
    evaluation which legitimately provides a fake topology for XEN/PV. That
    sets smp_num_siblings to 1, which causes the core CPU hot-plug core to
    refuse to bring up the APs.
    
    This happens because smt_hotplug_control is set to CPU_SMT_NOT_SUPPORTED
    which causes cpu_smt_allowed() to evaluate the unpopulated primary thread
    mask with the conclusion that all non-boot CPUs are not valid to be
    plugged.
    
    Make cpu_smt_allowed() more robust and take CPU_SMT_NOT_SUPPORTED and
    CPU_SMT_NOT_IMPLEMENTED into account. Rename it to cpu_bootable() while at
    it as that makes it more clear what the function is about.
    
    The primary mask issue on x86 XEN/PV needs to be addressed separately as
    there are users outside of the CPU hotplug code too.
    
    Fixes: 05736e4ac13c ("cpu/hotplug: Provide knobs to control SMT")
    Reported-by: Juergen Gross <jgross@suse.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Tested-by: Juergen Gross <jgross@suse.com>
    Tested-by: Sohil Mehta <sohil.mehta@intel.com>
    Tested-by: Michael Kelley <mikelley@microsoft.com>
    Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Tested-by: Zhang Rui <rui.zhang@intel.com>
    Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Link: https://lore.kernel.org/r/20230814085112.149440843@linutronix.de
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
dpaa2-eth: recycle the RX buffer only after all processing done [+ + +]
Author: Ioana Ciornei <ioana.ciornei@nxp.com>
Date:   Fri Nov 24 12:28:05 2023 +0200

    dpaa2-eth: recycle the RX buffer only after all processing done
    
    [ Upstream commit beb1930f966d1517921488bd5d64147f58f79abf ]
    
    The blamed commit added support for Rx copybreak. This meant that for
    certain frame sizes, a new skb was allocated and the initial data buffer
    was recycled. Instead of waiting to recycle the Rx buffer only after all
    processing was done on it (like accessing the parse results or timestamp
    information), the code path just went ahead and re-used the buffer right
    away.
    
    This sometimes lead to corrupted HW and SW annotation areas.
    Fix this by delaying the moment when the buffer is recycled.
    
    Fixes: 50f826999a80 ("dpaa2-eth: add rx copybreak support")
    Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/amd/display: add nv12 bounding box [+ + +]
Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Wed Dec 20 12:33:45 2023 -0500

    drm/amd/display: add nv12 bounding box
    
    commit 7e725c20fea8914ef1829da777f517ce1a93d388 upstream.
    
    This was included in gpu_info firmware, move it into the
    driver for consistency with other nv1x parts.
    
    Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2318
    Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
drm/amdgpu: skip gpu_info fw loading on navi12 [+ + +]
Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Wed Dec 20 12:36:08 2023 -0500

    drm/amdgpu: skip gpu_info fw loading on navi12
    
    commit 21f6137c64c65d6808c4a81006956197ca203383 upstream.
    
    It's no longer required.
    
    Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2318
    Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
drm/bridge: ti-sn65dsi86: Never store more than msg->size bytes in AUX xfer [+ + +]
Author: Douglas Anderson <dianders@chromium.org>
Date:   Thu Dec 14 12:37:52 2023 -0800

    drm/bridge: ti-sn65dsi86: Never store more than msg->size bytes in AUX xfer
    
    [ Upstream commit aca58eac52b88138ab98c814afb389a381725cd7 ]
    
    For aux reads, the value `msg->size` indicates the size of the buffer
    provided by `msg->buffer`. We should never in any circumstances write
    more bytes to the buffer since it may overflow the buffer.
    
    In the ti-sn65dsi86 driver there is one code path that reads the
    transfer length from hardware. Even though it's never been seen to be
    a problem, we should make extra sure that the hardware isn't
    increasing the length since doing so would cause us to overrun the
    buffer.
    
    Fixes: 982f589bde7a ("drm/bridge: ti-sn65dsi86: Update reply on aux failures")
    Reviewed-by: Stephen Boyd <swboyd@chromium.org>
    Reviewed-by: Guenter Roeck <groeck@chromium.org>
    Signed-off-by: Douglas Anderson <dianders@chromium.org>
    Link: https://patchwork.freedesktop.org/patch/msgid/20231214123752.v3.2.I7b83c0f31aeedc6b1dc98c7c741d3e1f94f040f8@changeid
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/i915/dp: Fix passing the correct DPCD_REV for drm_dp_set_phy_test_pattern [+ + +]
Author: Khaled Almahallawy <khaled.almahallawy@intel.com>
Date:   Wed Dec 13 13:15:42 2023 -0800

    drm/i915/dp: Fix passing the correct DPCD_REV for drm_dp_set_phy_test_pattern
    
    [ Upstream commit 2bd7a06a1208aaacb4e7a2a5436c23bce8d70801 ]
    
    Using link_status to get DPCD_REV fails when disabling/defaulting
    phy pattern. Use intel_dp->dpcd to access DPCD_REV correctly.
    
    Fixes: 8cdf72711928 ("drm/i915/dp: Program vswing, pre-emphasis, test-pattern")
    Cc: Jani Nikula <jani.nikula@intel.com>
    Cc: Imre Deak <imre.deak@intel.com>
    Cc: Lee Shawn C <shawn.c.lee@intel.com>
    Signed-off-by: Khaled Almahallawy <khaled.almahallawy@intel.com>
    Signed-off-by: Jani Nikula <jani.nikula@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20231213211542.3585105-3-khaled.almahallawy@intel.com
    (cherry picked from commit 3ee302ec22d6e1d7d1e6d381b0d507ee80f2135c)
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/mgag200: Fix gamma lut not initialized for G200ER, G200EV, G200SE [+ + +]
Author: Jocelyn Falempe <jfalempe@redhat.com>
Date:   Thu Dec 14 17:38:06 2023 +0100

    drm/mgag200: Fix gamma lut not initialized for G200ER, G200EV, G200SE
    
    commit 11f9eb899ecc8c02b769cf8d2532ba12786a7af7 upstream.
    
    When mgag200 switched from simple KMS to regular atomic helpers,
    the initialization of the gamma settings was lost.
    This leads to a black screen, if the bios/uefi doesn't use the same
    pixel color depth.
    This has been fixed with commit ad81e23426a6 ("drm/mgag200: Fix gamma
    lut not initialized.") for most G200, but G200ER, G200EV, G200SE use
    their own version of crtc_helper_atomic_enable() and need to be fixed
    too.
    
    Fixes: 1baf9127c482 ("drm/mgag200: Replace simple-KMS with regular atomic helpers")
    Cc: <stable@vger.kernel.org> #v6.1+
    Reported-by: Roger Sewell <roger.sewell@cantab.net>
    Suggested-by: Roger Sewell <roger.sewell@cantab.net>
    Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
    Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
    Link: https://patchwork.freedesktop.org/patch/msgid/20231214163849.359691-1-jfalempe@redhat.com
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
ethtool: don't propagate EOPNOTSUPP from dumps [+ + +]
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Sun Nov 26 14:58:06 2023 -0800

    ethtool: don't propagate EOPNOTSUPP from dumps
    
    [ Upstream commit cbeb989e41f4094f54bec2cecce993f26f547bea ]
    
    The default dump handler needs to clear ret before returning.
    Otherwise if the last interface returns an inconsequential
    error this error will propagate to user space.
    
    This may confuse user space (ethtool CLI seems to ignore it,
    but YNL doesn't). It will also terminate the dump early
    for mutli-skb dump, because netlink core treats EOPNOTSUPP
    as a real error.
    
    Fixes: 728480f12442 ("ethtool: default handlers for GET requests")
    Reviewed-by: Simon Horman <horms@kernel.org>
    Link: https://lore.kernel.org/r/20231126225806.2143528-1-kuba@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ext4: convert move_extent_per_page() to use folios [+ + +]
Author: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Date:   Thu Nov 17 23:30:52 2022 -0800

    ext4: convert move_extent_per_page() to use folios
    
    [ Upstream commit 6dd8fe86fa84729538d8bed3149faf9c5886bb5b ]
    
    Patch series "Removing the try_to_release_page() wrapper", v3.
    
    This patchset replaces the remaining calls of try_to_release_page() with
    the folio equivalent: filemap_release_folio().  This allows us to remove
    the wrapper.
    
    This patch (of 4):
    
    Convert move_extent_per_page() to use folios.  This change removes 5 calls
    to compound_head() and is in preparation for the removal of the
    try_to_release_page() wrapper.
    
    Link: https://lkml.kernel.org/r/20221118073055.55694-1-vishal.moola@gmail.com
    Link: https://lkml.kernel.org/r/20221118073055.55694-2-vishal.moola@gmail.com
    Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
    Cc: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 1898efcdbed3 ("block: update the stable_writes flag in bdev_add")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
f2fs: assign default compression level [+ + +]
Author: Jaegeuk Kim <jaegeuk@kernel.org>
Date:   Mon Jun 12 12:58:34 2023 -0700

    f2fs: assign default compression level
    
    [ Upstream commit 00e120b5e4b5638cf19eee96d4332f2d100746ba ]
    
    Let's avoid any confusion from assigning compress_level=0 for LZ4HC and ZSTD.
    
    Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
    Stable-dep-of: f5f3bd903a5d ("f2fs: set the default compress_level on ioctl")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

f2fs: clean up i_compress_flag and i_compress_level usage [+ + +]
Author: Chao Yu <chao@kernel.org>
Date:   Sat Jan 28 18:30:11 2023 +0800

    f2fs: clean up i_compress_flag and i_compress_level usage
    
    [ Upstream commit b90e5086df6bf5ba819216d5ecf0667370bd565f ]
    
    .i_compress_level was introduced by commit 3fde13f817e2 ("f2fs: compress:
    support compress level"), but never be used.
    
    This patch updates as below:
    - load high 8-bits of on-disk .i_compress_flag to in-memory .i_compress_level
    - load low 8-bits of on-disk .i_compress_flag to in-memory .i_compress_flag
    - change type of in-memory .i_compress_flag from unsigned short to unsigned
    char.
    
    w/ above changes, we can avoid unneeded bit shift whenever during
    .init_compress_ctx(), and shrink size of struct f2fs_inode_info.
    
    Signed-off-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
    Stable-dep-of: f5f3bd903a5d ("f2fs: set the default compress_level on ioctl")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

f2fs: compress: fix to assign compress_level for lz4 correctly [+ + +]
Author: Chao Yu <chao@kernel.org>
Date:   Mon Aug 21 23:22:25 2023 +0800

    f2fs: compress: fix to assign compress_level for lz4 correctly
    
    commit 091a4dfbb1d32b06c031edbfe2a44af100c4604f upstream.
    
    After remount, F2FS_OPTION().compress_level was assgin to
    LZ4HC_DEFAULT_CLEVEL incorrectly, result in lz4hc:9 was enabled, fix it.
    
    1. mount /dev/vdb
    /dev/vdb on /mnt/f2fs type f2fs (...,compress_algorithm=lz4,compress_log_size=2,...)
    2. mount -t f2fs -o remount,compress_log_size=3 /mnt/f2fs/
    3. mount|grep f2fs
    /dev/vdb on /mnt/f2fs type f2fs (...,compress_algorithm=lz4:9,compress_log_size=3,...)
    
    Fixes: 00e120b5e4b5 ("f2fs: assign default compression level")
    Signed-off-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

f2fs: convert to use bitmap API [+ + +]
Author: Yangtao Li <frank.li@vivo.com>
Date:   Thu Feb 16 21:53:24 2023 +0800

    f2fs: convert to use bitmap API
    
    [ Upstream commit 447286ebadaafa551550704ff0b42eb08b1d1cb2 ]
    
    Let's use BIT() and GENMASK() instead of open it.
    
    Signed-off-by: Yangtao Li <frank.li@vivo.com>
    Reviewed-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
    Stable-dep-of: f5f3bd903a5d ("f2fs: set the default compress_level on ioctl")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

f2fs: set the default compress_level on ioctl [+ + +]
Author: Jaegeuk Kim <jaegeuk@kernel.org>
Date:   Fri Sep 8 15:41:42 2023 -0700

    f2fs: set the default compress_level on ioctl
    
    [ Upstream commit f5f3bd903a5d3e3b2ba89f11e0e29db25e60c048 ]
    
    Otherwise, we'll get a broken inode.
    
     # touch $FILE
     # f2fs_io setflags compression $FILE
     # f2fs_io set_coption 2 8 $FILE
    
    [  112.227612] F2FS-fs (dm-51): sanity_check_compress_inode: inode (ino=8d3fe) has unsupported compress level: 0, run fsck to fix
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
fbdev: imsttfb: fix double free in probe() [+ + +]
Author: Dan Carpenter <dan.carpenter@linaro.org>
Date:   Fri Oct 27 15:04:56 2023 +0300

    fbdev: imsttfb: fix double free in probe()
    
    [ Upstream commit e08c30efda21ef4c0ec084a3a9581c220b442ba9 ]
    
    The init_imstt() function calls framebuffer_release() on error and then
    the probe() function calls it again.  It should only be done in probe.
    
    Fixes: 518ecb6a209f ("fbdev: imsttfb: Fix error path of imsttfb_probe()")
    Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
    Signed-off-by: Helge Deller <deller@gmx.de>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

fbdev: imsttfb: Release framebuffer and dealloc cmap on error path [+ + +]
Author: Helge Deller <deller@gmx.de>
Date:   Sat May 27 11:28:36 2023 +0200

    fbdev: imsttfb: Release framebuffer and dealloc cmap on error path
    
    [ Upstream commit 5cf9a090a39c97f4506b7b53739d469b1c05a7e9 ]
    
    Add missing cleanups in error path.
    
    Signed-off-by: Helge Deller <deller@gmx.de>
    Stable-dep-of: e08c30efda21 ("fbdev: imsttfb: fix double free in probe()")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
filemap: add a per-mapping stable writes flag [+ + +]
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Oct 25 16:10:17 2023 +0200

    filemap: add a per-mapping stable writes flag
    
    [ Upstream commit 762321dab9a72760bf9aec48362f932717c9424d ]
    
    folio_wait_stable waits for writeback to finish before modifying the
    contents of a folio again, e.g. to support check summing of the data
    in the block integrity code.
    
    Currently this behavior is controlled by the SB_I_STABLE_WRITES flag
    on the super_block, which means it is uniform for the entire file system.
    This is wrong for the block device pseudofs which is shared by all
    block devices, or file systems that can use multiple devices like XFS
    witht the RT subvolume or btrfs (although btrfs currently reimplements
    folio_wait_stable anyway).
    
    Add a per-address_space AS_STABLE_WRITES flag to control the behavior
    in a more fine grained way.  The existing SB_I_STABLE_WRITES is kept
    to initialize AS_STABLE_WRITES to the existing default which covers
    most cases.
    
    Signed-off-by: Christoph Hellwig <hch@lst.de>
    Link: https://lore.kernel.org/r/20231025141020.192413-2-hch@lst.de
    Tested-by: Ilya Dryomov <idryomov@gmail.com>
    Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Reviewed-by: Darrick J. Wong <djwong@kernel.org>
    Signed-off-by: Christian Brauner <brauner@kernel.org>
    Stable-dep-of: 1898efcdbed3 ("block: update the stable_writes flag in bdev_add")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
firewire: ohci: suppress unexpected system reboot in AMD Ryzen machines and ASM108x/VT630x PCIe cards [+ + +]
Author: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Date:   Tue Jan 2 20:01:50 2024 +0900

    firewire: ohci: suppress unexpected system reboot in AMD Ryzen machines and ASM108x/VT630x PCIe cards
    
    commit ac9184fbb8478dab4a0724b279f94956b69be827 upstream.
    
    VIA VT6306/6307/6308 provides PCI interface compliant to 1394 OHCI. When
    the hardware is combined with Asmedia ASM1083/1085 PCIe-to-PCI bus bridge,
    it appears that accesses to its 'Isochronous Cycle Timer' register (offset
    0xf0 on PCI memory space) often causes unexpected system reboot in any
    type of AMD Ryzen machine (both 0x17 and 0x19 families). It does not
    appears in the other type of machine (AMD pre-Ryzen machine, Intel
    machine, at least), or in the other OHCI 1394 hardware (e.g. Texas
    Instruments).
    
    The issue explicitly appears at a commit dcadfd7f7c74 ("firewire: core:
    use union for callback of transaction completion") added to v6.5 kernel.
    It changed 1394 OHCI driver to access to the register every time to
    dispatch local asynchronous transaction. However, the issue exists in
    older version of kernel as long as it runs in AMD Ryzen machine, since
    the access to the register is required to maintain bus time. It is not
    hard to imagine that users experience the unexpected system reboot when
    generating bus reset by plugging any devices in, or reading the register
    by time-aware application programs; e.g. audio sample processing.
    
    This commit suppresses the unexpected system reboot in the combination of
    hardware. It avoids the access itself. As a result, the software stack can
    not provide the hardware time anymore to unit drivers, userspace
    applications, and nodes in the same IEEE 1394 bus. It brings apparent
    disadvantage since time-aware application programs require it, while
    time-unaware applications are available again; e.g. sbp2.
    
    Cc: stable@vger.kernel.org
    Reported-by: Jiri Slaby <jirislaby@kernel.org>
    Closes: https://bugzilla.suse.com/show_bug.cgi?id=1215436
    Reported-by: Mario Limonciello <mario.limonciello@amd.com>
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217994
    Reported-by: Tobias Gruetzmacher <tobias-lists@23.gs>
    Closes: https://sourceforge.net/p/linux1394/mailman/message/58711901/
    Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2240973
    Closes: https://bugs.launchpad.net/linux/+bug/2043905
    Link: https://lore.kernel.org/r/20240102110150.244475-1-o-takashi@sakamocchi.jp
    Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
firmware: arm_scmi: Fix frequency truncation by promoting multiplier type [+ + +]
Author: Sudeep Holla <sudeep.holla@arm.com>
Date:   Thu Nov 30 20:43:42 2023 +0000

    firmware: arm_scmi: Fix frequency truncation by promoting multiplier type
    
    [ Upstream commit 8e3c98d9187e09274fc000a7d1a77b070a42d259 ]
    
    Fix the possible frequency truncation for all values equal to or greater
    4GHz on 64bit machines by updating the multiplier 'mult_factor' to
    'unsigned long' type. It is also possible that the multiplier itself can
    be greater than or equal to 2^32. So we need to also fix the equation
    computing the value of the multiplier.
    
    Fixes: a9e3fbfaa0ff ("firmware: arm_scmi: add initial support for performance protocol")
    Reported-by: Sibi Sankar <quic_sibis@quicinc.com>
    Closes: https://lore.kernel.org/all/20231129065748.19871-3-quic_sibis@quicinc.com/
    Cc: Cristian Marussi <cristian.marussi@arm.com>
    Link: https://lore.kernel.org/r/20231130204343.503076-1-sudeep.holla@arm.com
    Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
genirq/affinity: Don't pass irq_affinity_desc array to irq_build_affinity_masks [+ + +]
Author: Ming Lei <ming.lei@redhat.com>
Date:   Tue Dec 27 10:29:02 2022 +0800

    genirq/affinity: Don't pass irq_affinity_desc array to irq_build_affinity_masks
    
    [ Upstream commit e7bdd7f0cbd1c001bb9b4d3313edc5ee094bc3f8 ]
    
    Prepare for abstracting irq_build_affinity_masks() into a public function
    for assigning all CPUs evenly into several groups.
    
    Don't pass irq_affinity_desc array to irq_build_affinity_masks, instead
    return a cpumask array by storing each assigned group into one element of
    the array.
    
    This allows to provide a generic interface for grouping all CPUs evenly
    from a NUMA and CPU locality viewpoint, and the cost is one extra allocation
    in irq_build_affinity_masks(), which should be fine since it is done via
    GFP_KERNEL and irq_build_affinity_masks() is a slow path anyway.
    
    Signed-off-by: Ming Lei <ming.lei@redhat.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: John Garry <john.g.garry@oracle.com>
    Reviewed-by: Jens Axboe <axboe@kernel.dk>
    Link: https://lore.kernel.org/r/20221227022905.352674-4-ming.lei@redhat.com
    Stable-dep-of: 0263f92fadbb ("lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

genirq/affinity: Move group_cpus_evenly() into lib/ [+ + +]
Author: Ming Lei <ming.lei@redhat.com>
Date:   Tue Dec 27 10:29:04 2022 +0800

    genirq/affinity: Move group_cpus_evenly() into lib/
    
    [ Upstream commit f7b3ea8cf72f3d6060fe08e461805181e7450a13 ]
    
    group_cpus_evenly() has become a generic function which can be used for
    other subsystems than the interrupt subsystem, so move it into lib/.
    
    Signed-off-by: Ming Lei <ming.lei@redhat.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Jens Axboe <axboe@kernel.dk>
    Link: https://lore.kernel.org/r/20221227022905.352674-6-ming.lei@redhat.com
    Stable-dep-of: 0263f92fadbb ("lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

genirq/affinity: Only build SMP-only helper functions on SMP kernels [+ + +]
Author: Ingo Molnar <mingo@kernel.org>
Date:   Wed Jan 18 12:14:01 2023 +0100

    genirq/affinity: Only build SMP-only helper functions on SMP kernels
    
    commit 188a569658584e93930ab60334c5a1079c0330d8 upstream.
    
    allnoconfig grew these new build warnings in lib/group_cpus.c:
    
      lib/group_cpus.c:247:12: warning: Б─≤__group_cpus_evenlyБ─≥ defined but not used [-Wunused-function]
      lib/group_cpus.c:75:13: warning: Б─≤build_node_to_cpumaskБ─≥ defined but not used [-Wunused-function]
      lib/group_cpus.c:66:13: warning: Б─≤free_node_to_cpumaskБ─≥ defined but not used [-Wunused-function]
      lib/group_cpus.c:43:23: warning: Б─≤alloc_node_to_cpumaskБ─≥ defined but not used [-Wunused-function]
    
    Widen the #ifdef CONFIG_SMP block to not expose unused helpers on
    non-SMP builds.
    
    Also annotate the preprocessor branches for better readability.
    
    Fixes: f7b3ea8cf72f ("genirq/affinity: Move group_cpus_evenly() into lib/")
    Cc: Ming Lei <ming.lei@redhat.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Link: https://lore.kernel.org/r/20221227022905.352674-6-ming.lei@redhat.com
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

genirq/affinity: Pass affinity managed mask array to irq_build_affinity_masks [+ + +]
Author: Ming Lei <ming.lei@redhat.com>
Date:   Tue Dec 27 10:29:01 2022 +0800

    genirq/affinity: Pass affinity managed mask array to irq_build_affinity_masks
    
    [ Upstream commit 1f962d91a15af54301c63febb8ac2ba07aa3654f ]
    
    Pass affinity managed mask array to irq_build_affinity_masks() so that the
    index of the first affinity managed vector is always zero.
    
    This allows to simplify the implementation a bit.
    
    Signed-off-by: Ming Lei <ming.lei@redhat.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: John Garry <john.g.garry@oracle.com>
    Reviewed-by: Jens Axboe <axboe@kernel.dk>
    Link: https://lore.kernel.org/r/20221227022905.352674-3-ming.lei@redhat.com
    Stable-dep-of: 0263f92fadbb ("lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

genirq/affinity: Remove the 'firstvec' parameter from irq_build_affinity_masks [+ + +]
Author: Ming Lei <ming.lei@redhat.com>
Date:   Tue Dec 27 10:29:00 2022 +0800

    genirq/affinity: Remove the 'firstvec' parameter from irq_build_affinity_masks
    
    [ Upstream commit cdf07f0ea48a3b52f924714d477366ac510ee870 ]
    
    The 'firstvec' parameter is always same with the parameter of
    'startvec', so use 'startvec' directly inside irq_build_affinity_masks().
    
    Signed-off-by: Ming Lei <ming.lei@redhat.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: John Garry <john.g.garry@oracle.com>
    Reviewed-by: Jens Axboe <axboe@kernel.dk>
    Link: https://lore.kernel.org/r/20221227022905.352674-2-ming.lei@redhat.com
    Stable-dep-of: 0263f92fadbb ("lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

genirq/affinity: Rename irq_build_affinity_masks as group_cpus_evenly [+ + +]
Author: Ming Lei <ming.lei@redhat.com>
Date:   Tue Dec 27 10:29:03 2022 +0800

    genirq/affinity: Rename irq_build_affinity_masks as group_cpus_evenly
    
    [ Upstream commit 523f1ea76aad9025f9bd5258d77f4406fa9dbe5d ]
    
    Map irq vector into group, which allows to abstract the algorithm for
    a generic use case outside of the interrupt core.
    
    Rename irq_build_affinity_masks as group_cpus_evenly, so the API can be
    reused for blk-mq to make default queue mapping even though irq vectors
    aren't involved.
    
    No functional change, just rename vector as group.
    
    Signed-off-by: Ming Lei <ming.lei@redhat.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Jens Axboe <axboe@kernel.dk>
    Link: https://lore.kernel.org/r/20221227022905.352674-5-ming.lei@redhat.com
    Stable-dep-of: 0263f92fadbb ("lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
i2c: core: Fix atomic xfer check for non-preempt config [+ + +]
Author: Benjamin Bara <benjamin.bara@skidata.com>
Date:   Thu Jan 4 09:17:08 2024 +0100

    i2c: core: Fix atomic xfer check for non-preempt config
    
    commit a3368e1186e3ce8e38f78cbca019622095b1f331 upstream.
    
    Since commit aa49c90894d0 ("i2c: core: Run atomic i2c xfer when
    !preemptible"), the whole reboot/power off sequence on non-preempt kernels
    is using atomic i2c xfer, as !preemptible() always results to 1.
    
    During device_shutdown(), the i2c might be used a lot and not all busses
    have implemented an atomic xfer handler. This results in a lot of
    avoidable noise, like:
    
    [   12.687169] No atomic I2C transfer handler for 'i2c-0'
    [   12.692313] WARNING: CPU: 6 PID: 275 at drivers/i2c/i2c-core.h:40 i2c_smbus_xfer+0x100/0x118
    ...
    
    Fix this by allowing non-atomic xfer when the interrupts are enabled, as
    it was before.
    
    Link: https://lore.kernel.org/r/20231222230106.73f030a5@yea
    Link: https://lore.kernel.org/r/20240102150350.3180741-1-mwalle@kernel.org
    Link: https://lore.kernel.org/linux-i2c/13271b9b-4132-46ef-abf8-2c311967bb46@mailbox.org/
    Fixes: aa49c90894d0 ("i2c: core: Run atomic i2c xfer when !preemptible")
    Cc: stable@vger.kernel.org # v5.2+
    Signed-off-by: Benjamin Bara <benjamin.bara@skidata.com>
    Tested-by: Michael Walle <mwalle@kernel.org>
    Tested-by: Tor Vic <torvic9@mailbox.org>
    [wsa: removed a comment which needs more work, code is ok]
    Signed-off-by: Wolfram Sang <wsa@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
i40e: Fix filter input checks to prevent config with invalid values [+ + +]
Author: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Date:   Wed Nov 29 11:23:11 2023 +0100

    i40e: Fix filter input checks to prevent config with invalid values
    
    [ Upstream commit 3e48041d9820c17e0a51599d12e66c6e12a8d08d ]
    
    Prevent VF from configuring filters with unsupported actions or use
    REDIRECT action with invalid tc number. Current checks could cause
    out of bounds access on PF side.
    
    Fixes: e284fc280473 ("i40e: Add and delete cloud filter")
    Reviewed-by: Andrii Staikov <andrii.staikov@intel.com>
    Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
    Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

i40e: fix use-after-free in i40e_aqc_add_filters() [+ + +]
Author: Ke Xiao <xiaoke@sangfor.com.cn>
Date:   Mon Dec 18 15:08:50 2023 +0800

    i40e: fix use-after-free in i40e_aqc_add_filters()
    
    [ Upstream commit 6a15584e99db8918b60e507539c7446375dcf366 ]
    
    Commit 3116f59c12bd ("i40e: fix use-after-free in
    i40e_sync_filters_subtask()") avoided use-after-free issues,
    by increasing refcount during update the VSI filter list to
    the HW. However, it missed the unicast situation.
    
    When deleting an unicast FDB entry, the i40e driver will release
    the mac_filter, and i40e_service_task will concurrently request
    firmware to add the mac_filter, which will lead to the following
    use-after-free issue.
    
    Fix again for both netdev->uc and netdev->mc.
    
    BUG: KASAN: use-after-free in i40e_aqc_add_filters+0x55c/0x5b0 [i40e]
    Read of size 2 at addr ffff888eb3452d60 by task kworker/8:7/6379
    
    CPU: 8 PID: 6379 Comm: kworker/8:7 Kdump: loaded Tainted: G
    Workqueue: i40e i40e_service_task [i40e]
    Call Trace:
     dump_stack+0x71/0xab
     print_address_description+0x6b/0x290
     kasan_report+0x14a/0x2b0
     i40e_aqc_add_filters+0x55c/0x5b0 [i40e]
     i40e_sync_vsi_filters+0x1676/0x39c0 [i40e]
     i40e_service_task+0x1397/0x2bb0 [i40e]
     process_one_work+0x56a/0x11f0
     worker_thread+0x8f/0xf40
     kthread+0x2a0/0x390
     ret_from_fork+0x1f/0x40
    
    Allocated by task 21948:
     kasan_kmalloc+0xa6/0xd0
     kmem_cache_alloc_trace+0xdb/0x1c0
     i40e_add_filter+0x11e/0x520 [i40e]
     i40e_addr_sync+0x37/0x60 [i40e]
     __hw_addr_sync_dev+0x1f5/0x2f0
     i40e_set_rx_mode+0x61/0x1e0 [i40e]
     dev_uc_add_excl+0x137/0x190
     i40e_ndo_fdb_add+0x161/0x260 [i40e]
     rtnl_fdb_add+0x567/0x950
     rtnetlink_rcv_msg+0x5db/0x880
     netlink_rcv_skb+0x254/0x380
     netlink_unicast+0x454/0x610
     netlink_sendmsg+0x747/0xb00
     sock_sendmsg+0xe2/0x120
     __sys_sendto+0x1ae/0x290
     __x64_sys_sendto+0xdd/0x1b0
     do_syscall_64+0xa0/0x370
     entry_SYSCALL_64_after_hwframe+0x65/0xca
    
    Freed by task 21948:
     __kasan_slab_free+0x137/0x190
     kfree+0x8b/0x1b0
     __i40e_del_filter+0x116/0x1e0 [i40e]
     i40e_del_mac_filter+0x16c/0x300 [i40e]
     i40e_addr_unsync+0x134/0x1b0 [i40e]
     __hw_addr_sync_dev+0xff/0x2f0
     i40e_set_rx_mode+0x61/0x1e0 [i40e]
     dev_uc_del+0x77/0x90
     rtnl_fdb_del+0x6a5/0x860
     rtnetlink_rcv_msg+0x5db/0x880
     netlink_rcv_skb+0x254/0x380
     netlink_unicast+0x454/0x610
     netlink_sendmsg+0x747/0xb00
     sock_sendmsg+0xe2/0x120
     __sys_sendto+0x1ae/0x290
     __x64_sys_sendto+0xdd/0x1b0
     do_syscall_64+0xa0/0x370
     entry_SYSCALL_64_after_hwframe+0x65/0xca
    
    Fixes: 3116f59c12bd ("i40e: fix use-after-free in i40e_sync_filters_subtask()")
    Fixes: 41c445ff0f48 ("i40e: main driver core")
    Signed-off-by: Ke Xiao <xiaoke@sangfor.com.cn>
    Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
    Cc: Di Zhu <zhudi2@huawei.com>
    Reviewed-by: Jan Sokolowski <jan.sokolowski@intel.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
    Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

i40e: Restore VF MSI-X state during PCI reset [+ + +]
Author: Andrii Staikov <andrii.staikov@intel.com>
Date:   Thu Dec 21 14:27:35 2023 +0100

    i40e: Restore VF MSI-X state during PCI reset
    
    [ Upstream commit 371e576ff3e8580d91d49026e5d5faebf5565558 ]
    
    During a PCI FLR the MSI-X Enable flag in the VF PCI MSI-X capability
    register will be cleared. This can lead to issues when a VF is
    assigned to a VM because in these cases the VF driver receives no
    indication of the PF PCI error/reset and additionally it is incapable
    of restoring the cleared flag in the hypervisor configuration space
    without fully reinitializing the driver interrupt functionality.
    
    Since the VF driver is unable to easily resolve this condition on its own,
    restore the VF MSI-X flag during the PF PCI reset handling.
    
    Fixes: 19b7960b2da1 ("i40e: implement split PCI error reset handler")
    Co-developed-by: Karen Ostrowska <karen.ostrowska@intel.com>
    Signed-off-by: Karen Ostrowska <karen.ostrowska@intel.com>
    Co-developed-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
    Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
    Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
    Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
    Signed-off-by: Andrii Staikov <andrii.staikov@intel.com>
    Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ice: Fix link_down_on_close message [+ + +]
Author: Katarzyna Wieczerzycka <katarzyna.wieczerzycka@intel.com>
Date:   Fri Dec 15 12:01:56 2023 +0100

    ice: Fix link_down_on_close message
    
    [ Upstream commit 6a8d8bb55e7001de2d50920381cc858f3a3e9fb7 ]
    
    The driver should not report an error message when for a medialess port
    the link_down_on_close flag is enabled and the physical link cannot be
    set down.
    
    Fixes: 8ac7132704f3 ("ice: Fix interface being down after reset with link-down-on-close flag on")
    Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
    Signed-off-by: Katarzyna Wieczerzycka <katarzyna.wieczerzycka@intel.com>
    Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
    Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

ice: Shut down VSI with "link-down-on-close" enabled [+ + +]
Author: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
Date:   Fri Dec 15 12:01:57 2023 +0100

    ice: Shut down VSI with "link-down-on-close" enabled
    
    [ Upstream commit 6d05ff55ef4f4954d28551236239f297bd52ea48 ]
    
    Disabling netdev with ethtool private flag "link-down-on-close" enabled
    can cause NULL pointer dereference bug. Shut down VSI regardless of
    "link-down-on-close" state.
    
    Fixes: 8ac7132704f3 ("ice: Fix interface being down after reset with link-down-on-close flag on")
    Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
    Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@intel.com>
    Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
    Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
igc: Check VLAN EtherType mask [+ + +]
Author: Kurt Kanzenbach <kurt@linutronix.de>
Date:   Wed Dec 6 15:07:18 2023 +0100

    igc: Check VLAN EtherType mask
    
    [ Upstream commit 7afd49a38e73afd57ff62c8d1cf5af760c4d49c0 ]
    
    Currently the driver accepts VLAN EtherType steering rules regardless of
    the configured mask. And things might fail silently or with confusing error
    messages to the user. The VLAN EtherType can only be matched by full
    mask. Therefore, add a check for that.
    
    For instance the following rule is invalid, but the driver accepts it and
    ignores the user specified mask:
    |root@host:~# ethtool -N enp3s0 flow-type ether vlan-etype 0x8100 \
    |             m 0x00ff action 0
    |Added rule with ID 63
    |root@host:~# ethtool --show-ntuple enp3s0
    |4 RX rings available
    |Total 1 rules
    |
    |Filter: 63
    |        Flow Type: Raw Ethernet
    |        Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
    |        Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
    |        Ethertype: 0x0 mask: 0xFFFF
    |        VLAN EtherType: 0x8100 mask: 0x0
    |        VLAN: 0x0 mask: 0xffff
    |        User-defined: 0x0 mask: 0xffffffffffffffff
    |        Action: Direct to queue 0
    
    After:
    |root@host:~# ethtool -N enp3s0 flow-type ether vlan-etype 0x8100 \
    |             m 0x00ff action 0
    |rmgr: Cannot insert RX class rule: Operation not supported
    
    Fixes: 2b477d057e33 ("igc: Integrate flex filter into ethtool ops")
    Suggested-by: Suman Ghosh <sumang@marvell.com>
    Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
    Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Tested-by: Naama Meir <naamax.meir@linux.intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

igc: Check VLAN TCI mask [+ + +]
Author: Kurt Kanzenbach <kurt@linutronix.de>
Date:   Fri Dec 1 08:50:43 2023 +0100

    igc: Check VLAN TCI mask
    
    [ Upstream commit b5063cbe148b829e8eb97672c2cbccc058835476 ]
    
    Currently the driver accepts VLAN TCI steering rules regardless of the
    configured mask. And things might fail silently or with confusing error
    messages to the user.
    
    There are two ways to handle the VLAN TCI mask:
    
     1. Match on the PCP field using a VLAN prio filter
     2. Match on complete TCI field using a flex filter
    
    Therefore, add checks and code for that.
    
    For instance the following rule is invalid and will be converted into a
    VLAN prio rule which is not correct:
    |root@host:~# ethtool -N enp3s0 flow-type ether vlan 0x0001 m 0xf000 \
    |             action 1
    |Added rule with ID 61
    |root@host:~# ethtool --show-ntuple enp3s0
    |4 RX rings available
    |Total 1 rules
    |
    |Filter: 61
    |        Flow Type: Raw Ethernet
    |        Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
    |        Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
    |        Ethertype: 0x0 mask: 0xFFFF
    |        VLAN EtherType: 0x0 mask: 0xffff
    |        VLAN: 0x1 mask: 0x1fff
    |        User-defined: 0x0 mask: 0xffffffffffffffff
    |        Action: Direct to queue 1
    
    After:
    |root@host:~# ethtool -N enp3s0 flow-type ether vlan 0x0001 m 0xf000 \
    |             action 1
    |rmgr: Cannot insert RX class rule: Operation not supported
    
    Fixes: 7991487ecb2d ("igc: Allow for Flex Filters to be installed")
    Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
    Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Tested-by: Naama Meir <naamax.meir@linux.intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

igc: Fix hicredit calculation [+ + +]
Author: Rodrigo Cataldo <rodrigo.cadore@l-acoustics.com>
Date:   Fri Dec 8 15:58:16 2023 +0100

    igc: Fix hicredit calculation
    
    [ Upstream commit 947dfc8138dfaeb6e966e2d661de89eb203e3064 ]
    
    According to the Intel Software Manual for I225, Section 7.5.2.7,
    hicredit should be multiplied by the constant link-rate value, 0x7736.
    
    Currently, the old constant link-rate value, 0x7735, from the boards
    supported on igb are being used, most likely due to a copy'n'paste, as
    the rest of the logic is the same for both drivers.
    
    Update hicredit accordingly.
    
    Fixes: 1ab011b0bf07 ("igc: Add support for CBS offloading")
    Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
    Signed-off-by: Rodrigo Cataldo <rodrigo.cadore@l-acoustics.com>
    Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
    Tested-by: Naama Meir <naamax.meir@linux.intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

igc: Report VLAN EtherType matching back to user [+ + +]
Author: Kurt Kanzenbach <kurt@linutronix.de>
Date:   Fri Dec 1 08:50:42 2023 +0100

    igc: Report VLAN EtherType matching back to user
    
    [ Upstream commit 088464abd48cf3735aee91f9e211b32da9d81117 ]
    
    Currently the driver allows to configure matching by VLAN EtherType.
    However, the retrieval function does not report it back to the user. Add
    it.
    
    Before:
    |root@host:~# ethtool -N enp3s0 flow-type ether vlan-etype 0x8100 action 0
    |Added rule with ID 63
    |root@host:~# ethtool --show-ntuple enp3s0
    |4 RX rings available
    |Total 1 rules
    |
    |Filter: 63
    |        Flow Type: Raw Ethernet
    |        Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
    |        Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
    |        Ethertype: 0x0 mask: 0xFFFF
    |        Action: Direct to queue 0
    
    After:
    |root@host:~# ethtool -N enp3s0 flow-type ether vlan-etype 0x8100 action 0
    |Added rule with ID 63
    |root@host:~# ethtool --show-ntuple enp3s0
    |4 RX rings available
    |Total 1 rules
    |
    |Filter: 63
    |        Flow Type: Raw Ethernet
    |        Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
    |        Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
    |        Ethertype: 0x0 mask: 0xFFFF
    |        VLAN EtherType: 0x8100 mask: 0x0
    |        VLAN: 0x0 mask: 0xffff
    |        User-defined: 0x0 mask: 0xffffffffffffffff
    |        Action: Direct to queue 0
    
    Fixes: 2b477d057e33 ("igc: Integrate flex filter into ethtool ops")
    Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
    Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Tested-by: Naama Meir <naamax.meir@linux.intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ipv4, ipv6: Use splice_eof() to flush [+ + +]
Author: David Howells <dhowells@redhat.com>
Date:   Wed Jun 7 19:19:13 2023 +0100

    ipv4, ipv6: Use splice_eof() to flush
    
    [ Upstream commit 1d7e4538a5463faa0b0e26a7a7b6bd68c7dfdd78 ]
    
    Allow splice to undo the effects of MSG_MORE after prematurely ending a
    splice/sendfile due to getting an EOF condition (->splice_read() returned
    0) after splice had called sendmsg() with MSG_MORE set when the user didn't
    set MSG_MORE.
    
    For UDP, a pending packet will not be emitted if the socket is closed
    before it is flushed; with this change, it be flushed by ->splice_eof().
    
    For TCP, it's not clear that MSG_MORE is actually effective.
    
    Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
    Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/
    Signed-off-by: David Howells <dhowells@redhat.com>
    cc: Kuniyuki Iwashima <kuniyu@amazon.com>
    cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
    cc: David Ahern <dsahern@kernel.org>
    cc: Jens Axboe <axboe@kernel.dk>
    cc: Matthew Wilcox <willy@infradead.org>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: a0002127cd74 ("udp: move udp->no_check6_tx to udp->udp_flags")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
keys, dns: Fix missing size check of V1 server-list header [+ + +]
Author: Edward Adam Davis <eadavis@qq.com>
Date:   Sun Dec 24 00:02:49 2023 +0000

    keys, dns: Fix missing size check of V1 server-list header
    
    commit 1997b3cb4217b09e49659b634c94da47f0340409 upstream.
    
    The dns_resolver_preparse() function has a check on the size of the
    payload for the basic header of the binary-style payload, but is missing
    a check for the size of the V1 server-list payload header after
    determining that's what we've been given.
    
    Fix this by getting rid of the the pointer to the basic header and just
    assuming that we have a V1 server-list payload and moving the V1 server
    list pointer inside the if-statement.  Dealing with other types and
    versions can be left for when such have been defined.
    
    This can be tested by doing the following with KASAN enabled:
    
        echo -n -e '\x0\x0\x1\x2' | keyctl padd dns_resolver foo @p
    
    and produces an oops like the following:
    
        BUG: KASAN: slab-out-of-bounds in dns_resolver_preparse+0xc9f/0xd60 net/dns_resolver/dns_key.c:127
        Read of size 1 at addr ffff888028894084 by task syz-executor265/5069
        ...
        Call Trace:
          dns_resolver_preparse+0xc9f/0xd60 net/dns_resolver/dns_key.c:127
          __key_create_or_update+0x453/0xdf0 security/keys/key.c:842
          key_create_or_update+0x42/0x50 security/keys/key.c:1007
          __do_sys_add_key+0x29c/0x450 security/keys/keyctl.c:134
          do_syscall_x64 arch/x86/entry/common.c:52 [inline]
          do_syscall_64+0x40/0x110 arch/x86/entry/common.c:83
          entry_SYSCALL_64_after_hwframe+0x62/0x6a
    
    This patch was originally by Edward Adam Davis, but was modified by
    Linus.
    
    Fixes: b946001d3bb1 ("keys, dns: Allow key types (eg. DNS) to be reclaimed immediately on expiry")
    Reported-and-tested-by: syzbot+94bbb75204a05da3d89f@syzkaller.appspotmail.com
    Link: https://lore.kernel.org/r/0000000000009b39bc060c73e209@google.com/
    Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Edward Adam Davis <eadavis@qq.com>
    Signed-off-by: David Howells <dhowells@redhat.com>
    Tested-by: David Howells <dhowells@redhat.com>
    Cc: Edward Adam Davis <eadavis@qq.com>
    Cc: Jarkko Sakkinen <jarkko@kernel.org>
    Cc: Jeffrey E Altman <jaltman@auristor.com>
    Cc: Wang Lei <wang840925@gmail.com>
    Cc: Jeff Layton <jlayton@redhat.com>
    Cc: Steve French <sfrench@us.ibm.com>
    Cc: Marc Dionne <marc.dionne@auristor.com>
    Cc: "David S. Miller" <davem@davemloft.net>
    Cc: Eric Dumazet <edumazet@google.com>
    Cc: Jakub Kicinski <kuba@kernel.org>
    Cc: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Jeffrey E Altman <jaltman@auristor.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 
khugepage: replace try_to_release_page() with filemap_release_folio() [+ + +]
Author: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Date:   Thu Nov 17 23:30:53 2022 -0800

    khugepage: replace try_to_release_page() with filemap_release_folio()
    
    [ Upstream commit 64ab3195ea077eaeedc8b382939c3dc5ca56f369 ]
    
    Replace some calls with their folio equivalents.  This change removes 4
    calls to compound_head() and is in preparation for the removal of the
    try_to_release_page() wrapper.
    
    Link: https://lkml.kernel.org/r/20221118073055.55694-3-vishal.moola@gmail.com
    Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
    Cc: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 1898efcdbed3 ("block: update the stable_writes flag in bdev_add")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
KVM: x86/pmu: fix masking logic for MSR_CORE_PERF_GLOBAL_CTRL [+ + +]
Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Thu Jan 4 16:15:17 2024 +0100

    KVM: x86/pmu: fix masking logic for MSR_CORE_PERF_GLOBAL_CTRL
    
    commit 971079464001c6856186ca137778e534d983174a upstream.
    
    When commit c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE
    MSR emulation for extended PEBS") switched the initialization of
    cpuc->guest_switch_msrs to use compound literals, it screwed up
    the boolean logic:
    
    +       u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable;
    ...
    -       arr[0].guest = intel_ctrl & ~cpuc->intel_ctrl_host_mask;
    -       arr[0].guest &= ~(cpuc->pebs_enabled & x86_pmu.pebs_capable);
    +               .guest = intel_ctrl & (~cpuc->intel_ctrl_host_mask | ~pebs_mask),
    
    Before the patch, the value of arr[0].guest would have been intel_ctrl &
    ~cpuc->intel_ctrl_host_mask & ~pebs_mask.  The intent is to always treat
    PEBS events as host-only because, while the guest runs, there is no way
    to tell the processor about the virtual address where to put PEBS records
    intended for the host.
    
    Unfortunately, the new expression can be expanded to
    
            (intel_ctrl & ~cpuc->intel_ctrl_host_mask) | (intel_ctrl & ~pebs_mask)
    
    which makes no sense; it includes any bit that isn't *both* marked as
    exclude_guest and using PEBS.  So, reinstate the old logic.  Another
    way to write it could be "intel_ctrl & ~(cpuc->intel_ctrl_host_mask |
    pebs_mask)", presumably the intention of the author of the faulty.
    However, I personally find the repeated application of A AND NOT B to
    be a bit more readable.
    
    This shows up as guest failures when running concurrent long-running
    perf workloads on the host, and was reported to happen with rcutorture.
    All guests on a given host would die simultaneously with something like an
    instruction fault or a segmentation violation.
    
    Reported-by: Paul E. McKenney <paulmck@kernel.org>
    Analyzed-by: Sean Christopherson <seanjc@google.com>
    Tested-by: Paul E. McKenney <paulmck@kernel.org>
    Cc: stable@vger.kernel.org
    Fixes: c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS")
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly [+ + +]
Author: Ming Lei <ming.lei@redhat.com>
Date:   Mon Nov 20 16:35:59 2023 +0800

    lib/group_cpus.c: avoid acquiring cpu hotplug lock in group_cpus_evenly
    
    [ Upstream commit 0263f92fadbb9d294d5971ac57743f882c93b2b3 ]
    
    group_cpus_evenly() could be part of storage driver's error handler, such
    as nvme driver, when may happen during CPU hotplug, in which storage queue
    has to drain its pending IOs because all CPUs associated with the queue
    are offline and the queue is becoming inactive.  And handling IO needs
    error handler to provide forward progress.
    
    Then deadlock is caused:
    
    1) inside CPU hotplug handler, CPU hotplug lock is held, and blk-mq's
       handler is waiting for inflight IO
    
    2) error handler is waiting for CPU hotplug lock
    
    3) inflight IO can't be completed in blk-mq's CPU hotplug handler
       because error handling can't provide forward progress.
    
    Solve the deadlock by not holding CPU hotplug lock in group_cpus_evenly(),
    in which two stage spreads are taken: 1) the 1st stage is over all present
    CPUs; 2) the end stage is over all other CPUs.
    
    Turns out the two stage spread just needs consistent 'cpu_present_mask',
    and remove the CPU hotplug lock by storing it into one local cache.  This
    way doesn't change correctness, because all CPUs are still covered.
    
    Link: https://lkml.kernel.org/r/20231120083559.285174-1-ming.lei@redhat.com
    Signed-off-by: Ming Lei <ming.lei@redhat.com>
    Reported-by: Yi Zhang <yi.zhang@redhat.com>
    Reported-by: Guangwu Zhang <guazhang@redhat.com>
    Tested-by: Guangwu Zhang <guazhang@redhat.com>
    Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com>
    Reviewed-by: Jens Axboe <axboe@kernel.dk>
    Cc: Keith Busch <kbusch@kernel.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
Linux: Linux 6.1.72 [+ + +]
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Wed Jan 10 17:10:37 2024 +0100

    Linux 6.1.72
    
    Link: https://lore.kernel.org/r/20240108153511.214254205@linuxfoundation.org
    Tested-by: SeongJae Park <sj@kernel.org>
    Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Tested-by: Allen Pais <apais@linux.microsoft.com>
    Tested-by: Shuah Khan <skhan@linuxfoundation.org>
    Tested-by: Salvatore Bonaccorso <carnil@debian.org>
    Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
    Tested-by: Conor Dooley <conor.dooley@microchip.com>
    Tested-by: Jon Hunter <jonathanh@nvidia.com>
    Tested-by: Sven Joachim <svenjoac@gmx.de>
    Tested-by: Ron Economos <re@w6rz.net>
    Tested-by: Kelsey Steele <kelseysteele@linux.microsoft.com>
    Tested-by: Pavel Machek (CIP) <pavel@denx.de>
    Tested-by: Yann Sionneau <ysionneau@kalrayinc.com>
    Tested-by: kernelci.org bot <bot@kernelci.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
media: camss: sm8250: Virtual channels for CSID [+ + +]
Author: Milen Mitkov <quic_mmitkov@quicinc.com>
Date:   Fri Dec 9 11:40:34 2022 +0200

    media: camss: sm8250: Virtual channels for CSID
    
    [ Upstream commit 3c4ed72a16bc6733cda9c65048af74a2e8eaa0eb ]
    
    CSID hardware on SM8250 can demux up to 4 simultaneous streams
    based on virtual channel (vc) or datatype (dt).
    The CSID subdevice entity now has 4 source ports that can be
    enabled/disabled and thus can control which virtual channels
    are enabled. Datatype demuxing not tested.
    
    In order to keep a valid internal state of the subdevice,
    implicit format propagation from the sink to the source pads
    has been preserved. However, the format on each source pad
    can be different and in that case it must be configured explicitly.
    
    CSID's s_stream is called when any stream is started or stopped.
    It will call configure_streams() that will rewrite IRQ settings to HW.
    When multiple streams are running simultaneously there is an issue
    when writing IRQ settings for one stream while another is still
    running, thus avoid re-writing settings if they were not changed
    in link setup, or by fully powering off the CSID hardware.
    
    Signed-off-by: Milen Mitkov <quic_mmitkov@quicinc.com>
    Reviewed-by: Robert Foss <robert.foss@linaro.org>
    Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
    Acked-by: Robert Foss <robert.foss@linaro.org>
    Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Stable-dep-of: e655d1ae9703 ("media: qcom: camss: Fix set CSI2_RX_CFG1_VC_MODE when VC is greater than 3")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

media: qcom: camss: Comment CSID dt_id field [+ + +]
Author: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Date:   Thu Sep 28 01:58:25 2023 +0100

    media: qcom: camss: Comment CSID dt_id field
    
    commit f910d3ba78a2677c23508f225eb047d89eb4b2b6 upstream.
    
    Digging into the documentation we find that the DT_ID bitfield is used to
    map the six bit DT to a two bit ID code. This value is concatenated to the
    VC bitfield to create a CID value. DT_ID is the two least significant bits
    of CID and VC the most significant bits.
    
    Originally we set dt_id = vc * 4 in and then subsequently set dt_id = vc.
    
    commit 3c4ed72a16bc ("media: camss: sm8250: Virtual channels for CSID")
    silently fixed the multiplication by four which would give a better
    value for the generated CID without mentioning what was being done or why.
    
    Next up I haplessly changed the value back to "dt_id = vc * 4" since there
    didn't appear to be any logic behind it.
    
    Hans asked what the change was for and I honestly couldn't remember the
    provenance of it, so I dug in.
    
    Link: https://lore.kernel.org/linux-arm-msm/edd4bf9b-0e1b-883c-1a4d-50f4102c3924@xs4all.nl/
    
    Add a comment so the next hapless programmer doesn't make this same
    mistake.
    
    Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
    Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: qcom: camss: Fix set CSI2_RX_CFG1_VC_MODE when VC is greater than 3 [+ + +]
Author: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Date:   Wed Aug 30 16:16:14 2023 +0100

    media: qcom: camss: Fix set CSI2_RX_CFG1_VC_MODE when VC is greater than 3
    
    [ Upstream commit e655d1ae9703286cef7fda8675cad62f649dc183 ]
    
    VC_MODE = 0 implies a two bit VC address.
    VC_MODE = 1 is required for VCs with a larger address than two bits.
    
    Fixes: eebe6d00e9bf ("media: camss: Add support for CSID hardware version Titan 170")
    Cc: stable@vger.kernel.org
    Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
    Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
    Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
memory-failure: convert truncate_error_page() to use folio [+ + +]
Author: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Date:   Thu Nov 17 23:30:54 2022 -0800

    memory-failure: convert truncate_error_page() to use folio
    
    [ Upstream commit ac5efa782041670b63a05c36d92d02a80e50bb63 ]
    
    Replace try_to_release_page() with filemap_release_folio().  This change
    is in preparation for the removal of the try_to_release_page() wrapper.
    
    Link: https://lkml.kernel.org/r/20221118073055.55694-4-vishal.moola@gmail.com
    Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
    Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 1898efcdbed3 ("block: update the stable_writes flag in bdev_add")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
mlxbf_gige: fix receive packet race condition [+ + +]
Author: David Thompson <davthompson@nvidia.com>
Date:   Wed Dec 20 18:47:39 2023 -0500

    mlxbf_gige: fix receive packet race condition
    
    [ Upstream commit dcea1bd45e6d111cc8fc1aaefa7e31694089bda3 ]
    
    Under heavy traffic, the BlueField Gigabit interface can
    become unresponsive. This is due to a possible race condition
    in the mlxbf_gige_rx_packet function, where the function exits
    with producer and consumer indices equal but there are remaining
    packet(s) to be processed. In order to prevent this situation,
    read receive consumer index *before* the HW replenish so that
    the mlxbf_gige_rx_packet function returns an accurate return
    value even if a packet is received into just-replenished buffer
    prior to exiting this routine. If the just-replenished buffer
    is received and occupies the last RX ring entry, the interface
    would not recover and instead would encounter RX packet drops
    related to internal buffer shortages since the driver RX logic
    is not being triggered to drain the RX ring. This patch will
    address and prevent this "ring full" condition.
    
    Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver")
    Reviewed-by: Asmaa Mnebhi <asmaa@nvidia.com>
    Signed-off-by: David Thompson <davthompson@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
mm, netfs, fscache: stop read optimisation when folio removed from pagecache [+ + +]
Author: David Howells <dhowells@redhat.com>
Date:   Wed Jun 28 11:48:52 2023 +0100

    mm, netfs, fscache: stop read optimisation when folio removed from pagecache
    
    [ Upstream commit b4fa966f03b7401ceacd4ffd7227197afb2b8376 ]
    
    Fscache has an optimisation by which reads from the cache are skipped
    until we know that (a) there's data there to be read and (b) that data
    isn't entirely covered by pages resident in the netfs pagecache.  This is
    done with two flags manipulated by fscache_note_page_release():
    
            if (...
                test_bit(FSCACHE_COOKIE_HAVE_DATA, &cookie->flags) &&
                test_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags))
                    clear_bit(FSCACHE_COOKIE_NO_DATA_TO_READ, &cookie->flags);
    
    where the NO_DATA_TO_READ flag causes cachefiles_prepare_read() to
    indicate that netfslib should download from the server or clear the page
    instead.
    
    The fscache_note_page_release() function is intended to be called from
    ->releasepage() - but that only gets called if PG_private or PG_private_2
    is set - and currently the former is at the discretion of the network
    filesystem and the latter is only set whilst a page is being written to
    the cache, so sometimes we miss clearing the optimisation.
    
    Fix this by following Willy's suggestion[1] and adding an address_space
    flag, AS_RELEASE_ALWAYS, that causes filemap_release_folio() to always call
    ->release_folio() if it's set, even if PG_private or PG_private_2 aren't
    set.
    
    Note that this would require folio_test_private() and page_has_private() to
    become more complicated.  To avoid that, in the places[*] where these are
    used to conditionalise calls to filemap_release_folio() and
    try_to_release_page(), the tests are removed the those functions just
    jumped to unconditionally and the test is performed there.
    
    [*] There are some exceptions in vmscan.c where the check guards more than
    just a call to the releaser.  I've added a function, folio_needs_release()
    to wrap all the checks for that.
    
    AS_RELEASE_ALWAYS should be set if a non-NULL cookie is obtained from
    fscache and cleared in ->evict_inode() before truncate_inode_pages_final()
    is called.
    
    Additionally, the FSCACHE_COOKIE_NO_DATA_TO_READ flag needs to be cleared
    and the optimisation cancelled if a cachefiles object already contains data
    when we open it.
    
    [dwysocha@redhat.com: call folio_mapping() inside folio_needs_release()]
      Link: https://github.com/DaveWysochanskiRH/kernel/commit/902c990e311120179fa5de99d68364b2947b79ec
    Link: https://lkml.kernel.org/r/20230628104852.3391651-3-dhowells@redhat.com
    Fixes: 1f67e6d0b188 ("fscache: Provide a function to note the release of a page")
    Fixes: 047487c947e8 ("cachefiles: Implement the I/O routines")
    Signed-off-by: David Howells <dhowells@redhat.com>
    Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
    Reported-by: Rohith Surabattula <rohiths.msft@gmail.com>
    Suggested-by: Matthew Wilcox <willy@infradead.org>
    Tested-by: SeongJae Park <sj@kernel.org>
    Cc: Daire Byrne <daire.byrne@gmail.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Steve French <sfrench@samba.org>
    Cc: Shyam Prasad N <nspmangalore@gmail.com>
    Cc: Rohith Surabattula <rohiths.msft@gmail.com>
    Cc: Dave Wysochanski <dwysocha@redhat.com>
    Cc: Dominique Martinet <asmadeus@codewreck.org>
    Cc: Ilya Dryomov <idryomov@gmail.com>
    Cc: Andreas Dilger <adilger.kernel@dilger.ca>
    Cc: Jingbo Xu <jefflexu@linux.alibaba.com>
    Cc: "Theodore Ts'o" <tytso@mit.edu>
    Cc: Xiubo Li <xiubli@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 1898efcdbed3 ("block: update the stable_writes flag in bdev_add")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
mm/memory_hotplug: add missing mem_hotplug_lock [+ + +]
Author: Sumanth Korikkar <sumanthk@linux.ibm.com>
Date:   Mon Nov 20 15:53:52 2023 +0100

    mm/memory_hotplug: add missing mem_hotplug_lock
    
    [ Upstream commit 001002e73712cdf6b8d9a103648cda3040ad7647 ]
    
    From Documentation/core-api/memory-hotplug.rst:
    When adding/removing/onlining/offlining memory or adding/removing
    heterogeneous/device memory, we should always hold the mem_hotplug_lock
    in write mode to serialise memory hotplug (e.g. access to global/zone
    variables).
    
    mhp_(de)init_memmap_on_memory() functions can change zone stats and
    struct page content, but they are currently called w/o the
    mem_hotplug_lock.
    
    When memory block is being offlined and when kmemleak goes through each
    populated zone, the following theoretical race conditions could occur:
    CPU 0:                                       | CPU 1:
    memory_offline()                             |
    -> offline_pages()                           |
            -> mem_hotplug_begin()               |
               ...                               |
            -> mem_hotplug_done()                |
                                                 | kmemleak_scan()
                                                 | -> get_online_mems()
                                                 |    ...
    -> mhp_deinit_memmap_on_memory()             |
      [not protected by mem_hotplug_begin/done()]|
      Marks memory section as offline,           |   Retrieves zone_start_pfn
      poisons vmemmap struct pages and updates   |   and struct page members.
      the zone related data                      |
                                                 |    ...
                                                 | -> put_online_mems()
    
    Fix this by ensuring mem_hotplug_lock is taken before performing
    mhp_init_memmap_on_memory().  Also ensure that
    mhp_deinit_memmap_on_memory() holds the lock.
    
    online/offline_pages() are currently only called from
    memory_block_online/offline(), so it is safe to move the locking there.
    
    Link: https://lkml.kernel.org/r/20231120145354.308999-2-sumanthk@linux.ibm.com
    Fixes: a08a2ae34613 ("mm,memory_hotplug: allocate memmap from the added memory range")
    Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
    Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
    Acked-by: David Hildenbrand <david@redhat.com>
    Cc: Alexander Gordeev <agordeev@linux.ibm.com>
    Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
    Cc: Anshuman Khandual <anshuman.khandual@arm.com>
    Cc: Heiko Carstens <hca@linux.ibm.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: Vasily Gorbik <gor@linux.ibm.com>
    Cc: kernel test robot <lkp@intel.com>
    Cc: <stable@vger.kernel.org>    [5.15+]
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

mm/memory_hotplug: fix error handling in add_memory_resource() [+ + +]
Author: Sumanth Korikkar <sumanthk@linux.ibm.com>
Date:   Mon Nov 20 15:53:53 2023 +0100

    mm/memory_hotplug: fix error handling in add_memory_resource()
    
    [ Upstream commit f42ce5f087eb69e47294ababd2e7e6f88a82d308 ]
    
    In add_memory_resource(), creation of memory block devices occurs after
    successful call to arch_add_memory().  However, creation of memory block
    devices could fail.  In that case, arch_remove_memory() is called to
    perform necessary cleanup.
    
    Currently with or without altmap support, arch_remove_memory() is always
    passed with altmap set to NULL during error handling.  This leads to
    freeing of struct pages using free_pages(), eventhough the allocation
    might have been performed with altmap support via
    altmap_alloc_block_buf().
    
    Fix the error handling by passing altmap in arch_remove_memory(). This
    ensures the following:
    * When altmap is disabled, deallocation of the struct pages array occurs
      via free_pages().
    * When altmap is enabled, deallocation occurs via vmem_altmap_free().
    
    Link: https://lkml.kernel.org/r/20231120145354.308999-3-sumanthk@linux.ibm.com
    Fixes: a08a2ae34613 ("mm,memory_hotplug: allocate memmap from the added memory range")
    Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
    Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
    Acked-by: David Hildenbrand <david@redhat.com>
    Cc: Alexander Gordeev <agordeev@linux.ibm.com>
    Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
    Cc: Anshuman Khandual <anshuman.khandual@arm.com>
    Cc: Heiko Carstens <hca@linux.ibm.com>
    Cc: kernel test robot <lkp@intel.com>
    Cc: Michal Hocko <mhocko@suse.com>
    Cc: Oscar Salvador <osalvador@suse.de>
    Cc: Vasily Gorbik <gor@linux.ibm.com>
    Cc: <stable@vger.kernel.org>    [5.15+]
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
mm: fix unmap_mapping_range high bits shift bug [+ + +]
Author: Jiajun Xie <jiajun.xie.sh@gmail.com>
Date:   Wed Dec 20 13:28:39 2023 +0800

    mm: fix unmap_mapping_range high bits shift bug
    
    commit 9eab0421fa94a3dde0d1f7e36ab3294fc306c99d upstream.
    
    The bug happens when highest bit of holebegin is 1, suppose holebegin is
    0x8000000111111000, after shift, hba would be 0xfff8000000111111, then
    vma_interval_tree_foreach would look it up fail or leads to the wrong
    result.
    
    error call seq e.g.:
    - mmap(..., offset=0x8000000111111000)
      |- syscall(mmap, ... unsigned long, off):
         |- ksys_mmap_pgoff( ... , off >> PAGE_SHIFT);
    
      here pgoff is correctly shifted to 0x8000000111111,
      but pass 0x8000000111111000 as holebegin to unmap
      would then cause terrible result, as shown below:
    
    - unmap_mapping_range(..., loff_t const holebegin)
      |- pgoff_t hba = holebegin >> PAGE_SHIFT;
              /* hba = 0xfff8000000111111 unexpectedly */
    
    The issue happens in Heterogeneous computing, where the device(e.g.
    gpu) and host share the same virtual address space.
    
    A simple workflow pattern which hit the issue is:
            /* host */
        1. userspace first mmap a file backed VA range with specified offset.
                            e.g. (offset=0x800..., mmap return: va_a)
        2. write some data to the corresponding sys page
                             e.g. (va_a = 0xAABB)
            /* device */
        3. gpu workload touches VA, triggers gpu fault and notify the host.
            /* host */
        4. reviced gpu fault notification, then it will:
                4.1 unmap host pages and also takes care of cpu tlb
                      (use unmap_mapping_range with offset=0x800...)
                4.2 migrate sys page to device
                4.3 setup device page table and resolve device fault.
            /* device */
        5. gpu workload continued, it accessed va_a and got 0xAABB.
        6. gpu workload continued, it wrote 0xBBCC to va_a.
            /* host */
        7. userspace access va_a, as expected, it will:
                7.1 trigger cpu vm fault.
                7.2 driver handling fault to migrate gpu local page to host.
        8. userspace then could correctly get 0xBBCC from va_a
        9. done
    
    But in step 4.1, if we hit the bug this patch mentioned, then userspace
    would never trigger cpu fault, and still get the old value: 0xAABB.
    
    Making holebegin unsigned first fixes the bug.
    
    Link: https://lkml.kernel.org/r/20231220052839.26970-1-jiajun.xie.sh@gmail.com
    Signed-off-by: Jiajun Xie <jiajun.xie.sh@gmail.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: merge folio_has_private()/filemap_release_folio() call pairs [+ + +]
Author: David Howells <dhowells@redhat.com>
Date:   Wed Jun 28 11:48:51 2023 +0100

    mm: merge folio_has_private()/filemap_release_folio() call pairs
    
    [ Upstream commit 0201ebf274a306a6ebb95e5dc2d6a0a27c737cac ]
    
    Patch series "mm, netfs, fscache: Stop read optimisation when folio
    removed from pagecache", v7.
    
    This fixes an optimisation in fscache whereby we don't read from the cache
    for a particular file until we know that there's data there that we don't
    have in the pagecache.  The problem is that I'm no longer using PG_fscache
    (aka PG_private_2) to indicate that the page is cached and so I don't get
    a notification when a cached page is dropped from the pagecache.
    
    The first patch merges some folio_has_private() and
    filemap_release_folio() pairs and introduces a helper,
    folio_needs_release(), to indicate if a release is required.
    
    The second patch is the actual fix.  Following Willy's suggestions[1], it
    adds an AS_RELEASE_ALWAYS flag to an address_space that will make
    filemap_release_folio() always call ->release_folio(), even if
    PG_private/PG_private_2 aren't set.  folio_needs_release() is altered to
    add a check for this.
    
    This patch (of 2):
    
    Make filemap_release_folio() check folio_has_private().  Then, in most
    cases, where a call to folio_has_private() is immediately followed by a
    call to filemap_release_folio(), we can get rid of the test in the pair.
    
    There are a couple of sites in mm/vscan.c that this can't so easily be
    done.  In shrink_folio_list(), there are actually three cases (something
    different is done for incompletely invalidated buffers), but
    filemap_release_folio() elides two of them.
    
    In shrink_active_list(), we don't have have the folio lock yet, so the
    check allows us to avoid locking the page unnecessarily.
    
    A wrapper function to check if a folio needs release is provided for those
    places that still need to do it in the mm/ directory.  This will acquire
    additional parts to the condition in a future patch.
    
    After this, the only remaining caller of folio_has_private() outside of
    mm/ is a check in fuse.
    
    Link: https://lkml.kernel.org/r/20230628104852.3391651-1-dhowells@redhat.com
    Link: https://lkml.kernel.org/r/20230628104852.3391651-2-dhowells@redhat.com
    Reported-by: Rohith Surabattula <rohiths.msft@gmail.com>
    Suggested-by: Matthew Wilcox <willy@infradead.org>
    Signed-off-by: David Howells <dhowells@redhat.com>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Steve French <sfrench@samba.org>
    Cc: Shyam Prasad N <nspmangalore@gmail.com>
    Cc: Rohith Surabattula <rohiths.msft@gmail.com>
    Cc: Dave Wysochanski <dwysocha@redhat.com>
    Cc: Dominique Martinet <asmadeus@codewreck.org>
    Cc: Ilya Dryomov <idryomov@gmail.com>
    Cc: "Theodore Ts'o" <tytso@mit.edu>
    Cc: Andreas Dilger <adilger.kernel@dilger.ca>
    Cc: Xiubo Li <xiubli@redhat.com>
    Cc: Jingbo Xu <jefflexu@linux.alibaba.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Stable-dep-of: 1898efcdbed3 ("block: update the stable_writes flag in bdev_add")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
mmc: core: Cancel delayed work before releasing host [+ + +]
Author: Geert Uytterhoeven <geert+renesas@glider.be>
Date:   Mon Dec 4 12:29:53 2023 +0100

    mmc: core: Cancel delayed work before releasing host
    
    commit 1036f69e251380573e256568cf814506e3fb9988 upstream.
    
    On RZ/Five SMARC EVK, where probing of SDHI is deferred due to probe
    deferral of the vqmmc-supply regulator:
    
        ------------[ cut here ]------------
        WARNING: CPU: 0 PID: 0 at kernel/time/timer.c:1738 __run_timers.part.0+0x1d0/0x1e8
        Modules linked in:
        CPU: 0 PID: 0 Comm: swapper Not tainted 6.7.0-rc4 #101
        Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
        epc : __run_timers.part.0+0x1d0/0x1e8
         ra : __run_timers.part.0+0x134/0x1e8
        epc : ffffffff800771a4 ra : ffffffff80077108 sp : ffffffc800003e60
         gp : ffffffff814f5028 tp : ffffffff8140c5c0 t0 : ffffffc800000000
         t1 : 0000000000000001 t2 : ffffffff81201300 s0 : ffffffc800003f20
         s1 : ffffffd8023bc4a0 a0 : 00000000fffee6b0 a1 : 0004010000400000
         a2 : ffffffffc0000016 a3 : ffffffff81488640 a4 : ffffffc800003e60
         a5 : 0000000000000000 a6 : 0000000004000000 a7 : ffffffc800003e68
         s2 : 0000000000000122 s3 : 0000000000200000 s4 : 0000000000000000
         s5 : ffffffffffffffff s6 : ffffffff81488678 s7 : ffffffff814886c0
         s8 : ffffffff814f49c0 s9 : ffffffff81488640 s10: 0000000000000000
         s11: ffffffc800003e60 t3 : 0000000000000240 t4 : 0000000000000a52
         t5 : ffffffd8024ae018 t6 : ffffffd8024ae038
        status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003
        [<ffffffff800771a4>] __run_timers.part.0+0x1d0/0x1e8
        [<ffffffff800771e0>] run_timer_softirq+0x24/0x4a
        [<ffffffff80809092>] __do_softirq+0xc6/0x1fa
        [<ffffffff80028e4c>] irq_exit_rcu+0x66/0x84
        [<ffffffff80800f7a>] handle_riscv_irq+0x40/0x4e
        [<ffffffff80808f48>] call_on_irq_stack+0x1c/0x28
        ---[ end trace 0000000000000000 ]---
    
    What happens?
    
        renesas_sdhi_probe()
        {
            tmio_mmc_host_alloc()
                mmc_alloc_host()
                    INIT_DELAYED_WORK(&host->detect, mmc_rescan);
    
            devm_request_irq(tmio_mmc_irq);
    
            /*
             * After this, the interrupt handler may be invoked at any time
             *
             *  tmio_mmc_irq()
             *  {
             *      __tmio_mmc_card_detect_irq()
             *          mmc_detect_change()
             *              _mmc_detect_change()
             *                  mmc_schedule_delayed_work(&host->detect, delay);
             *  }
             */
    
            tmio_mmc_host_probe()
                tmio_mmc_init_ocr()
                    -EPROBE_DEFER
    
            tmio_mmc_host_free()
                mmc_free_host()
        }
    
    When expire_timers() runs later, it warns because the MMC host structure
    containing the delayed work was freed, and now contains an invalid work
    function pointer.
    
    Fix this by cancelling any pending delayed work before releasing the
    MMC host structure.
    
    Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
    Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/205dc4c91b47e31b64392fe2498c7a449e717b4b.1701689330.git.geert+renesas@glider.be
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: meson-mx-sdhc: Fix initialization frozen issue [+ + +]
Author: Ziyang Huang <hzyitc@outlook.com>
Date:   Wed Oct 11 00:44:00 2023 +0800

    mmc: meson-mx-sdhc: Fix initialization frozen issue
    
    commit 8c124d998ea0c9022e247b11ac51f86ec8afa0e1 upstream.
    
    Commit 4bc31edebde5 ("mmc: core: Set HS clock speed before sending
    HS CMD13") set HS clock (52MHz) before switching to HS mode. For this
    freq, FCLK_DIV5 will be selected and div value is 10 (reg value is 9).
    Then we set rx_clk_phase to 11 or 15 which is out of range and make
    hardware frozen. After we send command request, no irq will be
    interrupted and the mmc driver will keep to wait for request finished,
    even durning rebooting.
    
    So let's set it to Phase 90 which should work in most cases. Then let
    meson_mx_sdhc_execute_tuning() to find the accurate value for data
    transfer.
    
    If this doesn't work, maybe need to define a factor in dts.
    
    Fixes: e4bf1b0970ef ("mmc: host: meson-mx-sdhc: new driver for the Amlogic Meson SDHC host")
    Signed-off-by: Ziyang Huang <hzyitc@outlook.com>
    Tested-by: Anand Moon <linux.amoon@gmail.com>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/TYZPR01MB5556A3E71554A2EC08597EA4C9CDA@TYZPR01MB5556.apcprd01.prod.exchangelabs.com
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: rpmb: fixes pause retune on all RPMB partitions. [+ + +]
Author: Jorge Ramirez-Ortiz <jorge@foundries.io>
Date:   Fri Dec 1 16:31:43 2023 +0100

    mmc: rpmb: fixes pause retune on all RPMB partitions.
    
    commit e7794c14fd73e5eb4a3e0ecaa5334d5a17377c50 upstream.
    
    When RPMB was converted to a character device, it added support for
    multiple RPMB partitions (Commit 97548575bef3 ("mmc: block: Convert RPMB to
    a character device").
    
    One of the changes in this commit was transforming the variable target_part
    defined in __mmc_blk_ioctl_cmd into a bitmask. This inadvertently regressed
    the validation check done in mmc_blk_part_switch_pre() and
    mmc_blk_part_switch_post(), so let's fix it.
    
    Fixes: 97548575bef3 ("mmc: block: Convert RPMB to a character device")
    Signed-off-by: Jorge Ramirez-Ortiz <jorge@foundries.io>
    Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
    Cc: <stable@vger.kernel.org>
    Link: https://lore.kernel.org/r/20231201153143.1449753-1-jorge@foundries.io
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: sdhci-sprd: Fix eMMC init failure after hw reset [+ + +]
Author: Wenchao Chen <wenchao.chen@unisoc.com>
Date:   Mon Dec 4 14:49:34 2023 +0800

    mmc: sdhci-sprd: Fix eMMC init failure after hw reset
    
    commit 8abf77c88929b6d20fa4f9928b18d6448d64e293 upstream.
    
    Some eMMC devices that do not close the auto clk gate after hw reset will
    cause eMMC initialization to fail. Let's fix this.
    
    Signed-off-by: Wenchao Chen <wenchao.chen@unisoc.com>
    Fixes: ff874dbc4f86 ("mmc: sdhci-sprd: Disable CLK_AUTO when the clock is less than 400K")
    Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20231204064934.21236-1-wenchao.chen@unisoc.com
    Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
mptcp: prevent tcp diag from closing listener subflows [+ + +]
Author: Paolo Abeni <pabeni@redhat.com>
Date:   Tue Dec 26 13:10:18 2023 +0100

    mptcp: prevent tcp diag from closing listener subflows
    
    commit 4c0288299fd09ee7c6fbe2f57421f314d8c981db upstream.
    
    The MPTCP protocol does not expect that any other entity could change
    the first subflow status when such socket is listening.
    Unfortunately the TCP diag interface allows aborting any TCP socket,
    including MPTCP listeners subflows. As reported by syzbot, that trigger
    a WARN() and could lead to later bigger trouble.
    
    The MPTCP protocol needs to do some MPTCP-level cleanup actions to
    properly shutdown the listener. To keep the fix simple, prevent
    entirely the diag interface from stopping such listeners.
    
    We could refine the diag callback in a later, larger patch targeting
    net-next.
    
    Fixes: 57fc0f1ceaa4 ("mptcp: ensure listener is unhashed before updating the sk status")
    Cc: stable@vger.kernel.org
    Reported-by: <syzbot+5a01c3a666e726bc8752@syzkaller.appspotmail.com>
    Closes: https://lore.kernel.org/netdev/0000000000004f4579060c68431b@google.com/
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Mat Martineau <martineau@kernel.org>
    Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
    Link: https://lore.kernel.org/r/20231226-upstream-net-20231226-mptcp-prevent-warn-v1-2-1404dcc431ea@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
net-timestamp: extend SOF_TIMESTAMPING_OPT_ID to HW timestamps [+ + +]
Author: Vadim Fedorenko <vadfed@meta.com>
Date:   Mon Mar 6 08:07:38 2023 -0800

    net-timestamp: extend SOF_TIMESTAMPING_OPT_ID to HW timestamps
    
    [ Upstream commit 8ca5a5790b9a1ce147484d2a2c4e66d2553f3d6c ]
    
    When the feature was added it was enabled for SW timestamps only but
    with current hardware the same out-of-order timestamps can be seen.
    Let's expand the area for the feature to all types of timestamps.
    
    Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Stable-dep-of: 7f6ca95d16b9 ("net: Implement missing getsockopt(SO_TIMESTAMPING_NEW)")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
net/mlx5: Increase size of irq name buffer [+ + +]
Author: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Date:   Tue Nov 14 13:58:43 2023 -0800

    net/mlx5: Increase size of irq name buffer
    
    [ Upstream commit 3338bebfc26a1e2cebbba82a1cf12c0159608e73 ]
    
    Without increased buffer size, will trigger -Wformat-truncation with W=1
    for the snprintf operation writing to the buffer.
    
        drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c: In function 'mlx5_irq_alloc':
        drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c:296:7: error: '@pci:' directive output may be truncated writing 5 bytes into a region of size between 1 and 32 [-Werror=format-truncation=]
          296 |    "%s@pci:%s", name, pci_name(dev->pdev));
              |       ^~~~~
        drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c:295:2: note: 'snprintf' output 6 or more bytes (assuming 37) into a destination of size 32
          295 |  snprintf(irq->name, MLX5_MAX_IRQ_NAME,
              |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          296 |    "%s@pci:%s", name, pci_name(dev->pdev));
              |    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    
    Fixes: ada9f5d00797 ("IB/mlx5: Fix eq names to display nicely in /proc/interrupts")
    Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6d4ab2e97dcfbcd748ae71761a9d8e5e41cc732c
    Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
    Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
    Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
    Link: https://lore.kernel.org/r/20231114215846.5902-13-saeed@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
net/qla3xxx: fix potential memleak in ql_alloc_buffer_queues [+ + +]
Author: Dinghao Liu <dinghao.liu@zju.edu.cn>
Date:   Wed Dec 27 15:02:27 2023 +0800

    net/qla3xxx: fix potential memleak in ql_alloc_buffer_queues
    
    [ Upstream commit 89f45c30172c80e55c887f32f1af8e184124577b ]
    
    When dma_alloc_coherent() fails, we should free qdev->lrg_buf
    to prevent potential memleak.
    
    Fixes: 1357bfcf7106 ("qla3xxx: Dynamically size the rx buffer queue based on the MTU.")
    Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
    Link: https://lore.kernel.org/r/20231227070227.10527-1-dinghao.liu@zju.edu.cn
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
net/sched: act_ct: additional checks for outdated flows [+ + +]
Author: Vlad Buslov <vladbu@nvidia.com>
Date:   Tue Oct 24 21:58:57 2023 +0200

    net/sched: act_ct: additional checks for outdated flows
    
    commit a63b6622120cd03a304796dbccb80655b3a21798 upstream.
    
    Current nf_flow_is_outdated() implementation considers any flow table flow
    which state diverged from its underlying CT connection status for teardown
    which can be problematic in the following cases:
    
    - Flow has never been offloaded to hardware in the first place either
    because flow table has hardware offload disabled (flag
    NF_FLOWTABLE_HW_OFFLOAD is not set) or because it is still pending on 'add'
    workqueue to be offloaded for the first time. The former is incorrect, the
    later generates excessive deletions and additions of flows.
    
    - Flow is already pending to be updated on the workqueue. Tearing down such
    flows will also generate excessive removals from the flow table, especially
    on highly loaded system where the latency to re-offload a flow via 'add'
    workqueue can be quite high.
    
    When considering a flow for teardown as outdated verify that it is both
    offloaded to hardware and doesn't have any pending updates.
    
    Fixes: 41f2c7c342d3 ("net/sched: act_ct: Fix promotion of offloaded unreplied tuple")
    Reviewed-by: Paul Blakey <paulb@nvidia.com>
    Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net/sched: act_ct: Always fill offloading tuple iifidx [+ + +]
Author: Vlad Buslov <vladbu@nvidia.com>
Date:   Fri Nov 3 16:14:10 2023 +0100

    net/sched: act_ct: Always fill offloading tuple iifidx
    
    commit 9bc64bd0cd765f696fcd40fc98909b1f7c73b2ba upstream.
    
    Referenced commit doesn't always set iifidx when offloading the flow to
    hardware. Fix the following cases:
    
    - nf_conn_act_ct_ext_fill() is called before extension is created with
    nf_conn_act_ct_ext_add() in tcf_ct_act(). This can cause rule offload with
    unspecified iifidx when connection is offloaded after only single
    original-direction packet has been processed by tc data path. Always fill
    the new nf_conn_act_ct_ext instance after creating it in
    nf_conn_act_ct_ext_add().
    
    - Offloading of unidirectional UDP NEW connections is now supported, but ct
    flow iifidx field is not updated when connection is promoted to
    bidirectional which can result reply-direction iifidx to be zero when
    refreshing the connection. Fill in the extension and update flow iifidx
    before calling flow_offload_refresh().
    
    Fixes: 9795ded7f924 ("net/sched: act_ct: Fill offloading tuple iifidx")
    Reviewed-by: Paul Blakey <paulb@nvidia.com>
    Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Fixes: 6a9bad0069cf ("net/sched: act_ct: offload UDP NEW connections")
    Link: https://lore.kernel.org/r/20231103151410.764271-1-vladbu@nvidia.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net/sched: act_ct: Fix promotion of offloaded unreplied tuple [+ + +]
Author: Paul Blakey <paulb@nvidia.com>
Date:   Fri Jun 9 15:22:59 2023 +0300

    net/sched: act_ct: Fix promotion of offloaded unreplied tuple
    
    [ Upstream commit 41f2c7c342d3adb1c4dd5f2e3dd831adff16a669 ]
    
    Currently UNREPLIED and UNASSURED connections are added to the nf flow
    table. This causes the following connection packets to be processed
    by the flow table which then skips conntrack_in(), and thus such the
    connections will remain UNREPLIED and UNASSURED even if reply traffic
    is then seen. Even still, the unoffloaded reply packets are the ones
    triggering hardware update from new to established state, and if
    there aren't any to triger an update and/or previous update was
    missed, hardware can get out of sync with sw and still mark
    packets as new.
    
    Fix the above by:
    1) Not skipping conntrack_in() for UNASSURED packets, but still
       refresh for hardware, as before the cited patch.
    2) Try and force a refresh by reply-direction packets that update
       the hardware rules from new to established state.
    3) Remove any bidirectional flows that didn't failed to update in
       hardware for re-insertion as bidrectional once any new packet
       arrives.
    
    Fixes: 6a9bad0069cf ("net/sched: act_ct: offload UDP NEW connections")
    Co-developed-by: Vlad Buslov <vladbu@nvidia.com>
    Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
    Signed-off-by: Paul Blakey <paulb@nvidia.com>
    Reviewed-by: Florian Westphal <fw@strlen.de>
    Link: https://lore.kernel.org/r/1686313379-117663-1-git-send-email-paulb@nvidia.com
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Stable-dep-of: 125f1c7f26ff ("net/sched: act_ct: Take per-cb reference to tcf_ct_flow_table")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/sched: act_ct: offload UDP NEW connections [+ + +]
Author: Vlad Buslov <vladbu@nvidia.com>
Date:   Wed Feb 1 17:30:59 2023 +0100

    net/sched: act_ct: offload UDP NEW connections
    
    [ Upstream commit 6a9bad0069cf306f3df6ac53cf02438d4e15f296 ]
    
    Modify the offload algorithm of UDP connections to the following:
    
    - Offload NEW connection as unidirectional.
    
    - When connection state changes to ESTABLISHED also update the hardware
    flow. However, in order to prevent act_ct from spamming offload add wq for
    every packet coming in reply direction in this state verify whether
    connection has already been updated to ESTABLISHED in the drivers. If that
    it the case, then skip flow_table and let conntrack handle such packets
    which will also allow conntrack to potentially promote the connection to
    ASSURED.
    
    - When connection state changes to ASSURED set the flow_table flow
    NF_FLOW_HW_BIDIRECTIONAL flag which will cause refresh mechanism to offload
    the reply direction.
    
    All other protocols have their offload algorithm preserved and are always
    offloaded as bidirectional.
    
    Note that this change tries to minimize the load on flow_table add
    workqueue. First, it tracks the last ctinfo that was offloaded by using new
    flow 'NF_FLOW_HW_ESTABLISHED' flag and doesn't schedule the refresh for
    reply direction packets when the offloads have already been updated with
    current ctinfo. Second, when 'add' task executes on workqueue it always
    update the offload with current flow state (by checking 'bidirectional'
    flow flag and obtaining actual ctinfo/cookie through meta action instead of
    caching any of these from the moment of scheduling the 'add' work)
    preventing the need from scheduling more updates if state changed
    concurrently while the 'add' work was pending on workqueue.
    
    Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Stable-dep-of: 125f1c7f26ff ("net/sched: act_ct: Take per-cb reference to tcf_ct_flow_table")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net/sched: act_ct: Take per-cb reference to tcf_ct_flow_table [+ + +]
Author: Vlad Buslov <vladbu@nvidia.com>
Date:   Tue Dec 5 18:25:54 2023 +0100

    net/sched: act_ct: Take per-cb reference to tcf_ct_flow_table
    
    [ Upstream commit 125f1c7f26ffcdbf96177abe75b70c1a6ceb17bc ]
    
    The referenced change added custom cleanup code to act_ct to delete any
    callbacks registered on the parent block when deleting the
    tcf_ct_flow_table instance. However, the underlying issue is that the
    drivers don't obtain the reference to the tcf_ct_flow_table instance when
    registering callbacks which means that not only driver callbacks may still
    be on the table when deleting it but also that the driver can still have
    pointers to its internal nf_flowtable and can use it concurrently which
    results either warning in netfilter[0] or use-after-free.
    
    Fix the issue by taking a reference to the underlying struct
    tcf_ct_flow_table instance when registering the callback and release the
    reference when unregistering. Expose new API required for such reference
    counting by adding two new callbacks to nf_flowtable_type and implementing
    them for act_ct flowtable_ct type. This fixes the issue by extending the
    lifetime of nf_flowtable until all users have unregistered.
    
    [0]:
    [106170.938634] ------------[ cut here ]------------
    [106170.939111] WARNING: CPU: 21 PID: 3688 at include/net/netfilter/nf_flow_table.h:262 mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core]
    [106170.940108] Modules linked in: act_ct nf_flow_table act_mirred act_skbedit act_tunnel_key vxlan cls_matchall nfnetlink_cttimeout act_gact cls_flower sch_ingress mlx5_vdpa vringh vhost_iotlb vdpa bonding openvswitch nsh rpcrdma rdma_ucm
    ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_regis
    try overlay mlx5_core
    [106170.943496] CPU: 21 PID: 3688 Comm: kworker/u48:0 Not tainted 6.6.0-rc7_for_upstream_min_debug_2023_11_01_13_02 #1
    [106170.944361] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
    [106170.945292] Workqueue: mlx5e mlx5e_rep_neigh_update [mlx5_core]
    [106170.945846] RIP: 0010:mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core]
    [106170.946413] Code: 89 ef 48 83 05 71 a4 14 00 01 e8 f4 06 04 e1 48 83 05 6c a4 14 00 01 48 83 c4 28 5b 5d 41 5c 41 5d c3 48 83 05 d1 8b 14 00 01 <0f> 0b 48 83 05 d7 8b 14 00 01 e9 96 fe ff ff 48 83 05 a2 90 14 00
    [106170.947924] RSP: 0018:ffff88813ff0fcb8 EFLAGS: 00010202
    [106170.948397] RAX: 0000000000000000 RBX: ffff88811eabac40 RCX: ffff88811eabad48
    [106170.949040] RDX: ffff88811eab8000 RSI: ffffffffa02cd560 RDI: 0000000000000000
    [106170.949679] RBP: ffff88811eab8000 R08: 0000000000000001 R09: ffffffffa0229700
    [106170.950317] R10: ffff888103538fc0 R11: 0000000000000001 R12: ffff88811eabad58
    [106170.950969] R13: ffff888110c01c00 R14: ffff888106b40000 R15: 0000000000000000
    [106170.951616] FS:  0000000000000000(0000) GS:ffff88885fd40000(0000) knlGS:0000000000000000
    [106170.952329] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [106170.952834] CR2: 00007f1cefd28cb0 CR3: 000000012181b006 CR4: 0000000000370ea0
    [106170.953482] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [106170.954121] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [106170.954766] Call Trace:
    [106170.955057]  <TASK>
    [106170.955315]  ? __warn+0x79/0x120
    [106170.955648]  ? mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core]
    [106170.956172]  ? report_bug+0x17c/0x190
    [106170.956537]  ? handle_bug+0x3c/0x60
    [106170.956891]  ? exc_invalid_op+0x14/0x70
    [106170.957264]  ? asm_exc_invalid_op+0x16/0x20
    [106170.957666]  ? mlx5_del_flow_rules+0x10/0x310 [mlx5_core]
    [106170.958172]  ? mlx5_tc_ct_block_flow_offload_add+0x1240/0x1240 [mlx5_core]
    [106170.958788]  ? mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core]
    [106170.959339]  ? mlx5_tc_ct_del_ft_cb+0xc6/0x2b0 [mlx5_core]
    [106170.959854]  ? mapping_remove+0x154/0x1d0 [mlx5_core]
    [106170.960342]  ? mlx5e_tc_action_miss_mapping_put+0x4f/0x80 [mlx5_core]
    [106170.960927]  mlx5_tc_ct_delete_flow+0x76/0xc0 [mlx5_core]
    [106170.961441]  mlx5_free_flow_attr_actions+0x13b/0x220 [mlx5_core]
    [106170.962001]  mlx5e_tc_del_fdb_flow+0x22c/0x3b0 [mlx5_core]
    [106170.962524]  mlx5e_tc_del_flow+0x95/0x3c0 [mlx5_core]
    [106170.963034]  mlx5e_flow_put+0x73/0xe0 [mlx5_core]
    [106170.963506]  mlx5e_put_flow_list+0x38/0x70 [mlx5_core]
    [106170.964002]  mlx5e_rep_update_flows+0xec/0x290 [mlx5_core]
    [106170.964525]  mlx5e_rep_neigh_update+0x1da/0x310 [mlx5_core]
    [106170.965056]  process_one_work+0x13a/0x2c0
    [106170.965443]  worker_thread+0x2e5/0x3f0
    [106170.965808]  ? rescuer_thread+0x410/0x410
    [106170.966192]  kthread+0xc6/0xf0
    [106170.966515]  ? kthread_complete_and_exit+0x20/0x20
    [106170.966970]  ret_from_fork+0x2d/0x50
    [106170.967332]  ? kthread_complete_and_exit+0x20/0x20
    [106170.967774]  ret_from_fork_asm+0x11/0x20
    [106170.970466]  </TASK>
    [106170.970726] ---[ end trace 0000000000000000 ]---
    
    Fixes: 77ac5e40c44e ("net/sched: act_ct: remove and free nf_table callbacks")
    Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
    Reviewed-by: Paul Blakey <paulb@nvidia.com>
    Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
net/smc: fix invalid link access in dumping SMC-R connections [+ + +]
Author: Wen Gu <guwen@linux.alibaba.com>
Date:   Wed Dec 27 15:40:35 2023 +0800

    net/smc: fix invalid link access in dumping SMC-R connections
    
    [ Upstream commit 9dbe086c69b8902c85cece394760ac212e9e4ccc ]
    
    A crash was found when dumping SMC-R connections. It can be reproduced
    by following steps:
    
    - environment: two RNICs on both sides.
    - run SMC-R between two sides, now a SMC_LGR_SYMMETRIC type link group
      will be created.
    - set the first RNIC down on either side and link group will turn to
      SMC_LGR_ASYMMETRIC_LOCAL then.
    - run 'smcss -R' and the crash will be triggered.
    
     BUG: kernel NULL pointer dereference, address: 0000000000000010
     #PF: supervisor read access in kernel mode
     #PF: error_code(0x0000) - not-present page
     PGD 8000000101fdd067 P4D 8000000101fdd067 PUD 10ce46067 PMD 0
     Oops: 0000 [#1] PREEMPT SMP PTI
     CPU: 3 PID: 1810 Comm: smcss Kdump: loaded Tainted: G W   E      6.7.0-rc6+ #51
     RIP: 0010:__smc_diag_dump.constprop.0+0x36e/0x620 [smc_diag]
     Call Trace:
      <TASK>
      ? __die+0x24/0x70
      ? page_fault_oops+0x66/0x150
      ? exc_page_fault+0x69/0x140
      ? asm_exc_page_fault+0x26/0x30
      ? __smc_diag_dump.constprop.0+0x36e/0x620 [smc_diag]
      smc_diag_dump_proto+0xd0/0xf0 [smc_diag]
      smc_diag_dump+0x26/0x60 [smc_diag]
      netlink_dump+0x19f/0x320
      __netlink_dump_start+0x1dc/0x300
      smc_diag_handler_dump+0x6a/0x80 [smc_diag]
      ? __pfx_smc_diag_dump+0x10/0x10 [smc_diag]
      sock_diag_rcv_msg+0x121/0x140
      ? __pfx_sock_diag_rcv_msg+0x10/0x10
      netlink_rcv_skb+0x5a/0x110
      sock_diag_rcv+0x28/0x40
      netlink_unicast+0x22a/0x330
      netlink_sendmsg+0x240/0x4a0
      __sock_sendmsg+0xb0/0xc0
      ____sys_sendmsg+0x24e/0x300
      ? copy_msghdr_from_user+0x62/0x80
      ___sys_sendmsg+0x7c/0xd0
      ? __do_fault+0x34/0x1a0
      ? do_read_fault+0x5f/0x100
      ? do_fault+0xb0/0x110
      __sys_sendmsg+0x4d/0x80
      do_syscall_64+0x45/0xf0
      entry_SYSCALL_64_after_hwframe+0x6e/0x76
    
    When the first RNIC is set down, the lgr->lnk[0] will be cleared and an
    asymmetric link will be allocated in lgr->link[SMC_LINKS_PER_LGR_MAX - 1]
    by smc_llc_alloc_alt_link(). Then when we try to dump SMC-R connections
    in __smc_diag_dump(), the invalid lgr->lnk[0] will be accessed, resulting
    in this issue. So fix it by accessing the right link.
    
    Fixes: f16a7dd5cf27 ("smc: netlink interface for SMC sockets")
    Reported-by: henaumars <henaumars@sina.com>
    Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7616
    Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
    Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
    Link: https://lore.kernel.org/r/1703662835-53416-1-git-send-email-guwen@linux.alibaba.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
net: annotate data-races around sk->sk_bind_phc [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Aug 31 13:52:12 2023 +0000

    net: annotate data-races around sk->sk_bind_phc
    
    [ Upstream commit 251cd405a9e6e70b92fe5afbdd17fd5caf9d3266 ]
    
    sk->sk_bind_phc is read locklessly. Add corresponding annotations.
    
    Fixes: d463126e23f1 ("net: sock: extend SO_TIMESTAMPING for PHC binding")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Yangbo Lu <yangbo.lu@nxp.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Stable-dep-of: 7f6ca95d16b9 ("net: Implement missing getsockopt(SO_TIMESTAMPING_NEW)")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: annotate data-races around sk->sk_tsflags [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Aug 31 13:52:11 2023 +0000

    net: annotate data-races around sk->sk_tsflags
    
    [ Upstream commit e3390b30a5dfb112e8e802a59c0f68f947b638b2 ]
    
    sk->sk_tsflags can be read locklessly, add corresponding annotations.
    
    Fixes: b9f40e21ef42 ("net-timestamp: move timestamp flags out of sk_flags")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Willem de Bruijn <willemb@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Stable-dep-of: 7f6ca95d16b9 ("net: Implement missing getsockopt(SO_TIMESTAMPING_NEW)")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: bcmgenet: Fix FCS generation for fragmented skbuffs [+ + +]
Author: Adrian Cinal <adriancinal@gmail.com>
Date:   Thu Dec 28 14:56:38 2023 +0100

    net: bcmgenet: Fix FCS generation for fragmented skbuffs
    
    [ Upstream commit e584f2ff1e6cc9b1d99e8a6b0f3415940d1b3eb3 ]
    
    The flag DMA_TX_APPEND_CRC was only written to the first DMA descriptor
    in the TX path, where each descriptor corresponds to a single skbuff
    fragment (or the skbuff head). This led to packets with no FCS appearing
    on the wire if the kernel allocated the packet in fragments, which would
    always happen when using PACKET_MMAP/TPACKET (cf. tpacket_fill_skb() in
    net/af_packet.c).
    
    Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file")
    Signed-off-by: Adrian Cinal <adriancinal1@gmail.com>
    Acked-by: Doug Berger <opendmb@gmail.com>
    Acked-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Link: https://lore.kernel.org/r/20231228135638.1339245-1-adriancinal1@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: Declare MSG_SPLICE_PAGES internal sendmsg() flag [+ + +]
Author: David Howells <dhowells@redhat.com>
Date:   Mon May 22 13:11:10 2023 +0100

    net: Declare MSG_SPLICE_PAGES internal sendmsg() flag
    
    [ Upstream commit b841b901c452d92610f739a36e54978453528876 ]
    
    Declare MSG_SPLICE_PAGES, an internal sendmsg() flag, that hints to a
    network protocol that it should splice pages from the source iterator
    rather than copying the data if it can.  This flag is added to a list that
    is cleared by sendmsg syscalls on entry.
    
    This is intended as a replacement for the ->sendpage() op, allowing a way
    to splice in several multipage folios in one go.
    
    Signed-off-by: David Howells <dhowells@redhat.com>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    cc: Jens Axboe <axboe@kernel.dk>
    cc: Matthew Wilcox <willy@infradead.org>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: a0002127cd74 ("udp: move udp->no_check6_tx to udp->udp_flags")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: dpaa2-eth: rearrange variable in dpaa2_eth_get_ethtool_stats [+ + +]
Author: Ioana Ciornei <ioana.ciornei@nxp.com>
Date:   Tue Oct 18 17:18:51 2022 +0300

    net: dpaa2-eth: rearrange variable in dpaa2_eth_get_ethtool_stats
    
    [ Upstream commit 3313206827678f6f036eca601a51f6c4524b559a ]
    
    Rearrange the variables in the dpaa2_eth_get_ethtool_stats() function so
    that we adhere to the reverse Christmas tree rule.
    Also, in the next patch we are adding more variables and I didn't know
    where to place them with the current ordering.
    
    Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Stable-dep-of: beb1930f966d ("dpaa2-eth: recycle the RX buffer only after all processing done")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: Implement missing getsockopt(SO_TIMESTAMPING_NEW) [+ + +]
Author: Jц╤rn-Thorben Hinz <jthinz@mailbox.tu-berlin.de>
Date:   Fri Dec 22 00:19:01 2023 +0100

    net: Implement missing getsockopt(SO_TIMESTAMPING_NEW)
    
    [ Upstream commit 7f6ca95d16b96567ce4cf458a2790ff17fa620c3 ]
    
    Commit 9718475e6908 ("socket: Add SO_TIMESTAMPING_NEW") added the new
    socket option SO_TIMESTAMPING_NEW. Setting the option is handled in
    sk_setsockopt(), querying it was not handled in sk_getsockopt(), though.
    
    Following remarks on an earlier submission of this patch, keep the old
    behavior of getsockopt(SO_TIMESTAMPING_OLD) which returns the active
    flags even if they actually have been set through SO_TIMESTAMPING_NEW.
    
    The new getsockopt(SO_TIMESTAMPING_NEW) is stricter, returning flags
    only if they have been set through the same option.
    
    Fixes: 9718475e6908 ("socket: Add SO_TIMESTAMPING_NEW")
    Link: https://lore.kernel.org/lkml/20230703175048.151683-1-jthinz@mailbox.tu-berlin.de/
    Link: https://lore.kernel.org/netdev/0d7cddc9-03fa-43db-a579-14f3e822615b@app.fastmail.com/
    Signed-off-by: Jц╤rn-Thorben Hinz <jthinz@mailbox.tu-berlin.de>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: Implement missing SO_TIMESTAMPING_NEW cmsg support [+ + +]
Author: Thomas Lange <thomas@corelatus.se>
Date:   Thu Jan 4 09:57:44 2024 +0100

    net: Implement missing SO_TIMESTAMPING_NEW cmsg support
    
    [ Upstream commit 382a32018b74f407008615e0e831d05ed28e81cd ]
    
    Commit 9718475e6908 ("socket: Add SO_TIMESTAMPING_NEW") added the new
    socket option SO_TIMESTAMPING_NEW. However, it was never implemented in
    __sock_cmsg_send thus breaking SO_TIMESTAMPING cmsg for platforms using
    SO_TIMESTAMPING_NEW.
    
    Fixes: 9718475e6908 ("socket: Add SO_TIMESTAMPING_NEW")
    Link: https://lore.kernel.org/netdev/6a7281bf-bc4a-4f75-bb88-7011908ae471@app.fastmail.com/
    Signed-off-by: Thomas Lange <thomas@corelatus.se>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Link: https://lore.kernel.org/r/20240104085744.49164-1-thomas@corelatus.se
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: ravb: Wait for operating mode to be applied [+ + +]
Author: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Date:   Wed Jan 3 10:13:53 2024 +0200

    net: ravb: Wait for operating mode to be applied
    
    [ Upstream commit 9039cd4c61635b2d541009a7cd5e2cc052402f28 ]
    
    CSR.OPS bits specify the current operating mode and (according to
    documentation) they are updated by HW when the operating mode change
    request is processed. To comply with this check CSR.OPS before proceeding.
    
    Commit introduces ravb_set_opmode() that does all the necessities for
    setting the operating mode (set CCC.OPC (and CCC.GAC, CCC.CSEL, if any) and
    wait for CSR.OPS) and call it where needed. This should comply with all the
    HW manuals requirements as different manual variants specify that different
    modes need to be checked in CSR.OPS when setting CCC.OPC.
    
    If gPTP active in config mode is supported and it needs to be enabled, the
    CCC.GAC and CCC.CSEL needs to be configured along with CCC.OPC in the same
    write access. For this, ravb_set_opmode() allows passing GAC and CSEL as
    part of opmode and the function updates accordingly CCC register.
    
    Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
    Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
    Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: Save and restore msg_namelen in sock_sendmsg [+ + +]
Author: Marc Dionne <marc.dionne@auristor.com>
Date:   Thu Dec 21 09:12:30 2023 -0400

    net: Save and restore msg_namelen in sock_sendmsg
    
    [ Upstream commit 01b2885d9415152bcb12ff1f7788f500a74ea0ed ]
    
    Commit 86a7e0b69bd5 ("net: prevent rewrite of msg_name in
    sock_sendmsg()") made sock_sendmsg save the incoming msg_name pointer
    and restore it before returning, to insulate the caller against
    msg_name being changed by the called code.  If the address length
    was also changed however, we may return with an inconsistent structure
    where the length doesn't match the address, and attempts to reuse it may
    lead to lost packets.
    
    For example, a kernel that doesn't have commit 1c5950fc6fe9 ("udp6: fix
    potential access to stale information") will replace a v4 mapped address
    with its ipv4 equivalent, and shorten namelen accordingly from 28 to 16.
    If the caller attempts to reuse the resulting msg structure, it will have
    the original ipv6 (v4 mapped) address but an incorrect v4 length.
    
    Fixes: 86a7e0b69bd5 ("net: prevent rewrite of msg_name in sock_sendmsg()")
    Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: sched: call tcf_ct_params_free to free params in tcf_ct_init [+ + +]
Author: Xin Long <lucien.xin@gmail.com>
Date:   Sun Nov 6 15:34:16 2022 -0500

    net: sched: call tcf_ct_params_free to free params in tcf_ct_init
    
    [ Upstream commit 1913894100ca53205f2d56091cb34b8eba1de217 ]
    
    This patch is to make the err path simple by calling tcf_ct_params_free(),
    so that it won't cause problems when more members are added into param and
    need freeing on the err path.
    
    Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Stable-dep-of: 125f1c7f26ff ("net/sched: act_ct: Take per-cb reference to tcf_ct_flow_table")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: sched: em_text: fix possible memory leak in em_text_destroy() [+ + +]
Author: Hangyu Hua <hbh25y@gmail.com>
Date:   Thu Dec 21 10:25:31 2023 +0800

    net: sched: em_text: fix possible memory leak in em_text_destroy()
    
    [ Upstream commit 8fcb0382af6f1ef50936f1be05b8149eb2f88496 ]
    
    m->data needs to be freed when em_text_destroy is called.
    
    Fixes: d675c989ed2d ("[PKT_SCHED]: Packet classification based on textsearch (ematch)")
    Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
    Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
    Reviewed-by: Simon Horman <horms@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
netfilter: flowtable: allow unidirectional rules [+ + +]
Author: Vlad Buslov <vladbu@nvidia.com>
Date:   Wed Feb 1 17:30:56 2023 +0100

    netfilter: flowtable: allow unidirectional rules
    
    [ Upstream commit 8f84780b84d645d6e35467f4a6f3236b20d7f4b2 ]
    
    Modify flow table offload to support unidirectional connections by
    extending enum nf_flow_flags with new "NF_FLOW_HW_BIDIRECTIONAL" flag. Only
    offload reply direction when the flag is set. This infrastructure change is
    necessary to support offloading UDP NEW connections in original direction
    in following patches in series.
    
    Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Stable-dep-of: 125f1c7f26ff ("net/sched: act_ct: Take per-cb reference to tcf_ct_flow_table")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: flowtable: cache info of last offload [+ + +]
Author: Vlad Buslov <vladbu@nvidia.com>
Date:   Wed Feb 1 17:30:57 2023 +0100

    netfilter: flowtable: cache info of last offload
    
    [ Upstream commit 1a441a9b8be8849957a01413a144f84932c324cb ]
    
    Modify flow table offload to cache the last ct info status that was passed
    to the driver offload callbacks by extending enum nf_flow_flags with new
    "NF_FLOW_HW_ESTABLISHED" flag. Set the flag if ctinfo was 'established'
    during last act_ct meta actions fill call. This infrastructure change is
    necessary to optimize promoting of UDP connections from 'new' to
    'established' in following patches in this series.
    
    Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Stable-dep-of: 125f1c7f26ff ("net/sched: act_ct: Take per-cb reference to tcf_ct_flow_table")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: flowtable: GC pushes back packets to classic path [+ + +]
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date:   Tue Oct 24 21:09:47 2023 +0200

    netfilter: flowtable: GC pushes back packets to classic path
    
    [ Upstream commit 735795f68b37e9bb49f642407a0d49b1631ea1c7 ]
    
    Since 41f2c7c342d3 ("net/sched: act_ct: Fix promotion of offloaded
    unreplied tuple"), flowtable GC pushes back flows with IPS_SEEN_REPLY
    back to classic path in every run, ie. every second. This is because of
    a new check for NF_FLOW_HW_ESTABLISHED which is specific of sched/act_ct.
    
    In Netfilter's flowtable case, NF_FLOW_HW_ESTABLISHED never gets set on
    and IPS_SEEN_REPLY is unreliable since users decide when to offload the
    flow before, such bit might be set on at a later stage.
    
    Fix it by adding a custom .gc handler that sched/act_ct can use to
    deal with its NF_FLOW_HW_ESTABLISHED bit.
    
    Fixes: 41f2c7c342d3 ("net/sched: act_ct: Fix promotion of offloaded unreplied tuple")
    Reported-by: Vladimir Smelhaus <vl.sm@email.cz>
    Reviewed-by: Paul Blakey <paulb@nvidia.com>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Stable-dep-of: 125f1c7f26ff ("net/sched: act_ct: Take per-cb reference to tcf_ct_flow_table")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: nf_tables: set transport offset from mac header for netdev/egress [+ + +]
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date:   Thu Dec 14 11:50:12 2023 +0100

    netfilter: nf_tables: set transport offset from mac header for netdev/egress
    
    [ Upstream commit 0ae8e4cca78781401b17721bfb72718fdf7b4912 ]
    
    Before this patch, transport offset (pkt->thoff) provides an offset
    relative to the network header. This is fine for the inet families
    because skb->data points to the network header in such case. However,
    from netdev/egress, skb->data points to the mac header (if available),
    thus, pkt->thoff is missing the mac header length.
    
    Add skb_network_offset() to the transport offset (pkt->thoff) for
    netdev, so transport header mangling works as expected. Adjust payload
    fast eval function to use skb->data now that pkt->thoff provides an
    absolute offset. This explains why users report that matching on
    egress/netdev works but payload mangling does not.
    
    This patch implicitly fixes payload mangling for IPv4 packets in
    netdev/egress given skb_store_bits() requires an offset from skb->data
    to reach the transport header.
    
    I suspect that nft_exthdr and the trace infra were also broken from
    netdev/egress because they also take skb->data as start, and pkt->thoff
    was not correct.
    
    Note that IPv6 is fine because ipv6_find_hdr() already provides a
    transport offset starting from skb->data, which includes
    skb_network_offset().
    
    The bridge family also uses nft_set_pktinfo_ipv4_validate(), but there
    skb_network_offset() is zero, so the update in this patch does not alter
    the existing behaviour.
    
    Fixes: 42df6e1d221d ("netfilter: Introduce egress hook")
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: nft_immediate: drop chain reference counter on error [+ + +]
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date:   Mon Jan 1 20:15:33 2024 +0100

    netfilter: nft_immediate: drop chain reference counter on error
    
    [ Upstream commit b29be0ca8e816119ccdf95cc7d7c7be9bde005f1 ]
    
    In the init path, nft_data_init() bumps the chain reference counter,
    decrement it on error by following the error path which calls
    nft_data_release() to restore it.
    
    Fixes: 4bedf9eee016 ("netfilter: nf_tables: fix chain binding transaction logic")
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

netfilter: use skb_ip_totlen and iph_totlen [+ + +]
Author: Xin Long <lucien.xin@gmail.com>
Date:   Sat Jan 28 10:58:34 2023 -0500

    netfilter: use skb_ip_totlen and iph_totlen
    
    [ Upstream commit a13fbf5ed5b4fc9095f12e955ca3a59b5507ff01 ]
    
    There are also quite some places in netfilter that may process IPv4 TCP
    GSO packets, we need to replace them too.
    
    In length_mt(), we have to use u_int32_t/int to accept skb_ip_totlen()
    return value, otherwise it may overflow and mismatch. This change will
    also help us add selftest for IPv4 BIG TCP in the following patch.
    
    Note that we don't need to replace the one in tcpmss_tg4(), as it will
    return if there is data after tcphdr in tcpmss_mangle_packet(). The
    same in mangle_contents() in nf_nat_helper.c, it returns false when
    skb->len + extra > 65535 in enlarge_skb().
    
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: 0ae8e4cca787 ("netfilter: nf_tables: set transport offset from mac header for netdev/egress")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
nfc: llcp_core: Hold a ref to llcp_local->dev when holding a ref to llcp_local [+ + +]
Author: Siddh Raman Pant <code@siddh.me>
Date:   Tue Dec 19 23:19:43 2023 +0530

    nfc: llcp_core: Hold a ref to llcp_local->dev when holding a ref to llcp_local
    
    [ Upstream commit c95f919567d6f1914f13350af61a1b044ac85014 ]
    
    llcp_sock_sendmsg() calls nfc_llcp_send_ui_frame() which in turn calls
    nfc_alloc_send_skb(), which accesses the nfc_dev from the llcp_sock for
    getting the headroom and tailroom needed for skb allocation.
    
    Parallelly the nfc_dev can be freed, as the refcount is decreased via
    nfc_free_device(), leading to a UAF reported by Syzkaller, which can
    be summarized as follows:
    
    (1) llcp_sock_sendmsg() -> nfc_llcp_send_ui_frame()
            -> nfc_alloc_send_skb() -> Dereference *nfc_dev
    (2) virtual_ncidev_close() -> nci_free_device() -> nfc_free_device()
            -> put_device() -> nfc_release() -> Free *nfc_dev
    
    When a reference to llcp_local is acquired, we do not acquire the same
    for the nfc_dev. This leads to freeing even when the llcp_local is in
    use, and this is the case with the UAF described above too.
    
    Thus, when we acquire a reference to llcp_local, we should acquire a
    reference to nfc_dev, and release the references appropriately later.
    
    References for llcp_local is initialized in nfc_llcp_register_device()
    (which is called by nfc_register_device()). Thus, we should acquire a
    reference to nfc_dev there.
    
    nfc_unregister_device() calls nfc_llcp_unregister_device() which in
    turn calls nfc_llcp_local_put(). Thus, the reference to nfc_dev is
    appropriately released later.
    
    Reported-and-tested-by: syzbot+bbe84a4010eeea00982d@syzkaller.appspotmail.com
    Closes: https://syzkaller.appspot.com/bug?extid=bbe84a4010eeea00982d
    Fixes: c7aa12252f51 ("NFC: Take a reference on the LLCP local pointer when creating a socket")
    Reviewed-by: Suman Ghosh <sumang@marvell.com>
    Signed-off-by: Siddh Raman Pant <code@siddh.me>
    Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
octeontx2-af: Always configure NIX TX link credits based on max frame size [+ + +]
Author: Naveen Mamindlapalli <naveenm@marvell.com>
Date:   Tue Jan 2 15:26:43 2024 +0530

    octeontx2-af: Always configure NIX TX link credits based on max frame size
    
    [ Upstream commit a0d9528f6daf7fe8de217fa80a94d2989d2a57a7 ]
    
    Currently the NIX TX link credits are initialized based on the max frame
    size that can be transmitted on a link but when the MTU is changed, the
    NIX TX link credits are reprogrammed by the SW based on the new MTU value.
    Since SMQ max packet length is programmed to max frame size by default,
    there is a chance that NIX TX may stall while sending a max frame sized
    packet on the link with insufficient credits to send the packet all at
    once. This patch avoids stall issue by not changing the link credits
    dynamically when the MTU is changed.
    
    Fixes: 1c74b89171c3 ("octeontx2-af: Wait for TX link idle for credits change")
    Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
    Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
    Signed-off-by: Nithin Kumar Dabilpuram <ndabilpuram@marvell.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

octeontx2-af: Fix marking couple of structure as __packed [+ + +]
Author: Suman Ghosh <sumang@marvell.com>
Date:   Tue Dec 19 19:56:33 2023 +0530

    octeontx2-af: Fix marking couple of structure as __packed
    
    [ Upstream commit 0ee2384a5a0f3b4eeac8d10bb01a0609d245a4d1 ]
    
    Couple of structures was not marked as __packed. This patch
    fixes the same and mark them as __packed.
    
    Fixes: 42006910b5ea ("octeontx2-af: cleanup KPU config data")
    Signed-off-by: Suman Ghosh <sumang@marvell.com>
    Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

octeontx2-af: Fix pause frame configuration [+ + +]
Author: Hariprasad Kelam <hkelam@marvell.com>
Date:   Fri Dec 8 14:57:54 2023 +0530

    octeontx2-af: Fix pause frame configuration
    
    [ Upstream commit e307b5a845c5951dabafc48d00b6424ee64716c4 ]
    
    The current implementation's default Pause Forward setting is causing
    unnecessary network traffic. This patch disables Pause Forward to
    address this issue.
    
    Fixes: 1121f6b02e7a ("octeontx2-af: Priority flow control configuration support")
    Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
    Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

octeontx2-af: Re-enable MAC TX in otx2_stop processing [+ + +]
Author: Naveen Mamindlapalli <naveenm@marvell.com>
Date:   Tue Jan 2 19:44:00 2024 +0530

    octeontx2-af: Re-enable MAC TX in otx2_stop processing
    
    [ Upstream commit 818ed8933bd17bc91a9fa8b94a898189c546fc1a ]
    
    During QoS scheduling testing with multiple strict priority flows, the
    netdev tx watchdog timeout routine is invoked when a low priority QoS
    queue doesn't get a chance to transmit the packets because other high
    priority flows are completely subscribing the transmit link. The netdev
    tx watchdog timeout routine will stop MAC RX and TX functionality in
    otx2_stop() routine before cleanup of HW TX queues which results in SMQ
    flush errors because the packets belonging to low priority queues will
    never gets flushed since MAC TX is disabled. This patch fixes the issue
    by re-enabling MAC TX to ensure the packets in HW pipeline gets flushed
    properly.
    
    Fixes: a7faa68b4e7f ("octeontx2-af: Start/Stop traffic in CGX along with NPC")
    Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
    Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

octeontx2-af: Support variable number of lmacs [+ + +]
Author: Rakesh Babu Saladi <rsaladi2@marvell.com>
Date:   Mon Dec 5 12:35:18 2022 +0530

    octeontx2-af: Support variable number of lmacs
    
    [ Upstream commit f2e664ad503d4e5ce7c42a0862ab164331a0ef37 ]
    
    Most of the code in CGX/RPM driver assumes that max lmacs per
    given MAC as always, 4 and the number of MAC blocks also as 4.
    With this assumption, the max number of interfaces supported is
    hardcoded to 16. This creates a problem as next gen CN10KB silicon
    MAC supports 8 lmacs per MAC block.
    
    This patch solves the problem by using "max lmac per MAC block"
    value from constant csrs and uses cgx_cnt_max value which is
    populated based number of MAC blocks supported by silicon.
    
    Signed-off-by: Rakesh Babu Saladi <rsaladi2@marvell.com>
    Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
    Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Stable-dep-of: e307b5a845c5 ("octeontx2-af: Fix pause frame configuration")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
r8169: Fix PCI error on system resume [+ + +]
Author: Kai-Heng Feng <kai.heng.feng@canonical.com>
Date:   Fri Dec 22 12:34:09 2023 +0800

    r8169: Fix PCI error on system resume
    
    [ Upstream commit 9c476269bff2908a20930c58085bf0b05ebd569a ]
    
    Some r8168 NICs stop working upon system resume:
    
    [  688.051096] r8169 0000:02:00.1 enp2s0f1: rtl_ep_ocp_read_cond == 0 (loop: 10, delay: 10000).
    [  688.175131] r8169 0000:02:00.1 enp2s0f1: Link is Down
    ...
    [  691.534611] r8169 0000:02:00.1 enp2s0f1: PCI error (cmd = 0x0407, status_errs = 0x0000)
    
    Not sure if it's related, but those NICs have a BMC device at function
    0:
    02:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. Realtek RealManage BMC [10ec:816e] (rev 1a)
    
    Trial and error shows that increase the loop wait on
    rtl_ep_ocp_read_cond to 30 can eliminate the issue, so let
    rtl8168ep_driver_start() to wait a bit longer.
    
    Fixes: e6d6ca6e1204 ("r8169: Add support for another RTL8168FP")
    Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
    Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
Revert "interconnect: qcom: sm8250: Enable sync_state" [+ + +]
Author: Amit Pundir <amit.pundir@linaro.org>
Date:   Sun Jan 7 21:27:02 2024 +0530

    Revert "interconnect: qcom: sm8250: Enable sync_state"
    
    This reverts commit 3637f6bdfe2ccd53c493836b6e43c9a73e4513b3 which is
    commit bfc7db1cb94ad664546d70212699f8cc6c539e8c upstream.
    
    This resulted in boot regression on RB5 (sm8250), causing the device
    to hard crash into USB crash dump mode everytime.
    
    Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
    Link: https://lkft.validation.linaro.org/scheduler/job/7151629#L4239
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()" [+ + +]
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Mon Jan 1 12:08:18 2024 -0600

    Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()"
    
    commit f93e71aea6c60ebff8adbd8941e678302d377869 upstream.
    
    This reverts commit 08d0cc5f34265d1a1e3031f319f594bd1970976c.
    
    Michael reported that when attempting to resume from suspend to RAM on ASUS
    mini PC PN51-BB757MDE1 (DMI model: MINIPC PN51-E1), 08d0cc5f3426
    ("PCI/ASPM: Remove pcie_aspm_pm_state_change()") caused a 12-second delay
    with no output, followed by a reboot.
    
    Workarounds include:
    
      - Reverting 08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()")
      - Booting with "pcie_aspm=off"
      - Booting with "pcie_aspm.policy=performance"
      - "echo 0 | sudo tee /sys/bus/pci/devices/0000:03:00.0/link/l1_aspm"
        before suspending
      - Connecting a USB flash drive
    
    Link: https://lore.kernel.org/r/20240102232550.1751655-1-helgaas@kernel.org
    Fixes: 08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()")
    Reported-by: Michael Schaller <michael@5challer.de>
    Link: https://lore.kernel.org/r/76c61361-b8b4-435f-a9f1-32b716763d62@5challer.de
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
ring-buffer: Fix 32-bit rb_time_read() race with rb_time_cmpxchg() [+ + +]
Author: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Date:   Tue Dec 12 14:30:49 2023 -0500

    ring-buffer: Fix 32-bit rb_time_read() race with rb_time_cmpxchg()
    
    [ Upstream commit dec890089bf79a4954b61482715ee2d084364856 ]
    
    The following race can cause rb_time_read() to observe a corrupted time
    stamp:
    
    rb_time_cmpxchg()
    [...]
            if (!rb_time_read_cmpxchg(&t->msb, msb, msb2))
                    return false;
            if (!rb_time_read_cmpxchg(&t->top, top, top2))
                    return false;
    <interrupted before updating bottom>
    __rb_time_read()
    [...]
            do {
                    c = local_read(&t->cnt);
                    top = local_read(&t->top);
                    bottom = local_read(&t->bottom);
                    msb = local_read(&t->msb);
            } while (c != local_read(&t->cnt));
    
            *cnt = rb_time_cnt(top);
    
            /* If top and msb counts don't match, this interrupted a write */
            if (*cnt != rb_time_cnt(msb))
                    return false;
              ^ this check fails to catch that "bottom" is still not updated.
    
    So the old "bottom" value is returned, which is wrong.
    
    Fix this by checking that all three of msb, top, and bottom 2-bit cnt
    values match.
    
    The reason to favor checking all three fields over requiring a specific
    update order for both rb_time_set() and rb_time_cmpxchg() is because
    checking all three fields is more robust to handle partial failures of
    rb_time_cmpxchg() when interrupted by nested rb_time_set().
    
    Link: https://lore.kernel.org/lkml/20231211201324.652870-1-mathieu.desnoyers@efficios.com/
    Link: https://lore.kernel.org/linux-trace-kernel/20231212193049.680122-1-mathieu.desnoyers@efficios.com
    
    Fixes: f458a1453424e ("ring-buffer: Test last update in 32bit version of __rb_time_read()")
    Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
s390/cpumf: support user space events for counting [+ + +]
Author: Thomas Richter <tmricht@linux.ibm.com>
Date:   Fri Dec 23 11:03:32 2022 +0100

    s390/cpumf: support user space events for counting
    
    [ Upstream commit 91d5364dc673fa9cf3a5b7b30cf33c70803eb3a4 ]
    
    CPU Measurement counting facility events PROBLEM_STATE_CPU_CYCLES(32)
    and PROBLEM_STATE_INSTRUCTIONS(33) are valid events. However the device
    driver returns error -EOPNOTSUPP when these event are to be installed.
    
    Fix this and allow installation of events PROBLEM_STATE_CPU_CYCLES,
    PROBLEM_STATE_CPU_CYCLES:u, PROBLEM_STATE_INSTRUCTIONS and
    PROBLEM_STATE_INSTRUCTIONS:u.
    Kernel space counting only is still not supported by s390.
    
    Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
    Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
    Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
    Stable-dep-of: 09cda0a40051 ("s390/mm: add missing arch_set_page_dat() call to vmem_crst_alloc()")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
s390/mm: add missing arch_set_page_dat() call to vmem_crst_alloc() [+ + +]
Author: Heiko Carstens <hca@linux.ibm.com>
Date:   Tue Oct 17 21:07:04 2023 +0200

    s390/mm: add missing arch_set_page_dat() call to vmem_crst_alloc()
    
    [ Upstream commit 09cda0a400519b1541591c506e54c9c48e3101bf ]
    
    If the cmma no-dat feature is available all pages that are not used for
    dynamic address translation are marked as "no-dat" with the ESSA
    instruction. This information is visible to the hypervisor, so that the
    hypervisor can optimize purging of guest TLB entries. This also means that
    pages which are used for dynamic address translation must not be marked as
    "no-dat", since the hypervisor may then incorrectly not purge guest TLB
    entries.
    
    Region and segment tables allocated via vmem_crst_alloc() are incorrectly
    marked as "no-dat", as soon as slab_is_available() returns true.
    
    Such tables are allocated e.g. when kernel page tables are split, memory is
    hotplugged, or a DCSS segment is loaded.
    
    Fix this by adding the missing arch_set_page_dat() call.
    
    Cc: <stable@vger.kernel.org>
    Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
    Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
    Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
selftests: bonding: do not set port down when adding to bond [+ + +]
Author: Hangbin Liu <liuhangbin@gmail.com>
Date:   Sat Dec 23 20:59:22 2023 +0800

    selftests: bonding: do not set port down when adding to bond
    
    [ Upstream commit 61fa2493ca76fd7bb74e13f0205274f4ab0aa696 ]
    
    Similar to commit be809424659c ("selftests: bonding: do not set port down
    before adding to bond"). The bond-arp-interval-causes-panic test failed
    after commit a4abfa627c38 ("net: rtnetlink: Enslave device before bringing
    it up") as the kernel will set the port down _after_ adding to bond if setting
    port down specifically.
    
    Fix it by removing the link down operation when adding to bond.
    
    Fixes: 2ffd57327ff1 ("selftests: bonding: cause oops in bond_rr_gen_slave_id")
    Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
    Tested-by: Benjamin Poirier <benjamin.poirier@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests: mptcp: fix fastclose with csum failure [+ + +]
Author: Paolo Abeni <pabeni@redhat.com>
Date:   Tue Nov 14 00:16:17 2023 +0100

    selftests: mptcp: fix fastclose with csum failure
    
    [ Upstream commit 7cefbe5e1dacc7236caa77e9d072423f21422fe2 ]
    
    Running the mp_join selftest manually with the following command line:
    
      ./mptcp_join.sh -z -C
    
    leads to some failures:
    
      002 fastclose server test
      # ...
      rtx                                 [fail] got 1 MP_RST[s] TX expected 0
      # ...
      rstrx                               [fail] got 1 MP_RST[s] RX expected 0
    
    The problem is really in the wrong expectations for the RST checks
    implied by the csum validation. Note that the same check is repeated
    explicitly in the same test-case, with the correct expectation and
    pass successfully.
    
    Address the issue explicitly setting the correct expectation for
    the failing checks.
    
    Reported-by: Xiumei Mu <xmu@redhat.com>
    Fixes: 6bf41020b72b ("selftests: mptcp: update and extend fastclose test-cases")
    Cc: stable@vger.kernel.org
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
    Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
    Link: https://lore.kernel.org/r/20231114-upstream-net-20231113-mptcp-misc-fixes-6-7-rc2-v1-5-7b9cd6a7b7f4@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests: mptcp: set FAILING_LINKS in run_tests [+ + +]
Author: Geliang Tang <geliang.tang@linux.dev>
Date:   Fri Jun 23 10:34:09 2023 -0700

    selftests: mptcp: set FAILING_LINKS in run_tests
    
    [ Upstream commit be7e9786c9155c2942cd53b813e4723be67e07c4 ]
    
    Set FAILING_LINKS as an env var with a limited scope only when calling
    run_tests().
    
    Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
    Signed-off-by: Geliang Tang <geliang.tang@suse.com>
    Signed-off-by: Mat Martineau <martineau@kernel.org>
    Link: https://lore.kernel.org/r/20230623-send-net-next-20230623-v1-3-a883213c8ba9@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: 7cefbe5e1dac ("selftests: mptcp: fix fastclose with csum failure")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests: secretmem: floor the memory size to the multiple of page_size [+ + +]
Author: Muhammad Usama Anjum <usama.anjum@collabora.com>
Date:   Thu Dec 14 15:19:30 2023 +0500

    selftests: secretmem: floor the memory size to the multiple of page_size
    
    [ Upstream commit 0aac13add26d546ac74c89d2883b3a5f0fbea039 ]
    
    The "locked-in-memory size" limit per process can be non-multiple of
    page_size.  The mmap() fails if we try to allocate locked-in-memory with
    same size as the allowed limit if it isn't multiple of the page_size
    because mmap() rounds off the memory size to be allocated to next multiple
    of page_size.
    
    Fix this by flooring the length to be allocated with mmap() to the
    previous multiple of the page_size.
    
    This was getting triggered on KernelCI regularly because of different
    ulimit settings which wasn't multiple of the page_size.  Find logs
    here: https://linux.kernelci.org/test/plan/id/657654bd8e81e654fae13532/
    The bug in was present from the time test was first added.
    
    Link: https://lkml.kernel.org/r/20231214101931.1155586-1-usama.anjum@collabora.com
    Fixes: 76fe17ef588a ("secretmem: test: add basic selftest for memfd_secret(2)")
    Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
    Reported-by: "kernelci.org bot" <bot@kernelci.org>
    Closes: https://linux.kernelci.org/test/plan/id/657654bd8e81e654fae13532/
    Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
    Cc: Mike Rapoport (IBM) <rppt@kernel.org>
    Cc: Shuah Khan <shuah@kernel.org>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
sfc: fix a double-free bug in efx_probe_filters [+ + +]
Author: Zhipeng Lu <alexious@zju.edu.cn>
Date:   Mon Dec 25 19:29:14 2023 +0800

    sfc: fix a double-free bug in efx_probe_filters
    
    [ Upstream commit d5a306aedba34e640b11d7026dbbafb78ee3a5f6 ]
    
    In efx_probe_filters, the channel->rps_flow_id is freed in a
    efx_for_each_channel marco  when success equals to 0.
    However, after the following call chain:
    
    ef100_net_open
      |-> efx_probe_filters
      |-> ef100_net_stop
            |-> efx_remove_filters
    
    The channel->rps_flow_id is freed again in the efx_for_each_channel of
    efx_remove_filters, triggering a double-free bug.
    
    Fixes: a9dc3d5612ce ("sfc_ef100: RX filter table management and related gubbins")
    Reviewed-by: Simon Horman <horms@kernel.org>
    Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
    Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn>
    Link: https://lore.kernel.org/r/20231225112915.3544581-1-alexious@zju.edu.cn
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
smb3: Replace smb2pdu 1-element arrays with flex-arrays [+ + +]
Author: Kees Cook <keescook@chromium.org>
Date:   Fri Feb 17 16:24:40 2023 -0800

    smb3: Replace smb2pdu 1-element arrays with flex-arrays
    
    commit eb3e28c1e89b4984308777231887e41aa8a0151f upstream.
    
    The kernel is globally removing the ambiguous 0-length and 1-element
    arrays in favor of flexible arrays, so that we can gain both compile-time
    and run-time array bounds checking[1].
    
    Replace the trailing 1-element array with a flexible array in the
    following structures:
    
            struct smb2_err_rsp
            struct smb2_tree_connect_req
            struct smb2_negotiate_rsp
            struct smb2_sess_setup_req
            struct smb2_sess_setup_rsp
            struct smb2_read_req
            struct smb2_read_rsp
            struct smb2_write_req
            struct smb2_write_rsp
            struct smb2_query_directory_req
            struct smb2_query_directory_rsp
            struct smb2_set_info_req
            struct smb2_change_notify_rsp
            struct smb2_create_rsp
            struct smb2_query_info_req
            struct smb2_query_info_rsp
    
    Replace the trailing 1-element array with a flexible array, but leave
    the existing structure padding:
    
            struct smb2_file_all_info
            struct smb2_lock_req
    
    Adjust all related size calculations to match the changes to sizeof().
    
    No machine code output or .data section differences are produced after
    these changes.
    
    [1] For lots of details, see both:
        https://docs.kernel.org/process/deprecated.html#zero-length-and-one-element-arrays
        https://people.kernel.org/kees/bounded-flexible-arrays-in-c
    
    Cc: Steve French <sfrench@samba.org>
    Cc: Paulo Alcantara <pc@cjr.nz>
    Cc: Ronnie Sahlberg <lsahlber@redhat.com>
    Cc: Shyam Prasad N <sprasad@microsoft.com>
    Cc: Tom Talpey <tom@talpey.com>
    Cc: Namjae Jeon <linkinjeon@kernel.org>
    Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
    Cc: linux-cifs@vger.kernel.org
    Cc: samba-technical@lists.samba.org
    Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
    Signed-off-by: Kees Cook <keescook@chromium.org>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
smb: client: fix missing mode bits for SMB symlinks [+ + +]
Author: Paulo Alcantara <pc@manguebit.com>
Date:   Sat Nov 25 23:55:10 2023 -0300

    smb: client: fix missing mode bits for SMB symlinks
    
    [ Upstream commit ef22bb800d967616c7638d204bc1b425beac7f5f ]
    
    When instantiating inodes for SMB symlinks, add the mode bits from
    @cifs_sb->ctx->file_mode as we already do for the other special files.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
    Signed-off-by: Steve French <stfrench@microsoft.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
splice, net: Add a splice_eof op to file-ops and socket-ops [+ + +]
Author: David Howells <dhowells@redhat.com>
Date:   Wed Jun 7 19:19:10 2023 +0100

    splice, net: Add a splice_eof op to file-ops and socket-ops
    
    [ Upstream commit 2bfc66850952b6921b2033b09729ec59eabbc81d ]
    
    Add an optional method, ->splice_eof(), to allow splice to indicate the
    premature termination of a splice to struct file_operations and struct
    proto_ops.
    
    This is called if sendfile() or splice() encounters all of the following
    conditions inside splice_direct_to_actor():
    
     (1) the user did not set SPLICE_F_MORE (splice only), and
    
     (2) an EOF condition occurred (->splice_read() returned 0), and
    
     (3) we haven't read enough to fulfill the request (ie. len > 0 still), and
    
     (4) we have already spliced at least one byte.
    
    A further patch will modify the behaviour of SPLICE_F_MORE to always be
    passed to the actor if either the user set it or we haven't yet read
    sufficient data to fulfill the request.
    
    Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
    Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/
    Signed-off-by: David Howells <dhowells@redhat.com>
    Reviewed-by: Jakub Kicinski <kuba@kernel.org>
    cc: Jens Axboe <axboe@kernel.dk>
    cc: Christoph Hellwig <hch@lst.de>
    cc: Al Viro <viro@zeniv.linux.org.uk>
    cc: Matthew Wilcox <willy@infradead.org>
    cc: Jan Kara <jack@suse.cz>
    cc: Jeff Layton <jlayton@kernel.org>
    cc: David Hildenbrand <david@redhat.com>
    cc: Christian Brauner <brauner@kernel.org>
    cc: Chuck Lever <chuck.lever@oracle.com>
    cc: Boris Pismenny <borisp@nvidia.com>
    cc: John Fastabend <john.fastabend@gmail.com>
    cc: linux-mm@kvack.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: a0002127cd74 ("udp: move udp->no_check6_tx to udp->udp_flags")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
srcu: Fix callbacks acceleration mishandling [+ + +]
Author: Frederic Weisbecker <frederic@kernel.org>
Date:   Wed Oct 4 01:28:59 2023 +0200

    srcu: Fix callbacks acceleration mishandling
    
    [ Upstream commit 4a8e65b0c348e42107c64381e692e282900be361 ]
    
    SRCU callbacks acceleration might fail if the preceding callbacks
    advance also fails. This can happen when the following steps are met:
    
    1) The RCU_WAIT_TAIL segment has callbacks (say for gp_num 8) and the
       RCU_NEXT_READY_TAIL also has callbacks (say for gp_num 12).
    
    2) The grace period for RCU_WAIT_TAIL is observed as started but not yet
       completed so rcu_seq_current() returns 4 + SRCU_STATE_SCAN1 = 5.
    
    3) This value is passed to rcu_segcblist_advance() which can't move
       any segment forward and fails.
    
    4) srcu_gp_start_if_needed() still proceeds with callback acceleration.
       But then the call to rcu_seq_snap() observes the grace period for the
       RCU_WAIT_TAIL segment (gp_num 8) as completed and the subsequent one
       for the RCU_NEXT_READY_TAIL segment as started
       (ie: 8 + SRCU_STATE_SCAN1 = 9) so it returns a snapshot of the
       next grace period, which is 16.
    
    5) The value of 16 is passed to rcu_segcblist_accelerate() but the
       freshly enqueued callback in RCU_NEXT_TAIL can't move to
       RCU_NEXT_READY_TAIL which already has callbacks for a previous grace
       period (gp_num = 12). So acceleration fails.
    
    6) Note in all these steps, srcu_invoke_callbacks() hadn't had a chance
       to run srcu_invoke_callbacks().
    
    Then some very bad outcome may happen if the following happens:
    
    7) Some other CPU races and starts the grace period number 16 before the
       CPU handling previous steps had a chance. Therefore srcu_gp_start()
       isn't called on the latter sdp to fix the acceleration leak from
       previous steps with a new pair of call to advance/accelerate.
    
    8) The grace period 16 completes and srcu_invoke_callbacks() is finally
       called. All the callbacks from previous grace periods (8 and 12) are
       correctly advanced and executed but callbacks in RCU_NEXT_READY_TAIL
       still remain. Then rcu_segcblist_accelerate() is called with a
       snaphot of 20.
    
    9) Since nothing started the grace period number 20, callbacks stay
       unhandled.
    
    This has been reported in real load:
    
            [3144162.608392] INFO: task kworker/136:12:252684 blocked for more
            than 122 seconds.
            [3144162.615986]       Tainted: G           O  K   5.4.203-1-tlinux4-0011.1 #1
            [3144162.623053] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
            disables this message.
            [3144162.631162] kworker/136:12  D    0 252684      2 0x90004000
            [3144162.631189] Workqueue: kvm-irqfd-cleanup irqfd_shutdown [kvm]
            [3144162.631192] Call Trace:
            [3144162.631202]  __schedule+0x2ee/0x660
            [3144162.631206]  schedule+0x33/0xa0
            [3144162.631209]  schedule_timeout+0x1c4/0x340
            [3144162.631214]  ? update_load_avg+0x82/0x660
            [3144162.631217]  ? raw_spin_rq_lock_nested+0x1f/0x30
            [3144162.631218]  wait_for_completion+0x119/0x180
            [3144162.631220]  ? wake_up_q+0x80/0x80
            [3144162.631224]  __synchronize_srcu.part.19+0x81/0xb0
            [3144162.631226]  ? __bpf_trace_rcu_utilization+0x10/0x10
            [3144162.631227]  synchronize_srcu+0x5f/0xc0
            [3144162.631236]  irqfd_shutdown+0x3c/0xb0 [kvm]
            [3144162.631239]  ? __schedule+0x2f6/0x660
            [3144162.631243]  process_one_work+0x19a/0x3a0
            [3144162.631244]  worker_thread+0x37/0x3a0
            [3144162.631247]  kthread+0x117/0x140
            [3144162.631247]  ? process_one_work+0x3a0/0x3a0
            [3144162.631248]  ? __kthread_cancel_work+0x40/0x40
            [3144162.631250]  ret_from_fork+0x1f/0x30
    
    Fix this with taking the snapshot for acceleration _before_ the read
    of the current grace period number.
    
    The only side effect of this solution is that callbacks advancing happen
    then _after_ the full barrier in rcu_seq_snap(). This is not a problem
    because that barrier only cares about:
    
    1) Ordering accesses of the update side before call_srcu() so they don't
       bleed.
    2) See all the accesses prior to the grace period of the current gp_num
    
    The only things callbacks advancing need to be ordered against are
    carried by snp locking.
    
    Reported-by: Yong He <alexyonghe@tencent.com>
    Co-developed-by:: Yong He <alexyonghe@tencent.com>
    Signed-off-by: Yong He <alexyonghe@tencent.com>
    Co-developed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
    Signed-off-by:  Joel Fernandes (Google) <joel@joelfernandes.org>
    Co-developed-by: Neeraj upadhyay <Neeraj.Upadhyay@amd.com>
    Signed-off-by: Neeraj upadhyay <Neeraj.Upadhyay@amd.com>
    Link: http://lore.kernel.org/CANZk6aR+CqZaqmMWrC2eRRPY12qAZnDZLwLnHZbNi=xXMB401g@mail.gmail.com
    Fixes: da915ad5cf25 ("srcu: Parallelize callback handling")
    Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
udp: annotate data-races around udp->encap_type [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Sep 12 09:17:28 2023 +0000

    udp: annotate data-races around udp->encap_type
    
    [ Upstream commit 70a36f571362a8de8b8c02d21ae524fc776287f2 ]
    
    syzbot/KCSAN complained about UDP_ENCAP_L2TPINUDP setsockopt() racing.
    
    Add READ_ONCE()/WRITE_ONCE() to document races on this lockless field.
    
    syzbot report was:
    BUG: KCSAN: data-race in udp_lib_setsockopt / udp_lib_setsockopt
    
    read-write to 0xffff8881083603fa of 1 bytes by task 16557 on cpu 0:
    udp_lib_setsockopt+0x682/0x6c0
    udp_setsockopt+0x73/0xa0 net/ipv4/udp.c:2779
    sock_common_setsockopt+0x61/0x70 net/core/sock.c:3697
    __sys_setsockopt+0x1c9/0x230 net/socket.c:2263
    __do_sys_setsockopt net/socket.c:2274 [inline]
    __se_sys_setsockopt net/socket.c:2271 [inline]
    __x64_sys_setsockopt+0x66/0x80 net/socket.c:2271
    do_syscall_x64 arch/x86/entry/common.c:50 [inline]
    do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
    entry_SYSCALL_64_after_hwframe+0x63/0xcd
    
    read-write to 0xffff8881083603fa of 1 bytes by task 16554 on cpu 1:
    udp_lib_setsockopt+0x682/0x6c0
    udp_setsockopt+0x73/0xa0 net/ipv4/udp.c:2779
    sock_common_setsockopt+0x61/0x70 net/core/sock.c:3697
    __sys_setsockopt+0x1c9/0x230 net/socket.c:2263
    __do_sys_setsockopt net/socket.c:2274 [inline]
    __se_sys_setsockopt net/socket.c:2271 [inline]
    __x64_sys_setsockopt+0x66/0x80 net/socket.c:2271
    do_syscall_x64 arch/x86/entry/common.c:50 [inline]
    do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
    entry_SYSCALL_64_after_hwframe+0x63/0xcd
    
    value changed: 0x01 -> 0x05
    
    Reported by Kernel Concurrency Sanitizer on:
    CPU: 1 PID: 16554 Comm: syz-executor.5 Not tainted 6.5.0-rc7-syzkaller-00004-gf7757129e3de #0
    
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES [+ + +]
Author: David Howells <dhowells@redhat.com>
Date:   Mon May 22 13:11:22 2023 +0100

    udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES
    
    [ Upstream commit 7ac7c987850c3ec617c778f7bd871804dc1c648d ]
    
    Convert udp_sendpage() to use sendmsg() with MSG_SPLICE_PAGES rather than
    directly splicing in the pages itself.
    
    This allows ->sendpage() to be replaced by something that can handle
    multiple multipage folios in a single transaction.
    
    Signed-off-by: David Howells <dhowells@redhat.com>
    cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
    cc: David Ahern <dsahern@kernel.org>
    cc: Jens Axboe <axboe@kernel.dk>
    cc: Matthew Wilcox <willy@infradead.org>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Stable-dep-of: a0002127cd74 ("udp: move udp->no_check6_tx to udp->udp_flags")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

udp: introduce udp->udp_flags [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Sep 12 09:17:21 2023 +0000

    udp: introduce udp->udp_flags
    
    [ Upstream commit 81b36803ac139827538ac5ce4028e750a3c53f53 ]
    
    According to syzbot, it is time to use proper atomic flags
    for various UDP flags.
    
    Add udp_flags field, and convert udp->corkflag to first
    bit in it.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Stable-dep-of: a0002127cd74 ("udp: move udp->no_check6_tx to udp->udp_flags")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

udp: lockless UDP_ENCAP_L2TPINUDP / UDP_GRO [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Sep 12 09:17:27 2023 +0000

    udp: lockless UDP_ENCAP_L2TPINUDP / UDP_GRO
    
    [ Upstream commit ac9a7f4ce5dda1472e8f44096f33066c6ec1a3b4 ]
    
    Move udp->encap_enabled to udp->udp_flags.
    
    Add udp_test_and_set_bit() helper to allow lockless
    udp_tunnel_encap_enable() implementation.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Stable-dep-of: 70a36f571362 ("udp: annotate data-races around udp->encap_type")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

udp: move udp->accept_udp_{l4|fraglist} to udp->udp_flags [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Sep 12 09:17:26 2023 +0000

    udp: move udp->accept_udp_{l4|fraglist} to udp->udp_flags
    
    [ Upstream commit f5f52f0884a595ff99ab1a608643fe4025fca2d5 ]
    
    These are read locklessly, move them to udp_flags to fix data-races.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Stable-dep-of: 70a36f571362 ("udp: annotate data-races around udp->encap_type")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

udp: move udp->gro_enabled to udp->udp_flags [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Sep 12 09:17:24 2023 +0000

    udp: move udp->gro_enabled to udp->udp_flags
    
    [ Upstream commit e1dc0615c6b08ef36414f08c011965b8fb56198b ]
    
    syzbot reported that udp->gro_enabled can be read locklessly.
    Use one atomic bit from udp->udp_flags.
    
    Fixes: e20cf8d3f1f7 ("udp: implement GRO for plain UDP sockets.")
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

udp: move udp->no_check6_rx to udp->udp_flags [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Sep 12 09:17:23 2023 +0000

    udp: move udp->no_check6_rx to udp->udp_flags
    
    [ Upstream commit bcbc1b1de884647aa0318bf74eb7f293d72a1e40 ]
    
    syzbot reported that udp->no_check6_rx can be read locklessly.
    Use one atomic bit from udp->udp_flags.
    
    Fixes: 1c19448c9ba6 ("net: Make enabling of zero UDP6 csums more restrictive")
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

udp: move udp->no_check6_tx to udp->udp_flags [+ + +]
Author: Eric Dumazet <edumazet@google.com>
Date:   Tue Sep 12 09:17:22 2023 +0000

    udp: move udp->no_check6_tx to udp->udp_flags
    
    [ Upstream commit a0002127cd746fcaa182ad3386ef6931c37f3bda ]
    
    syzbot reported that udp->no_check6_tx can be read locklessly.
    Use one atomic bit from udp->udp_flags
    
    Fixes: 1c19448c9ba6 ("net: Make enabling of zero UDP6 csums more restrictive")
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Willem de Bruijn <willemb@google.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
wifi: iwlwifi: pcie: don't synchronize IRQs from IRQ [+ + +]
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Fri Dec 15 11:13:34 2023 +0100

    wifi: iwlwifi: pcie: don't synchronize IRQs from IRQ
    
    [ Upstream commit 400f6ebbc175286576c7f7fddf3c347d09d12310 ]
    
    On older devices (before unified image!) we can end up calling
    stop_device from an rfkill interrupt. However, in stop_device
    we attempt to synchronize IRQs, which then of course deadlocks.
    
    Avoid this by checking the context, if running from the IRQ
    thread then don't synchronize. This wouldn't be correct on a
    new device since RSS is supported, but older devices only have
    a single interrupt/queue.
    
    Fixes: 37fb29bd1f90 ("wifi: iwlwifi: pcie: synchronize IRQs before NAPI")
    Reviewed-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
    Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Kalle Valo <kvalo@kernel.org>
    Link: https://msgid.link/20231215111335.59aab00baed7.Iadfe154d6248e7f9dfd69522e5429dbbd72925d7@changeid
    Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: iwlwifi: yoyo: swap cdb and jacket bits values [+ + +]
Author: Rotem Saado <rotem.saado@intel.com>
Date:   Wed Oct 4 12:36:22 2023 +0300

    wifi: iwlwifi: yoyo: swap cdb and jacket bits values
    
    [ Upstream commit 65008777b9dcd2002414ddb2c2158293a6e2fd6f ]
    
    The bits are wrong, the jacket bit should be 5 and cdb bit 4.
    Fix it.
    
    Fixes: 1f171f4f1437 ("iwlwifi: Add support for getting rf id with blank otp")
    Signed-off-by: Rotem Saado <rotem.saado@intel.com>
    Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
    Link: https://lore.kernel.org/r/20231004123422.356d8dacda2f.I349ab888b43a11baa2453a1d6978a6a703e422f0@changeid
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
x86/kprobes: fix incorrect return address calculation in kprobe_emulate_call_indirect [+ + +]
Author: Jinghao Jia <jinghao7@illinois.edu>
Date:   Tue Jan 2 17:33:45 2024 -0600

    x86/kprobes: fix incorrect return address calculation in kprobe_emulate_call_indirect
    
    commit f5d03da48d062966c94f0199d20be0b3a37a7982 upstream.
    
    kprobe_emulate_call_indirect currently uses int3_emulate_call to emulate
    indirect calls. However, int3_emulate_call always assumes the size of
    the call to be 5 bytes when calculating the return address. This is
    incorrect for register-based indirect calls in x86, which can be either
    2 or 3 bytes depending on whether REX prefix is used. At kprobe runtime,
    the incorrect return address causes control flow to land onto the wrong
    place after return -- possibly not a valid instruction boundary. This
    can lead to a panic like the following:
    
    [    7.308204][    C1] BUG: unable to handle page fault for address: 000000000002b4d8
    [    7.308883][    C1] #PF: supervisor read access in kernel mode
    [    7.309168][    C1] #PF: error_code(0x0000) - not-present page
    [    7.309461][    C1] PGD 0 P4D 0
    [    7.309652][    C1] Oops: 0000 [#1] SMP
    [    7.309929][    C1] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.7.0-rc5-trace-for-next #6
    [    7.310397][    C1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-20220807_005459-localhost 04/01/2014
    [    7.311068][    C1] RIP: 0010:__common_interrupt+0x52/0xc0
    [    7.311349][    C1] Code: 01 00 4d 85 f6 74 39 49 81 fe 00 f0 ff ff 77 30 4c 89 f7 4d 8b 5e 68 41 ba 91 76 d8 42 45 03 53 fc 74 02 0f 0b cc ff d3 65 48 <8b> 05 30 c7 ff 7e 65 4c 89 3d 28 c7 ff 7e 5b 41 5c 41 5e 41 5f c3
    [    7.312512][    C1] RSP: 0018:ffffc900000e0fd0 EFLAGS: 00010046
    [    7.312899][    C1] RAX: 0000000000000001 RBX: 0000000000000023 RCX: 0000000000000001
    [    7.313334][    C1] RDX: 00000000000003cd RSI: 0000000000000001 RDI: ffff888100d302a4
    [    7.313702][    C1] RBP: 0000000000000001 R08: 0ef439818636191f R09: b1621ff338a3b482
    [    7.314146][    C1] R10: ffffffff81e5127b R11: ffffffff81059810 R12: 0000000000000023
    [    7.314509][    C1] R13: 0000000000000000 R14: ffff888100d30200 R15: 0000000000000000
    [    7.314951][    C1] FS:  0000000000000000(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000
    [    7.315396][    C1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [    7.315691][    C1] CR2: 000000000002b4d8 CR3: 0000000003028003 CR4: 0000000000370ef0
    [    7.316153][    C1] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [    7.316508][    C1] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [    7.316948][    C1] Call Trace:
    [    7.317123][    C1]  <IRQ>
    [    7.317279][    C1]  ? __die_body+0x64/0xb0
    [    7.317482][    C1]  ? page_fault_oops+0x248/0x370
    [    7.317712][    C1]  ? __wake_up+0x96/0xb0
    [    7.317964][    C1]  ? exc_page_fault+0x62/0x130
    [    7.318211][    C1]  ? asm_exc_page_fault+0x22/0x30
    [    7.318444][    C1]  ? __cfi_native_send_call_func_single_ipi+0x10/0x10
    [    7.318860][    C1]  ? default_idle+0xb/0x10
    [    7.319063][    C1]  ? __common_interrupt+0x52/0xc0
    [    7.319330][    C1]  common_interrupt+0x78/0x90
    [    7.319546][    C1]  </IRQ>
    [    7.319679][    C1]  <TASK>
    [    7.319854][    C1]  asm_common_interrupt+0x22/0x40
    [    7.320082][    C1] RIP: 0010:default_idle+0xb/0x10
    [    7.320309][    C1] Code: 4c 01 c7 4c 29 c2 e9 72 ff ff ff cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 b8 0c 67 40 a5 66 90 0f 00 2d 09 b9 3b 00 fb f4 <fa> c3 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 b8 0c 67 40 a5 e9
    [    7.321449][    C1] RSP: 0018:ffffc9000009bee8 EFLAGS: 00000256
    [    7.321808][    C1] RAX: ffff88813bca8b68 RBX: 0000000000000001 RCX: 000000000001ef0c
    [    7.322227][    C1] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 000000000001ef0c
    [    7.322656][    C1] RBP: ffffc9000009bef8 R08: 8000000000000000 R09: 00000000000008c2
    [    7.323083][    C1] R10: 0000000000000000 R11: ffffffff81058e70 R12: 0000000000000000
    [    7.323530][    C1] R13: ffff8881002b30c0 R14: 0000000000000000 R15: 0000000000000000
    [    7.323948][    C1]  ? __cfi_lapic_next_deadline+0x10/0x10
    [    7.324239][    C1]  default_idle_call+0x31/0x50
    [    7.324464][    C1]  do_idle+0xd3/0x240
    [    7.324690][    C1]  cpu_startup_entry+0x25/0x30
    [    7.324983][    C1]  start_secondary+0xb4/0xc0
    [    7.325217][    C1]  secondary_startup_64_no_verify+0x179/0x17b
    [    7.325498][    C1]  </TASK>
    [    7.325641][    C1] Modules linked in:
    [    7.325906][    C1] CR2: 000000000002b4d8
    [    7.326104][    C1] ---[ end trace 0000000000000000 ]---
    [    7.326354][    C1] RIP: 0010:__common_interrupt+0x52/0xc0
    [    7.326614][    C1] Code: 01 00 4d 85 f6 74 39 49 81 fe 00 f0 ff ff 77 30 4c 89 f7 4d 8b 5e 68 41 ba 91 76 d8 42 45 03 53 fc 74 02 0f 0b cc ff d3 65 48 <8b> 05 30 c7 ff 7e 65 4c 89 3d 28 c7 ff 7e 5b 41 5c 41 5e 41 5f c3
    [    7.327570][    C1] RSP: 0018:ffffc900000e0fd0 EFLAGS: 00010046
    [    7.327910][    C1] RAX: 0000000000000001 RBX: 0000000000000023 RCX: 0000000000000001
    [    7.328273][    C1] RDX: 00000000000003cd RSI: 0000000000000001 RDI: ffff888100d302a4
    [    7.328632][    C1] RBP: 0000000000000001 R08: 0ef439818636191f R09: b1621ff338a3b482
    [    7.329223][    C1] R10: ffffffff81e5127b R11: ffffffff81059810 R12: 0000000000000023
    [    7.329780][    C1] R13: 0000000000000000 R14: ffff888100d30200 R15: 0000000000000000
    [    7.330193][    C1] FS:  0000000000000000(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000
    [    7.330632][    C1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [    7.331050][    C1] CR2: 000000000002b4d8 CR3: 0000000003028003 CR4: 0000000000370ef0
    [    7.331454][    C1] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [    7.331854][    C1] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [    7.332236][    C1] Kernel panic - not syncing: Fatal exception in interrupt
    [    7.332730][    C1] Kernel Offset: disabled
    [    7.333044][    C1] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
    
    The relevant assembly code is (from objdump, faulting address
    highlighted):
    
    ffffffff8102ed9d:       41 ff d3                  call   *%r11
    ffffffff8102eda0:       65 48 <8b> 05 30 c7 ff    mov    %gs:0x7effc730(%rip),%rax
    
    The emulation incorrectly sets the return address to be ffffffff8102ed9d
    + 0x5 = ffffffff8102eda2, which is the 8b byte in the middle of the next
    mov. This in turn causes incorrect subsequent instruction decoding and
    eventually triggers the page fault above.
    
    Instead of invoking int3_emulate_call, perform push and jmp emulation
    directly in kprobe_emulate_call_indirect. At this point we can obtain
    the instruction size from p->ainsn.size so that we can calculate the
    correct return address.
    
    Link: https://lore.kernel.org/all/20240102233345.385475-1-jinghao7@illinois.edu/
    
    Fixes: 6256e668b7af ("x86/kprobes: Use int3 instead of debug trap for single-step")
    Cc: stable@vger.kernel.org
    Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>
    Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>