Author: Al Viro <viro@zeniv.linux.org.uk>
Date: Sun Apr 27 15:41:51 2025 -0400
__legitimize_mnt(): check for MNT_SYNC_UMOUNT should be under mount_lock
[ Upstream commit 250cf3693060a5f803c5f1ddc082bb06b16112a9 ]
... or we risk stealing final mntput from sync umount - raising mnt_count
after umount(2) has verified that victim is not busy, but before it
has set MNT_SYNC_UMOUNT; in that case __legitimize_mnt() doesn't see
that it's safe to quietly undo mnt_count increment and leaves dropping
the reference to caller, where it'll be a full-blown mntput().
Check under mount_lock is needed; leaving the current one done before
taking that makes no sense - it's nowhere near common enough to bother
with.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Xiaofei Tan <tanxiaofei@huawei.com>
Date: Wed Feb 12 14:34:08 2025 +0800
ACPI: HED: Always initialize before evged
[ Upstream commit cccf6ee090c8c133072d5d5b52ae25f3bc907a16 ]
When the HED driver is built-in, it initializes after evged because they
both are at the same initcall level, so the initialization ordering
depends on the Makefile order. However, this prevents RAS records
coming in between the evged driver initialization and the HED driver
initialization from being handled.
If the number of such RAS records is above the APEI HEST error source
number, the HEST resources may be exhausted, and that may affect
subsequent RAS error reporting.
To fix this issue, change the initcall level of HED to subsys_initcall
and prevent the driver from being built as a module by changing ACPI_HED
in Kconfig from "tristate" to "bool".
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Link: https://patch.msgid.link/20250212063408.927666-1-tanxiaofei@huawei.com
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jeremy Linton <jeremy.linton@arm.com>
Date: Wed May 7 21:30:25 2025 -0500
ACPI: PPTT: Fix processor subtable walk
commit adfab6b39202481bb43286fff94def4953793fdb upstream.
The original PPTT code had a bug where the processor subtable length
was not correctly validated when encountering a truncated
acpi_pptt_processor node.
Commit 7ab4f0e37a0f4 ("ACPI PPTT: Fix coding mistakes in a couple of
sizeof() calls") attempted to fix this by validating the size is as
large as the acpi_pptt_processor node structure. This introduced a
regression where the last processor node in the PPTT table is ignored
if it doesn't contain any private resources. That results errors like:
ACPI PPTT: PPTT table found, but unable to locate core XX (XX)
ACPI: SPE must be homogeneous
Furthermore, it fails in a common case where the node length isn't
equal to the acpi_pptt_processor structure size, leaving the original
bug in a modified form.
Correct the regression by adjusting the loop termination conditions as
suggested by the bug reporters. An additional check performed after
the subtable node type is detected, validates the acpi_pptt_processor
node is fully contained in the PPTT table. Repeating the check in
acpi_pptt_leaf_node() is largely redundant as the node is already
known to be fully contained in the table.
The case where a final truncated node's parent property is accepted,
but the node itself is rejected should not be considered a bug.
Fixes: 7ab4f0e37a0f4 ("ACPI PPTT: Fix coding mistakes in a couple of sizeof() calls")
Reported-by: Maximilian Heyne <mheyne@amazon.de>
Closes: https://lore.kernel.org/linux-acpi/20250506-draco-taped-15f475cd@mheyne-amazon/
Reported-by: Yicong Yang <yangyicong@hisilicon.com>
Closes: https://lore.kernel.org/linux-acpi/20250507035124.28071-1-yangyicong@huawei.com/
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Tested-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Maximilian Heyne <mheyne@amazon.de>
Cc: All applicable <stable@vger.kernel.org> # 7ab4f0e37a0f4: ACPI PPTT: Fix coding mistakes ...
Link: https://patch.msgid.link/20250508023025.1301030-1-jeremy.linton@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Wentao Liang <vulab@iscas.ac.cn>
Date: Wed May 14 17:24:44 2025 +0800
ALSA: es1968: Add error handling for snd_pcm_hw_constraint_pow2()
commit 9e000f1b7f31684cc5927e034360b87ac7919593 upstream.
The function snd_es1968_capture_open() calls the function
snd_pcm_hw_constraint_pow2(), but does not check its return
value. A proper implementation can be found in snd_cx25821_pcm_open().
Add error handling for snd_pcm_hw_constraint_pow2() and propagate its
error code.
Fixes: b942cf815b57 ("[ALSA] es1968 - Fix stuttering capture")
Cc: stable@vger.kernel.org # v2.6.22
Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
Link: https://patch.msgid.link/20250514092444.331-1-vulab@iscas.ac.cn
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Takashi Iwai <tiwai@suse.de>
Date: Sun Apr 27 10:10:34 2025 +0200
ALSA: hda/realtek: Add quirk for HP Spectre x360 15-df1xxx
[ Upstream commit be0c40da888840fe91b45474cb70779e6cbaf7ca ]
HP Spectre x360 15-df1xxx with SSID 13c:863e requires similar
workarounds that were applied to another HP Spectre x360 models;
it has a mute LED only, no micmute LEDs, and needs the speaker GPIO
seup.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=220054
Link: https://patch.msgid.link/20250427081035.11567-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Takashi Iwai <tiwai@suse.de>
Date: Fri May 16 10:08:16 2025 +0200
ALSA: pcm: Fix race of buffer access at PCM OSS layer
commit 93a81ca0657758b607c3f4ba889ae806be9beb73 upstream.
The PCM OSS layer tries to clear the buffer with the silence data at
initialization (or reconfiguration) of a stream with the explicit call
of snd_pcm_format_set_silence() with runtime->dma_area. But this may
lead to a UAF because the accessed runtime->dma_area might be freed
concurrently, as it's performed outside the PCM ops.
For avoiding it, move the code into the PCM core and perform it inside
the buffer access lock, so that it won't be changed during the
operation.
Reported-by: syzbot+32d4647f551007595173@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/68164d8e.050a0220.11da1b.0019.GAE@google.com
Cc: <stable@vger.kernel.org>
Link: https://patch.msgid.link/20250516080817.20068-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Geert Uytterhoeven <geert+renesas@glider.be>
Date: Tue May 13 09:31:04 2025 +0200
ALSA: sh: SND_AICA should depend on SH_DMA_API
[ Upstream commit 66e48ef6ef506c89ec1b3851c6f9f5f80b5835ff ]
If CONFIG_SH_DMA_API=n:
WARNING: unmet direct dependencies detected for G2_DMA
Depends on [n]: SH_DREAMCAST [=y] && SH_DMA_API [=n]
Selected by [y]:
- SND_AICA [=y] && SOUND [=y] && SND [=y] && SND_SUPERH [=y] && SH_DREAMCAST [=y]
SND_AICA selects G2_DMA. As the latter depends on SH_DMA_API, the
former should depend on SH_DMA_API, too.
Fixes: f477a538c14d07f8 ("sh: dma: fix kconfig dependency for G2_DMA")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505131320.PzgTtl9H-lkp@intel.com/
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/b90625f8a9078d0d304bafe862cbe3a3fab40082.1747121335.git.geert+renesas@glider.be
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Joachim Priesner <joachim.priesner@web.de>
Date: Mon Apr 28 07:36:06 2025 +0200
ALSA: usb-audio: Add second USB ID for Jabra Evolve 65 headset
commit 1149719442d28c96dc63cad432b5a6db7c300e1a upstream.
There seem to be multiple USB device IDs used for these;
the one I have reports as 0b0e:030c when powered on.
(When powered off, it reports as 0b0e:0311.)
Signed-off-by: Joachim Priesner <joachim.priesner@web.de>
Cc: <stable@vger.kernel.org>
Link: https://patch.msgid.link/20250428053606.9237-1-joachim.priesner@web.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Vishal Badole <Vishal.Badole@amd.com>
Date: Thu Apr 24 18:32:48 2025 +0530
amd-xgbe: Fix to ensure dependent features are toggled with RX checksum offload
commit f04dd30f1bef1ed2e74a4050af6e5e5e3869bac3 upstream.
According to the XGMAC specification, enabling features such as Layer 3
and Layer 4 Packet Filtering, Split Header and Virtualized Network support
automatically selects the IPC Full Checksum Offload Engine on the receive
side.
When RX checksum offload is disabled, these dependent features must also
be disabled to prevent abnormal behavior caused by mismatched feature
dependencies.
Ensure that toggling RX checksum offload (disabling or enabling) properly
disables or enables all dependent features, maintaining consistent and
expected behavior in the network device.
Cc: stable@vger.kernel.org
Fixes: 1a510ccf5869 ("amd-xgbe: Add support for VXLAN offload capabilities")
Signed-off-by: Vishal Badole <Vishal.Badole@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250424130248.428865-1-Vishal.Badole@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ryan Roberts <ryan.roberts@arm.com>
Date: Fri Feb 21 10:12:25 2025 +0530
arm64/mm: Check PUD_TYPE_TABLE in pud_bad()
[ Upstream commit bfb1d2b9021c21891427acc86eb848ccedeb274e ]
pud_bad() is currently defined in terms of pud_table(). Although for some
configs, pud_table() is hard-coded to true i.e. when using 64K base pages
or when page table levels are less than 3.
pud_bad() is intended to check that the pud is configured correctly. Hence
let's open-code the same check that the full version of pud_table() uses
into pud_bad(). Then it always performs the check regardless of the config.
Cc: Will Deacon <will@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20250221044227.1145393-7-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt>
Date: Mon Feb 24 12:17:36 2025 +0000
arm64: tegra: p2597: Fix gpio for vdd-1v8-dis regulator
[ Upstream commit f34621f31e3be81456c903287f7e4c0609829e29 ]
According to the board schematics the enable pin of this regulator is
connected to gpio line #9 of the first instance of the TCA9539
GPIO expander, so adjust it.
Signed-off-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt>
Link: https://lore.kernel.org/r/20250224-diogo-gpio_exp-v1-1-80fb84ac48c6@tecnico.ulisboa.pt
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Svyatoslav Ryhel <clamor95@gmail.com>
Date: Wed Feb 26 12:56:11 2025 +0200
ARM: tegra: Switch DSI-B clock parent to PLLD on Tegra114
[ Upstream commit 2b3db788f2f614b875b257cdb079adadedc060f3 ]
PLLD is usually used as parent clock for internal video devices, like
DSI for example, while PLLD2 is used as parent for HDMI.
Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com>
Link: https://lore.kernel.org/r/20250226105615.61087-3-clamor95@gmail.com
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Takashi Iwai <tiwai@suse.de>
Date: Sun Apr 20 10:56:59 2025 +0200
ASoC: Intel: bytcr_rt5640: Add DMI quirk for Acer Aspire SW3-013
[ Upstream commit a549b927ea3f5e50b1394209b64e6e17e31d4db8 ]
Acer Aspire SW3-013 requires the very same quirk as other Acer Aspire
model for making it working.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=220011
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20250420085716.12095-1-tiwai@suse.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Martin Povišer <povik+lin@cutebit.org>
Date: Sat Feb 8 00:57:22 2025 +0000
ASoC: ops: Enforce platform maximum on initial value
[ Upstream commit 783db6851c1821d8b983ffb12b99c279ff64f2ee ]
Lower the volume if it is violating the platform maximum at its initial
value (i.e. at the time of the 'snd_soc_limit_volume' call).
Signed-off-by: Martin Povišer <povik+lin@cutebit.org>
[Cherry picked from the Asahi kernel with fixups -- broonie]
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://patch.msgid.link/20250208-asoc-volume-limit-v1-1-b98fcf4cdbad@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Baryshkov <lumag@kernel.org>
Date: Sat Mar 27 12:28:57 2021 +0300
ASoC: q6afe-clocks: fix reprobing of the driver
commit 96fadf7e8ff49fdb74754801228942b67c3eeebd upstream.
Q6afe-clocks driver can get reprobed. For example if the APR services
are restarted after the firmware crash. However currently Q6afe-clocks
driver will oops because hw.init will get cleared during first _probe
call. Rewrite the driver to fill the clock data at runtime rather than
using big static array of clocks.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Fixes: 520a1c396d19 ("ASoC: q6afe-clocks: add q6afe clock controller")
Link: https://lore.kernel.org/r/20210327092857.3073879-1-dmitry.baryshkov@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
[Minor context change fixed]
Signed-off-by: Feng Liu <Feng.Liu3@windriver.com>
Signed-off-by: He Zhe <Zhe.He@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Date: Wed Feb 12 02:24:38 2025 +0000
ASoC: soc-dai: check return value at snd_soc_dai_set_tdm_slot()
[ Upstream commit 7f1186a8d738661b941b298fd6d1d5725ed71428 ]
snd_soc_dai_set_tdm_slot() calls .xlate_tdm_slot_mask() or
snd_soc_xlate_tdm_slot_mask(), but didn't check its return value.
Let's check it.
This patch might break existing driver. In such case, let's makes
each func to void instead of int.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://patch.msgid.link/87o6z7yk61.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hector Martin <marcan@marcan.st>
Date: Sat Feb 8 01:03:24 2025 +0000
ASoC: tas2764: Power up/down amp on mute ops
[ Upstream commit 1c3b5f37409682184669457a5bdf761268eafbe5 ]
The ASoC convention is that clocks are removed after codec mute, and
power up/down is more about top level power management. For these chips,
the "mute" state still expects a TDM clock, and yanking the clock in
this state will trigger clock errors. So, do the full
shutdown<->mute<->active transition on the mute operation, so the amp is
in software shutdown by the time the clocks are removed.
This fixes TDM clock errors when streams are stopped.
Signed-off-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://patch.msgid.link/20250208-asoc-tas2764-v1-1-dbab892a69b5@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Michael Chan <michael.chan@broadcom.com>
Date: Mon Apr 28 15:59:03 2025 -0700
bnxt_en: Fix ethtool -d byte order for 32-bit values
[ Upstream commit 02e8be5a032cae0f4ca33c6053c44d83cf4acc93 ]
For version 1 register dump that includes the PCIe stats, the existing
code incorrectly assumes that all PCIe stats are 64-bit values. Fix it
by using an array containing the starting and ending index of the 32-bit
values. The loop in bnxt_get_regs() will use the array to do proper
endian swap for the 32-bit values.
Fixes: b5d600b027eb ("bnxt_en: Add support for 'ethtool -d'")
Reviewed-by: Shruti Parab <shruti.parab@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hangbin Liu <liuhangbin@gmail.com>
Date: Tue Feb 25 03:39:14 2025 +0000
bonding: report duplicate MAC address in all situations
[ Upstream commit 28d68d396a1cd21591e8c6d74afbde33a7ea107e ]
Normally, a bond uses the MAC address of the first added slave as the bond’s
MAC address. And the bond will set active slave’s MAC address to bond’s
address if fail_over_mac is set to none (0) or follow (2).
When the first slave is removed, the bond will still use the removed slave’s
MAC address, which can lead to a duplicate MAC address and potentially cause
issues with the switch. To avoid confusion, let's warn the user in all
situations, including when fail_over_mac is set to 2 or not in active-backup
mode.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20250225033914.18617-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Viktor Malik <vmalik@redhat.com>
Date: Wed Jan 29 08:18:57 2025 +0100
bpftool: Fix readlink usage in get_fd_type
[ Upstream commit 0053f7d39d491b6138d7c526876d13885cbb65f1 ]
The `readlink(path, buf, sizeof(buf))` call reads at most sizeof(buf)
bytes and *does not* append null-terminator to buf. With respect to
that, fix two pieces in get_fd_type:
1. Change the truncation check to contain sizeof(buf) rather than
sizeof(path).
2. Append null-terminator to buf.
Reported by Coverity.
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/bpf/20250129071857.75182-1-vmalik@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ido Schimmel <idosch@nvidia.com>
Date: Thu May 15 11:48:48 2025 +0300
bridge: netfilter: Fix forwarding of fragmented packets
[ Upstream commit 91b6dbced0ef1d680afdd69b14fc83d50ebafaf3 ]
When netfilter defrag hooks are loaded (due to the presence of conntrack
rules, for example), fragmented packets entering the bridge will be
defragged by the bridge's pre-routing hook (br_nf_pre_routing() ->
ipv4_conntrack_defrag()).
Later on, in the bridge's post-routing hook, the defragged packet will
be fragmented again. If the size of the largest fragment is larger than
what the kernel has determined as the destination MTU (using
ip_skb_dst_mtu()), the defragged packet will be dropped.
Before commit ac6627a28dbf ("net: ipv4: Consolidate ipv4_mtu and
ip_dst_mtu_maybe_forward"), ip_skb_dst_mtu() would return dst_mtu() as
the destination MTU. Assuming the dst entry attached to the packet is
the bridge's fake rtable one, this would simply be the bridge's MTU (see
fake_mtu()).
However, after above mentioned commit, ip_skb_dst_mtu() ends up
returning the route's MTU stored in the dst entry's metrics. Ideally, in
case the dst entry is the bridge's fake rtable one, this should be the
bridge's MTU as the bridge takes care of updating this metric when its
MTU changes (see br_change_mtu()).
Unfortunately, the last operation is a no-op given the metrics attached
to the fake rtable entry are marked as read-only. Therefore,
ip_skb_dst_mtu() ends up returning 1500 (the initial MTU value) and
defragged packets are dropped during fragmentation when dealing with
large fragments and high MTU (e.g., 9k).
Fix by moving the fake rtable entry's metrics to be per-bridge (in a
similar fashion to the fake rtable entry itself) and marking them as
writable, thereby allowing MTU changes to be reflected.
Fixes: 62fa8a846d7d ("net: Implement read-only protection and COW'ing of metrics.")
Fixes: 33eb9873a283 ("bridge: initialize fake_rtable metrics")
Reported-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Closes: https://lore.kernel.org/netdev/PH0PR10MB4504888284FF4CBA648197D0ACB82@PH0PR10MB4504.namprd10.prod.outlook.com/
Tested-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20250515084848.727706-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mark Harmstone <maharmstone@fb.com>
Date: Thu Mar 6 10:58:46 2025 +0000
btrfs: avoid linker error in btrfs_find_create_tree_block()
[ Upstream commit 7ef3cbf17d2734ca66c4ed8573be45f4e461e7ee ]
The inline function btrfs_is_testing() is hardcoded to return 0 if
CONFIG_BTRFS_FS_RUN_SANITY_TESTS is not set. Currently we're relying on
the compiler optimizing out the call to alloc_test_extent_buffer() in
btrfs_find_create_tree_block(), as it's not been defined (it's behind an
#ifdef).
Add a stub version of alloc_test_extent_buffer() to avoid linker errors
on non-standard optimization levels. This problem was seen on GCC 14
with -O0 and is helps to see symbols that would be otherwise optimized
out.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Mark Harmstone <maharmstone@fb.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Goldwyn Rodrigues <rgoldwyn@suse.de>
Date: Fri Apr 25 09:25:06 2025 -0400
btrfs: correct the order of prelim_ref arguments in btrfs__prelim_ref
[ Upstream commit bc7e0975093567f51be8e1bdf4aa5900a3cf0b1e ]
btrfs_prelim_ref() calls the old and new reference variables in the
incorrect order. This causes a NULL pointer dereference because oldref
is passed as NULL to trace_btrfs_prelim_ref_insert().
Note, trace_btrfs_prelim_ref_insert() is being called with newref as
oldref (and oldref as NULL) on purpose in order to print out
the values of newref.
To reproduce:
echo 1 > /sys/kernel/debug/tracing/events/btrfs/btrfs_prelim_ref_insert/enable
Perform some writeback operations.
Backtrace:
BUG: kernel NULL pointer dereference, address: 0000000000000018
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 115949067 P4D 115949067 PUD 11594a067 PMD 0
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 1 UID: 0 PID: 1188 Comm: fsstress Not tainted 6.15.0-rc2-tester+ #47 PREEMPT(voluntary) 7ca2cef72d5e9c600f0c7718adb6462de8149622
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-2-gc13ff2cd-prebuilt.qemu.org 04/01/2014
RIP: 0010:trace_event_raw_event_btrfs__prelim_ref+0x72/0x130
Code: e8 43 81 9f ff 48 85 c0 74 78 4d 85 e4 0f 84 8f 00 00 00 49 8b 94 24 c0 06 00 00 48 8b 0a 48 89 48 08 48 8b 52 08 48 89 50 10 <49> 8b 55 18 48 89 50 18 49 8b 55 20 48 89 50 20 41 0f b6 55 28 88
RSP: 0018:ffffce44820077a0 EFLAGS: 00010286
RAX: ffff8c6b403f9014 RBX: ffff8c6b55825730 RCX: 304994edf9cf506b
RDX: d8b11eb7f0fdb699 RSI: ffff8c6b403f9010 RDI: ffff8c6b403f9010
RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000010
R10: 00000000ffffffff R11: 0000000000000000 R12: ffff8c6b4e8fb000
R13: 0000000000000000 R14: ffffce44820077a8 R15: ffff8c6b4abd1540
FS: 00007f4dc6813740(0000) GS:ffff8c6c1d378000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000018 CR3: 000000010eb42000 CR4: 0000000000750ef0
PKRU: 55555554
Call Trace:
<TASK>
prelim_ref_insert+0x1c1/0x270
find_parent_nodes+0x12a6/0x1ee0
? __entry_text_end+0x101f06/0x101f09
? srso_alias_return_thunk+0x5/0xfbef5
? srso_alias_return_thunk+0x5/0xfbef5
? srso_alias_return_thunk+0x5/0xfbef5
? srso_alias_return_thunk+0x5/0xfbef5
btrfs_is_data_extent_shared+0x167/0x640
? fiemap_process_hole+0xd0/0x2c0
extent_fiemap+0xa5c/0xbc0
? __entry_text_end+0x101f05/0x101f09
btrfs_fiemap+0x7e/0xd0
do_vfs_ioctl+0x425/0x9d0
__x64_sys_ioctl+0x75/0xc0
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Filipe Manana <fdmanana@suse.com>
Date: Tue Jun 18 12:15:01 2024 +0100
btrfs: don't BUG_ON() when 0 reference count at btrfs_lookup_extent_info()
commit 28cb13f29faf6290597b24b728dc3100c019356f upstream.
Instead of doing a BUG_ON() handle the error by returning -EUCLEAN,
aborting the transaction and logging an error message.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[Minor conflict resolved due to code context change.]
Signed-off-by: Jianqi Ren <jianqi.ren.cn@windriver.com>
Signed-off-by: He Zhe <zhe.he@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Filipe Manana <fdmanana@suse.com>
Date: Wed Feb 5 13:09:25 2025 +0000
btrfs: send: return -ENAMETOOLONG when attempting a path that is too long
[ Upstream commit a77749b3e21813566cea050bbb3414ae74562eba ]
When attempting to build a too long path we are currently returning
-ENOMEM, which is very odd and misleading. So update fs_path_ensure_buf()
to return -ENAMETOOLONG instead. Also, while at it, move the WARN_ON()
into the if statement's expression, as it makes it clear what is being
tested and also has the effect of adding 'unlikely' to the statement,
which allows the compiler to generate better code as this condition is
never expected to happen.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Oliver Hartkopp <socketcan@hartkopp.net>
Date: Mon May 19 14:50:26 2025 +0200
can: bcm: add locking for bcm_op runtime updates
commit c2aba69d0c36a496ab4f2e81e9c2b271f2693fd7 upstream.
The CAN broadcast manager (CAN BCM) can send a sequence of CAN frames via
hrtimer. The content and also the length of the sequence can be changed
resp reduced at runtime where the 'currframe' counter is then set to zero.
Although this appeared to be a safe operation the updates of 'currframe'
can be triggered from user space and hrtimer context in bcm_can_tx().
Anderson Nascimento created a proof of concept that triggered a KASAN
slab-out-of-bounds read access which can be prevented with a spin_lock_bh.
At the rework of bcm_can_tx() the 'count' variable has been moved into
the protected section as this variable can be modified from both contexts
too.
Fixes: ffd980f976e7 ("[CAN]: Add broadcast manager (bcm) protocol")
Reported-by: Anderson Nascimento <anderson@allelesecurity.com>
Tested-by: Anderson Nascimento <anderson@allelesecurity.com>
Reviewed-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://patch.msgid.link/20250519125027.11900-1-socketcan@hartkopp.net
Cc: stable@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Oliver Hartkopp <socketcan@hartkopp.net>
Date: Mon May 19 14:50:27 2025 +0200
can: bcm: add missing rcu read protection for procfs content
commit dac5e6249159ac255dad9781793dbe5908ac9ddb upstream.
When the procfs content is generated for a bcm_op which is in the process
to be removed the procfs output might show unreliable data (UAF).
As the removal of bcm_op's is already implemented with rcu handling this
patch adds the missing rcu_read_lock() and makes sure the list entries
are properly removed under rcu protection.
Fixes: f1b4e32aca08 ("can: bcm: use call_rcu() instead of costly synchronize_rcu()")
Reported-by: Anderson Nascimento <anderson@allelesecurity.com>
Suggested-by: Anderson Nascimento <anderson@allelesecurity.com>
Tested-by: Anderson Nascimento <anderson@allelesecurity.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://patch.msgid.link/20250519125027.11900-2-socketcan@hartkopp.net
Cc: stable@vger.kernel.org # >= 5.4
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Date: Wed Feb 12 21:23:14 2025 +0100
can: c_can: Use of_property_present() to test existence of DT property
[ Upstream commit ab1bc2290fd8311d49b87c29f1eb123fcb581bee ]
of_property_read_bool() should be used only on boolean properties.
Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://patch.msgid.link/20250212-syscon-phandle-args-can-v2-3-ac9a1253396b@linaro.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Oliver Hartkopp <socketcan@hartkopp.net>
Date: Tue Apr 29 09:05:55 2025 +0200
can: gw: fix RCU/BH usage in cgw_create_job()
[ Upstream commit 511e64e13d8cc72853275832e3f372607466c18c ]
As reported by Sebastian Andrzej Siewior the use of local_bh_disable()
is only feasible in uni processor systems to update the modification rules.
The usual use-case to update the modification rules is to update the data
of the modifications but not the modification types (AND/OR/XOR/SET) or
the checksum functions itself.
To omit additional memory allocations to maintain fast modification
switching times, the modification description space is doubled at gw-job
creation time so that only the reference to the active modification
description is changed under rcu protection.
Rename cgw_job::mod to cf_mod and make it a RCU pointer. Allocate in
cgw_create_job() and free it together with cgw_job in
cgw_job_free_rcu(). Update all users to dereference cgw_job::cf_mod with
a RCU accessor and if possible once.
[bigeasy: Replace mod1/mod2 from the Oliver's original patch with dynamic
allocation, use RCU annotation and accessor]
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Closes: https://lore.kernel.org/linux-can/20231031112349.y0aLoBrz@linutronix.de/
Fixes: dd895d7f21b2 ("can: cangw: introduce optional uid to reference created routing jobs")
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20250429070555.cs-7b_eZ@linutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Mon Feb 7 11:07:06 2022 -0800
can: gw: use call_rcu() instead of costly synchronize_rcu()
[ Upstream commit 181d4447905d551cc664f1e7e796b482c1eec992 ]
Commit fb8696ab14ad ("can: gw: synchronize rcu operations
before removing gw job entry") added three synchronize_rcu() calls
to make sure one rcu grace period was observed before freeing
a "struct cgw_job" (which are tiny objects).
This should be converted to call_rcu() to avoid adding delays
in device / network dismantles.
Use the rcu_head that was already in struct cgw_job,
not yet used.
Link: https://lore.kernel.org/all/20220207190706.1499190-1-eric.dumazet@gmail.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Oliver Hartkopp <socketcan@hartkopp.net>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Stable-dep-of: 511e64e13d8c ("can: gw: fix RCU/BH usage in cgw_create_job()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Marc Kleine-Budde <mkl@pengutronix.de>
Date: Fri May 2 16:13:44 2025 +0200
can: mcp251xfd: mcp251xfd_remove(): fix order of unregistration calls
commit 84f5eb833f53ae192baed4cfb8d9eaab43481fc9 upstream.
If a driver is removed, the driver framework invokes the driver's
remove callback. A CAN driver's remove function calls
unregister_candev(), which calls net_device_ops::ndo_stop further down
in the call stack for interfaces which are in the "up" state.
With the mcp251xfd driver the removal of the module causes the
following warning:
| WARNING: CPU: 0 PID: 352 at net/core/dev.c:7342 __netif_napi_del_locked+0xc8/0xd8
as can_rx_offload_del() deletes the NAPI, while it is still active,
because the interface is still up.
To fix the warning, first unregister the network interface, which
calls net_device_ops::ndo_stop, which disables the NAPI, and then call
can_rx_offload_del().
Fixes: 55e5b97f003e ("can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN")
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250502-can-rx-offload-del-v1-1-59a9b131589d@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: gaoxu <gaoxu2@honor.com>
Date: Thu Apr 17 07:30:00 2025 +0000
cgroup: Fix compilation issue due to cgroup_mutex not being exported
[ Upstream commit 87c259a7a359e73e6c52c68fcbec79988999b4e6 ]
When adding folio_memcg function call in the zram module for
Android16-6.12, the following error occurs during compilation:
ERROR: modpost: "cgroup_mutex" [../soc-repo/zram.ko] undefined!
This error is caused by the indirect call to lockdep_is_held(&cgroup_mutex)
within folio_memcg. The export setting for cgroup_mutex is controlled by
the CONFIG_PROVE_RCU macro. If CONFIG_LOCKDEP is enabled while
CONFIG_PROVE_RCU is not, this compilation error will occur.
To resolve this issue, add a parallel macro CONFIG_LOCKDEP control to
ensure cgroup_mutex is properly exported when needed.
Signed-off-by: gao xu <gaoxu2@honor.com>
Acked-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ahmad Fatoum <a.fatoum@pengutronix.de>
Date: Tue Feb 18 19:26:46 2025 +0100
clk: imx8mp: inform CCF of maximum frequency of clocks
[ Upstream commit 06a61b5cb6a8638fa8823cd09b17233b29696fa2 ]
The IMX8MPCEC datasheet lists maximum frequencies allowed for different
modules. Some of these limits are universal, but some depend on
whether the SoC is operating in nominal or in overdrive mode.
The imx8mp.dtsi currently assumes overdrive mode and configures some
clocks in accordance with this. Boards wishing to make use of nominal
mode will need to override some of the clock rates manually.
As operating the clocks outside of their allowed range can lead to
difficult to debug issues, it makes sense to register the maximum rates
allowed in the driver, so the CCF can take them into account.
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Link: https://lore.kernel.org/r/20250218-imx8m-clk-v4-6-b7697dc2dcd0@pengutronix.de
Signed-off-by: Abel Vesa <abel.vesa@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Fri Apr 4 15:31:16 2025 +0200
clocksource/i8253: Use raw_spinlock_irqsave() in clockevent_i8253_disable()
commit 94cff94634e506a4a44684bee1875d2dbf782722 upstream.
On x86 during boot, clockevent_i8253_disable() can be invoked via
x86_late_time_init -> hpet_time_init() -> pit_timer_init() which happens
with enabled interrupts.
If some of the old i8253 hardware is actually used then lockdep will notice
that i8253_lock is used in hard interrupt context. This causes lockdep to
complain because it observed the lock being acquired with interrupts
enabled and in hard interrupt context.
Make clockevent_i8253_disable() acquire the lock with
raw_spinlock_irqsave() to cure this.
[ tglx: Massage change log and use guard() ]
Fixes: c8c4076723dac ("x86/timer: Skip PIT initialization on modern chipsets")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250404133116.p-XRWJXf@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Paul Burton <paulburton@kernel.org>
Date: Wed Jan 29 13:32:47 2025 +0100
clocksource: mips-gic-timer: Enable counter when CPUs start
[ Upstream commit 3128b0a2e0cf6e07aa78e5f8cf7dd9cd59dc8174 ]
In multi-cluster MIPS I6500 systems there is a GIC in each cluster,
each with its own counter. When a cluster powers up the counter will
be stopped, with the COUNTSTOP bit set in the GIC_CONFIG register.
In single cluster systems, it has been fine to clear COUNTSTOP once
in gic_clocksource_of_init() to start the counter. In multi-cluster
systems, this will only have started the counter in the boot cluster,
and any CPUs in other clusters will find their counter stopped which
will break the GIC clock_event_device.
Resolve this by having CPUs clear the COUNTSTOP bit when they come
online, using the existing gic_starting_cpu() CPU hotplug callback. This
will allow CPUs in secondary clusters to ensure that the cluster's GIC
counter is running as expected.
Signed-off-by: Paul Burton <paulburton@kernel.org>
Signed-off-by: Chao-ying Fu <cfu@wavecomp.com>
Signed-off-by: Dragan Mladjenovic <dragan.mladjenovic@syrmia.com>
Signed-off-by: Aleksandar Rikalo <arikalo@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Serge Semin <fancer.lancer@gmail.com>
Tested-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Christian Brauner <brauner@kernel.org>
Date: Mon Apr 14 15:55:06 2025 +0200
coredump: fix error handling for replace_fd()
commit 95c5f43181fe9c1b5e5a4bd3281c857a5259991f upstream.
The replace_fd() helper returns the file descriptor number on success
and a negative error code on failure. The current error handling in
umh_pipe_setup() only works because the file descriptor that is replaced
is zero but that's pretty volatile. Explicitly check for a negative
error code.
Link: https://lore.kernel.org/20250414-work-coredump-v2-2-685bf231f828@kernel.org
Tested-by: Luca Boccassi <luca.boccassi@gmail.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Christian Brauner <brauner@kernel.org>
Date: Mon Jun 2 13:16:07 2025 +0200
coredump: hand a pidfd to the usermode coredump helper
commit b5325b2a270fcaf7b2a9a0f23d422ca8a5a8bdea upstream.
Give userspace a way to instruct the kernel to install a pidfd into the
usermode helper process. This makes coredump handling a lot more
reliable for userspace. In parallel with this commit we already have
systemd adding support for this in [1].
We create a pidfs file for the coredumping process when we process the
corename pattern. When the usermode helper process is forked we then
install the pidfs file as file descriptor three into the usermode
helpers file descriptor table so it's available to the exec'd program.
Since usermode helpers are either children of the system_unbound_wq
workqueue or kthreadd we know that the file descriptor table is empty
and can thus always use three as the file descriptor number.
Note, that we'll install a pidfd for the thread-group leader even if a
subthread is calling do_coredump(). We know that task linkage hasn't
been removed due to delay_group_leader() and even if this @current isn't
the actual thread-group leader we know that the thread-group leader
cannot be reaped until @current has exited.
[brauner: This is a backport for the v5.10 series. Upstream has
significantly changed and backporting all that infra is a non-starter.
So simply backport the pidfd_prepare() helper and waste the file
descriptor we allocated. Then we minimally massage the umh coredump
setup code.]
Link: https://github.com/systemd/systemd/pull/37125 [1]
Link: https://lore.kernel.org/20250414-work-coredump-v2-3-685bf231f828@kernel.org
Tested-by: Luca Boccassi <luca.boccassi@gmail.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date: Thu Feb 6 15:29:05 2025 +0100
cpuidle: menu: Avoid discarding useful information
[ Upstream commit 85975daeaa4d6ec560bfcd354fc9c08ad7f38888 ]
When giving up on making a high-confidence prediction,
get_typical_interval() always returns UINT_MAX which means that the
next idle interval prediction will be based entirely on the time till
the next timer. However, the information represented by the most
recent intervals may not be completely useless in those cases.
Namely, the largest recent idle interval is an upper bound on the
recently observed idle duration, so it is reasonable to assume that
the next idle duration is unlikely to exceed it. Moreover, this is
still true after eliminating the suspected outliers if the sample
set still under consideration is at least as large as 50% of the
maximum sample set size.
Accordingly, make get_typical_interval() return the current maximum
recent interval value in that case instead of UINT_MAX.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reported-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Reviewed-by: Christian Loehle <christian.loehle@arm.com>
Tested-by: Christian Loehle <christian.loehle@arm.com>
Tested-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Link: https://patch.msgid.link/7770672.EvYhyI6sBW@rjwysocki.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ivan Pravdin <ipravdin.official@gmail.com>
Date: Sun May 18 18:41:02 2025 -0400
crypto: algif_hash - fix double free in hash_accept
commit b2df03ed4052e97126267e8c13ad4204ea6ba9b6 upstream.
If accept(2) is called on socket type algif_hash with
MSG_MORE flag set and crypto_ahash_import fails,
sk2 is freed. However, it is also freed in af_alg_release,
leading to slab-use-after-free error.
Fixes: fe869cdb89c9 ("crypto: algif_hash - User-space interface for hash operations")
Cc: <stable@vger.kernel.org>
Signed-off-by: Ivan Pravdin <ipravdin.official@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ming-Hung Tsai <mtsai@redhat.com>
Date: Thu Mar 6 16:41:50 2025 +0800
dm cache: prevent BUG_ON by blocking retries on failed device resumes
[ Upstream commit 5da692e2262b8f81993baa9592f57d12c2703dea ]
A cache device failing to resume due to mapping errors should not be
retried, as the failure leaves a partially initialized policy object.
Repeating the resume operation risks triggering BUG_ON when reloading
cache mappings into the incomplete policy object.
Reproduce steps:
1. create a cache metadata consisting of 512 or more cache blocks,
with some mappings stored in the first array block of the mapping
array. Here we use cache_restore v1.0 to build the metadata.
cat <<EOF >> cmeta.xml
<superblock uuid="" block_size="128" nr_cache_blocks="512" \
policy="smq" hint_width="4">
<mappings>
<mapping cache_block="0" origin_block="0" dirty="false"/>
</mappings>
</superblock>
EOF
dmsetup create cmeta --table "0 8192 linear /dev/sdc 0"
cache_restore -i cmeta.xml -o /dev/mapper/cmeta --metadata-version=2
dmsetup remove cmeta
2. wipe the second array block of the mapping array to simulate
data degradations.
mapping_root=$(dd if=/dev/sdc bs=1c count=8 skip=192 \
2>/dev/null | hexdump -e '1/8 "%u\n"')
ablock=$(dd if=/dev/sdc bs=1c count=8 skip=$((4096*mapping_root+2056)) \
2>/dev/null | hexdump -e '1/8 "%u\n"')
dd if=/dev/zero of=/dev/sdc bs=4k count=1 seek=$ablock
3. try bringing up the cache device. The resume is expected to fail
due to the broken array block.
dmsetup create cmeta --table "0 8192 linear /dev/sdc 0"
dmsetup create cdata --table "0 65536 linear /dev/sdc 8192"
dmsetup create corig --table "0 524288 linear /dev/sdc 262144"
dmsetup create cache --notable
dmsetup load cache --table "0 524288 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0"
dmsetup resume cache
4. try resuming the cache again. An unexpected BUG_ON is triggered
while loading cache mappings.
dmsetup resume cache
Kernel logs:
(snip)
------------[ cut here ]------------
kernel BUG at drivers/md/dm-cache-policy-smq.c:752!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
CPU: 0 UID: 0 PID: 332 Comm: dmsetup Not tainted 6.13.4 #3
RIP: 0010:smq_load_mapping+0x3e5/0x570
Fix by disallowing resume operations for devices that failed the
initial attempt.
Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mikulas Patocka <mpatocka@redhat.com>
Date: Tue Apr 22 21:18:33 2025 +0200
dm-integrity: fix a warning on invalid table line
commit 0a533c3e4246c29d502a7e0fba0e86d80a906b04 upstream.
If we use the 'B' mode and we have an invalit table line,
cancel_delayed_work_sync would trigger a warning. This commit avoids the
warning.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Benjamin Marzinski <bmarzins@redhat.com>
Date: Tue Apr 15 00:17:16 2025 -0400
dm: always update the array size in realloc_argv on success
commit 5a2a6c428190f945c5cbf5791f72dbea83e97f66 upstream.
realloc_argv() was only updating the array size if it was called with
old_argv already allocated. The first time it was called to create an
argv array, it would allocate the array but return the array size as
zero. dm_split_args() would think that it couldn't store any arguments
in the array and would call realloc_argv() again, causing it to
reallocate the initial slots (this time using GPF_KERNEL) and finally
return a size. Aside from being wasteful, this could cause deadlocks on
targets that need to process messages without starting new IO. Instead,
realloc_argv should always update the allocated array size on success.
Fixes: a0651926553c ("dm table: don't copy from a NULL pointer in realloc_argv()")
Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Tudor Ambarus <tudor.ambarus@linaro.org>
Date: Tue May 6 11:31:50 2025 +0000
dm: fix copying after src array boundaries
commit f1aff4bc199cb92c055668caed65505e3b4d2656 upstream.
The blammed commit copied to argv the size of the reallocated argv,
instead of the size of the old_argv, thus reading and copying from
past the old_argv allocated memory.
Following BUG_ON was hit:
[ 3.038929][ T1] kernel BUG at lib/string_helpers.c:1040!
[ 3.039147][ T1] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
...
[ 3.056489][ T1] Call trace:
[ 3.056591][ T1] __fortify_panic+0x10/0x18 (P)
[ 3.056773][ T1] dm_split_args+0x20c/0x210
[ 3.056942][ T1] dm_table_add_target+0x13c/0x360
[ 3.057132][ T1] table_load+0x110/0x3ac
[ 3.057292][ T1] dm_ctl_ioctl+0x424/0x56c
[ 3.057457][ T1] __arm64_sys_ioctl+0xa8/0xec
[ 3.057634][ T1] invoke_syscall+0x58/0x10c
[ 3.057804][ T1] el0_svc_common+0xa8/0xdc
[ 3.057970][ T1] do_el0_svc+0x1c/0x28
[ 3.058123][ T1] el0_svc+0x50/0xac
[ 3.058266][ T1] el0t_64_sync_handler+0x60/0xc4
[ 3.058452][ T1] el0t_64_sync+0x1b0/0x1b4
[ 3.058620][ T1] Code: f800865e a9bf7bfd 910003fd 941f48aa (d4210000)
[ 3.058897][ T1] ---[ end trace 0000000000000000 ]---
[ 3.059083][ T1] Kernel panic - not syncing: Oops - BUG: Fatal exception
Fix it by copying the size of src, and not the size of dst, as it was.
Fixes: 5a2a6c428190 ("dm: always update the array size in realloc_argv on success")
Cc: stable@vger.kernel.org
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Mikulas Patocka <mpatocka@redhat.com>
Date: Fri Mar 14 13:51:32 2025 +0100
dm: restrict dm device size to 2^63-512 bytes
[ Upstream commit 45fc728515c14f53f6205789de5bfd72a95af3b8 ]
The devices with size >= 2^63 bytes can't be used reliably by userspace
because the type off_t is a signed 64-bit integer.
Therefore, we limit the maximum size of a device mapper device to
2^63-512 bytes.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Marek Szyprowski <m.szyprowski@samsung.com>
Date: Tue Apr 15 09:56:59 2025 +0200
dma-mapping: avoid potential unused data compilation warning
[ Upstream commit c9b19ea63036fc537a69265acea1b18dabd1cbd3 ]
When CONFIG_NEED_DMA_MAP_STATE is not defined, dma-mapping clients might
report unused data compilation warnings for dma_unmap_*() calls
arguments. Redefine macros for those calls to let compiler to notice that
it is okay when the provided arguments are not used.
Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20250415075659.428549-1-m.szyprowski@samsung.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nathan Lynch <nathan.lynch@amd.com>
Date: Thu Apr 3 11:24:19 2025 -0500
dmaengine: Revert "dmaengine: dmatest: Fix dmatest waiting less when interrupted"
commit df180e65305f8c1e020d54bfc2132349fd693de1 upstream.
Several issues with this change:
* The analysis is flawed and it's unclear what problem is being
fixed. There is no difference between wait_event_freezable_timeout()
and wait_event_timeout() with respect to device interrupts. And of
course "the interrupt notifying the finish of an operation happens
during wait_event_freezable_timeout()" -- that's how it's supposed
to work.
* The link at the "Closes:" tag appears to be an unrelated
use-after-free in idxd.
* It introduces a regression: dmatest threads are meant to be
freezable and this change breaks that.
See discussion here:
https://lore.kernel.org/dmaengine/878qpa13fe.fsf@AUSNATLYNCH.amd.com/
Fixes: e87ca16e9911 ("dmaengine: dmatest: Fix dmatest waiting less when interrupted")
Signed-off-by: Nathan Lynch <nathan.lynch@amd.com>
Link: https://lore.kernel.org/r/20250403-dmaengine-dmatest-revert-waiting-less-v1-1-8227c5a3d7c8@amd.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ronald Wahl <ronald.wahl@legrand.com>
Date: Mon Apr 14 19:31:13 2025 +0200
dmaengine: ti: k3-udma: Add missing locking
commit fca280992af8c2fbd511bc43f65abb4a17363f2f upstream.
Recent kernels complain about a missing lock in k3-udma.c when the lock
validator is enabled:
[ 4.128073] WARNING: CPU: 0 PID: 746 at drivers/dma/ti/../virt-dma.h:169 udma_start.isra.0+0x34/0x238
[ 4.137352] CPU: 0 UID: 0 PID: 746 Comm: kworker/0:3 Not tainted 6.12.9-arm64 #28
[ 4.144867] Hardware name: pp-v12 (DT)
[ 4.148648] Workqueue: events udma_check_tx_completion
[ 4.153841] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 4.160834] pc : udma_start.isra.0+0x34/0x238
[ 4.165227] lr : udma_start.isra.0+0x30/0x238
[ 4.169618] sp : ffffffc083cabcf0
[ 4.172963] x29: ffffffc083cabcf0 x28: 0000000000000000 x27: ffffff800001b005
[ 4.180167] x26: ffffffc0812f0000 x25: 0000000000000000 x24: 0000000000000000
[ 4.187370] x23: 0000000000000001 x22: 00000000e21eabe9 x21: ffffff8000fa0670
[ 4.194571] x20: ffffff8001b6bf00 x19: ffffff8000fa0430 x18: ffffffc083b95030
[ 4.201773] x17: 0000000000000000 x16: 00000000f0000000 x15: 0000000000000048
[ 4.208976] x14: 0000000000000048 x13: 0000000000000000 x12: 0000000000000001
[ 4.216179] x11: ffffffc08151a240 x10: 0000000000003ea1 x9 : ffffffc08046ab68
[ 4.223381] x8 : ffffffc083cabac0 x7 : ffffffc081df3718 x6 : 0000000000029fc8
[ 4.230583] x5 : ffffffc0817ee6d8 x4 : 0000000000000bc0 x3 : 0000000000000000
[ 4.237784] x2 : 0000000000000000 x1 : 00000000001fffff x0 : 0000000000000000
[ 4.244986] Call trace:
[ 4.247463] udma_start.isra.0+0x34/0x238
[ 4.251509] udma_check_tx_completion+0xd0/0xdc
[ 4.256076] process_one_work+0x244/0x3fc
[ 4.260129] process_scheduled_works+0x6c/0x74
[ 4.264610] worker_thread+0x150/0x1dc
[ 4.268398] kthread+0xd8/0xe8
[ 4.271492] ret_from_fork+0x10/0x20
[ 4.275107] irq event stamp: 220
[ 4.278363] hardirqs last enabled at (219): [<ffffffc080a27c7c>] _raw_spin_unlock_irq+0x38/0x50
[ 4.287183] hardirqs last disabled at (220): [<ffffffc080a1c154>] el1_dbg+0x24/0x50
[ 4.294879] softirqs last enabled at (182): [<ffffffc080037e68>] handle_softirqs+0x1c0/0x3cc
[ 4.303437] softirqs last disabled at (177): [<ffffffc080010170>] __do_softirq+0x1c/0x28
[ 4.311559] ---[ end trace 0000000000000000 ]---
This commit adds the missing locking.
Fixes: 25dcb5dd7b7c ("dmaengine: ti: New driver for K3 UDMA")
Cc: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Cc: Vinod Koul <vkoul@kernel.org>
Cc: dmaengine@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Ronald Wahl <ronald.wahl@legrand.com>
Acked-by: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Link: https://lore.kernel.org/r/20250414173113.80677-1-rwahl@gmx.de
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Yemike Abhilash Chandra <y-abhilashchandra@ti.com>
Date: Thu Apr 17 13:25:21 2025 +0530
dmaengine: ti: k3-udma: Use cap_mask directly from dma_device structure instead of a local copy
commit 8ca9590c39b69b55a8de63d2b21b0d44f523b43a upstream.
Currently, a local dma_cap_mask_t variable is used to store device
cap_mask within udma_of_xlate(). However, the DMA_PRIVATE flag in
the device cap_mask can get cleared when the last channel is released.
This can happen right after storing the cap_mask locally in
udma_of_xlate(), and subsequent dma_request_channel() can fail due to
mismatch in the cap_mask. Fix this by removing the local dma_cap_mask_t
variable and directly using the one from the dma_device structure.
Fixes: 25dcb5dd7b7c ("dmaengine: ti: New driver for K3 UDMA")
Cc: stable@vger.kernel.org
Signed-off-by: Vaishnav Achath <vaishnav.a@ti.com>
Acked-by: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Reviewed-by: Udit Kumar <u-kumar1@ti.com>
Signed-off-by: Yemike Abhilash Chandra <y-abhilashchandra@ti.com>
Link: https://lore.kernel.org/r/20250417075521.623651-1-y-abhilashchandra@ti.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Al Viro <viro@zeniv.linux.org.uk>
Date: Mon Apr 28 23:56:14 2025 -0400
do_umount(): add missing barrier before refcount checks in sync case
[ Upstream commit 65781e19dcfcb4aed1167d87a3ffcc2a0c071d47 ]
do_umount() analogue of the race fixed in 119e1ef80ecf "fix
__legitimize_mnt()/mntput() race". Here we want to make sure that
if __legitimize_mnt() doesn't notice our lock_mount_hash(), we will
notice their refcount increment. Harder to hit than mntput_no_expire()
one, fortunately, and consequences are milder (sync umount acting
like umount -l on a rare race with RCU pathwalk hitting at just the
wrong time instead of use-after-free galore mntput_no_expire()
counterpart used to be hit). Still a bug...
Fixes: 48a066e72d97 ("RCU'd vfsmounts")
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jing Su <jingsusu@didiglobal.com>
Date: Wed Mar 19 16:57:51 2025 +0800
dql: Fix dql->limit value when reset.
[ Upstream commit 3a17f23f7c36bac3a3584aaf97d3e3e0b2790396 ]
Executing dql_reset after setting a non-zero value for limit_min can
lead to an unreasonable situation where dql->limit is less than
dql->limit_min.
For instance, after setting
/sys/class/net/eth*/queues/tx-0/byte_queue_limits/limit_min,
an ifconfig down/up operation might cause the ethernet driver to call
netdev_tx_reset_queue, which in turn invokes dql_reset.
In this case, dql->limit is reset to 0 while dql->limit_min remains
non-zero value, which is unexpected. The limit should always be
greater than or equal to limit_min.
Signed-off-by: Jing Su <jingsusu@didiglobal.com>
Link: https://patch.msgid.link/Z9qHD1s/NEuQBdgH@pilot-ThinkCentre-M930t-N000
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tom Chung <chiahsuan.chung@amd.com>
Date: Mon Jan 13 14:22:31 2025 +0800
drm/amd/display: Initial psr_version with correct setting
[ Upstream commit d8c782cac5007e68e7484d420168f12d3490def6 ]
[Why & How]
The initial setting for psr_version is not correct while
create a virtual link.
The default psr_version should be DC_PSR_VERSION_UNSUPPORTED.
Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Victor Lu <victorchengchi.lu@amd.com>
Date: Thu Feb 13 18:38:28 2025 -0500
drm/amdgpu: Do not program AGP BAR regs under SRIOV in gfxhub_v1_0.c
[ Upstream commit 057fef20b8401110a7bc1c2fe9d804a8a0bf0d24 ]
SRIOV VF does not have write access to AGP BAR regs.
Skip the writes to avoid a dmesg warning.
Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Philip Yang <Philip.Yang@amd.com>
Date: Mon Feb 17 20:08:29 2025 -0500
drm/amdkfd: KFD release_work possible circular locking
[ Upstream commit 1b9366c601039d60546794c63fbb83ce8e53b978 ]
If waiting for gpu reset done in KFD release_work, thers is WARNING:
possible circular locking dependency detected
#2 kfd_create_process
kfd_process_mutex
flush kfd release work
#1 kfd release work
wait for amdgpu reset work
#0 amdgpu_device_gpu_reset
kgd2kfd_pre_reset
kfd_process_mutex
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock((work_completion)(&p->release_work));
lock((wq_completion)kfd_process_wq);
lock((work_completion)(&p->release_work));
lock((wq_completion)amdgpu-reset-dev);
To fix this, KFD create process move flush release work outside
kfd_process_mutex.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Thomas Zimmermann <tzimmermann@suse.de>
Date: Fri Jan 31 10:21:08 2025 +0100
drm/ast: Find VBIOS mode from regular display size
[ Upstream commit c81202906b5cd56db403e95db3d29c9dfc8c74c1 ]
The ast driver looks up supplied display modes from an internal list of
display modes supported by the VBIOS.
Do not use the crtc_-prefixed display values from struct drm_display_mode
for looking up the VBIOS mode. The fields contain raw values that the
driver programs to hardware. They are affected by display settings like
double-scan or interlace.
Instead use the regular vdisplay and hdisplay fields for lookup. As the
programmed values can now differ from the values used for lookup, set
struct drm_display_mode.crtc_vdisplay and .crtc_hdisplay from the VBIOS
mode.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250131092257.115596-9-tzimmermann@suse.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Simona Vetter <simona.vetter@ffwll.ch>
Date: Wed Jan 8 18:24:16 2025 +0100
drm/atomic: clarify the rules around drm_atomic_state->allow_modeset
[ Upstream commit c5e3306a424b52e38ad2c28c7f3399fcd03e383d ]
msm is automagically upgrading normal commits to full modesets, and
that's a big no-no:
- for one this results in full on->off->on transitions on all these
crtc, at least if you're using the usual helpers. Which seems to be
the case, and is breaking uapi
- further even if the ctm change itself would not result in flicker,
this can hide modesets for other reasons. Which again breaks the
uapi
v2: I forgot the case of adding unrelated crtc state. Add that case
and link to the existing kerneldoc explainers. This has come up in an
irc discussion with Manasi and Ville about intel's bigjoiner mode.
Also cc everyone involved in the msm irc discussion, more people
joined after I sent out v1.
v3: Wording polish from Pekka and Thomas
Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com>
Acked-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Pekka Paalanen <pekka.paalanen@collabora.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: Manasi Navare <navaremanasi@google.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Abhinav Kumar <quic_abhinavk@quicinc.com>
Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Simona Vetter <simona.vetter@intel.com>
Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20250108172417.160831-1-simona.vetter@ffwll.ch
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: feijuan.li <feijuan.li@samsung.com>
Date: Wed May 14 14:35:11 2025 +0800
drm/edid: fixed the bug that hdr metadata was not reset
commit 6692dbc15e5ed40a3aa037aced65d7b8826c58cd upstream.
When DP connected to a device with HDR capability,
the hdr structure was filled.Then connected to another
sink device without hdr capability, but the hdr info
still exist.
Fixes: e85959d6cbe0 ("drm: Parse HDR metadata info from EDID")
Cc: <stable@vger.kernel.org> # v5.3+
Signed-off-by: "feijuan.li" <feijuan.li@samsung.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://lore.kernel.org/r/20250514063511.4151780-1-feijuan.li@samsung.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jani Nikula <jani.nikula@intel.com>
Date: Thu Mar 27 14:47:39 2025 +0200
drm/i915/gvt: fix unterminated-string-initialization warning
commit 2e43ae7dd71cd9bb0d1bce1d3306bf77523feb81 upstream.
Initializing const char opregion_signature[16] = OPREGION_SIGNATURE
(which is "IntelGraphicsMem") drops the NUL termination of the
string. This is intentional, but the compiler doesn't know this.
Switch to initializing header->signature directly from the string
litaral, with sizeof destination rather than source. We don't treat the
signature as a string other than for initialization; it's really just a
blob of binary data.
Add a static assert for good measure to cross-check the sizes.
Reported-by: Kees Cook <kees@kernel.org>
Closes: https://lore.kernel.org/r/20250310222355.work.417-kees@kernel.org
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13934
Tested-by: Nicolas Chauvet <kwizart@gmail.com>
Tested-by: Damian Tometzki <damian@riscv-rocks.de>
Cc: stable@vger.kernel.org
Reviewed-by: Zhenyu Wang <zhenyuw.linux@gmail.com>
Link: https://lore.kernel.org/r/20250327124739.2609656-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit 4f8207469094bd04aad952258ceb9ff4c77b6bfa)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
[nathan: Move static_assert() to top of function to avoid instance of
-Wdeclaration-after-statement due to lack of b5ec6fd286df]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Date: Mon Feb 17 16:47:58 2025 +0100
drm/mediatek: mtk_dpi: Add checks for reg_h_fre_con existence
[ Upstream commit 8c9da7cd0bbcc90ab444454fecf535320456a312 ]
In preparation for adding support for newer DPI instances which
do support direct-pin but do not have any H_FRE_CON register,
like the one found in MT8195 and MT8188, add a branch to check
if the reg_h_fre_con variable was declared in the mtk_dpi_conf
structure for the probed SoC DPI version.
As a note, this is useful specifically only for cases in which
the support_direct_pin variable is true, so mt8195-dpintf is
not affected by any issue.
Reviewed-by: CK Hu <ck.hu@mediatek.com>
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20250217154836.108895-6-angelogioacchino.delregno@collabora.com/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Philipp Stanner <phasta@kernel.org>
Date: Tue Apr 15 14:19:00 2025 +0200
drm/nouveau: Fix WARN_ON in nouveau_fence_context_kill()
commit bbe5679f30d7690a9b6838a583b9690ea73fe0e9 upstream.
Nouveau is mostly designed in a way that it's expected that fences only
ever get signaled through nouveau_fence_signal(). However, in at least
one other place, nouveau_fence_done(), can signal fences, too. If that
happens (race) a signaled fence remains in the pending list for a while,
until it gets removed by nouveau_fence_update().
Should nouveau_fence_context_kill() run in the meantime, this would be
a bug because the function would attempt to set an error code on an
already signaled fence.
Have nouveau_fence_context_kill() check for a fence being signaled.
Cc: stable@vger.kernel.org # v5.10+
Fixes: ea13e5abf807 ("drm/nouveau: signal pending fences when channel has been killed")
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://lore.kernel.org/r/20250415121900.55719-3-phasta@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Kevin Baker <kevinb@ventureresearch.com>
Date: Mon May 5 12:02:56 2025 -0500
drm/panel: simple: Update timings for AUO G101EVN010
[ Upstream commit 7c6fa1797a725732981f2d77711c867166737719 ]
Switch to panel timings based on datasheet for the AUO G101EVN01.0
LVDS panel. Default timings were tested on the panel.
Previous mode-based timings resulted in horizontal display shift.
Signed-off-by: Kevin Baker <kevinb@ventureresearch.com>
Fixes: 4fb86404a977 ("drm/panel: simple: Add AUO G101EVN010 panel support")
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20250505170256.1385113-1-kevinb@ventureresearch.com
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20250505170256.1385113-1-kevinb@ventureresearch.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zack Rusin <zack.rusin@broadcom.com>
Date: Mon Jul 22 14:41:13 2024 -0400
drm/vmwgfx: Fix a deadlock in dma buf fence polling
commit e58337100721f3cc0c7424a18730e4f39844934f upstream.
Introduce a version of the fence ops that on release doesn't remove
the fence from the pending list, and thus doesn't require a lock to
fix poll->fence wait->fence unref deadlocks.
vmwgfx overwrites the wait callback to iterate over the list of all
fences and update their status, to do that it holds a lock to prevent
the list modifcations from other threads. The fence destroy callback
both deletes the fence and removes it from the list of pending
fences, for which it holds a lock.
dma buf polling cb unrefs a fence after it's been signaled: so the poll
calls the wait, which signals the fences, which are being destroyed.
The destruction tries to acquire the lock on the pending fences list
which it can never get because it's held by the wait from which it
was called.
Old bug, but not a lot of userspace apps were using dma-buf polling
interfaces. Fix those, in particular this fixes KDE stalls/deadlock.
Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Fixes: 2298e804e96e ("drm/vmwgfx: rework to new fence interface, v2")
Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com>
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v6.2+
Reviewed-by: Maaz Mombasawala <maaz.mombasawala@broadcom.com>
Reviewed-by: Martin Krastev <martin.krastev@broadcom.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240722184313.181318-2-zack.rusin@broadcom.com
[Minor context change fixed]
Signed-off-by: Zhi Yang <Zhi.Yang@windriver.com>
Signed-off-by: He Zhe <zhe.he@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jessica Zhang <quic_jesszhan@quicinc.com>
Date: Mon Dec 16 16:43:14 2024 -0800
drm: Add valid clones check
[ Upstream commit 41b4b11da02157c7474caf41d56baae0e941d01a ]
Check that all encoders attached to a given CRTC are valid
possible_clones of each other.
Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20241216-concurrent-wb-v4-3-fe220297a7f0@quicinc.com
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Niravkumar L Rabara <niravkumar.l.rabara@altera.com>
Date: Fri Apr 25 07:26:40 2025 -0700
EDAC/altera: Set DDR and SDMMC interrupt mask before registration
commit 6dbe3c5418c4368e824bff6ae4889257dd544892 upstream.
Mask DDR and SDMMC in probe function to avoid spurious interrupts before
registration. Removed invalid register write to system manager.
Fixes: 1166fde93d5b ("EDAC, altera: Add Arria10 ECC memory init functions")
Signed-off-by: Niravkumar L Rabara <niravkumar.l.rabara@altera.com>
Signed-off-by: Matthew Gerlach <matthew.gerlach@altera.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Cc: stable@kernel.org
Link: https://lore.kernel.org/20250425142640.33125-3-matthew.gerlach@altera.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Niravkumar L Rabara <niravkumar.l.rabara@altera.com>
Date: Fri Apr 25 07:26:39 2025 -0700
EDAC/altera: Test the correct error reg offset
commit 4fb7b8fceb0beebbe00712c3daf49ade0386076a upstream.
Test correct structure member, ecc_cecnt_offset, before using it.
[ bp: Massage commit message. ]
Fixes: 73bcc942f427 ("EDAC, altera: Add Arria10 EDAC support")
Signed-off-by: Niravkumar L Rabara <niravkumar.l.rabara@altera.com>
Signed-off-by: Matthew Gerlach <matthew.gerlach@altera.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Cc: stable@kernel.org
Link: https://lore.kernel.org/20250425142640.33125-2-matthew.gerlach@altera.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Arnd Bergmann <arnd@arndb.de>
Date: Wed Jan 22 07:50:26 2025 +0100
EDAC/ie31200: work around false positive build warning
[ Upstream commit c29dfd661fe2f8d1b48c7f00590929c04b25bf40 ]
gcc-14 produces a bogus warning in some configurations:
drivers/edac/ie31200_edac.c: In function 'ie31200_probe1.isra':
drivers/edac/ie31200_edac.c:412:26: error: 'dimm_info' is used uninitialized [-Werror=uninitialized]
412 | struct dimm_data dimm_info[IE31200_CHANNELS][IE31200_DIMMS_PER_CHANNEL];
| ^~~~~~~~~
drivers/edac/ie31200_edac.c:412:26: note: 'dimm_info' declared here
412 | struct dimm_data dimm_info[IE31200_CHANNELS][IE31200_DIMMS_PER_CHANNEL];
| ^~~~~~~~~
I don't see any way the unintialized access could really happen here,
but I can see why the compiler gets confused by the two loops.
Instead, rework the two nested loops to only read the addr_decode
registers and then keep only one instance of the dimm info structure.
[Tony: Qiuxu pointed out that the "populate DIMM info" comment was left
behind in the refactor and suggested moving it. I deleted the comment
as unnecessry in front os a call to populate_dimm_info(). That seems
pretty self-describing.]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Jason Baron <jbaron@akamai.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/all/20250122065031.1321015-1-arnd@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jakub Kicinski <kuba@kernel.org>
Date: Wed Feb 12 17:06:33 2025 -0800
eth: mlx4: don't try to complete XDP frames in netpoll
[ Upstream commit 8fdeafd66edaf420ea0063a1f13442fe3470fe70 ]
mlx4 doesn't support ndo_xdp_xmit / XDP_REDIRECT and wasn't
using page pool until now, so it could run XDP completions
in netpoll (NAPI budget == 0) just fine. Page pool has calling
context requirements, make sure we don't try to call it from
what is potentially HW IRQ context.
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250213010635.1354034-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Christian Göttsche <cgzones@googlemail.com>
Date: Sun Mar 2 17:06:39 2025 +0100
ext4: reorder capability check last
[ Upstream commit 1b419c889c0767a5b66d0a6c566cae491f1cb0f7 ]
capable() calls refer to enabled LSMs whether to permit or deny the
request. This is relevant in connection with SELinux, where a
capability check results in a policy decision and by default a denial
message on insufficient permission is issued.
It can lead to three undesired cases:
1. A denial message is generated, even in case the operation was an
unprivileged one and thus the syscall succeeded, creating noise.
2. To avoid the noise from 1. the policy writer adds a rule to ignore
those denial messages, hiding future syscalls, where the task
performs an actual privileged operation, leading to hidden limited
functionality of that task.
3. To avoid the noise from 1. the policy writer adds a rule to permit
the task the requested capability, while it does not need it,
violating the principle of least privilege.
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20250302160657.127253-2-cgoettsche@seltendoof.de
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zsolt Kajtar <soci@c64.rulez.org>
Date: Sun Feb 2 21:33:46 2025 +0100
fbcon: Use correct erase colour for clearing in fbcon
[ Upstream commit 892c788d73fe4a94337ed092cb998c49fa8ecaf4 ]
The erase colour calculation for fbcon clearing should use get_color instead
of attr_col_ec, like everything else. The latter is similar but is not correct.
For example it's missing the depth dependent remapping and doesn't care about
blanking.
The problem can be reproduced by setting up the background colour to grey
(vt.color=0x70) and having an fbcon console set to 2bpp (4 shades of gray).
Now the background attribute should be 1 (dark gray) on the console.
If the screen is scrolled when pressing enter in a shell prompt at the bottom
line then the new line is cleared using colour 7 instead of 1. That's not
something fillrect likes (at 2bbp it expect 0-3) so the result is interesting.
This patch switches to get_color with vc_video_erase_char to determine the
erase colour from attr_col_ec. That makes the latter function redundant as
no other users were left.
Use correct erase colour for clearing in fbcon
Signed-off-by: Zsolt Kajtar <soci@c64.rulez.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zsolt Kajtar <soci@c64.rulez.org>
Date: Sat Feb 1 09:18:09 2025 +0100
fbdev: core: tileblit: Implement missing margin clearing for tileblit
[ Upstream commit 76d3ca89981354e1f85a3e0ad9ac4217d351cc72 ]
I was wondering why there's garbage at the bottom of the screen when
tile blitting is used with an odd mode like 1080, 600 or 200. Sure there's
only space for half a tile but the same area is clean when the buffer
is bitmap.
Then later I found that it's supposed to be cleaned but that's not
implemented. So I took what's in bitblit and adapted it for tileblit.
This implementation was tested for both the horizontal and vertical case,
and now does the same as what's done for bitmap buffers.
If anyone is interested to reproduce the problem then I could bet that'd
be on a S3 or Ark. Just set up a mode with an odd line count and make
sure that the virtual size covers the complete tile at the bottom. E.g.
for 600 lines that's 608 virtual lines for a 16 tall tile. Then the
bottom area should be cleaned.
For the right side it's more difficult as there the drivers won't let an
odd size happen, unless the code is modified. But once it reports back a
few pixel columns short then fbcon won't use the last column. With the
patch that column is now clean.
Btw. the virtual size should be rounded up by the driver for both axes
(not only the horizontal) so that it's dividable by the tile size.
That's a driver bug but correcting it is not in scope for this patch.
Implement missing margin clearing for tileblit
Signed-off-by: Zsolt Kajtar <soci@c64.rulez.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shixiong Ou <oushixiong@kylinos.cn>
Date: Mon Mar 10 09:54:31 2025 +0800
fbdev: fsl-diu-fb: add missing device_remove_file()
[ Upstream commit 86d16cd12efa547ed43d16ba7a782c1251c80ea8 ]
Call device_remove_file() when driver remove.
Signed-off-by: Shixiong Ou <oushixiong@kylinos.cn>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Christian Brauner <brauner@kernel.org>
Date: Mon Mar 27 20:22:52 2023 +0200
fork: use pidfd_prepare()
commit ca7707f5430ad6b1c9cb7cee0a7f67d69328bb2d upstream.
Stop open-coding get_unused_fd_flags() and anon_inode_getfile(). That's
brittle just for keeping the flags between both calls in sync. Use the
dedicated helper.
Message-Id: <20230327-pidfd-file-api-v1-2-5c0e9a3158e4@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Kuhanh Murugasen Krishnan <kuhanh.murugasen.krishnan@intel.com>
Date: Thu Feb 13 06:12:49 2025 +0800
fpga: altera-cvp: Increase credit timeout
[ Upstream commit 0f05886a40fdc55016ba4d9ae0a9c41f8312f15b ]
Increase the timeout for SDM (Secure device manager) data credits from
20ms to 40ms. Internal stress tests running at 500 loops failed with the
current timeout of 20ms. At the start of a FPGA configuration, the CVP
host driver reads the transmit credits from SDM. It then sends bitstream
FPGA data to SDM based on the total credits. Each credit allows the
CVP host driver to send 4kBytes of data. There are situations whereby,
the SDM did not respond in time during testing.
Signed-off-by: Ang Tien Sung <tien.sung.ang@intel.com>
Signed-off-by: Kuhanh Murugasen Krishnan <kuhanh.murugasen.krishnan@intel.com>
Acked-by: Xu Yilun <yilun.xu@intel.com>
Link: https://lore.kernel.org/r/20250212221249.2715929-1-tien.sung.ang@intel.com
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Milton Barrera <miltonjosue2001@gmail.com>
Date: Wed Apr 9 00:04:28 2025 -0600
HID: quirks: Add ADATA XPG alpha wireless mouse support
[ Upstream commit fa9fdeea1b7d6440c22efa6d59a769eae8bc89f1 ]
This patch adds HID_QUIRK_ALWAYS_POLL for the ADATA XPG wireless gaming mouse (USB ID 125f:7505) and its USB dongle (USB ID 125f:7506). Without this quirk, the device does not generate input events properly.
Signed-off-by: Milton Barrera <miltonjosue2001@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: junan <junan76@163.com>
Date: Thu Nov 28 10:35:18 2024 +0800
HID: usbkbd: Fix the bit shift number for LED_KANA
[ Upstream commit d73a4bfa2881a6859b384b75a414c33d4898b055 ]
Since "LED_KANA" was defined as "0x04", the shift number should be "4".
Signed-off-by: junan <junan76@163.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alexander Stein <alexander.stein@ew.tq-group.com>
Date: Mon Feb 10 15:59:30 2025 +0100
hwmon: (gpio-fan) Add missing mutex locks
[ Upstream commit 9fee7d19bab635f89223cc40dfd2c8797fdc4988 ]
set_fan_speed() is expected to be called with fan_data->lock being locked.
Add locking for proper synchronization.
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Link: https://lore.kernel.org/r/20250210145934.761280-3-alexander.stein@ew.tq-group.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Andrey Vatoropin <a.vatoropin@crpt.ru>
Date: Tue Feb 4 09:54:08 2025 +0000
hwmon: (xgene-hwmon) use appropriate type for the latency value
[ Upstream commit 8df0f002827e18632dcd986f7546c1abf1953a6f ]
The expression PCC_NUM_RETRIES * pcc_chan->latency is currently being
evaluated using 32-bit arithmetic.
Since a value of type 'u64' is used to store the eventual result,
and this result is later sent to the function usecs_to_jiffies with
input parameter unsigned int, the current data type is too wide to
store the value of ctx->usecs_lat.
Change the data type of "usecs_lat" to a more suitable (narrower) type.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Andrey Vatoropin <a.vatoropin@crpt.ru>
Link: https://lore.kernel.org/r/20250204095400.95013-1-a.vatoropin@crpt.ru
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Clark Wang <xiaoning.wang@nxp.com>
Date: Mon Apr 21 14:23:41 2025 +0800
i2c: imx-lpi2c: Fix clock count when probe defers
commit b1852c5de2f2a37dd4462f7837c9e3e678f9e546 upstream.
Deferred probe with pm_runtime_put() may delay clock disable, causing
incorrect clock usage count. Use pm_runtime_put_sync() to ensure the
clock is disabled immediately.
Fixes: 13d6eb20fc79 ("i2c: imx-lpi2c: add runtime pm support")
Signed-off-by: Clark Wang <xiaoning.wang@nxp.com>
Signed-off-by: Carlos Song <carlos.song@nxp.com>
Cc: <stable@vger.kernel.org> # v4.16+
Link: https://lore.kernel.org/r/20250421062341.2471922-1-carlos.song@nxp.com
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Vitalii Mordan <mordan@ispras.ru>
Date: Wed Feb 12 20:28:03 2025 +0300
i2c: pxa: fix call balance of i2c->clk handling routines
[ Upstream commit be7113d2e2a6f20cbee99c98d261a1fd6fd7b549 ]
If the clock i2c->clk was not enabled in i2c_pxa_probe(), it should not be
disabled in any path.
Found by Linux Verification Center (linuxtesting.org) with Klever.
Signed-off-by: Vitalii Mordan <mordan@ispras.ru>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20250212172803.1422136-1-mordan@ispras.ru
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
Date: Tue Nov 28 10:48:37 2023 +0100
i2c: qup: Vote for interconnect bandwidth to DRAM
[ Upstream commit d4f35233a6345f62637463ef6e0708f44ffaa583 ]
When the I2C QUP controller is used together with a DMA engine it needs
to vote for the interconnect path to the DRAM. Otherwise it may be
unable to access the memory quickly enough.
The requested peak bandwidth is dependent on the I2C core clock.
To avoid sending votes too often the bandwidth is always requested when
a DMA transfer starts, but dropped only on runtime suspend. Runtime
suspend should only happen if no transfer is active. After resumption we
can defer the next vote until the first DMA transfer actually happens.
The implementation is largely identical to the one introduced for
spi-qup in commit ecdaa9473019 ("spi: qup: Vote for interconnect
bandwidth to DRAM") since both drivers represent the same hardware
block.
Signed-off-by: Stephan Gerhold <stephan.gerhold@kernkonzept.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20231128-i2c-qup-dvfs-v1-3-59a0e3039111@kernkonzept.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alexander Lobakin <alexandr.lobakin@intel.com>
Date: Mon Apr 4 18:15:09 2022 +0200
ice: arfs: fix use-after-free when freeing @rx_cpu_rmap
commit d7442f512b71fc63a99c8a801422dde4fbbf9f93 upstream.
The CI testing bots triggered the following splat:
[ 718.203054] BUG: KASAN: use-after-free in free_irq_cpu_rmap+0x53/0x80
[ 718.206349] Read of size 4 at addr ffff8881bd127e00 by task sh/20834
[ 718.212852] CPU: 28 PID: 20834 Comm: sh Kdump: loaded Tainted: G S W IOE 5.17.0-rc8_nextqueue-devqueue-02643-g23f3121aca93 #1
[ 718.219695] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0012.070720200218 07/07/2020
[ 718.223418] Call Trace:
[ 718.227139]
[ 718.230783] dump_stack_lvl+0x33/0x42
[ 718.234431] print_address_description.constprop.9+0x21/0x170
[ 718.238177] ? free_irq_cpu_rmap+0x53/0x80
[ 718.241885] ? free_irq_cpu_rmap+0x53/0x80
[ 718.245539] kasan_report.cold.18+0x7f/0x11b
[ 718.249197] ? free_irq_cpu_rmap+0x53/0x80
[ 718.252852] free_irq_cpu_rmap+0x53/0x80
[ 718.256471] ice_free_cpu_rx_rmap.part.11+0x37/0x50 [ice]
[ 718.260174] ice_remove_arfs+0x5f/0x70 [ice]
[ 718.263810] ice_rebuild_arfs+0x3b/0x70 [ice]
[ 718.267419] ice_rebuild+0x39c/0xb60 [ice]
[ 718.270974] ? asm_sysvec_apic_timer_interrupt+0x12/0x20
[ 718.274472] ? ice_init_phy_user_cfg+0x360/0x360 [ice]
[ 718.278033] ? delay_tsc+0x4a/0xb0
[ 718.281513] ? preempt_count_sub+0x14/0xc0
[ 718.284984] ? delay_tsc+0x8f/0xb0
[ 718.288463] ice_do_reset+0x92/0xf0 [ice]
[ 718.292014] ice_pci_err_resume+0x91/0xf0 [ice]
[ 718.295561] pci_reset_function+0x53/0x80
<...>
[ 718.393035] Allocated by task 690:
[ 718.433497] Freed by task 20834:
[ 718.495688] Last potentially related work creation:
[ 718.568966] The buggy address belongs to the object at ffff8881bd127e00
which belongs to the cache kmalloc-96 of size 96
[ 718.574085] The buggy address is located 0 bytes inside of
96-byte region [ffff8881bd127e00, ffff8881bd127e60)
[ 718.579265] The buggy address belongs to the page:
[ 718.598905] Memory state around the buggy address:
[ 718.601809] ffff8881bd127d00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
[ 718.604796] ffff8881bd127d80: 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc
[ 718.607794] >ffff8881bd127e00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
[ 718.610811] ^
[ 718.613819] ffff8881bd127e80: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc
[ 718.617107] ffff8881bd127f00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
This is due to that free_irq_cpu_rmap() is always being called
*after* (devm_)free_irq() and thus it tries to work with IRQ descs
already freed. For example, on device reset the driver frees the
rmap right before allocating a new one (the splat above).
Make rmap creation and freeing function symmetrical with
{request,free}_irq() calls i.e. do that on ifup/ifdown instead
of device probe/remove/resume. These operations can be performed
independently from the actual device aRFS configuration.
Also, make sure ice_vsi_free_irq() clears IRQ affinity notifiers
only when aRFS is disabled -- otherwise, CPU rmap sets and clears
its own and they must not be touched manually.
Fixes: 28bf26724fdb0 ("ice: Implement aRFS")
Co-developed-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Tested-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Suraj Jitindar Singh <surajjs@amazon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date: Wed Mar 5 12:55:34 2025 +0200
ieee802154: ca8210: Use proper setters and getters for bitwise types
[ Upstream commit 169b2262205836a5d1213ff44dca2962276bece1 ]
Sparse complains that the driver doesn't respect the bitwise types:
drivers/net/ieee802154/ca8210.c:1796:27: warning: incorrect type in assignment (different base types)
drivers/net/ieee802154/ca8210.c:1796:27: expected restricted __le16 [addressable] [assigned] [usertype] pan_id
drivers/net/ieee802154/ca8210.c:1796:27: got unsigned short [usertype]
drivers/net/ieee802154/ca8210.c:1801:25: warning: incorrect type in assignment (different base types)
drivers/net/ieee802154/ca8210.c:1801:25: expected restricted __le16 [addressable] [assigned] [usertype] pan_id
drivers/net/ieee802154/ca8210.c:1801:25: got unsigned short [usertype]
drivers/net/ieee802154/ca8210.c:1928:28: warning: incorrect type in argument 3 (different base types)
drivers/net/ieee802154/ca8210.c:1928:28: expected unsigned short [usertype] dst_pan_id
drivers/net/ieee802154/ca8210.c:1928:28: got restricted __le16 [addressable] [usertype] pan_id
Use proper setters and getters for bitwise types.
Note, in accordance with [1] the protocol is little endian.
Link: https://www.cascoda.com/wp-content/uploads/2018/11/CA-8210_datasheet_0418.pdf [1]
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/20250305105656.2133487-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Angelo Dureghello <adureghello@baylibre.com>
Date: Fri Apr 18 20:37:53 2025 +0200
iio: adc: ad7606: fix serial register access
commit f083f8a21cc785ebe3a33f756a3fa3660611f8db upstream.
Fix register read/write routine as per datasheet.
When reading multiple consecutive registers, only the first one is read
properly. This is due to missing chip select deassert and assert again
between first and second 16bit transfer, as shown in the datasheet
AD7606C-16, rev 0, figure 110.
Fixes: f2a22e1e172f ("iio: adc: ad7606: Add support for software mode for ad7616")
Reviewed-by: David Lechner <dlechner@baylibre.com>
Signed-off-by: Angelo Dureghello <adureghello@baylibre.com>
Link: https://patch.msgid.link/20250418-wip-bl-ad7606-fix-reg-access-v3-1-d5eeb440c738@baylibre.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Date: Sun Apr 13 11:34:25 2025 +0100
iio: adc: ad7768-1: Fix insufficient alignment of timestamp.
[ Upstream commit ffbc26bc91c1f1eb3dcf5d8776e74cbae21ee13a ]
On architectures where an s64 is not 64-bit aligned, this may result
insufficient alignment of the timestamp and the structure being too small.
Use aligned_s64 to force the alignment.
Fixes: a1caeebab07e ("iio: adc: ad7768-1: Fix too small buffer passed to iio_push_to_buffers_with_timestamp()") # aligned_s64 newer
Reported-by: David Lechner <dlechner@baylibre.com>
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Reviewed-by: David Lechner <dlechner@baylibre.com>
Link: https://patch.msgid.link/20250413103443.2420727-3-jic23@kernel.org
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Date: Sun Apr 13 11:34:26 2025 +0100
iio: adc: dln2: Use aligned_s64 for timestamp
[ Upstream commit 5097eaae98e53f9ab9d35801c70da819b92ca907 ]
Here the lack of marking allows the overall structure to not be
sufficiently aligned resulting in misplacement of the timestamp
in iio_push_to_buffers_with_timestamp(). Use aligned_s64 to
force the alignment on all architectures.
Fixes: 7c0299e879dd ("iio: adc: Add support for DLN2 ADC")
Reported-by: David Lechner <dlechner@baylibre.com>
Reviewed-by: Andy Shevchenko <andy@kernel.org>
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Reviewed-by: David Lechner <dlechner@baylibre.com>
Link: https://patch.msgid.link/20250413103443.2420727-4-jic23@kernel.org
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Gabriel Shahrouzi <gshahrouzi@gmail.com>
Date: Mon Apr 21 09:15:39 2025 -0400
iio: adis16201: Correct inclinometer channel resolution
commit 609bc31eca06c7408e6860d8b46311ebe45c1fef upstream.
The inclinometer channels were previously defined with 14 realbits.
However, the ADIS16201 datasheet states the resolution for these output
channels is 12 bits (Page 14, text description; Page 15, table 7).
Correct the realbits value to 12 to accurately reflect the hardware.
Fixes: f7fe1d1dd5a5 ("staging: iio: new adis16201 driver")
Cc: stable@vger.kernel.org
Signed-off-by: Gabriel Shahrouzi <gshahrouzi@gmail.com>
Reviewed-by: Marcelo Schmitt <marcelo.schmitt1@gmail.com>
Link: https://patch.msgid.link/20250421131539.912966-1-gshahrouzi@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: David Lechner <dlechner@baylibre.com>
Date: Thu Apr 17 11:52:37 2025 -0500
iio: chemical: sps30: use aligned_s64 for timestamp
[ Upstream commit bb49d940344bcb8e2b19e69d7ac86f567887ea9a ]
Follow the pattern of other drivers and use aligned_s64 for the
timestamp. This will ensure that the timestamp is correctly aligned on
all architectures.
Fixes: a5bf6fdd19c3 ("iio:chemical:sps30: Fix timestamp alignment")
Signed-off-by: David Lechner <dlechner@baylibre.com>
Reviewed-by: Nuno Sá <nuno.sa@analog.com>
Link: https://patch.msgid.link/20250417-iio-more-timestamp-alignment-v1-5-eafac1e22318@baylibre.com
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Silvano Seva <s.seva@4sigma.it>
Date: Tue Mar 11 09:49:47 2025 +0100
iio: imu: st_lsm6dsx: fix possible lockup in st_lsm6dsx_read_fifo
commit 159ca7f18129834b6f4c7eae67de48e96c752fc9 upstream.
Prevent st_lsm6dsx_read_fifo from falling in an infinite loop in case
pattern_len is equal to zero and the device FIFO is not empty.
Fixes: 290a6ce11d93 ("iio: imu: add support to lsm6dsx driver")
Signed-off-by: Silvano Seva <s.seva@4sigma.it>
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250311085030.3593-2-s.seva@4sigma.it
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Silvano Seva <s.seva@4sigma.it>
Date: Tue Mar 11 09:49:49 2025 +0100
iio: imu: st_lsm6dsx: fix possible lockup in st_lsm6dsx_read_tagged_fifo
commit 8114ef86e2058e2554111b793596f17bee23fa15 upstream.
Prevent st_lsm6dsx_read_tagged_fifo from falling in an infinite loop in
case pattern_len is equal to zero and the device FIFO is not empty.
Fixes: 801a6e0af0c6 ("iio: imu: st_lsm6dsx: add support to LSM6DSO")
Signed-off-by: Silvano Seva <s.seva@4sigma.it>
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250311085030.3593-4-s.seva@4sigma.it
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Aditya Garg <gargaditya08@live.com>
Date: Wed May 7 12:12:15 2025 -0700
Input: synaptics - enable InterTouch on Dell Precision M3800
commit a609cb4cc07aa9ab8f50466622814356c06f2c17 upstream.
Enable InterTouch mode on Dell Precision M3800 by adding "DLL060d" to
the list of SMBus-enabled variants.
Reported-by: Markus Rathgeb <maggu2810@gmail.com>
Signed-off-by: Aditya Garg <gargaditya08@live.com>
Link: https://lore.kernel.org/r/PN3PR01MB959789DD6D574E16141E5DC4B888A@PN3PR01MB9597.INDPRD01.PROD.OUTLOOK.COM
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Manuel Fombuena <fombuena@outlook.com>
Date: Wed May 7 12:05:26 2025 -0700
Input: synaptics - enable InterTouch on Dynabook Portege X30-D
commit 6d7ea0881000966607772451b789b5fb5766f11d upstream.
[ 5.989588] psmouse serio1: synaptics: Your touchpad (PNP: TOS0213 PNP0f03) says it can support a different bus. If i2c-hid and hid-rmi are not used, you might want to try setting psmouse.synaptics_intertouch to 1 and report this to linux-input@vger.kernel.org.
[ 6.039923] psmouse serio1: synaptics: Touchpad model: 1, fw: 9.32, id: 0x1e2a1, caps: 0xf00223/0x840300/0x12e800/0x52d884, board id: 3322, fw id: 2658004
The board is labelled TM3322.
Present on the Toshiba / Dynabook Portege X30-D and possibly others.
Confirmed working well with psmouse.synaptics_intertouch=1 and local build.
Signed-off-by: Manuel Fombuena <fombuena@outlook.com>
Signed-off-by: Aditya Garg <gargaditya08@live.com>
Link: https://lore.kernel.org/r/PN3PR01MB9597711E7933A08389FEC31DB888A@PN3PR01MB9597.INDPRD01.PROD.OUTLOOK.COM
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Aditya Garg <gargaditya08@live.com>
Date: Wed May 7 12:06:32 2025 -0700
Input: synaptics - enable InterTouch on Dynabook Portege X30L-G
commit 47d768b32e644b56901bb4bbbdb1feb01ea86c85 upstream.
Enable InterTouch mode on Dynabook Portege X30L-G by adding "TOS01f6" to
the list of SMBus-enabled variants.
Reported-by: Xuntao Chi <chotaotao1qaz2wsx@gmail.com>
Tested-by: Xuntao Chi <chotaotao1qaz2wsx@gmail.com>
Signed-off-by: Aditya Garg <gargaditya08@live.com>
Link: https://lore.kernel.org/r/PN3PR01MB959786E4AC797160CDA93012B888A@PN3PR01MB9597.INDPRD01.PROD.OUTLOOK.COM
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Aditya Garg <gargaditya08@live.com>
Date: Wed May 7 12:09:00 2025 -0700
Input: synaptics - enable InterTouch on TUXEDO InfinityBook Pro 14 v5
commit 2abc698ac77314e0de5b33a6d96a39c5159d88e4 upstream.
Enable InterTouch mode on TUXEDO InfinityBook Pro 14 v5 by adding
"SYN1221" to the list of SMBus-enabled variants.
Add support for InterTouch on SYN1221 by adding it to the list of
SMBus-enabled variants.
Reported-by: Matthias Eilert <kernel.hias@eilert.tech>
Tested-by: Matthias Eilert <kernel.hias@eilert.tech>
Signed-off-by: Aditya Garg <gargaditya08@live.com>
Link: https://lore.kernel.org/r/PN3PR01MB9597C033C4BC20EE2A0C4543B888A@PN3PR01MB9597.INDPRD01.PROD.OUTLOOK.COM
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Date: Wed May 7 14:52:55 2025 -0700
Input: synaptics - enable SMBus for HP Elitebook 850 G1
commit f04f03d3e99bc8f89b6af5debf07ff67d961bc23 upstream.
The kernel reports that the touchpad for this device can support
SMBus mode.
Reported-by: jt <enopatch@gmail.com>
Link: https://lore.kernel.org/r/iys5dbv3ldddsgobfkxldazxyp54kay4bozzmagga6emy45jop@2ebvuxgaui4u
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Pavel Paklov <Pavel.Paklov@cyberprotect.ru>
Date: Tue Mar 25 09:22:44 2025 +0000
iommu/amd: Fix potential buffer overflow in parse_ivrs_acpihid
commit 8dee308e4c01dea48fc104d37f92d5b58c50b96c upstream.
There is a string parsing logic error which can lead to an overflow of hid
or uid buffers. Comparing ACPIID_LEN against a total string length doesn't
take into account the lengths of individual hid and uid buffers so the
check is insufficient in some cases. For example if the length of hid
string is 4 and the length of the uid string is 260, the length of str
will be equal to ACPIID_LEN + 1 but uid string will overflow uid buffer
which size is 256.
The same applies to the hid string with length 13 and uid string with
length 250.
Check the length of hid and uid strings separately to prevent
buffer overflow.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: ca3bf5d47cec ("iommu/amd: Introduces ivrs_acpihid kernel parameter")
Cc: stable@vger.kernel.org
Signed-off-by: Pavel Paklov <Pavel.Paklov@cyberprotect.ru>
Link: https://lore.kernel.org/r/20250325092259.392844-1-Pavel.Paklov@cyberprotect.ru
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Mingcong Bai <jeffbai@aosc.io>
Date: Fri Apr 18 11:16:42 2025 +0800
iommu/vt-d: Apply quirk_iommu_igfx for 8086:0044 (QM57/QS57)
commit 2c8a7c66c90832432496616a9a3c07293f1364f3 upstream.
On the Lenovo ThinkPad X201, when Intel VT-d is enabled in the BIOS, the
kernel boots with errors related to DMAR, the graphical interface appeared
quite choppy, and the system resets erratically within a minute after it
booted:
DMAR: DRHD: handling fault status reg 3
DMAR: [DMA Write NO_PASID] Request device [00:02.0] fault addr 0xb97ff000
[fault reason 0x05] PTE Write access is not set
Upon comparing boot logs with VT-d on/off, I found that the Intel Calpella
quirk (`quirk_calpella_no_shadow_gtt()') correctly applied the igfx IOMMU
disable/quirk correctly:
pci 0000:00:00.0: DMAR: BIOS has allocated no shadow GTT; disabling IOMMU
for graphics
Whereas with VT-d on, it went into the "else" branch, which then
triggered the DMAR handling fault above:
... else if (!disable_igfx_iommu) {
/* we have to ensure the gfx device is idle before we flush */
pci_info(dev, "Disabling batched IOTLB flush on Ironlake\n");
iommu_set_dma_strict();
}
Now, this is not exactly scientific, but moving 0x0044 to quirk_iommu_igfx
seems to have fixed the aforementioned issue. Running a few `git blame'
runs on the function, I have found that the quirk was originally
introduced as a fix specific to ThinkPad X201:
commit 9eecabcb9a92 ("intel-iommu: Abort IOMMU setup for igfx if BIOS gave
no shadow GTT space")
Which was later revised twice to the "else" branch we saw above:
- 2011: commit 6fbcfb3e467a ("intel-iommu: Workaround IOTLB hang on
Ironlake GPU")
- 2024: commit ba00196ca41c ("iommu/vt-d: Decouple igfx_off from graphic
identity mapping")
I'm uncertain whether further testings on this particular laptops were
done in 2011 and (honestly I'm not sure) 2024, but I would be happy to do
some distro-specific testing if that's what would be required to verify
this patch.
P.S., I also see IDs 0x0040, 0x0062, and 0x006a listed under the same
`quirk_calpella_no_shadow_gtt()' quirk, but I'm not sure how similar these
chipsets are (if they share the same issue with VT-d or even, indeed, if
this issue is specific to a bug in the Lenovo BIOS). With regards to
0x0062, it seems to be a Centrino wireless card, but not a chipset?
I have also listed a couple (distro and kernel) bug reports below as
references (some of them are from 7-8 years ago!), as they seem to be
similar issue found on different Westmere/Ironlake, Haswell, and Broadwell
hardware setups.
Cc: stable@vger.kernel.org
Fixes: 6fbcfb3e467a ("intel-iommu: Workaround IOTLB hang on Ironlake GPU")
Fixes: ba00196ca41c ("iommu/vt-d: Decouple igfx_off from graphic identity mapping")
Link: https://groups.google.com/g/qubes-users/c/4NP4goUds2c?pli=1
Link: https://bugs.archlinux.org/task/65362
Link: https://bbs.archlinux.org/viewtopic.php?id=230323
Reported-by: Wenhao Sun <weiguangtwk@outlook.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=197029
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
Link: https://lore.kernel.org/r/20250415133330.12528-1-jeffbai@aosc.io
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date: Fri Feb 7 16:24:58 2025 +0900
ip: fib_rules: Fetch net from fib_rule in fib[46]_rule_configure().
[ Upstream commit 5a1ccffd30a08f5a2428cd5fbb3ab03e8eb6c66d ]
The following patch will not set skb->sk from VRF path.
Let's fetch net from fib_rule->fr_net instead of sock_net(skb->sk)
in fib[46]_rule_configure().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250207072502.87775-5-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kuniyuki Iwashima <kuniyu@amazon.com>
Date: Thu Feb 27 20:23:27 2025 -0800
ipv4: fib: Move fib_valid_key_len() to rtm_to_fib_config().
[ Upstream commit 254ba7e6032d3fc738050d500b0c1d8197af90ca ]
fib_valid_key_len() is called in the beginning of fib_table_insert()
or fib_table_delete() to check if the prefix length is valid.
fib_table_insert() and fib_table_delete() are called from 3 paths
- ip_rt_ioctl()
- inet_rtm_newroute() / inet_rtm_delroute()
- fib_magic()
In the first ioctl() path, rtentry_to_fib_config() checks the prefix
length with bad_mask(). Also, fib_magic() always passes the correct
prefix: 32 or ifa->ifa_prefixlen, which is already validated.
Let's move fib_valid_key_len() to the rtnetlink path, rtm_to_fib_config().
While at it, 2 direct returns in rtm_to_fib_config() are changed to
goto to match other places in the same function
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20250228042328.96624-12-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Willem de Bruijn <willemb@google.com>
Date: Thu Mar 6 22:34:09 2025 -0500
ipv6: save dontfrag in cork
[ Upstream commit a18dfa9925b9ef6107ea3aa5814ca3c704d34a8a ]
When spanning datagram construction over multiple send calls using
MSG_MORE, per datagram settings are configured on the first send.
That is when ip(6)_setup_cork stores these settings for subsequent use
in __ip(6)_append_data and others.
The only flag that escaped this was dontfrag. As a result, a datagram
could be constructed with df=0 on the first sendmsg, but df=1 on a
next. Which is what cmsg_ip.sh does in an upcoming MSG_MORE test in
the "diff" scenario.
Changing datagram conditions in the middle of constructing an skb
makes this already complex code path even more convoluted. It is here
unintentional. Bring this flag in line with expected sockopt/cmsg
behavior.
And stop passing ipc6 to __ip6_append_data, to avoid such issues
in the future. This is already the case for __ip_append_data.
inet6_cork had a 6 byte hole, so the 1B flag has no impact.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250307033620.411611-3-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Xiang wangx <wangxiang@cdjrlc.com>
Date: Thu Dec 9 21:24:53 2021 +0800
irqchip/gic-v2m: Add const to of_device_id
[ Upstream commit c10f2f8b5d8027c1ea77f777f2d16cb9043a6c09 ]
struct of_device_id should normally be const.
Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211209132453.25623-1-wangxiang@cdjrlc.com
Stable-dep-of: 3318dc299b07 ("irqchip/gic-v2m: Prevent use after free of gicv2m_get_fwnode()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Thomas Gleixner <tglx@linutronix.de>
Date: Mon Nov 21 15:39:33 2022 +0100
irqchip/gic-v2m: Mark a few functions __init
[ Upstream commit d51a15af37ce8cf59e73de51dcdce3c9f4944974 ]
They are all part of the init sequence.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221121140048.534395323@linutronix.de
Stable-dep-of: 3318dc299b07 ("irqchip/gic-v2m: Prevent use after free of gicv2m_get_fwnode()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Suzuki K Poulose <suzuki.poulose@arm.com>
Date: Tue Apr 22 17:16:16 2025 +0100
irqchip/gic-v2m: Prevent use after free of gicv2m_get_fwnode()
[ Upstream commit 3318dc299b072a0511d6dfd8367f3304fb6d9827 ]
With ACPI in place, gicv2m_get_fwnode() is registered with the pci
subsystem as pci_msi_get_fwnode_cb(), which may get invoked at runtime
during a PCI host bridge probe. But, the call back is wrongly marked as
__init, causing it to be freed, while being registered with the PCI
subsystem and could trigger:
Unable to handle kernel paging request at virtual address ffff8000816c0400
gicv2m_get_fwnode+0x0/0x58 (P)
pci_set_bus_msi_domain+0x74/0x88
pci_register_host_bridge+0x194/0x548
This is easily reproducible on a Juno board with ACPI boot.
Retain the function for later use.
Fixes: 0644b3daca28 ("irqchip/gic-v2m: acpi: Introducing GICv2m ACPI support")
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nathan Chancellor <nathan@kernel.org>
Date: Tue May 6 14:02:01 2025 -0700
kbuild: Disable -Wdefault-const-init-unsafe
commit d0afcfeb9e3810ec89d1ffde1a0e36621bb75dca upstream.
A new on by default warning in clang [1] aims to flags instances where
const variables without static or thread local storage or const members
in aggregate types are not initialized because it can lead to an
indeterminate value. This is quite noisy for the kernel due to
instances originating from header files such as:
drivers/gpu/drm/i915/gt/intel_ring.h:62:2: error: default initialization of an object of type 'typeof (ring->size)' (aka 'const unsigned int') leaves the object uninitialized [-Werror,-Wdefault-const-init-var-unsafe]
62 | typecheck(typeof(ring->size), next);
| ^
include/linux/typecheck.h:10:9: note: expanded from macro 'typecheck'
10 | ({ type __dummy; \
| ^
include/net/ip.h:478:14: error: default initialization of an object of type 'typeof (rt->dst.expires)' (aka 'const unsigned long') leaves the object uninitialized [-Werror,-Wdefault-const-init-var-unsafe]
478 | if (mtu && time_before(jiffies, rt->dst.expires))
| ^
include/linux/jiffies.h:138:26: note: expanded from macro 'time_before'
138 | #define time_before(a,b) time_after(b,a)
| ^
include/linux/jiffies.h:128:3: note: expanded from macro 'time_after'
128 | (typecheck(unsigned long, a) && \
| ^
include/linux/typecheck.h:11:12: note: expanded from macro 'typecheck'
11 | typeof(x) __dummy2; \
| ^
include/linux/list.h:409:27: warning: default initialization of an object of type 'union (unnamed union at include/linux/list.h:409:27)' with const member leaves the object uninitialized [-Wdefault-const-init-field-unsafe]
409 | struct list_head *next = smp_load_acquire(&head->next);
| ^
include/asm-generic/barrier.h:176:29: note: expanded from macro 'smp_load_acquire'
176 | #define smp_load_acquire(p) __smp_load_acquire(p)
| ^
arch/arm64/include/asm/barrier.h:164:59: note: expanded from macro '__smp_load_acquire'
164 | union { __unqual_scalar_typeof(*p) __val; char __c[1]; } __u; \
| ^
include/linux/list.h:409:27: note: member '__val' declared 'const' here
crypto/scatterwalk.c:66:22: error: default initialization of an object of type 'struct scatter_walk' with const member leaves the object uninitialized [-Werror,-Wdefault-const-init-field-unsafe]
66 | struct scatter_walk walk;
| ^
include/crypto/algapi.h:112:15: note: member 'addr' declared 'const' here
112 | void *const addr;
| ^
fs/hugetlbfs/inode.c:733:24: error: default initialization of an object of type 'struct vm_area_struct' with const member leaves the object uninitialized [-Werror,-Wdefault-const-init-field-unsafe]
733 | struct vm_area_struct pseudo_vma;
| ^
include/linux/mm_types.h:803:20: note: member 'vm_flags' declared 'const' here
803 | const vm_flags_t vm_flags;
| ^
Silencing the instances from typecheck.h is difficult because '= {}' is
not available in older but supported compilers and '= {0}' would cause
warnings about a literal 0 being treated as NULL. While it might be
possible to come up with a local hack to silence the warning for
clang-21+, it may not be worth it since -Wuninitialized will still
trigger if an uninitialized const variable is actually used.
In all audited cases of the "field" variant of the warning, the members
are either not used in the particular call path, modified through other
means such as memset() / memcpy() because the containing object is not
const, or are within a union with other non-const members.
Since this warning does not appear to have a high signal to noise ratio,
just disable it.
Cc: stable@vger.kernel.org
Link: https://github.com/llvm/llvm-project/commit/576161cb6069e2c7656a8ef530727a0f4aefff30 [1]
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Closes: https://lore.kernel.org/CA+G9fYuNjKcxFKS_MKPRuga32XbndkLGcY-PVuoSwzv6VWbY=w@mail.gmail.com/
Reported-by: Marcus Seyfarth <m.seyfarth@gmail.com>
Closes: https://github.com/ClangBuiltLinux/linux/issues/2088
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
[nathan: Apply change to Makefile instead of scripts/Makefile.extrawarn
due to lack of e88ca24319e4 in older stable branches]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Seyediman Seyedarab <imandevel@gmail.com>
Date: Sat Mar 1 17:21:37 2025 -0500
kbuild: fix argument parsing in scripts/config
[ Upstream commit f757f6011c92b5a01db742c39149bed9e526478f ]
The script previously assumed --file was always the first argument,
which caused issues when it appeared later. This patch updates the
parsing logic to scan all arguments to find --file, sets the config
file correctly, and resets the argument list with the remaining
commands.
It also fixes --refresh to respect --file by passing KCONFIG_CONFIG=$FN
to make oldconfig.
Signed-off-by: Seyediman Seyedarab <imandevel@gmail.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Daniel Gomez <da.gomez@samsung.com>
Date: Fri Mar 28 14:28:37 2025 +0000
kconfig: merge_config: use an empty file as initfile
[ Upstream commit a26fe287eed112b4e21e854f173c8918a6a8596d ]
The scripts/kconfig/merge_config.sh script requires an existing
$INITFILE (or the $1 argument) as a base file for merging Kconfig
fragments. However, an empty $INITFILE can serve as an initial starting
point, later referenced by the KCONFIG_ALLCONFIG Makefile variable
if -m is not used. This variable can point to any configuration file
containing preset config symbols (the merged output) as stated in
Documentation/kbuild/kconfig.rst. When -m is used $INITFILE will
contain just the merge output requiring the user to run make (i.e.
KCONFIG_ALLCONFIG=<$INITFILE> make <allnoconfig/alldefconfig> or make
olddefconfig).
Instead of failing when `$INITFILE` is missing, create an empty file and
use it as the starting point for merges.
Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nandakumar Edamana <nandakumar@nandakumar.co.in>
Date: Sat Feb 22 02:31:11 2025 +0530
libbpf: Fix out-of-bound read
[ Upstream commit 236d3910117e9f97ebf75e511d8bcc950f1a4e5f ]
In `set_kcfg_value_str`, an untrusted string is accessed with the assumption
that it will be at least two characters long due to the presence of checks for
opening and closing quotes. But the check for the closing quote
(value[len - 1] != '"') misses the fact that it could be checking the opening
quote itself in case of an invalid input that consists of just the opening
quote.
This commit adds an explicit check to make sure the string is at least two
characters long.
Signed-off-by: Nandakumar Edamana <nandakumar@nandakumar.co.in>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250221210110.3182084-1-nandakumar@nandakumar.co.in
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Robert Richter <rrichter@amd.com>
Date: Thu Mar 20 12:22:22 2025 +0100
libnvdimm/labels: Fix divide error in nd_label_data_init()
[ Upstream commit ef1d3455bbc1922f94a91ed58d3d7db440652959 ]
If a faulty CXL memory device returns a broken zero LSA size in its
memory device information (Identify Memory Device (Opcode 4000h), CXL
spec. 3.1, 8.2.9.9.1.1), a divide error occurs in the libnvdimm
driver:
Oops: divide error: 0000 [#1] PREEMPT SMP NOPTI
RIP: 0010:nd_label_data_init+0x10e/0x800 [libnvdimm]
Code and flow:
1) CXL Command 4000h returns LSA size = 0
2) config_size is assigned to zero LSA size (CXL pmem driver):
drivers/cxl/pmem.c: .config_size = mds->lsa_size,
3) max_xfer is set to zero (nvdimm driver):
drivers/nvdimm/label.c: max_xfer = min_t(size_t, ndd->nsarea.max_xfer, config_size);
4) A subsequent DIV_ROUND_UP() causes a division by zero:
drivers/nvdimm/label.c: /* Make our initial read size a multiple of max_xfer size */
drivers/nvdimm/label.c: read_size = min(DIV_ROUND_UP(read_size, max_xfer) * max_xfer,
drivers/nvdimm/label.c- config_size);
Fix this by checking the config size parameter by extending an
existing check.
Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://patch.msgid.link/20250320112223.608320-1-rrichter@amd.com
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Wed Jun 4 14:37:09 2025 +0200
Linux 5.10.238
Link: https://lore.kernel.org/r/20250602134307.195171844@linuxfoundation.org
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ilia Gavrilov <Ilia.Gavrilov@infotecs.ru>
Date: Thu May 15 12:20:15 2025 +0000
llc: fix data loss when reading from a socket in llc_ui_recvmsg()
commit 239af1970bcb039a1551d2c438d113df0010c149 upstream.
For SOCK_STREAM sockets, if user buffer size (len) is less
than skb size (skb->len), the remaining data from skb
will be lost after calling kfree_skb().
To fix this, move the statement for partial reading
above skb deletion.
Found by InfoTeCS on behalf of Linux Verification Center (linuxtesting.org)
Fixes: 30a584d944fb ("[LLX]: SOCK_DGRAM interface fixes")
Cc: stable@vger.kernel.org
Signed-off-by: Ilia Gavrilov <Ilia.Gavrilov@infotecs.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Tudor Ambarus <tudor.ambarus@linaro.org>
Date: Mon Feb 24 08:27:13 2025 +0000
mailbox: use error ret code of of_parse_phandle_with_args()
[ Upstream commit 24fdd5074b205cfb0ef4cd0751a2d03031455929 ]
In case of error, of_parse_phandle_with_args() returns -EINVAL when the
passed index is negative, or -ENOENT when the index is for an empty
phandle. The mailbox core overwrote the error return code with a less
precise -ENODEV. Use the error returned code from
of_parse_phandle_with_args().
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Markus Elfring <elfring@users.sourceforge.net>
Date: Fri Oct 4 15:50:15 2024 +0200
media: c8sectpfe: Call of_node_put(i2c_bus) only once in c8sectpfe_probe()
[ Upstream commit b773530a34df0687020520015057075f8b7b4ac4 ]
An of_node_put(i2c_bus) call was immediately used after a pointer check
for an of_find_i2c_adapter_by_node() call in this function implementation.
Thus call such a function only once instead directly before the check.
This issue was transformed by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hans Verkuil <hverkuil@xs4all.nl>
Date: Mon Feb 24 14:13:24 2025 +0100
media: cx231xx: set device_caps for 417
[ Upstream commit a79efc44b51432490538a55b9753a721f7d3ea42 ]
The video_device for the MPEG encoder did not set device_caps.
Add this, otherwise the video device can't be registered (you get a
WARN_ON instead).
Not seen before since currently 417 support is disabled, but I found
this while experimenting with it.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sakari Ailus <sakari.ailus@linux.intel.com>
Date: Mon Dec 16 10:48:49 2024 +0200
media: v4l: Memset argument to 0 before calling get_mbus_config pad op
[ Upstream commit 91d6a99acfa5ce9f95ede775074b80f7193bd717 ]
Memset the config argument to get_mbus_config V4L2 sub-device pad
operation to zero before calling the operation. This ensures the callers
don't need to bother with it nor the implementations need to set all
fields that may not be relevant to them.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Breno Leitao <leitao@debian.org>
Date: Fri May 23 10:21:06 2025 -0700
memcg: always call cond_resched() after fn()
commit 06717a7b6c86514dbd6ab322e8083ffaa4db5712 upstream.
I am seeing soft lockup on certain machine types when a cgroup OOMs. This
is happening because killing the process in certain machine might be very
slow, which causes the soft lockup and RCU stalls. This happens usually
when the cgroup has MANY processes and memory.oom.group is set.
Example I am seeing in real production:
[462012.244552] Memory cgroup out of memory: Killed process 3370438 (crosvm) ....
....
[462037.318059] Memory cgroup out of memory: Killed process 4171372 (adb) ....
[462037.348314] watchdog: BUG: soft lockup - CPU#64 stuck for 26s! [stat_manager-ag:1618982]
....
Quick look at why this is so slow, it seems to be related to serial flush
for certain machine types. For all the crashes I saw, the target CPU was
at console_flush_all().
In the case above, there are thousands of processes in the cgroup, and it
is soft locking up before it reaches the 1024 limit in the code (which
would call the cond_resched()). So, cond_resched() in 1024 blocks is not
sufficient.
Remove the counter-based conditional rescheduling logic and call
cond_resched() unconditionally after each task iteration, after fn() is
called. This avoids the lockup independently of how slow fn() is.
Link: https://lkml.kernel.org/r/20250523-memcg_fix-v1-1-ad3eafb60477@debian.org
Fixes: ade81479c7dd ("memcg: fix soft lockup in the OOM process")
Signed-off-by: Breno Leitao <leitao@debian.org>
Suggested-by: Rik van Riel <riel@surriel.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Michael van der Westhuizen <rmikey@meta.com>
Cc: Usama Arif <usamaarif642@gmail.com>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: Chen Ridong <chenridong@huawei.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Thorsten Blum <thorsten.blum@linux.dev>
Date: Sun Apr 27 13:34:24 2025 +0200
MIPS: Fix MAX_REG_OFFSET
[ Upstream commit c44572e0cc13c9afff83fd333135a0aa9b27ba26 ]
Fix MAX_REG_OFFSET to point to the last register in 'pt_regs' and not to
the marker itself, which could allow regs_get_register() to return an
invalid offset.
Fixes: 40e084a506eb ("MIPS: Add uprobes support.")
Suggested-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Paul Burton <paulburton@kernel.org>
Date: Wed Jan 29 13:32:48 2025 +0100
MIPS: pm-cps: Use per-CPU variables as per-CPU, not per-core
[ Upstream commit 00a134fc2bb4a5f8fada58cf7ff4259149691d64 ]
The pm-cps code has up until now used per-CPU variables indexed by core,
rather than CPU number, in order to share data amongst sibling CPUs (ie.
VPs/threads in a core). This works fine for single cluster systems, but
with multi-cluster systems a core number is no longer unique in the
system, leading to sharing between CPUs that are not actually siblings.
Avoid this issue by using per-CPU variables as they are more generally
used - ie. access them using CPU numbers rather than core numbers.
Sharing between siblings is then accomplished by:
- Assigning the same pointer to entries for each sibling CPU for the
nc_asm_enter & ready_count variables, which allow this by virtue of
being per-CPU pointers.
- Indexing by the first CPU set in a CPUs cpu_sibling_map in the case
of pm_barrier, for which we can't use the previous approach because
the per-CPU variable is not a pointer.
Signed-off-by: Paul Burton <paulburton@kernel.org>
Signed-off-by: Dragan Mladjenovic <dragan.mladjenovic@syrmia.com>
Signed-off-by: Aleksandar Rikalo <arikalo@gmail.com>
Tested-by: Serge Semin <fancer.lancer@gmail.com>
Tested-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Bibo Mao <maobibo@loongson.cn>
Date: Tue Jun 9 10:54:35 2020 +0800
MIPS: Use arch specific syscall name match function
[ Upstream commit 756276ce78d5624dc814f9d99f7d16c8fd51076e ]
On MIPS system, most of the syscall function name begin with prefix
sys_. Some syscalls are special such as clone/fork, function name of
these begin with __sys_. Since scratch registers need be saved in
stack when these system calls happens.
With ftrace system call method, system call functions are declared with
SYSCALL_DEFINEx, metadata of the system call symbol name begins with
sys_. Here mips specific function arch_syscall_match_sym_name is used to
compare function name between sys_call_table[] and metadata of syscall
symbol.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tianyang Zhang <zhangtianyang@loongson.cn>
Date: Wed Apr 16 16:24:05 2025 +0800
mm/page_alloc.c: avoid infinite retries caused by cpuset race
commit e05741fb10c38d70bbd7ec12b23c197b6355d519 upstream.
__alloc_pages_slowpath has no change detection for ac->nodemask in the
part of retry path, while cpuset can modify it in parallel. For some
processes that set mempolicy as MPOL_BIND, this results ac->nodemask
changes, and then the should_reclaim_retry will judge based on the latest
nodemask and jump to retry, while the get_page_from_freelist only
traverses the zonelist from ac->preferred_zoneref, which selected by a
expired nodemask and may cause infinite retries in some cases
cpu 64:
__alloc_pages_slowpath {
/* ..... */
retry:
/* ac->nodemask = 0x1, ac->preferred->zone->nid = 1 */
if (alloc_flags & ALLOC_KSWAPD)
wake_all_kswapds(order, gfp_mask, ac);
/* cpu 1:
cpuset_write_resmask
update_nodemask
update_nodemasks_hier
update_tasks_nodemask
mpol_rebind_task
mpol_rebind_policy
mpol_rebind_nodemask
// mempolicy->nodes has been modified,
// which ac->nodemask point to
*/
/* ac->nodemask = 0x3, ac->preferred->zone->nid = 1 */
if (should_reclaim_retry(gfp_mask, order, ac, alloc_flags,
did_some_progress > 0, &no_progress_loops))
goto retry;
}
Simultaneously starting multiple cpuset01 from LTP can quickly reproduce
this issue on a multi node server when the maximum memory pressure is
reached and the swap is enabled
Link: https://lkml.kernel.org/r/20250416082405.20988-1-zhangtianyang@loongson.cn
Fixes: c33d6c06f60f ("mm, page_alloc: avoid looking up the first zone in a zonelist twice")
Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Zi Yan <ziy@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Erick Shepherd <erick.shepherd@ni.com>
Date: Fri Mar 14 14:50:21 2025 -0500
mmc: host: Wait for Vdd to settle on card power off
[ Upstream commit 31e75ed964582257f59156ce6a42860e1ae4cc39 ]
The SD spec version 6.0 section 6.4.1.5 requires that Vdd must be
lowered to less than 0.5V for a minimum of 1 ms when powering off a
card. Increase wait to 15 ms so that voltage has time to drain down
to 0.5V and cards can power off correctly. Issues with voltage drain
time were only observed on Apollo Lake and Bay Trail host controllers
so this fix is limited to those devices.
Signed-off-by: Erick Shepherd <erick.shepherd@ni.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20250314195021.1588090-1-erick.shepherd@ni.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ruslan Piasetskyi <ruslan.piasetskyi@gmail.com>
Date: Wed Mar 26 23:06:38 2025 +0100
mmc: renesas_sdhi: Fix error handling in renesas_sdhi_probe
commit 649b50a82f09fa44c2f7a65618e4584072145ab7 upstream.
After moving tmio_mmc_host_probe down, error handling has to be
adjusted.
Fixes: 74f45de394d9 ("mmc: renesas_sdhi: register irqs before registering controller")
Reviewed-by: Ihar Salauyou <salauyou.ihar@gmail.com>
Signed-off-by: Ruslan Piasetskyi <ruslan.piasetskyi@gmail.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250326220638.460083-1-ruslan.piasetskyi@gmail.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Erick Shepherd <erick.shepherd@ni.com>
Date: Tue Feb 11 15:46:45 2025 -0600
mmc: sdhci: Disable SD card clock before changing parameters
[ Upstream commit fb3bbc46c94f261b6156ee863c1b06c84cf157dc ]
Per the SD Host Controller Simplified Specification v4.20 §3.2.3, change
the SD card clock parameters only after first disabling the external card
clock. Doing this fixes a spurious clock pulse on Baytrail and Apollo Lake
SD controllers which otherwise breaks voltage switching with a specific
Swissbit SD card.
Signed-off-by: Kyle Roeschley <kyle.roeschley@ni.com>
Signed-off-by: Brad Mouring <brad.mouring@ni.com>
Signed-off-by: Erick Shepherd <erick.shepherd@ni.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20250211214645.469279-1-erick.shepherd@ni.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Antipov <dmantipov@yandex.ru>
Date: Wed May 7 09:50:44 2025 +0300
module: ensure that kobject_put() is safe for module type kobjects
commit a6aeb739974ec73e5217c75a7c008a688d3d5cf1 upstream.
In 'lookup_or_create_module_kobject()', an internal kobject is created
using 'module_ktype'. So call to 'kobject_put()' on error handling
path causes an attempt to use an uninitialized completion pointer in
'module_kobject_release()'. In this scenario, we just want to release
kobject without an extra synchronization required for a regular module
unloading process, so adding an extra check whether 'complete()' is
actually required makes 'kobject_put()' safe.
Reported-by: syzbot+7fb8a372e1f6add936dd@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7fb8a372e1f6add936dd
Fixes: 942e443127e9 ("module: Fix mod->mkobj.kobj potentially freed too early")
Cc: stable@vger.kernel.org
Suggested-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Link: https://lore.kernel.org/r/20250507065044.86529-1-dmantipov@yandex.ru
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Kees Cook <kees@kernel.org>
Date: Mon Feb 10 09:45:05 2025 -0800
net/mlx4_core: Avoid impossible mlx4_db_alloc() order value
[ Upstream commit 4a6f18f28627e121bd1f74b5fcc9f945d6dbeb1e ]
GCC can see that the value range for "order" is capped, but this leads
it to consider that it might be negative, leading to a false positive
warning (with GCC 15 with -Warray-bounds -fdiagnostics-details):
../drivers/net/ethernet/mellanox/mlx4/alloc.c:691:47: error: array subscript -1 is below array bounds of 'long unsigned int *[2]' [-Werror=array-bounds=]
691 | i = find_first_bit(pgdir->bits[o], MLX4_DB_PER_PAGE >> o);
| ~~~~~~~~~~~^~~
'mlx4_alloc_db_from_pgdir': events 1-2
691 | i = find_first_bit(pgdir->bits[o], MLX4_DB_PER_PAGE >> o); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| | | | | (2) out of array bounds here
| (1) when the condition is evaluated to true In file included from ../drivers/net/ethernet/mellanox/mlx4/mlx4.h:53,
from ../drivers/net/ethernet/mellanox/mlx4/alloc.c:42:
../include/linux/mlx4/device.h:664:33: note: while referencing 'bits'
664 | unsigned long *bits[2];
| ^~~~
Switch the argument to unsigned int, which removes the compiler needing
to consider negative values.
Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20250210174504.work.075-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shahar Shitrit <shshitrit@nvidia.com>
Date: Thu Feb 13 11:46:38 2025 +0200
net/mlx5: Apply rate-limiting to high temperature warning
[ Upstream commit 9dd3d5d258aceb37bdf09c8b91fa448f58ea81f0 ]
Wrap the high temperature warning in a temperature event with
a call to net_ratelimit() to prevent flooding the kernel log
with repeated warning messages when temperature exceeds the
threshold multiple times within a short duration.
Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Link: https://patch.msgid.link/20250213094641.226501-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Moshe Shemesh <moshe@nvidia.com>
Date: Wed Feb 26 14:25:40 2025 +0200
net/mlx5: Avoid report two health errors on same syndrome
[ Upstream commit b5d7b2f04ebcff740f44ef4d295b3401aeb029f4 ]
In case health counter has not increased for few polling intervals, miss
counter will reach max misses threshold and health report will be
triggered for FW health reporter. In case syndrome found on same health
poll another health report will be triggered.
Avoid two health reports on same syndrome by marking this syndrome as
already known.
Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chris Mi <cmi@nvidia.com>
Date: Wed Apr 23 11:36:11 2025 +0300
net/mlx5: E-switch, Fix error handling for enabling roce
[ Upstream commit 90538d23278a981e344d364e923162fce752afeb ]
The cited commit assumes enabling roce always succeeds. But it is
not true. Add error handling for it.
Fixes: 80f09dfc237f ("net/mlx5: Eswitch, enable RoCE loopback traffic")
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250423083611.324567-6-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Maor Gottlieb <maorg@nvidia.com>
Date: Wed Apr 23 11:36:08 2025 +0300
net/mlx5: E-Switch, Initialize MAC Address for Default GID
[ Upstream commit 5d1a04f347e6cbf5ffe74da409a5d71fbe8c5f19 ]
Initialize the source MAC address when creating the default GID entry.
Since this entry is used only for loopback traffic, it only needs to
be a unicast address. A zeroed-out MAC address is sufficient for this
purpose.
Without this fix, random bits would be assigned as the source address.
If these bits formed a multicast address, the firmware would return an
error, preventing the user from switching to switchdev mode:
Error: mlx5_core: Failed setting eswitch to offloads.
kernel answers: Invalid argument
Fixes: 80f09dfc237f ("net/mlx5: Eswitch, enable RoCE loopback traffic")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250423083611.324567-3-mbloch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alexei Lazar <alazar@nvidia.com>
Date: Sun Feb 9 12:17:15 2025 +0200
net/mlx5: Extend Ethtool loopback selftest to support non-linear SKB
[ Upstream commit 95b9606b15bb3ce1198d28d2393dd0e1f0a5f3e9 ]
Current loopback test validation ignores non-linear SKB case in
the SKB access, which can lead to failures in scenarios such as
when HW GRO is enabled.
Linearize the SKB so both cases will be handled.
Signed-off-by: Alexei Lazar <alazar@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250209101716.112774-15-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shahar Shitrit <shshitrit@nvidia.com>
Date: Thu Feb 13 11:46:40 2025 +0200
net/mlx5: Modify LSB bitmask in temperature event to include only the first bit
[ Upstream commit 633f16d7e07c129a36b882c05379e01ce5bdb542 ]
In the sensor_count field of the MTEWE register, bits 1-62 are
supported only for unmanaged switches, not for NICs, and bit 63
is reserved for internal use.
To prevent confusing output that may include set bits that are
not relevant to NIC sensors, we update the bitmask to retain only
the first bit, which corresponds to the sensor ASIC.
Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Link: https://patch.msgid.link/20250213094641.226501-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wenpeng Liang <liangwenpeng@huawei.com>
Date: Thu Apr 1 21:07:42 2021 +0800
net/mlx5: Remove return statement exist at the end of void function
[ Upstream commit 9dee115bc1478b6a51f664defbc5b091985a3fd3 ]
void function return statements are not generally useful.
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Stable-dep-of: 90538d23278a ("net/mlx5: E-switch, Fix error handling for enabling roce")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: William Tu <witu@nvidia.com>
Date: Sun Feb 9 12:17:08 2025 +0200
net/mlx5e: reduce rep rxq depth to 256 for ECPF
[ Upstream commit b9cc8f9d700867aaa77aedddfea85e53d5e5d584 ]
By experiments, a single queue representor netdev consumes kernel
memory around 2.8MB, and 1.8MB out of the 2.8MB is due to page
pool for the RXQ. Scaling to a thousand representors consumes 2.8GB,
which becomes a memory pressure issue for embedded devices such as
BlueField-2 16GB / BlueField-3 32GB memory.
Since representor netdevs mostly handles miss traffic, and ideally,
most of the traffic will be offloaded, reduce the default non-uplink
rep netdev's RXQ default depth from 1024 to 256 if mdev is ecpf eswitch
manager. This saves around 1MB of memory per regular RQ,
(1024 - 256) * 2KB, allocated from page pool.
With rxq depth of 256, the netlink page pool tool reports
$./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump page-pool-get
{'id': 277,
'ifindex': 9,
'inflight': 128,
'inflight-mem': 786432,
'napi-id': 775}]
This is due to mtu 1500 + headroom consumes half pages, so 256 rxq
entries consumes around 128 pages (thus create a page pool with
size 128), shown above at inflight.
Note that each netdev has multiple types of RQs, including
Regular RQ, XSK, PTP, Drop, Trap RQ. Since non-uplink representor
only supports regular rq, this patch only changes the regular RQ's
default depth.
Signed-off-by: William Tu <witu@nvidia.com>
Reviewed-by: Bodong Wang <bodong@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250209101716.112774-8-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: William Tu <witu@nvidia.com>
Date: Sun Feb 9 12:17:09 2025 +0200
net/mlx5e: set the tx_queue_len for pfifo_fast
[ Upstream commit a38cc5706fb9f7dc4ee3a443f61de13ce1e410ed ]
By default, the mq netdev creates a pfifo_fast qdisc. On a
system with 16 core, the pfifo_fast with 3 bands consumes
16 * 3 * 8 (size of pointer) * 1024 (default tx queue len)
= 393KB. The patch sets the tx qlen to representor default
value, 128 (1<<MLX5E_REP_PARAMS_DEF_LOG_SQ_SIZE), which
consumes 16 * 3 * 8 * 128 = 49KB, saving 344KB for each
representor at ECPF.
Signed-off-by: William Tu <witu@nvidia.com>
Reviewed-by: Daniel Jurgens <danielj@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250209101716.112774-9-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jakub Kicinski <kuba@kernel.org>
Date: Thu Feb 15 06:33:46 2024 -0800
net/sched: act_mirred: don't override retval if we already lost the skb
commit 166c2c8a6a4dc2e4ceba9e10cfe81c3e469e3210 upstream.
If we're redirecting the skb, and haven't called tcf_mirred_forward(),
yet, we need to tell the core to drop the skb by setting the retcode
to SHOT. If we have called tcf_mirred_forward(), however, the skb
is out of our hands and returning SHOT will lead to UaF.
Move the retval override to the error path which actually need it.
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Fixes: e5cf1baf92cb ("act_mirred: use TC_ACT_REINSERT when possible")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[Minor conflict resolved due to code context change.]
Signed-off-by: Jianqi Ren <jianqi.ren.cn@windriver.com>
Signed-off-by: He Zhe <zhe.he@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Wang Liang <wangliang74@huawei.com>
Date: Tue May 20 18:14:04 2025 +0800
net/tipc: fix slab-use-after-free Read in tipc_aead_encrypt_done
[ Upstream commit e279024617134c94fd3e37470156534d5f2b3472 ]
Syzbot reported a slab-use-after-free with the following call trace:
==================================================================
BUG: KASAN: slab-use-after-free in tipc_aead_encrypt_done+0x4bd/0x510 net/tipc/crypto.c:840
Read of size 8 at addr ffff88807a733000 by task kworker/1:0/25
Call Trace:
kasan_report+0xd9/0x110 mm/kasan/report.c:601
tipc_aead_encrypt_done+0x4bd/0x510 net/tipc/crypto.c:840
crypto_request_complete include/crypto/algapi.h:266
aead_request_complete include/crypto/internal/aead.h:85
cryptd_aead_crypt+0x3b8/0x750 crypto/cryptd.c:772
crypto_request_complete include/crypto/algapi.h:266
cryptd_queue_worker+0x131/0x200 crypto/cryptd.c:181
process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231
Allocated by task 8355:
kzalloc_noprof include/linux/slab.h:778
tipc_crypto_start+0xcc/0x9e0 net/tipc/crypto.c:1466
tipc_init_net+0x2dd/0x430 net/tipc/core.c:72
ops_init+0xb9/0x650 net/core/net_namespace.c:139
setup_net+0x435/0xb40 net/core/net_namespace.c:343
copy_net_ns+0x2f0/0x670 net/core/net_namespace.c:508
create_new_namespaces+0x3ea/0xb10 kernel/nsproxy.c:110
unshare_nsproxy_namespaces+0xc0/0x1f0 kernel/nsproxy.c:228
ksys_unshare+0x419/0x970 kernel/fork.c:3323
__do_sys_unshare kernel/fork.c:3394
Freed by task 63:
kfree+0x12a/0x3b0 mm/slub.c:4557
tipc_crypto_stop+0x23c/0x500 net/tipc/crypto.c:1539
tipc_exit_net+0x8c/0x110 net/tipc/core.c:119
ops_exit_list+0xb0/0x180 net/core/net_namespace.c:173
cleanup_net+0x5b7/0xbf0 net/core/net_namespace.c:640
process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231
After freed the tipc_crypto tx by delete namespace, tipc_aead_encrypt_done
may still visit it in cryptd_queue_worker workqueue.
I reproduce this issue by:
ip netns add ns1
ip link add veth1 type veth peer name veth2
ip link set veth1 netns ns1
ip netns exec ns1 tipc bearer enable media eth dev veth1
ip netns exec ns1 tipc node set key this_is_a_master_key master
ip netns exec ns1 tipc bearer disable media eth dev veth1
ip netns del ns1
The key of reproduction is that, simd_aead_encrypt is interrupted, leading
to crypto_simd_usable() return false. Thus, the cryptd_queue_worker is
triggered, and the tipc_crypto tx will be visited.
tipc_disc_timeout
tipc_bearer_xmit_skb
tipc_crypto_xmit
tipc_aead_encrypt
crypto_aead_encrypt
// encrypt()
simd_aead_encrypt
// crypto_simd_usable() is false
child = &ctx->cryptd_tfm->base;
simd_aead_encrypt
crypto_aead_encrypt
// encrypt()
cryptd_aead_encrypt_enqueue
cryptd_aead_enqueue
cryptd_enqueue_request
// trigger cryptd_queue_worker
queue_work_on(smp_processor_id(), cryptd_wq, &cpu_queue->work)
Fix this by holding net reference count before encrypt.
Reported-by: syzbot+55c12726619ff85ce1f6@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=55c12726619ff85ce1f6
Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication")
Signed-off-by: Wang Liang <wangliang74@huawei.com>
Link: https://patch.msgid.link/20250520101404.1341730-1-wangliang74@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mathieu Othacehe <othacehe@gnu.org>
Date: Fri May 9 14:19:35 2025 +0200
net: cadence: macb: Fix a possible deadlock in macb_halt_tx.
[ Upstream commit c92d6089d8ad7d4d815ebcedee3f3907b539ff1f ]
There is a situation where after THALT is set high, TGO stays high as
well. Because jiffies are never updated, as we are in a context with
interrupts disabled, we never exit that loop and have a deadlock.
That deadlock was noticed on a sama5d4 device that stayed locked for days.
Use retries instead of jiffies so that the timeout really works and we do
not have a deadlock anymore.
Fixes: e86cd53afc590 ("net/macb: better manage tx errors")
Signed-off-by: Mathieu Othacehe <othacehe@gnu.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250509121935.16282-1-othacehe@gnu.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Simon Horman <horms@kernel.org>
Date: Fri Apr 25 16:50:47 2025 +0100
net: dlink: Correct endianness handling of led_mode
[ Upstream commit e7e5ae71831c44d58627a991e603845a2fed2cab ]
As it's name suggests, parse_eeprom() parses EEPROM data.
This is done by reading data, 16 bits at a time as follows:
for (i = 0; i < 128; i++)
((__le16 *) sromdata)[i] = cpu_to_le16(read_eeprom(np, i));
sromdata is at the same memory location as psrom.
And the type of psrom is a pointer to struct t_SROM.
As can be seen in the loop above, data is stored in sromdata, and thus psrom,
as 16-bit little-endian values.
However, the integer fields of t_SROM are host byte order integers.
And in the case of led_mode this leads to a little endian value
being incorrectly treated as host byte order.
Looking at rio_set_led_mode, this does appear to be a bug as that code
masks led_mode with 0x1, 0x2 and 0x8. Logic that would be effected by a
reversed byte order.
This problem would only manifest on big endian hosts.
Found by inspection while investigating a sparse warning
regarding the crc field of t_SROM.
I believe that warning is a false positive. And although I plan
to send a follow-up to use little-endian types for other the integer
fields of PSROM_t I do not believe that will involve any bug fixes.
Compile tested only.
Fixes: c3f45d322cbd ("dl2k: Add support for IP1000A-based cards")
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250425-dlink-led-mode-v1-1-6bae3c36e736@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jonas Gorski <jonas.gorski@gmail.com>
Date: Tue Apr 29 22:17:00 2025 +0200
net: dsa: b53: allow leaky reserved multicast
[ Upstream commit 5f93185a757ff38b36f849c659aeef368db15a68 ]
Allow reserved multicast to ignore VLAN membership so STP and other
management protocols work without a PVID VLAN configured when using a
vlan aware bridge.
Fixes: 967dd82ffc52 ("net: dsa: b53: Add support for Broadcom RoboSwitch")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250429201710.330937-2-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jonas Gorski <jonas.gorski@gmail.com>
Date: Tue Apr 29 22:17:09 2025 +0200
net: dsa: b53: fix learning on VLAN unaware bridges
[ Upstream commit 9f34ad89bcf0e6df6f8b01f1bdab211493fc66d1 ]
When VLAN filtering is off, we configure the switch to forward, but not
learn on VLAN table misses. This effectively disables learning while not
filtering.
Fix this by switching to forward and learn. Setting the learning disable
register will still control whether learning actually happens.
Fixes: dad8d7c6452b ("net: dsa: b53: Properly account for VLAN filtering")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250429201710.330937-11-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jonas Gorski <jonas.gorski@gmail.com>
Date: Tue Apr 29 22:17:04 2025 +0200
net: dsa: b53: fix VLAN ID for untagged vlan on bridge leave
[ Upstream commit a1c1901c5cc881425cc45992ab6c5418174e9e5a ]
The untagged default VLAN is added to the default vlan, which may be
one, but we modify the VLAN 0 entry on bridge leave.
Fix this to use the correct VLAN entry for the default pvid.
Fixes: fea83353177a ("net: dsa: b53: Fix default VLAN ID")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250429201710.330937-6-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vladimir Oltean <vladimir.oltean@nxp.com>
Date: Fri May 9 14:38:16 2025 +0300
net: dsa: sja1105: discard incoming frames in BR_STATE_LISTENING
[ Upstream commit 498625a8ab2c8e1c9ab5105744310e8d6952cc01 ]
It has been reported that when under a bridge with stp_state=1, the logs
get spammed with this message:
[ 251.734607] fsl_dpaa2_eth dpni.5 eth0: Couldn't decode source port
Further debugging shows the following info associated with packets:
source_port=-1, switch_id=-1, vid=-1, vbid=1
In other words, they are data plane packets which are supposed to be
decoded by dsa_tag_8021q_find_port_by_vbid(), but the latter (correctly)
refuses to do so, because no switch port is currently in
BR_STATE_LEARNING or BR_STATE_FORWARDING - so the packet is effectively
unexpected.
The error goes away after the port progresses to BR_STATE_LEARNING in 15
seconds (the default forward_time of the bridge), because then,
dsa_tag_8021q_find_port_by_vbid() can correctly associate the data plane
packets with a plausible bridge port in a plausible STP state.
Re-reading IEEE 802.1D-1990, I see the following:
"4.4.2 Learning: (...) The Forwarding Process shall discard received
frames."
IEEE 802.1D-2004 further clarifies:
"DISABLED, BLOCKING, LISTENING, and BROKEN all correspond to the
DISCARDING port state. While those dot1dStpPortStates serve to
distinguish reasons for discarding frames, the operation of the
Forwarding and Learning processes is the same for all of them. (...)
LISTENING represents a port that the spanning tree algorithm has
selected to be part of the active topology (computing a Root Port or
Designated Port role) but is temporarily discarding frames to guard
against loops or incorrect learning."
Well, this is not what the driver does - instead it sets
mac[port].ingress = true.
To get rid of the log spam, prevent unexpected data plane packets to
be received by software by discarding them on ingress in the LISTENING
state.
In terms of blame attribution: the prints only date back to commit
d7f9787a763f ("net: dsa: tag_8021q: add support for imprecise RX based
on the VBID"). However, the settings would permit a LISTENING port to
forward to a FORWARDING port, and the standard suggests that's not OK.
Fixes: 640f763f98c2 ("net: dsa: sja1105: Add support for Spanning Tree Protocol")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20250509113816.2221992-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Paul Kocialkowski <paulk@sys-base.io>
Date: Mon May 19 18:49:36 2025 +0200
net: dwmac-sun8i: Use parsed internal PHY address instead of 1
[ Upstream commit 47653e4243f2b0a26372e481ca098936b51ec3a8 ]
While the MDIO address of the internal PHY on Allwinner sun8i chips is
generally 1, of_mdio_parse_addr is used to cleanly parse the address
from the device-tree instead of hardcoding it.
A commit reworking the code ditched the parsed value and hardcoded the
value 1 instead, which didn't really break anything but is more fragile
and not future-proof.
Restore the initial behavior using the parsed address returned from the
helper.
Fixes: 634db83b8265 ("net: stmmac: dwmac-sun8i: Handle integrated/external MDIOs")
Signed-off-by: Paul Kocialkowski <paulk@sys-base.io>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Corentin LABBE <clabbe.montjoie@gmail.com>
Tested-by: Corentin LABBE <clabbe.montjoie@gmail.com>
Link: https://patch.msgid.link/20250519164936.4172658-1-paulk@sys-base.io
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Date: Mon Mar 3 08:46:57 2025 +0100
net: ethernet: ti: cpsw_new: populate netdev of_node
[ Upstream commit 7ff1c88fc89688c27f773ba956f65f0c11367269 ]
So that of_find_net_device_by_node() can find CPSW ports and other DSA
switches can be stacked downstream. Tested in conjunction with KSZ8873.
Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
Link: https://patch.msgid.link/20250303074703.1758297-1-alexander.sverdlin@siemens.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mattias Barthel <mattias.barthel@atlascopco.com>
Date: Tue Apr 29 11:08:26 2025 +0200
net: fec: ERR007885 Workaround for conventional TX
[ Upstream commit a179aad12badc43201cbf45d1e8ed2c1383c76b9 ]
Activate TX hang workaround also in
fec_enet_txq_submit_skb() when TSO is not enabled.
Errata: ERR007885
Symptoms: NETDEV WATCHDOG: eth0 (fec): transmit queue 0 timed out
commit 37d6017b84f7 ("net: fec: Workaround for imx6sx enet tx hang when enable three queues")
There is a TDAR race condition for mutliQ when the software sets TDAR
and the UDMA clears TDAR simultaneously or in a small window (2-4 cycles).
This will cause the udma_tx and udma_tx_arbiter state machines to hang.
So, the Workaround is checking TDAR status four time, if TDAR cleared by
hardware and then write TDAR, otherwise don't set TDAR.
Fixes: 53bb20d1faba ("net: fec: add variable reg_desc_active to speed things up")
Signed-off-by: Mattias Barthel <mattias.barthel@atlascopco.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250429090826.3101258-1-mattiasbarthel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Felix Fietkau <nbd@nbd.name>
Date: Sat Apr 26 17:32:09 2025 +0200
net: ipv6: fix UDPv6 GSO segmentation with NAT
[ Upstream commit b936a9b8d4a585ccb6d454921c36286bfe63e01d ]
If any address or port is changed, update it in all packets and recalculate
checksum.
Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250426153210.14044-1-nbd@nbd.name
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Thangaraj Samynathan <thangaraj.s@microchip.com>
Date: Tue Apr 29 10:55:27 2025 +0530
net: lan743x: Fix memleak issue when GSO enabled
[ Upstream commit 2d52e2e38b85c8b7bc00dca55c2499f46f8c8198 ]
Always map the `skb` to the LS descriptor. Previously skb was
mapped to EXT descriptor when the number of fragments is zero with
GSO enabled. Mapping the skb to EXT descriptor prevents it from
being freed, leading to a memory leak
Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver")
Signed-off-by: Thangaraj Samynathan <thangaraj.s@microchip.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250429052527.10031-1-thangaraj.s@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peter Seiderer <ps.report@gmx.net>
Date: Wed Feb 19 09:45:27 2025 +0100
net: pktgen: fix access outside of user given buffer in pktgen_thread_write()
[ Upstream commit 425e64440ad0a2f03bdaf04be0ae53dededbaa77 ]
Honour the user given buffer size for the strn_len() calls (otherwise
strn_len() will access memory outside of the user given buffer).
Signed-off-by: Peter Seiderer <ps.report@gmx.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250219084527.20488-8-ps.report@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peter Seiderer <ps.report@gmx.net>
Date: Thu Feb 27 14:56:00 2025 +0100
net: pktgen: fix mpls maximum labels list parsing
[ Upstream commit 2b15a0693f70d1e8119743ee89edbfb1271b3ea8 ]
Fix mpls maximum labels list parsing up to MAX_MPLS_LABELS entries (instead
of up to MAX_MPLS_LABELS - 1).
Addresses the following:
$ echo "mpls 00000f00,00000f01,00000f02,00000f03,00000f04,00000f05,00000f06,00000f07,00000f08,00000f09,00000f0a,00000f0b,00000f0c,00000f0d,00000f0e,00000f0f" > /proc/net/pktgen/lo\@0
-bash: echo: write error: Argument list too long
Signed-off-by: Peter Seiderer <ps.report@gmx.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Arnd Bergmann <arnd@arndb.de>
Date: Tue Feb 25 17:33:33 2025 +0100
net: xgene-v2: remove incorrect ACPI_PTR annotation
[ Upstream commit 01358e8fe922f716c05d7864ac2213b2440026e7 ]
Building with W=1 shows a warning about xge_acpi_match being unused when
CONFIG_ACPI is disabled:
drivers/net/ethernet/apm/xgene-v2/main.c:723:36: error: unused variable 'xge_acpi_match' [-Werror,-Wunused-const-variable]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patch.msgid.link/20250225163341.4168238-2-arnd@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Victor Nogueira <victor@mojatatu.com>
Date: Fri Apr 25 19:07:05 2025 -0300
net_sched: drr: Fix double list add in class with netem as child qdisc
[ Upstream commit f99a3fbf023e20b626be4b0f042463d598050c9a ]
As described in Gerrard's report [1], there are use cases where a netem
child qdisc will make the parent qdisc's enqueue callback reentrant.
In the case of drr, there won't be a UAF, but the code will add the same
classifier to the list twice, which will cause memory corruption.
In addition to checking for qlen being zero, this patch checks whether the
class was already added to the active_list (cl_is_active) before adding
to the list to cover for the reentrant case.
[1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-TEJJbw@mail.gmail.com/
Fixes: 37d9cf1a3ce3 ("sched: Fix detection of empty queues in child qdiscs")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Link: https://patch.msgid.link/20250425220710.3964791-2-victor@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Victor Nogueira <victor@mojatatu.com>
Date: Fri Apr 25 19:07:07 2025 -0300
net_sched: ets: Fix double list add in class with netem as child qdisc
[ Upstream commit 1a6d0c00fa07972384b0c308c72db091d49988b6 ]
As described in Gerrard's report [1], there are use cases where a netem
child qdisc will make the parent qdisc's enqueue callback reentrant.
In the case of ets, there won't be a UAF, but the code will add the same
classifier to the list twice, which will cause memory corruption.
In addition to checking for qlen being zero, this patch checks whether
the class was already added to the active_list (cl_is_active) before
doing the addition to cater for the reentrant case.
[1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-TEJJbw@mail.gmail.com/
Fixes: 37d9cf1a3ce3 ("sched: Fix detection of empty queues in child qdiscs")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Link: https://patch.msgid.link/20250425220710.3964791-4-victor@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Cong Wang <xiyou.wangcong@gmail.com>
Date: Tue May 6 21:35:58 2025 -0700
net_sched: Flush gso_skb list too during ->change()
[ Upstream commit 2d3cbfd6d54a2c39ce3244f33f85c595844bd7b8 ]
Previously, when reducing a qdisc's limit via the ->change() operation, only
the main skb queue was trimmed, potentially leaving packets in the gso_skb
list. This could result in NULL pointer dereference when we only check
sch->limit against sch->q.qlen.
This patch introduces a new helper, qdisc_dequeue_internal(), which ensures
both the gso_skb list and the main queue are properly flushed when trimming
excess packets. All relevant qdiscs (codel, fq, fq_codel, fq_pie, hhf, pie)
are updated to use this helper in their ->change() routines.
Fixes: 76e3cc126bb2 ("codel: Controlled Delay AQM")
Fixes: 4b549a2ef4be ("fq_codel: Fair Queue Codel AQM")
Fixes: afe4fd062416 ("pkt_sched: fq: Fair Queue packet scheduler")
Fixes: ec97ecf1ebe4 ("net: sched: add Flow Queue PIE packet scheduler")
Fixes: 10239edf86f1 ("net-qdisc-hhf: Heavy-Hitter Filter (HHF) qdisc")
Fixes: d4b36210c2e6 ("net: pkt_sched: PIE AQM scheme")
Reported-by: Will <willsroot@protonmail.com>
Reported-by: Savy <savy@syst3mfailure.io>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pedro Tammela <pctammela@mojatatu.com>
Date: Thu May 22 15:14:47 2025 -0300
net_sched: hfsc: Address reentrant enqueue adding class to eltree twice
commit ac9fe7dd8e730a103ae4481147395cc73492d786 upstream.
Savino says:
"We are writing to report that this recent patch
(141d34391abbb315d68556b7c67ad97885407547) [1]
can be bypassed, and a UAF can still occur when HFSC is utilized with
NETEM.
The patch only checks the cl->cl_nactive field to determine whether
it is the first insertion or not [2], but this field is only
incremented by init_vf [3].
By using HFSC_RSC (which uses init_ed) [4], it is possible to bypass the
check and insert the class twice in the eltree.
Under normal conditions, this would lead to an infinite loop in
hfsc_dequeue for the reasons we already explained in this report [5].
However, if TBF is added as root qdisc and it is configured with a
very low rate,
it can be utilized to prevent packets from being dequeued.
This behavior can be exploited to perform subsequent insertions in the
HFSC eltree and cause a UAF."
To fix both the UAF and the infinite loop, with netem as an hfsc child,
check explicitly in hfsc_enqueue whether the class is already in the eltree
whenever the HFSC_RSC flag is set.
[1] https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=141d34391abbb315d68556b7c67ad97885407547
[2] https://elixir.bootlin.com/linux/v6.15-rc5/source/net/sched/sch_hfsc.c#L1572
[3] https://elixir.bootlin.com/linux/v6.15-rc5/source/net/sched/sch_hfsc.c#L677
[4] https://elixir.bootlin.com/linux/v6.15-rc5/source/net/sched/sch_hfsc.c#L1574
[5] https://lore.kernel.org/netdev/8DuRWwfqjoRDLDmBMlIfbrsZg9Gx50DHJc1ilxsEBNe2D6NMoigR_eIRIG0LOjMc3r10nUUZtArXx4oZBIdUfZQrwjcQhdinnMis_0G7VEk=@willsroot.io/T/#u
Fixes: 37d9cf1a3ce3 ("sched: Fix detection of empty queues in child qdiscs")
Reported-by: Savino Dicanosa <savy@syst3mfailure.io>
Reported-by: William Liu <will@willsroot.io>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Tested-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Link: https://patch.msgid.link/20250522181448.1439717-2-pctammela@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Victor Nogueira <victor@mojatatu.com>
Date: Fri Apr 25 19:07:06 2025 -0300
net_sched: hfsc: Fix a UAF vulnerability in class with netem as child qdisc
[ Upstream commit 141d34391abbb315d68556b7c67ad97885407547 ]
As described in Gerrard's report [1], we have a UAF case when an hfsc class
has a netem child qdisc. The crux of the issue is that hfsc is assuming
that checking for cl->qdisc->q.qlen == 0 guarantees that it hasn't inserted
the class in the vttree or eltree (which is not true for the netem
duplicate case).
This patch checks the n_active class variable to make sure that the code
won't insert the class in the vttree or eltree twice, catering for the
reentrant case.
[1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-TEJJbw@mail.gmail.com/
Fixes: 37d9cf1a3ce3 ("sched: Fix detection of empty queues in child qdiscs")
Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Link: https://patch.msgid.link/20250425220710.3964791-3-victor@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Victor Nogueira <victor@mojatatu.com>
Date: Fri Apr 25 19:07:08 2025 -0300
net_sched: qfq: Fix double list add in class with netem as child qdisc
[ Upstream commit f139f37dcdf34b67f5bf92bc8e0f7f6b3ac63aa4 ]
As described in Gerrard's report [1], there are use cases where a netem
child qdisc will make the parent qdisc's enqueue callback reentrant.
In the case of qfq, there won't be a UAF, but the code will add the same
classifier to the list twice, which will cause memory corruption.
This patch checks whether the class was already added to the agg->active
list (cl_is_active) before doing the addition to cater for the reentrant
case.
[1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-TEJJbw@mail.gmail.com/
Fixes: 37d9cf1a3ce3 ("sched: Fix detection of empty queues in child qdiscs")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Link: https://patch.msgid.link/20250425220710.3964791-5-victor@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nicolas Bouchinet <nicolas.bouchinet@ssi.gouv.fr>
Date: Wed Jan 29 18:06:30 2025 +0100
netfilter: conntrack: Bound nf_conntrack sysctl writes
[ Upstream commit 8b6861390ffee6b8ed78b9395e3776c16fec6579 ]
nf_conntrack_max and nf_conntrack_expect_max sysctls were authorized to
be written any negative value, which would then be stored in the
unsigned int variables nf_conntrack_max and nf_ct_expect_max variables.
While the do_proc_dointvec_conv function is supposed to limit writing
handled by proc_dointvec proc_handler to INT_MAX. Such a negative value
being written in an unsigned int leads to a very high value, exceeding
this limit.
Moreover, the nf_conntrack_expect_max sysctl documentation specifies the
minimum value is 1.
The proc_handlers have thus been updated to proc_dointvec_minmax in
order to specify the following write bounds :
* Bound nf_conntrack_max sysctl writings between SYSCTL_ZERO
and SYSCTL_INT_MAX.
* Bound nf_conntrack_expect_max sysctl writings between SYSCTL_ONE
and SYSCTL_INT_MAX as defined in the sysctl documentation.
With this patch applied, sysctl writes outside the defined in the bound
will thus lead to a write error :
```
sysctl -w net.netfilter.nf_conntrack_expect_max=-1
sysctl: setting key "net.netfilter.nf_conntrack_expect_max": Invalid argument
```
Signed-off-by: Nicolas Bouchinet <nicolas.bouchinet@ssi.gouv.fr>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jozsef Kadlecsik <kadlec@netfilter.org>
Date: Wed May 7 17:01:59 2025 +0200
netfilter: ipset: fix region locking in hash types
[ Upstream commit 8478a729c0462273188263136880480729e9efca ]
Region locking introduced in v5.6-rc4 contained three macros to handle
the region locks: ahash_bucket_start(), ahash_bucket_end() which gave
back the start and end hash bucket values belonging to a given region
lock and ahash_region() which should give back the region lock belonging
to a given hash bucket. The latter was incorrect which can lead to a
race condition between the garbage collector and adding new elements
when a hash type of set is defined with timeouts.
Fixes: f66ee0410b1c ("netfilter: ipset: Fix "INFO: rcu detected stall in hash_xxx" reports")
Reported-by: Kota Toda <kota.toda@gmo-cybersecurity.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Florian Westphal <fw@strlen.de>
Date: Tue May 20 02:22:36 2025 +0200
netfilter: nf_tables: do not defer rule destruction via call_rcu
commit b04df3da1b5c6f6dc7cdccc37941740c078c4043 upstream.
nf_tables_chain_destroy can sleep, it can't be used from call_rcu
callbacks.
Moreover, nf_tables_rule_release() is only safe for error unwinding,
while transaction mutex is held and the to-be-desroyed rule was not
exposed to either dataplane or dumps, as it deactives+frees without
the required synchronize_rcu() in-between.
nft_rule_expr_deactivate() callbacks will change ->use counters
of other chains/sets, see e.g. nft_lookup .deactivate callback, these
must be serialized via transaction mutex.
Also add a few lockdep asserts to make this more explicit.
Calling synchronize_rcu() isn't ideal, but fixing this without is hard
and way more intrusive. As-is, we can get:
WARNING: .. net/netfilter/nf_tables_api.c:5515 nft_set_destroy+0x..
Workqueue: events nf_tables_trans_destroy_work
RIP: 0010:nft_set_destroy+0x3fe/0x5c0
Call Trace:
<TASK>
nf_tables_trans_destroy_work+0x6b7/0xad0
process_one_work+0x64a/0xce0
worker_thread+0x613/0x10d0
In case the synchronize_rcu becomes an issue, we can explore alternatives.
One way would be to allocate nft_trans_rule objects + one nft_trans_chain
object, deactivate the rules + the chain and then defer the freeing to the
nft destroy workqueue. We'd still need to keep the synchronize_rcu path as
a fallback to handle -ENOMEM corner cases though.
Reported-by: syzbot+b26935466701e56cfdc2@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/67478d92.050a0220.253251.0062.GAE@google.com/T/
Fixes: c03d278fdf35 ("netfilter: nf_tables: wait for rcu grace period on net_device removal")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Florian Westphal <fw@strlen.de>
Date: Tue May 20 02:22:34 2025 +0200
netfilter: nf_tables: pass nft_chain to destroy function, not nft_ctx
commit 8965d42bcf54d42cbc72fe34a9d0ec3f8527debd upstream.
It would be better to not store nft_ctx inside nft_trans object,
the netlink ctx strucutre is huge and most of its information is
never needed in places that use trans->ctx.
Avoid/reduce its usage if possible, no runtime behaviour change
intended.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Tue May 20 02:22:35 2025 +0200
netfilter: nf_tables: wait for rcu grace period on net_device removal
commit c03d278fdf35e73dd0ec543b9b556876b9d9a8dc upstream.
8c873e219970 ("netfilter: core: free hooks with call_rcu") removed
synchronize_net() call when unregistering basechain hook, however,
net_device removal event handler for the NFPROTO_NETDEV was not updated
to wait for RCU grace period.
Note that 835b803377f5 ("netfilter: nf_tables_netdev: unregister hooks
on net_device removal") does not remove basechain rules on device
removal, I was hinted to remove rules on net_device removal later, see
5ebe0b0eec9d ("netfilter: nf_tables: destroy basechain and rules on
netdevice removal").
Although NETDEV_UNREGISTER event is guaranteed to be handled after
synchronize_net() call, this path needs to wait for rcu grace period via
rcu callback to release basechain hooks if netns is alive because an
ongoing netlink dump could be in progress (sockets hold a reference on
the netns).
Note that nf_tables_pre_exit_net() unregisters and releases basechain
hooks but it is possible to see NETDEV_UNREGISTER at a later stage in
the netns exit path, eg. veth peer device in another netns:
cleanup_net()
default_device_exit_batch()
unregister_netdevice_many_notify()
notifier_call_chain()
nf_tables_netdev_event()
__nft_release_basechain()
In this particular case, same rule of thumb applies: if netns is alive,
then wait for rcu grace period because netlink dump in the other netns
could be in progress. Otherwise, if the other netns is going away then
no netlink dump can be in progress and basechain hooks can be released
inmediately.
While at it, turn WARN_ON() into WARN_ON_ONCE() for the basechain
validation, which should not ever happen.
Fixes: 835b803377f5 ("netfilter: nf_tables_netdev: unregister hooks on net_device removal")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jeff Layton <jlayton@kernel.org>
Date: Thu Apr 10 16:42:03 2025 -0400
nfs: don't share pNFS DS connections between net namespaces
[ Upstream commit 6b9785dc8b13d9fb75ceec8cf4ea7ec3f3b1edbc ]
Currently, different NFS clients can share the same DS connections, even
when they are in different net namespaces. If a containerized client
creates a DS connection, another container can find and use it. When the
first client exits, the connection will close which can lead to stalls
in other clients.
Add a net namespace pointer to struct nfs4_pnfs_ds, and compare those
value to the caller's netns in _data_server_lookup_locked() when
searching for a nfs4_pnfs_ds to match.
Reported-by: Omar Sandoval <osandov@osandov.com>
Reported-by: Sargun Dillon <sargun@sargun.me>
Closes: https://lore.kernel.org/linux-nfs/Z_ArpQC_vREh_hEA@telecaster/
Tested-by: Sargun Dillon <sargun@sargun.me>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Link: https://lore.kernel.org/r/20250410-nfs-ds-netns-v2-1-f80b7979ba80@kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Li Lingfeng <lilingfeng3@huawei.com>
Date: Thu Apr 17 15:25:08 2025 +0800
nfs: handle failure of nfs_get_lock_context in unlock path
[ Upstream commit c457dc1ec770a22636b473ce5d35614adfe97636 ]
When memory is insufficient, the allocation of nfs_lock_context in
nfs_get_lock_context() fails and returns -ENOMEM. If we mistakenly treat
an nfs4_unlockdata structure (whose l_ctx member has been set to -ENOMEM)
as valid and proceed to execute rpc_run_task(), this will trigger a NULL
pointer dereference in nfs4_locku_prepare. For example:
BUG: kernel NULL pointer dereference, address: 000000000000000c
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP PTI
CPU: 15 UID: 0 PID: 12 Comm: kworker/u64:0 Not tainted 6.15.0-rc2-dirty #60
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40
Workqueue: rpciod rpc_async_schedule
RIP: 0010:nfs4_locku_prepare+0x35/0xc2
Code: 89 f2 48 89 fd 48 c7 c7 68 69 ef b5 53 48 8b 8e 90 00 00 00 48 89 f3
RSP: 0018:ffffbbafc006bdb8 EFLAGS: 00010246
RAX: 000000000000004b RBX: ffff9b964fc1fa00 RCX: 0000000000000000
RDX: 0000000000000000 RSI: fffffffffffffff4 RDI: ffff9ba53fddbf40
RBP: ffff9ba539934000 R08: 0000000000000000 R09: ffffbbafc006bc38
R10: ffffffffb6b689c8 R11: 0000000000000003 R12: ffff9ba539934030
R13: 0000000000000001 R14: 0000000004248060 R15: ffffffffb56d1c30
FS: 0000000000000000(0000) GS:ffff9ba5881f0000(0000) knlGS:00000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000000c CR3: 000000093f244000 CR4: 00000000000006f0
Call Trace:
<TASK>
__rpc_execute+0xbc/0x480
rpc_async_schedule+0x2f/0x40
process_one_work+0x232/0x5d0
worker_thread+0x1da/0x3d0
? __pfx_worker_thread+0x10/0x10
kthread+0x10d/0x240
? __pfx_kthread+0x10/0x10
ret_from_fork+0x34/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
Modules linked in:
CR2: 000000000000000c
---[ end trace 0000000000000000 ]---
Free the allocated nfs4_unlockdata when nfs_get_lock_context() fails and
return NULL to terminate subsequent rpc_run_task, preventing NULL pointer
dereference.
Fixes: f30cb757f680 ("NFS: Always wait for I/O completion before unlock")
Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/20250417072508.3850532-1-lilingfeng3@huawei.com
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Sat May 10 10:50:13 2025 -0400
NFSv4/pnfs: Reset the layout state after a layoutreturn
[ Upstream commit 6d6d7f91cc8c111d40416ac9240a3bb9396c5235 ]
If there are still layout segments in the layout plh_return_lsegs list
after a layout return, we should be resetting the state to ensure they
eventually get returned as well.
Fixes: 68f744797edd ("pNFS: Do not free layout segments that are marked for return")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Thu Mar 27 19:20:53 2025 -0400
NFSv4: Check for delegation validity in nfs_start_delegation_return_locked()
[ Upstream commit 9e8f324bd44c1fe026b582b75213de4eccfa1163 ]
Check that the delegation is still attached after taking the spin lock
in nfs_start_delegation_return_locked().
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Mon Mar 24 20:35:33 2025 -0400
NFSv4: Treat ENETUNREACH errors as fatal for state recovery
[ Upstream commit 0af5fb5ed3d2fd9e110c6112271f022b744a849a ]
If a containerised process is killed and causes an ENETUNREACH or
ENETDOWN error to be propagated to the state manager, then mark the
nfs_client as being dead so that we don't loop in functions that are
expecting recovery to succeed.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Michael Liang <mliang@purestorage.com>
Date: Tue Apr 29 10:42:01 2025 -0600
nvme-tcp: fix premature queue removal and I/O failover
[ Upstream commit 77e40bbce93059658aee02786a32c5c98a240a8a ]
This patch addresses a data corruption issue observed in nvme-tcp during
testing.
In an NVMe native multipath setup, when an I/O timeout occurs, all
inflight I/Os are canceled almost immediately after the kernel socket is
shut down. These canceled I/Os are reported as host path errors,
triggering a failover that succeeds on a different path.
However, at this point, the original I/O may still be outstanding in the
host's network transmission path (e.g., the NIC’s TX queue). From the
user-space app's perspective, the buffer associated with the I/O is
considered completed since they're acked on the different path and may
be reused for new I/O requests.
Because nvme-tcp enables zero-copy by default in the transmission path,
this can lead to corrupted data being sent to the original target,
ultimately causing data corruption.
We can reproduce this data corruption by injecting delay on one path and
triggering i/o timeout.
To prevent this issue, this change ensures that all inflight
transmissions are fully completed from host's perspective before
returning from queue stop. To handle concurrent I/O timeout from multiple
namespaces under the same controller, always wait in queue stop
regardless of queue's state.
This aligns with the behavior of queue stopping in other NVMe fabric
transports.
Fixes: 3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver")
Signed-off-by: Michael Liang <mliang@purestorage.com>
Reviewed-by: Mohamed Khalfella <mkhalfella@purestorage.com>
Reviewed-by: Randy Jennings <randyj@purestorage.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Daniel Wagner <wagi@kernel.org>
Date: Fri May 2 10:58:00 2025 +0200
nvme: unblock ctrl state transition for firmware update
[ Upstream commit 650415fca0a97472fdd79725e35152614d1aad76 ]
The original nvme subsystem design didn't have a CONNECTING state; the
state machine allowed transitions from RESETTING to LIVE directly.
With the introduction of nvme fabrics the CONNECTING state was
introduce. Over time the nvme-pci started to use the CONNECTING state as
well.
Eventually, a bug fix for the nvme-fc started to depend that the only
valid transition to LIVE was from CONNECTING. Though this change didn't
update the firmware update handler which was still depending on
RESETTING to LIVE transition.
The simplest way to address it for the time being is to switch into
CONNECTING state before going to LIVE state.
Fixes: d2fe192348f9 ("nvme: only allow entering LIVE from CONNECTING state")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Daniel Wagner <wagi@kernel.org>
Closes: https://lore.kernel.org/all/0134ea15-8d5f-41f7-9e9a-d7e6d82accaa@roeck-us.net
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alistair Francis <alistair.francis@wdc.com>
Date: Wed Apr 23 16:06:21 2025 +1000
nvmet-tcp: don't restore null sk_state_change
[ Upstream commit 46d22b47df2741996af277a2838b95f130436c13 ]
queue->state_change is set as part of nvmet_tcp_set_queue_sock(), but if
the TCP connection isn't established when nvmet_tcp_set_queue_sock() is
called then queue->state_change isn't set and sock->sk->sk_state_change
isn't replaced.
As such we don't need to restore sock->sk->sk_state_change if
queue->state_change is NULL.
This avoids NULL pointer dereferences such as this:
[ 286.462026][ C0] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 286.462814][ C0] #PF: supervisor instruction fetch in kernel mode
[ 286.463796][ C0] #PF: error_code(0x0010) - not-present page
[ 286.464392][ C0] PGD 8000000140620067 P4D 8000000140620067 PUD 114201067 PMD 0
[ 286.465086][ C0] Oops: Oops: 0010 [#1] SMP KASAN PTI
[ 286.465559][ C0] CPU: 0 UID: 0 PID: 1628 Comm: nvme Not tainted 6.15.0-rc2+ #11 PREEMPT(voluntary)
[ 286.466393][ C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014
[ 286.467147][ C0] RIP: 0010:0x0
[ 286.467420][ C0] Code: Unable to access opcode bytes at 0xffffffffffffffd6.
[ 286.467977][ C0] RSP: 0018:ffff8883ae008580 EFLAGS: 00010246
[ 286.468425][ C0] RAX: 0000000000000000 RBX: ffff88813fd34100 RCX: ffffffffa386cc43
[ 286.469019][ C0] RDX: 1ffff11027fa68b6 RSI: 0000000000000008 RDI: ffff88813fd34100
[ 286.469545][ C0] RBP: ffff88813fd34160 R08: 0000000000000000 R09: ffffed1027fa682c
[ 286.470072][ C0] R10: ffff88813fd34167 R11: 0000000000000000 R12: ffff88813fd344c3
[ 286.470585][ C0] R13: ffff88813fd34112 R14: ffff88813fd34aec R15: ffff888132cdd268
[ 286.471070][ C0] FS: 00007fe3c04c7d80(0000) GS:ffff88840743f000(0000) knlGS:0000000000000000
[ 286.471644][ C0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 286.472543][ C0] CR2: ffffffffffffffd6 CR3: 000000012daca000 CR4: 00000000000006f0
[ 286.473500][ C0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 286.474467][ C0] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
[ 286.475453][ C0] Call Trace:
[ 286.476102][ C0] <IRQ>
[ 286.476719][ C0] tcp_fin+0x2bb/0x440
[ 286.477429][ C0] tcp_data_queue+0x190f/0x4e60
[ 286.478174][ C0] ? __build_skb_around+0x234/0x330
[ 286.478940][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.479659][ C0] ? __pfx_tcp_data_queue+0x10/0x10
[ 286.480431][ C0] ? tcp_try_undo_loss+0x640/0x6c0
[ 286.481196][ C0] ? seqcount_lockdep_reader_access.constprop.0+0x82/0x90
[ 286.482046][ C0] ? kvm_clock_get_cycles+0x14/0x30
[ 286.482769][ C0] ? ktime_get+0x66/0x150
[ 286.483433][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.484146][ C0] tcp_rcv_established+0x6e4/0x2050
[ 286.484857][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.485523][ C0] ? ipv4_dst_check+0x160/0x2b0
[ 286.486203][ C0] ? __pfx_tcp_rcv_established+0x10/0x10
[ 286.486917][ C0] ? lock_release+0x217/0x2c0
[ 286.487595][ C0] tcp_v4_do_rcv+0x4d6/0x9b0
[ 286.488279][ C0] tcp_v4_rcv+0x2af8/0x3e30
[ 286.488904][ C0] ? raw_local_deliver+0x51b/0xad0
[ 286.489551][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.490198][ C0] ? __pfx_tcp_v4_rcv+0x10/0x10
[ 286.490813][ C0] ? __pfx_raw_local_deliver+0x10/0x10
[ 286.491487][ C0] ? __pfx_nf_confirm+0x10/0x10 [nf_conntrack]
[ 286.492275][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.492900][ C0] ip_protocol_deliver_rcu+0x8f/0x370
[ 286.493579][ C0] ip_local_deliver_finish+0x297/0x420
[ 286.494268][ C0] ip_local_deliver+0x168/0x430
[ 286.494867][ C0] ? __pfx_ip_local_deliver+0x10/0x10
[ 286.495498][ C0] ? __pfx_ip_local_deliver_finish+0x10/0x10
[ 286.496204][ C0] ? ip_rcv_finish_core+0x19a/0x1f20
[ 286.496806][ C0] ? lock_release+0x217/0x2c0
[ 286.497414][ C0] ip_rcv+0x455/0x6e0
[ 286.497945][ C0] ? __pfx_ip_rcv+0x10/0x10
[ 286.498550][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.499137][ C0] ? __pfx_ip_rcv_finish+0x10/0x10
[ 286.499763][ C0] ? lock_release+0x217/0x2c0
[ 286.500327][ C0] ? dl_scaled_delta_exec+0xd1/0x2c0
[ 286.500922][ C0] ? __pfx_ip_rcv+0x10/0x10
[ 286.501480][ C0] __netif_receive_skb_one_core+0x166/0x1b0
[ 286.502173][ C0] ? __pfx___netif_receive_skb_one_core+0x10/0x10
[ 286.502903][ C0] ? lock_acquire+0x2b2/0x310
[ 286.503487][ C0] ? process_backlog+0x372/0x1350
[ 286.504087][ C0] ? lock_release+0x217/0x2c0
[ 286.504642][ C0] process_backlog+0x3b9/0x1350
[ 286.505214][ C0] ? process_backlog+0x372/0x1350
[ 286.505779][ C0] __napi_poll.constprop.0+0xa6/0x490
[ 286.506363][ C0] net_rx_action+0x92e/0xe10
[ 286.506889][ C0] ? __pfx_net_rx_action+0x10/0x10
[ 286.507437][ C0] ? timerqueue_add+0x1f0/0x320
[ 286.507977][ C0] ? sched_clock_cpu+0x68/0x540
[ 286.508492][ C0] ? lock_acquire+0x2b2/0x310
[ 286.509043][ C0] ? kvm_sched_clock_read+0xd/0x20
[ 286.509607][ C0] ? handle_softirqs+0x1aa/0x7d0
[ 286.510187][ C0] handle_softirqs+0x1f2/0x7d0
[ 286.510754][ C0] ? __pfx_handle_softirqs+0x10/0x10
[ 286.511348][ C0] ? irqtime_account_irq+0x181/0x290
[ 286.511937][ C0] ? __dev_queue_xmit+0x85d/0x3450
[ 286.512510][ C0] do_softirq.part.0+0x89/0xc0
[ 286.513100][ C0] </IRQ>
[ 286.513548][ C0] <TASK>
[ 286.513953][ C0] __local_bh_enable_ip+0x112/0x140
[ 286.514522][ C0] ? __dev_queue_xmit+0x85d/0x3450
[ 286.515072][ C0] __dev_queue_xmit+0x872/0x3450
[ 286.515619][ C0] ? nft_do_chain+0xe16/0x15b0 [nf_tables]
[ 286.516252][ C0] ? __pfx___dev_queue_xmit+0x10/0x10
[ 286.516817][ C0] ? selinux_ip_postroute+0x43c/0xc50
[ 286.517433][ C0] ? __pfx_selinux_ip_postroute+0x10/0x10
[ 286.518061][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.518606][ C0] ? ip_output+0x164/0x4a0
[ 286.519149][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.519671][ C0] ? ip_finish_output2+0x17d5/0x1fb0
[ 286.520258][ C0] ip_finish_output2+0xb4b/0x1fb0
[ 286.520787][ C0] ? __pfx_ip_finish_output2+0x10/0x10
[ 286.521355][ C0] ? __ip_finish_output+0x15d/0x750
[ 286.521890][ C0] ip_output+0x164/0x4a0
[ 286.522372][ C0] ? __pfx_ip_output+0x10/0x10
[ 286.522872][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.523402][ C0] ? _raw_spin_unlock_irqrestore+0x4c/0x60
[ 286.524031][ C0] ? __pfx_ip_finish_output+0x10/0x10
[ 286.524605][ C0] ? __ip_queue_xmit+0x999/0x2260
[ 286.525200][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.525744][ C0] ? ipv4_dst_check+0x16a/0x2b0
[ 286.526279][ C0] ? lock_release+0x217/0x2c0
[ 286.526793][ C0] __ip_queue_xmit+0x1883/0x2260
[ 286.527324][ C0] ? __skb_clone+0x54c/0x730
[ 286.527827][ C0] __tcp_transmit_skb+0x209b/0x37a0
[ 286.528374][ C0] ? __pfx___tcp_transmit_skb+0x10/0x10
[ 286.528952][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.529472][ C0] ? seqcount_lockdep_reader_access.constprop.0+0x82/0x90
[ 286.530152][ C0] ? trace_hardirqs_on+0x12/0x120
[ 286.530691][ C0] tcp_write_xmit+0xb81/0x88b0
[ 286.531224][ C0] ? mod_memcg_state+0x4d/0x60
[ 286.531736][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.532253][ C0] __tcp_push_pending_frames+0x90/0x320
[ 286.532826][ C0] tcp_send_fin+0x141/0xb50
[ 286.533352][ C0] ? __pfx_tcp_send_fin+0x10/0x10
[ 286.533908][ C0] ? __local_bh_enable_ip+0xab/0x140
[ 286.534495][ C0] inet_shutdown+0x243/0x320
[ 286.535077][ C0] nvme_tcp_alloc_queue+0xb3b/0x2590 [nvme_tcp]
[ 286.535709][ C0] ? do_raw_spin_lock+0x129/0x260
[ 286.536314][ C0] ? __pfx_nvme_tcp_alloc_queue+0x10/0x10 [nvme_tcp]
[ 286.536996][ C0] ? do_raw_spin_unlock+0x54/0x1e0
[ 286.537550][ C0] ? _raw_spin_unlock+0x29/0x50
[ 286.538127][ C0] ? do_raw_spin_lock+0x129/0x260
[ 286.538664][ C0] ? __pfx_do_raw_spin_lock+0x10/0x10
[ 286.539249][ C0] ? nvme_tcp_alloc_admin_queue+0xd5/0x340 [nvme_tcp]
[ 286.539892][ C0] ? __wake_up+0x40/0x60
[ 286.540392][ C0] nvme_tcp_alloc_admin_queue+0xd5/0x340 [nvme_tcp]
[ 286.541047][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.541589][ C0] nvme_tcp_setup_ctrl+0x8b/0x7a0 [nvme_tcp]
[ 286.542254][ C0] ? _raw_spin_unlock_irqrestore+0x4c/0x60
[ 286.542887][ C0] ? __pfx_nvme_tcp_setup_ctrl+0x10/0x10 [nvme_tcp]
[ 286.543568][ C0] ? trace_hardirqs_on+0x12/0x120
[ 286.544166][ C0] ? _raw_spin_unlock_irqrestore+0x35/0x60
[ 286.544792][ C0] ? nvme_change_ctrl_state+0x196/0x2e0 [nvme_core]
[ 286.545477][ C0] nvme_tcp_create_ctrl+0x839/0xb90 [nvme_tcp]
[ 286.546126][ C0] nvmf_dev_write+0x3db/0x7e0 [nvme_fabrics]
[ 286.546775][ C0] ? rw_verify_area+0x69/0x520
[ 286.547334][ C0] vfs_write+0x218/0xe90
[ 286.547854][ C0] ? do_syscall_64+0x9f/0x190
[ 286.548408][ C0] ? trace_hardirqs_on_prepare+0xdb/0x120
[ 286.549037][ C0] ? syscall_exit_to_user_mode+0x93/0x280
[ 286.549659][ C0] ? __pfx_vfs_write+0x10/0x10
[ 286.550259][ C0] ? do_syscall_64+0x9f/0x190
[ 286.550840][ C0] ? syscall_exit_to_user_mode+0x8e/0x280
[ 286.551516][ C0] ? trace_hardirqs_on_prepare+0xdb/0x120
[ 286.552180][ C0] ? syscall_exit_to_user_mode+0x93/0x280
[ 286.552834][ C0] ? ksys_read+0xf5/0x1c0
[ 286.553386][ C0] ? __pfx_ksys_read+0x10/0x10
[ 286.553964][ C0] ksys_write+0xf5/0x1c0
[ 286.554499][ C0] ? __pfx_ksys_write+0x10/0x10
[ 286.555072][ C0] ? trace_hardirqs_on_prepare+0xdb/0x120
[ 286.555698][ C0] ? syscall_exit_to_user_mode+0x93/0x280
[ 286.556319][ C0] ? do_syscall_64+0x54/0x190
[ 286.556866][ C0] do_syscall_64+0x93/0x190
[ 286.557420][ C0] ? rcu_read_unlock+0x17/0x60
[ 286.557986][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.558526][ C0] ? lock_release+0x217/0x2c0
[ 286.559087][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.559659][ C0] ? count_memcg_events.constprop.0+0x4a/0x60
[ 286.560476][ C0] ? exc_page_fault+0x7a/0x110
[ 286.561064][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.561647][ C0] ? lock_release+0x217/0x2c0
[ 286.562257][ C0] ? do_user_addr_fault+0x171/0xa00
[ 286.562839][ C0] ? do_user_addr_fault+0x4a2/0xa00
[ 286.563453][ C0] ? irqentry_exit_to_user_mode+0x84/0x270
[ 286.564112][ C0] ? rcu_is_watching+0x11/0xb0
[ 286.564677][ C0] ? irqentry_exit_to_user_mode+0x84/0x270
[ 286.565317][ C0] ? trace_hardirqs_on_prepare+0xdb/0x120
[ 286.565922][ C0] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 286.566542][ C0] RIP: 0033:0x7fe3c05e6504
[ 286.567102][ C0] Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d c5 8b 10 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89
[ 286.568931][ C0] RSP: 002b:00007fff76444f58 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[ 286.569807][ C0] RAX: ffffffffffffffda RBX: 000000003b40d930 RCX: 00007fe3c05e6504
[ 286.570621][ C0] RDX: 00000000000000cf RSI: 000000003b40d930 RDI: 0000000000000003
[ 286.571443][ C0] RBP: 0000000000000003 R08: 00000000000000cf R09: 000000003b40d930
[ 286.572246][ C0] R10: 0000000000000000 R11: 0000000000000202 R12: 000000003b40cd60
[ 286.573069][ C0] R13: 00000000000000cf R14: 00007fe3c07417f8 R15: 00007fe3c073502e
[ 286.573886][ C0] </TASK>
Closes: https://lore.kernel.org/linux-nvme/5hdonndzoqa265oq3bj6iarwtfk5dewxxjtbjvn5uqnwclpwt6@a2n6w3taxxex/
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jan Kara <jack@suse.cz>
Date: Thu Apr 24 15:45:12 2025 +0200
ocfs2: implement handshaking with ocfs2 recovery thread
commit 8f947e0fd595951460f5a6e1ac29baa82fa02eab upstream.
We will need ocfs2 recovery thread to acknowledge transitions of
recovery_state when disabling particular types of recovery. This is
similar to what currently happens when disabling recovery completely, just
more general. Implement the handshake and use it for exit from recovery.
Link: https://lkml.kernel.org/r/20250424134515.18933-5-jack@suse.cz
Fixes: 5f530de63cfc ("ocfs2: Use s_umount for quota recovery protection")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Tested-by: Heming Zhao <heming.zhao@suse.com>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Murad Masimov <m.masimov@mt-integration.ru>
Cc: Shichangkuo <shi.changkuo@h3c.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jan Kara <jack@suse.cz>
Date: Thu Apr 24 15:45:13 2025 +0200
ocfs2: stop quota recovery before disabling quotas
commit fcaf3b2683b05a9684acdebda706a12025a6927a upstream.
Currently quota recovery is synchronized with unmount using sb->s_umount
semaphore. That is however prone to deadlocks because
flush_workqueue(osb->ocfs2_wq) called from umount code can wait for quota
recovery to complete while ocfs2_finish_quota_recovery() waits for
sb->s_umount semaphore.
Grabbing of sb->s_umount semaphore in ocfs2_finish_quota_recovery() is
only needed to protect that function from disabling of quotas from
ocfs2_dismount_volume(). Handle this problem by disabling quota recovery
early during unmount in ocfs2_dismount_volume() instead so that we can
drop acquisition of sb->s_umount from ocfs2_finish_quota_recovery().
Link: https://lkml.kernel.org/r/20250424134515.18933-6-jack@suse.cz
Fixes: 5f530de63cfc ("ocfs2: Use s_umount for quota recovery protection")
Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Shichangkuo <shi.changkuo@h3c.com>
Reported-by: Murad Masimov <m.masimov@mt-integration.ru>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Tested-by: Heming Zhao <heming.zhao@suse.com>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jan Kara <jack@suse.cz>
Date: Thu Apr 24 15:45:11 2025 +0200
ocfs2: switch osb->disable_recovery to enum
commit c0fb83088f0cc4ee4706e0495ee8b06f49daa716 upstream.
Patch series "ocfs2: Fix deadlocks in quota recovery", v3.
This implements another approach to fixing quota recovery deadlocks. We
avoid grabbing sb->s_umount semaphore from ocfs2_finish_quota_recovery()
and instead stop quota recovery early in ocfs2_dismount_volume().
This patch (of 3):
We will need more recovery states than just pure enable / disable to fix
deadlocks with quota recovery. Switch osb->disable_recovery to enum.
Link: https://lkml.kernel.org/r/20250424134301.1392-1-jack@suse.cz
Link: https://lkml.kernel.org/r/20250424134515.18933-4-jack@suse.cz
Fixes: 5f530de63cfc ("ocfs2: Use s_umount for quota recovery protection")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Tested-by: Heming Zhao <heming.zhao@suse.com>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Murad Masimov <m.masimov@mt-integration.ru>
Cc: Shichangkuo <shi.changkuo@h3c.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sergey Shtylyov <s.shtylyov@omp.ru>
Date: Sun Apr 14 11:51:39 2024 +0300
of: module: add buffer overflow check in of_modalias()
commit cf7385cb26ac4f0ee6c7385960525ad534323252 upstream.
In of_modalias(), if the buffer happens to be too small even for the 1st
snprintf() call, the len parameter will become negative and str parameter
(if not NULL initially) will point beyond the buffer's end. Add the buffer
overflow check after the 1st snprintf() call and fix such check after the
strlen() call (accounting for the terminating NUL char).
Fixes: bc575064d688 ("of/device: use of_property_for_each_string to parse compatible strings")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/bbfc6be0-c687-62b6-d015-5141b93f313e@omp.ru
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: "Uwe Kleine-König" <ukleinek@debian.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Eelco Chaudron <echaudro@redhat.com>
Date: Tue May 6 16:28:54 2025 +0200
openvswitch: Fix unsafe attribute parsing in output_userspace()
commit 6beb6835c1fbb3f676aebb51a5fee6b77fed9308 upstream.
This patch replaces the manual Netlink attribute iteration in
output_userspace() with nla_for_each_nested(), which ensures that only
well-formed attributes are processed.
Fixes: ccb1352e76cf ("net: Add Open vSwitch kernel components.")
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Link: https://patch.msgid.link/0bd65949df61591d9171c0dc13e42cea8941da10.1746541734.git.echaudro@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Matthew Wilcox (Oracle) <willy@infradead.org>
Date: Wed Mar 5 20:47:25 2025 +0000
orangefs: Do not truncate file size
[ Upstream commit 062e8093592fb866b8e016641a8b27feb6ac509d ]
'len' is used to store the result of i_size_read(), so making 'len'
a size_t results in truncation to 4GiB on 32-bit systems.
Signed-off-by: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Link: https://lore.kernel.org/r/20250305204734.1475264-2-willy@infradead.org
Tested-by: Mike Marshall <hubcap@omnibond.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dominik Grzegorzek <dominik.grzegorzek@oracle.com>
Date: Sun May 18 19:45:31 2025 +0200
padata: do not leak refcount in reorder_work
commit d6ebcde6d4ecf34f8495fb30516645db3aea8993 upstream.
A recent patch that addressed a UAF introduced a reference count leak:
the parallel_data refcount is incremented unconditionally, regardless
of the return value of queue_work(). If the work item is already queued,
the incremented refcount is never decremented.
Fix this by checking the return value of queue_work() and decrementing
the refcount when necessary.
Resolves:
Unreferenced object 0xffff9d9f421e3d80 (size 192):
comm "cryptomgr_probe", pid 157, jiffies 4294694003
hex dump (first 32 bytes):
80 8b cf 41 9f 9d ff ff b8 97 e0 89 ff ff ff ff ...A............
d0 97 e0 89 ff ff ff ff 19 00 00 00 1f 88 23 00 ..............#.
backtrace (crc 838fb36):
__kmalloc_cache_noprof+0x284/0x320
padata_alloc_pd+0x20/0x1e0
padata_alloc_shell+0x3b/0xa0
0xffffffffc040a54d
cryptomgr_probe+0x43/0xc0
kthread+0xf6/0x1f0
ret_from_fork+0x2f/0x50
ret_from_fork_asm+0x1a/0x30
Fixes: dd7d37ccf6b1 ("padata: avoid UAF for reorder_work")
Cc: <stable@vger.kernel.org>
Signed-off-by: Dominik Grzegorzek <dominik.grzegorzek@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Helge Deller <deller@gmx.de>
Date: Sat May 3 18:24:01 2025 +0200
parisc: Fix double SIGFPE crash
commit de3629baf5a33af1919dec7136d643b0662e85ef upstream.
Camm noticed that on parisc a SIGFPE exception will crash an application with
a second SIGFPE in the signal handler. Dave analyzed it, and it happens
because glibc uses a double-word floating-point store to atomically update
function descriptors. As a result of lazy binding, we hit a floating-point
store in fpe_func almost immediately.
When the T bit is set, an assist exception trap occurs when when the
co-processor encounters *any* floating-point instruction except for a double
store of register %fr0. The latter cancels all pending traps. Let's fix this
by clearing the Trap (T) bit in the FP status register before returning to the
signal handler in userspace.
The issue can be reproduced with this test program:
root@parisc:~# cat fpe.c
static void fpe_func(int sig, siginfo_t *i, void *v) {
sigset_t set;
sigemptyset(&set);
sigaddset(&set, SIGFPE);
sigprocmask(SIG_UNBLOCK, &set, NULL);
printf("GOT signal %d with si_code %ld\n", sig, i->si_code);
}
int main() {
struct sigaction action = {
.sa_sigaction = fpe_func,
.sa_flags = SA_RESTART|SA_SIGINFO };
sigaction(SIGFPE, &action, 0);
feenableexcept(FE_OVERFLOW);
return printf("%lf\n",1.7976931348623158E308*1.7976931348623158E308);
}
root@parisc:~# gcc fpe.c -lm
root@parisc:~# ./a.out
Floating point exception
root@parisc:~# strace -f ./a.out
execve("./a.out", ["./a.out"], 0xf9ac7034 /* 20 vars */) = 0
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0
...
rt_sigaction(SIGFPE, {sa_handler=0x1110a, sa_mask=[], sa_flags=SA_RESTART|SA_SIGINFO}, NULL, 8) = 0
--- SIGFPE {si_signo=SIGFPE, si_code=FPE_FLTOVF, si_addr=0x1078f} ---
--- SIGFPE {si_signo=SIGFPE, si_code=FPE_FLTOVF, si_addr=0xf8f21237} ---
+++ killed by SIGFPE +++
Floating point exception
Signed-off-by: Helge Deller <deller@gmx.de>
Suggested-by: John David Anglin <dave.anglin@bell.net>
Reported-by: Camm Maguire <camm@maguirefamily.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Stanimir Varbanov <svarbanov@suse.de>
Date: Mon Feb 24 10:35:56 2025 +0200
PCI: brcmstb: Add a softdep to MIP MSI-X driver
[ Upstream commit 2294059118c550464dd8906286324d90c33b152b ]
Then the brcmstb PCIe driver and MIP MSI-X interrupt controller
drivers are built as modules there could be a race in probing.
To avoid this, add a softdep to MIP driver to guarantee that
MIP driver will be load first.
Signed-off-by: Stanimir Varbanov <svarbanov@suse.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Tested-by: Ivan T. Ivanov <iivanov@suse.de>
Link: https://lore.kernel.org/r/20250224083559.47645-5-svarbanov@suse.de
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stanimir Varbanov <svarbanov@suse.de>
Date: Mon Feb 24 10:35:58 2025 +0200
PCI: brcmstb: Expand inbound window size up to 64GB
[ Upstream commit 25a98c727015638baffcfa236e3f37b70cedcf87 ]
The BCM2712 memory map can support up to 64GB of system memory, thus
expand the inbound window size in calculation helper function.
The change is safe for the currently supported SoCs that have smaller
inbound window sizes.
Signed-off-by: Stanimir Varbanov <svarbanov@suse.de>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Jim Quinlan <james.quinlan@broadcom.com>
Tested-by: Ivan T. Ivanov <iivanov@suse.de>
Link: https://lore.kernel.org/r/20250224083559.47645-7-svarbanov@suse.de
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Date: Mon Dec 16 19:56:12 2024 +0200
PCI: Fix old_size lower bound in calculate_iosize() too
[ Upstream commit ff61f380de5652e723168341480cc7adf1dd6213 ]
Commit 903534fa7d30 ("PCI: Fix resource double counting on remove &
rescan") fixed double counting of mem resources because of old_size being
applied too early.
Fix a similar counting bug on the io resource side.
Link: https://lore.kernel.org/r/20241216175632.4175-6-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xiaochun Lee <lixc17@lenovo.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Richard Zhu <hongxing.zhu@nxp.com>
Date: Tue Nov 26 15:56:56 2024 +0800
PCI: imx6: Skip controller_id generation logic for i.MX7D
commit f068ffdd034c93f0c768acdc87d4d2d7023c1379 upstream.
The i.MX7D only has one PCIe controller, so controller_id should always be
0. The previous code is incorrect although yielding the correct result.
Fix by removing "IMX7D" from the switch case branch.
Fixes: 2d8ed461dbc9 ("PCI: imx6: Add support for i.MX8MQ")
Link: https://lore.kernel.org/r/20241126075702.4099164-5-hongxing.zhu@nxp.com
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
[Because this switch case does more than just controller_id
logic, move the "IMX7D" case label instead of removing it entirely.]
Signed-off-by: Ryan Matthews <ryanmatthews@fastmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ravi Bangoria <ravi.bangoria@amd.com>
Date: Wed Jan 15 05:44:33 2025 +0000
perf/amd/ibs: Fix perf_ibs_op.cnt_mask for CurCnt
[ Upstream commit 46dcf85566170d4528b842bf83ffc350d71771fa ]
IBS Op uses two counters: MaxCnt and CurCnt. MaxCnt is programmed with
the desired sample period. IBS hw generates sample when CurCnt reaches
to MaxCnt. The size of these counter used to be 20 bits but later they
were extended to 27 bits. The 7 bit extension is indicated by CPUID
Fn8000_001B_EAX[6 / OpCntExt].
perf_ibs->cnt_mask variable contains bit masks for MaxCnt and CurCnt.
But IBS driver does not set upper 7 bits of CurCnt in cnt_mask even
when OpCntExt CPUID bit is set. Fix this.
IBS driver uses cnt_mask[CurCnt] bits only while disabling an event.
Fortunately, CurCnt bits are not read from MSR while re-enabling the
event, instead MaxCnt is programmed with desired period and CurCnt is
set to 0. Hence, we did not see any issues so far.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/r/20250115054438.1021-5-ravi.bangoria@amd.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Robin Murphy <robin.murphy@arm.com>
Date: Mon May 12 18:11:54 2025 +0100
perf/arm-cmn: Initialise cmn->cpu earlier
commit 597704e201068db3d104de3c7a4d447ff8209127 upstream.
For all the complexity of handling affinity for CPU hotplug, what we've
apparently managed to overlook is that arm_cmn_init_irqs() has in fact
always been setting the *initial* affinity of all IRQs to CPU 0, not the
CPU we subsequently choose for event scheduling. Oh dear.
Cc: stable@vger.kernel.org
Fixes: 0ba64770a2f2 ("perf: Add Arm CMN-600 PMU driver")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/b12fccba6b5b4d2674944f59e4daad91cd63420b.1747069914.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
[ backport past NUMA changes in 5.17 ]
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Dmitry Baryshkov <lumag@kernel.org>
Date: Sun Feb 9 14:31:45 2025 +0200
phy: core: don't require set_mode() callback for phy_get_mode() to work
[ Upstream commit d58c04e305afbaa9dda7969151f06c4efe2c98b0 ]
As reported by Damon Ding, the phy_get_mode() call doesn't work as
expected unless the PHY driver has a .set_mode() call. This prompts PHY
drivers to have empty stubs for .set_mode() for the sake of being able
to get the mode.
Make .set_mode() callback truly optional and update PHY's mode even if
it there is none.
Cc: Damon Ding <damon.ding@rock-chips.com>
Link: https://lore.kernel.org/r/96f8310f-93f1-4bcb-8637-137e1159ff83@rock-chips.com
Tested-by: Damon Ding <damon.ding@rock-chips.com>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20250209-phy-fix-set-moe-v2-1-76e248503856@linaro.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ma Ke <make24@iscas.ac.cn>
Date: Mon Mar 3 15:27:39 2025 +0800
phy: Fix error handling in tegra_xusb_port_init
commit b2ea5f49580c0762d17d80d8083cb89bc3acf74f upstream.
If device_add() fails, do not use device_unregister() for error
handling. device_unregister() consists two functions: device_del() and
put_device(). device_unregister() should only be called after
device_add() succeeded because device_del() undoes what device_add()
does if successful. Change device_unregister() to put_device() call
before returning from the function.
As comment of device_add() says, 'if device_add() succeeds, you should
call device_del() when you want to get rid of it. If device_add() has
not succeeded, use only put_device() to drop the reference count'.
Found by code review.
Cc: stable@vger.kernel.org
Fixes: 53d2a715c240 ("phy: Add Tegra XUSB pad controller support")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Acked-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20250303072739.3874987-1-make24@iscas.ac.cn
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Date: Wed May 7 15:50:32 2025 +0300
phy: renesas: rcar-gen3-usb2: Set timing registers only once
commit 86e70849f4b2b4597ac9f7c7931f2a363774be25 upstream.
phy-rcar-gen3-usb2 driver exports 4 PHYs. The timing registers are common
to all PHYs. There is no need to set them every time a PHY is initialized.
Set timing register only when the 1st PHY is initialized.
Fixes: f3b5a8d9b50d ("phy: rcar-gen3-usb2: Add R-Car Gen3 USB2 PHY driver")
Cc: stable@vger.kernel.org
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Link: https://lore.kernel.org/r/20250507125032.565017-6-claudiu.beznea.uj@bp.renesas.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Christian Brauner <brauner@kernel.org>
Date: Mon Mar 27 20:22:51 2023 +0200
pid: add pidfd_prepare()
commit 6ae930d9dbf2d093157be33428538c91966d8a9f upstream.
Add a new helper that allows to reserve a pidfd and allocates a new
pidfd file that stashes the provided struct pid. This will allow us to
remove places that either open code this function or that call
pidfd_create() but then have to call close_fd() because there are still
failure points after pidfd_create() has been called.
Reviewed-by: Jan Kara <jack@suse.cz>
Message-Id: <20230327-pidfd-file-api-v1-1-5c0e9a3158e4@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Artur Weber <aweber.kernel@gmail.com>
Date: Mon Mar 3 21:54:47 2025 +0100
pinctrl: bcm281xx: Use "unsigned int" instead of bare "unsigned"
[ Upstream commit 07b5a2a13f4704c5eae3be7277ec54ffdba45f72 ]
Replace uses of bare "unsigned" with "unsigned int" to fix checkpatch
warnings. No functional change.
Signed-off-by: Artur Weber <aweber.kernel@gmail.com>
Link: https://lore.kernel.org/20250303-bcm21664-pinctrl-v3-2-5f8b80e4ab51@gmail.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Valentin Caron <valentin.caron@foss.st.com>
Date: Thu Jan 16 18:00:09 2025 +0100
pinctrl: devicetree: do not goto err when probing hogs in pinctrl_dt_to_map
[ Upstream commit c98868e816209e568c9d72023ba0bc1e4d96e611 ]
Cross case in pinctrl framework make impossible to an hogged pin and
another, not hogged, used within the same device-tree node. For example
with this simplified device-tree :
&pinctrl {
pinctrl_pin_1: pinctrl-pin-1 {
pins = "dummy-pinctrl-pin";
};
};
&rtc {
pinctrl-names = "default"
pinctrl-0 = <&pinctrl_pin_1 &rtc_pin_1>
rtc_pin_1: rtc-pin-1 {
pins = "dummy-rtc-pin";
};
};
"pinctrl_pin_1" configuration is never set. This produces this path in
the code:
really_probe()
pinctrl_bind_pins()
| devm_pinctrl_get()
| pinctrl_get()
| create_pinctrl()
| pinctrl_dt_to_map()
| // Hog pin create an abort for all pins of the node
| ret = dt_to_map_one_config()
| | /* Do not defer probing of hogs (circular loop) */
| | if (np_pctldev == p->dev->of_node)
| | return -ENODEV;
| if (ret)
| goto err
|
call_driver_probe()
stm32_rtc_probe()
pinctrl_enable()
pinctrl_claim_hogs()
create_pinctrl()
for_each_maps(maps_node, i, map)
// Not hog pin is skipped
if (pctldev && strcmp(dev_name(pctldev->dev),
map->ctrl_dev_name))
continue;
At the first call of create_pinctrl() the hogged pin produces an abort to
avoid a defer of hogged pins. All other pin configurations are trashed.
At the second call, create_pinctrl is now called with pctldev parameter to
get hogs, but in this context only hogs are set. And other pins are
skipped.
To handle this, do not produce an abort in the first call of
create_pinctrl(). Classic pin configuration will be set in
pinctrl_bind_pins() context. And the hogged pin configuration will be set
in pinctrl_claim_hogs() context.
Signed-off-by: Valentin Caron <valentin.caron@foss.st.com>
Link: https://lore.kernel.org/20250116170009.2075544-1-valentin.caron@foss.st.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Date: Sat Mar 29 20:01:32 2025 +0100
pinctrl: meson: define the pull up/down resistor value as 60 kOhm
[ Upstream commit e56088a13708757da68ad035269d69b93ac8c389 ]
The public datasheets of the following Amlogic SoCs describe a typical
resistor value for the built-in pull up/down resistor:
- Meson8/8b/8m2: not documented
- GXBB (S905): 60 kOhm
- GXL (S905X): 60 kOhm
- GXM (S912): 60 kOhm
- G12B (S922X): 60 kOhm
- SM1 (S905D3): 60 kOhm
The public G12B and SM1 datasheets additionally state min and max
values:
- min value: 50 kOhm for both, pull-up and pull-down
- max value for the pull-up: 70 kOhm
- max value for the pull-down: 130 kOhm
Use 60 kOhm in the pinctrl-meson driver as well so it's shown in the
debugfs output. It may not be accurate for Meson8/8b/8m2 but in reality
60 kOhm is closer to the actual value than 1 Ohm.
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/20250329190132.855196-1-martin.blumenstingl@googlemail.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hans de Goede <hdegoede@redhat.com>
Date: Thu May 1 15:17:02 2025 +0200
platform/x86: asus-wmi: Fix wlan_ctrl_by_user detection
[ Upstream commit bfcfe6d335a967f8ea0c1980960e6f0205b5de6e ]
The wlan_ctrl_by_user detection was introduced by commit a50bd128f28c
("asus-wmi: record wlan status while controlled by userapp").
Quoting from that commit's commit message:
"""
When you call WMIMethod(DSTS, 0x00010011) to get WLAN status, it may return
(1) 0x00050001 (On)
(2) 0x00050000 (Off)
(3) 0x00030001 (On)
(4) 0x00030000 (Off)
(5) 0x00000002 (Unknown)
(1), (2) means that the model has hardware GPIO for WLAN, you can call
WMIMethod(DEVS, 0x00010011, 1 or 0) to turn WLAN on/off.
(3), (4) means that the model doesn’t have hardware GPIO, you need to use
API or driver library to turn WLAN on/off, and call
WMIMethod(DEVS, 0x00010012, 1 or 0) to set WLAN LED status.
After you set WLAN LED status, you can see the WLAN status is changed with
WMIMethod(DSTS, 0x00010011). Because the status is recorded lastly
(ex: Windows), you can use it for synchronization.
(5) means that the model doesn’t have WLAN device.
WLAN is the ONLY special case with upper rule.
"""
The wlan_ctrl_by_user flag should be set on 0x0003000? ((3), (4) above)
return values, but the flag mistakenly also gets set on laptops with
0x0005000? ((1), (2)) return values. This is causing rfkill problems on
laptops where 0x0005000? is returned.
Fix the check to only set the wlan_ctrl_by_user flag for 0x0003000?
return values.
Fixes: a50bd128f28c ("asus-wmi: record wlan status while controlled by userapp")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219786
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Armin Wolf <W_Armin@gmx.de>
Link: https://lore.kernel.org/r/20250501131702.103360-2-hdegoede@redhat.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Valtteri Koskivuori <vkoskiv@gmail.com>
Date: Fri May 9 21:42:49 2025 +0300
platform/x86: fujitsu-laptop: Support Lifebook S2110 hotkeys
[ Upstream commit a7e255ff9fe4d9b8b902023aaf5b7a673786bb50 ]
The S2110 has an additional set of media playback control keys enabled
by a hardware toggle button that switches the keys between "Application"
and "Player" modes. Toggling "Player" mode just shifts the scancode of
each hotkey up by 4.
Add defines for new scancodes, and a keymap and dmi id for the S2110.
Tested on a Fujitsu Lifebook S2110.
Signed-off-by: Valtteri Koskivuori <vkoskiv@gmail.com>
Acked-by: Jonathan Woithe <jwoithe@just42.net>
Link: https://lore.kernel.org/r/20250509184251.713003-1-vkoskiv@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mark Pearson <mpearson-lenovo@squebb.ca>
Date: Fri May 16 22:33:37 2025 -0400
platform/x86: thinkpad_acpi: Ignore battery threshold change event notification
[ Upstream commit 29e4e6b4235fefa5930affb531fe449cac330a72 ]
If user modifies the battery charge threshold an ACPI event is generated.
Confirmed with Lenovo FW team this is only generated on user event. As no
action is needed, ignore the event and prevent spurious kernel logs.
Reported-by: Derek Barbosa <debarbos@redhat.com>
Closes: https://lore.kernel.org/platform-driver-x86/7e9a1c47-5d9c-4978-af20-3949d53fb5dc@app.fastmail.com/T/#m5f5b9ae31d3fbf30d7d9a9d76c15fb3502dfd903
Signed-off-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Armin Wolf <W_Armin@gmx.de>
Link: https://lore.kernel.org/r/20250517023348.2962591-1-mpearson-lenovo@squebb.ca
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: John Chau <johnchau@0atlas.com>
Date: Mon May 5 01:55:13 2025 +0900
platform/x86: thinkpad_acpi: Support also NEC Lavie X1475JAS
[ Upstream commit a032f29a15412fab9f4352e0032836d51420a338 ]
Change get_thinkpad_model_data() to check for additional vendor name
"NEC" in order to support NEC Lavie X1475JAS notebook (and perhaps
more).
The reason of this works with minimal changes is because NEC Lavie
X1475JAS is a Thinkpad inside. ACPI dumps reveals its OEM ID to be
"LENOVO", BIOS version "R2PET30W" matches typical Lenovo BIOS version,
the existence of HKEY of LEN0268, with DMI fw string is "R2PHT24W".
I compiled and tested with my own machine, attached the dmesg
below as proof of work:
[ 6.288932] thinkpad_acpi: ThinkPad ACPI Extras v0.26
[ 6.288937] thinkpad_acpi: http://ibm-acpi.sf.net/
[ 6.288938] thinkpad_acpi: ThinkPad BIOS R2PET30W (1.11 ), EC R2PHT24W
[ 6.307000] thinkpad_acpi: radio switch found; radios are enabled
[ 6.307030] thinkpad_acpi: This ThinkPad has standard ACPI backlight brightness control, supported by the ACPI video driver
[ 6.307033] thinkpad_acpi: Disabling thinkpad-acpi brightness events by default...
[ 6.320322] thinkpad_acpi: rfkill switch tpacpi_bluetooth_sw: radio is unblocked
[ 6.371963] thinkpad_acpi: secondary fan control detected & enabled
[ 6.391922] thinkpad_acpi: battery 1 registered (start 0, stop 85, behaviours: 0x7)
[ 6.398375] input: ThinkPad Extra Buttons as /devices/platform/thinkpad_acpi/input/input13
Signed-off-by: John Chau <johnchau@0atlas.com>
Link: https://lore.kernel.org/r/20250504165513.295135-1-johnchau@0atlas.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Thu Mar 20 12:45:01 2025 -0400
pNFS/flexfiles: Report ENETDOWN as a connection error
[ Upstream commit aa42add73ce9b9e3714723d385c254b75814e335 ]
If the client should see an ENETDOWN when trying to connect to the data
server, it might still be able to talk to the metadata server through
another NIC. If so, report the error.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Tested-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Sat Mar 8 17:48:17 2025 +0100
posix-timers: Add cond_resched() to posix_timer_add() search loop
[ Upstream commit 5f2909c6cd13564a07ae692a95457f52295c4f22 ]
With a large number of POSIX timers the search for a valid ID might cause a
soft lockup on PREEMPT_NONE/VOLUNTARY kernels.
Add cond_resched() to the loop to prevent that.
[ tglx: Split out from Eric's series ]
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/all/20250214135911.2037402-2-edumazet@google.com
Link: https://lore.kernel.org/all/20250308155623.635612865@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Andreas Schwab <schwab@linux-m68k.org>
Date: Mon Jan 13 18:19:09 2025 +0100
powerpc/prom_init: Fixup missing #size-cells on PowerBook6,7
[ Upstream commit 7e67ef889c9ab7246547db73d524459f47403a77 ]
Similar to the PowerMac3,1, the PowerBook6,7 is missing the #size-cells
property on the i2s node.
Depends-on: commit 045b14ca5c36 ("of: WARN on deprecated #address-cells/#size-cells handling")
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
[maddy: added "commit" work in depends-on to avoid checkpatch error]
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/875xmizl6a.fsf@igel.home
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Abdun Nihaal <abdun.nihaal@gmail.com>
Date: Mon May 12 10:18:27 2025 +0530
qlcnic: fix memory leak in qlcnic_sriov_channel_cfg_cmd()
[ Upstream commit 9d8a99c5a7c7f4f7eca2c168a4ec254409670035 ]
In one of the error paths in qlcnic_sriov_channel_cfg_cmd(), the memory
allocated in qlcnic_sriov_alloc_bc_mbx_args() for mailbox arguments is
not freed. Fix that by jumping to the error path that frees them, by
calling qlcnic_free_mbx_args(). This was found using static analysis.
Fixes: f197a7aa6288 ("qlcnic: VF-PF communication channel implementation")
Signed-off-by: Abdun Nihaal <abdun.nihaal@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250512044829.36400-1-abdun.nihaal@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Uladzislau Rezki (Sony) <urezki@gmail.com>
Date: Wed Feb 1 16:08:07 2023 +0100
rcu/kvfree: Add kvfree_rcu_mightsleep() and kfree_rcu_mightsleep()
[ Upstream commit 608723c41cd951fb32ade2f8371e61c270816175 ]
The kvfree_rcu() and kfree_rcu() APIs are hazardous in that if you forget
the second argument, it works, but might sleep. This sleeping can be a
correctness bug from atomic contexts, and even in non-atomic contexts
it might introduce unacceptable latencies. This commit therefore adds
kvfree_rcu_mightsleep() and kfree_rcu_mightsleep(), which will replace
the single-argument kvfree_rcu() and kfree_rcu(), respectively.
This commit enables a series of commits that switch from single-argument
kvfree_rcu() and kfree_rcu() to their _mightsleep() counterparts. Once
all of these commits land, the single-argument versions will be removed.
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Stable-dep-of: 511e64e13d8c ("can: gw: fix RCU/BH usage in cgw_create_job()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ankur Arora <ankur.a.arora@oracle.com>
Date: Thu Dec 12 20:06:52 2024 -0800
rcu: fix header guard for rcu_all_qs()
[ Upstream commit ad6b5b73ff565e88aca7a7d1286788d80c97ba71 ]
rcu_all_qs() is defined for !CONFIG_PREEMPT_RCU but the declaration
is conditioned on CONFIG_PREEMPTION.
With CONFIG_PREEMPT_LAZY, CONFIG_PREEMPTION=y does not imply
CONFIG_PREEMPT_RCU=y.
Decouple the two.
Cc: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ankur Arora <ankur.a.arora@oracle.com>
Date: Thu Dec 12 20:06:56 2024 -0800
rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y
[ Upstream commit 83b28cfe796464ebbde1cf7916c126da6d572685 ]
With PREEMPT_RCU=n, cond_resched() provides urgently needed quiescent
states for read-side critical sections via rcu_all_qs().
One reason why this was needed: lacking preempt-count, the tick
handler has no way of knowing whether it is executing in a
read-side critical section or not.
With (PREEMPT_LAZY=y, PREEMPT_DYNAMIC=n), we get (PREEMPT_COUNT=y,
PREEMPT_RCU=n). In this configuration cond_resched() is a stub and
does not provide quiescent states via rcu_all_qs().
(PREEMPT_RCU=y provides this information via rcu_read_unlock() and
its nesting counter.)
So, use the availability of preempt_count() to report quiescent states
in rcu_flavor_sched_clock_irq().
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zhu Yanjun <yanjun.zhu@linux.dev>
Date: Sat Apr 12 09:57:14 2025 +0200
RDMA/rxe: Fix slab-use-after-free Read in rxe_queue_cleanup bug
[ Upstream commit f81b33582f9339d2dc17c69b92040d3650bb4bae ]
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x7d/0xa0 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0xcf/0x610 mm/kasan/report.c:489
kasan_report+0xb5/0xe0 mm/kasan/report.c:602
rxe_queue_cleanup+0xd0/0xe0 drivers/infiniband/sw/rxe/rxe_queue.c:195
rxe_cq_cleanup+0x3f/0x50 drivers/infiniband/sw/rxe/rxe_cq.c:132
__rxe_cleanup+0x168/0x300 drivers/infiniband/sw/rxe/rxe_pool.c:232
rxe_create_cq+0x22e/0x3a0 drivers/infiniband/sw/rxe/rxe_verbs.c:1109
create_cq+0x658/0xb90 drivers/infiniband/core/uverbs_cmd.c:1052
ib_uverbs_create_cq+0xc7/0x120 drivers/infiniband/core/uverbs_cmd.c:1095
ib_uverbs_write+0x969/0xc90 drivers/infiniband/core/uverbs_main.c:679
vfs_write fs/read_write.c:677 [inline]
vfs_write+0x26a/0xcc0 fs/read_write.c:659
ksys_write+0x1b8/0x200 fs/read_write.c:731
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xaa/0x1b0 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
In the function rxe_create_cq, when rxe_cq_from_init fails, the function
rxe_cleanup will be called to handle the allocated resources. In fact,
some memory resources have already been freed in the function
rxe_cq_from_init. Thus, this problem will occur.
The solution is to let rxe_cleanup do all the work.
Fixes: 8700e3e7c485 ("Soft RoCE driver")
Link: https://paste.ubuntu.com/p/tJgC42wDf6/
Tested-by: liuyi <liuy22@mails.tsinghua.edu.cn>
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://patch.msgid.link/20250412075714.3257358-1-yanjun.zhu@linux.dev
Reviewed-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Isaac Scott <isaac.scott@ideasonboard.com>
Date: Tue Jan 28 17:31:43 2025 +0000
regulator: ad5398: Add device tree support
[ Upstream commit 5a6a461079decea452fdcae955bccecf92e07e97 ]
Previously, the ad5398 driver used only platform_data, which is
deprecated in favour of device tree. This caused the AD5398 to fail to
probe as it could not load its init_data. If the AD5398 has a device
tree node, pull the init_data from there using
of_get_regulator_init_data.
Signed-off-by: Isaac Scott <isaac.scott@ideasonboard.com>
Acked-by: Michael Hennerich <michael.hennerich@analog.com>
Link: https://patch.msgid.link/20250128173143.959600-4-isaac.scott@ideasonboard.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Christian Hewitt <christianshewitt@gmail.com>
Date: Mon Apr 21 22:12:59 2025 +0200
Revert "drm/meson: vclk: fix calculation of 59.94 fractional rates"
[ Upstream commit f37bb5486ea536c1d61df89feeaeff3f84f0b560 ]
This reverts commit bfbc68e.
The patch does permit the offending YUV420 @ 59.94 phy_freq and
vclk_freq mode to match in calculations. It also results in all
fractional rates being unavailable for use. This was unintended
and requires the patch to be reverted.
Fixes: bfbc68e4d869 ("drm/meson: vclk: fix calculation of 59.94 fractional rates")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Hewitt <christianshewitt@gmail.com>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20250421201300.778955-2-martin.blumenstingl@googlemail.com
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20250421201300.778955-2-martin.blumenstingl@googlemail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alexandre Belloni <alexandre.belloni@bootlin.com>
Date: Mon Mar 3 23:37:44 2025 +0100
rtc: ds1307: stop disabling alarms on probe
[ Upstream commit dcec12617ee61beed928e889607bf37e145bf86b ]
It is a bad practice to disable alarms on probe or remove as this will
prevent alarms across reboots.
Link: https://lore.kernel.org/r/20250303223744.1135672-1-alexandre.belloni@bootlin.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alexandre Belloni <alexandre.belloni@bootlin.com>
Date: Thu Mar 6 22:42:41 2025 +0100
rtc: rv3032: fix EERD location
[ Upstream commit b0f9cb4a0706b0356e84d67e48500b77b343debe ]
EERD is bit 2 in CTRL1
Link: https://lore.kernel.org/r/20250306214243.1167692-1-alexandre.belloni@bootlin.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Cong Wang <xiyou.wangcong@gmail.com>
Date: Sun May 18 15:20:37 2025 -0700
sch_hfsc: Fix qlen accounting bug when using peek in hfsc_enqueue()
[ Upstream commit 3f981138109f63232a5fb7165938d4c945cc1b9d ]
When enqueuing the first packet to an HFSC class, hfsc_enqueue() calls the
child qdisc's peek() operation before incrementing sch->q.qlen and
sch->qstats.backlog. If the child qdisc uses qdisc_peek_dequeued(), this may
trigger an immediate dequeue and potential packet drop. In such cases,
qdisc_tree_reduce_backlog() is called, but the HFSC qdisc's qlen and backlog
have not yet been updated, leading to inconsistent queue accounting. This
can leave an empty HFSC class in the active list, causing further
consequences like use-after-free.
This patch fixes the bug by moving the increment of sch->q.qlen and
sch->qstats.backlog before the call to the child qdisc's peek() operation.
This ensures that queue length and backlog are always accurate when packet
drops or dequeues are triggered during the peek.
Fixes: 12d0ad3be9c3 ("net/sched/sch_hfsc.c: handle corner cases where head may change invalidating calculated deadline")
Reported-by: Mingi Cho <mincho@theori.io>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250518222038.58538-2-xiyou.wangcong@gmail.com
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Justin Tee <justin.tee@broadcom.com>
Date: Thu Jan 30 16:05:22 2025 -0800
scsi: lpfc: Handle duplicate D_IDs in ndlp search-by D_ID routine
[ Upstream commit 56c3d809b7b450379162d0b8a70bbe71ab8db706 ]
After a port swap between separate fabrics, there may be multiple nodes in
the vport's fc_nodes list with the same fabric well known address.
Duplication is temporary and eventually resolves itself after dev_loss_tmo
expires, but nameserver queries may still occur before dev_loss_tmo. This
possibly results in returning stale fabric ndlp objects. Fix by adding an
nlp_state check to ensure the ndlp search routine returns the correct newer
allocated ndlp fabric object.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250131000524.163662-5-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Date: Wed Feb 12 17:26:55 2025 -0800
scsi: mpt3sas: Send a diag reset if target reset fails
[ Upstream commit 5612d6d51ed2634a033c95de2edec7449409cbb9 ]
When an IOCTL times out and driver issues a target reset, if firmware
fails the task management elevate the recovery by issuing a diag reset to
controller.
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Link: https://lore.kernel.org/r/1739410016-27503-5-git-send-email-shivasharan.srikanteshwara@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kai Mäkisara <Kai.Makisara@kolumbus.fi>
Date: Tue Mar 11 13:25:15 2025 +0200
scsi: st: ERASE does not change tape location
[ Upstream commit ad77cebf97bd42c93ab4e3bffd09f2b905c1959a ]
The SCSI ERASE command erases from the current position onwards. Don't
clear the position variables.
Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi>
Link: https://lore.kernel.org/r/20250311112516.5548-3-Kai.Makisara@kolumbus.fi
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kai Mäkisara <Kai.Makisara@kolumbus.fi>
Date: Mon Jan 20 21:49:22 2025 +0200
scsi: st: Restore some drive settings after reset
[ Upstream commit 7081dc75df79696d8322d01821c28e53416c932c ]
Some of the allowed operations put the tape into a known position to
continue operation assuming only the tape position has changed. But reset
sets partition, density and block size to drive default values. These
should be restored to the values before reset.
Normally the current block size and density are stored by the drive. If
the settings have been changed, the changed values have to be saved by the
driver across reset.
Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi>
Link: https://lore.kernel.org/r/20250120194925.44432-2-Kai.Makisara@kolumbus.fi
Reviewed-by: John Meneghini <jmeneghi@redhat.com>
Tested-by: John Meneghini <jmeneghi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kai Mäkisara <Kai.Makisara@kolumbus.fi>
Date: Tue Mar 11 13:25:16 2025 +0200
scsi: st: Tighten the page format heuristics with MODE SELECT
[ Upstream commit 8db816c6f176321e42254badd5c1a8df8bfcfdb4 ]
In the days when SCSI-2 was emerging, some drives did claim SCSI-2 but did
not correctly implement it. The st driver first tries MODE SELECT with the
page format bit set to set the block descriptor. If not successful, the
non-page format is tried.
The test only tests the sense code and this triggers also from illegal
parameter in the parameter list. The test is limited to "old" devices and
made more strict to remove false alarms.
Signed-off-by: Kai Mäkisara <Kai.Makisara@kolumbus.fi>
Link: https://lore.kernel.org/r/20250311112516.5548-4-Kai.Makisara@kolumbus.fi
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mike Christie <michael.christie@oracle.com>
Date: Mon Jun 27 21:23:25 2022 -0500
scsi: target: Fix WRITE_SAME No Data Buffer crash
commit ccd3f449052449a917a3e577d8ba0368f43b8f29 upstream.
In newer version of the SBC specs, we have a NDOB bit that indicates there
is no data buffer that gets written out. If this bit is set using commands
like "sg_write_same --ndob" we will crash in target_core_iblock/file's
execute_write_same handlers when we go to access the se_cmd->t_data_sg
because its NULL.
This patch adds a check for the NDOB bit in the common WRITE SAME code
because we don't support it. And, it adds a check for zero SG elements in
each handler in case the initiator tries to send a normal WRITE SAME with
no data buffer.
Link: https://lore.kernel.org/r/20220628022325.14627-2-michael.christie@oracle.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Dmitry Bogdanov <d.bogdanov@yadro.com>
Date: Tue Dec 24 13:17:57 2024 +0300
scsi: target: iscsi: Fix timeout on deleted connection
[ Upstream commit 7f533cc5ee4c4436cee51dc58e81dfd9c3384418 ]
NOPIN response timer may expire on a deleted connection and crash with
such logs:
Did not receive response to NOPIN on CID: 0, failing connection for I_T Nexus (null),i,0x00023d000125,iqn.2017-01.com.iscsi.target,t,0x3d
BUG: Kernel NULL pointer dereference on read at 0x00000000
NIP strlcpy+0x8/0xb0
LR iscsit_fill_cxn_timeout_err_stats+0x5c/0xc0 [iscsi_target_mod]
Call Trace:
iscsit_handle_nopin_response_timeout+0xfc/0x120 [iscsi_target_mod]
call_timer_fn+0x58/0x1f0
run_timer_softirq+0x740/0x860
__do_softirq+0x16c/0x420
irq_exit+0x188/0x1c0
timer_interrupt+0x184/0x410
That is because nopin response timer may be re-started on nopin timer
expiration.
Stop nopin timer before stopping the nopin response timer to be sure
that no one of them will be re-started.
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Link: https://lore.kernel.org/r/20241224101757.32300-1-d.bogdanov@yadro.com
Reviewed-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Feng Tang <feng.tang@linux.alibaba.com>
Date: Wed Apr 23 18:36:45 2025 +0800
selftests/mm: compaction_test: support platform with huge mount of memory
commit ab00ddd802f80e31fc9639c652d736fe3913feae upstream.
When running mm selftest to verify mm patches, 'compaction_test' case
failed on an x86 server with 1TB memory. And the root cause is that it
has too much free memory than what the test supports.
The test case tries to allocate 100000 huge pages, which is about 200 GB
for that x86 server, and when it succeeds, it expects it's large than 1/3
of 80% of the free memory in system. This logic only works for platform
with 750 GB ( 200 / (1/3) / 80% ) or less free memory, and may raise false
alarm for others.
Fix it by changing the fixed page number to self-adjustable number
according to the real number of free memory.
Link: https://lkml.kernel.org/r/20250423103645.2758-1-feng.tang@linux.alibaba.com
Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory")
Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com>
Acked-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Tested-by: Baolin Wang <baolin.wang@inux.alibaba.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Sri Jayaramappa <sjayaram@akamai.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Konstantin Andreev <andreev@swemel.ru>
Date: Fri Jan 17 02:40:34 2025 +0300
smack: recognize ipv4 CIPSO w/o categories
[ Upstream commit a158a937d864d0034fea14913c1f09c6d5f574b8 ]
If SMACK label has CIPSO representation w/o categories, e.g.:
| # cat /smack/cipso2
| foo 10
| @ 250/2
| ...
then SMACK does not recognize such CIPSO in input ipv4 packets
and substitues '*' label instead. Audit records may look like
| lsm=SMACK fn=smack_socket_sock_rcv_skb action=denied
| subject="*" object="_" requested=w pid=0 comm="swapper/1" ...
This happens in two steps:
1) security/smack/smackfs.c`smk_set_cipso
does not clear NETLBL_SECATTR_MLS_CAT
from (struct smack_known *)skp->smk_netlabel.flags
on assigning CIPSO w/o categories:
| rcu_assign_pointer(skp->smk_netlabel.attr.mls.cat, ncats.attr.mls.cat);
| skp->smk_netlabel.attr.mls.lvl = ncats.attr.mls.lvl;
2) security/smack/smack_lsm.c`smack_from_secattr
can not match skp->smk_netlabel with input packet's
struct netlbl_lsm_secattr *sap
because sap->flags have not NETLBL_SECATTR_MLS_CAT (what is correct)
but skp->smk_netlabel.flags have (what is incorrect):
| if ((sap->flags & NETLBL_SECATTR_MLS_CAT) == 0) {
| if ((skp->smk_netlabel.flags &
| NETLBL_SECATTR_MLS_CAT) == 0)
| found = 1;
| break;
| }
This commit sets/clears NETLBL_SECATTR_MLS_CAT in
skp->smk_netlabel.flags according to the presense of CIPSO categories.
The update of smk_netlabel is not atomic, so input packets processing
still may be incorrect during short time while update proceeds.
Signed-off-by: Konstantin Andreev <andreev@swemel.ru>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wang Zhaolong <wangzhaolong1@huawei.com>
Date: Fri May 16 17:12:55 2025 +0800
smb: client: Fix use-after-free in cifs_fill_dirent
commit a7a8fe56e932a36f43e031b398aef92341bf5ea0 upstream.
There is a race condition in the readdir concurrency process, which may
access the rsp buffer after it has been released, triggering the
following KASAN warning.
==================================================================
BUG: KASAN: slab-use-after-free in cifs_fill_dirent+0xb03/0xb60 [cifs]
Read of size 4 at addr ffff8880099b819c by task a.out/342975
CPU: 2 UID: 0 PID: 342975 Comm: a.out Not tainted 6.15.0-rc6+ #240 PREEMPT(full)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x53/0x70
print_report+0xce/0x640
kasan_report+0xb8/0xf0
cifs_fill_dirent+0xb03/0xb60 [cifs]
cifs_readdir+0x12cb/0x3190 [cifs]
iterate_dir+0x1a1/0x520
__x64_sys_getdents+0x134/0x220
do_syscall_64+0x4b/0x110
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f996f64b9f9
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89
f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
f0 ff ff 0d f7 c3 0c 00 f7 d8 64 89 8
RSP: 002b:00007f996f53de78 EFLAGS: 00000207 ORIG_RAX: 000000000000004e
RAX: ffffffffffffffda RBX: 00007f996f53ecdc RCX: 00007f996f64b9f9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 00007f996f53dea0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000207 R12: ffffffffffffff88
R13: 0000000000000000 R14: 00007ffc8cd9a500 R15: 00007f996f51e000
</TASK>
Allocated by task 408:
kasan_save_stack+0x20/0x40
kasan_save_track+0x14/0x30
__kasan_slab_alloc+0x6e/0x70
kmem_cache_alloc_noprof+0x117/0x3d0
mempool_alloc_noprof+0xf2/0x2c0
cifs_buf_get+0x36/0x80 [cifs]
allocate_buffers+0x1d2/0x330 [cifs]
cifs_demultiplex_thread+0x22b/0x2690 [cifs]
kthread+0x394/0x720
ret_from_fork+0x34/0x70
ret_from_fork_asm+0x1a/0x30
Freed by task 342979:
kasan_save_stack+0x20/0x40
kasan_save_track+0x14/0x30
kasan_save_free_info+0x3b/0x60
__kasan_slab_free+0x37/0x50
kmem_cache_free+0x2b8/0x500
cifs_buf_release+0x3c/0x70 [cifs]
cifs_readdir+0x1c97/0x3190 [cifs]
iterate_dir+0x1a1/0x520
__x64_sys_getdents64+0x134/0x220
do_syscall_64+0x4b/0x110
entry_SYSCALL_64_after_hwframe+0x76/0x7e
The buggy address belongs to the object at ffff8880099b8000
which belongs to the cache cifs_request of size 16588
The buggy address is located 412 bytes inside of
freed 16588-byte region [ffff8880099b8000, ffff8880099bc0cc)
The buggy address belongs to the physical page:
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x99b8
head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
anon flags: 0x80000000000040(head|node=0|zone=1)
page_type: f5(slab)
raw: 0080000000000040 ffff888001e03400 0000000000000000 dead000000000001
raw: 0000000000000000 0000000000010001 00000000f5000000 0000000000000000
head: 0080000000000040 ffff888001e03400 0000000000000000 dead000000000001
head: 0000000000000000 0000000000010001 00000000f5000000 0000000000000000
head: 0080000000000003 ffffea0000266e01 00000000ffffffff 00000000ffffffff
head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000008
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff8880099b8080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880099b8100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8880099b8180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8880099b8200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880099b8280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
POC is available in the link [1].
The problem triggering process is as follows:
Process 1 Process 2
-----------------------------------------------------------------
cifs_readdir
/* file->private_data == NULL */
initiate_cifs_search
cifsFile = kzalloc(sizeof(struct cifsFileInfo), GFP_KERNEL);
smb2_query_dir_first ->query_dir_first()
SMB2_query_directory
SMB2_query_directory_init
cifs_send_recv
smb2_parse_query_directory
srch_inf->ntwrk_buf_start = (char *)rsp;
srch_inf->srch_entries_start = (char *)rsp + ...
srch_inf->last_entry = (char *)rsp + ...
srch_inf->smallBuf = true;
find_cifs_entry
/* if (cfile->srch_inf.ntwrk_buf_start) */
cifs_small_buf_release(cfile->srch_inf // free
cifs_readdir ->iterate_shared()
/* file->private_data != NULL */
find_cifs_entry
/* in while (...) loop */
smb2_query_dir_next ->query_dir_next()
SMB2_query_directory
SMB2_query_directory_init
cifs_send_recv
compound_send_recv
smb_send_rqst
__smb_send_rqst
rc = -ERESTARTSYS;
/* if (fatal_signal_pending()) */
goto out;
return rc
/* if (cfile->srch_inf.last_entry) */
cifs_save_resume_key()
cifs_fill_dirent // UAF
/* if (rc) */
return -ENOENT;
Fix this by ensuring the return code is checked before using pointers
from the srch_inf.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=220131 [1]
Fixes: a364bc0b37f1 ("[CIFS] fix saving of resume key before CIFSFindNext")
Cc: stable@vger.kernel.org
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Wang Zhaolong <wangzhaolong1@huawei.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Wang Zhaolong <wangzhaolong1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Wang Zhaolong <wangzhaolong1@huawei.com>
Date: Fri May 16 17:12:56 2025 +0800
smb: client: Reset all search buffer pointers when releasing buffer
commit e48f9d849bfdec276eebf782a84fd4dfbe1c14c0 upstream.
Multiple pointers in struct cifs_search_info (ntwrk_buf_start,
srch_entries_start, and last_entry) point to the same allocated buffer.
However, when freeing this buffer, only ntwrk_buf_start was set to NULL,
while the other pointers remained pointing to freed memory.
This is defensive programming to prevent potential issues with stale
pointers. While the active UAF vulnerability is fixed by the previous
patch, this change ensures consistent pointer state and more robust error
handling.
Signed-off-by: Wang Zhaolong <wangzhaolong1@huawei.com>
Cc: stable@vger.kernel.org
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Wang Zhaolong <wangzhaolong1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Andrew Davis <afd@ti.com>
Date: Thu Jan 23 12:17:26 2025 -0600
soc: ti: k3-socinfo: Do not use syscon helper to build regmap
[ Upstream commit a5caf03188e44388e8c618dcbe5fffad1a249385 ]
The syscon helper device_node_to_regmap() is used to fetch a regmap
registered to a device node. It also currently creates this regmap
if the node did not already have a regmap associated with it. This
should only be used on "syscon" nodes. This driver is not such a
device and instead uses device_node_to_regmap() on its own node as
a hacky way to create a regmap for itself.
This will not work going forward and so we should create our regmap
the normal way by defining our regmap_config, fetching our memory
resource, then using the normal regmap_init_mmio() function.
Signed-off-by: Andrew Davis <afd@ti.com>
Link: https://lore.kernel.org/r/20250123181726.597144-1-afd@ti.com
Signed-off-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Geert Uytterhoeven <geert+renesas@glider.be>
Date: Fri May 2 13:10:35 2025 +0200
spi: loopback-test: Do not split 1024-byte hexdumps
[ Upstream commit a73fa3690a1f3014d6677e368dce4e70767a6ba2 ]
spi_test_print_hex_dump() prints buffers holding less than 1024 bytes in
full. Larger buffers are truncated: only the first 512 and the last 512
bytes are printed, separated by a truncation message. The latter is
confusing in case the buffer holds exactly 1024 bytes, as all data is
printed anyway.
Fix this by printing buffers holding up to and including 1024 bytes in
full.
Fixes: 84e0c4e5e2c4ef42 ("spi: add loopback test driver to allow for spi_master regression tests")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/37ee1bc90c6554c9347040adabf04188c8f704aa.1746184171.git.geert+renesas@glider.be
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Bogdan-Gabriel Roman <bogdan-gabriel.roman@nxp.com>
Date: Thu May 22 15:51:31 2025 +0100
spi: spi-fsl-dspi: Halt the module after a new message transfer
[ Upstream commit 8a30a6d35a11ff5ccdede7d6740765685385a917 ]
The XSPI mode implementation in this driver still uses the EOQ flag to
signal the last word in a transmission and deassert the PCS signal.
However, at speeds lower than ~200kHZ, the PCS signal seems to remain
asserted even when SR[EOQF] = 1 indicates the end of a transmission.
This is a problem for target devices which require the deassertation of
the PCS signal between transfers.
Hence, this commit 'forces' the deassertation of the PCS by stopping the
module through MCR[HALT] after completing a new transfer. According to
the reference manual, the module stops or transitions from the Running
state to the Stopped state after the current frame, when any one of the
following conditions exist:
- The value of SR[EOQF] = 1.
- The chip is in Debug mode and the value of MCR[FRZ] = 1.
- The value of MCR[HALT] = 1.
This shouldn't be done if the last transfer in the message has cs_change
set.
Fixes: ea93ed4c181b ("spi: spi-fsl-dspi: Use EOQ for last word in buffer even for XSPI mode")
Signed-off-by: Bogdan-Gabriel Roman <bogdan-gabriel.roman@nxp.com>
Signed-off-by: Larisa Grigore <larisa.grigore@nxp.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Link: https://patch.msgid.link/20250522-james-nxp-spi-v2-2-bea884630cfb@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Larisa Grigore <larisa.grigore@nxp.com>
Date: Thu May 22 15:51:32 2025 +0100
spi: spi-fsl-dspi: Reset SR flags before sending a new message
[ Upstream commit 7aba292eb15389073c7f3bd7847e3862dfdf604d ]
If, in a previous transfer, the controller sends more data than expected
by the DSPI target, SR.RFDF (RX FIFO is not empty) will remain asserted.
When flushing the FIFOs at the beginning of a new transfer (writing 1
into MCR.CLR_TXF and MCR.CLR_RXF), SR.RFDF should also be cleared.
Otherwise, when running in target mode with DMA, if SR.RFDF remains
asserted, the DMA callback will be fired before the controller sends any
data.
Take this opportunity to reset all Status Register fields.
Fixes: 5ce3cc567471 ("spi: spi-fsl-dspi: Provide support for DSPI slave mode operation (Vybryd vf610)")
Signed-off-by: Larisa Grigore <larisa.grigore@nxp.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Link: https://patch.msgid.link/20250522-james-nxp-spi-v2-3-bea884630cfb@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Larisa Grigore <larisa.grigore@nxp.com>
Date: Thu May 22 15:51:30 2025 +0100
spi: spi-fsl-dspi: restrict register range for regmap access
[ Upstream commit 283ae0c65e9c592f4a1ba4f31917f5e766da7f31 ]
DSPI registers are NOT continuous, some registers are reserved and
accessing them from userspace will trigger external abort, add regmap
register access table to avoid below abort.
For example on S32G:
# cat /sys/kernel/debug/regmap/401d8000.spi/registers
Internal error: synchronous external abort: 96000210 1 PREEMPT SMP
...
Call trace:
regmap_mmio_read32le+0x24/0x48
regmap_mmio_read+0x48/0x70
_regmap_bus_reg_read+0x38/0x48
_regmap_read+0x68/0x1b0
regmap_read+0x50/0x78
regmap_read_debugfs+0x120/0x338
Fixes: 1acbdeb92c87 ("spi/fsl-dspi: Convert to use regmap and add big-endian support")
Co-developed-by: Xulin Sun <xulin.sun@windriver.com>
Signed-off-by: Xulin Sun <xulin.sun@windriver.com>
Signed-off-by: Larisa Grigore <larisa.grigore@nxp.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Link: https://patch.msgid.link/20250522-james-nxp-spi-v2-1-bea884630cfb@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alessandro Grassi <alessandro.grassi@mailbox.org>
Date: Fri May 2 11:55:20 2025 +0200
spi: spi-sun4i: fix early activation
[ Upstream commit fb98bd0a13de2c9d96cb5c00c81b5ca118ac9d71 ]
The SPI interface is activated before the CPOL setting is applied. In
that moment, the clock idles high and CS goes low. After a short delay,
CPOL and other settings are applied, which may cause the clock to change
state and idle low. This transition is not part of a clock cycle, and it
can confuse the receiving device.
To prevent this unexpected transition, activate the interface while CPOL
and the other settings are being applied.
Signed-off-by: Alessandro Grassi <alessandro.grassi@mailbox.org>
Link: https://patch.msgid.link/20250502095520.13825-1-alessandro.grassi@mailbox.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sean Anderson <sean.anderson@linux.dev>
Date: Thu Jan 16 17:41:30 2025 -0500
spi: zynqmp-gqspi: Always acknowledge interrupts
[ Upstream commit 89785306453ce6d949e783f6936821a0b7649ee2 ]
RXEMPTY can cause an IRQ, even though we may not do anything about it
(such as if we are waiting for more received data). We must still handle
these IRQs because we can tell they were caused by the device.
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Link: https://patch.msgid.link/20250116224130.2684544-6-sean.anderson@linux.dev
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Gabriel Shahrouzi <gshahrouzi@gmail.com>
Date: Fri Apr 18 21:29:37 2025 -0400
staging: axis-fifo: Correct handling of tx_fifo_depth for size validation
commit 2ca34b508774aaa590fc3698a54204706ecca4ba upstream.
Remove erroneous subtraction of 4 from the total FIFO depth read from
device tree. The stored depth is for checking against total capacity,
not initial vacancy. This prevented writes near the FIFO's full size.
The check performed just before data transfer, which uses live reads of
the TDFV register to determine current vacancy, correctly handles the
initial Depth - 4 hardware state and subsequent FIFO fullness.
Fixes: 4a965c5f89de ("staging: add driver for Xilinx AXI-Stream FIFO v4.1 IP core")
Cc: stable@vger.kernel.org
Signed-off-by: Gabriel Shahrouzi <gshahrouzi@gmail.com>
Link: https://lore.kernel.org/r/20250419012937.674924-1-gshahrouzi@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Gabriel Shahrouzi <gshahrouzi@gmail.com>
Date: Fri Apr 18 20:43:06 2025 -0400
staging: axis-fifo: Remove hardware resets for user errors
commit c6e8d85fafa7193613db37da29c0e8d6e2515b13 upstream.
The axis-fifo driver performs a full hardware reset (via
reset_ip_core()) in several error paths within the read and write
functions. This reset flushes both TX and RX FIFOs and resets the
AXI-Stream links.
Allow the user to handle the error without causing hardware disruption
or data loss in other FIFO paths.
Fixes: 4a965c5f89de ("staging: add driver for Xilinx AXI-Stream FIFO v4.1 IP core")
Cc: stable@vger.kernel.org
Signed-off-by: Gabriel Shahrouzi <gshahrouzi@gmail.com>
Link: https://lore.kernel.org/r/20250419004306.669605-1-gshahrouzi@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Gabriel Shahrouzi <gshahrouzi@gmail.com>
Date: Mon Apr 14 11:40:49 2025 -0400
staging: iio: adc: ad7816: Correct conditional logic for store mode
commit 2e922956277187655ed9bedf7b5c28906e51708f upstream.
The mode setting logic in ad7816_store_mode was reversed due to
incorrect handling of the strcmp return value. strcmp returns 0 on
match, so the `if (strcmp(buf, "full"))` block executed when the
input was not "full".
This resulted in "full" setting the mode to AD7816_PD (power-down) and
other inputs setting it to AD7816_FULL.
Fix this by checking it against 0 to correctly check for "full" and
"power-down", mapping them to AD7816_FULL and AD7816_PD respectively.
Fixes: 7924425db04a ("staging: iio: adc: new driver for AD7816 devices")
Cc: stable@vger.kernel.org
Signed-off-by: Gabriel Shahrouzi <gshahrouzi@gmail.com>
Acked-by: Nuno Sá <nuno.sa@analog.com>
Link: https://lore.kernel.org/stable/20250414152920.467505-1-gshahrouzi%40gmail.com
Link: https://patch.msgid.link/20250414154050.469482-1-gshahrouzi@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Mon Mar 24 19:35:01 2025 -0400
SUNRPC: rpc_clnt_set_transport() must not change the autobind setting
[ Upstream commit bf9be373b830a3e48117da5d89bb6145a575f880 ]
The autobind setting was supposed to be determined in rpc_create(),
since commit c2866763b402 ("SUNRPC: use sockaddr + size when creating
remote transport endpoints").
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date: Mon Mar 24 19:05:48 2025 -0400
SUNRPC: rpcbind should never reset the port to the value '0'
[ Upstream commit 214c13e380ad7636631279f426387f9c4e3c14d9 ]
If we already had a valid port number for the RPC service, then we
should not allow the rpcbind client to set it to the invalid value '0'.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Wed Mar 5 13:05:50 2025 +0000
tcp: bring back NUMA dispersion in inet_ehash_locks_alloc()
[ Upstream commit f8ece40786c9342249aa0a1b55e148ee23b2a746 ]
We have platforms with 6 NUMA nodes and 480 cpus.
inet_ehash_locks_alloc() currently allocates a single 64KB page
to hold all ehash spinlocks. This adds more pressure on a single node.
Change inet_ehash_locks_alloc() to use vmalloc() to spread
the spinlocks on all online nodes, driven by NUMA policies.
At boot time, NUMA policy is interleave=all, meaning that
tcp_hashinfo.ehash_locks gets hash dispersion on all nodes.
Tested:
lack5:~# grep inet_ehash_locks_alloc /proc/vmallocinfo
0x00000000d9aec4d1-0x00000000a828b652 69632 inet_ehash_locks_alloc+0x90/0x100 pages=16 vmalloc N0=2 N1=3 N2=3 N3=3 N4=3 N5=2
lack5:~# echo 8192 >/proc/sys/net/ipv4/tcp_child_ehash_entries
lack5:~# numactl --interleave=all unshare -n bash -c "grep inet_ehash_locks_alloc /proc/vmallocinfo"
0x000000004e99d30c-0x00000000763f3279 36864 inet_ehash_locks_alloc+0x90/0x100 pages=8 vmalloc N0=1 N1=2 N2=2 N3=1 N4=1 N5=1
0x00000000d9aec4d1-0x00000000a828b652 69632 inet_ehash_locks_alloc+0x90/0x100 pages=16 vmalloc N0=2 N1=3 N2=3 N3=3 N4=3 N5=2
lack5:~# numactl --interleave=0,5 unshare -n bash -c "grep inet_ehash_locks_alloc /proc/vmallocinfo"
0x00000000fd73a33e-0x0000000004b9a177 36864 inet_ehash_locks_alloc+0x90/0x100 pages=8 vmalloc N0=4 N5=4
0x00000000d9aec4d1-0x00000000a828b652 69632 inet_ehash_locks_alloc+0x90/0x100 pages=16 vmalloc N0=2 N1=3 N2=3 N3=3 N4=3 N5=2
lack5:~# echo 1024 >/proc/sys/net/ipv4/tcp_child_ehash_entries
lack5:~# numactl --interleave=all unshare -n bash -c "grep inet_ehash_locks_alloc /proc/vmallocinfo"
0x00000000db07d7a2-0x00000000ad697d29 8192 inet_ehash_locks_alloc+0x90/0x100 pages=1 vmalloc N2=1
0x00000000d9aec4d1-0x00000000a828b652 69632 inet_ehash_locks_alloc+0x90/0x100 pages=16 vmalloc N0=2 N1=3 N2=3 N3=3 N4=3 N5=2
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250305130550.1865988-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ilpo Järvinen <ij@kernel.org>
Date: Wed Mar 5 23:38:41 2025 +0100
tcp: reorganize tcp_in_ack_event() and tcp_count_delivered()
[ Upstream commit 149dfb31615e22271d2525f078c95ea49bc4db24 ]
- Move tcp_count_delivered() earlier and split tcp_count_delivered_ce()
out of it
- Move tcp_in_ack_event() later
- While at it, remove the inline from tcp_in_ack_event() and let
the compiler to decide
Accurate ECN's heuristics does not know if there is going
to be ACE field based CE counter increase or not until after
rtx queue has been processed. Only then the number of ACKed
bytes/pkts is available. As CE or not affects presence of
FLAG_ECE, that information for tcp_in_ack_event is not yet
available in the old location of the call to tcp_in_ack_event().
Signed-off-by: Ilpo Järvinen <ij@kernel.org>
Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alice Guo <alice.guo@nxp.com>
Date: Mon Dec 9 11:48:59 2024 -0500
thermal/drivers/qoriq: Power down TMU on system suspend
[ Upstream commit 229f3feb4b0442835b27d519679168bea2de96c2 ]
Enable power-down of TMU (Thermal Management Unit) for TMU version 2 during
system suspend to save power. Save approximately 4.3mW on VDD_ANA_1P8 on
i.MX93 platforms.
Signed-off-by: Alice Guo <alice.guo@nxp.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Link: https://lore.kernel.org/r/20241209164859.3758906-2-Frank.Li@nxp.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ian Rogers <irogers@google.com>
Date: Tue Mar 11 14:36:23 2025 -0700
tools/build: Don't pass test log files to linker
[ Upstream commit 935e7cb5bb80106ff4f2fe39640f430134ef8cd8 ]
Separate test log files from object files. Depend on test log output
but don't pass to the linker.
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250311213628.569562-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Michal Suchanek <msuchanek@suse.de>
Date: Fri Apr 4 10:23:14 2025 +0200
tpm: tis: Double the timeout B to 4s
[ Upstream commit 2f661f71fda1fc0c42b7746ca5b7da529eb6b5be ]
With some Infineon chips the timeouts in tpm_tis_send_data (both B and
C) can reach up to about 2250 ms.
Timeout C is retried since
commit de9e33df7762 ("tpm, tpm_tis: Workaround failed command reception on Infineon devices")
Timeout B still needs to be extended.
The problem is most commonly encountered with context related operation
such as load context/save context. These are issued directly by the
kernel, and there is no retry logic for them.
When a filesystem is set up to use the TPM for unlocking the boot fails,
and restarting the userspace service is ineffective. This is likely
because ignoring a load context/save context result puts the real TPM
state and the TPM state expected by the kernel out of sync.
Chips known to be affected:
tpm_tis IFX1522:00: 2.0 TPM (device-id 0x1D, rev-id 54)
Description: SLB9672
Firmware Revision: 15.22
tpm_tis MSFT0101:00: 2.0 TPM (device-id 0x1B, rev-id 22)
Firmware Revision: 7.83
tpm_tis MSFT0101:00: 2.0 TPM (device-id 0x1A, rev-id 16)
Firmware Revision: 5.63
Link: https://lore.kernel.org/linux-integrity/Z5pI07m0Muapyu9w@kitsune.suse.cz/
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jeongjun Park <aha310510@gmail.com>
Date: Tue Apr 22 20:30:25 2025 +0900
tracing: Fix oob write in trace_seq_to_buffer()
commit f5178c41bb43444a6008150fe6094497135d07cb upstream.
syzbot reported this bug:
==================================================================
BUG: KASAN: slab-out-of-bounds in trace_seq_to_buffer kernel/trace/trace.c:1830 [inline]
BUG: KASAN: slab-out-of-bounds in tracing_splice_read_pipe+0x6be/0xdd0 kernel/trace/trace.c:6822
Write of size 4507 at addr ffff888032b6b000 by task syz.2.320/7260
CPU: 1 UID: 0 PID: 7260 Comm: syz.2.320 Not tainted 6.15.0-rc1-syzkaller-00301-g3bde70a2c827 #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:408 [inline]
print_report+0xc3/0x670 mm/kasan/report.c:521
kasan_report+0xe0/0x110 mm/kasan/report.c:634
check_region_inline mm/kasan/generic.c:183 [inline]
kasan_check_range+0xef/0x1a0 mm/kasan/generic.c:189
__asan_memcpy+0x3c/0x60 mm/kasan/shadow.c:106
trace_seq_to_buffer kernel/trace/trace.c:1830 [inline]
tracing_splice_read_pipe+0x6be/0xdd0 kernel/trace/trace.c:6822
....
==================================================================
It has been reported that trace_seq_to_buffer() tries to copy more data
than PAGE_SIZE to buf. Therefore, to prevent this, we should use the
smaller of trace_seq_used(&iter->seq) and PAGE_SIZE as an argument.
Link: https://lore.kernel.org/20250422113026.13308-1-aha310510@gmail.com
Reported-by: syzbot+c8cd2d2c412b868263fb@syzkaller.appspotmail.com
Fixes: 3c56819b14b0 ("tracing: splice support for tracing_pipe")
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Steven Rostedt <rostedt@goodmis.org>
Date: Fri May 9 15:26:57 2025 -0400
tracing: samples: Initialize trace_array_printk() with the correct function
commit 1b0c192c92ea1fe2dcb178f84adf15fe37c3e7c8 upstream.
When using trace_array_printk() on a created instance, the correct
function to use to initialize it is:
trace_array_init_printk()
Not
trace_printk_init_buffer()
The former is a proper function to use, the latter is for initializing
trace_printk() and causes the NOTICE banner to be displayed.
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Divya Indi <divya.indi@oracle.com>
Link: https://lore.kernel.org/20250509152657.0f6744d9@gandalf.local.home
Fixes: 89ed42495ef4a ("tracing: Sample module to demonstrate kernel access to Ftrace instances.")
Fixes: 38ce2a9e33db6 ("tracing: Add trace_array_init_printk() to initialize instance trace_printk() buffers")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date: Tue Sep 3 20:59:04 2024 +0300
types: Complement the aligned types with signed 64-bit one
[ Upstream commit e4ca0e59c39442546866f3dd514a3a5956577daf ]
Some user may want to use aligned signed 64-bit type.
Provide it for them.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20240903180218.3640501-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Stable-dep-of: 5097eaae98e5 ("iio: adc: dln2: Use aligned_s64 for timestamp")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Masahiro Yamada <masahiroy@kernel.org>
Date: Wed May 7 16:49:33 2025 +0900
um: let 'make clean' properly clean underlying SUBARCH as well
[ Upstream commit ab09da75700e9d25c7dfbc7f7934920beb5e39b9 ]
Building the kernel with O= is affected by stale in-tree build artifacts.
So, if the source tree is not clean, Kbuild displays the following:
$ make ARCH=um O=build defconfig
make[1]: Entering directory '/.../linux/build'
***
*** The source tree is not clean, please run 'make ARCH=um mrproper'
*** in /.../linux
***
make[2]: *** [/.../linux/Makefile:673: outputmakefile] Error 1
make[1]: *** [/.../linux/Makefile:248: __sub-make] Error 2
make[1]: Leaving directory '/.../linux/build'
make: *** [Makefile:248: __sub-make] Error 2
Usually, running 'make mrproper' is sufficient for cleaning the source
tree for out-of-tree builds.
However, building UML generates build artifacts not only in arch/um/,
but also in the SUBARCH directory (i.e., arch/x86/). If in-tree stale
files remain under arch/x86/, Kbuild will reuse them instead of creating
new ones under the specified build directory.
This commit makes 'make ARCH=um clean' recurse into the SUBARCH directory.
Reported-by: Shuah Khan <skhan@linuxfoundation.org>
Closes: https://lore.kernel.org/lkml/20250502172459.14175-1-skhan@linuxfoundation.org/
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Benjamin Berg <benjamin@sipsolutions.net>
Date: Mon Feb 24 19:18:19 2025 +0100
um: Store full CSGSFS and SS register from mcontext
[ Upstream commit cef721e0d53d2b64f2ba177c63a0dfdd7c0daf17 ]
Doing this allows using registers as retrieved from an mcontext to be
pushed to a process using PTRACE_SETREGS.
It is not entirely clear to me why CSGSFS was masked. Doing so creates
issues when using the mcontext as process state in seccomp and simply
copying the register appears to work perfectly fine for ptrace.
Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net>
Link: https://patch.msgid.link/20250224181827.647129-2-benjamin@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tiwei Bie <tiwei.btw@antgroup.com>
Date: Fri Feb 21 12:18:55 2025 +0800
um: Update min_low_pfn to match changes in uml_reserved
[ Upstream commit e82cf3051e6193f61e03898f8dba035199064d36 ]
When uml_reserved is updated, min_low_pfn must also be updated
accordingly. Otherwise, min_low_pfn will not accurately reflect
the lowest available PFN.
Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
Link: https://patch.msgid.link/20250221041855.1156109-1-tiwei.btw@antgroup.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Fedor Pchelkin <pchelkin@ispras.ru>
Date: Sun Mar 16 13:26:56 2025 +0300
usb: chipidea: ci_hdrc_imx: implement usb_phy_init() error handling
[ Upstream commit 8c531e0a8c2d82509ad97c6d3a1e6217c7ed136d ]
usb_phy_init() may return an error code if e.g. its implementation fails
to prepare/enable some clocks. And properly rollback on probe error path
by calling the counterpart usb_phy_shutdown().
Found by Linux Verification Center (linuxtesting.org).
Fixes: be9cae2479f4 ("usb: chipidea: imx: Fix ULPI on imx53")
Cc: stable <stable@kernel.org>
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Acked-by: Peter Chen <peter.chen@kernel.org>
Link: https://lore.kernel.org/r/20250316102658.490340-4-pchelkin@ispras.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alexander Stein <alexander.stein@ew.tq-group.com>
Date: Tue Jun 14 14:05:22 2022 +0200
usb: chipidea: ci_hdrc_imx: use dev_err_probe()
[ Upstream commit 18171cfc3c236a1587dcad9adc27c6e781af4438 ]
Use dev_err_probe() to simplify handling errors in ci_hdrc_imx_probe()
Acked-by: Peter Chen <peter.chen@kernel.org>
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Link: https://lore.kernel.org/r/20220614120522.1469957-1-alexander.stein@ew.tq-group.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stable-dep-of: 8c531e0a8c2d ("usb: chipidea: ci_hdrc_imx: implement usb_phy_init() error handling")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wayne Chang <waynec@nvidia.com>
Date: Fri Apr 18 16:12:28 2025 +0800
usb: gadget: tegra-xudc: ACK ST_RC after clearing CTRL_RUN
commit 59820fde001500c167342257650541280c622b73 upstream.
We identified a bug where the ST_RC bit in the status register was not
being acknowledged after clearing the CTRL_RUN bit in the control
register. This could lead to unexpected behavior in the USB gadget
drivers.
This patch resolves the issue by adding the necessary code to explicitly
acknowledge ST_RC after clearing CTRL_RUN based on the programming
sequence, ensuring proper state transition.
Fixes: 49db427232fe ("usb: gadget: Add UDC driver for tegra XUSB device mode controller")
Cc: stable <stable@kernel.org>
Signed-off-by: Wayne Chang <waynec@nvidia.com>
Link: https://lore.kernel.org/r/20250418081228.1194779-1-waynec@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jim Lin <jilin@nvidia.com>
Date: Tue Apr 22 19:40:01 2025 +0800
usb: host: tegra: Prevent host controller crash when OTG port is used
commit 732f35cf8bdfece582f6e4a9c659119036577308 upstream.
When a USB device is connected to the OTG port, the tegra_xhci_id_work()
routine transitions the PHY to host mode and calls xhci_hub_control()
with the SetPortFeature command to enable port power.
In certain cases, the XHCI controller may be in a low-power state
when this operation occurs. If xhci_hub_control() is invoked while
the controller is suspended, the PORTSC register may return 0xFFFFFFFF,
indicating a read failure. This causes xhci_hc_died() to be triggered,
leading to host controller shutdown.
Example backtrace:
[ 105.445736] Workqueue: events tegra_xhci_id_work
[ 105.445747] dump_backtrace+0x0/0x1e8
[ 105.445759] xhci_hc_died.part.48+0x40/0x270
[ 105.445769] tegra_xhci_set_port_power+0xc0/0x240
[ 105.445774] tegra_xhci_id_work+0x130/0x240
To prevent this, ensure the controller is fully resumed before
interacting with hardware registers by calling pm_runtime_get_sync()
prior to the host mode transition and xhci_hub_control().
Fixes: f836e7843036 ("usb: xhci-tegra: Add OTG support")
Cc: stable <stable@kernel.org>
Signed-off-by: Jim Lin <jilin@nvidia.com>
Signed-off-by: Wayne Chang <waynec@nvidia.com>
Link: https://lore.kernel.org/r/20250422114001.126367-1-waynec@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: RD Babiera <rdbabiera@google.com>
Date: Thu Feb 29 00:11:02 2024 +0000
usb: typec: altmodes/displayport: create sysfs nodes as driver's default device attribute group
commit 165376f6b23e9a779850e750fb2eb06622e5a531 upstream.
The DisplayPort driver's sysfs nodes may be present to the userspace before
typec_altmode_set_drvdata() completes in dp_altmode_probe. This means that
a sysfs read can trigger a NULL pointer error by deferencing dp->hpd in
hpd_show or dp->lock in pin_assignment_show, as dev_get_drvdata() returns
NULL in those cases.
Remove manual sysfs node creation in favor of adding attribute group as
default for devices bound to the driver. The ATTRIBUTE_GROUPS() macro is
not used here otherwise the path to the sysfs nodes is no longer compliant
with the ABI.
Fixes: 0e3bb7d6894d ("usb: typec: Add driver for DisplayPort alternate mode")
Cc: stable@vger.kernel.org
Signed-off-by: RD Babiera <rdbabiera@google.com>
Link: https://lore.kernel.org/r/20240229001101.3889432-2-rdbabiera@google.com
[Minor conflict resolved due to code context change.]
Signed-off-by: Jianqi Ren <jianqi.ren.cn@windriver.com>
Signed-off-by: He Zhe <zhe.he@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: GONG Ruiqi <gongruiqi1@huawei.com>
Date: Tue Jan 7 09:57:50 2025 +0800
usb: typec: fix pm usage counter imbalance in ucsi_ccg_sync_control()
commit b0e525d7a22ea350e75e2aec22e47fcfafa4cacd upstream.
The error handling for the case `con_index == 0` should involve dropping
the pm usage counter, as ucsi_ccg_sync_control() gets it at the
beginning. Fix it.
Cc: stable <stable@kernel.org>
Fixes: e56aac6e5a25 ("usb: typec: fix potential array underflow in ucsi_ccg_sync_control()")
Signed-off-by: GONG Ruiqi <gongruiqi1@huawei.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20250107015750.2778646-1-gongruiqi1@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[Minor context change fixed.]
Signed-off-by: Bin Lan <bin.lan.cn@windriver.com>
Signed-off-by: He Zhe <zhe.he@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Dan Carpenter <dan.carpenter@linaro.org>
Date: Mon Nov 11 14:08:06 2024 +0300
usb: typec: fix potential array underflow in ucsi_ccg_sync_control()
commit e56aac6e5a25630645607b6856d4b2a17b2311a5 upstream.
The "command" variable can be controlled by the user via debugfs. The
worry is that if con_index is zero then "&uc->ucsi->connector[con_index
- 1]" would be an array underflow.
Fixes: 170a6726d0e2 ("usb: typec: ucsi: add support for separate DP altmode devices")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/c69ef0b3-61b0-4dde-98dd-97b97f81d912@stanley.mountain
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ The function ucsi_ccg_sync_write() is renamed to ucsi_ccg_sync_control()
in commit 13f2ec3115c8 ("usb: typec: ucsi:simplify command sending API").
Apply this patch to ucsi_ccg_sync_write() in 6.1.y accordingly. ]
Signed-off-by: Bin Lan <bin.lan.cn@windriver.com>
Signed-off-by: He Zhe <zhe.he@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: RD Babiera <rdbabiera@google.com>
Date: Tue Apr 29 23:47:01 2025 +0000
usb: typec: tcpm: delay SNK_TRY_WAIT_DEBOUNCE to SRC_TRYWAIT transition
commit e918d3959b5ae0e793b8f815ce62240e10ba03a4 upstream.
This patch fixes Type-C Compliance Test TD 4.7.6 - Try.SNK DRP Connect
SNKAS.
The compliance tester moves into SNK_UNATTACHED during toggling and
expects the PUT to apply Rp after tPDDebounce of detection. If the port
is in SNK_TRY_WAIT_DEBOUNCE, it will move into SRC_TRYWAIT immediately
and apply Rp. This violates TD 4.7.5.V.3, where the tester confirms that
the PUT attaches Rp after the transitions to Unattached.SNK for
tPDDebounce.
Change the tcpm_set_state delay between SNK_TRY_WAIT_DEBOUNCE and
SRC_TRYWAIT to tPDDebounce.
Fixes: a0a3e04e6b2c ("staging: typec: tcpm: Check for Rp for tPDDebounce")
Cc: stable <stable@kernel.org>
Signed-off-by: RD Babiera <rdbabiera@google.com>
Reviewed-by: Badhri Jagan Sridharan <badhri@google.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20250429234703.3748506-2-rdbabiera@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Andrei Kuchynski <akuchynski@chromium.org>
Date: Thu Apr 24 08:44:29 2025 +0000
usb: typec: ucsi: displayport: Fix NULL pointer access
commit 312d79669e71283d05c05cc49a1a31e59e3d9e0e upstream.
This patch ensures that the UCSI driver waits for all pending tasks in the
ucsi_displayport_work workqueue to finish executing before proceeding with
the partner removal.
Cc: stable <stable@kernel.org>
Fixes: af8622f6a585 ("usb: typec: ucsi: Support for DisplayPort alt mode")
Signed-off-by: Andrei Kuchynski <akuchynski@chromium.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Benson Leung <bleung@chromium.org>
Link: https://lore.kernel.org/r/20250424084429.3220757-3-akuchynski@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Alexey Charkov <alchark@gmail.com>
Date: Fri Apr 25 18:11:11 2025 +0400
usb: uhci-platform: Make the clock really optional
commit a5c7973539b010874a37a0e846e62ac6f00553ba upstream.
Device tree bindings state that the clock is optional for UHCI platform
controllers, and some existing device trees don't provide those - such
as those for VIA/WonderMedia devices.
The driver however fails to probe now if no clock is provided, because
devm_clk_get returns an error pointer in such case.
Switch to devm_clk_get_optional instead, so that it could probe again
on those platforms where no clocks are given.
Cc: stable <stable@kernel.org>
Fixes: 26c502701c52 ("usb: uhci: Add clk support to uhci-platform")
Signed-off-by: Alexey Charkov <alchark@gmail.com>
Link: https://lore.kernel.org/r/20250425-uhci-clock-optional-v1-1-a1d462592f29@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Dave Penkler <dpenkler@gmail.com>
Date: Fri May 2 09:09:41 2025 +0200
usb: usbtmc: Fix erroneous generic_read ioctl return
commit 4e77d3ec7c7c0d9535ccf1138827cb9bb5480b9b upstream.
wait_event_interruptible_timeout returns a long
The return value was being assigned to an int causing an integer overflow
when the remaining jiffies > INT_MAX which resulted in random error
returns.
Use a long return value, converting to the int ioctl return only on error.
Fixes: bb99794a4792 ("usb: usbtmc: Add ioctl for vendor specific read")
Cc: stable@vger.kernel.org
Signed-off-by: Dave Penkler <dpenkler@gmail.com>
Link: https://lore.kernel.org/r/20250502070941.31819-4-dpenkler@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Dave Penkler <dpenkler@gmail.com>
Date: Fri May 2 09:09:39 2025 +0200
usb: usbtmc: Fix erroneous get_stb ioctl error returns
commit cac01bd178d6a2a23727f138d647ce1a0e8a73a1 upstream.
wait_event_interruptible_timeout returns a long
The return was being assigned to an int causing an integer overflow when
the remaining jiffies > INT_MAX resulting in random error returns.
Use a long return value and convert to int ioctl return only on error.
When the return value of wait_event_interruptible_timeout was <= INT_MAX
the number of remaining jiffies was returned which has no meaning for the
user. Return 0 on success.
Reported-by: Michael Katzmann <vk2bea@gmail.com>
Fixes: dbf3e7f654c0 ("Implement an ioctl to support the USMTMC-USB488 READ_STATUS_BYTE operation.")
Cc: stable@vger.kernel.org
Signed-off-by: Dave Penkler <dpenkler@gmail.com>
Link: https://lore.kernel.org/r/20250502070941.31819-2-dpenkler@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Dave Penkler <dpenkler@gmail.com>
Date: Fri May 2 09:09:40 2025 +0200
usb: usbtmc: Fix erroneous wait_srq ioctl return
commit a9747c9b8b59ab4207effd20eb91a890acb44e16 upstream.
wait_event_interruptible_timeout returns a long
The return was being assigned to an int causing an integer overflow when
the remaining jiffies > INT_MAX resulting in random error returns.
Use a long return value, converting to the int ioctl return only on
error.
Fixes: 739240a9f6ac ("usb: usbtmc: Add ioctl USBTMC488_IOCTL_WAIT_SRQ")
Cc: stable@vger.kernel.org
Signed-off-by: Dave Penkler <dpenkler@gmail.com>
Link: https://lore.kernel.org/r/20250502070941.31819-3-dpenkler@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Oliver Neukum <oneukum@suse.com>
Date: Wed Apr 30 15:48:10 2025 +0200
USB: usbtmc: use interruptible sleep in usbtmc_read
commit 054c5145540e5ad5b80adf23a5e3e2fc281fb8aa upstream.
usbtmc_read() calls usbtmc_generic_read()
which uses interruptible sleep, but usbtmc_read()
itself uses uninterruptble sleep for mutual exclusion
between threads. That makes no sense.
Both should use interruptible sleep.
Fixes: 5b775f672cc99 ("USB: add USB test and measurement class driver")
Cc: stable <stable@kernel.org>
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Link: https://lore.kernel.org/r/20250430134810.226015-1-oneukum@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ido Schimmel <idosch@nvidia.com>
Date: Tue Feb 4 16:55:42 2025 +0200
vxlan: Annotate FDB data races
[ Upstream commit f6205f8215f12a96518ac9469ff76294ae7bd612 ]
The 'used' and 'updated' fields in the FDB entry structure can be
accessed concurrently by multiple threads, leading to reports such as
[1]. Can be reproduced using [2].
Suppress these reports by annotating these accesses using
READ_ONCE() / WRITE_ONCE().
[1]
BUG: KCSAN: data-race in vxlan_xmit / vxlan_xmit
write to 0xffff942604d263a8 of 8 bytes by task 286 on cpu 0:
vxlan_xmit+0xb29/0x2380
dev_hard_start_xmit+0x84/0x2f0
__dev_queue_xmit+0x45a/0x1650
packet_xmit+0x100/0x150
packet_sendmsg+0x2114/0x2ac0
__sys_sendto+0x318/0x330
__x64_sys_sendto+0x76/0x90
x64_sys_call+0x14e8/0x1c00
do_syscall_64+0x9e/0x1a0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
read to 0xffff942604d263a8 of 8 bytes by task 287 on cpu 2:
vxlan_xmit+0xadf/0x2380
dev_hard_start_xmit+0x84/0x2f0
__dev_queue_xmit+0x45a/0x1650
packet_xmit+0x100/0x150
packet_sendmsg+0x2114/0x2ac0
__sys_sendto+0x318/0x330
__x64_sys_sendto+0x76/0x90
x64_sys_call+0x14e8/0x1c00
do_syscall_64+0x9e/0x1a0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
value changed: 0x00000000fffbac6e -> 0x00000000fffbac6f
Reported by Kernel Concurrency Sanitizer on:
CPU: 2 UID: 0 PID: 287 Comm: mausezahn Not tainted 6.13.0-rc7-01544-gb4b270f11a02 #5
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014
[2]
#!/bin/bash
set +H
echo whitelist > /sys/kernel/debug/kcsan
echo !vxlan_xmit > /sys/kernel/debug/kcsan
ip link add name vx0 up type vxlan id 10010 dstport 4789 local 192.0.2.1
bridge fdb add 00:11:22:33:44:55 dev vx0 self static dst 198.51.100.1
taskset -c 0 mausezahn vx0 -a own -b 00:11:22:33:44:55 -c 0 -q &
taskset -c 2 mausezahn vx0 -a own -b 00:11:22:33:44:55 -c 0 -q &
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/20250204145549.1216254-2-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wentao Liang <vulab@iscas.ac.cn>
Date: Tue Apr 22 12:22:02 2025 +0800
wifi: brcm80211: fmac: Add error handling for brcmf_usb_dl_writeimage()
commit 8e089e7b585d95122c8122d732d1d5ef8f879396 upstream.
The function brcmf_usb_dl_writeimage() calls the function
brcmf_usb_dl_cmd() but dose not check its return value. The
'state.state' and the 'state.bytes' are uninitialized if the
function brcmf_usb_dl_cmd() fails. It is dangerous to use
uninitialized variables in the conditions.
Add error handling for brcmf_usb_dl_cmd() to jump to error
handling path if the brcmf_usb_dl_cmd() fails and the
'state.state' and the 'state.bytes' are uninitialized.
Improve the error message to report more detailed error
information.
Fixes: 71bb244ba2fd ("brcm80211: fmac: add USB support for bcm43235/6/8 chipsets")
Cc: stable@vger.kernel.org # v3.4+
Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Link: https://patch.msgid.link/20250422042203.2259-1-vulab@iscas.ac.cn
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Fedor Pchelkin <pchelkin@ispras.ru>
Date: Tue May 6 14:55:39 2025 +0300
wifi: mt76: disable napi on driver removal
commit 78ab4be549533432d97ea8989d2f00b508fa68d8 upstream.
A warning on driver removal started occurring after commit 9dd05df8403b
("net: warn if NAPI instance wasn't shut down"). Disable tx napi before
deleting it in mt76_dma_cleanup().
WARNING: CPU: 4 PID: 18828 at net/core/dev.c:7288 __netif_napi_del_locked+0xf0/0x100
CPU: 4 UID: 0 PID: 18828 Comm: modprobe Not tainted 6.15.0-rc4 #4 PREEMPT(lazy)
Hardware name: ASUS System Product Name/PRIME X670E-PRO WIFI, BIOS 3035 09/05/2024
RIP: 0010:__netif_napi_del_locked+0xf0/0x100
Call Trace:
<TASK>
mt76_dma_cleanup+0x54/0x2f0 [mt76]
mt7921_pci_remove+0xd5/0x190 [mt7921e]
pci_device_remove+0x47/0xc0
device_release_driver_internal+0x19e/0x200
driver_detach+0x48/0x90
bus_remove_driver+0x6d/0xf0
pci_unregister_driver+0x2e/0xb0
__do_sys_delete_module.isra.0+0x197/0x2e0
do_syscall_64+0x7b/0x160
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Tested with mt7921e but the same pattern can be actually applied to other
mt76 drivers calling mt76_dma_cleanup() during removal. Tx napi is enabled
in their *_dma_init() functions and only toggled off and on again inside
their suspend/resume/reset paths. So it should be okay to disable tx
napi in such a generic way.
Found by Linux Verification Center (linuxtesting.org).
Fixes: 2ac515a5d74f ("mt76: mt76x02: use napi polling for tx cleanup")
Cc: stable@vger.kernel.org
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Tested-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com>
Link: https://patch.msgid.link/20250506115540.19045-1-pchelkin@ispras.ru
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Date: Sun Jan 26 16:03:11 2025 +0200
wifi: rtw88: Don't use static local variable in rtw8822b_set_tx_power_index_by_rate
[ Upstream commit 00451eb3bec763f708e7e58326468c1e575e5a66 ]
Some users want to plug two identical USB devices at the same time.
This static variable could theoretically cause them to use incorrect
TX power values.
Move the variable to the caller and pass a pointer to it to
rtw8822b_set_tx_power_index_by_rate().
Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/8a60f581-0ab5-4d98-a97d-dd83b605008f@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Date: Tue Feb 4 20:37:36 2025 +0200
wifi: rtw88: Fix download_firmware_validate() for RTL8814AU
[ Upstream commit 9e8243025cc06abc975c876dffda052073207ab3 ]
After the firmware is uploaded, download_firmware_validate() checks some
bits in REG_MCUFW_CTRL to see if everything went okay. The
RTL8814AU power on sequence sets bits 13 and 12 to 2, which this
function does not expect, so it thinks the firmware upload failed.
Make download_firmware_validate() ignore bits 13 and 12.
Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/049d2887-22fc-47b7-9e59-62627cb525f8@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Date: Tue Feb 18 01:29:52 2025 +0200
wifi: rtw88: Fix rtw_desc_to_mcsrate() to handle MCS16-31
[ Upstream commit 86d04f8f991a0509e318fe886d5a1cf795736c7d ]
This function translates the rate number reported by the hardware into
something mac80211 can understand. It was ignoring the 3SS and 4SS HT
rates. Translate them too.
Also set *nss to 0 for the HT rates, just to make sure it's
initialised.
Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/d0a5a86b-4869-47f6-a5a7-01c0f987cc7f@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Date: Tue Feb 18 01:30:22 2025 +0200
wifi: rtw88: Fix rtw_init_ht_cap() for RTL8814AU
[ Upstream commit c7eea1ba05ca5b0dbf77a27cf2e1e6e2fb3c0043 ]
Set the RX mask and the highest RX rate according to the number of
spatial streams the chip can receive. For RTL8814AU that is 3.
Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/4e786f50-ed1c-4387-8b28-e6ff00e35e81@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Date: Tue Feb 18 01:30:48 2025 +0200
wifi: rtw88: Fix rtw_init_vht_cap() for RTL8814AU
[ Upstream commit 6be7544d19fcfcb729495e793bc6181f85bb8949 ]
Set the MCS maps and the highest rates according to the number of
spatial streams the chip has. For RTL8814AU that is 3.
Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/e86aa009-b5bf-4b3a-8112-ea5e3cd49465@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Breno Leitao <leitao@debian.org>
Date: Thu Oct 31 04:06:17 2024 -0700
x86/bugs: Make spectre user default depend on MITIGATION_SPECTRE_V2
[ Upstream commit 98fdaeb296f51ef08e727a7cc72e5b5c864c4f4d ]
Change the default value of spectre v2 in user mode to respect the
CONFIG_MITIGATION_SPECTRE_V2 config option.
Currently, user mode spectre v2 is set to auto
(SPECTRE_V2_USER_CMD_AUTO) by default, even if
CONFIG_MITIGATION_SPECTRE_V2 is disabled.
Set the spectre_v2 value to auto (SPECTRE_V2_USER_CMD_AUTO) if the
Spectre v2 config (CONFIG_MITIGATION_SPECTRE_V2) is enabled, otherwise
set the value to none (SPECTRE_V2_USER_CMD_NONE).
Important to say the command line argument "spectre_v2_user" overwrites
the default value in both cases.
When CONFIG_MITIGATION_SPECTRE_V2 is not set, users have the flexibility
to opt-in for specific mitigations independently. In this scenario,
setting spectre_v2= will not enable spectre_v2_user=, and command line
options spectre_v2_user and spectre_v2 are independent when
CONFIG_MITIGATION_SPECTRE_V2=n.
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Kaplan <David.Kaplan@amd.com>
Link: https://lore.kernel.org/r/20241031-x86_bugs_last_v2-v2-2-b7ff1dab840e@debian.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Waiman Long <longman@redhat.com>
Date: Thu Feb 6 14:18:44 2025 -0500
x86/nmi: Add an emergency handler in nmi_desc & use it in nmi_shootdown_cpus()
[ Upstream commit fe37c699ae3eed6e02ee55fbf5cb9ceb7fcfd76c ]
Depending on the type of panics, it was found that the
__register_nmi_handler() function can be called in NMI context from
nmi_shootdown_cpus() leading to a lockdep splat:
WARNING: inconsistent lock state
inconsistent {INITIAL USE} -> {IN-NMI} usage.
lock(&nmi_desc[0].lock);
<Interrupt>
lock(&nmi_desc[0].lock);
Call Trace:
_raw_spin_lock_irqsave
__register_nmi_handler
nmi_shootdown_cpus
kdump_nmi_shootdown_cpus
native_machine_crash_shutdown
__crash_kexec
In this particular case, the following panic message was printed before:
Kernel panic - not syncing: Fatal hardware error!
This message seemed to be given out from __ghes_panic() running in
NMI context.
The __register_nmi_handler() function which takes the nmi_desc lock
with irq disabled shouldn't be called from NMI context as this can
lead to deadlock.
The nmi_shootdown_cpus() function can only be invoked once. After the
first invocation, all other CPUs should be stuck in the newly added
crash_nmi_callback() and cannot respond to a second NMI.
Fix it by adding a new emergency NMI handler to the nmi_desc
structure and provide a new set_emergency_nmi_handler() helper to set
crash_nmi_callback() in any context. The new emergency handler will
preempt other handlers in the linked list. That will eliminate the need
to take any lock and serve the panic in NMI use case.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rik van Riel <riel@surriel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20250206191844.131700-1-longman@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Juergen Gross <jgross@suse.com>
Date: Mon Feb 10 08:43:39 2025 +0100
xen/swiotlb: relax alignment requirements
commit 85fcb57c983f423180ba6ec5d0034242da05cc54 upstream.
When mapping a buffer for DMA via .map_page or .map_sg DMA operations,
there is no need to check the machine frames to be aligned according
to the mapped areas size. All what is needed in these cases is that the
buffer is contiguous at machine level.
So carve out the alignment check from range_straddles_page_boundary()
and move it to a helper called by xen_swiotlb_alloc_coherent() and
xen_swiotlb_free_coherent() directly.
Fixes: 9f40ec84a797 ("xen/swiotlb: add alignment check for dma buffers")
Reported-by: Jan Vejvalka <jan.vejvalka@lfmotol.cuni.cz>
Tested-by: Jan Vejvalka <jan.vejvalka@lfmotol.cuni.cz>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Harshvardhan Jha <harshvardhan.j.jha@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Frediano Ziglio <frediano.ziglio@cloud.com>
Date: Thu Feb 27 14:50:15 2025 +0000
xen: Add support for XenServer 6.1 platform device
[ Upstream commit 2356f15caefc0cc63d9cc5122641754f76ef9b25 ]
On XenServer on Windows machine a platform device with ID 2 instead of
1 is used.
This device is mainly identical to device 1 but due to some Windows
update behaviour it was decided to use a device with a different ID.
This causes compatibility issues with Linux which expects, if Xen
is detected, to find a Xen platform device (5853:0001) otherwise code
will crash due to some missing initialization (specifically grant
tables). Specifically from dmesg
RIP: 0010:gnttab_expand+0x29/0x210
Code: 90 0f 1f 44 00 00 55 31 d2 48 89 e5 41 57 41 56 41 55 41 89 fd
41 54 53 48 83 ec 10 48 8b 05 7e 9a 49 02 44 8b 35 a7 9a 49 02
<8b> 48 04 8d 44 39 ff f7 f1 45 8d 24 06 89 c3 e8 43 fe ff ff
44 39
RSP: 0000:ffffba34c01fbc88 EFLAGS: 00010086
...
The device 2 is presented by Xapi adding device specification to
Qemu command line.
Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Acked-by: Juergen Gross <jgross@suse.com>
Message-ID: <20250227145016.25350-1-frediano.ziglio@cloud.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jason Andryuk <jason.andryuk@amd.com>
Date: Tue May 6 16:44:56 2025 -0400
xenbus: Allow PVH dom0 a non-local xenstore
[ Upstream commit 90989869baae47ee2aa3bcb6f6eb9fbbe4287958 ]
Make xenbus_init() allow a non-local xenstore for a PVH dom0 - it is
currently forced to XS_LOCAL. With Hyperlaunch booting dom0 and a
xenstore stubdom, dom0 can be handled as a regular XS_HVM following the
late init path.
Ideally we'd drop the use of xen_initial_domain() and just check for the
event channel instead. However, ARM has a xen,enhanced no-xenstore
mode, where the event channel and PFN would both be 0. Retain the
xen_initial_domain() check, and use that for an additional check when
the event channel is 0.
Check the full 64bit HVM_PARAM_STORE_EVTCHN value to catch the off
chance that high bits are set for the 32bit event channel.
Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
Change-Id: I5506da42e4c6b8e85079fefb2f193c8de17c7437
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <20250506204456.5220-1-jason.andryuk@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jason Andryuk <jason.andryuk@amd.com>
Date: Tue May 6 17:09:33 2025 -0400
xenbus: Use kref to track req lifetime
commit 1f0304dfd9d217c2f8b04a9ef4b3258a66eedd27 upstream.
Marek reported seeing a NULL pointer fault in the xenbus_thread
callstack:
BUG: kernel NULL pointer dereference, address: 0000000000000000
RIP: e030:__wake_up_common+0x4c/0x180
Call Trace:
<TASK>
__wake_up_common_lock+0x82/0xd0
process_msg+0x18e/0x2f0
xenbus_thread+0x165/0x1c0
process_msg+0x18e is req->cb(req). req->cb is set to xs_wake_up(), a
thin wrapper around wake_up(), or xenbus_dev_queue_reply(). It seems
like it was xs_wake_up() in this case.
It seems like req may have woken up the xs_wait_for_reply(), which
kfree()ed the req. When xenbus_thread resumes, it faults on the zero-ed
data.
Linux Device Drivers 2nd edition states:
"Normally, a wake_up call can cause an immediate reschedule to happen,
meaning that other processes might run before wake_up returns."
... which would match the behaviour observed.
Change to keeping two krefs on each request. One for the caller, and
one for xenbus_thread. Each will kref_put() when finished, and the last
will free it.
This use of kref matches the description in
Documentation/core-api/kref.rst
Link: https://lore.kernel.org/xen-devel/ZO0WrR5J0xuwDIxW@mail-itl/
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Fixes: fd8aa9095a95 ("xen: optimize xenbus driver for multiple concurrent xenstore accesses")
Cc: stable@vger.kernel.org
Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Message-ID: <20250506210935.5607-1-jason.andryuk@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Paul Chaignon <paul.chaignon@gmail.com>
Date: Wed May 7 13:31:58 2025 +0200
xfrm: Sanitize marks before insert
[ Upstream commit 0b91fda3a1f044141e1e615456ff62508c32b202 ]
Prior to this patch, the mark is sanitized (applying the state's mask to
the state's value) only on inserts when checking if a conflicting XFRM
state or policy exists.
We discovered in Cilium that this same sanitization does not occur
in the hot-path __xfrm_state_lookup. In the hot-path, the sk_buff's mark
is simply compared to the state's value:
if ((mark & x->mark.m) != x->mark.v)
continue;
Therefore, users can define unsanitized marks (ex. 0xf42/0xf00) which will
never match any packet.
This commit updates __xfrm_state_insert and xfrm_policy_insert to store
the sanitized marks, thus removing this footgun.
This has the side effect of changing the ip output, as the
returned mark will have the mask applied to it when printed.
Fixes: 3d6acfa7641f ("xfrm: SA lookups with mark")
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Louis DeLosSantos <louis.delos.devel@gmail.com>
Co-developed-by: Louis DeLosSantos <louis.delos.devel@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>