Author: Haoyu Lu <hechushiguitu666@gmail.com>
Date: Tue Apr 7 11:31:15 2026 +0800
ACPI: AGDI: fix missing newline in error message
[ Upstream commit b178330b67abb7293b6de28b2a49d49c83962db5 ]
Add the missing trailing newline to the dev_err() message
printed when SDEI event registration fails.
This keeps the error output as a properly terminated log line.
Fixes: a2a591fb76e6 ("ACPI: AGDI: Add driver for Arm Generic Diagnostic Dump and Reset device")
Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Signed-off-by: Haoyu Lu <hechushiguitu666@gmail.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date: Mon Feb 23 16:28:15 2026 +0100
ACPI: x86: cmos_rtc: Clean up address space handler driver
[ Upstream commit ba0b236736dde4059bdcb8e99beaa50d6e5b6e7e ]
Make multiple changes that do not alter functionality to the CMOS RTC
ACPI address space handler driver, including the following:
- Drop the unused .detach() callback from cmos_rtc_handler.
- Rename acpi_cmos_rtc_attach_handler() to acpi_cmos_rtc_attach().
- Rearrange acpi_cmos_rtc_space_handler() to reduce the number of
redundant checks and make white space follow the coding style.
- Adjust an error message in acpi_install_cmos_rtc_space_handler()
and make the white space follow the coding style.
- Rearrange acpi_remove_cmos_rtc_space_handler() and adjust an error
message in it.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/5094429.31r3eYUQgx@rafael.j.wysocki
Stable-dep-of: 6cee29ad9d7e ("ACPI: x86: cmos_rtc: Improve coordination with ACPI TAD driver")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date: Mon Feb 23 16:28:57 2026 +0100
ACPI: x86: cmos_rtc: Improve coordination with ACPI TAD driver
[ Upstream commit 6cee29ad9d7e400d39ae0b1a54447fedcb62eecd ]
If a CMOS RTC (PNP0B00/PNP0B01/PNP0B02) device coexists with an ACPI
TAD (timer and event alarm device, ACPI000E), the ACPI TAD driver will
attempt to install the CMOS RTC address space hanlder that has been
installed already and the TAD probing will fail.
Avoid that by changing acpi_install_cmos_rtc_space_handler() to return
zero and acpi_remove_cmos_rtc_space_handler() to do nothing if the CMOS
RTC address space handler has been installed already.
Fixes: 596ca52a56da ("ACPI: TAD: Install SystemCMOS address space handler for ACPI000E")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/2415111.ElGaqSPkdT@rafael.j.wysocki
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Date: Wed Mar 25 02:24:04 2026 -0300
ALSA: core: Validate compress device numbers without dynamic minors
[ Upstream commit 796e119e9b14763be905ad0d023c71a14bc2e931 ]
Without CONFIG_SND_DYNAMIC_MINORS, ALSA reserves only two fixed minors
for compress devices on each card: comprD0 and comprD1.
snd_find_free_minor() currently computes the compress minor as
type + dev without validating dev first, so device numbers greater than
1 spill into the HWDEP minor range instead of failing registration.
ASoC passes rtd->id to snd_compress_new(), so this can happen on real
non-dynamic-minor builds.
Add a dedicated fixed-minor check for SNDRV_DEVICE_TYPE_COMPRESS in
snd_find_free_minor() and reject out-of-range device numbers with
-EINVAL before constructing the minor.
Also remove the stale TODO in compress_offload.c that still claims
multiple compress nodes are missing.
Fixes: 3eafc959b32f ("ALSA: core: add support for compressed devices")
Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Link: https://patch.msgid.link/20260325-alsa-compress-static-minors-v1-1-0628573bee1c@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: wangdicheng <wangdicheng@kylinos.cn>
Date: Tue Apr 28 16:04:50 2026 +0800
ALSA: hda/conexant: Fix missing error check for jack detection
[ Upstream commit b0e2333a231107adedd38c6fcfe1adc6162716fc ]
In cx_probe(), the return value of snd_hda_jack_detect_enable_callback()
is ignored. This function returns a pointer, and if it fails (e.g., due
to memory allocation failure), it returns an error pointer which must
be checked using IS_ERR().
If the registration fails, the driver continues to probe, but the jack
detection callback will not be registered. This can lead to a kernel
crash later when the driver attempts to handle jack events or accesses
the uninitialized structure.
Check the return value using IS_ERR() and propagate the error via
PTR_ERR() to the probe caller.
Fixes: 7aeb25908648 ("ALSA: hda/conexant: Fix headset auto detect fail in cx8070 and SN6140")
Signed-off-by: wangdicheng <wangdicheng@kylinos.cn>
Link: https://patch.msgid.link/20260428080450.108801-1-wangdich9700@163.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: wangdicheng <wangdicheng@kylinos.cn>
Date: Mon Jun 16 15:43:31 2025 +0800
ALSA: hda/conexant: Renaming the codec with device ID 0x1f86 and 0x1f87
[ Upstream commit 7f4c540e0859e2025675d2c5c5c6ab88eaf817e2 ]
Due to changes in the manufacturer's plan, all 0x14f11f86 will be
named CX11880, and 0x14f11f87 will be named SN6140
Signed-off-by: wangdicheng <wangdicheng@kylinos.cn>
Link: https://patch.msgid.link/20250616074331.581309-1-wangdich9700@163.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Stable-dep-of: b0e2333a2311 ("ALSA: hda/conexant: Fix missing error check for jack detection")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kailang Yang <kailang@realtek.com>
Date: Tue Apr 14 15:44:04 2026 +0800
ALSA: hda/realtek - fixed speaker no sound update
[ Upstream commit 46c862f5419e0a86b60b9f9558d247f6084c99f9 ]
Fixed speaker has pop noise on Lenovo Thinkpad X11 Carbon Gen 12.
Fixes: 630fbc6e870e ("ALSA: hda/realtek - fixed speaker no sound")
Reported-and-tested-by: Jeremy Bethmont <jeremy.bethmont@gmail.com>
Closes: https://lore.kernel.org/CAC88DfsHrhyhy0Pn1O-z9egBvMYu=6NYgcvcC6KCgwh_-Ldkxg@mail.gmail.com
Signed-off-by: Kailang Yang <kailang@realtek.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Lei Huang <huanglei@kylinos.cn>
Date: Tue Mar 31 15:54:05 2026 +0800
ALSA: hda/realtek: fix code style (ERROR: else should follow close brace '}')
[ Upstream commit d1888bf848ade6a9e71c7ba516fd215aa1bd8d65 ]
Fix checkpatch code style errors:
ERROR: else should follow close brace '}'
#2300: FILE: sound/hda/codecs/realtek/alc269.c:2300:
+ }
+ else
Fixes: 31278997add6 ("ALSA: hda/realtek - Add headset quirk for Dell DT")
Signed-off-by: Lei Huang <huanglei@kylinos.cn>
Link: https://patch.msgid.link/20260331075405.78148-1-huanglei814@163.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Richard Fitzgerald <rf@opensource.cirrus.com>
Date: Tue Apr 28 14:05:31 2026 +0100
ALSA: hda: cs35l56: Fix uninitialized value in cs35l56_hda_read_acpi()
[ Upstream commit 90df4957a3271adf391b3432cd76a40887cf3273 ]
Eliminate the uninitialized 'nval' in cs35l56_hda_read_acpi() if a
system-specific quirk overrides processing of the dev-index property.
The value is now stored in a new 'num_amps' member of struct cs35l56_hda
so that the quirk handler can set the value.
The quirk for the Lenovo Yoga Book 9i GenX replaces the values from the
dev-index property with hardcoded indexes. So cs35l56_hda_read_acpi() would
then skip reading the property. But this left the 'nval' local variable
uninitialized when it is later passed to cirrus_scodec_get_speaker_id().
Fixes: 40b1c2f9b299 ("ALSA: hda/cs35l56: Workaround bad dev-index on Lenovo Yoga Book 9i GenX")
Reported-by: Dan Carpenter <error27@gmail.com>
Closes: https://lore.kernel.org/linux-sound/aenFesLAStjrVNy8@stanley.mountain/T/#u
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Link: https://patch.msgid.link/20260428130531.169600-1-rf@opensource.cirrus.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Date: Fri Apr 10 00:54:32 2026 -0300
ALSA: sc6000: Keep the programmed board state in card-private data
[ Upstream commit fb79bf127ac2577b4876132da6dba768018aad4c ]
The driver may auto-select IRQ and DMA resources at probe time, but
sc6000_init_board() still derives the SC-6000 soft configuration from
the module parameter arrays. When irq=auto or dma=auto is used, the
codec is created with the selected resources while the board is
programmed with the unresolved values.
Store the mapped ports and generated SC-6000 board configuration in
card-private data, build that configuration from the live probe
results instead of the raw module parameters, and keep the probe-time
board programming in a shared helper.
This fixes the resource-programming mismatch and leaves the driver
with a stable board-state block that can be reused by suspend/resume.
Fixes: c282866101bf ("ALSA: sc6000: add support for SC-6600 and SC-7000")
Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://patch.msgid.link/20260410-alsa-sc6000-pm-v1-1-4d9e95493d26@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Panagiotis Petrakopoulos <npetrakopoulos2003@gmail.com>
Date: Mon Apr 6 01:25:48 2026 +0300
ALSA: scarlett2: Add missing sentinel initializer field
[ Upstream commit 2428cd6e8b6fa80c36db4652702ca0acd2ce3f08 ]
A "-Wmissing-field-initializers" warning was emitted when compiling the
module using the W=2 option. There is a sentinel initializer field
missing in the end of scarlett2_devices[]. Tested using a
Scarlett Solo 4th gen.
Fixes: d98cc489029d ("ALSA: scarlett2: Move USB IDs out from device_info struct")
Signed-off-by: Panagiotis Petrakopoulos <npetrakopoulos2003@gmail.com>
Link: https://patch.msgid.link/20260405222548.8903-1-npetrakopoulos2003@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Date: Thu May 7 00:40:52 2026 -0300
ALSA: usb-audio: Bound MIDI 2.0 endpoint descriptor scans
commit 918be519c7876329e1b6e2ea1c59f0b75e792dca upstream.
The USB MIDI 2.0 endpoint parser has the same descriptor walking
pattern as the legacy MIDI parser. It validates bLength against
bNumGrpTrmBlock before reading baAssoGrpTrmBlkID[], but not against the
remaining bytes in the endpoint-extra scan.
A malformed device can therefore make later baAssoGrpTrmBlkID[] reads
consume bytes past the walked descriptor.
Reject zero-length and overlong descriptors while walking endpoint
extras.
Fixes: ff49d1df79ae ("ALSA: usb-audio: USB MIDI 2.0 UMP support")
Cc: stable@vger.kernel.org
Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Link: https://patch.msgid.link/20260507-usb-midi-endpoint-scan-bounds-v1-2-329d7348160e@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Date: Thu May 7 00:40:51 2026 -0300
ALSA: usb-audio: Bound MIDI endpoint descriptor scans
commit d6854daa67be623860f4e1873fd3d3c275aba4ed upstream.
snd_usbmidi_get_ms_info() validates the internal MIDIStreaming endpoint
descriptor size before using baAssocJackID[], but the descriptor walker can
still return a class-specific endpoint descriptor whose bLength exceeds the
remaining bytes in the endpoint-extra scan.
That leaves later flexible-array reads bounded by bLength, but not by the
remaining bytes in the endpoint-extra scan.
Stop walking when bLength is zero or
extends past the remaining endpoint-extra scan.
Fixes: 5c6cd7021a05 ("ALSA: usb-audio: Fix case when USB MIDI interface has more than one extra endpoint descriptor")
Cc: stable@vger.kernel.org
Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Link: https://patch.msgid.link/20260507-usb-midi-endpoint-scan-bounds-v1-1-329d7348160e@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Takashi Iwai <tiwai@suse.de>
Date: Mon Apr 27 17:15:04 2026 +0200
ALSA: usb-audio: Fix potential leak of pd at parsing UAC3 streams
[ Upstream commit c39f0bc03f84ba64c9144c95714df1dc36150f6d ]
At parsing UAC3 streams, we allocate a PD object at each time, and
either assign or free it. But there is a case where the PD object may
be leaked; namely, in __snd_usb_parse_audio_interface() loop, when an
audioformat shares the same endpoint with others, it's put to a link
and returns from snd_usb_add_audio_stream(), but the PD is forgotten
afterwards. Overall, the treatment of PD object in the parser code is
a bit flaky, and we should be more careful about the object ownership.
This patch tries to fix the above case and improve the code a bit.
The pd object is now managed with the auto-cleanup in the loop, and
the ownership is updated when the pd object gets assigned to the
stream, which guarantees the release of the leftover object.
Fixes: 7edf3b5e6a45 ("ALSA: usb-audio: AudioStreaming Power Domain parsing")
Link: https://patch.msgid.link/20260427151508.12544-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wentao Guan <guanwentao@uniontech.com>
Date: Mon Apr 13 17:54:59 2026 +0800
arm64/scs: Fix potential sign extension issue of advance_loc4
[ Upstream commit 4023b7424ecd5d38cc75b650d6c1bf630ef8cb40 ]
The expression (*opcode++ << 24) and exp * code_alignment_factor
may overflow signed int and becomes negative.
Fix this by casting each byte to u64 before shifting. Also fix
the misaligned break statement while we are here.
Example of the result can be seen here:
Link: https://godbolt.org/z/zhY8d3595
It maybe not a real problem, but could be a issue in future.
Fixes: d499e9627d70 ("arm64/scs: Fix handling of advance_loc4")
Signed-off-by: Wentao Guan <guanwentao@uniontech.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Christoph Hellwig <hch@lst.de>
Date: Fri Mar 27 07:16:35 2026 +0100
arm64/xor: fix conflicting attributes for xor_block_template
[ Upstream commit 675a0dd596e712404557286d0a883b54ee28e4f4 ]
Commit 2c54b423cf85 ("arm64/xor: use EOR3 instructions when available")
changes the definition to __ro_after_init instead of const, but failed to
update the external declaration in xor.h. This was not found because
xor-neon.c doesn't include <asm/xor.h>, and can't easily do that due to
current architecture of the XOR code.
Link: https://lkml.kernel.org/r/20260327061704.3707577-4-hch@lst.de
Fixes: 2c54b423cf85 ("arm64/xor: use EOR3 instructions when available")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Tested-by: Eric Biggers <ebiggers@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Mason <clm@fb.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: David Sterba <dsterba@suse.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason A. Donenfeld <jason@zx2c4.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Li Nan <linan122@huawei.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Magnus Lindholm <linmag7@gmail.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Song Liu <song@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Ted Ts'o <tytso@mit.edu>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: James Clark <james.clark@linaro.org>
Date: Thu Mar 5 16:28:18 2026 +0000
arm64: cpufeature: Make PMUVer and PerfMon unsigned
[ Upstream commit d1dcc20bcc40efe1f1c71639376c91dafa489222 ]
On the host, this change doesn't make a difference because the fields
are defined as FTR_EXACT. However, KVM allows userspace to set these
fields for a guest and overrides the type to be FTR_LOWER_SAFE. And
while KVM used to do an unsigned comparison to validate that the new
value is lower than what the hardware provides, since the linked commit
it uses the generic sanitization framework which does a signed
comparison.
Fix it by defining these fields as unsigned. In theory, without this
fix, userspace could set a higher PMU version than the hardware supports
by providing any value with the top bit set.
Fixes: c118cead07a7 ("KVM: arm64: Use generic sanitisation for ID_(AA64)DFR0_EL1")
Signed-off-by: James Clark <james.clark@linaro.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Colton Lewis <coltonlewis@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nora Schiffer <nora.schiffer@ew.tq-group.com>
Date: Mon Mar 2 09:45:48 2026 +0100
arm64: dts: freescale: imx8mp-tqma8mpql-mba8mp-ras314: fix UART1 RTS/CTS muxing
[ Upstream commit b8d785a9f360abcd6a6f8f10a2adf222f8494d66 ]
UART1 operates in DCE mode, but the RTS/CTS pins were incorrectly
configured using the DTE pinmux setting.
Correct the pinmux to match DCE mode. Switching the RTS and CTS signals
is fine for this board, as UART1 is routed to a pin header. Existing
functionality is unaffected, as RTS/CTS could never have worked with
the incorrect pinmux.
Fixes: ddabb3ce3f90 ("arm64: dts: freescale: add TQMa8MPQL on MBa8MP-RAS314")
Signed-off-by: Nora Schiffer <nora.schiffer@ew.tq-group.com>
Reviewed-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Francesco Dolcini <francesco.dolcini@toradex.com>
Date: Mon Jan 19 11:34:09 2026 +0100
arm64: dts: imx8-apalis: Fix LEDs name collision
[ Upstream commit 92ab53b9bb2a72581c32073755077af916eb9aee ]
Ixora boards have multiple instances of status leds, to avoid a name
collision add the function-enumerator property.
This fixes the following Linux kernel warnings:
leds-gpio leds: Led green:status renamed to green:status_1 due to name collision
leds-gpio leds: Led red:status renamed to red:status_1 due to name collision
Fixes: c083131c9021 ("arm64: dts: freescale: add apalis imx8 aka quadmax carrier board support")
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Daniel Baluta <daniel.baluta@nxp.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peng Fan <peng.fan@nxp.com>
Date: Sun Mar 29 21:00:11 2026 +0800
arm64: dts: imx8mm-emtop-som: Correct PAD settings for PMIC_nINT
[ Upstream commit 721dec3ee9ff5231d13a412ff87df63b966d137b ]
With commit 5d0efaf47ee90 ("regulator: pca9450: Correct interrupt type"),
there might be interrupt storm for this board. Need to set PAD PUE and PU
together to make pull up work properly.
While at here, also correct interrupt type as IRQ_TYPE_LEVEL_LOW.
Fixes: cbd3ef64eb9d1 ("arm64: dts: Add support for Emtop SoM & Baseboard")
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peng Fan <peng.fan@nxp.com>
Date: Sun Mar 29 21:00:13 2026 +0800
arm64: dts: imx8mm-tqma8mqml: Correct PAD settings for PMIC_nINT
[ Upstream commit 42a9f5a16328ed78a88e0498556965b6c6ec515c ]
With commit 5d0efaf47ee90 ("regulator: pca9450: Correct interrupt type"),
there might be interrupt storm for this board. Need to set PAD PUE and PU
together to make pull up work properly.
Fixes: dfcd1b6f7620e ("arm64: dts: freescale: add initial device tree for TQMa8MQML with i.MX8MM")
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peng Fan <peng.fan@nxp.com>
Date: Sun Mar 29 21:00:12 2026 +0800
arm64: dts: imx8mn-tqma8mqnl: Correct PAD settings for PMIC_nINT
[ Upstream commit 0fb37990774113afd943eaa91323679388584b6d ]
With commit 5d0efaf47ee90 ("regulator: pca9450: Correct interrupt type"),
there might be interrupt storm for this board. Need to set PAD PUE and PU
together to make pull up work properly.
Fixes: 3e56e354db6d3 ("arm64: dts: freescale: add initial device tree for TQMa8MQNL with i.MX8MN")
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peng Fan <peng.fan@nxp.com>
Date: Thu Mar 26 15:28:16 2026 +0800
arm64: dts: imx8mp-data-modul-edm-sbc: Correct PAD settings for PMIC_nINT
[ Upstream commit 8ff145577e93f312ff398cb950ee3bd44835f5be ]
PMIC_nINT is low level triggered, but the current PAD settings is
PE=0,PUE=0,FSEL_1_FAST_SLEW_RATE=1,SION=1. So PAD needs to be configured
as PULL UP with PULL Enable, no need SION. Correct it.
Fixes: 562d222f23f0f ("arm64: dts: imx8mp: Add support for Data Modul i.MX8M Plus eDM SBC")
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peng Fan <peng.fan@nxp.com>
Date: Thu Mar 26 15:28:05 2026 +0800
arm64: dts: imx8mp-debix-model-a: Correct PAD settings for PMIC_nINT
[ Upstream commit 3b778178997aee24537b521a8cb60970bc1ce01c ]
With commit 5d0efaf47ee90 ("regulator: pca9450: Correct interrupt type"),
there is interrupt storm for i.MX8MP DEBIX Model A. Per schematic, there
is no on board PULL-UP resistors for GPIO1_IO03, so need to set PAD
PUE and PU together to make pull up work properly.
Fixes: c86d350aae68e ("arm64: dts: Add device tree for the Debix Model A Board")
Reported-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Closes: https://lore.kernel.org/all/20260323105858.GA2185714@killaraus.ideasonboard.com/
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Tested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peng Fan <peng.fan@nxp.com>
Date: Thu Mar 26 15:28:06 2026 +0800
arm64: dts: imx8mp-debix-som-a: Correct PAD settings for PMIC_nINT
[ Upstream commit 2ea7872048a179b0ea8dadc67771961df3f0fc4a ]
With commit 5d0efaf47ee90 ("regulator: pca9450: Correct interrupt type"),
there is interrupt storm for i.MX8MP DEBIX SOM A. Need to set PAD
PUE and PU together to make pull up work properly.
Fixes: 21baf0b47f81b ("arm64: dts: freescale: Add DEBIX SOM A and SOM A I/O Board support")
Reported-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Closes: https://lore.kernel.org/all/20260323105858.GA2185714@killaraus.ideasonboard.com/
Reported-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Closes: https://lore.kernel.org/imx/20260324194353.GB2352505@killaraus.ideasonboard.com/T/#m9a07fdc75496369a7d76d52c5e34ed140dcabfe3
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Reviewed-by: Kieran Bingham <kieran.bingham@ideasonboard.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peng Fan <peng.fan@nxp.com>
Date: Thu Mar 26 15:28:15 2026 +0800
arm64: dts: imx8mp-dhcom-som: Correct PAD settings for PMIC_nINT
[ Upstream commit f9ed5afc988da3e22543725e35be6addbb0497bc ]
PMIC_nINT is low level triggered, but the current PAD settings is
PE=0,PUE=0,FSEL_1_FAST_SLEW_RATE=1,SION=1. So PAD needs to be configured
as PULL UP with PULL Enable, no need SION. Correct it.
Fixes: 8d6712695bc8e ("arm64: dts: imx8mp: Add support for DH electronics i.MX8M Plus DHCOM and PDK2")
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sherry Sun <sherry.sun@nxp.com>
Date: Thu Feb 5 15:34:53 2026 +0800
arm64: dts: imx8mp-evk: Enable pull select bit for PCIe regulator GPIO (M.2 W_DISABLE1)
[ Upstream commit d1e7eab6033f9885a02c4b4e8f09e34d8e9d21ab ]
The current pin configuration for MX8MP_IOMUXC_SD1_DATA4__GPIO2_IO06
sets the weak pull-up but does not enable the pull select field.
Bit 8 in the IOMUX register must be set in order for the weak pull-up
to actually take effect.
Update the pinctrl setting from 0x40 to 0x140 to enable both the pull
select and the weak pull-up, ensuring the line behaves as expected.
Fixes: d50650500064 ("arm64: dts: imx8mp-evk: Add PCIe support")
Signed-off-by: Sherry Sun <sherry.sun@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peng Fan <peng.fan@nxp.com>
Date: Thu Mar 26 15:28:09 2026 +0800
arm64: dts: imx8mp-icore-mx8mp: Correct PAD settings for PMIC_nINT
[ Upstream commit ea8c90f5c7ceeb6657a8fe564aa7b190dce298a6 ]
With commit 5d0efaf47ee90 ("regulator: pca9450: Correct interrupt type"),
there might be interrupt storm for this board. Need to set PAD PUE and PU
together to make pull up work properly.
Fixes: eefe06b295087 ("arm64: dts: imx8mp: Add Engicam i.Core MX8M Plus SoM")
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peng Fan <peng.fan@nxp.com>
Date: Thu Mar 26 15:28:07 2026 +0800
arm64: dts: imx8mp-navqp: Correct PAD settings for PMIC_nINT
[ Upstream commit 741d6ac1a2a2e0f3e2cae5eef3516cdd75119e83 ]
With commit 5d0efaf47ee90 ("regulator: pca9450: Correct interrupt type"),
there will be interrupt storm for i.MX8MP NAVQP. Per schematic, there
is no on board PULL-UP resistors for GPIO1_IO03, so need to set PAD
PUE and PU together to make pull up work properly.
Fixes: 682729a9d506d ("arm64: dts: freescale: Add device tree for Emcraft Systems NavQ+ Kit")
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Xu Yang <xu.yang_2@nxp.com>
Date: Tue Mar 24 19:04:58 2026 +0800
arm64: dts: imx8qm-mek: switch Type-C connector power-role to dual
[ Upstream commit e3d3d19d1c0050789a4813ce836a641a3387d916 ]
When attach to PC Type-A port, the USB device controller does not function
at all. Because it is configured as source-only and a Type-A port doesn't
support PD capability, a data role swap is impossible.
Actually, PTN5110THQ is configured for Source role only at POR, but after
POR it can operate as a DRP (Dual-Role Power). By switching the power-role
to dual, the port can operate as a sink and enter device mode when attach
to Type-A port.
Since the board design uses EN_SRC to control the 5V VBUS path and EN_SNK
to control the 12V VBUS output, to avoid outputting a higher VBUS when in
sink role, we set the operation current limit to 0mA so that SW will not
control EN_SNK at all.
Fixes: b237975b2cd58 ("arm64: dts: imx8qm-mek: add usb 3.0 and related type C nodes")
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Xu Yang <xu.yang_2@nxp.com>
Date: Tue Mar 24 19:04:59 2026 +0800
arm64: dts: imx8qxp-mek: switch Type-C connector power-role to dual
[ Upstream commit 825b8c7e1d2918d89eb378b761530d1e51dba82e ]
When attach to PC Type-A port, the USB device controller does not function
at all. Because it is configured as source-only and a Type-A port doesn't
support PD capability, a data role swap is impossible.
Actually, PTN5110THQ is configured for Source role only at POR, but after
POR it can operate as a DRP (Dual-Role Power). By switching the power-role
to dual, the port can operate as a sink and enter device mode when attach
to Type-A port.
Since the board design uses EN_SRC to control the 5V VBUS path and EN_SNK
to control the 12V VBUS output, to avoid outputting a higher VBUS when in
sink role, we set the operation current limit to 0mA so that SW will not
control EN_SNK at all.
Fixes: 2faf4ebcee2e5 ("arm64: dts: freescale: imx8qxp-mek: enable cadence usb3")
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Josua Mayer <josua@solid-run.com>
Date: Tue Mar 24 13:40:59 2026 +0100
arm64: dts: lx2160a: add sda gpio references for i2c bus recovery
[ Upstream commit 89ea0dbd701f89805499d26bd90657468c789545 ]
LX2160A pinmux is done in groups by various length bitfields within
configuration registers.
In particular i2c sda/scl pins are always configured together. Therefore
bus recovery may control both sda and scl.
When pinmux nodes and bus recovery was enabled originally for LX2160,
only the scl-gpios were added to the i2c controller nodes.
Add references to sda-gpios for each i2c controller.
Fixes: 8a1365c7bbc1 ("arm64: dts: lx2160a: add pinmux and i2c gpio to support bus recovery")
Signed-off-by: Josua Mayer <josua@solid-run.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Josua Mayer <josua@solid-run.com>
Date: Tue Mar 24 13:40:56 2026 +0100
arm64: dts: lx2160a: change i2c0 (iic1) pinmux mask to one bit
[ Upstream commit 7a3cc49ad1fc8d063abb7f5de8f1b981b99d2978 ]
LX2160A pinmux is done in groups by various length bitfields within
configuration registers.
The first i2c bus (called IIC1 in reference manual) is configured through
field IIC1_PMUX in register RCWSR14 bit 10 which is described in the
reference manual as a single bit, unlike the other i2c buses.
Change the bitmask for the pinmux nodes from 0x7 to 0x1 to ensure only
single bit is modified.
Further change the zero in the same line to hexadecimal format for
consistency.
Align with documentation by avoiding writes to reserved bits. No functional
change, as writing the extra two reserved bits is not known to cause
issues.
Fixes: 8a1365c7bbc1 ("arm64: dts: lx2160a: add pinmux and i2c gpio to support bus recovery")
Signed-off-by: Josua Mayer <josua@solid-run.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Josua Mayer <josua@solid-run.com>
Date: Tue Mar 24 13:41:00 2026 +0100
arm64: dts: lx2160a: change zeros to hexadecimal in pinmux nodes
[ Upstream commit 03241620d2b9915c9e3463dbc56e9eb95ad43c08 ]
Replace some stray zeros from decimal to hexadecimal format within
pinmux nodes.
No functional change intended.
Fixes: 8a1365c7bbc1 ("arm64: dts: lx2160a: add pinmux and i2c gpio to support bus recovery")
Signed-off-by: Josua Mayer <josua@solid-run.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Josua Mayer <josua@solid-run.com>
Date: Tue Mar 24 13:41:01 2026 +0100
arm64: dts: lx2160a: complete pinmux for rcwsr12 configuration word
[ Upstream commit 284ad7064aaa1badde022785cd925af29c696b21 ]
Commit 8a1365c7bbc1 ("arm64: dts: lx2160a: add pinmux and i2c gpio to
support bus recovery") introduced pinmux nodes for lx2160 i2c
interfaces, allowing runtime change between i2c and gpio functions
implementing bus recovery.
However, the dynamic configuration area (overwrite MUX) used by the
pinctrl-single driver initially reads as zero and does not reflect the
actual hardware state set by the Reset Configuration Word (RCW) at
power-on.
Because multiple groups of pins are configured from a single 32-bit
register, the first write from the pinctrl driver unintentionally clears
all other bits to zero.
Add description for all bits of RCWSR12 register, allowing boards to
explicitly define and restore their intended hardware state.
This includes i2c, gpio, flextimer, spi, can and sdhc functions.
Other configuration words, i.e. RCWSR13 & RCWSR14 may be added in the
future for boards setting non-zero values there.
Fixes: 8a1365c7bbc1 ("arm64: dts: lx2160a: add pinmux and i2c gpio to support bus recovery")
Signed-off-by: Josua Mayer <josua@solid-run.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Josua Mayer <josua@solid-run.com>
Date: Tue Mar 24 13:40:57 2026 +0100
arm64: dts: lx2160a: remove duplicate pinmux nodes
[ Upstream commit 325ca511ca3dda936207ce737e0afe837d45a674 ]
LX2160A pinmux is done in groups by various length bitfields within
configuration registers.
The pinmux nodes i2c7-scl-pins and i2c7-scl-gpio-pins are duplicates of
i2c6-scl-gpio and i2c6-scl-gpio-pins, writing to the same register and
bits.
These two i2c buses i2c6/i2c7 (IIC7/IIC8) are configured together in
register RCWSR13 bits 3-0.
Drop the duplicate node name and change references to the i2c6 node.
Fixes: 8a1365c7bbc1 ("arm64: dts: lx2160a: add pinmux and i2c gpio to support bus recovery")
Signed-off-by: Josua Mayer <josua@solid-run.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Josua Mayer <josua@solid-run.com>
Date: Tue Mar 24 13:40:58 2026 +0100
arm64: dts: lx2160a: rename pinmux nodes for readability
[ Upstream commit 456eb494746afd56d3a9dc30271300136e55b96e ]
LX2160A pinmux is done in groups by various length bitfields within
configuration registers.
Each group of pins is named in the reference manual after a primary
function using soc-specific naming, e.g. IIC1 (for i2c0).
Hardware block numbering starts from zero in device-tree but one in the
reference manual.
Rename the already defined pinmux nodes originally added for changing
i2c pins between i2c and gpio functions reflecting the reference manual
name (IIC) in the node name, and the device-tree name (i2c, gpio) in the
label.
Specifically, drop the "_scl" suffix from the I2C labels because the
nodes actually configure both SDA and SCL pins together. Instead add
"_pins" suffix to avoid conflicts with I2C controller labels.
For GPIO functions, include the specific controller and pin numbers in
the label to clarify they are generic GPIOs and help spot mistakes.
No functional change intended.
Fixes: 8a1365c7bbc1 ("arm64: dts: lx2160a: add pinmux and i2c gpio to support bus recovery")
Signed-off-by: Josua Mayer <josua@solid-run.com>
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Gabor Juhos <j4g8y7@gmail.com>
Date: Mon Mar 30 17:25:16 2026 +0200
arm64: dts: marvell: armada-37xx: use 'usb2-phy' in USB3 controller's phy-names
[ Upstream commit 0fef19844624f8bc07651b4d26088d8940affba3 ]
Instead of the generic 'usb2-phy' name, the Armada 37xx device trees
are using a custom 'usb2-utmi-otg-phy' name for the USB2 PHY in the USB3
controller node. Since commit 53a2d95df836 ("usb: core: add phy notify
connect and disconnect"), this triggers a bug [1] in the USB core which
causes double use of the USB3 PHY.
Change the PHY name to 'usb2-phy' in the SoC and in the uDPU specific
dtsi files in order to avoid triggering the bug and also to keep the
names in line with the ones used by other platforms.
Link: https://lore.kernel.org/r/20260330-usb-avoid-usb3-phy-double-use-v1-1-d2113aecb535@gmail.com # [1]
Fixes: 53a2d95df836 ("usb: core: add phy notify connect and disconnect")
Signed-off-by: Gabor Juhos <j4g8y7@gmail.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Akari Tsuyukusa <akkun11.open@gmail.com>
Date: Thu Mar 12 13:15:28 2026 +0900
arm64: dts: mediatek: mt6795: Fix gpio-ranges pin count
[ Upstream commit c4c4823c8a5baa10b8100b01f49d7c3f4a871689 ]
The gpio-ranges in the MT6795 pinctrl node were incorrectly defined,
therefore, GPIO196 cannot be used.
Correct the range count to match the driver.
Fixes: b888886a4536 ("arm64: dts: mediatek: mt6795: Add pinctrl controller node")
Signed-off-by: Akari Tsuyukusa <akkun11.open@gmail.com>
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Akari Tsuyukusa <akkun11.open@gmail.com>
Date: Thu Mar 12 13:15:29 2026 +0900
arm64: dts: mediatek: mt7981b: Fix gpio-ranges pin count
[ Upstream commit b62a927f4a46a7f58d88ba3d5fb6e88e1a4b4603 ]
The gpio-ranges in the MT7981B pinctrl node were incorrectly defined,
therefore, pin 56 cannot be used.
Correct the range count to match the driver.
Fixes: 62b24c7fdf0a ("arm64: dts: mediatek: mt7981: add pinctrl")
Signed-off-by: Akari Tsuyukusa <akkun11.open@gmail.com>
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Akari Tsuyukusa <akkun11.open@gmail.com>
Date: Thu Mar 12 13:15:30 2026 +0900
arm64: dts: mediatek: mt7986a: Fix gpio-ranges pin count
[ Upstream commit 820ed0c1a13c5fafb36232538d793f99a0986ef3 ]
The gpio-ranges in the MT7986A pinctrl node were incorrectly defined,
therefore, pin 100 cannot be used.
Correct the range count to match the driver.
Fixes: c3a064a32ed9 ("arm64: dts: mediatek: add pinctrl support for mt7986a")
Signed-off-by: Akari Tsuyukusa <akkun11.open@gmail.com>
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Date: Fri May 2 12:43:22 2025 -0400
arm64: dts: mediatek: mt8365: Describe infracfg-nao as a pure syscon
[ Upstream commit 0651c24658360706c30588cec0a12c05edb03e9a ]
The infracfg-nao register space at 0x1020e000 has different registers
than the infracfg space at 0x10001000, and most importantly, doesn't
contain any clock controls. Therefore it shouldn't use the same
compatible used for the mt8365 infracfg clocks driver:
mediatek,mt8365-infracfg. Since it currently does, probe errors are
reported in the kernel logs:
[ 0.245959] Failed to register clk ifr_pmic_tmr: -EEXIST
[ 0.245998] clk-mt8365 1020e000.infracfg: probe with driver clk-mt8365 failed with error -17
This register space is used only as a syscon for bus control by the
power domain controller, so in order to properly describe it and fix the
errors, set its compatible to a distinct compatible used exclusively as
a syscon, drop the clock-cells, and while at it rename the node to
'syscon' following the naming convention.
Fixes: 6ff945376556 ("arm64: dts: mediatek: Initial mt8365-evk support")
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Reviewed-by: David Lechner <dlechner@baylibre.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jun Yan <jerrysteve1101@gmail.com>
Date: Mon Mar 30 22:51:11 2026 +0800
arm64: dts: meson-gxl-p230: fix ethernet PHY interrupt number
[ Upstream commit 174a0ef3b33434f475c87e66f37980e39b73805a ]
Correct the interrupt number assigned to the Realtek PHY in the p230
following the same logic as commit 3106507e1004 ("ARM64: dts: meson-gxm:
fix q200 interrupt number"),as reported in [PATCH 0/2] Ethernet PHY
interrupt improvements [1].
[1] https://lore.kernel.org/all/20171202214037.17017-1-martin.blumenstingl@googlemail.com/
Fixes: b94d22d94ad2 ("ARM64: dts: meson-gx: add external PHY interrupt on some platforms")
Signed-off-by: Jun Yan <jerrysteve1101@gmail.com>
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Link: https://patch.msgid.link/20260330145111.115318-1-jerrysteve1101@gmail.com
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Barnabás Czémán <barnabas.czeman@mainlining.org>
Date: Fri Jan 16 08:07:39 2026 +0100
arm64: dts: qcom: msm8953-xiaomi-daisy: fix backlight
[ Upstream commit 7131f6d909a6546329b71f2bacfdc60cb3e6020e ]
The backlight on this device is connected via 3 strings. Currently,
the DT claims only two are present, which results in visible stripes
on the display (since every third backlight string remains unconfigured).
Fix the number of strings to avoid that.
Fixes: 38d779c26395 ("arm64: dts: qcom: msm8953: Add device tree for Xiaomi Mi A2 Lite")
Signed-off-by: Barnabás Czémán <barnabas.czeman@mainlining.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260116-pmi8950-wled-v3-7-e6c93de84079@mainlining.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Barnabás Czémán <barnabas.czeman@mainlining.org>
Date: Fri Jan 16 08:07:37 2026 +0100
arm64: dts: qcom: msm8953-xiaomi-vince: correct wled ovp value
[ Upstream commit 9e87f0eaadccc3fecdf3c3c0334e05694804b5f5 ]
PMI8950 doesn't actually support setting an OVP threshold value of
29.6 V. The closest allowed value is 29.5 V. Set that instead.
Fixes: aa17e707e04a ("arm64: dts: qcom: msm8953: Add device tree for Xiaomi Redmi 5 Plus")
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Barnabás Czémán <barnabas.czeman@mainlining.org>
Link: https://lore.kernel.org/r/20260116-pmi8950-wled-v3-5-e6c93de84079@mainlining.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Heidelberg <david@ixit.cz>
Date: Fri Mar 20 18:33:11 2026 +0100
arm64: dts: qcom: sdm845-xiaomi-beryllium: Mark l1a regulator as powered during boot
[ Upstream commit 3b0dd81eea6b7a239fce456ce4545af76f1a9715 ]
The regulator must be on, since it provides the display subsystem and
therefore the bootloader had turned it on before Linux booted.
Fixes: 77809cf74a8c ("arm64: dts: qcom: Add support for Xiaomi Poco F1 (Beryllium)")
Signed-off-by: David Heidelberg <david@ixit.cz>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260320-beryllium-booton-v2-1-931d1be21eae@ixit.cz
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Luca Weiss <luca.weiss@fairphone.com>
Date: Thu Mar 19 09:55:00 2026 +0100
arm64: dts: qcom: sm7225-fairphone-fp4: Fix conflicting bias pinctrl
[ Upstream commit be7c1badb0b934cfe88427b1d4ec3eb9f52ba587 ]
The pinctrl nodes from sm6350.dtsi already contain a bias-* property, so
that needs to be deleted, otherwise the dtb will contain two conflicting
bias-* properties.
Reported-by: Conor Dooley <conor@kernel.org>
Closes: https://lore.kernel.org/r/20260310-maritime-silly-05e7b7e03aa6@spud/
Fixes: c4ef464b24c5 ("arm64: dts: qcom: sm7225-fairphone-fp4: Add Bluetooth")
Signed-off-by: Luca Weiss <luca.weiss@fairphone.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20260319-fp4-uart1-fix-v1-1-f6b3fedef583@fairphone.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alexander Koskovich <AKoskovich@pm.me>
Date: Sun Mar 8 04:26:37 2026 +0000
arm64: dts: qcom: sm8250: Add missing CPU7 3.09GHz OPP
[ Upstream commit b683730e27ba4f91986c4c92f5cb7297f1e01a6d ]
This resolves the following error seen on the ASUS ROG Phone 3:
cpu cpu7: Voltage update failed freq=3091200
cpu cpu7: failed to update OPP for freq=3091200
Fixes: 8e0e8016cb79 ("arm64: dts: qcom: sm8250: Add CPU opp tables")
Signed-off-by: Alexander Koskovich <akoskovich@pm.me>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260307-sm8250-cpu7-opp-v1-1-435f5f6628a1@pm.me
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Date: Sat Mar 14 04:37:13 2026 +0200
arm64: dts: qcom: sm8450: Enable UHS-I SDR50 and SDR104 SD card modes
[ Upstream commit db0c5ef1abda6effdc5c85d6688fb6af2b351ae5 ]
The reported problem of some non-working UHS-I speed modes on SM8450
originates in commit 0a631a36f724 ("arm64: dts: qcom: Add device tree
for Sony Xperia 1 IV"), and then it was spread to all SM8450 powered
platforms by commit 9d561dc4e5cc ("arm64: dts: qcom: sm8450: disable
SDHCI SDR104/SDR50 on all boards").
The tests show that the rootcause of the problem was related to an
overclocking of SD cards, and it's fixed later on by commit a27ac3806b0a
("clk: qcom: gcc-sm8450: Use floor ops for SDCC RCGs").
Since then both SDR50 and SDR104 speed modes are working fine on SM8450,
tested on SM8450-HDK:
SDR50 speed mode:
mmc0: new UHS-I speed SDR50 SDHC card at address 0001
mmcblk0: mmc0:0001 00000 14.6 GiB
mmcblk0: p1
% dd if=/dev/mmcblk0p1 of=/dev/null bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 24.6254 s, 43.6 MB/s
SDR104 speed mode:
mmc0: new UHS-I speed SDR104 SDHC card at address 59b4
mmcblk0: mmc0:59b4 USDU1 28.3 GiB
mmcblk0: p1
% dd if=/dev/mmcblk0p1 of=/dev/null bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 12.3266 s, 87.1 MB/s
Remove the restrictions on SD card speed modes from the SM8450 platform
dtsi file and enable UHS-I speed modes.
Fixes: 9d561dc4e5cc ("arm64: dts: qcom: sm8450: disable SDHCI SDR104/SDR50 on all boards")
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Link: https://lore.kernel.org/r/20260314023715.357512-5-vladimir.zapolskiy@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Date: Tue Mar 17 15:41:16 2026 +0100
arm64: dts: qcom: sm8450: Fix GIC_ITS range length
[ Upstream commit 14044fa192c50265bc1f636108371044bbdcf7b7 ]
Currently, the GITS_SGIR register is cut off. Fix it up.
Fixes: fc8b0b9b630d ("arm64: dts: qcom: sm8450 add ITS device tree node")
Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Reviewed-by: Abel Vesa <abel.vesa@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260317-topic-its_range_fixup-v1-3-49be8076adb1@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Date: Sat Mar 14 04:37:14 2026 +0200
arm64: dts: qcom: sm8550: Enable UHS-I SDR50 and SDR104 SD card modes
[ Upstream commit 66b0f024fba0728ddce6916dce173bb1bdd4eab0 ]
The restriction on UHS-I speed modes was added to all SM8550 platforms
by copying it from SM8450 dtsi file, and due to the overclocking of SD
cards it was an actually reproducible problem. Since the latter issue
has been fixed, UHS-I speed modes are working fine on SM8550 boards,
below is the test performed on SM8550-HDK:
SDR50 speed mode:
mmc0: new UHS-I speed SDR50 SDHC card at address 0001
mmcblk0: mmc0:0001 00000 14.6 GiB
mmcblk0: p1
% dd if=/dev/mmcblk0p1 of=/dev/null bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 23.5468 s, 45.6 MB/s
SDR104 speed mode:
mmc0: new UHS-I speed SDR104 SDHC card at address 59b4
mmcblk0: mmc0:59b4 USDU1 28.3 GiB
mmcblk0: p1
% dd if=/dev/mmcblk0p1 of=/dev/null bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 11.9819 s, 89.6 MB/s
Unset the UHS-I speed mode restrictions from the SM8550 platform dtsi
file, there is no indication that the SDHC controller is broken.
Fixes: ffc50b2d3828 ("arm64: dts: qcom: Add base SM8550 dtsi")
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Link: https://lore.kernel.org/r/20260314023715.357512-6-vladimir.zapolskiy@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Date: Tue Mar 17 15:41:17 2026 +0100
arm64: dts: qcom: sm8550: Fix GIC_ITS range length
[ Upstream commit 357c559e386705609b6b9dc0544c420e3f91f3a0 ]
Currently, the GITS_SGIR register is cut off. Fix it up.
Fixes: ffc50b2d3828 ("arm64: dts: qcom: Add base SM8550 dtsi")
Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Reviewed-by: Abel Vesa <abel.vesa@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260317-topic-its_range_fixup-v1-4-49be8076adb1@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Date: Sat Mar 14 04:37:10 2026 +0200
arm64: dts: qcom: sm8550: Fix xo clock supply of platform SD host controller
[ Upstream commit 30ac651c69bddbc83cab6d52fc5d2e03bed83282 ]
The expected frequency of SD host controller core supply clock is 19.2MHz,
while RPMH_CXO_CLK clock frequency on SM8650 platform is 38.4MHz.
Apparently the overclocked supply clock could be good enough on some
boards and even with the most of SD cards, however some low-end UHS-I
SD cards in SDR104 mode of the host controller produce I/O errors in
runtime, fortunately this problem is gone, if the "xo" clock frequency
matches the expected 19.2MHz clock rate.
Fixes: ffc50b2d3828 ("arm64: dts: qcom: Add base SM8550 dtsi")
Signed-off-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20260314023715.357512-2-vladimir.zapolskiy@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Date: Sat Mar 14 04:37:15 2026 +0200
arm64: dts: qcom: sm8650: Enable UHS-I SDR50 and SDR104 SD card modes
[ Upstream commit 93f823e7d48232e62fb8fb74481696609c90244a ]
The restriction on UHS-I speed modes was added to all SM8650 platforms
by copying it from SM8450 and SM8550 dtsi files, and it was an actually
reproducible problem due to the overclocking of SD cards. Since the latter
issue has been fixed in the SM8650 GCC driver, UHS-I speed modes are
working fine on SM8650 boards, below is the test performed on SM8650-HDK:
SDR50 speed mode:
mmc0: new UHS-I speed SDR50 SDHC card at address 0001
mmcblk0: mmc0:0001 00000 14.6 GiB
mmcblk0: p1
% dd if=/dev/mmcblk0p1 of=/dev/null bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 24.8086 s, 43.3 MB/s
SDR104 speed mode:
mmc0: new UHS-I speed SDR104 SDHC card at address 59b4
mmcblk0: mmc0:59b4 USDU1 28.3 GiB
mmcblk0: p1
% dd if=/dev/mmcblk0p1 of=/dev/null bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 12.9448 s, 82.9 MB/s
Unset the UHS-I speed mode restrictions from the SM8550 platform dtsi
file, there is no indication that the SDHC controller is broken.
Fixes: 10e024671295 ("arm64: dts: qcom: sm8650: add interconnect dependent device nodes")
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Link: https://lore.kernel.org/r/20260314023715.357512-7-vladimir.zapolskiy@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Date: Tue Mar 17 15:41:18 2026 +0100
arm64: dts: qcom: sm8650: Fix GIC_ITS range length
[ Upstream commit 6c8e2ca1263d0da5976418ed285eaec430e8d87f ]
Currently, the GITS_SGIR register is cut off. Fix it up.
Fixes: d2350377997f ("arm64: dts: qcom: add initial SM8650 dtsi")
Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Reviewed-by: Abel Vesa <abel.vesa@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260317-topic-its_range_fixup-v1-5-49be8076adb1@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Date: Sat Mar 14 04:37:11 2026 +0200
arm64: dts: qcom: sm8650: Fix xo clock supply of SD host controller
[ Upstream commit 390903efaa057c44fd80e7d9839419c50092018e ]
The expected frequency of SD host controller core supply clock is 19.2MHz,
while RPMH_CXO_CLK clock frequency on SM8650 platform is 38.4MHz.
Apparently the overclocked supply clock could be good enough on some
boards and even with the most of SD cards, however some low-end UHS-I
SD cards in SDR104 mode of the host controller produce I/O errors in
runtime, fortunately this problem is gone, if the "xo" clock frequency
matches the expected 19.2MHz clock rate.
Fixes: 10e024671295 ("arm64: dts: qcom: sm8650: add interconnect dependent device nodes")
Signed-off-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20260314023715.357512-3-vladimir.zapolskiy@linaro.org
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chris Morgan <macromorgan@hotmail.com>
Date: Tue Mar 10 08:46:48 2026 -0500
arm64: dts: rockchip: Correct Fan Supply for Gameforce Ace
[ Upstream commit c7079215b7dbf88b84a95ff13982bf3dab3cfbe1 ]
Correct the regulator providing power to the PWM controlled fan.
Without this fix the fan only runs when the audio path is playing
audio (because the speaker amplifier and PWM fan share the same
regulator).
Fixes: 4e946c447a04 ("arm64: dts: rockchip: Add GameForce Ace")
Signed-off-by: Chris Morgan <macromorgan@hotmail.com>
Link: https://patch.msgid.link/20260310134648.550006-1-macroalpha82@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chris Morgan <macromorgan@hotmail.com>
Date: Tue Mar 10 08:49:19 2026 -0500
arm64: dts: rockchip: Correct Joystick Axes on Gameforce Ace
[ Upstream commit c337c1b561c1c3016d30776d7dc2032ea4979334 ]
The Gameforce Ace's joystick axes were set incorrectly initially,
getting the X/Y and RX/RY axes backwards. Additionally, correct the
RY axis so that it is inverted.
All axes tested with evtest and outputting correct values.
Fixes: 4e946c447a04 ("arm64: dts: rockchip: Add GameForce Ace")
Reported-by: sydarn <sydarn@proton.me>
Signed-off-by: Chris Morgan <macromorgan@hotmail.com>
Link: https://patch.msgid.link/20260310134919.550023-1-macroalpha82@gmail.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ming Wang <wangming5719@gmail.com>
Date: Fri Feb 6 17:04:53 2026 +0800
arm64: dts: rockchip: Fix Bluetooth stability on LCKFB TaiShan Pi
[ Upstream commit 861a9593e10bb6ab2a492b315c8a2a3aad70ac00 ]
The AP6212 WiFi/BT module on the LCKFB TaiShan Pi (RK3566) is prone to
communication timeouts and reset failures (error -110) when operating at
3 Mbps.
This patch stabilizes the Bluetooth interface by:
1. Updating the compatible string to 'brcm,bcm43430a1-bt' to better reflect
the actual chip revision used in the AP6212 module.
2. Lowering the maximum UART baud rate from 3,000,000 to 1,500,000 bps.
Tests show that 1.5 Mbps is the reliable upper limit for this board's
UART configuration, eliminating the initialization timeouts.
Fixes: 251e5ade9ba4 ("arm64: dts: rockchip: add dts for LCKFB Taishan Pi RK3566")
Signed-off-by: Ming Wang <wangming5719@gmail.com>
Link: https://patch.msgid.link/20260206090453.1041919-1-wming126@126.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Heiko Stuebner <heiko.stuebner@cherry.de>
Date: Tue Feb 10 09:03:02 2026 +0100
arm64: dts: rockchip: Make Jaguar PCIe-refclk pin use pull-up config
[ Upstream commit f45d4356feeba1c8dac3414b688f59292ddfc9f9 ]
The hardware PU/PD config of the pin after reset is to pull-up and on
Jaguar this will also keep the device in reset until the driver actually
enables the pin. So restore this boot pull-up config of the pin on Jaguar
instead of setting it to pull-none.
Suggested-by: Quentin Schulz <quentin.schulz@cherry.de>
Fixes: 0ec7e1096332 ("arm64: dts: rockchip: add PCIe3 support on rk3588-jaguar")
Signed-off-by: Heiko Stuebner <heiko.stuebner@cherry.de>
Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com>
Reviewed-by: Quentin Schulz <quentin.schulz@cherry.de>
Link: https://patch.msgid.link/20260210080303.680403-5-heiko@sntech.de
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Judith Mendez <jm@ti.com>
Date: Mon Feb 23 17:37:31 2026 -0600
arm64: dts: ti: k3-am62-lp-sk: Enable internal pulls for MMC0 data pins
[ Upstream commit ee2a9d9c9e6c9643fb7e45febcaedfbc038e483a ]
AM62 LP SK board does not have external pullups on MMC0 DAT1-DAT7
pins [0]. Enable internal pullups on DAT1-DAT7 considering:
- without a host-side pullup, these lines rely solely on the eMMC
device's internal pullup (R_int, 10-150K per JEDEC), which may
exceed the recommended 50K max for 1.8V VCCQ
- JEDEC JESD84-B51 Table 200 requires host-side pullups (R_DAT,
10K-100K) on all data lines to prevent bus floating
[0] https://www.ti.com/lit/zip/SPRR471
Fixes: a0b8da04153e ("arm64: dts: ti: k3-am62*: Move eMMC pinmux to top level board file")
Signed-off-by: Judith Mendez <jm@ti.com>
Reviewed-by: Moteen Shah <m-shah@ti.com>
Link: https://patch.msgid.link/20260223233731.2690472-4-jm@ti.com
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Francesco Dolcini <francesco.dolcini@toradex.com>
Date: Tue Mar 24 10:36:57 2026 +0100
arm64: dts: ti: k3-am62-verdin: Fix SPI_1 GPIO CS pinctrl label
[ Upstream commit 944dffaec1ef0f21c203728de77b5618ed70df6e ]
Fix SPI_1_CS GPIO pinmux label, this is spi1_cs, not qspi1_io4.
There are no user of this label yet, therefore this change does not
create any compatibility issue.
Fixes: fcb335934c51 ("arm64: dts: ti: verdin-am62: Improve spi1 chip-select pinctrl")
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Link: https://patch.msgid.link/20260324093705.26730-3-francesco@dolcini.it
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Judith Mendez <jm@ti.com>
Date: Mon Feb 23 17:37:29 2026 -0600
arm64: dts: ti: k3-am62p5-sk: Disable MMC1 internal pulls on data pins
[ Upstream commit 6d4441be969bea89bb9702781f5dfb3a8f2a02a4 ]
AM62P SK has external 10K pullups on MMC1 DAT1-DAT3 pins [0].
Disable internal pullups on DAT1-DAT3 so that each line has a
single pullup source:
- with both pullups enabled, the effective parallel resistance on
DAT1-3 (~8.33K) drops below the 10K minimum pullup requirement
for data lines (per SD Physical Layer Specification)
- removing internal pullups makes DAT1-3 match DAT0 10K
external pullup so its consistent and within spec
- both internal and external pullups enabled equals unnecessary power
consumption
[0] https://www.ti.com/lit/zip/SPRR487
Fixes: c00504ea42c0 ("arm64: dts: ti: k3-am62p5-sk: Updates for SK EVM")
Signed-off-by: Judith Mendez <jm@ti.com>
Reviewed-by: Moteen Shah <m-shah@ti.com>
Link: https://patch.msgid.link/20260223233731.2690472-2-jm@ti.com
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wang Wensheng <wsw9603@163.com>
Date: Sun Apr 5 19:42:31 2026 +0800
arm64: kexec: Remove duplicate allocation for trans_pgd
[ Upstream commit ee020bf6f14094c9ae434bb37e6957a1fdad513c ]
trans_pgd would be allocated in trans_pgd_create_copy(), so remove the
duplicate allocation before calling trans_pgd_create_copy().
Fixes: 3744b5280e67 ("arm64: kexec: install a copy of the linear-map")
Signed-off-by: Wang Wensheng <wsw9603@163.com>
Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
Date: Thu Apr 30 16:58:08 2026 +0800
arm64: Reserve an extra page for early kernel mapping
[ Upstream commit 4d8e74ad4585672489da6145b3328d415f50db82 ]
The final part of [data, end) segment may overflow into the next page of
init_pg_end[1] which is the gap page before early_init_stack[2]:
[1]
crash_arm64_v9.0.1> vtop ffffffed00601000
VIRTUAL PHYSICAL
ffffffed00601000 83401000
PAGE DIRECTORY: ffffffecffd62000
PGD: ffffffecffd62da0 => 10000000833fb003
PMD: ffffff80033fb018 => 10000000833fe003
PTE: ffffff80033fe008 => 68000083401f03
PAGE: 83401000
PTE PHYSICAL FLAGS
68000083401f03 83401000 (VALID|SHARED|AF|NG|PXN|UXN)
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
fffffffec00d0040 83401000 0 0 1 4000 reserved
[2]
ffffffed002c8000 (r) __pi__data
ffffffed0054e000 (d) __pi___bss_start
ffffffed005f5000 (b) __pi_init_pg_dir
ffffffed005fe000 (b) __pi_init_pg_end
ffffffed005ff000 (B) early_init_stack
ffffffed00608000 (b) __pi__end
For 4K pages, the early kernel mapping may use 2MB block entries but the
kernel segments are only 64KB aligned. Segment boundaries that fall
within a 2MB block therefore require a PTE table so that different
attributes can be applied on either side of the boundary.
KERNEL_SEGMENT_COUNT still correctly counts the five permanent kernel
VMAs registered by declare_kernel_vmas(). However, since commit
5973a62efa34 ("arm64: map [_text, _stext) virtual address range
non-executable+read-only"), the early mapper also maps [_text, _stext)
separately from [_stext, _etext). This adds one more early-only split
and can require one more page-table page than the existing
EARLY_SEGMENT_EXTRA_PAGES allowance reserves.
Increase the 4K-page early mapping allowance by one page to cover that
additional split.
Fixes: 5973a62efa34 ("arm64: map [_text, _stext) virtual address range non-executable+read-only")
Assisted-by: TRAE:GLM-5.1
Suggested-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
[catalin.marinas@arm.com: rewrote part of the commit log]
[catalin.marinas@arm.com: expanded the code comment]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Frank Li <Frank.Li@nxp.com>
Date: Wed Feb 11 18:12:55 2026 -0500
ARM: dts: imx27-eukrea: replace interrupts with interrupts-extended
[ Upstream commit 0477a6b31e2874e554e3bcfac9883684b8f8ca2d ]
The property interrupts use default interrupt controllers. But pass down
gpio<n> as phandle. Correct it by use interrupts-extended.
Fixes: d8cae888aa2bc ("ARM: dts: Add support for the cpuimx27 board from Eukrea and its baseboard")
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rafał Miłecki <rafal@milecki.pl>
Date: Tue Feb 24 09:25:41 2026 +0100
ARM: dts: mediatek: mt7623: fix efuse fallback compatible
[ Upstream commit 5978ff33cc6f0988388a2830dc5cd2ea4e81f36a ]
Fix following validation error:
arch/arm/boot/dts/mediatek/mt7623a-rfb-emmc.dtb: efuse@10206000: compatible: 'oneOf' conditional failed, one must be fixed:
['mediatek,mt7623-efuse', 'mediatek,mt8173-efuse'] is too long
'mediatek,mt8173-efuse' was expected
'mediatek,efuse' was expected
from schema $id: http://devicetree.org/schemas/nvmem/mediatek,efuse.yaml#
arch/arm/boot/dts/mediatek/mt7623a-rfb-emmc.dtb: efuse@10206000: Unevaluated properties are not allowed ('compatible' was unexpected)
from schema $id: http://devicetree.org/schemas/nvmem/mediatek,efuse.yaml#
Fixes: 43c7a91b4b3a ("arm: dts: mt7623: add efuse nodes to the mt7623.dtsi file")
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Aaro Koskinen <aaro.koskinen@iki.fi>
Date: Fri Mar 27 19:15:10 2026 +0200
ARM: OMAP1: Fix DEBUG_LL and earlyprintk on OMAP16XX
[ Upstream commit 7e74b606dd39c46d4378d6f6563f560a00ab8694 ]
On OMAP16XX, the UART enable bit shifts are written instead of the actual
bits. This breaks the boot when DEBUG_LL and earlyprintk is enabled;
the UART gets disabled and some random bits get enabled. Fix that.
Fixes: 34c86239b184 ("ARM: OMAP1: clock: Fix early UART rate issues")
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Link: https://patch.msgid.link/aca7HnXZ-aCSJPW7@darkstar.musicnaut.iki.fi
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Date: Wed Nov 6 00:10:44 2024 +0000
ASoC: add symmetric_ prefix for dai->rate/channels/sample_bits
[ Upstream commit 1bd775da9ba919b87b2313a78d5957afc1a62dde ]
snd_soc_dai has rate/channels/sample_bits parameter, but it is only valid
if symmetry is being enforced by symmetric_xxx flag on driver.
It is very difficult to know about it from current naming, and easy to
misunderstand it. add symmetric_ prefix for it.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://patch.msgid.link/87zfmd8bnf.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Stable-dep-of: 07c774dd64ba ("ASoC: soc-compress: use function to clear symmetric params")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Guilherme G. Piccoli <gpiccoli@igalia.com>
Date: Thu Apr 23 15:30:58 2026 -0300
ASoC: amd: acp: Add DMI quirk for Valve Steam Deck OLED
[ Upstream commit b0f6f4ac7d5d04fe2adcdd63ed1cd1ad505b8958 ]
Commit 671dd2ffbd8b ("ASoC: amd: acp: Add new cpu dai and dailink creation for I2S BT instance")
introduced a change that "broke" Steam Deck's audio probe, in the OLED
model, as observed in the following dmesg snippet:
[...]
snd_sof_amd_vangogh 0000:04:00.5: Topology: ABI 3:26:0 Kernel ABI 3:23:1
sof_mach nau8821-max: ASoC: physical link acp-bt-codec (id 2) not exist
sof_mach nau8821-max: ASoC: topology: could not load header: -22
snd_sof_amd_vangogh 0000:04:00.5: tplg amd/sof-tplg/sof-vangogh-nau8821-max.tplg component load failed -22
snd_sof_amd_vangogh 0000:04:00.5: error: failed to load DSP topology -22
snd_sof_amd_vangogh 0000:04:00.5: ASoC error (-22): at snd_soc_component_probe() on 0000:04:00.5
sof_mach nau8821-max: ASoC: failed to instantiate card -22
sof_mach nau8821-max: error -EINVAL: Failed to register card(sof-nau8821-max)
sof_mach nau8821-max: probe with driver sof_mach failed with error -22
[...]
Notice the quotes in "broke": it's not really a bug in such commit,
but instead a problem with a topology file from Steam Deck OLED. This
was discussed to great extent in [1], and Cristian proposed a pretty
simple and functional change that resolved the issue for the Deck's
issue. That change, though, would break other devices, so it wasn't
accepted upstream. And the proper suggested solution (fix the topology)
was never implemented, so Valve's kernel (and anyone that wants to boot
the mainline on Steam Deck OLED) is carrying that fix downstream.
So, we propose hereby a different approach: a DMI quirk, as many already
present in the sound drivers, to address this issue solely on Steam Deck
OLED, not breaking other devices and as a bonus, allowing simple patch
up in case eventually the topology file gets fixed (we'd just need to
check against any DMI info reflecting that or the topology/FW versions).
The motivation of such upstream quirk is related to users that want
to test latest kernel trees on their devices and get no only non-working
sound device, but seems some games (like Ori and the Blind Forest)
can't properly work without a proper functional audio device.
Example of such report can be seen at [2].
Cc: Mark Brown <broonie@kernel.org>
Cc: Robert Beckett <bob.beckett@collabora.com>
Cc: Umang Jain <uajain@igalia.com>
Fixes: 671dd2ffbd8b ("ASoC: amd: acp: Add new cpu dai and dailink creation for I2S BT instance")
Link: https://lore.kernel.org/r/20231209205351.880797-11-cristian.ciocaltea@collabora.com/ [1]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218677 [2]
Reviewed-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Link: https://patch.msgid.link/20260423183505.116445-1-gpiccoli@igalia.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Christian A. Ehrhardt <christian.ehrhardt@codasip.com>
Date: Tue Apr 28 21:22:49 2026 +0200
ASoC: codecs: ab8500: Fix casting of private data
[ Upstream commit a201aef1a88b675e9eb8487e27d14e2eef3cef80 ]
ab8500_filter_controls[i].private_value is initialized using
.private_value = (unsigned long)&(struct filter_control)
{.count = xcount, .min = xmin, .max = xmax}
thus it's a pointer to a struct filter_control casted to unsigned long.
So to get back that pointer .private_data must be cast back, not its
address.
Fixes: 679d7abdc754 ("ASoC: codecs: Add AB8500 codec-driver")
Signed-off-by: Christian A. Ehrhardt <christian.ehrhardt@codasip.com>
Signed-off-by: Uwe Kleine-König (The Capable Hub) <u.kleine-koenig@baylibre.com>
Link: https://patch.msgid.link/20260428192255.2294705-2-u.kleine-koenig@baylibre.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shengjiu Wang <shengjiu.wang@nxp.com>
Date: Wed Apr 1 17:42:26 2026 +0800
ASoC: fsl_easrc: Change the type for iec958 channel status controls
[ Upstream commit 47f28a5bd154a95d5aa563dde02a801bd32ddb81 ]
Use the type SNDRV_CTL_ELEM_TYPE_IEC958 for iec958 channel status
controls, the original type will cause mixer-test to iterate all 32bit
values, which costs a lot of time. And using IEC958 type can reduce the
control numbers.
Also enable pm runtime before updating registers to make the regmap cache
data align with the value in hardware.
Fixes: 955ac624058f ("ASoC: fsl_easrc: Add EASRC ASoC CPU DAI drivers")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://patch.msgid.link/20260401094226.2900532-12-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shengjiu Wang <shengjiu.wang@nxp.com>
Date: Wed Apr 1 17:42:24 2026 +0800
ASoC: fsl_easrc: Check the variable range in fsl_easrc_iec958_put_bits()
[ Upstream commit 00541b86fb578d4949cfdd6aff1f82d43fcf07af ]
Add check of input value's range in fsl_easrc_iec958_put_bits(),
otherwise the wrong value may be written from user space.
Fixes: 955ac624058f ("ASoC: fsl_easrc: Add EASRC ASoC CPU DAI drivers")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://patch.msgid.link/20260401094226.2900532-10-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shengjiu Wang <shengjiu.wang@nxp.com>
Date: Wed Apr 1 17:42:25 2026 +0800
ASoC: fsl_easrc: Fix value type in fsl_easrc_iec958_get_bits()
[ Upstream commit aa21fe4a81458cf469c2615b08cbde5997dde25a ]
The value type of controls "Context 0 IEC958 Bits Per Sample" should be
integer, not enumerated, the issue is found by the mixer-test.
Fixes: 955ac624058f ("ASoC: fsl_easrc: Add EASRC ASoC CPU DAI drivers")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://patch.msgid.link/20260401094226.2900532-11-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shengjiu Wang <shengjiu.wang@nxp.com>
Date: Wed Apr 1 17:42:16 2026 +0800
ASoC: fsl_micfil: Add access property for "VAD Detected"
[ Upstream commit c7661bfc7422443df394c01e069ae4e5c3a7f04c ]
Add access property SNDRV_CTL_ELEM_ACCESS_READ for control "VAD
Detected", which doesn't support put operation, otherwise there will be
issue with mixer-test.
Fixes: 29dbfeecab85 ("ASoC: fsl_micfil: Add Hardware Voice Activity Detector support")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://patch.msgid.link/20260401094226.2900532-2-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shengjiu Wang <shengjiu.wang@nxp.com>
Date: Wed Apr 1 17:42:17 2026 +0800
ASoC: fsl_micfil: Fix event generation in hwvad_put_enable()
[ Upstream commit 59b9061824f2179fe133e2636203548eaba3e528 ]
ALSA controls should return 1 if the value in the control changed but the
control put operation hwvad_put_enable() only returns 0 or a negative
error code, causing ALSA to not generate any change events.
Add a suitable check in the function before updating the vad_enabled
variable.
Fixes: 29dbfeecab85 ("ASoC: fsl_micfil: Add Hardware Voice Activity Detector support")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://patch.msgid.link/20260401094226.2900532-3-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shengjiu Wang <shengjiu.wang@nxp.com>
Date: Wed Apr 1 17:42:18 2026 +0800
ASoC: fsl_micfil: Fix event generation in hwvad_put_init_mode()
[ Upstream commit 7e226209906906421f0d952d7304e48fdb0adabc ]
ALSA controls should return 1 if the value in the control changed but the
control put operation hwvad_put_init_mode() only returns 0 or a negative
error code, causing ALSA to not generate any change events.
Add a suitable check in the function before updating the vad_init_mode
variable.
Fixes: 29dbfeecab85 ("ASoC: fsl_micfil: Add Hardware Voice Activity Detector support")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://patch.msgid.link/20260401094226.2900532-4-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shengjiu Wang <shengjiu.wang@nxp.com>
Date: Wed Apr 1 17:42:20 2026 +0800
ASoC: fsl_micfil: Fix event generation in micfil_put_dc_remover_state()
[ Upstream commit 7d2bd35100de370dc326b250e8f6b66bee06a2f3 ]
ALSA controls should return 1 if the value in the control changed but the
control put operation micfil_put_dc_remover_state() only returns 0 or a
negative error code, causing ALSA to not generate any change events.
return the value of snd_soc_component_update_bits() directly, as it has
the capability of return check status of changed or not.
Also enable pm runtime before calling the function
snd_soc_component_update_bits() to make the regmap cache data align with
the value in hardware.
Fixes: 29dbfeecab85 ("ASoC: fsl_micfil: Add Hardware Voice Activity Detector support")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://patch.msgid.link/20260401094226.2900532-6-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shengjiu Wang <shengjiu.wang@nxp.com>
Date: Wed Apr 1 17:42:21 2026 +0800
ASoC: fsl_micfil: Fix event generation in micfil_quality_set()
[ Upstream commit e5785093b1b45af7ee57d18619b2854a8aed073a ]
ALSA controls should return 1 if the value in the control changed but the
control put operation micfil_quality_set() only returns 0 or a negative
error code, causing ALSA to not generate any change events.
Add a suitable check in the function before updating the quality variable.
Also enable pm runtime before calling the function micfil_set_quality()
to make the regmap cache data align with the value in hardware.
Fixes: bea1d61d5892 ("ASoC: fsl_micfil: rework quality setting")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://patch.msgid.link/20260401094226.2900532-7-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shengjiu Wang <shengjiu.wang@nxp.com>
Date: Wed Apr 1 17:42:22 2026 +0800
ASoC: fsl_xcvr: Fix event generation in fsl_xcvr_arc_mode_put()
[ Upstream commit 1b61c8103c9317a9c37fe544c2d83cee1c281149 ]
ALSA controls should return 1 if the value in the control changed but the
control put operation fsl_xcvr_arc_mode_put() only returns 0 or a negative
error code, causing ALSA to not generate any change events.
Add a suitable check in the function before updating the arc_mode
variable.
Fixes: 28564486866f ("ASoC: fsl_xcvr: Add XCVR ASoC CPU DAI driver")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://patch.msgid.link/20260401094226.2900532-8-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shengjiu Wang <shengjiu.wang@nxp.com>
Date: Wed Apr 1 17:42:23 2026 +0800
ASoC: fsl_xcvr: Fix event generation in fsl_xcvr_mode_put()
[ Upstream commit 64a496ba976324615b845d60739dfcdae3d57434 ]
ALSA controls should return 1 if the value in the control changed but the
control put operation fsl_xcvr_mode_put() only returns 0 or a negative
error code, causing ALSA to not generate any change events.
Add a suitable check in the function before updating the mode variable.
Fixes: 28564486866f ("ASoC: fsl_xcvr: Add XCVR ASoC CPU DAI driver")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://patch.msgid.link/20260401094226.2900532-9-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Date: Thu Apr 2 08:11:08 2026 +0000
ASoC: qcom: qdsp6: topology: check widget type before accessing data
[ Upstream commit d5bfdd28e0cdd45043ae6e0ac168a451d59283dc ]
Check widget type before accessing the private data, as this could a
virtual widget which is no associated with a dsp graph, container and
module. Accessing witout check could lead to incorrect memory access.
Fixes: 36ad9bf1d93d ("ASoC: qdsp6: audioreach: add topology support")
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com>
Link: https://patch.msgid.link/20260402081118.348071-4-srinivas.kandagatla@oss.qualcomm.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Denis Rastyogin <gerben@altlinux.org>
Date: Fri Mar 27 13:33:11 2026 +0300
ASoC: rsnd: Fix potential out-of-bounds access of component_dais[]
[ Upstream commit f9e437cddf6cf9e603bdaefe148c1f4792aaf39c ]
component_dais[RSND_MAX_COMPONENT] is initially zero-initialized
and later populated in rsnd_dai_of_node(). However, the existing boundary check:
if (i >= RSND_MAX_COMPONENT)
does not guarantee that the last valid element remains zero. As a result,
the loop can rely on component_dais[RSND_MAX_COMPONENT] being zero,
which may lead to an out-of-bounds access.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 547b02f74e4a ("ASoC: rsnd: enable multi Component support for Audio Graph Card/Card2")
Signed-off-by: Denis Rastyogin <gerben@altlinux.org>
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://patch.msgid.link/20260327103311.459239-1-gerben@altlinux.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Date: Thu Feb 19 04:53:52 2026 +0000
ASoC: soc-compress: use function to clear symmetric params
[ Upstream commit 07c774dd64ba0c605dbf844132122e3edbdbea93 ]
Current soc-compress.c clears symmetric_rate, but it clears rate only,
not clear other symmetric_channels/sample_bits.
static int soc_compr_clean(...)
{
...
if (!snd_soc_dai_active(cpu_dai))
=> cpu_dai->symmetric_rate = 0;
if (!snd_soc_dai_active(codec_dai))
=> codec_dai->symmetric_rate = 0;
...
};
This feature was added when v3.7 kernel [1], and there was only
symmetric_rate, no symmetric_channels/sample_bits in that timing.
symmetric_channels/sample_bits were added in v3.14 [2],
but I guess it didn't notice that soc-compress.c is updating symmetric_xxx.
We are clearing symmetry_xxx by soc_pcm_set_dai_params(), but is soc-pcm.c
local function. Makes it global function and clear symmetry_xxx by it.
[1] commit 1245b7005de02 ("ASoC: add compress stream support")
[2] commit 3635bf09a89cf ("ASoC: soc-pcm: add symmetry for channels and
sample bits")
Fixes: 3635bf09a89c ("ASoC: soc-pcm: add symmetry for channels and sample bits")
Cc: Nicolin Chen <b42378@freescale.com>
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://patch.msgid.link/87ms15e3kv.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Date: Wed Mar 25 17:05:11 2026 -0300
ASoC: SOF: compress: return the configured codec from get_params
[ Upstream commit 2c4fdd055f92a2fc8602dcd88bcea08c374b7e8b ]
The SOF compressed offload path accepts codec parameters in
sof_compr_set_params() and forwards them to firmware as
extended data in the SOF IPC stream params message.
However, sof_compr_get_params() still returns success without
filling the snd_codec structure. Since the compress core allocates
that structure zeroed and copies it back to userspace on success,
SNDRV_COMPRESS_GET_PARAMS returns an all-zero codec description
even after the stream has been configured successfully.
The stale TODO in this callback conflates get_params() with capability
discovery. Supported codec enumeration belongs in get_caps() and
get_codec_caps(). get_params() should report the current codec settings.
Cache the codec accepted by sof_compr_set_params() in the per-stream SOF
compress state and return it from sof_compr_get_params().
Fixes: 6324cf901e14 ("ASoC: SOF: compr: Add compress ops implementation")
Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com>
Link: https://patch.msgid.link/20260325-sof-compr-get-params-v1-1-0758815f13c7@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ethan Tidmore <ethantidmore06@gmail.com>
Date: Tue Mar 24 12:38:30 2026 -0500
ASoC: SOF: Intel: hda: Place check before dereference
[ Upstream commit 6cbc8360f51a3df2ea16a786b262b9fe44d4c68c ]
The struct hext_stream is dereferenced before it is checked for NULL.
Although it can never be NULL due to a check prior to
hda_dsp_iccmax_stream_hw_params() being called, this change clears any
confusion regarding hext_stream possibly being NULL.
Check hext_stream for NULL and then assign its members.
Detected by Smatch:
sound/soc/sof/intel/hda-stream.c:488 hda_dsp_iccmax_stream_hw_params() warn:
variable dereferenced before check 'hext_stream' (see line 486)
Fixes: aca961f196e5d ("ASoC: SOF: Intel: hda: Add helper function to program ICCMAX stream")
Signed-off-by: Ethan Tidmore <ethantidmore06@gmail.com>
Link: https://patch.msgid.link/20260324173830.17563-1-ethantidmore06@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Daniel Baluta <daniel.baluta@nxp.com>
Date: Thu Sep 26 12:02:52 2024 +0300
ASoC: SOF: ipc3: Use standard dev_dbg API
[ Upstream commit 55c39835ee0ef94593a78f6ea808138d476f3b81 ]
Use standard dev_dbg API because it gives better debugging
information and allows dynamic control of prints.
Signed-off-by: Daniel Baluta <daniel.baluta@nxp.com>
Link: https://patch.msgid.link/20240926090252.106040-1-daniel.baluta@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Stable-dep-of: 07c774dd64ba ("ASoC: soc-compress: use function to clear symmetric params")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sander Vanheule <sander@svanheule.net>
Date: Fri Feb 20 16:26:33 2026 +0100
ASoC: sti: Return errors from regmap_field_alloc()
[ Upstream commit 272aabef50bc3fe58edd26de000f4cdd41bdbe60 ]
When regmap_field_alloc() fails, it can return an error. Specifically,
it will return PTR_ERR(-ENOMEM) when the allocation returns a NULL
pointer. The code then uses these allocations with a simple NULL check:
if (player->clk_sel) {
// May dereference invalid pointer (-ENOMEM)
err = regmap_field_write(player->clk_sel, ...);
}
Ensure initialization fails by forwarding the errors from
regmap_field_alloc(), thus avoiding the use of the invalid pointers.
Fixes: 76c2145ded6b ("ASoC: sti: Add CPU DAI driver for playback")
Signed-off-by: Sander Vanheule <sander@svanheule.net>
Link: https://patch.msgid.link/20260220152634.480766-2-sander@svanheule.net
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sander Vanheule <sander@svanheule.net>
Date: Fri Feb 20 16:26:34 2026 +0100
ASoC: sti: use managed regmap_field allocations
[ Upstream commit 1696fad8b259a2d46e51cd6e17e4bcdbe02279fa ]
The regmap_field objects allocated at player init are never freed and
may leak resources if the driver is removed.
Switch to devm_regmap_field_alloc() to automatically limit the lifetime
of the allocations the lifetime of the device.
Fixes: 76c2145ded6b ("ASoC: sti: Add CPU DAI driver for playback")
Signed-off-by: Sander Vanheule <sander@svanheule.net>
Link: https://patch.msgid.link/20260220152634.480766-3-sander@svanheule.net
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Igor Pylypiv <ipylypiv@google.com>
Date: Sun Apr 12 08:36:37 2026 -0700
ata: libata-scsi: fix requeue of deferred ATA PASS-THROUGH commands
[ Upstream commit 8ebf408e7d463eee02c348a3c8277b95587b710d ]
Commit 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation")
introduced ata_scsi_requeue_deferred_qc() to handle commands deferred
during resets or NCQ failures. This deferral logic completed commands
with DID_SOFT_ERROR to trigger a retry in the SCSI mid-layer.
However, DID_SOFT_ERROR is subject to scsi_cmd_retry_allowed() checks.
ATA PASS-THROUGH commands sent via SG_IO ioctl have scmd->allowed set
to zero. This causes the mid-layer to fail the command immediately
instead of retrying, even though the command was never actually issued
to the hardware.
Switch to DID_REQUEUE to ensure these commands are inserted back into
the request queue regardless of retry limits.
Fixes: 0ea84089dbf6 ("ata: libata-scsi: avoid Non-NCQ command starvation")
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sergio Correia <scorreia@redhat.com>
Date: Tue May 12 14:28:59 2026 +0100
audit: enforce AUDIT_LOCKED for AUDIT_TRIM and AUDIT_MAKE_EQUIV
commit f9e1c1324b4d98d591a6f7568fdebf5cf456dfc2 upstream.
AUDIT_ADD_RULE and AUDIT_DEL_RULE correctly check for AUDIT_LOCKED
and return -EPERM, but AUDIT_TRIM and AUDIT_MAKE_EQUIV do not. This
allows a process with CAP_AUDIT_CONTROL to modify directory tree
watches and equivalence mappings even when the audit configuration
has been locked, undermining the purpose of the lock.
Add AUDIT_LOCKED checks to both commands.
Cc: stable@vger.kernel.org
Reviewed-by: Ricardo Robaina <rrobaina@redhat.com>
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Sergio Correia <scorreia@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sergio Correia <scorreia@redhat.com>
Date: Tue May 12 14:28:33 2026 +0100
audit: fix incorrect inheritable capability in CAPSET records
commit e4a640475e43f406fdfd56d370b1f34b0cbbc18d upstream.
__audit_log_capset() records the effective capability set into the
inheritable field due to a copy-paste error. Every CAPSET audit
record therefore reports cap_pi (process inheritable) with the value
of cap_effective instead of cap_inheritable.
This silently corrupts audit data used for compliance and forensic
analysis: an attacker who modifies inheritable capabilities to
prepare for a privilege-escalating exec would have the change masked
in the audit trail.
The bug has been present since the original introduction of CAPSET
audit records in 2008.
Cc: stable@vger.kernel.org
Fixes: e68b75a027bb ("When the capset syscall is used it is not possible for audit to record the actual capbilities being added/removed. This patch adds a new record type which emits the target pid and the eff, inh, and perm cap sets.")
Reviewed-by: Ricardo Robaina <rrobaina@redhat.com>
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Sergio Correia <scorreia@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Chen Ni <nichen@iscas.ac.cn>
Date: Tue Feb 3 10:16:25 2026 +0800
backlight: sky81452-backlight: Check return value of devm_gpiod_get_optional() in sky81452_bl_parse_dt()
[ Upstream commit 797cc011ae02bda26f93d25a4442d7a1a77d84df ]
The devm_gpiod_get_optional() function may return an ERR_PTR in case of
genuine GPIO acquisition errors, not just NULL which indicates the
legitimate absence of an optional GPIO.
Add an IS_ERR() check after the call in sky81452_bl_parse_dt(). On
error, return the error code to ensure proper failure handling rather
than proceeding with invalid pointers.
Fixes: e1915eec54a6 ("backlight: sky81452: Convert to GPIO descriptors")
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Linus Walleij <linusw@kernel.org>
Reviewed-by: Daniel Thompson (RISCstar) <danielt@kernel.org>
Link: https://patch.msgid.link/20260203021625.578678-1-nichen@iscas.ac.cn
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Weiming Shi <bestswngs@gmail.com>
Date: Sun Apr 26 09:53:51 2026 -0700
bareudp: fix NULL pointer dereference in bareudp_fill_metadata_dst()
[ Upstream commit aa6c6d9ee064aabfede4402fd1283424e649ca19 ]
bareudp_fill_metadata_dst() passes bareudp->sock to
udp_tunnel6_dst_lookup() in the IPv6 path without a NULL check.
The socket is only created in bareudp_open() and NULLed in
bareudp_stop(), so calling this function while the device is down
triggers a NULL dereference via sock->sk.
BUG: kernel NULL pointer dereference, address: 0000000000000018
RIP: 0010:udp_tunnel6_dst_lookup (net/ipv6/ip6_udp_tunnel.c:160)
Call Trace:
<TASK>
bareudp_fill_metadata_dst (drivers/net/bareudp.c:532)
do_execute_actions (net/openvswitch/actions.c:901)
ovs_execute_actions (net/openvswitch/actions.c:1589)
ovs_packet_cmd_execute (net/openvswitch/datapath.c:700)
genl_family_rcv_msg_doit (net/netlink/genetlink.c:1114)
genl_rcv_msg (net/netlink/genetlink.c:1209)
netlink_rcv_skb (net/netlink/af_netlink.c:2550)
</TASK>
Add a NULL check returning -ESHUTDOWN, consistent with the xmit paths
in the same driver.
Fixes: 571912c69f0e ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260426165350.1663137-2-bestswngs@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jackie Liu <liuyun01@kylinos.cn>
Date: Tue Mar 31 16:50:54 2026 +0800
blk-cgroup: fix disk reference leak in blkcg_maybe_throttle_current()
[ Upstream commit 23308af722fefed00af5f238024c11710938fba3 ]
Add the missing put_disk() on the error path in
blkcg_maybe_throttle_current(). When blkcg lookup, blkg lookup, or
blkg_tryget() fails, the function jumps to the out label which only
calls rcu_read_unlock() but does not release the disk reference acquired
by blkcg_schedule_throttle() via get_device(). Since current->throttle_disk
is already set to NULL before the lookup, blkcg_exit() cannot release
this reference either, causing the disk to never be freed.
Restore the reference release that was present as blk_put_queue() in the
original code but was inadvertently dropped during the conversion from
request_queue to gendisk.
Fixes: f05837ed73d0 ("blk-cgroup: store a gendisk to throttle in struct task_struct")
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260331085054.46857-1-liu.yun@linux.dev
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ming Lei <ming.lei@redhat.com>
Date: Wed Mar 11 11:28:37 2026 +0800
blk-cgroup: wait for blkcg cleanup before initializing new disk
[ Upstream commit 3dbaacf6ab68f81e3375fe769a2ecdbd3ce386fd ]
When a queue is shared across disk rebind (e.g., SCSI unbind/bind), the
previous disk's blkcg state is cleaned up asynchronously via
disk_release() -> blkcg_exit_disk(). If the new disk's blkcg_init_disk()
runs before that cleanup finishes, we may overwrite q->root_blkg while
the old one is still alive, and radix_tree_insert() in blkg_create()
fails with -EEXIST because the old blkg entries still occupy the same
queue id slot in blkcg->blkg_tree. This causes the sd probe to fail
with -ENOMEM.
Fix it by waiting in blkcg_init_disk() for root_blkg to become NULL,
which indicates the previous disk's blkcg cleanup has completed.
Fixes: 1059699f87eb ("block: move blkcg initialization/destroy into disk allocation/release handler")
Cc: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260311032837.2368714-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pauli Virtanen <pav@iki.fi>
Date: Fri Apr 24 22:24:29 2026 +0300
Bluetooth: btmtk: accept too short WMT FUNC_CTRL events
commit e3ac0d9f1a205f33a43fba3b79ef74d2f604c78b upstream.
MT7925 (USB ID 0e8d:e025) on fw version 20260106153314 sends WMT
FUNC_CTRL events that are missing the status field.
Prior to commit 006b9943b982 ("Bluetooth: btmtk: validate WMT event SKB
length before struct access") the status was read from out-of-bounds of
SKB data, which usually would result to success with
BTMTK_WMT_ON_UNDONE, although I don't know the intent here. The bounds
check added in that commit returns with error instead, producing
"Bluetooth: hci0: Failed to send wmt func ctrl (-22)" and makes the
device unusable.
Fix the regression by interpreting too short packet as status
BTMTK_WMT_ON_UNDONE, which makes the device work normally again.
Fixes: 634a4408c061 ("Bluetooth: btmtk: validate WMT event SKB length before struct access")
Signed-off-by: Pauli Virtanen <pav@iki.fi>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> # MT7922 (0489:e0e2)
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Pauli Virtanen <pav@iki.fi>
Date: Sun Mar 29 16:42:59 2026 +0300
Bluetooth: fix locking in hci_conn_request_evt() with HCI_PROTO_DEFER
[ Upstream commit 5c7209a341ff2ac338b2b0375c34a307b37c9ac2 ]
When protocol sets HCI_PROTO_DEFER, hci_conn_request_evt() calls
hci_connect_cfm(conn) without hdev->lock. Generally hci_connect_cfm()
assumes it is held, and if conn is deleted concurrently -> UAF.
Only SCO and ISO set HCI_PROTO_DEFER and only for defer setup listen,
and HCI_EV_CONN_REQUEST is not generated for ISO. In the non-deferred
listening socket code paths, hci_connect_cfm(conn) is called with
hdev->lock held.
Fix by holding the lock.
Fixes: 70c464256310 ("Bluetooth: Refactor connection request handling")
Signed-off-by: Pauli Virtanen <pav@iki.fi>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jonathan Rissanen <jonathan.rissanen@axis.com>
Date: Fri Mar 27 11:47:20 2026 +0100
Bluetooth: hci_ldisc: Clear HCI_UART_PROTO_INIT on error
[ Upstream commit 68d39ea5e0adc9ecaea1ce8abd842ec972eb8718 ]
When hci_register_dev() fails in hci_uart_register_dev()
HCI_UART_PROTO_INIT is not cleared before calling hu->proto->close(hu)
and setting hu->hdev to NULL. This means incoming UART data will reach
the protocol-specific recv handler in hci_uart_tty_receive() after
resources are freed.
Clear HCI_UART_PROTO_INIT with a write lock before calling
hu->proto->close() and setting hu->hdev to NULL. The write lock ensures
all active readers have completed and no new reader can enter the
protocol recv path before resources are freed.
This allows the protocol-specific recv functions to remove the
"HCI_UART_REGISTERED" guard without risking a null pointer dereference
if hci_register_dev() fails.
Fixes: 5df5dafc171b ("Bluetooth: hci_uart: Fix another race during initialization")
Signed-off-by: Jonathan Rissanen <jonathan.rissanen@axis.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dudu Lu <phx0fer@gmail.com>
Date: Sun Apr 5 23:47:41 2026 +0800
Bluetooth: l2cap: Add missing chan lock in l2cap_ecred_reconf_rsp
[ Upstream commit 42776497cdbc9a665b384a6dcb85f0d4bd927eab ]
l2cap_ecred_reconf_rsp() calls l2cap_chan_del() without holding
l2cap_chan_lock(). Every other l2cap_chan_del() caller in the file
acquires the lock first. A remote BLE device can send a crafted
L2CAP ECRED reconfiguration response to corrupt the channel list
while another thread is iterating it.
Add l2cap_chan_hold() and l2cap_chan_lock() before l2cap_chan_del(),
and l2cap_chan_unlock() and l2cap_chan_put() after, matching the
pattern used in l2cap_ecred_conn_rsp() and l2cap_conn_del().
Fixes: 15f02b910562 ("Bluetooth: L2CAP: Add initial code for Enhanced Credit Based Mode")
Signed-off-by: Dudu Lu <phx0fer@gmail.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Date: Mon Mar 16 14:34:13 2026 -0400
Bluetooth: L2CAP: Fix printing wrong information if SDU length exceeds MTU
[ Upstream commit 15bf35a660eb82a49f8397fc3d3acada8dae13db ]
The code was printing skb->len and sdu_len in the places where it should
be sdu_len and chan->imtu respectively to match the if conditions.
Link: https://lore.kernel.org/linux-bluetooth/20260315132013.75ab40c5@kernel.org/T/#m1418f9c82eeff8510c1beaa21cf53af20db96c06
Fixes: e1d9a6688986 ("Bluetooth: LE L2CAP: Disconnect if received packet's SDU exceeds IMTU")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stefan Metzmacher <metze@samba.org>
Date: Tue Apr 7 17:13:45 2026 +0200
Bluetooth: SCO: check for codecs->num_codecs == 1 before assigning to sco_pi(sk)->codec
[ Upstream commit 4e10a9ebbf081c16517cdd9366ac618bf38d7d0c ]
copy_struct_from_sockptr() fill 'buffer' in
sco_sock_setsockopt() with zeros, so there's no
real problem.
But it actually looks strange to do this,
without checking all of codecs->codecs[0]
really comes from userspace:
sco_pi(sk)->codec = codecs->codecs[0];
As only optlen < sizeof(struct bt_codecs) is checked
and codecs->num_codecs is not checked against != 1,
but only <= 1, and the space for the additional struct bt_codec
is not checked.
Note I don't understand bluetooth and I didn't do any runtime
tests with this! I just found it when debugging a problem
in copy_struct_from_sockptr().
I just added this to check the size is as expected:
BUILD_BUG_ON(struct_size(codecs, codecs, 0) != 1);
BUILD_BUG_ON(struct_size(codecs, codecs, 1) != 8);
And made sure it still compiles using this:
make CF=-D__CHECK_ENDIAN__ W=1ce C=1 net/bluetooth/sco.o
Fixes: 3e643e4efa1e ("Bluetooth: Improve setsockopt() handling of malformed user input")
Cc: Michal Luczaj <mhal@rbox.co>
Cc: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Cc: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: David Wei <dw@davidwei.uk>
Cc: linux-bluetooth@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Puranjay Mohan <puranjay@kernel.org>
Date: Fri Apr 17 07:33:52 2026 -0700
bpf, arm32: Reject BPF-to-BPF calls and callbacks in the JIT
[ Upstream commit e1d486445af3c392628532229f7ce5f5cf7891b6 ]
The ARM32 BPF JIT does not support BPF-to-BPF function calls
(BPF_PSEUDO_CALL) or callbacks (BPF_PSEUDO_FUNC), but it does
not reject them either.
When a program with subprograms is loaded (e.g. libxdp's XDP
dispatcher uses __noinline__ subprograms, or any program using
callbacks like bpf_loop or bpf_for_each_map_elem), the verifier
invokes bpf_jit_subprogs() which calls bpf_int_jit_compile()
for each subprogram.
For BPF_PSEUDO_CALL, since ARM32 does not reject it, the JIT
silently emits code using the wrong address computation:
func = __bpf_call_base + imm
where imm is a pc-relative subprogram offset, producing a bogus
function pointer.
For BPF_PSEUDO_FUNC, the ldimm64 handler ignores src_reg and
loads the immediate as a normal 64-bit value without error.
In both cases, build_body() reports success and a JIT image is
allocated. ARM32 lacks the jit_data/extra_pass mechanism needed
for the second JIT pass in bpf_jit_subprogs(). On the second
pass, bpf_int_jit_compile() performs a full fresh compilation,
allocating a new JIT binary and overwriting prog->bpf_func. The
first allocation is never freed. bpf_jit_subprogs() then detects
the function pointer changed and aborts with -ENOTSUPP, but the
original JIT binary has already been leaked. Each program
load/unload cycle leaks one JIT binary allocation, as reported
by kmemleak:
unreferenced object 0xbf0a1000 (size 4096):
backtrace:
bpf_jit_binary_alloc+0x64/0xfc
bpf_int_jit_compile+0x14c/0x348
bpf_jit_subprogs+0x4fc/0xa60
Fix this by rejecting both BPF_PSEUDO_CALL in the BPF_CALL
handler and BPF_PSEUDO_FUNC in the BPF_LD_IMM64 handler, falling
through to the existing 'notyet' path. This causes build_body()
to fail before any JIT binary is allocated, so
bpf_int_jit_compile() returns the original program unjitted.
bpf_jit_subprogs() then sees !prog->jited and cleanly falls
back to the interpreter with no leak.
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Fixes: 1c2a088a6626 ("bpf: x64: add JIT support for multi-function programs")
Reported-by: Jonas Rebmann <jre@pengutronix.de>
Closes: https://lore.kernel.org/bpf/b63e9174-7a3d-4e22-8294-16df07a4af89@pengutronix.de
Tested-by: Jonas Rebmann <jre@pengutronix.de>
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260417143353.838911-1-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Daniel Borkmann <daniel@iogearbox.net>
Date: Wed Apr 15 14:14:03 2026 +0200
bpf, arm64: Fix off-by-one in check_imm signed range check
[ Upstream commit 1dd8be4ec722ce54e4cace59f3a4ba658111b3ec ]
check_imm(bits, imm) is used in the arm64 BPF JIT to verify that
a branch displacement (in arm64 instruction units) fits into the
signed N-bit immediate field of a B, B.cond or CBZ/CBNZ encoding
before it is handed to the encoder. The macro currently tests for
(imm > 0 && imm >> bits) || (imm < 0 && ~imm >> bits) which admits
values in [-2^N, 2^N) — effectively a signed (N+1)-bit range. A
signed N-bit field only holds [-2^(N-1), 2^(N-1)), so the check
admits one extra bit of range on each side.
In particular, for check_imm19(), values in [2^18, 2^19) slip past
the check but do not fit into the 19-bit signed imm19 field of
B.cond. aarch64_insn_encode_immediate() then masks the raw value
into the 19-bit field, setting bit 18 (the sign bit) and flipping
a forward branch into a backward one. Same class of issue exists
for check_imm26() and the B/BL encoding. Shift by (bits - 1)
instead of bits so the actual signed N-bit range is enforced.
Fixes: e54bcde3d69d ("arm64: eBPF JIT compiler")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Puranjay Mohan <puranjay@kernel.org>
Link: https://lore.kernel.org/r/20260415121403.639619-2-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Michal Luczaj <mhal@rbox.co>
Date: Tue Apr 14 16:13:16 2026 +0200
bpf, sockmap: Fix af_unix iter deadlock
[ Upstream commit 4d328dd695383224aa750ddee6b4ad40c0f8d205 ]
bpf_iter_unix_seq_show() may deadlock when lock_sock_fast() takes the fast
path and the iter prog attempts to update a sockmap. Which ends up spinning
at sock_map_update_elem()'s bh_lock_sock():
WARNING: possible recursive locking detected
test_progs/1393 is trying to acquire lock:
ffff88811ec25f58 (slock-AF_UNIX){+...}-{3:3}, at: sock_map_update_elem+0xdb/0x1f0
but task is already holding lock:
ffff88811ec25f58 (slock-AF_UNIX){+...}-{3:3}, at: __lock_sock_fast+0x37/0xe0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(slock-AF_UNIX);
lock(slock-AF_UNIX);
*** DEADLOCK ***
May be due to missing lock nesting notation
4 locks held by test_progs/1393:
#0: ffff88814b59c790 (&p->lock){+.+.}-{4:4}, at: bpf_seq_read+0x59/0x10d0
#1: ffff88811ec25fd8 (sk_lock-AF_UNIX){+.+.}-{0:0}, at: bpf_seq_read+0x42c/0x10d0
#2: ffff88811ec25f58 (slock-AF_UNIX){+...}-{3:3}, at: __lock_sock_fast+0x37/0xe0
#3: ffffffff85a6a7c0 (rcu_read_lock){....}-{1:3}, at: bpf_iter_run_prog+0x51d/0xb00
Call Trace:
dump_stack_lvl+0x5d/0x80
print_deadlock_bug.cold+0xc0/0xce
__lock_acquire+0x130f/0x2590
lock_acquire+0x14e/0x2b0
_raw_spin_lock+0x30/0x40
sock_map_update_elem+0xdb/0x1f0
bpf_prog_2d0075e5d9b721cd_dump_unix+0x55/0x4f4
bpf_iter_run_prog+0x5b9/0xb00
bpf_iter_unix_seq_show+0x1f7/0x2e0
bpf_seq_read+0x42c/0x10d0
vfs_read+0x171/0xb20
ksys_read+0xff/0x200
do_syscall_64+0x6b/0x3a0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Fixes: 2c860a43dd77 ("bpf: af_unix: Implement BPF iterator for UNIX domain socket.")
Suggested-by: Kuniyuki Iwashima <kuniyu@google.com>
Suggested-by: Martin KaFai Lau <martin.lau@linux.dev>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260414-unix-proto-update-null-ptr-deref-v4-2-2af6fe97918e@rbox.co
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Michal Luczaj <mhal@rbox.co>
Date: Tue Apr 14 16:13:18 2026 +0200
bpf, sockmap: Fix af_unix null-ptr-deref in proto update
[ Upstream commit dca38b7734d2ea00af4818ff3ae836fab33d5d5a ]
unix_stream_connect() sets sk_state (`WRITE_ONCE(sk->sk_state,
TCP_ESTABLISHED)`) _before_ it assigns a peer (`unix_peer(sk) = newsk`).
sk_state == TCP_ESTABLISHED makes sock_map_sk_state_allowed() believe that
socket is properly set up, which would include having a defined peer. IOW,
there's a window when unix_stream_bpf_update_proto() can be called on
socket which still has unix_peer(sk) == NULL.
CPU0 bpf CPU1 connect
-------- ------------
WRITE_ONCE(sk->sk_state, TCP_ESTABLISHED)
sock_map_sk_state_allowed(sk)
...
sk_pair = unix_peer(sk)
sock_hold(sk_pair)
sock_hold(newsk)
smp_mb__after_atomic()
unix_peer(sk) = newsk
BUG: kernel NULL pointer dereference, address: 0000000000000080
RIP: 0010:unix_stream_bpf_update_proto+0xa0/0x1b0
Call Trace:
sock_map_link+0x564/0x8b0
sock_map_update_common+0x6e/0x340
sock_map_update_elem_sys+0x17d/0x240
__sys_bpf+0x26db/0x3250
__x64_sys_bpf+0x21/0x30
do_syscall_64+0x6b/0x3a0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Initial idea was to move peer assignment _before_ the sk_state update[1],
but that involved an additional memory barrier, and changing the hot path
was rejected.
Then a NULL check during proto update in unix_stream_bpf_update_proto() was
considered[2], but the follow-up discussion[3] focused on the root cause,
i.e. sockmap update taking a wrong lock. Or, more specifically, missing
unix_state_lock()[4].
In the end it was concluded that teaching sockmap about the af_unix locking
would be unnecessarily complex[5].
Complexity aside, since BPF_PROG_TYPE_SCHED_CLS and BPF_PROG_TYPE_SCHED_ACT
are allowed to update sockmaps, sock_map_update_elem() taking the unix
lock, as it is currently implemented in unix_state_lock():
spin_lock(&unix_sk(s)->lock), would be problematic. unix_state_lock() taken
in a process context, followed by a softirq-context TC BPF program
attempting to take the same spinlock -- deadlock[6].
This way we circled back to the peer check idea[2].
[1]: https://lore.kernel.org/netdev/ba5c50aa-1df4-40c2-ab33-a72022c5a32e@rbox.co/
[2]: https://lore.kernel.org/netdev/20240610174906.32921-1-kuniyu@amazon.com/
[3]: https://lore.kernel.org/netdev/7603c0e6-cd5b-452b-b710-73b64bd9de26@linux.dev/
[4]: https://lore.kernel.org/netdev/CAAVpQUA+8GL_j63CaKb8hbxoL21izD58yr1NvhOhU=j+35+3og@mail.gmail.com/
[5]: https://lore.kernel.org/bpf/CAAVpQUAHijOMext28Gi10dSLuMzGYh+jK61Ujn+fZ-wvcODR2A@mail.gmail.com/
[6]: https://lore.kernel.org/bpf/dd043c69-4d03-46fe-8325-8f97101435cf@linux.dev/
Summary of scenarios where af_unix/stream connect() may race a sockmap
update:
1. connect() vs. bpf(BPF_MAP_UPDATE_ELEM), i.e. sock_map_update_elem_sys()
Implemented NULL check is sufficient. Once assigned, socket peer won't
be released until socket fd is released. And that's not an issue because
sock_map_update_elem_sys() bumps fd refcnf.
2. connect() vs BPF program doing update
Update restricted per verifier.c:may_update_sockmap() to
BPF_PROG_TYPE_TRACING/BPF_TRACE_ITER
BPF_PROG_TYPE_SOCK_OPS (bpf_sock_map_update() only)
BPF_PROG_TYPE_SOCKET_FILTER
BPF_PROG_TYPE_SCHED_CLS
BPF_PROG_TYPE_SCHED_ACT
BPF_PROG_TYPE_XDP
BPF_PROG_TYPE_SK_REUSEPORT
BPF_PROG_TYPE_FLOW_DISSECTOR
BPF_PROG_TYPE_SK_LOOKUP
Plus one more race to consider:
CPU0 bpf CPU1 connect
-------- ------------
WRITE_ONCE(sk->sk_state, TCP_ESTABLISHED)
sock_map_sk_state_allowed(sk)
sock_hold(newsk)
smp_mb__after_atomic()
unix_peer(sk) = newsk
sk_pair = unix_peer(sk)
if (unlikely(!sk_pair))
return -EINVAL;
CPU1 close
----------
skpair = unix_peer(sk);
unix_peer(sk) = NULL;
sock_put(skpair)
// use after free?
sock_hold(sk_pair)
2.1 BPF program invoking helper function bpf_sock_map_update() ->
BPF_CALL_4(bpf_sock_map_update(), ...)
Helper limited to BPF_PROG_TYPE_SOCK_OPS. Nevertheless, a unix sock
might be accessible via bpf_map_lookup_elem(). Which implies sk
already having psock, which in turn implies sk already having
sk_pair. Since sk_psock_destroy() is queued as RCU work, sk_pair
won't go away while BPF executes the update.
2.2 BPF program invoking helper function bpf_map_update_elem() ->
sock_map_update_elem()
2.2.1 Unix sock accessible to BPF prog only via sockmap lookup in
BPF_PROG_TYPE_SOCKET_FILTER, BPF_PROG_TYPE_SCHED_CLS,
BPF_PROG_TYPE_SCHED_ACT, BPF_PROG_TYPE_XDP,
BPF_PROG_TYPE_SK_REUSEPORT, BPF_PROG_TYPE_FLOW_DISSECTOR,
BPF_PROG_TYPE_SK_LOOKUP.
Pretty much the same as case 2.1.
2.2.2 Unix sock accessible to BPF program directly:
BPF_PROG_TYPE_TRACING, narrowed down to BPF_TRACE_ITER.
Sockmap iterator (sock_map_seq_ops) is safe: unix sock
residing in a sockmap means that the sock already went through
the proto update step.
Unix sock iterator (bpf_iter_unix_seq_ops), on the other hand,
gives access to socks that may still be unconnected. Which
means iterator prog can race sockmap/proto update against
connect().
BUG: KASAN: null-ptr-deref in unix_stream_bpf_update_proto+0x253/0x4d0
Write of size 4 at addr 0000000000000080 by task test_progs/3140
Call Trace:
dump_stack_lvl+0x5d/0x80
kasan_report+0xe4/0x1c0
kasan_check_range+0x125/0x200
unix_stream_bpf_update_proto+0x253/0x4d0
sock_map_link+0x71c/0xec0
sock_map_update_common+0xbc/0x600
sock_map_update_elem+0x19a/0x1f0
bpf_prog_bbbf56096cdd4f01_selective_dump_unix+0x20c/0x217
bpf_iter_run_prog+0x21e/0xae0
bpf_iter_unix_seq_show+0x1e0/0x2a0
bpf_seq_read+0x42c/0x10d0
vfs_read+0x171/0xb20
ksys_read+0xff/0x200
do_syscall_64+0xf7/0x5e0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
While the introduced NULL check prevents null-ptr-deref in the
BPF program path as well, it is insufficient to guard against
a poorly timed close() leading to a use-after-free. This will
be addressed in a subsequent patch.
Fixes: c63829182c37 ("af_unix: Implement ->psock_update_sk_prot()")
Closes: https://lore.kernel.org/netdev/ba5c50aa-1df4-40c2-ab33-a72022c5a32e@rbox.co/
Reported-by: Michal Luczaj <mhal@rbox.co>
Reported-by: 钱一铭 <yimingqian591@gmail.com>
Suggested-by: Kuniyuki Iwashima <kuniyu@google.com>
Suggested-by: Martin KaFai Lau <martin.lau@linux.dev>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260414-unix-proto-update-null-ptr-deref-v4-4-2af6fe97918e@rbox.co
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Michal Luczaj <mhal@rbox.co>
Date: Tue Apr 14 16:13:19 2026 +0200
bpf, sockmap: Take state lock for af_unix iter
[ Upstream commit 64c2f93fc3254d3bf5de4445fb732ee5c451edb6 ]
When a BPF iterator program updates a sockmap, there is a race condition in
unix_stream_bpf_update_proto() where the `peer` pointer can become stale[1]
during a state transition TCP_ESTABLISHED -> TCP_CLOSE.
CPU0 bpf CPU1 close
-------- ----------
// unix_stream_bpf_update_proto()
sk_pair = unix_peer(sk)
if (unlikely(!sk_pair))
return -EINVAL;
// unix_release_sock()
skpair = unix_peer(sk);
unix_peer(sk) = NULL;
sock_put(skpair)
sock_hold(sk_pair) // UaF
More practically, this fix guarantees that the iterator program is
consistently provided with a unix socket that remains stable during
iterator execution.
[1]:
BUG: KASAN: slab-use-after-free in unix_stream_bpf_update_proto+0x155/0x490
Write of size 4 at addr ffff8881178c9a00 by task test_progs/2231
Call Trace:
dump_stack_lvl+0x5d/0x80
print_report+0x170/0x4f3
kasan_report+0xe4/0x1c0
kasan_check_range+0x125/0x200
unix_stream_bpf_update_proto+0x155/0x490
sock_map_link+0x71c/0xec0
sock_map_update_common+0xbc/0x600
sock_map_update_elem+0x19a/0x1f0
bpf_prog_bbbf56096cdd4f01_selective_dump_unix+0x20c/0x217
bpf_iter_run_prog+0x21e/0xae0
bpf_iter_unix_seq_show+0x1e0/0x2a0
bpf_seq_read+0x42c/0x10d0
vfs_read+0x171/0xb20
ksys_read+0xff/0x200
do_syscall_64+0xf7/0x5e0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Allocated by task 2236:
kasan_save_stack+0x30/0x50
kasan_save_track+0x14/0x30
__kasan_slab_alloc+0x63/0x80
kmem_cache_alloc_noprof+0x1d5/0x680
sk_prot_alloc+0x59/0x210
sk_alloc+0x34/0x470
unix_create1+0x86/0x8a0
unix_stream_connect+0x318/0x15b0
__sys_connect+0xfd/0x130
__x64_sys_connect+0x72/0xd0
do_syscall_64+0xf7/0x5e0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Freed by task 2236:
kasan_save_stack+0x30/0x50
kasan_save_track+0x14/0x30
kasan_save_free_info+0x3b/0x70
__kasan_slab_free+0x47/0x70
kmem_cache_free+0x11c/0x590
__sk_destruct+0x432/0x6e0
unix_release_sock+0x9b3/0xf60
unix_release+0x8a/0xf0
__sock_release+0xb0/0x270
sock_close+0x18/0x20
__fput+0x36e/0xac0
fput_close_sync+0xe5/0x1a0
__x64_sys_close+0x7d/0xd0
do_syscall_64+0xf7/0x5e0
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Fixes: 2c860a43dd77 ("bpf: af_unix: Implement BPF iterator for UNIX domain socket.")
Suggested-by: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260414-unix-proto-update-null-ptr-deref-v4-5-2af6fe97918e@rbox.co
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: haoyu.lu <hechushiguitu666@gmail.com>
Date: Tue Mar 24 20:27:02 2026 +0800
bpf,arc_jit: Fix missing newline in pr_err messages
[ Upstream commit b6b5e0ebd429d66ce37ae5af649a74ea1f041d92 ]
Add missing newline to pr_err messages in ARC JIT.
Fixes: f122668ddcce ("ARC: Add eBPF JIT support")
Signed-off-by: haoyu.lu <hechushiguitu666@gmail.com>
Link: https://lore.kernel.org/r/20260324122703.641-1-hechushiguitu666@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Emil Tsalapatis <emil@etsalapatis.com>
Date: Sun Apr 12 13:45:38 2026 -0400
bpf: Allow instructions with arena source and non-arena dest registers
[ Upstream commit ac61bffe91d4bda08806e12957c6d64756d042db ]
The compiler sometimes stores the result of a PTR_TO_ARENA and SCALAR
operation into the scalar register rather than the pointer register.
Relax the verifier to allow operations between a source arena register
and a destination non-arena register, marking the destination's value
as a PTR_TO_ARENA.
Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
Acked-by: Song Liu <song@kernel.org>
Fixes: 6082b6c328b5 ("bpf: Recognize addr_space_cast instruction in the verifier.")
Link: https://lore.kernel.org/r/20260412174546.18684-2-emil@etsalapatis.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yihan Ding <dingyihan@uniontech.com>
Date: Thu Apr 16 20:01:41 2026 +0800
bpf: allow UTF-8 literals in bpf_bprintf_prepare()
[ Upstream commit b960430ea8862ef37ce53c8bf74a8dc79d3f2404 ]
bpf_bprintf_prepare() only needs ASCII parsing for conversion
specifiers. Plain text can safely carry bytes >= 0x80, so allow
UTF-8 literals outside '%' sequences while keeping ASCII control
bytes rejected and format specifiers ASCII-only.
This keeps existing parsing rules for format directives unchanged,
while allowing helpers such as bpf_trace_printk() to emit UTF-8
literal text.
Update test_snprintf_negative() in the same commit so selftests keep
matching the new plain-text vs format-specifier split during bisection.
Fixes: 48cac3f4a96d ("bpf: Implement formatted output helpers with bstr_printf")
Signed-off-by: Yihan Ding <dingyihan@uniontech.com>
Acked-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/20260416120142.1420646-2-dingyihan@uniontech.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jiayuan Chen <jiayuan.chen@linux.dev>
Date: Tue Apr 7 20:23:33 2026 +0800
bpf: Drop task_to_inode and inet_conn_established from lsm sleepable hooks
[ Upstream commit beaf0e96b1da74549a6cabd040f9667d83b2e97e ]
bpf_lsm_task_to_inode() is called under rcu_read_lock() and
bpf_lsm_inet_conn_established() is called from softirq context, so
neither hook can be used by sleepable LSM programs.
Fixes: 423f16108c9d8 ("bpf: Augment the set of sleepable LSM hooks")
Reported-by: Quan Sun <2022090917019@std.uestc.edu.cn>
Reported-by: Yinhao Hu <dddddd@hust.edu.cn>
Reported-by: Kaiyan Mei <M202472210@hust.edu.cn>
Reported-by: Dongliang Mu <dzm91@hust.edu.cn>
Closes: https://lore.kernel.org/bpf/3ab69731-24d1-431a-a351-452aafaaf2a5@std.uestc.edu.cn/T/#u
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Link: https://lore.kernel.org/r/20260407122334.344072-1-jiayuan.chen@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Daniel Borkmann <daniel@iogearbox.net>
Date: Sat Apr 11 01:26:50 2026 +0200
bpf: Enforce regsafe base id consistency for BPF_ADD_CONST scalars
[ Upstream commit 2f2ec8e7730e21fc9bd49e0de9cdd58213ea24d0 ]
When regsafe() compares two scalar registers that both carry
BPF_ADD_CONST, check_scalar_ids() maps their full compound id
(aka base | BPF_ADD_CONST flag) as one idmap entry. However,
it never verifies that the underlying base ids, that is, with
the flag stripped are consistent with existing idmap mappings.
This allows construction of two verifier states where the old
state has R3 = R2 + 10 (both sharing base id A) while the current
state has R3 = R4 + 10 (base id C, unrelated to R2). The idmap
creates two independent entries: A->B (for R2) and A|flag->C|flag
(for R3), without catching that A->C conflicts with A->B. State
pruning then incorrectly succeeds.
Fix this by additionally verifying base ID mapping consistency
whenever BPF_ADD_CONST is set: after mapping the compound ids,
also invoke check_ids() on the base IDs (flag bits stripped).
This ensures that if A was already mapped to B from comparing
the source register, any ADD_CONST derivative must also derive
from B, not an unrelated C.
Fixes: 98d7ca374ba4 ("bpf: Track delta between "linked" registers.")
Reported-by: STAR Labs SG <info@starlabs.sg>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20260410232651.559778-1-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Weiming Shi <bestswngs@gmail.com>
Date: Fri Apr 3 21:29:50 2026 +0800
bpf: fix end-of-list detection in cgroup_storage_get_next_key()
[ Upstream commit 5828b9e5b272ecff7cf5d345128d3de7324117f7 ]
list_next_entry() never returns NULL -- when the current element is the
last entry it wraps to the list head via container_of(). The subsequent
NULL check is therefore dead code and get_next_key() never returns
-ENOENT for the last element, instead reading storage->key from a bogus
pointer that aliases internal map fields and copying the result to
userspace.
Replace it with list_entry_is_head() so the function correctly returns
-ENOENT when there are no more entries.
Fixes: de9cbbaadba5 ("bpf: introduce cgroup storage maps")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Reviewed-by: Sun Jian <sun.jian.kdev@gmail.com>
Acked-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/20260403132951.43533-2-bestswngs@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Puranjay Mohan <puranjay@kernel.org>
Date: Wed Apr 8 08:45:35 2026 -0700
bpf: fix mm lifecycle in open-coded task_vma iterator
[ Upstream commit d8e27d2d22b6e2df3a0125b8c08e9aace38c954c ]
The open-coded task_vma iterator reads task->mm locklessly and acquires
mmap_read_trylock() but never calls mmget(). If the task exits
concurrently, the mm_struct can be freed as it is not
SLAB_TYPESAFE_BY_RCU, resulting in a use-after-free.
Safely read task->mm with a trylock on alloc_lock and acquire an mm
reference. Drop the reference via bpf_iter_mmput_async() in _destroy()
and error paths. bpf_iter_mmput_async() is a local wrapper around
mmput_async() with a fallback to mmput() on !CONFIG_MMU.
Reject irqs-disabled contexts (including NMI) up front. Operations used
by _next() and _destroy() (mmap_read_unlock, bpf_iter_mmput_async)
take spinlocks with IRQs disabled (pool->lock, pi_lock). Running from
NMI or from a tracepoint that fires with those locks held could
deadlock.
A trylock on alloc_lock is used instead of the blocking task_lock()
(get_task_mm) to avoid a deadlock when a softirq BPF program iterates
a task that already holds its alloc_lock on the same CPU.
Fixes: 4ac454682158 ("bpf: Introduce task_vma open-coded iterator kfuncs")
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Link: https://lore.kernel.org/r/20260408154539.3832150-2-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mykyta Yatsenko <yatsenko@meta.com>
Date: Thu Apr 16 11:08:07 2026 -0700
bpf: Fix NULL deref in map_kptr_match_type for scalar regs
[ Upstream commit 4d0a375887ab4d49e4da1ff10f9606cab8f7c3ad ]
Commit ab6c637ad027 ("bpf: Fix a bpf_kptr_xchg() issue with local
kptr") refactored map_kptr_match_type() to branch on btf_is_kernel()
before checking base_type(). A scalar register stored into a kptr
slot has no btf, so the btf_is_kernel(reg->btf) call dereferences
NULL.
Move the base_type() != PTR_TO_BTF_ID guard before any reg->btf
access.
Fixes: ab6c637ad027 ("bpf: Fix a bpf_kptr_xchg() issue with local kptr")
Reported-by: Hiker Cl <clhiker365@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221372
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Acked-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/20260416-kptr_crash-v1-1-5589356584b4@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Lang Xu <xulang@uniontech.com>
Date: Thu Apr 2 15:42:35 2026 +0800
bpf: Fix OOB in pcpu_init_value
[ Upstream commit 576afddfee8d1108ee299bf10f581593540d1a36 ]
An out-of-bounds read occurs when copying element from a
BPF_MAP_TYPE_CGROUP_STORAGE map to another pcpu map with the
same value_size that is not rounded up to 8 bytes.
The issue happens when:
1. A CGROUP_STORAGE map is created with value_size not aligned to
8 bytes (e.g., 4 bytes)
2. A pcpu map is created with the same value_size (e.g., 4 bytes)
3. Update element in 2 with data in 1
pcpu_init_value assumes that all sources are rounded up to 8 bytes,
and invokes copy_map_value_long to make a data copy, However, the
assumption doesn't stand since there are some cases where the source
may not be rounded up to 8 bytes, e.g., CGROUP_STORAGE, skb->data.
the verifier verifies exactly the size that the source claims, not
the size rounded up to 8 bytes by kernel, an OOB happens when the
source has only 4 bytes while the copy size(4) is rounded up to 8.
Fixes: d3bec0138bfb ("bpf: Zero-fill re-used per-cpu map element")
Reported-by: Kaiyan Mei <kaiyanm@hust.edu.cn>
Closes: https://lore.kernel.org/all/14e6c70c.6c121.19c0399d948.Coremail.kaiyanm@hust.edu.cn/
Link: https://lore.kernel.org/r/420FEEDDC768A4BE+20260402074236.2187154-1-xulang@uniontech.com
Signed-off-by: Lang Xu <xulang@uniontech.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Daniel Borkmann <daniel@iogearbox.net>
Date: Thu Apr 16 14:27:19 2026 +0200
bpf: Fix precedence bug in convert_bpf_ld_abs alignment check
[ Upstream commit e5f635edd393aeaa7cad9e42831d397e6e2e1eed ]
Fix an operator precedence issue in convert_bpf_ld_abs() where the
expression offset + ip_align % size evaluates as offset + (ip_align % size)
due to % having higher precedence than +. That latter evaluation does
not make any sense. The intended check is (offset + ip_align) % size == 0
to verify that the packet load offset is properly aligned for direct
access.
With NET_IP_ALIGN == 2, the bug causes the inline fast-path for direct
packet loads to almost never be taken on !CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
platforms. This forces nearly all cBPF BPF_LD_ABS packet loads through
the bpf_skb_load_helper slow path on the affected archs.
Fixes: e0cea7ce988c ("bpf: implement ld_abs/ld_ind in native bpf")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20260416122719.661033-1-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sechang Lim <rhkrqnwk98@gmail.com>
Date: Tue Apr 7 10:38:23 2026 +0000
bpf: Fix RCU stall in bpf_fd_array_map_clear()
[ Upstream commit 4406942e65ca128c56c67443832988873c21d2e9 ]
Add a missing cond_resched() in bpf_fd_array_map_clear() loop.
For PROG_ARRAY maps with many entries this loop calls
prog_array_map_poke_run() per entry which can be expensive, and
without yielding this can cause RCU stalls under load:
rcu: Stack dump where RCU GP kthread last ran:
CPU: 0 UID: 0 PID: 30932 Comm: kworker/0:2 Not tainted 6.14.0-13195-g967e8def1100 #2 PREEMPT(undef)
Workqueue: events prog_array_map_clear_deferred
RIP: 0010:write_comp_data+0x38/0x90 kernel/kcov.c:246
Call Trace:
<TASK>
prog_array_map_poke_run+0x77/0x380 kernel/bpf/arraymap.c:1096
__fd_array_map_delete_elem+0x197/0x310 kernel/bpf/arraymap.c:925
bpf_fd_array_map_clear kernel/bpf/arraymap.c:1000 [inline]
prog_array_map_clear_deferred+0x119/0x1b0 kernel/bpf/arraymap.c:1141
process_one_work+0x898/0x19d0 kernel/workqueue.c:3238
process_scheduled_works kernel/workqueue.c:3319 [inline]
worker_thread+0x770/0x10b0 kernel/workqueue.c:3400
kthread+0x465/0x880 kernel/kthread.c:464
ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:153
ret_from_fork_asm+0x19/0x30 arch/x86/entry/entry_64.S:245
</TASK>
Reviewed-by: Sun Jian <sun.jian.kdev@gmail.com>
Fixes: da765a2f5993 ("bpf: Add poke dependency tracking for prog array maps")
Signed-off-by: Sechang Lim <rhkrqnwk98@gmail.com>
Link: https://lore.kernel.org/r/20260407103823.3942156-1-rhkrqnwk98@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: MingTao Huang <mintaohuang@tencent.com>
Date: Thu Apr 2 20:18:50 2026 +0800
bpf: Fix stale offload->prog pointer after constant blinding
[ Upstream commit a1aa9ef47c299c5bbc30594d3c2f0589edf908e6 ]
When a dev-bound-only BPF program (BPF_F_XDP_DEV_BOUND_ONLY) undergoes
JIT compilation with constant blinding enabled (bpf_jit_harden >= 2),
bpf_jit_blind_constants() clones the program. The original prog is then
freed in bpf_jit_prog_release_other(), which updates aux->prog to point
to the surviving clone, but fails to update offload->prog.
This leaves offload->prog pointing to the freed original program. When
the network namespace is subsequently destroyed, cleanup_net() triggers
bpf_dev_bound_netdev_unregister(), which iterates ondev->progs and calls
__bpf_prog_offload_destroy(offload->prog). Accessing the freed prog
causes a page fault:
BUG: unable to handle page fault for address: ffffc900085f1038
Workqueue: netns cleanup_net
RIP: 0010:__bpf_prog_offload_destroy+0xc/0x80
Call Trace:
__bpf_offload_dev_netdev_unregister+0x257/0x350
bpf_dev_bound_netdev_unregister+0x4a/0x90
unregister_netdevice_many_notify+0x2a2/0x660
...
cleanup_net+0x21a/0x320
The test sequence that triggers this reliably is:
1. Set net.core.bpf_jit_harden=2 (echo 2 > /proc/sys/net/core/bpf_jit_harden)
2. Run xdp_metadata selftest, which creates a dev-bound-only XDP
program on a veth inside a netns (./test_progs -t xdp_metadata)
3. cleanup_net -> page fault in __bpf_prog_offload_destroy
Dev-bound-only programs are unique in that they have an offload structure
but go through the normal JIT path instead of bpf_prog_offload_compile().
This means they are subject to constant blinding's prog clone-and-replace,
while also having offload->prog that must stay in sync.
Fix this by updating offload->prog in bpf_jit_prog_release_other(),
alongside the existing aux->prog update. Both are back-pointers to
the prog that must be kept in sync when the prog is replaced.
Fixes: 2b3486bc2d23 ("bpf: Introduce device-bound XDP programs")
Signed-off-by: MingTao Huang <mintaohuang@tencent.com>
Link: https://lore.kernel.org/r/tencent_BCF692F45859CCE6C22B7B0B64827947D406@qq.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alexei Starovoitov <ast@kernel.org>
Date: Tue Mar 24 14:59:36 2026 -0700
bpf: Fix variable length stack write over spilled pointers
[ Upstream commit 4639eb9e30ab10c7935c7c19e872facf9a94713f ]
Scrub slots if variable-offset stack write goes over spilled pointers.
Otherwise is_spilled_reg() may == true && spilled_ptr.type == NOT_INIT
and valid program is rejected by check_stack_read_fixed_off()
with obscure "invalid size of register fill" message.
Fixes: 01f810ace9ed ("bpf: Allow variable-offset stack access")
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260324215938.81733-1-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Weiming Shi <bestswngs@gmail.com>
Date: Sun Apr 5 00:12:20 2026 +0800
bpf: reject negative CO-RE accessor indices in bpf_core_parse_spec()
[ Upstream commit 1c22483a2c4bbf747787f328392ca3e68619c4dc ]
CO-RE accessor strings are colon-separated indices that describe a path
from a root BTF type to a target field, e.g. "0:1:2" walks through
nested struct members. bpf_core_parse_spec() parses each component with
sscanf("%d"), so negative values like -1 are silently accepted. The
subsequent bounds checks (access_idx >= btf_vlen(t)) only guard the
upper bound and always pass for negative values because C integer
promotion converts the __u16 btf_vlen result to int, making the
comparison (int)(-1) >= (int)(N) false for any positive N.
When -1 reaches btf_member_bit_offset() it gets cast to u32 0xffffffff,
producing an out-of-bounds read far past the members array. A crafted
BPF program with a negative CO-RE accessor on any struct that exists in
vmlinux BTF (e.g. task_struct) crashes the kernel deterministically
during BPF_PROG_LOAD on any system with CONFIG_DEBUG_INFO_BTF=y
(default on major distributions). The bug is reachable with CAP_BPF:
BUG: unable to handle page fault for address: ffffed11818b6626
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
Oops: Oops: 0000 [#1] SMP KASAN NOPTI
CPU: 0 UID: 0 PID: 85 Comm: poc Not tainted 7.0.0-rc6 #18 PREEMPT(full)
RIP: 0010:bpf_core_parse_spec (tools/lib/bpf/relo_core.c:354)
RAX: 00000000ffffffff
Call Trace:
<TASK>
bpf_core_calc_relo_insn (tools/lib/bpf/relo_core.c:1321)
bpf_core_apply (kernel/bpf/btf.c:9507)
check_core_relo (kernel/bpf/verifier.c:19475)
bpf_check (kernel/bpf/verifier.c:26031)
bpf_prog_load (kernel/bpf/syscall.c:3089)
__sys_bpf (kernel/bpf/syscall.c:6228)
</TASK>
CO-RE accessor indices are inherently non-negative (struct member index,
array element index, or enumerator index), so reject them immediately
after parsing.
Fixes: ddc7c3042614 ("libbpf: implement BPF CO-RE offset relocation algorithm")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Acked-by: Paul Chaignon <paul.chaignon@gmail.com>
Link: https://lore.kernel.org/r/20260404161221.961828-2-bestswngs@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sun Jian <sun.jian.kdev@gmail.com>
Date: Wed Apr 8 11:46:22 2026 +0800
bpf: reject short IPv4/IPv6 inputs in bpf_prog_test_run_skb
[ Upstream commit 12bec2bd4b76d81c5d3996bd14ec1b7f4d983747 ]
bpf_prog_test_run_skb() calls eth_type_trans() first and then uses
skb->protocol to initialize sk family and address fields for the test
run.
For IPv4 and IPv6 packets, it may access ip_hdr(skb) or ipv6_hdr(skb)
even when the provided test input only contains an Ethernet header.
Reject the input earlier if the Ethernet frame carries IPv4/IPv6
EtherType but the L3 header is too short.
Fold the IPv4/IPv6 header length checks into the existing protocol
switch and return -EINVAL before accessing the network headers.
Fixes: fa5cb548ced6 ("bpf: Setup socket family and addresses in bpf_prog_test_run_skb")
Reported-by: syzbot+619b9ef527f510a57cfc@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=619b9ef527f510a57cfc
Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>
Link: https://lore.kernel.org/r/20260408034623.180320-2-sun.jian.kdev@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Puranjay Mohan <puranjay@kernel.org>
Date: Tue Feb 3 08:51:00 2026 -0800
bpf: Relax scalar id equivalence for state pruning
[ Upstream commit b0388bafa4949bd30af7b3be5ee415f2a25ac014 ]
Scalar register IDs are used by the verifier to track relationships
between registers and enable bounds propagation across those
relationships. Once an ID becomes singular (i.e. only a single
register/stack slot carries it), it can no longer contribute to bounds
propagation and effectively becomes stale. The previous commit makes the
verifier clear such ids before caching the state.
When comparing the current and cached states for pruning, these stale
IDs can cause technically equivalent states to be considered different
and thus prevent pruning.
For example, in the selftest added in the next commit, two registers -
r6 and r7 are not linked to any other registers and get cached with
id=0, in the current state, they are both linked to each other with
id=A. Before this commit, check_scalar_ids would give temporary ids to
r6 and r7 (say tid1 and tid2) and then check_ids() would map tid1->A,
and when it would see tid2->A, it would not consider these state
equivalent.
Relax scalar ID equivalence by treating rold->id == 0 as "independent":
if the old state did not rely on any ID relationships for a register,
then any ID/linking present in the current state only adds constraints
and is always safe to accept for pruning. Implement this by returning
true immediately in check_scalar_ids() when old_id == 0.
Maintain correctness for the opposite direction (old_id != 0 && cur_id
== 0) by still allocating a temporary ID for cur_id == 0. This avoids
incorrectly allowing multiple independent current registers (id==0) to
satisfy a single linked old ID during mapping.
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Link: https://lore.kernel.org/r/20260203165102.2302462-5-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Stable-dep-of: 2f2ec8e7730e ("bpf: Enforce regsafe base id consistency for BPF_ADD_CONST scalars")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Puranjay Mohan <puranjay@kernel.org>
Date: Wed Apr 8 08:45:37 2026 -0700
bpf: return VMA snapshot from task_vma iterator
[ Upstream commit 4cbee026db54cad39c39db4d356100cb133412b3 ]
Holding the per-VMA lock across the BPF program body creates a lock
ordering problem when helpers acquire locks that depend on mmap_lock:
vm_lock -> i_rwsem -> mmap_lock -> vm_lock
Snapshot the VMA under the per-VMA lock in _next() via memcpy(), then
drop the lock before returning. The BPF program accesses only the
snapshot.
The verifier only trusts vm_mm and vm_file pointers (see
BTF_TYPE_SAFE_TRUSTED_OR_NULL in verifier.c). vm_file is reference-
counted with get_file() under the lock and released via fput() on the
next iteration or in _destroy(). vm_mm is already correct because
lock_vma_under_rcu() verifies vma->vm_mm == mm. All other pointers
are left as-is by memcpy() since the verifier treats them as untrusted.
Fixes: 4ac454682158 ("bpf: Introduce task_vma open-coded iterator kfuncs")
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Mykyta Yatsenko <yatsenko@meta.com>
Link: https://lore.kernel.org/r/20260408154539.3832150-4-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Puranjay Mohan <puranjay@kernel.org>
Date: Wed Apr 8 08:45:36 2026 -0700
bpf: switch task_vma iterator from mmap_lock to per-VMA locks
[ Upstream commit bee9ef4a40a277bf401be43d39ba7f7f063cf39c ]
The open-coded task_vma iterator holds mmap_lock for the entire duration
of iteration, increasing contention on this highly contended lock.
Switch to per-VMA locking. Find the next VMA via an RCU-protected maple
tree walk and lock it with lock_vma_under_rcu(). lock_next_vma() is not
used because its fallback takes mmap_read_lock(), and the iterator must
work in non-sleepable contexts.
lock_vma_under_rcu() is a point lookup (mas_walk) that finds the VMA
containing a given address but cannot iterate across gaps. An
RCU-protected vma_next() walk (mas_find) first locates the next VMA's
vm_start to pass to lock_vma_under_rcu().
Between the RCU walk and the lock, the VMA may be removed, shrunk, or
write-locked. On failure, advance past it using vm_end from the RCU
walk. Because the VMA slab is SLAB_TYPESAFE_BY_RCU, vm_end may be
stale; fall back to PAGE_SIZE advancement when it does not make forward
progress. Concurrent VMA insertions at addresses already passed by the
iterator are not detected.
CONFIG_PER_VMA_LOCK is required; return -EOPNOTSUPP without it.
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Link: https://lore.kernel.org/r/20260408154539.3832150-3-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Stable-dep-of: 4cbee026db54 ("bpf: return VMA snapshot from task_vma iterator")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Feng Yang <yangfeng@kylinos.cn>
Date: Wed Mar 4 17:44:28 2026 +0800
bpf: test_run: Fix the null pointer dereference issue in bpf_lwt_xmit_push_encap
[ Upstream commit 972787479ee73006fddb5e59ab5c8e733810ff42 ]
The bpf_lwt_xmit_push_encap helper needs to access skb_dst(skb)->dev to
calculate the needed headroom:
err = skb_cow_head(skb,
len + LL_RESERVED_SPACE(skb_dst(skb)->dev));
But skb->_skb_refdst may not be initialized when the skb is set up by
bpf_prog_test_run_skb function. Executing bpf_lwt_push_ip_encap function
in this scenario will trigger null pointer dereference, causing a kernel
crash as Yinhao reported:
[ 105.186365] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 105.186382] #PF: supervisor read access in kernel mode
[ 105.186388] #PF: error_code(0x0000) - not-present page
[ 105.186393] PGD 121d3d067 P4D 121d3d067 PUD 106c83067 PMD 0
[ 105.186404] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 105.186412] CPU: 3 PID: 3250 Comm: poc Kdump: loaded Not tainted 6.19.0-rc5 #1
[ 105.186423] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 105.186427] RIP: 0010:bpf_lwt_push_ip_encap+0x1eb/0x520
[ 105.186443] Code: 0f 84 de 01 00 00 0f b7 4a 04 66 85 c9 0f 85 47 01 00 00 31 c0 5b 5d 41 5c 41 5d 41 5e c3 cc cc cc cc 48 8b 73 58 48 83 e6 fe <48> 8b 36 0f b7 be ec 00 00 00 0f b7 b6 e6 00 00 00 01 fe 83 e6 f0
[ 105.186449] RSP: 0018:ffffbb0e0387bc50 EFLAGS: 00010246
[ 105.186455] RAX: 000000000000004e RBX: ffff94c74e036500 RCX: ffff94c74874da00
[ 105.186460] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff94c74e036500
[ 105.186463] RBP: 0000000000000001 R08: 0000000000000002 R09: 0000000000000000
[ 105.186467] R10: ffffbb0e0387bd50 R11: 0000000000000000 R12: ffffbb0e0387bc98
[ 105.186471] R13: 0000000000000014 R14: 0000000000000000 R15: 0000000000000002
[ 105.186484] FS: 00007f166aa4d680(0000) GS:ffff94c8b7780000(0000) knlGS:0000000000000000
[ 105.186490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 105.186494] CR2: 0000000000000000 CR3: 000000015eade001 CR4: 0000000000770ee0
[ 105.186499] PKRU: 55555554
[ 105.186502] Call Trace:
[ 105.186507] <TASK>
[ 105.186513] bpf_lwt_xmit_push_encap+0x2b/0x40
[ 105.186522] bpf_prog_a75eaad51e517912+0x41/0x49
[ 105.186536] ? kvm_clock_get_cycles+0x18/0x30
[ 105.186547] ? ktime_get+0x3c/0xa0
[ 105.186554] bpf_test_run+0x195/0x320
[ 105.186563] ? bpf_test_run+0x10f/0x320
[ 105.186579] bpf_prog_test_run_skb+0x2f5/0x4f0
[ 105.186590] __sys_bpf+0x69c/0xa40
[ 105.186603] __x64_sys_bpf+0x1e/0x30
[ 105.186611] do_syscall_64+0x59/0x110
[ 105.186620] entry_SYSCALL_64_after_hwframe+0x76/0xe0
[ 105.186649] RIP: 0033:0x7f166a97455d
Temporarily add the setting of skb->_skb_refdst before bpf_test_run to resolve the issue.
Fixes: 52f278774e79 ("bpf: implement BPF_LWT_ENCAP_IP mode in bpf_lwt_push_encap")
Reported-by: Yinhao Hu <dddddd@hust.edu.cn>
Reported-by: Kaiyan Mei <M202472210@hust.edu.cn>
Closes: https://groups.google.com/g/hust-os-kernel-patches/c/8-a0kPpBW2s
Signed-off-by: Yun Lu <luyun@kylinos.cn>
Signed-off-by: Feng Yang <yangfeng@kylinos.cn>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Tested-by: syzbot@syzkaller.appspotmail.com
Link: https://patch.msgid.link/20260304094429.168521-2-yangfeng59949@163.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Carlier <devnexen@gmail.com>
Date: Fri Mar 20 07:26:45 2026 +0000
bpf: Use RCU-safe iteration in dev_map_redirect_multi() SKB path
[ Upstream commit 8ed82f807bb09d2c8455aaa665f2c6cb17bc6a19 ]
The DEVMAP_HASH branch in dev_map_redirect_multi() uses
hlist_for_each_entry_safe() to iterate hash buckets, but this function
runs under RCU protection (called from xdp_do_generic_redirect_map()
in softirq context). Concurrent writers (__dev_map_hash_update_elem,
dev_map_hash_delete_elem) modify the list using RCU primitives
(hlist_add_head_rcu, hlist_del_rcu).
hlist_for_each_entry_safe() performs plain pointer dereferences without
rcu_dereference(), missing the acquire barrier needed to pair with
writers' rcu_assign_pointer(). On weakly-ordered architectures (ARM64,
POWER), a reader can observe a partially-constructed node. It also
defeats CONFIG_PROVE_RCU lockdep validation and KCSAN data-race
detection.
Replace with hlist_for_each_entry_rcu() using rcu_read_lock_bh_held()
as the lockdep condition, consistent with the rcu_dereference_check()
used in the DEVMAP (non-hash) branch of the same functions. Also fix
the same incorrect lockdep_is_held(&dtab->index_lock) condition in
dev_map_enqueue_multi(), where the lock is not held either.
Fixes: e624d4ed4aa8 ("xdp: Extend xdp_redirect_map with broadcast support")
Signed-off-by: David Carlier <devnexen@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20260320072645.16731-1-devnexen@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Puranjay Mohan <puranjay@kernel.org>
Date: Fri Apr 17 08:21:33 2026 -0700
bpf: Validate node_id in arena_alloc_pages()
[ Upstream commit 2845989f2ebaf7848e4eccf9a779daf3156ea0a5 ]
arena_alloc_pages() accepts a plain int node_id and forwards it through
the entire allocation chain without any bounds checking.
Validate node_id before passing it down the allocation chain in
arena_alloc_pages().
Fixes: 317460317a02 ("bpf: Introduce bpf_arena.")
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260417152135.1383754-1-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Qu Wenruo <wqu@suse.com>
Date: Sun May 17 09:52:07 2026 -0400
btrfs: do not mark inode incompressible after inline attempt fails
[ Upstream commit 2e0e3716c7b6f8d71df2fbe709b922e54700f71b ]
[BUG]
The following sequence will set the file with nocompress flag:
# mkfs.btrfs -f $dev
# mount $dev $mnt -o max_inline=4,compress
# xfs_io -f -c "pwrite 0 2k" -c sync $mnt/foobar
The inode will have NOCOMPRESS flag, even if the content itself (all 0xcd)
can still be compressed very well:
item 4 key (257 INODE_ITEM 0) itemoff 15879 itemsize 160
generation 9 transid 10 size 2097152 nbytes 1052672
block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0
sequence 257 flags 0x8(NOCOMPRESS)
Please note that, this behavior is there even before commit 59615e2c1f63
("btrfs: reject single block sized compression early").
[CAUSE]
At compress_file_range(), after btrfs_compress_folios() call, we try
making an inlined extent by calling cow_file_range_inline().
But cow_file_range_inline() calls can_cow_file_range_inline() which has
more accurate checks on if the range can be inlined.
One of the user configurable conditions is the "max_inline=" mount
option. If that value is set low (like the example, 4 bytes, which
cannot store any header), or the compressed content is just slightly
larger than 2K (the default value, meaning a 50% compression ratio),
cow_file_range_inline() will return 1 immediately.
And since we're here only to try inline the compressed data, the range
is no larger than a single fs block.
Thus compression is never going to make it a win, we fall back to
marking the inode incompressible unavoidably.
[FIX]
Just add an extra check after inline attempt, so that if the inline
attempt failed, do not set the nocompress flag.
As there is no way to remove that flag, and the default 50% compression
ratio is way too strict for the whole inode.
CC: stable@vger.kernel.org # 6.12+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Filipe Manana <fdmanana@suse.com>
Date: Mon Mar 23 15:50:13 2026 +0000
btrfs: fix deadlock between reflink and transaction commit when using flushoncommit
[ Upstream commit b48c980b6a7e409050bb3067165db31cc6205e3e ]
When using the flushoncommit mount option, we can have a deadlock between
a transaction commit and a reflink operation that copied an inline extent
to an offset beyond the current i_size of the destination node.
The deadlock happens like this:
1) Task A clones an inline extent from inode X to an offset of inode Y
that is beyond Y's current i_size. This means we copied the inline
extent's data to a folio of inode Y that is beyond its EOF, using a
call to copy_inline_to_page();
2) Task B starts a transaction commit and calls
btrfs_start_delalloc_flush() to flush delalloc;
3) The delalloc flushing sees the new dirty folio of inode Y and when it
attempts to flush it, it ends up at extent_writepage() and sees that
the offset of the folio is beyond the i_size of inode Y, so it attempts
to invalidate the folio by calling folio_invalidate(), which ends up at
btrfs' folio invalidate callback - btrfs_invalidate_folio(). There it
tries to lock the folio's range in inode Y's extent io tree, but it
blocks since it's currently locked by task A - during a reflink we lock
the inodes and the source and destination ranges after flushing all
delalloc and waiting for ordered extent completion - after that we
don't expect to have dirty folios in the ranges, the exception is if
we have to copy an inline extent's data (because the destination offset
is not zero);
4) Task A then attempts to start a transaction to update the inode item,
and then it's blocked since the current transaction is in the
TRANS_STATE_COMMIT_START state. Therefore task A has to wait for the
current transaction to become unblocked (its state >=
TRANS_STATE_UNBLOCKED).
So task A is waiting for the transaction commit done by task B, and
the later waiting on the extent lock of inode Y that is currently
held by task A.
Syzbot recently reported this with the following stack traces:
INFO: task kworker/u8:7:1053 blocked for more than 143 seconds.
Not tainted syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u8:7 state:D stack:23520 pid:1053 tgid:1053 ppid:2 task_flags:0x4208060 flags:0x00080000
Workqueue: writeback wb_workfn (flush-btrfs-46)
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5298 [inline]
__schedule+0x1553/0x5240 kernel/sched/core.c:6911
__schedule_loop kernel/sched/core.c:6993 [inline]
schedule+0x164/0x360 kernel/sched/core.c:7008
wait_extent_bit fs/btrfs/extent-io-tree.c:811 [inline]
btrfs_lock_extent_bits+0x59c/0x700 fs/btrfs/extent-io-tree.c:1914
btrfs_lock_extent fs/btrfs/extent-io-tree.h:152 [inline]
btrfs_invalidate_folio+0x43d/0xc40 fs/btrfs/inode.c:7704
extent_writepage fs/btrfs/extent_io.c:1852 [inline]
extent_write_cache_pages fs/btrfs/extent_io.c:2580 [inline]
btrfs_writepages+0x12ff/0x2440 fs/btrfs/extent_io.c:2713
do_writepages+0x32e/0x550 mm/page-writeback.c:2554
__writeback_single_inode+0x133/0x11a0 fs/fs-writeback.c:1750
writeback_sb_inodes+0x995/0x19d0 fs/fs-writeback.c:2042
wb_writeback+0x456/0xb70 fs/fs-writeback.c:2227
wb_do_writeback fs/fs-writeback.c:2374 [inline]
wb_workfn+0x41a/0xf60 fs/fs-writeback.c:2414
process_one_work kernel/workqueue.c:3276 [inline]
process_scheduled_works+0xb6e/0x18c0 kernel/workqueue.c:3359
worker_thread+0xa53/0xfc0 kernel/workqueue.c:3440
kthread+0x388/0x470 kernel/kthread.c:436
ret_from_fork+0x51e/0xb90 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
INFO: task syz.4.64:6910 blocked for more than 143 seconds.
Not tainted syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz.4.64 state:D stack:22752 pid:6910 tgid:6905 ppid:5944 task_flags:0x400140 flags:0x00080002
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5298 [inline]
__schedule+0x1553/0x5240 kernel/sched/core.c:6911
__schedule_loop kernel/sched/core.c:6993 [inline]
schedule+0x164/0x360 kernel/sched/core.c:7008
wait_current_trans+0x39f/0x590 fs/btrfs/transaction.c:535
start_transaction+0x6a7/0x1650 fs/btrfs/transaction.c:705
clone_copy_inline_extent fs/btrfs/reflink.c:299 [inline]
btrfs_clone+0x128a/0x24d0 fs/btrfs/reflink.c:529
btrfs_clone_files+0x271/0x3f0 fs/btrfs/reflink.c:750
btrfs_remap_file_range+0x76b/0x1320 fs/btrfs/reflink.c:903
vfs_copy_file_range+0xda7/0x1390 fs/read_write.c:1600
__do_sys_copy_file_range fs/read_write.c:1683 [inline]
__se_sys_copy_file_range+0x2fb/0x480 fs/read_write.c:1650
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x14d/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f5f73afc799
RSP: 002b:00007f5f7315e028 EFLAGS: 00000246 ORIG_RAX: 0000000000000146
RAX: ffffffffffffffda RBX: 00007f5f73d75fa0 RCX: 00007f5f73afc799
RDX: 0000000000000005 RSI: 0000000000000000 RDI: 0000000000000005
RBP: 00007f5f73b92c99 R08: 0000000000000863 R09: 0000000000000000
R10: 00002000000000c0 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f5f73d76038 R14: 00007f5f73d75fa0 R15: 00007fff138a5068
</TASK>
INFO: task syz.4.64:6975 blocked for more than 143 seconds.
Not tainted syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz.4.64 state:D stack:24736 pid:6975 tgid:6905 ppid:5944 task_flags:0x400040 flags:0x00080002
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5298 [inline]
__schedule+0x1553/0x5240 kernel/sched/core.c:6911
__schedule_loop kernel/sched/core.c:6993 [inline]
schedule+0x164/0x360 kernel/sched/core.c:7008
wb_wait_for_completion+0x3e8/0x790 fs/fs-writeback.c:227
__writeback_inodes_sb_nr+0x24c/0x2d0 fs/fs-writeback.c:2838
try_to_writeback_inodes_sb+0x9a/0xc0 fs/fs-writeback.c:2886
btrfs_start_delalloc_flush fs/btrfs/transaction.c:2175 [inline]
btrfs_commit_transaction+0x82e/0x31a0 fs/btrfs/transaction.c:2364
btrfs_ioctl+0xca7/0xd00 fs/btrfs/ioctl.c:5206
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl+0xff/0x170 fs/ioctl.c:583
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x14d/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f5f73afc799
RSP: 002b:00007f5f7313d028 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f5f73d76090 RCX: 00007f5f73afc799
RDX: 0000000000000000 RSI: 0000000000009408 RDI: 0000000000000004
RBP: 00007f5f73b92c99 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f5f73d76128 R14: 00007f5f73d76090 R15: 00007fff138a5068
</TASK>
Fix this by updating the i_size of the destination inode of a reflink
operation after we copy an inline extent's data to an offset beyond the
i_size and before attempting to start a transaction to update the inode's
item.
Reported-by: syzbot+63056bf627663701bbbf@syzkaller.appspotmail.com
Link: https://lore.kernel.org/linux-btrfs/69bba3fe.050a0220.227207.002f.GAE@google.com/
Fixes: 05a5a7621ce6 ("Btrfs: implement full reflink support for inline extents")
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mark Harmstone <mark@harmstone.com>
Date: Thu Apr 16 18:15:23 2026 +0100
btrfs: fix double-decrement of bytes_may_use in submit_one_async_extent()
[ Upstream commit 82323b1a7088b7a5c3e528a5d634bff447fa286f ]
submit_one_async_extent() calls btrfs_reserve_extent(), which decrements
bytes_may_use. If the call btrfs_create_io_em() fails, we jump to
out_free_reserve, which calls extent_clear_unlock_delalloc().
Because we're specifying EXTENT_DO_ACCOUNTING, i.e.
EXTENT_CLEAR_META_RESV | EXTENT_CLEAR_DATA_RESV, this decreases
bytes_may_use again. This can lead to problems later on, as an initial
write can fail only for the writeback to silently ENOSPC.
Fix this by replacing EXTENT_DO_ACCOUNTING with EXTENT_CLEAR_META_RESV.
This parallels a4fe134fc1d8eb ("btrfs: fix a double release on reserved
extents in cow_one_range()"), which is the same fix in cow_one_range().
Fixes: 151a41bc46df ("Btrfs: fix what bits we clear when erroring out from delalloc")
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Mark Harmstone <mark@harmstone.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Filipe Manana <fdmanana@suse.com>
Date: Sat May 16 15:09:18 2026 -0400
btrfs: fix missing last_unlink_trans update when removing a directory
[ Upstream commit 999757231c49376cd1a37308d2c8c4c9932571e1 ]
When removing a directory we are not updating its last_unlink_trans field,
which can result in incorrect fsync behaviour in case some one fsyncs the
directory after it was removed because it's holding a file descriptor on
it.
Example scenario:
mkdir /mnt/dir1
mkdir /mnt/dir1/dir2
mkdir /mnt/dir3
sync -f /mnt
# Do some change to the directory and fsync it.
chmod 700 /mnt/dir1
xfs_io -c fsync /mnt/dir1
# Move dir2 out of dir1 so that dir1 becomes empty.
mv /mnt/dir1/dir2 /mnt/dir3/
open fd on /mnt/dir1
call rmdir(2) on path "/mnt/dir1"
fsync fd
<trigger power failure>
When attempting to mount the filesystem, the log replay will fail with
an -EIO error and dmesg/syslog has the following:
[445771.626482] BTRFS info (device dm-0): first mount of filesystem 0368bbea-6c5e-44b5-b409-09abe496e650
[445771.626486] BTRFS info (device dm-0): using crc32c checksum algorithm
[445771.627912] BTRFS info (device dm-0): start tree-log replay
[445771.628335] page: refcount:2 mapcount:0 mapping:0000000061443ddc index:0x1d00 pfn:0x7072a5
[445771.629453] memcg:ffff89f400351b00
[445771.629892] aops:btree_aops [btrfs] ino:1
[445771.630737] flags: 0x17fffc00000402a(uptodate|lru|private|writeback|node=0|zone=2|lastcpupid=0x1ffff)
[445771.632359] raw: 017fffc00000402a fffff47284d950c8 fffff472907b7c08 ffff89f458e412b8
[445771.633713] raw: 0000000000001d00 ffff89f6c51d1a90 00000002ffffffff ffff89f400351b00
[445771.635029] page dumped because: eb page dump
[445771.635825] BTRFS critical (device dm-0): corrupt leaf: root=5 block=30408704 slot=10 ino=258, invalid nlink: has 2 expect no more than 1 for dir
[445771.638088] BTRFS info (device dm-0): leaf 30408704 gen 10 total ptrs 17 free space 14878 owner 5
[445771.638091] BTRFS info (device dm-0): refs 4 lock_owner 0 current 3581087
[445771.638094] item 0 key (256 INODE_ITEM 0) itemoff 16123 itemsize 160
[445771.638097] inode generation 3 transid 9 size 16 nbytes 16384
[445771.638098] block group 0 mode 40755 links 1 uid 0 gid 0
[445771.638100] rdev 0 sequence 2 flags 0x0
[445771.638102] atime 1775744884.0
[445771.660056] ctime 1775744885.645502983
[445771.660058] mtime 1775744885.645502983
[445771.660060] otime 1775744884.0
[445771.660062] item 1 key (256 INODE_REF 256) itemoff 16111 itemsize 12
[445771.660064] index 0 name_len 2
[445771.660066] item 2 key (256 DIR_ITEM 1843588421) itemoff 16077 itemsize 34
[445771.660068] location key (259 1 0) type 2
[445771.660070] transid 9 data_len 0 name_len 4
[445771.660075] item 3 key (256 DIR_ITEM 2363071922) itemoff 16043 itemsize 34
[445771.660076] location key (257 1 0) type 2
[445771.660077] transid 9 data_len 0 name_len 4
[445771.660078] item 4 key (256 DIR_INDEX 2) itemoff 16009 itemsize 34
[445771.660079] location key (257 1 0) type 2
[445771.660080] transid 9 data_len 0 name_len 4
[445771.660081] item 5 key (256 DIR_INDEX 3) itemoff 15975 itemsize 34
[445771.660082] location key (259 1 0) type 2
[445771.660083] transid 9 data_len 0 name_len 4
[445771.660084] item 6 key (257 INODE_ITEM 0) itemoff 15815 itemsize 160
[445771.660086] inode generation 9 transid 9 size 8 nbytes 0
[445771.660087] block group 0 mode 40777 links 1 uid 0 gid 0
[445771.660088] rdev 0 sequence 2 flags 0x0
[445771.660089] atime 1775744885.641174097
[445771.660090] ctime 1775744885.645502983
[445771.660091] mtime 1775744885.645502983
[445771.660105] otime 1775744885.641174097
[445771.660106] item 7 key (257 INODE_REF 256) itemoff 15801 itemsize 14
[445771.660107] index 2 name_len 4
[445771.660108] item 8 key (257 DIR_ITEM 2676584006) itemoff 15767 itemsize 34
[445771.660109] location key (258 1 0) type 2
[445771.660110] transid 9 data_len 0 name_len 4
[445771.660111] item 9 key (257 DIR_INDEX 2) itemoff 15733 itemsize 34
[445771.660112] location key (258 1 0) type 2
[445771.660113] transid 9 data_len 0 name_len 4
[445771.660114] item 10 key (258 INODE_ITEM 0) itemoff 15573 itemsize 160
[445771.660115] inode generation 9 transid 10 size 0 nbytes 0
[445771.660116] block group 0 mode 40755 links 2 uid 0 gid 0
[445771.660117] rdev 0 sequence 0 flags 0x0
[445771.660118] atime 1775744885.645502983
[445771.660119] ctime 1775744885.645502983
[445771.660120] mtime 1775744885.645502983
[445771.660121] otime 1775744885.645502983
[445771.660122] item 11 key (258 INODE_REF 257) itemoff 15559 itemsize 14
[445771.660123] index 2 name_len 4
[445771.660124] item 12 key (258 INODE_REF 259) itemoff 15545 itemsize 14
[445771.660125] index 2 name_len 4
[445771.660126] item 13 key (259 INODE_ITEM 0) itemoff 15385 itemsize 160
[445771.660127] inode generation 9 transid 10 size 8 nbytes 0
[445771.660128] block group 0 mode 40755 links 1 uid 0 gid 0
[445771.660129] rdev 0 sequence 1 flags 0x0
[445771.660130] atime 1775744885.645502983
[445771.660130] ctime 1775744885.645502983
[445771.660131] mtime 1775744885.645502983
[445771.660132] otime 1775744885.645502983
[445771.660133] item 14 key (259 INODE_REF 256) itemoff 15371 itemsize 14
[445771.660134] index 3 name_len 4
[445771.660135] item 15 key (259 DIR_ITEM 2676584006) itemoff 15337 itemsize 34
[445771.660136] location key (258 1 0) type 2
[445771.660137] transid 10 data_len 0 name_len 4
[445771.660138] item 16 key (259 DIR_INDEX 2) itemoff 15303 itemsize 34
[445771.660139] location key (258 1 0) type 2
[445771.660140] transid 10 data_len 0 name_len 4
[445771.660144] BTRFS error (device dm-0): block=30408704 write time tree block corruption detected
[445771.661650] ------------[ cut here ]------------
[445771.662358] WARNING: fs/btrfs/disk-io.c:326 at btree_csum_one_bio+0x217/0x230 [btrfs], CPU#8: mount/3581087
[445771.663588] Modules linked in: btrfs f2fs xfs (...)
[445771.671229] CPU: 8 UID: 0 PID: 3581087 Comm: mount Tainted: G W 7.0.0-rc6-btrfs-next-230+ #2 PREEMPT(full)
[445771.672575] Tainted: [W]=WARN
[445771.672987] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
[445771.674460] RIP: 0010:btree_csum_one_bio+0x217/0x230 [btrfs]
[445771.675222] Code: 89 44 24 (...)
[445771.677364] RSP: 0018:ffffd23882247660 EFLAGS: 00010246
[445771.678029] RAX: 0000000000000000 RBX: ffff89f6c51d1a90 RCX: 0000000000000000
[445771.678975] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff89f406020000
[445771.679983] RBP: ffff89f821204000 R08: 0000000000000000 R09: 00000000ffefffff
[445771.680905] R10: ffffd23882247448 R11: 0000000000000003 R12: ffffd23882247668
[445771.681978] R13: ffff89f458e40fc0 R14: ffff89f737f4f500 R15: ffff89f737f4f500
[445771.682912] FS: 00007f0447a98840(0000) GS:ffff89fb9771d000(0000) knlGS:0000000000000000
[445771.684393] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[445771.685230] CR2: 00007f0447bf1330 CR3: 000000017cb02002 CR4: 0000000000370ef0
[445771.686273] Call Trace:
[445771.686646] <TASK>
[445771.686969] btrfs_submit_bbio+0x83f/0x860 [btrfs]
[445771.687750] ? write_one_eb+0x28f/0x340 [btrfs]
[445771.688428] btree_writepages+0x2e3/0x550 [btrfs]
[445771.689180] ? kmem_cache_alloc_noprof+0x12a/0x490
[445771.689963] ? alloc_extent_state+0x19/0x120 [btrfs]
[445771.690801] ? kmem_cache_free+0x135/0x380
[445771.691328] ? preempt_count_add+0x69/0xa0
[445771.691831] ? set_extent_bit+0x252/0x8e0 [btrfs]
[445771.692468] ? xas_load+0x9/0xc0
[445771.692873] ? xas_find+0x14d/0x1a0
[445771.693304] do_writepages+0xc6/0x160
[445771.693756] filemap_writeback+0xb8/0xe0
[445771.694274] btrfs_write_marked_extents+0x61/0x170 [btrfs]
[445771.694999] btrfs_write_and_wait_transaction+0x4e/0xc0 [btrfs]
[445771.695818] btrfs_commit_transaction+0x5c8/0xd10 [btrfs]
[445771.696530] ? kmem_cache_free+0x135/0x380
[445771.697120] ? release_extent_buffer+0x34/0x160 [btrfs]
[445771.697786] btrfs_recover_log_trees+0x7be/0x7e0 [btrfs]
[445771.698525] ? __pfx_replay_one_buffer+0x10/0x10 [btrfs]
[445771.699206] open_ctree+0x11e5/0x1810 [btrfs]
[445771.699776] btrfs_get_tree.cold+0xb/0x162 [btrfs]
[445771.700463] ? fscontext_read+0x165/0x180
[445771.701146] ? rw_verify_area+0x50/0x180
[445771.701866] vfs_get_tree+0x25/0xd0
[445771.702491] vfs_cmd_create+0x59/0xe0
[445771.703125] __do_sys_fsconfig+0x303/0x610
[445771.703603] do_syscall_64+0xe9/0xf20
[445771.703974] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[445771.704700] RIP: 0033:0x7f0447cbd4aa
[445771.705108] Code: 73 01 c3 (...)
[445771.707263] RSP: 002b:00007ffc4e528318 EFLAGS: 00000246 ORIG_RAX: 00000000000001af
[445771.708107] RAX: ffffffffffffffda RBX: 00005561585d8c20 RCX: 00007f0447cbd4aa
[445771.708931] RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000003
[445771.709744] RBP: 00005561585d9120 R08: 0000000000000000 R09: 0000000000000000
[445771.710674] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[445771.711477] R13: 00007f0447e4f580 R14: 00007f0447e5126c R15: 00007f0447e36a23
[445771.712277] </TASK>
[445771.712541] ---[ end trace 0000000000000000 ]---
[445771.713382] BTRFS error (device dm-0): error while writing out transaction: -5
[445771.714679] BTRFS warning (device dm-0): Skipping commit of aborted transaction.
[445771.715562] BTRFS error (device dm-0 state A): Transaction aborted (error -5)
[445771.716459] BTRFS: error (device dm-0 state A) in cleanup_transaction:2068: errno=-5 IO failure
[445771.717936] BTRFS error (device dm-0 state EA): failed to recover log trees with error: -5
[445771.719681] BTRFS error (device dm-0 state EA): open_ctree failed: -5
The problem is that such a fsync should have result in a fallback to a
transaction commit, but that did not happen because through the
btrfs_rmdir() we never update the directory's last_unlink_trans field.
Any inode that had a link removed must have its last_unlink_trans updated
to the ID of transaction used for the operation, otherwise fsync and log
replay will not work correctly.
btrfs_rmdir() calls btrfs_unlink_inode() and through that call chain we
never call btrfs_record_unlink_dir() in order to update last_unlink_trans.
However btrfs_unlink(), which is used for unlinking regular files, calls
btrfs_record_unlink_dir() and then calls btrfs_unlink_inode(). So fix
this by moving the call to btrfs_record_unlink_dir() from btrfs_unlink()
to btrfs_unlink_inode().
A test case for fstests will follow soon.
Reported-by: Slava0135 <slava.kovalevskiy.2014@gmail.com>
Link: https://lore.kernel.org/linux-btrfs/CAAJYhww5ov62Hm+n+tmhcL-e_4cBobg+OWogKjOJxVUXivC=MQ@mail.gmail.com/
CC: stable@vger.kernel.org
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: David Sterba <dsterba@suse.com>
Date: Tue Feb 18 01:31:38 2025 +0100
btrfs: pass struct btrfs_inode to clone_copy_inline_extent()
[ Upstream commit 65a66afd1ee5b2770fde296663baa0f79af56bc7 ]
Pass a struct btrfs_inode to clone_copy_inline_extent() as it's an
internal interface, allowing to remove some use of BTRFS_I.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Stable-dep-of: b48c980b6a7e ("btrfs: fix deadlock between reflink and transaction commit when using flushoncommit")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Filipe Manana <fdmanana@suse.com>
Date: Sat May 16 15:09:17 2026 -0400
btrfs: use btrfs inodes in btrfs_rmdir() to avoid so much usage of BTRFS_I()
[ Upstream commit 98060e1611177ddc842601a58258876ab435fdbf ]
Almost everywhere we want to use a btrfs inode and therefore we have a
lot of calls to BTRFS_I(), making the code more verbose. Instead use btrfs
inode local variables to avoid so much use of BTRFS_I().
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Stable-dep-of: 999757231c49 ("btrfs: fix missing last_unlink_trans update when removing a directory")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Filipe Manana <fdmanana@suse.com>
Date: Sat May 16 15:09:16 2026 -0400
btrfs: use inode already stored in local variable at btrfs_rmdir()
[ Upstream commit 9f82a4ed34d870b5719f9b95f7da4f74d3325a6f ]
There's no need to call d_inode(dentry) when calling btrfs_unlink_inode()
since we have already stored that in a local inode variable. So just use
the local variable to make the code less verbose.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Stable-dep-of: 999757231c49 ("btrfs: fix missing last_unlink_trans update when removing a directory")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Danilo Krummrich <dakr@kernel.org>
Date: Tue Mar 24 01:59:06 2026 +0100
bus: fsl-mc: use generic driver_override infrastructure
[ Upstream commit 6c8dfb0362732bf1e4829867a2a5239fedc592d0 ]
When a driver is probed through __driver_attach(), the bus' match()
callback is called without the device lock held, thus accessing the
driver_override field without a lock, which can cause a UAF.
Fix this by using the driver-core driver_override infrastructure taking
care of proper locking internally.
Note that calling match() from __driver_attach() without the device lock
held is intentional. [1]
Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Acked-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Acked-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [1]
Reported-by: Gui-Dong Han <hanguidong02@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789
Fixes: 1f86a00c1159 ("bus/fsl-mc: add support for 'driver_override' in the mc-bus")
Link: https://patch.msgid.link/20260324005919.2408620-3-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Gatien Chevallier <gatien.chevallier@foss.st.com>
Date: Thu Jan 29 13:56:17 2026 +0100
bus: rifsc: fix RIF configuration check for peripherals
[ Upstream commit d5ce3b4e951bc41a6ce877c8500bb4fe42146669 ]
Peripheral holding CID0 cannot be accessed, remove this completely
incorrect check. While there, fix and simplify the semaphore checking
that should be performed when the CID filtering is enabled.
Fixes: a18208457253 ("bus: rifsc: introduce RIFSC firewall controller driver")
Signed-off-by: Gatien Chevallier <gatien.chevallier@foss.st.com>
Link: https://lore.kernel.org/r/20260129-fix_cid_check_rifsc-v1-1-ef280ccf764d@foss.st.com
Signed-off-by: Alexandre Torgue <alexandre.torgue@foss.st.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Daan De Meyer <daan@amutable.com>
Date: Mon Apr 27 22:01:39 2026 +0100
cdrom, scsi: sr: propagate read-only status to block layer via set_disk_ro()
[ Upstream commit 0898a817621a2f0cddca8122d9b974003fe5036d ]
The cdrom core never calls set_disk_ro() for a registered device, so
BLKROGET on a CD-ROM device always returns 0 (writable), even when the
drive has no write capabilities and writes will inevitably fail. This
causes problems for userspace that relies on BLKROGET to determine
whether a block device is read-only. For example, systemd's loop device
setup uses BLKROGET to decide whether to create a loop device with
LO_FLAGS_READ_ONLY. Without the read-only flag, writes pass through the
loop device to the CD-ROM and fail with I/O errors. systemd-fsck
similarly checks BLKROGET to decide whether to run fsck in no-repair
mode (-n).
The write-capability bits in cdi->mask come from two different sources:
CDC_DVD_RAM and CDC_CD_RW are populated by the driver from the MODE
SENSE capabilities page (page 0x2A) before register_cdrom() is called,
while CDC_MRW_W and CDC_RAM require the MMC GET CONFIGURATION command
and were only probed by cdrom_open_write() at device open time. This
meant that any attempt to compute the writable state from the full
mask at probe time was incorrect, because the GET CONFIGURATION bits
were still unset (and cdi->mask is initialized such that capabilities
are assumed present).
Fix this by factoring the GET CONFIGURATION probing out of
cdrom_open_write() into a new exported helper,
cdrom_probe_write_features(), and having sr call it from sr_probe()
right after get_capabilities() has populated the MODE SENSE bits.
register_cdrom() then calls set_disk_ro() based on the full
write-capability mask (CDC_DVD_RAM | CDC_MRW_W | CDC_RAM | CDC_CD_RW)
so the block layer reflects the drive's actual write support. The
feature queries used (CDF_MRW and CDF_RWRT via GET CONFIGURATION with
RT=00) report drive-level capabilities that are persistent across
media, so a single probe before register_cdrom() is sufficient and the
redundant probe at open time is dropped.
With set_disk_ro() now accurate, the long-vestigial cd->writeable flag
in sr can go: get_capabilities() used to set cd->writeable based on
the same four mask bits, but because CDC_MRW_W and CDC_RAM default to
"capability present" in cdi->mask and aren't touched by MODE SENSE,
the condition that gated cd->writeable was always true, making it
unconditionally 1. Replace the corresponding gate in sr_init_command()
with get_disk_ro(cd->disk), which turns a previously no-op check into
a real one and also catches kernel-internal bio writers that bypass
blkdev_write_iter()'s bdev_read_only() check.
The sd driver (SCSI disks) does not have this problem because it
checks the MODE SENSE Write Protect bit and calls set_disk_ro()
accordingly. The sr driver cannot use the same approach because the
MMC specification does not define the WP bit in the MODE SENSE
device-specific parameter byte for CD-ROM devices.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Daan De Meyer <daan@amutable.com>
Reviewed-by: Phillip Potter <phil@philpotter.co.uk>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Phillip Potter <phil@philpotter.co.uk>
Link: https://patch.msgid.link/20260427210139.1400-2-phil@philpotter.co.uk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Date: Thu Apr 9 12:26:02 2026 -0700
ceph: fix a buffer leak in __ceph_setxattr()
commit 5d3cc36b4e77a27ce7b686b7c59c7072bcb3fa8e upstream.
The old_blob in __ceph_setxattr() can store
ci->i_xattrs.prealloc_blob value during the retry.
However, it is never called the ceph_buffer_put()
for the old_blob object. This patch fixes the issue of
the buffer leak.
Cc: stable@vger.kernel.org
Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Reviewed-by: Alex Markuze <amarkuze@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Date: Thu Apr 9 12:43:40 2026 -0700
ceph: fix BUG_ON in __ceph_build_xattrs_blob() due to stale blob size
commit 0c22d9511cbde746622f8e4c11aaa63fe76d45f9 upstream.
The generic/642 test-case can reproduce the kernel crash:
[40243.605254] ------------[ cut here ]------------
[40243.605956] kernel BUG at fs/ceph/xattr.c:918!
[40243.607142] Oops: invalid opcode: 0000 [#1] SMP PTI
[40243.608067] CPU: 7 UID: 0 PID: 498762 Comm: kworker/7:1 Not tainted 7.0.0-rc7+ #3 PREEMPT(full)
[40243.609700] Hardware name: QEMU Ubuntu 25.10 PC v2 (i440FX + PIIX, + 10.1 machine, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[40243.611820] Workqueue: ceph-msgr ceph_con_workfn
[40243.612715] RIP: 0010:__ceph_build_xattrs_blob+0x1b8/0x1e0
[40243.613731] Code: 0f 84 82 fe ff ff e9 cf 8e 56 ff 48 8d 65 e8 31 c0 5b 41 5c 41 5d 5d 31 d2 31 c9 31 f6 31 ff 45 31 c0 45 31 c9 c3 cc cc cc cc <0f> 0b 4c 8b 62 08 41 8b 85 24 07 00 00 49 83 c4 04 41 89 44 24 fc
[40243.616888] RSP: 0018:ffffcc80c4d4b688 EFLAGS: 00010287
[40243.617773] RAX: 0000000000010026 RBX: 0000000000000001 RCX: 0000000000000000
[40243.618928] RDX: ffff8a773798dee0 RSI: 0000000000000000 RDI: 0000000000000000
[40243.620158] RBP: ffffcc80c4d4b6a0 R08: 0000000000000000 R09: 0000000000000000
[40243.621573] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8a75f3b58000
[40243.622907] R13: ffff8a75f3b58000 R14: 0000000000000080 R15: 000000000000bffd
[40243.624054] FS: 0000000000000000(0000) GS:ffff8a787d1b4000(0000) knlGS:0000000000000000
[40243.625331] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[40243.626269] CR2: 000072f390b623c0 CR3: 000000011c02a003 CR4: 0000000000372ef0
[40243.627408] Call Trace:
[40243.627839] <TASK>
[40243.628188] __prep_cap+0x3fd/0x4a0
[40243.628789] ? do_raw_spin_unlock+0x4e/0xe0
[40243.629474] ceph_check_caps+0x46a/0xc80
[40243.630094] ? __lock_acquire+0x4a2/0x2650
[40243.630773] ? find_held_lock+0x31/0x90
[40243.631347] ? handle_cap_grant+0x79f/0x1060
[40243.632068] ? lock_release+0xd9/0x300
[40243.632696] ? __mutex_unlock_slowpath+0x3e/0x340
[40243.633429] ? lock_release+0xd9/0x300
[40243.634052] handle_cap_grant+0xcf6/0x1060
[40243.634745] ceph_handle_caps+0x122b/0x2110
[40243.635415] mds_dispatch+0x5bd/0x2160
[40243.636034] ? ceph_con_process_message+0x65/0x190
[40243.636828] ? lock_release+0xd9/0x300
[40243.637431] ceph_con_process_message+0x7a/0x190
[40243.638184] ? kfree+0x311/0x4f0
[40243.638749] ? kfree+0x311/0x4f0
[40243.639268] process_message+0x16/0x1a0
[40243.639915] ? sg_free_table+0x39/0x90
[40243.640572] ceph_con_v2_try_read+0xf58/0x2120
[40243.641255] ? lock_acquire+0xc8/0x300
[40243.641863] ceph_con_workfn+0x151/0x820
[40243.642493] process_one_work+0x22f/0x630
[40243.643093] ? process_one_work+0x254/0x630
[40243.643770] worker_thread+0x1e2/0x400
[40243.644332] ? __pfx_worker_thread+0x10/0x10
[40243.645020] kthread+0x109/0x140
[40243.645560] ? __pfx_kthread+0x10/0x10
[40243.646125] ret_from_fork+0x3f8/0x480
[40243.646752] ? __pfx_kthread+0x10/0x10
[40243.647316] ? __pfx_kthread+0x10/0x10
[40243.647919] ret_from_fork_asm+0x1a/0x30
[40243.648556] </TASK>
[40243.648902] Modules linked in: overlay hctr2 libpolyval chacha libchacha adiantum libnh libpoly1305 essiv intel_rapl_msr intel_rapl_common intel_uncore_frequency_common skx_edac_common nfit kvm_intel kvm irqbypass joydev ghash_clmulni_intel aesni_intel rapl input_leds mac_hid psmouse vga16fb serio_raw vgastate floppy i2c_piix4 pata_acpi bochs qemu_fw_cfg i2c_smbus sch_fq_codel rbd dm_crypt msr parport_pc ppdev lp parport efi_pstore
[40243.654766] ---[ end trace 0000000000000000 ]---
Commit d93231a6bc8a ("ceph: prevent a client from exceeding the MDS
maximum xattr size") moved the required_blob_size computation to before
the __build_xattrs() call, introducing a race.
__build_xattrs() releases and reacquires i_ceph_lock during execution.
In that window, handle_cap_grant() may update i_xattrs.blob with a
newer MDS-provided blob and bump i_xattrs.version. When
__build_xattrs() detects that index_version < version, it destroys and
rebuilds the entire xattr rb-tree from the new blob, potentially
increasing count, names_size, and vals_size.
The prealloc_blob size check that follows still uses the stale
required_blob_size computed before the rebuild, so it passes even when
prealloc_blob is too small for the now-larger tree. After __set_xattr()
adds one more xattr on top, __ceph_build_xattrs_blob() is called from
the cap flush path and hits:
BUG_ON(need > ci->i_xattrs.prealloc_blob->alloc_len);
Fix this by recomputing required_blob_size after __build_xattrs()
returns, using the current tree state. Also re-validate against
m_max_xattr_size to fall back to the sync path if the rebuilt tree now
exceeds the MDS limit.
Cc: stable@vger.kernel.org
Fixes: d93231a6bc8a ("ceph: prevent a client from exceeding the MDS maximum xattr size")
Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Reviewed-by: Alex Markuze <amarkuze@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: cuitao <cuitao@kylinos.cn>
Date: Tue Apr 14 09:53:27 2026 +0800
cgroup/rdma: fix integer overflow in rdmacg_try_charge()
[ Upstream commit c802f460dd485c1332b5a35e7adcfb2bc22536a2 ]
The expression `rpool->resources[index].usage + 1` is computed in int
arithmetic before being assigned to s64 variable `new`. When usage equals
INT_MAX (the default "max" value), the addition overflows to INT_MIN.
This negative value then passes the `new > max` check incorrectly,
allowing a charge that should be rejected and corrupting usage to
negative.
Fix by casting usage to s64 before the addition so the arithmetic is
done in 64-bit.
Fixes: 39d3e7584a68 ("rdmacg: Added rdma cgroup controller")
Signed-off-by: cuitao <cuitao@kylinos.cn>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Petr Malat <oss@malat.biz>
Date: Thu Apr 23 04:48:26 2026 -0500
cgroup: Increment nr_dying_subsys_* from rmdir context
[ Upstream commit 13e786b64bd3fd81c7eb22aa32bf8305c32f2ccf ]
Incrementing nr_dying_subsys_* in offline_css(), which is executed by
cgroup_offline_wq worker, leads to a race where user can see the value
to be 0 if he reads cgroup.stat after calling rmdir and before the worker
executes. This makes the user wrongly expect resources released by the
removed cgroup to be available for a new assignment.
Increment nr_dying_subsys_* from kill_css(), which is called from the
cgroup_rmdir() context.
Fixes: ab0312526867 ("cgroup: Show # of subsystem CSSes in cgroup.stat")
Signed-off-by: Petr Malat <oss@malat.biz>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sebastian Krzyszkowiak <sebastian.krzyszkowiak@puri.sm>
Date: Wed Jan 28 00:47:21 2026 +0100
clk: imx8mq: Correct the CSI PHY sels
[ Upstream commit d16f57caa78776e6e8a88b96cb2597797b376138 ]
According to i.MX 8M Quad Reference Manual (Section 5.1.2 Table 5-1)
MIPI_CSI1_PHY_REF_CLK_ROOT and MIPI_CSI2_PHY_REF_CLK_ROOT have
SYSTEM_PLL2_DIV3 available as their second source, which corresponds
to sys2_pll_333m rather than sys2_pll_125m.
Fixes: b80522040cd3 ("clk: imx: Add clock driver for i.MX8MQ CCM")
Signed-off-by: Sebastian Krzyszkowiak <sebastian.krzyszkowiak@puri.sm>
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Link: https://patch.msgid.link/20260128-imx8mq-csi-clk-v1-1-ac028ed26e8c@puri.sm
Signed-off-by: Abel Vesa <abel.vesa@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Felix Gu <ustc.gu@gmail.com>
Date: Tue Feb 3 22:07:58 2026 +0800
clk: imx: imx6q: Fix device node reference leak in of_assigned_ldb_sels()
[ Upstream commit 9faf207208951460f3f7eefbc112246c8d28ff1b ]
The function of_assigned_ldb_sels() calls of_parse_phandle_with_args()
but never calls of_node_put() to release the reference, causing a memory
leak.
Fix this by adding proper cleanup calls on all exit paths.
Fixes: 5d283b083800 ("clk: imx6: Fix procedure to switch the parent of LDB_DI_CLK")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Link: https://patch.msgid.link/20260203-clk-imx6q-v3-2-6cd2696bb371@gmail.com
Signed-off-by: Abel Vesa <abel.vesa@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Felix Gu <ustc.gu@gmail.com>
Date: Tue Feb 3 22:07:57 2026 +0800
clk: imx: imx6q: Fix device node reference leak in pll6_bypassed()
[ Upstream commit 4b84d496c804b470124cd3a08e928df6801d8eae ]
The function pll6_bypassed() calls of_parse_phandle_with_args()
but never calls of_node_put() to release the reference, causing
a memory leak.
Fix this by adding proper cleanup calls on all exit paths.
Fixes: 3cc48976e9763 ("clk: imx6q: handle ENET PLL bypass")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Link: https://patch.msgid.link/20260203-clk-imx6q-v3-1-6cd2696bb371@gmail.com
Signed-off-by: Abel Vesa <abel.vesa@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Date: Tue Jan 20 12:19:26 2026 +0100
clk: qcom: dispcc-sc7180: Add missing MDSS resets
[ Upstream commit b0bc6011c5499bdfddd0390262bfa13dce1eff74 ]
The MDSS resets have so far been left undescribed. Fix that.
Fixes: dd3d06622138 ("clk: qcom: Add display clock controller driver for SC7180")
Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Taniya Das <taniya.das@oss.qualcomm.com>
Tested-by: Val Packett <val@packett.cool> # sc7180-ecs-liva-qc710
Link: https://lore.kernel.org/r/20260120-topic-7180_dispcc_bcr-v1-2-0b1b442156c3@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: White Lewis <liu224806@gmail.com>
Date: Tue Mar 3 19:55:50 2026 +0800
clk: qcom: dispcc-sc8280xp: remove CLK_SET_RATE_PARENT from byte_div_clk_src dividers
[ Upstream commit 0b151a6307205eb867250985a910a88787cbf12e ]
The four byte_div_clk_src dividers (disp{0,1}_cc_mdss_byte{0,1}_div_clk_src)
had CLK_SET_RATE_PARENT set. When the DSI driver calls clk_set_rate() on
byte_intf_clk, the rate-change propagates through the divider up to the
parent PLL (byte_clk_src), halving the byte clock rate.
A simiar issue had been also encountered on SM8750.
b8501febdc51 ("clk: qcom: dispcc-sm8750: Drop incorrect CLK_SET_RATE_PARENT on byte intf parent").
Likewise, remove CLK_SET_RATE_PARENT from all four byte divider clocks
so that clk_set_rate() on the divider adjusts only the divider ratio,
leaving the parent PLL untouched.
Fixes: 4a66e76fdb6d ("clk: qcom: Add SC8280XP display clock controller")
Signed-off-by: White Lewis <liu224806@gmail.com>
[pengyu: reword]
Signed-off-by: Pengyu Luo <mitltlatltl@gmail.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260303115550.9279-1-mitltlatltl@gmail.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Date: Wed Mar 4 14:48:30 2026 +0100
clk: qcom: dispcc-sm4450: Fix DSI byte clock rate setting
[ Upstream commit 7bc48fcdf9e77bf68ef04af015d50df2a9acac00 ]
The clock tree for byte_clk_src is as follows:
┌──────byte0_clk_src─────┐
│ │
byte0_clk byte0_div_clk_src
│
byte0_intf_clk
If both of its direct children have CLK_SET_RATE_PARENT with different
requests, byte0_clk_src (and its parent) will be reconfigured. In this
case, byte0_intf should strictly follow the rate of byte0_clk (with
some adjustments based on PHY mode).
Remove CLK_SET_RATE_PARENT from byte0_div_clk_src to avoid this issue.
Fixes: 76f05f1ec766 ("clk: qcom: Add DISPCC driver support for SM4450")
Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260304-topic-dsi_byte_fixup-v1-4-b79b29f83176@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Val Packett <val@packett.cool>
Date: Thu Mar 12 08:12:13 2026 -0300
clk: qcom: dispcc-sm8250: Enable parents for pixel clocks
[ Upstream commit acf7a91d0b0e9e3ef374944021de62062125b7e4 ]
Add CLK_OPS_PARENT_ENABLE to MDSS pixel clock sources to ensure parent
clocks are enabled during clock operations, preventing potential
stability issues during display configuration.
Fixes: 80a18f4a8567 ("clk: qcom: Add display clock controller driver for SM8150 and SM8250")
Signed-off-by: Val Packett <val@packett.cool>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260312112321.370983-9-val@packett.cool
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Val Packett <val@packett.cool>
Date: Thu Mar 12 08:12:12 2026 -0300
clk: qcom: dispcc-sm8250: Use shared ops on the mdss vsync clk
[ Upstream commit 8c522da70f0c2e5148c4c13ccb1c64cca57a6fdb ]
mdss_gdsc can get stuck on boot due to RCGs being left on from last boot.
As a fix, commit 01a0a6cc8cfd ("clk: qcom: Park shared RCGs upon
registration") introduced a callback to ensure the RCG is off upon init.
However, the fix depends on all shared RCGs being marked as such in code.
For SM8150/SC8180X/SM8250 the MDSS vsync clock was using regular ops,
unlike the same clock in the SC7180 code. This was causing display to
frequently fail to initialize after rebooting on the Surface Pro X.
Fix by using shared ops for this clock.
Fixes: 80a18f4a8567 ("clk: qcom: Add display clock controller driver for SM8150 and SM8250")
Signed-off-by: Val Packett <val@packett.cool>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260312112321.370983-8-val@packett.cool
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Date: Mon Jan 12 04:12:23 2026 +0200
clk: qcom: dispcc-sm8450: use RCG2 ops for DPTX1 AUX clock source
[ Upstream commit 141af1be817c42c7f1e1605348d4b1983d319bea ]
The clk_dp_ops are supposed to be used for DP-related clocks with a
proper MND divier. Use standard RCG2 ops for dptx1_aux_clk_src, the same
as all other DPTX AUX clocks in this driver.
Fixes: 16fb89f92ec4 ("clk: qcom: Add support for Display Clock Controller on SM8450")
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Abel Vesa <abel.vesa@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Taniya Das <taniya.das@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260112-dp-aux-clks-v1-2-456b0c11b069@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Val Packett <val@packett.cool>
Date: Thu Mar 12 08:12:07 2026 -0300
clk: qcom: gcc-sc8180x: Add missing GDSCs
[ Upstream commit 3565741eb985a8a7cc6656eb33496195468cb99e ]
There are 5 more GDSCs that we were ignoring and not putting to sleep,
which are listed in downstream DTS. Add them.
Fixes: 4433594bbe5d ("clk: qcom: gcc: Add global clock controller driver for SC8180x")
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Val Packett <val@packett.cool>
Link: https://lore.kernel.org/r/20260312112321.370983-3-val@packett.cool
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Val Packett <val@packett.cool>
Date: Thu Mar 12 08:12:09 2026 -0300
clk: qcom: gcc-sc8180x: Use retention for PCIe power domains
[ Upstream commit ccb92c78b42edd26225b4d5920847dfee3e1b093 ]
As the PCIe host controller driver does not yet support dealing with the
loss of state during suspend, use retention for relevant GDSCs.
This fixes the link not surviving upon resume:
nvme 0002:01:00.0: Unable to change power state from D3cold to D0, device inaccessible
nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS read failed (134)
nvme 0002:01:00.0: Unable to change power state from D3cold to D0, device inaccessible
nvme nvme0: Disabling device after reset failure: -19
Fixes: 4433594bbe5d ("clk: qcom: gcc: Add global clock controller driver for SC8180x")
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Val Packett <val@packett.cool>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://lore.kernel.org/r/20260312112321.370983-5-val@packett.cool
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Val Packett <val@packett.cool>
Date: Thu Mar 12 08:12:08 2026 -0300
clk: qcom: gcc-sc8180x: Use retention for USB power domains
[ Upstream commit 25bc96f26cd6c19dde13a0b9859183e531d6fbfc ]
The USB subsystem does not expect to lose its state on suspend:
xhci-hcd xhci-hcd.0.auto: xHC error in resume, USBSTS 0x401, Reinit
usb usb1: root hub lost power or was reset
(The reinitialization usually succeeds, but it does slow down resume.)
To maintain state during suspend, the relevant GDSCs need to stay in
retention mode, like they do on other similar SoCs. Change the mode to
PWRSTS_RET_ON to fix.
Fixes: 4433594bbe5d ("clk: qcom: gcc: Add global clock controller driver for SC8180x")
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Signed-off-by: Val Packett <val@packett.cool>
Link: https://lore.kernel.org/r/20260312112321.370983-4-val@packett.cool
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jagadeesh Kona <jagadeesh.kona@oss.qualcomm.com>
Date: Fri Mar 27 20:36:46 2026 +0530
clk: qcom: gcc-x1e80100: Keep GCC USB QTB clock always ON
[ Upstream commit 05566ebcc0cd170bd4f50c907ee3ed8e106251e3 ]
In Hamoa, SMMU invalidation requires the GCC_AGGRE_USB_NOC_AXI_CLK
to be on for the USB QTB to be functional. This is currently
explicitly enabled by the DWC3 glue driver, so an invalidation
happening while the USB controller is suspended will fault.
Solve this by voting for the GCC MMU USB QTB clock.
Fixes: 161b7c401f4b ("clk: qcom: Add Global Clock controller (GCC) driver for X1E80100")
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Jagadeesh Kona <jagadeesh.kona@oss.qualcomm.com>
Reviewed-by: Taniya Das <taniya.das@oss.qualcomm.com>
Reviewed-by: Abel Vesa <abel.vesa@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260327-hamoa-usb-qtb-clk-always-on-v2-1-7d8a406e650f@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Arnd Bergmann <arnd@arndb.de>
Date: Fri Mar 20 16:18:49 2026 +0100
clk: qoriq: avoid format string warning
[ Upstream commit 096abbb6682ee031a0f5ce9f4c71ead9fa63d31e ]
clang-22 warns about the use of non-variadic format arguments passed into
snprintf():
drivers/clk/clk-qoriq.c:925:39: error: diagnostic behavior may be improved by adding the
'format(printf, 7, 8)' attribute to the declaration of 'create_mux_common' [-Werror,-Wmissing-format-attribute]
910 | static struct clk * __init create_mux_common(struct clockgen *cg,
| __attribute__((format(printf, 7, 8)))
911 | struct mux_hwclock *hwc,
912 | const struct clk_ops *ops,
913 | unsigned long min_rate,
914 | unsigned long max_rate,
915 | unsigned long pct80_rate,
916 | const char *fmt, int idx)
917 | {
918 | struct clk_init_data init = {};
919 | struct clk *clk;
920 | const struct clockgen_pll_div *div;
921 | const char *parent_names[NUM_MUX_PARENTS];
922 | char name[32];
923 | int i, j;
924 |
925 | snprintf(name, sizeof(name), fmt, idx);
| ^
drivers/clk/clk-qoriq.c:910:28: note: 'create_mux_common' declared here
910 | static struct clk * __init create_mux_common(struct clockgen *cg,
Rework this to pass the 'int idx' as a varargs argument, allowing the
format string to be verified at the caller location.
Fixes: 0dfc86b3173f ("clk: qoriq: Move chip-specific knowledge into driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Kees Cook <kees@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Brian Masney <bmasney@redhat.com>
Date: Mon Mar 30 10:32:37 2026 -0400
clk: visconti: pll: initialize clk_init_data to zero
[ Upstream commit 1603cbb64173a0e9fa7500f2a686f4aa011c58b9 ]
Sashiko reported the following:
> The struct clk_init_data init is declared on the stack without being
> fully zero-initialized. While fields like name, flags, parent_names,
> num_parents, and ops are explicitly assigned, the parent_data and
> parent_hws fields are left containing stack garbage.
clk_core_populate_parent_map() currently prefers the parent names over
the parent data and hws, so this isn't a problem at the moment. If that
ordering ever changed in the future, then this could lead to some
unexpected crashes. Let's just go ahead and make sure that the struct
clk_init_data is initialized to zero as a good practice.
Fixes: b4cbe606dc367 ("clk: visconti: Add support common clock driver and reset driver")
Link: https://sashiko.dev/#/patchset/20260326042317.122536-1-rosenp%40gmail.com
Signed-off-by: Brian Masney <bmasney@redhat.com>
Reviewed-by: Benoît Monin <benoit.monin@bootlin.com>
Reviewed-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.x90@mail.toshiba>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Geert Uytterhoeven <geert+renesas@glider.be>
Date: Thu Mar 5 11:11:16 2026 +0100
clk: xgene: Fix mapping leak in xgene_pllclk_init()
[ Upstream commit f520a492e07bc6718e26cfb7543ab4cadd8bb0e2 ]
If xgene_register_clk_pll() fails, the mapped register block is never
unmapped.
Fixes: 308964caeebc45eb ("clk: Add APM X-Gene SoC clock driver")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Brian Masney <bmasney@redhat.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date: Tue May 5 17:02:45 2026 +0800
crypto: af_alg - Cap AEAD AD length to 0x80000000
commit e4c06479d7059888adf2f22bc1ebcf053bf691a2 upstream.
In order to prevent arithmetic overflows when checking the TX
buffer size, cap the associated data length to 0x80000000.
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Fixes: 400c40cf78da ("crypto: algif - add AEAD support")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Thorsten Blum <thorsten.blum@linux.dev>
Date: Mon Jan 26 18:47:03 2026 +0100
crypto: atmel - Use unregister_{aeads,ahashes,skciphers}
[ Upstream commit 2ffc1ef4e826f0c3274f9ff5eb42bc70a5571afd ]
Replace multiple for loops with calls to crypto_unregister_aeads(),
crypto_unregister_ahashes(), and crypto_unregister_skciphers().
Remove the definition of atmel_tdes_unregister_algs() because it is
equivalent to calling crypto_unregister_skciphers() directly, and the
function parameter 'struct atmel_tdes_dev *' is unused anyway.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Stable-dep-of: 57a13941c0bb ("crypto: atmel-aes - guard unregister on error in atmel_aes_register_algs")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Thorsten Blum <thorsten.blum@linux.dev>
Date: Wed Mar 11 12:39:28 2026 +0100
crypto: atmel-aes - guard unregister on error in atmel_aes_register_algs
[ Upstream commit 57a13941c0bb06ae24e3b34672d7b6f2172b253f ]
Ensure the device supports XTS and GCM with 'has_xts' and 'has_gcm'
before unregistering algorithms when XTS or authenc registration fails,
which would trigger a WARN in crypto_unregister_alg().
Currently, with the capabilities defined in atmel_aes_get_cap(), this
bug cannot happen because all devices that support XTS and authenc also
support GCM, but the error handling should still be correct regardless
of hardware capabilities.
Fixes: d52db5188a87 ("crypto: atmel-aes - add support to the XTS mode")
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Paul Moses <p@1g4.org>
Date: Wed Apr 1 03:07:49 2026 -0500
crypto: ccp - copy IV using skcipher ivsize
[ Upstream commit a7a1f3cdd64d8a165d9b8c9e9ad7fb46ac19dfc4 ]
AF_ALG rfc3686-ctr-aes-ccp requests pass an 8-byte IV to the driver.
ccp_aes_complete() restores AES_BLOCK_SIZE bytes into the caller's IV
buffer while RFC3686 skciphers expose an 8-byte IV, so the restore
overruns the provided buffer.
Use crypto_skcipher_ivsize() to copy only the algorithm's IV length.
Fixes: 2b789435d7f3 ("crypto: ccp - CCP AES crypto API support")
Signed-off-by: Paul Moses <p@1g4.org>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Haixin Xu <jerryxucs@gmail.com>
Date: Mon Mar 30 15:23:46 2026 +0800
crypto: jitterentropy - replace long-held spinlock with mutex
[ Upstream commit 01d798e9feb30212952d4e992801ba6bd6a82351 ]
jent_kcapi_random() serializes the shared jitterentropy state, but it
currently holds a spinlock across the jent_read_entropy() call. That
path performs expensive jitter collection and SHA3 conditioning, so
parallel readers can trigger stalls as contending waiters spin for
the same lock.
To prevent non-preemptible lock hold, replace rng->jent_lock with a
mutex so contended readers sleep instead of spinning on a shared lock
held across expensive entropy generation.
Fixes: bb5530e40824 ("crypto: jitterentropy - add jitterentropy RNG")
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Yuan Tan <yuantan098@gmail.com>
Suggested-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Haixin Xu <jerryxucs@gmail.com>
Reviewed-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ahsan Atta <ahsan.atta@intel.com>
Date: Tue Mar 24 11:12:34 2026 +0000
crypto: qat - disable 420xx AE cluster when lead engine is fused off
[ Upstream commit f216e0f2d1787e662bb6662c9c522185aa3b855a ]
The get_ae_mask() function only disables individual engines based on
the fuse register, but engines are organized in clusters of 4. If the
lead engine of a cluster is fused off, the entire cluster must be
disabled.
Replace the single bitmask inversion with explicit test_bit() checks
on the lead engine of each group, disabling the full ADF_AE_GROUP
when the lead bit is set.
Signed-off-by: Ahsan Atta <ahsan.atta@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Fixes: fcf60f4bcf54 ("crypto: qat - add support for 420xx devices")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ahsan Atta <ahsan.atta@intel.com>
Date: Tue Mar 24 11:11:12 2026 +0000
crypto: qat - disable 4xxx AE cluster when lead engine is fused off
[ Upstream commit b260d53561dd69b29505222ec44cf386ac2c2ca6 ]
The get_ae_mask() function only disables individual engines based on
the fuse register, but engines are organized in clusters of 4. If the
lead engine of a cluster is fused off, the entire cluster must be
disabled.
Replace the single bitmask inversion with explicit test_bit() checks
on the lead engine of each group, disabling the full ADF_AE_GROUP
when the lead bit is set.
Signed-off-by: Ahsan Atta <ahsan.atta@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Fixes: 8c8268166e834 ("crypto: qat - add qat_4xxx driver")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Date: Tue Mar 24 18:17:23 2026 +0000
crypto: qat - fix type mismatch in RAS sysfs show functions
[ Upstream commit ec23d75c4b77ae42af0777ea59599b1d4f611371 ]
ADF_RAS_ERR_CTR_READ() expands to atomic_read(), which returns int.
The local variable 'counter' was declared as 'unsigned long', causing
a type mismatch on the assignment. The format specifier '%ld' was
consequently wrong in two ways: wrong length modifier and wrong
signedness.
Use int to match the return type of atomic_read() and update the
format specifier to '%d' accordingly.
Fixes: 532d7f6bc458 ("crypto: qat - add error counters")
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Ahsan Atta <ahsan.atta@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Suman Kumar Chakraborty <suman.kumar.chakraborty@intel.com>
Date: Wed Mar 12 11:39:38 2025 +0000
crypto: qat - introduce fuse array
[ Upstream commit f3bda3b9b69cb193968ba781446ff18c734f0276 ]
Change the representation of fuses in the accelerator device
structure from a single value to an array.
This allows the structure to accommodate additional fuses that
are required for future generations of QAT hardware.
This does not introduce any functional changes.
Signed-off-by: Suman Kumar Chakraborty <suman.kumar.chakraborty@intel.com>
Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Stable-dep-of: b260d53561dd ("crypto: qat - disable 4xxx AE cluster when lead engine is fused off")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Date: Sat Mar 28 22:29:46 2026 +0000
crypto: qat - use swab32 macro
[ Upstream commit 35ecb77ae0749a2f1b04872c9978d9d7ddbbeb79 ]
Replace __builtin_bswap32() with swab32 in icp_qat_hw_20_comp.h to fix
the following build errors on architectures without native byte-swap
support:
alpha-linux-ld: drivers/crypto/intel/qat/qat_common/adf_gen4_hw_data.o: in function `adf_gen4_build_decomp_block':
drivers/crypto/intel/qat/qat_common/icp_qat_hw_20_comp.h:141:(.text+0xeec): undefined reference to `__bswapsi2'
alpha-linux-ld: drivers/crypto/intel/qat/qat_common/icp_qat_hw_20_comp.h:141:(.text+0xef8): undefined reference to `__bswapsi2'
alpha-linux-ld: drivers/crypto/intel/qat/qat_common/adf_gen4_hw_data.o: in function `adf_gen4_build_comp_block':
drivers/crypto/intel/qat/qat_common/icp_qat_hw_20_comp.h:57:(.text+0xf64): undefined reference to `__bswapsi2'
alpha-linux-ld: drivers/crypto/intel/qat/qat_common/icp_qat_hw_20_comp.h:57:(.text+0xf7c): undefined reference to `__bswapsi2'
Fixes: 5b14b2b307e4 ("crypto: qat - enable deflate for QAT GEN4")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202603290259.Ig9kDOmI-lkp@intel.com/
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: T Pratham <t-pratham@ti.com>
Date: Wed Apr 15 20:06:58 2026 +0530
crypto: sa2ul - Fix AEAD fallback algorithm names
[ Upstream commit 8451ab6ad686ffdcdf9ddadaa446a79ab48e5590 ]
For authenc AEAD algorithms, sa2ul is trying to register very specific
-ce version as a fallback. This causes registration failure on SoCs
which do not have ARMv8-CE enabled/available. Change the fallback
algorithm from the specific driver name to generic algorithm name so
that the kernel can allocate any available fallback.
Fixes: d2c8ac187fc92 ("crypto: sa2ul - Add AEAD algorithm support")
Signed-off-by: T Pratham <t-pratham@ti.com>
Reviewed-by: Manorit Chawdhry <m-chawdhry@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date: Tue Mar 10 18:28:29 2026 +0900
crypto: tegra - Disable softirqs before finalizing request
[ Upstream commit 2aeec9af775fb53aa086419b953302c6f4ad4984 ]
Softirqs must be disabled when calling the finalization fucntion on
a request.
Reported-by: Guangwu Zhang <guazhang@redhat.com>
Fixes: 0880bb3b00c8 ("crypto: tegra - Add Tegra Security Engine driver")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Akhil R <akhilrajeev@nvidia.com>
Date: Mon Feb 24 14:46:03 2025 +0530
crypto: tegra - finalize crypto req on error
[ Upstream commit 1e245948ca0c252f561792fabb45de5518301d97 ]
Call the crypto finalize function before exiting *do_one_req() functions.
This allows the driver to take up further requests even if the previous
one fails.
Fixes: 0880bb3b00c8 ("crypto: tegra - Add Tegra Security Engine driver")
Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Stable-dep-of: 2aeec9af775f ("crypto: tegra - Disable softirqs before finalizing request")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Akhil R <akhilrajeev@nvidia.com>
Date: Mon Feb 24 14:46:09 2025 +0530
crypto: tegra - Reserve keyslots to allocate dynamically
[ Upstream commit b157e7a228aee9b48c2de05129476b822aa7956d ]
The HW supports only storing 15 keys at a time. This limits the number
of tfms that can work without failutes. Reserve keyslots to solve this
and use the reserved ones during the encryption/decryption operation.
This allow users to have the capability of hardware protected keys
and faster operations if there are limited number of tfms while not
halting the operation if there are more tfms.
Fixes: 0880bb3b00c8 ("crypto: tegra - Add Tegra Security Engine driver")
Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Stable-dep-of: 2aeec9af775f ("crypto: tegra - Disable softirqs before finalizing request")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Akhil R <akhilrajeev@nvidia.com>
Date: Mon Feb 24 14:46:05 2025 +0530
crypto: tegra - Transfer HASH init function to crypto engine
[ Upstream commit 97ee15ea101629d2ffe21d3c5dc03b8d8be43603 ]
Ahash init() function was called asynchronous to the crypto engine queue.
This could corrupt the request context if there is any ongoing operation
for the same request. Queue the init function as well to the crypto
engine queue so that this scenario can be avoided.
Fixes: 0880bb3b00c8 ("crypto: tegra - Add Tegra Security Engine driver")
Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Stable-dep-of: 2aeec9af775f ("crypto: tegra - Disable softirqs before finalizing request")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Li Ming <ming.li@zohomail.com>
Date: Sat Mar 14 15:06:33 2026 +0800
cxl/pci: Check memdev driver binding status in cxl_reset_done()
[ Upstream commit e8069c66d09309579e53567be8ddfa6ccb2f452a ]
cxl_reset_done() accesses the endpoint of the corresponding CXL memdev
without endpoint validity checking. By default, cxlmd->endpoint is
initialized to -ENXIO, if cxl_reset_done() is triggered after the
corresponding CXL memdev probing failed, this results in access to an
invalid endpoint.
CXL subsystem can always check CXL memdev driver binding status to
confirm its endpoint validity. So adding the CXL memdev driver checking
inside cxl_reset_done() to avoid accessing an invalid endpoint.
Fixes: 934edcd436dc ("cxl: Add post-reset warning if reset results in loss of previously committed HDM decoders")
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Li Ming <ming.li@zohomail.com>
Link: https://patch.msgid.link/20260314-fix_access_endpoint_without_drv_check-v2-4-4c09edf2e1db@zohomail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Gui-Dong Han <hanguidong02@gmail.com>
Date: Mon Mar 23 16:58:44 2026 +0800
debugfs: check for NULL pointer in debugfs_create_str()
[ Upstream commit 31de83980d3764d784f79ff1bc93c42b324f4013 ]
Passing a NULL pointer to debugfs_create_str() leads to a NULL pointer
dereference when the debugfs file is read. Following upstream
discussions, forbid the creation of debugfs string files with NULL
pointers. Add a WARN_ON() to expose offending callers and return early.
Fixes: 9af0440ec86e ("debugfs: Implement debugfs_create_str()")
Reported-by: yangshiguang <yangshiguang@xiaomi.com>
Closes: https://lore.kernel.org/lkml/2025122221-gag-malt-75ba@gregkh/
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com>
Link: https://patch.msgid.link/20260323085930.88894-2-hanguidong02@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Gui-Dong Han <hanguidong02@gmail.com>
Date: Mon Mar 23 16:58:45 2026 +0800
debugfs: fix placement of EXPORT_SYMBOL_GPL for debugfs_create_str()
[ Upstream commit 4afc929c0f74c4f22b055a82b371d50586da58ca ]
The EXPORT_SYMBOL_GPL() for debugfs_create_str was placed incorrectly
away from the function definition. Move it immediately below the
debugfs_create_str() function where it belongs.
Fixes: d60b59b96795 ("debugfs: Export debugfs_create_str symbol")
Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com>
Link: https://patch.msgid.link/20260323085930.88894-3-hanguidong02@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Danilo Krummrich <dakr@kernel.org>
Date: Tue Feb 3 00:48:14 2026 +0100
devres: fix missing node debug info in devm_krealloc()
[ Upstream commit f813ec9e84b4d0ca81ec1da94ab07bfb4a29266c ]
Fix missing call to set_node_dbginfo() for new devres nodes created by
devm_krealloc().
Fixes: f82485722e5d ("devres: provide devm_krealloc()")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/20260202235210.55176-2-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ming-Hung Tsai <mtsai@redhat.com>
Date: Wed Mar 4 19:56:28 2026 +0800
dm cache metadata: fix memory leak on metadata abort retry
[ Upstream commit 044ca491d4086dc5bf233e9fcb71db52df32f633 ]
When failing to acquire the root_lock in dm_cache_metadata_abort because
the block_manager is read-only, the temporary block_manager created
outside the root_lock is not properly released, causing a memory leak.
Reproduce steps:
This can be reproduced by reloading a new table while the metadata
is read-only. While the second call to dm_cache_metadata_abort is
caused by lack of support for table preload in dm-cache, mentioned
in commit 9b1cc9f251af ("dm cache: share cache-metadata object across
inactive and active DM tables"), it exposes the memory leak in
dm_cache_metadata_abort when the function is called multiple times.
Specifically, dm-cache fails to sync the new cache object's mode during
preresume, creating the reproducer condition.
This issue could also occur through concurrent metadata_operation_failed
calls due to races in cache mode updates, but the table preload scenario
below provides a reliable reproducer.
1. Create a cache device with some faulty trailing metadata blocks
dmsetup create cmeta <<EOF
0 200 linear /dev/sdc 0
200 7992 error
EOF
dmsetup create cdata --table "0 131072 linear /dev/sdc 8192"
dmsetup create corig --table "0 262144 linear /dev/sdc 262144"
dd if=/dev/zero of=/dev/mapper/cmeta bs=4k count=1 oflag=direct
dmsetup create cache --table "0 131072 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 1 writethrough smq 0"
2. Suspend and resume the cache to start a new metadata transaction and
trigger metadata io errors on the next metadata commit.
dmsetup suspend cache
dmsetup resume cache
3. Write to the cache device to update metadata
fio --filename=/dev/mapper/cache --name test --rw=randwrite --bs=4k \
--randrepeat=0 --direct=1 --size 64k
4. Preload the same table
dmsetup reload cache --table "$(dmsetup table cache)"
5. Resume the new table. This triggers the memory leak.
dmsetup suspend cache
dmsetup resume cache
kmemleak logs:
<snip>
unreferenced object 0xffff8880080c2010 (size 16):
comm "dmsetup", pid 132, jiffies 4294982580
hex dump (first 16 bytes):
00 38 b9 07 80 88 ff ff 6a 6b 6b 6b 6b 6b 6b a5 ...
backtrace (crc 3118f31c):
kmemleak_alloc+0x28/0x40
__kmalloc_cache_noprof+0x3d9/0x510
dm_block_manager_create+0x51/0x140
dm_cache_metadata_abort+0x85/0x320
metadata_operation_failed+0x103/0x1e0
cache_preresume+0xacd/0xe70
dm_table_resume_targets+0xd3/0x320
__dm_resume+0x1b/0xf0
dm_resume+0x127/0x170
<snip>
Fixes: 352b837a5541 ("dm cache: Fix ABBA deadlock between shrink_slab and dm_cache_metadata_abort")
Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ming-Hung Tsai <mtsai@redhat.com>
Date: Mon Feb 9 15:54:08 2026 +0800
dm cache policy smq: fix missing locks in invalidating cache blocks
[ Upstream commit 2d1f7b65f5deedd2e6b09fdc6ea27f8375f24b45 ]
In passthrough mode, the policy invalidate_mapping operation is called
simultaneously from multiple workers, thus it should be protected by a
lock. Otherwise, we might end up with data races on the allocated blocks
counter, or even use-after-free issues with internal data structures
when doing concurrent writes.
Note that the existing FIXME in smq_invalidate_mapping() doesn't affect
passthrough mode since migration tasks don't exist there, but would need
attention if supporting fast device shrinking via suspend/resume without
target reloading.
Reproduce steps:
1. Create a cache device consisting of 1024 cache entries
dmsetup create cmeta --table "0 8192 linear /dev/sdc 0"
dmsetup create cdata --table "0 131072 linear /dev/sdc 8192"
dmsetup create corig --table "0 262144 linear /dev/sdc 262144"
dd if=/dev/zero of=/dev/mapper/cmeta bs=4k count=1 oflag=direct
dmsetup create cache --table "0 262144 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0"
2. Populate the cache, and record the number of cached blocks
fio --name=populate --filename=/dev/mapper/cache --rw=randwrite --bs=4k \
--size=64m --direct=1
nr_cached=$(dmsetup status cache | awk '{split($7, a, "/"); print a[1]}')
3. Reload the cache into passthrough mode
dmsetup suspend cache
dmsetup reload cache --table "0 262144 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 passthrough smq 0"
dmsetup resume cache
4. Write to the passthrough cache. By setting multiple jobs with I/O
size equal to the cache block size, cache blocks are invalidated
concurrently from different workers.
fio --filename=/dev/mapper/cache --name=test --rw=randwrite --bs=64k \
--direct=1 --numjobs=2 --randrepeat=0 --size=64m
5. Check if demoted matches cached block count. These numbers should
match but may differ due to the data race.
nr_demoted=$(dmsetup status cache | awk '{print $12}')
echo "$nr_cached, $nr_demoted"
Fixes: b29d4986d0da ("dm cache: significant rework to leverage dm-bio-prison-v2")
Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ming-Hung Tsai <mtsai@redhat.com>
Date: Mon Feb 9 15:54:09 2026 +0800
dm cache: fix concurrent write failure in passthrough mode
[ Upstream commit e4f66341779d0cf4c83c74793753a84094286d9e ]
When bio prison cell lock acquisition fails due to concurrent writes to
the same block in passthrough mode, dm-cache incorrectly returns an I/O
error instead of properly handling the concurrency. This can occur in
both process and workqueue contexts when invalidate_lock() is called for
exclusive access to a data block. Fix this by deferring the write bios
to ensure proper block device behavior.
Reproduce steps:
1. Create a cache device
dmsetup create cmeta --table "0 8192 linear /dev/sdc 0"
dmsetup create cdata --table "0 131072 linear /dev/sdc 8192"
dmsetup create corig --table "0 262144 linear /dev/sdc 262144"
dd if=/dev/zero of=/dev/mapper/cmeta bs=4k count=1 oflag=direct
dmsetup create cache --table "0 262144 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0"
2. Promote the first data block into cache
fio --filename=/dev/mapper/cache --name=populate --rw=write --bs=4k \
--direct=1 --size=64k
3. Reload the cache into passthrough mode
dmsetup suspend cache
dmsetup reload cache --table "0 262144 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 passthrough smq 0"
dmsetup resume cache
4. Write to the first cached block concurrently. Sometimes one of the
processes will receive I/O errors.
fio --filename=/dev/mapper/cache --name test --rw=randwrite --bs=4k \
--randrepeat=0 --direct=1 --numjobs=2 --size 64k
<snip>
fio-3.41
fio: io_u error on file /dev/mapper/cache: Input/output error: write offset=4096, buflen=4096
fio: pid=106, err=5/file:io_u.c:2008, func=io_u error, error=Input/output error
test: (groupid=0, jobs=1): err= 0: pid=105
test: (groupid=0, jobs=1): err= 5 (file:io_u.c:2008, func=io_u error, error=Input/output error): pid=106
<snip>
Fixes: b29d4986d0da ("dm cache: significant rework to leverage dm-bio-prison-v2")
Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ming-Hung Tsai <mtsai@redhat.com>
Date: Mon Feb 9 15:54:10 2026 +0800
dm cache: fix dirty mapping checking in passthrough mode switching
[ Upstream commit 322586745bd1a0e5f3559fd1635fdeb4dbd1d6b8 ]
As mentioned in commit 9b1cc9f251af ("dm cache: share cache-metadata
object across inactive and active DM tables"), dm-cache assumed table
reload occurs after suspension, while LVM's table preload breaks this
assumption. The dirty mapping check for passthrough mode was designed
around this assumption and is performed during table creation, causing
the check to fail with preload while metadata updates are ongoing. This
risks loading dirty mappings into passthrough mode, resulting in data
loss.
Reproduce steps:
1. Create a writeback cache with zero migration_threshold to produce
dirty mappings
dmsetup create cmeta --table "0 8192 linear /dev/sdc 0"
dmsetup create cdata --table "0 131072 linear /dev/sdc 8192"
dmsetup create corig --table "0 262144 linear /dev/sdc 262144"
dd if=/dev/zero of=/dev/mapper/cmeta bs=4k count=1 oflag=direct
dmsetup create cache --table "0 262144 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writeback smq \
2 migration_threshold 0"
2. Preload a table in passthrough mode
dmsetup reload cache --table "0 262144 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 passthrough smq 0"
3. Write to the first cache block to make it dirty
fio --filename=/dev/mapper/cache --name=populate --rw=write --bs=4k \
--direct=1 --size=64k
4. Resume the inactive table. Now it's possible to load the dirty block
into passthrough mode.
dmsetup resume cache
Fix by moving the checks to the preresume phase to support table
preloading. Also remove the unused function dm_cache_metadata_all_clean.
Fixes: 2ee57d587357 ("dm cache: add passthrough mode")
Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ming-Hung Tsai <mtsai@redhat.com>
Date: Fri Apr 10 21:08:01 2026 +0800
dm cache: fix missing return in invalidate_committed's error path
[ Upstream commit 8c0ee19db81f0fa1ff25fd75b22b17c0cc2acde3 ]
In passthrough mode, dm-cache defers write submission until after
metadata commit completes via the invalidate_committed() continuation.
On commit error, invalidate_committed() calls invalidate_complete() to
end the bio and free the migration struct, after which it should return
immediately.
The patch 4ca8b8bd952d ("dm cache: fix write hang in passthrough mode")
omitted this early return, causing execution to fall through into the
success path on error. This results in use-after-free on the migration
struct in the subsequent calls.
Fix by adding the missing return after the invalidate_complete() call.
Fixes: 4ca8b8bd952d ("dm cache: fix write hang in passthrough mode")
Reported-by: Dan Carpenter <error27@gmail.com>
Closes: https://lore.kernel.org/dm-devel/adjMq6T5RRjv_uxM@stanley.mountain/
Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ming-Hung Tsai <mtsai@redhat.com>
Date: Mon Feb 9 15:54:05 2026 +0800
dm cache: fix null-deref with concurrent writes in passthrough mode
[ Upstream commit 7d1f98d668ee34c1d15bdc0420fdd062f24a27c0 ]
In passthrough mode, when dm-cache starts to invalidate a cache
entry and bio prison cell lock fails due to concurrent write to
the same cached block, mg->cell remains NULL. The error path in
invalidate_complete() attempts to unlock and free the cell
unconditionally, causing a NULL pointer dereference:
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 0 UID: 0 PID: 134 Comm: fio Not tainted 6.19.0-rc7 #3 PREEMPT
RIP: 0010:dm_cell_unlock_v2+0x3f/0x210
<snip>
Call Trace:
invalidate_complete+0xef/0x430
map_bio+0x130f/0x1a10
cache_map+0x320/0x6b0
__map_bio+0x458/0x510
dm_submit_bio+0x40e/0x16d0
__submit_bio+0x419/0x870
<snip>
Reproduce steps:
1. Create a cache device
dmsetup create cmeta --table "0 8192 linear /dev/sdc 0"
dmsetup create cdata --table "0 131072 linear /dev/sdc 8192"
dmsetup create corig --table "0 262144 linear /dev/sdc 262144"
dd if=/dev/zero of=/dev/mapper/cmeta bs=4k count=1 oflag=direct
dmsetup create cache --table "0 262144 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0"
2. Promote the first data block into cache
fio --filename=/dev/mapper/cache --name=populate --rw=write --bs=4k \
--direct=1 --size=64k
3. Reload the cache into passthrough mode
dmsetup suspend cache
dmsetup reload cache --table "0 262144 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 passthrough smq 0"
dmsetup resume cache
4. Write to the first cached block concurrently
fio --filename=/dev/mapper/cache --name test --rw=randwrite --bs=4k \
--randrepeat=0 --direct=1 --numjobs=2 --size 64k
Fix by checking if mg->cell is valid before attempting to unlock it.
Fixes: b29d4986d0da ("dm cache: significant rework to leverage dm-bio-prison-v2")
Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ming-Hung Tsai <mtsai@redhat.com>
Date: Mon Feb 9 15:54:07 2026 +0800
dm cache: fix write hang in passthrough mode
[ Upstream commit 4ca8b8bd952df7c3ccdc68af9bd3419d0839a04b ]
The invalidate_remove() function has incomplete logic for handling write
hit bios after cache invalidation. It sets up the remapping for the
overwrite_bio but then drops it immediately without submission, causing
write operations to hang.
Fix by adding a new invalidate_committed() continuation that submits
the remapped writes to the cache origin after metadata commit completes,
while using the overwrite_endio hook to ensure proper completion
sequencing. This maintains existing coherency. Also improve error
handling in invalidate_complete() to preserve the original error status
instead of using bio_io_error() unconditionally.
Fixes: b29d4986d0da ("dm cache: significant rework to leverage dm-bio-prison-v2")
Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ming-Hung Tsai <mtsai@redhat.com>
Date: Mon Feb 9 15:54:06 2026 +0800
dm cache: fix write path cache coherency in passthrough mode
[ Upstream commit 0c5eef0aad508231d8e43ff8392692925e131b68 ]
In passthrough mode, dm-cache defers write bio submission until cache
invalidation completes to maintain existing coherency, requiring the
target map function to return DM_MAPIO_SUBMITTED. The current map_bio()
returns DM_MAPIO_REMAPPED, violating the required ordering constraint.
Reproduce steps:
1. Create a cache device
dmsetup create cmeta --table "0 8192 linear /dev/sdc 0"
dmsetup create cdata --table "0 131072 linear /dev/sdc 8192"
dmsetup create corig --table "0 262144 linear /dev/sdc 262144"
dd if=/dev/zero of=/dev/mapper/cmeta bs=4k count=1 oflag=direct
dmsetup create cache --table "0 262144 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0"
2. Promote the first data block into the cache
fio --filename=/dev/mapper/cache --name=populate --rw=write --bs=4k \
--direct=1 --size=64k
3. Reload the cache into passthrough mode
dmsetup suspend cache
dmsetup reload cache --table "0 262144 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 passthrough smq 0"
dmsetup resume cache
4. Write to the first data block, and check io ordering using ftrace
echo 1 > /sys/kernel/debug/tracing/events/block/block_bio_queue/enable
echo 1 > /sys/kernel/debug/tracing/events/block/block_bio_complete/enable
echo 1 > /sys/kernel/debug/tracing/events/block/block_rq_complete/enable
fio --filename=/dev/mapper/cache --name=test --rw=write --bs=64k \
--direct=1 --size 64k
5. ftrace logs show that write operations to the cache origin (252:2)
and metadata operations (252:0) are unsynchronized: the origin write
occurs before metadata commit.
<snip>
fio-146 [000] ..... 420.139562: block_bio_queue: 252,3 WS 0 + 128 [fio]
fio-146 [000] ..... 420.149395: block_bio_queue: 252,2 WS 0 + 128 [fio]
fio-146 [000] ..... 420.149763: block_bio_queue: 8,32 WS 262144 + 128 [fio]
fio-146 [000] dNh1. 420.151446: block_rq_complete: 8,32 WS () 262144 + 128 be,0,4 [0]
fio-146 [000] dNh1. 420.152731: block_bio_complete: 252,2 WS 0 + 128 [0]
fio-146 [000] dNh1. 420.154229: block_bio_complete: 252,3 WS 0 + 128 [0]
kworker/0:0-9 [000] ..... 420.160530: block_bio_queue: 252,0 W 408 + 8 [kworker/0:0]
kworker/0:0-9 [000] ..... 420.161641: block_bio_queue: 8,32 W 408 + 8 [kworker/0:0]
kworker/0:0-9 [000] ..... 420.162533: block_bio_queue: 252,0 W 416 + 8 [kworker/0:0]
kworker/0:0-9 [000] ..... 420.162821: block_bio_queue: 8,32 W 416 + 8 [kworker/0:0]
<snip>
Fixes: b29d4986d0da ("dm cache: significant rework to leverage dm-bio-prison-v2")
Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ming-Hung Tsai <mtsai@redhat.com>
Date: Thu Mar 6 16:41:51 2025 +0800
dm cache: support shrinking the origin device
[ Upstream commit c2662b1544cbd8ea3181381bb899b8e681dfedc7 ]
This patch introduces formal support for shrinking the cache origin by
reducing the cache target length via table reloads. Cache blocks mapped
beyond the new target length must be clean and are invalidated during
preresume. If any dirty blocks exist in the area being removed, the
preresume operation fails without setting the NEEDS_CHECK flag in
superblock, and the resume ioctl returns EFBIG. The cache device remains
suspended until a table reload with target length that fits existing
mappings is performed.
Without this patch, reducing the cache target length could result in
io errors (RHBZ: 2134334), out-of-bounds memory access to the discard
bitset, and security concerns regarding data leakage.
Verification steps:
1. create a cache metadata with some cached blocks mapped to the tail
of the origin device. Here we use cache_restore v1.0 to build a
metadata with one clean block mapped to the last origin block.
cat <<EOF >> cmeta.xml
<superblock uuid="" block_size="128" nr_cache_blocks="512" \
policy="smq" hint_width="4">
<mappings>
<mapping cache_block="0" origin_block="4095" dirty="false"/>
</mappings>
</superblock>
EOF
dmsetup create cmeta --table "0 8192 linear /dev/sdc 0"
cache_restore -i cmeta.xml -o /dev/mapper/cmeta --metadata-version=2
dmsetup remove cmeta
2. bring up the cache whilst shrinking the cache origin by one block:
dmsetup create cmeta --table "0 8192 linear /dev/sdc 0"
dmsetup create cdata --table "0 65536 linear /dev/sdc 8192"
dmsetup create corig --table "0 524160 linear /dev/sdc 262144"
dmsetup create cache --table "0 524160 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0"
3. check the number of cached data blocks via dmsetup status. It is
expected to be zero.
dmsetup status cache | cut -d ' ' -f 7
In addition to the script above, this patch can be verified using the
"cache/resize" tests in dmtest-python:
./dmtest run --rx cache/resize/shrink_origin --result-set default
Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Stable-dep-of: 322586745bd1 ("dm cache: fix dirty mapping checking in passthrough mode switching")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Guillaume Gonnet <ggonnet.linux@gmail.com>
Date: Tue Mar 17 22:32:28 2026 +0100
dm init: ensure device probing has finished in dm-mod.waitfor=
[ Upstream commit 99a2312f69805f4ba92d98a757625e0300a747ab ]
The early_lookup_bdev() function returns successfully when the disk
device is present but not necessarily its partitions. In this situation,
dm_early_create() fails as the partition block device does not exist
yet.
In my case, this phenomenon occurs quite often because the device is
an SD card with slow reading times, on which kernel takes time to
enumerate available partitions.
Fortunately, the underlying device is back to "probing" state while
enumerating partitions. Waiting for all probing to end is enough to fix
this issue.
That's also the reason why this problem never occurs with rootwait=
parameter: the while loop inside wait_for_root() explicitly waits for
probing to be done and then the function calls async_synchronize_full().
These lines were omitted in 035641b, even though the commit says it's
based on the rootwait logic...
Anyway, calling wait_for_device_probe() after our while loop does the
job (it both waits for probing and calls async_synchronize_full).
Fixes: 035641b01e72 ("dm init: add dm-mod.waitfor to wait for asynchronously probed block devices")
Signed-off-by: Guillaume Gonnet <ggonnet.linux@gmail.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Junrui Luo <moonafterrain@outlook.com>
Date: Thu Mar 5 20:05:48 2026 +0800
dm log: fix out-of-bounds write due to region_count overflow
[ Upstream commit c20e36b7631d83e7535877f08af8b0af72c44b1a ]
The local variable region_count in create_log_context() is declared as
unsigned int (32-bit), but dm_sector_div_up() returns sector_t (64-bit).
When a device-mapper target has a sufficiently large ti->len with a small
region_size, the division result can exceed UINT_MAX. The truncated
value is then used to calculate bitset_size, causing clean_bits,
sync_bits, and recovering_bits to be allocated far smaller than needed
for the actual number of regions.
Subsequent log operations (log_set_bit, log_clear_bit, log_test_bit) use
region indices derived from the full untruncated region space, causing
out-of-bounds writes to kernel heap memory allocated by vmalloc.
This can be reproduced by creating a mirror target whose region_count
overflows 32 bits:
dmsetup create bigzero --table '0 8589934594 zero'
dmsetup create mymirror --table '0 8589934594 mirror \
core 2 2 nosync 2 /dev/mapper/bigzero 0 \
/dev/mapper/bigzero 0'
The status output confirms the truncation (sync_count=1 instead of
4294967297, because 0x100000001 was truncated to 1):
$ dmsetup status mymirror
0 8589934594 mirror 2 254:1 254:1 1/4294967297 ...
This leads to a kernel crash in core_in_sync:
BUG: scheduling while atomic: (udev-worker)/9150/0x00000000
RIP: 0010:core_in_sync+0x14/0x30 [dm_log]
CR2: 0000000000000008
Fixing recursive fault but reboot is needed!
Fix by widening the local region_count to sector_t and adding an
explicit overflow check before the value is assigned to lc->region_count.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Yuhao Jiang <danisjiang@gmail.com>
Signed-off-by: Junrui Luo <moonafterrain@outlook.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Khairul Anuar Romli <karom.9560@gmail.com>
Date: Mon Feb 2 14:02:19 2026 +0800
dmaengine: dw-axi-dmac: Remove unnecessary return statement from void function
[ Upstream commit 48278a72fce8a8d30efaedeb206c9c3f05c1eb3f ]
checkpatch.pl --strict reports a WARNING in dw-axi-dmac-platform.c:
WARNING: void function return statements are not generally useful
FILE: drivers/dma/dw-axi-dmac/dw-axi-dmac-platform.c
According to Linux kernel coding style [Documentation/process/
coding-style.rst], explicit "return;" statements at the end of void
functions are redundant and should be omitted. The function will
automatically return upon reaching the closing brace, so the extra
statement adds unnecessary clutter without functional benefit.
This patch removes the superfluous "return;" statement in
dw_axi_dma_set_hw_channel() to comply with kernel coding standards and
eliminate the checkpatch warning.
Fixes: 32286e279385 ("dmaengine: dw-axi-dmac: Remove free slot check algorithm in dw_axi_dma_set_hw_channel")
Signed-off-by: Khairul Anuar Romli <karom.9560@gmail.com>
Link: https://patch.msgid.link/20260202060224.12616-4-karom.9560@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Frank Li <Frank.Li@nxp.com>
Date: Wed Feb 25 16:41:38 2026 -0500
dmaengine: mxs-dma: Fix missing return value from of_dma_controller_register()
[ Upstream commit ab2bf6d4c0a0152907b18d25c1b118ea5ea779df ]
Propagate the return value of of_dma_controller_register() in probe()
instead of ignoring it.
Fixes: a580b8c5429a6 ("dmaengine: mxs-dma: add dma support for i.MX23/28")
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260225-mxsdma-module-v3-2-8f798b13baa6@nxp.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jane Chu <jane.chu@oracle.com>
Date: Mon Mar 2 13:10:15 2026 -0700
Documentation: fix a hugetlbfs reservation statement
[ Upstream commit 7a197d346a44384a1a858a98ef03766840e561d4 ]
Documentation/mm/hugetlbfs_reserv.rst has
if (resv_needed <= (resv_huge_pages - free_huge_pages))
resv_huge_pages += resv_needed;
which describes this code in gather_surplus_pages()
needed = (h->resv_huge_pages + delta) - h->free_huge_pages;
if (needed <= 0) {
h->resv_huge_pages += delta;
return 0;
}
which means if there are enough free hugepages to account for the new
reservation, simply update the global reservation count without
further action.
But the description is backwards, it should be
if (resv_needed <= (free_huge_pages - resv_huge_pages))
instead.
Link: https://lkml.kernel.org/r/20260302201015.1824798-1-jane.chu@oracle.com
Fixes: 70bc0dc578b3 ("Documentation: vm, add hugetlbfs reservation overview")
Signed-off-by: Jane Chu <jane.chu@oracle.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Cai Xinchen <caixinchen1@huawei.com>
Date: Thu Mar 12 06:59:06 2026 +0000
dpaa2: add independent dependencies for FSL_DPAA2_SWITCH
[ Upstream commit 12589892f41c4c645c80ef9f036f7451a6045624 ]
Since the commit 84cba72956fd ("dpaa2-switch: integrate
the MAC endpoint support") included dpaa2-mac.o in the driver,
but it didn't select PCS_LYNX, PHYLINK and FSL_XGMAC_MDIO. it
will lead to link error, such as
undefined reference to `phylink_ethtool_ksettings_set'
undefined reference to `lynx_pcs_create_fwnode'
And the same reason as the commit d2624e70a2f53 ("dpaa2-eth: select
XGMAC_MDIO for MDIO bus support"), enable the FSL_XGMAC_MDIO Kconfig
option in order to have MDIO access to internal and external PHYs.
Because dpaa2-switch uses fsl_mc_driver APIs, add depends on FSL_MC_BUS
&& FSL_MC_DPIO as FSL_DPAA2_SWITCH do.
FSL_XGMAC_MDIO and FSL_MC_BUS depend on OF, thus the dependence of
FSL_MC_BUS can satisfy FSL_XGMAC_MDIO's OF requirement.
Fixes: 84cba72956fd ("dpaa2-switch: integrate the MAC endpoint support")
Suggested-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Cai Xinchen <caixinchen1@huawei.com>
Link: https://patch.msgid.link/20260312065907.476663-2-caixinchen1@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Cai Xinchen <caixinchen1@huawei.com>
Date: Thu Mar 12 06:59:07 2026 +0000
dpaa2: compile dpaa2 even CONFIG_FSL_DPAA2_ETH=n
[ Upstream commit 97daf00745f7f9f261b0e91418de6e79d7826c36 ]
CONFIG_FSL_DPAA2_ETH and CONFIG_FSL_DPAA2_SWITCH are not
associated, but the compilation of FSL_DPAA2_SWITCH depends on
the compilation of the dpaa2 folder. The files controlled by
CONFIG_FSL_DPAA2_SWITCH in the dpaa2 folder are not controlled
by CONFIG_FSL_DPAA2_ETH, except for the files controlled by
CONFIG_FSL_DPAA2_SWITCH. Therefore, removing the restriction will
not affect the compilation of the files in the directory.
Fixes: f48298d3fbfaa ("staging: dpaa2-switch: move the driver out of staging")
Suggested-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Cai Xinchen <caixinchen1@huawei.com>
Link: https://patch.msgid.link/20260312065907.476663-3-caixinchen1@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Bart Van Assche <bvanassche@acm.org>
Date: Thu Mar 26 14:40:54 2026 -0700
drbd: Balance RCU calls in drbd_adm_dump_devices()
[ Upstream commit 2b31e86387e60b3689339f0f0fbb4d3623d9d494 ]
Make drbd_adm_dump_devices() call rcu_read_lock() before
rcu_read_unlock() is called. This has been detected by the Clang
thread-safety analyzer.
Tested-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Fixes: a55bbd375d18 ("drbd: Backport the "status" command")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20260326214054.284593-1-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Timur Kristóf <timur.kristof@gmail.com>
Date: Tue Apr 28 13:40:41 2026 +0200
drm/amd/display: Allow DCE link encoder without AUX registers
[ Upstream commit ac27e3f99035f132f23bc0409d0e57f11f054c70 ]
Allow constructing the DCE link encoder without DDC,
which means the AUX registers array will be NULL.
This is necessary to support embedded connectors without DDC.
Fixes: 4562236b3bc0 ("drm/amd/dc: Add dc display driver (v2)")
Link: https://gitlab.freedesktop.org/drm/amd/-/work_items/5192
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 87f30b101af62590faf6020d106da07efdda199b)
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Timur Kristóf <timur.kristof@gmail.com>
Date: Tue Apr 28 13:40:44 2026 +0200
drm/amd/display: Read EDID from VBIOS embedded panel info
[ Upstream commit 9ea16f64189bf7b6ba50fc7f0325b3c1f836d105 ]
Some board manufacturers hardcode the EDID for the embedded
panel in the VBIOS. This EDID should be used when the panel
doesn't have a DDC.
For reference, see the legacy non-DC display code:
amdgpu_atombios_encoder_get_lcd_info()
This is necessary to support embedded connectors without DDC.
Fixes: 4562236b3bc0 ("drm/amd/dc: Add dc display driver (v2)")
Link: https://gitlab.freedesktop.org/drm/amd/-/work_items/5192
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit eb105e63b474c11ef6a84a1c6b18100d851ff364)
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Timur Kristóf <timur.kristof@gmail.com>
Date: Sun Mar 29 18:03:03 2026 +0200
drm/amd/pm/ci: Clear EnabledForActivity field for memory levels
[ Upstream commit 5facfd4c4c67e8500116ffec0d9da35d92b9c787 ]
Follow what radeon did and what amdgpu does for other GPUs with SMU7.
Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Timur Kristóf <timur.kristof@gmail.com>
Date: Sun Mar 29 18:02:59 2026 +0200
drm/amd/pm/ci: Disable MCLK DPM on problematic CI ASICs
[ Upstream commit 9851f29cb06c09f7dad3867d8b0feec3fc71b6c8 ]
There are two known cases where MCLK DPM can causes issues:
Radeon R9 M380 found in iMac computers from 2015.
The SMU in this GPU just hangs as soon as we send it the
PPSMC_MSG_MCLKDPM_Enable command, even when MCLK switching is
disabled, and even when we only populate one MCLK DPM level.
Apply workaround to all devices with the same subsystem ID.
Radeon R7 260X due to old memory controller microcode.
We only flash the MC ucode when it isn't set up by the VBIOS,
therefore there is no way to make sure that it has the correct
ucode version.
I verified that this patch fixes the SMU hang on the R9 M380
which would previously fail to boot. This also fixes the UVD
initialization error on that GPU which happened because the
SMU couldn't ungate the UVD after it hung.
Fixes: 86457c3b21cb ("drm/amd/powerplay: Add support for CI asics to hwmgr")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Timur Kristóf <timur.kristof@gmail.com>
Date: Sun Mar 29 18:03:04 2026 +0200
drm/amd/pm/ci: Fill DW8 fields from SMC
[ Upstream commit baf28ec5795c077406d6f52b8ad39e614153bce6 ]
In ci_populate_dw8() we currently just read a value from the SMU
and then throw it away. Instead of throwing away the value,
we should use it to fill other fields in DW8 (like radeon).
Otherwise the value of the other fiels is just cleared when
we copy this data to the SMU later.
Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Timur Kristóf <timur.kristof@gmail.com>
Date: Sun Mar 29 18:03:02 2026 +0200
drm/amd/pm/ci: Fix powertune defaults for Hawaii 0x67B0
[ Upstream commit d784759c07924280f3c313f205fc48eb62d7cb71 ]
There is no AMD GPU with the ID 0x66B0, this looks like a typo.
It should be 0x67B0 which is actually part of the PCI ID list,
and should use the Hawaii XT powertune defaults according to
the old radeon driver.
Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Timur Kristóf <timur.kristof@gmail.com>
Date: Sun Mar 29 18:02:58 2026 +0200
drm/amd/pm/ci: Use highest MCLK on CI when MCLK DPM is disabled
[ Upstream commit 894f0d34d66cb47fe718fe2ae5c18729d22c5218 ]
When MCLK DPM is disabled for any reason, populate the MCLK
table with the highest MCLK DPM level, so that the ASIC can
use the highest possible memory clock to get good performance
even when MCLK DPM is disabled.
Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Timur Kristóf <timur.kristof@gmail.com>
Date: Sun Mar 29 18:03:05 2026 +0200
drm/amd/pm/smu7: Add SCLK cap for quirky Hawaii board
[ Upstream commit 4724bc5b8d78c34b993594f9406135408ccb312a ]
On a specific Radeon R9 390X board, the GPU can "randomly" hang
while gaming. Initially I thought this was a RADV bug and tried
to work around this in Mesa:
commit 8ea08747b86b ("radv: Mitigate GPU hang on Hawaii in Dota 2 and RotTR")
However, I got some feedback from other users who are reporting
that the above mitigation causes a significant performance
regression for them, and they didn't experience the hang on their
GPU in the first place.
After some further investigation, it turns out that the problem
is that the highest SCLK DPM level on this board isn't stable.
Lowering SCLK to 1040 MHz (from 1070 MHz) works around the issue,
and has a negligible impact on performance compared to the Mesa
patch. (Note that increasing the voltage can also work around it,
but we felt that lowering the SCLK is the safer option.)
To solve the above issue, add an "sclk_cap" field to smu7_hwmgr
and set this field for the affected board. The capped SCLK value
correctly appears on the sysfs interface and shows up in GUI
tools such as LACT.
Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Timur Kristóf <timur.kristof@gmail.com>
Date: Sun Mar 29 18:03:00 2026 +0200
drm/amd/pm/smu7: Fix SMU7 voltage dependency on display clock
[ Upstream commit 0138610c14130425be53423b35336561829965e0 ]
The DCE (display controller engine) requires a minimum voltage
in order to function correctly, depending on which clock level
it currently uses.
Add a new table that contains display clock frequency levels
and the corresponding required voltages. The clock frequency
levels are taken from DC (and the old radeon driver's voltage
dependency table for CI in cases where its values were lower).
The voltage levels are taken from the following function:
phm_initializa_dynamic_state_adjustment_rule_settings().
Furthermore, in case of CI, call smu7_patch_vddc() on the new
table to account for leakage voltage (like in radeon).
Use the display clock value from amd_pp_display_configuration
to look up the voltage level needed by the DCE. Send the
voltage to the SMU via the PPSMC_MSG_VddC_Request command.
The previous implementation of this feature was non-functional
because it relied on a "dal_power_level" field which was never
assigned; and it was not at all implemented for CI ASICs.
I verified this on a Radeon R9 M380 which previously booted to
a black screen with DC enabled (default since Linux 6.19), but
now works correctly.
Fixes: 599a7e9fe1b6 ("drm/amd/powerplay: implement smu7 hwmgr to manager asics with smu ip version 7.")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alex Deucher <alexander.deucher@amd.com>
Date: Thu Feb 19 18:18:28 2026 -0500
drm/amdgpu/gfx10: look at the right prop for gfx queue priority
[ Upstream commit 355d96cdec5c61fd83f7eb54f1a28e38809645d6 ]
Look at hqd_queue_priority rather than hqd_pipe_priority.
In practice, it didn't matter as both were always set for
kernel queues, but that will change in the future.
Fixes: b07d1d73b09e ("drm/amd/amdgpu: Enable high priority gfx queue")
Reviewed-by:Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alex Deucher <alexander.deucher@amd.com>
Date: Thu Feb 19 18:20:27 2026 -0500
drm/amdgpu/gfx11: look at the right prop for gfx queue priority
[ Upstream commit f9a4e81bcbd04e6f967d851f9fe69d8bb3cc08b3 ]
Look at hqd_queue_priority rather than hqd_pipe_priority.
In practice, it didn't matter as both were always set for
kernel queues, but that will change in the future.
Fixes: 2e216b1e6ba2 ("drm/amdgpu/gfx11: handle priority setup for gfx pipe1")
Reviewed-by:Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Timur Kristóf <timur.kristof@gmail.com>
Date: Sat Apr 18 23:49:33 2026 +0200
drm/amdgpu/gfx6: Support harvested SI chips with disabled TCCs (v2)
[ Upstream commit fe2b84f9228e2a0903221a4d0d8c350b018e9c0c ]
This commit fixes amdgpu to work on the Radeon HD 7870 XT
which has never worked with the Linux open source drivers before.
Some boards have "harvested" chips, meaning that some parts of
the chip are disabled and fused, and it's sold for cheaper and
under a different marketing name.
On a harvested chip, any of the following can be disabled:
- CUs (Compute Units)
- RBs (Render Backend, aka. ROP)
- Memory channels (ie. the chip has a lower bandwidth)
- TCCs (ie. less L2 cache)
Handle chips with harvested TCCs by patching the registers
that configure how TCCs are mapped.
If some TCCs are disabled, we need to make sure that
the disabled TCCs are not used, and the remaining TCCs
are used optimally.
TCP_CHAN_STEER_LO/HI control which TCC is used by TCP channels.
TCP_ADDR_CONFIG.NUM_TCC_BANKS controls how many channels are used.
Note that the TCC configuration is highly relevant to performance.
Suboptimal configuration (eg. CHAN_STEER=0) can significantly
reduce gaming performance.
For optimal performance:
- Rely on the CHAN_STEER from the golden registers table,
only skip disabled TCCs but keep the mapping order.
- Limit NUM_TCC_BANKS to number of active TCCs to avoid thrashing,
which performs better than using the same TCC twice.
v2:
- Also consider CGTS_USER_TCC_DISABLE for disabled TCCs.
Link: https://bugs.freedesktop.org/show_bug.cgi?id=60879
Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/2664
Fixes: 2cd46ad22383 ("drm/amdgpu: add graphic pipeline implementation for si v8")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 00218d15528fab9f6b31241fe5904eea4fcaa30d)
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Timur Kristóf <timur.kristof@gmail.com>
Date: Sat Apr 18 23:49:30 2026 +0200
drm/amdgpu/gmc: Fix AMDGPU_GART_PLACEMENT_LOW to not overlap with VRAM
[ Upstream commit 36d65da7570bf72ce28504fa9a81abfc728e6d96 ]
When the GART placement is set to AMDGPU_GART_PLACEMENT_LOW:
Make sure that GART does not overlap with VRAM when
VRAM is configured to be in the low address space.
Solve this according to the following logic:
- When GART fits before VRAM, use zero address for GART
- Otherwise, put GART after the end of VRAM, aligned to 4 GiB
Previously, I had assumed this was not possible
so it was OK to not handle it, but now we got a report
from a user who has a board that is configured this way.
Fixes: 917f91d8d8e8 ("drm/amdgpu/gmc: add a way to force a particular placement for GART")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3d9de5d86a1658cadb311461b001eb1df67263ad)
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yinjie Yao <yinjie.yao@amd.com>
Date: Mon Apr 27 11:46:10 2026 -0400
drm/amdgpu/jpeg: set no_user_fence for JPEG v2.0 ring
[ Upstream commit e5f612dc91650561fe2b5b76dd6d2898ec9ad480 ]
JPEG rings do not support 64-bit user fence writes, reject CS
submissions with user fences.
Fixes: 6ac27241106b ("drm/amdgpu: add JPEG v2.0 function supports")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 96179da0c6b059eb31706a0abe8dd6381c533143)
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yinjie Yao <yinjie.yao@amd.com>
Date: Mon Apr 27 11:46:10 2026 -0400
drm/amdgpu/jpeg: set no_user_fence for JPEG v2.5 ring
[ Upstream commit 79405e774ede411c6b47ed41c651e40b92de64a2 ]
JPEG rings do not support 64-bit user fence writes, reject CS
submissions with user fences.
Fixes: 14f43e8f88c5 ("drm/amdgpu: move JPEG2.5 out from VCN2.5")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 3216a7f4e2642bda5fd14f57586e835ae9202587)
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yinjie Yao <yinjie.yao@amd.com>
Date: Mon Apr 27 11:46:10 2026 -0400
drm/amdgpu/jpeg: set no_user_fence for JPEG v3.0 ring
[ Upstream commit a2baf12eec41f246689e6a3f8619af1200031576 ]
JPEG rings do not support 64-bit user fence writes, reject CS
submissions with user fences.
Fixes: dfd57dbf44dd ("drm/amdgpu: add JPEG3.0 support for Sienna_Cichlid")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 4d7d774f100efb5089c86a1fb8c5bf47c63fc9ef)
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yinjie Yao <yinjie.yao@amd.com>
Date: Mon Apr 27 11:46:11 2026 -0400
drm/amdgpu/jpeg: set no_user_fence for JPEG v4.0 ring
[ Upstream commit e7e90b5839aeb8805ec83bb4da610b8dab8e184d ]
JPEG rings do not support 64-bit user fence writes, reject CS
submissions with user fences.
Fixes: b13111de32a9 ("drm/amdgpu/jpeg: add jpeg support for VCN4_0_0")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8d0cac9478a3f046279c657d6a2545de49ae675a)
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yinjie Yao <yinjie.yao@amd.com>
Date: Mon Apr 27 11:46:11 2026 -0400
drm/amdgpu/jpeg: set no_user_fence for JPEG v4.0.3 ring
[ Upstream commit 83e37c0987ca92f9e87789b46dd311dcf5a4a6c8 ]
JPEG rings do not support 64-bit user fence writes, reject CS
submissions with user fences.
Fixes: e684e654eba9 ("drm/amdgpu/jpeg: add jpeg support for VCN4_0_3")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2f6afc97d259d530f4f86c7743efbc573a8da927)
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yinjie Yao <yinjie.yao@amd.com>
Date: Mon Apr 27 11:46:11 2026 -0400
drm/amdgpu/jpeg: set no_user_fence for JPEG v4.0.5 ring
[ Upstream commit b65b7f3f3c18f797f81a2af7c97e2079900ad6db ]
JPEG rings do not support 64-bit user fence writes, reject CS
submissions with user fences.
Fixes: 8f98a715da8e ("drm/amdgpu/jpeg: add jpeg support for VCN4_0_5")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f05d0a4f21fc720116d6e238f23308b199891058)
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yinjie Yao <yinjie.yao@amd.com>
Date: Mon Apr 27 11:46:11 2026 -0400
drm/amdgpu/jpeg: set no_user_fence for JPEG v5.0.0 ring
[ Upstream commit ea7c61c5f895e8f9ea0ffffa180498ef9c740152 ]
JPEG rings do not support 64-bit user fence writes, reject CS
submissions with user fences.
Fixes: dfad65c65728 ("drm/amdgpu: Add JPEG5 support")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 0f43893d3cd478fa57836697525b338817c9c23d)
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Timur Kristóf <timur.kristof@gmail.com>
Date: Sat Apr 18 23:49:31 2026 +0200
drm/amdgpu/uvd3.1: Don't validate the firmware when already validated
[ Upstream commit 13e4cf116dbf7a1fb8123a59bea2c098f30d3736 ]
UVD 3.1 firmware validation seems to always fail after
attempting it when it had already been validated.
(This works similarly with the VCE 1.0 as well.)
Don't attempt repeating the validation when it's already done.
This caused issues in situations when the system isn't able
to suspend the GPU properly and so the GPU isn't actually
powered down. Then amdgpu would fail when calling the IP
block resume function.
Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/2887
Fixes: bb7978111dd3 ("drm/amdgpu: fix SI UVD firmware validate resume fail")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 889a2cfd889c4a4dd9d0c89ce9a8e60b78be71dd)
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Timur Kristóf <timur.kristof@gmail.com>
Date: Sun Mar 29 18:03:06 2026 +0200
drm/amdgpu/uvd4.2: Don't initialize UVD 4.2 when DPM is disabled
[ Upstream commit 8b3e8fa6d7bdab292447a43f70532db437d5d4f5 ]
UVD 4.2 doesn't work at all when DPM is disabled because
the SMU is responsible for ungating it. So, Linux fails
to boot with CIK GPUs when using the amdgpu.dpm=0 parameter.
Fix this by returning -ENOENT from uvd_v4_2_early_init()
when amdgpu_dpm isn't enabled.
Note: amdgpu.dpm=0 is often suggested as a workaround
for issues and is useful for debugging.
Fixes: a2e73f56fa62 ("drm/amdgpu: Add support for CIK parts")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yinjie Yao <yinjie.yao@amd.com>
Date: Mon Apr 27 11:45:35 2026 -0400
drm/amdgpu/vcn: set no_user_fence for VCN v2.0 enc/dec rings
[ Upstream commit 8d80b293b41fcb5e9396db93e788b0f4ebcbafb7 ]
VCN encoder and decoder rings do not support 64-bit user fence writes,
reject CS submissions with user fences.
Fixes: 1b61de45dfaf ("drm/amdgpu: add initial VCN2.0 support (v2)")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e2b5499fca55f1a32960a311bbb62e35891eaf73)
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yinjie Yao <yinjie.yao@amd.com>
Date: Mon Apr 27 11:45:35 2026 -0400
drm/amdgpu/vcn: set no_user_fence for VCN v2.5 enc/dec rings
[ Upstream commit 4f317863a3ab212a027d8c8c3cc3af4e3fb95704 ]
VCN encoder and decoder rings do not support 64-bit user fence writes,
reject CS submissions with user fences.
Fixes: 28c17d72072b ("drm/amdgpu: add VCN2.5 basic supports")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit efc9dd5590894109bce9a0bfe1fa5592dd6b20b1)
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yinjie Yao <yinjie.yao@amd.com>
Date: Mon Apr 27 11:45:35 2026 -0400
drm/amdgpu/vcn: set no_user_fence for VCN v3.0 enc/dec rings
[ Upstream commit f1e5a6660d7cbf006079126d9babbf0ccf538c6b ]
VCN encoder and decoder rings do not support 64-bit user fence writes,
reject CS submissions with user fences.
Fixes: cf14826cdfb5 ("drm/amdgpu: add VCN3.0 support for Sienna_Cichlid")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 663bed3c7b8b9a7624b0d95d300ddae034ad0614)
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yinjie Yao <yinjie.yao@amd.com>
Date: Mon Apr 27 11:45:36 2026 -0400
drm/amdgpu/vcn: set no_user_fence for VCN v4.0.3 enc ring
[ Upstream commit 4532b52b34e4e4310386e6fdf6a643368599f522 ]
VCN encoder and decoder rings do not support 64-bit user fence writes,
reject CS submissions with user fences.
Fixes: b889ef4ac988 ("drm/amdgpu/vcn: add vcn support for VCN4_0_3")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ff1a5a125c5a70c328806b9bc01d7d942cf3f9aa)
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yinjie Yao <yinjie.yao@amd.com>
Date: Mon Apr 27 11:45:36 2026 -0400
drm/amdgpu/vcn: set no_user_fence for VCN v4.0.5 enc ring
[ Upstream commit 589a254bf3e88204c8402b9cbccd5e23a0af990f ]
VCN encoder and decoder rings do not support 64-bit user fence writes,
reject CS submissions with user fences.
Fixes: 547aad32edac ("drm/amdgpu: add VCN4 ip block support")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 084d94ac93707bdda07efb5cee786f632de4219b)
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yinjie Yao <yinjie.yao@amd.com>
Date: Mon Apr 27 11:45:36 2026 -0400
drm/amdgpu/vcn: set no_user_fence for VCN v5.0.0 enc ring
[ Upstream commit 8cae0ce77de492d7c31c1532a2e80c0c6e7e58cb ]
VCN encoder and decoder rings do not support 64-bit user fence writes,
reject CS submissions with user fences.
Fixes: b6d1a0632051 ("drm/amdgpu: add VCN_5_0_0 IP block support")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 49b1fbbb5a071197ee71e2d70959b1cb29bdc317)
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sunil Khatri <sunil.khatri@amd.com>
Date: Tue Sep 24 18:16:29 2024 +0530
drm/amdgpu: add amdgpu_device reference in ip block
[ Upstream commit 37b993225d37744f2a62bf67074a76a6cb7b8b98 ]
To handle amdgpu_device reference for different GPUs
we add it's reference in each ip block which can be
used to differentiate between difference gpu devices.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Stable-dep-of: 8b3e8fa6d7bd ("drm/amdgpu/uvd4.2: Don't initialize UVD 4.2 when DPM is disabled")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Date: Thu Mar 12 19:29:54 2026 +0530
drm/amdgpu: Add default case in DVI mode validation
[ Upstream commit e6020a55b8e364d15eac27f9c788e13114eec6b7 ]
amdgpu_connector_dvi_mode_valid() assigns max_digital_pixel_clock_khz
based on connector_object_id using a switch statement that lacks a
default case.
In practice this code path should never be hit because the existing
cases already cover all digital connector types that this function is
used for. This is also legacy display code which is not used for new
hardware.
Add a default case returning MODE_BAD to make the switch exhaustive and
silence the static analyzer smatch error. The new branch is effectively
defensive and should never be reached during normal operation.
Fixes: 585b2f685c56 ("drm/amdgpu: Respect max pixel clock for HDMI and DVI-D (v2)")
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Timur Kristóf <timur.kristof@gmail.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Christian König <christian.koenig@amd.com>
Date: Fri Apr 17 15:52:45 2026 +0200
drm/amdgpu: fix AMDGPU_INFO_READ_MMR_REG
[ Upstream commit 0ef196a208385b7d7da79f411c161b04e97283e2 ]
There were multiple issues in that code.
First of all the order between the reset semaphore and the mm_lock was
wrong (e.g. copy_to_user) was called while holding the lock.
Then we allocated memory while holding the reset semaphore which is also
a pretty big bug and can deadlock.
Then we used down_read_trylock() instead of waiting for the reset to
finish.
Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 9e823f307074 ("drm/amdgpu: Block MMR_READ IOCTL in reset")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 361b6e6b303d4b691f6c5974d3eaab67ca6dd90e)
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alexandre Demers <alexandre.f.demers@gmail.com>
Date: Thu Feb 27 00:05:04 2025 -0500
drm/amdgpu: fix spelling typos
[ Upstream commit ce43abd7ec9464cf954f90e1c69e11768b02fa0a ]
Found some typos while exploring amdgpu code.
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Stable-dep-of: 13e4cf116dbf ("drm/amdgpu/uvd3.1: Don't validate the firmware when already validated")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sunil Khatri <sunil.khatri@amd.com>
Date: Tue Sep 24 21:30:17 2024 +0530
drm/amdgpu: update the handle ptr in dump_ip_state
[ Upstream commit fa73462dc0482644416c2a2ee042c11d93a89663 ]
Update the ptr handle to amdgpu_ip_block ptr in all
the functions.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Stable-dep-of: 8b3e8fa6d7bd ("drm/amdgpu/uvd4.2: Don't initialize UVD 4.2 when DPM is disabled")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sunil Khatri <sunil.khatri@amd.com>
Date: Wed Sep 25 16:59:51 2024 +0530
drm/amdgpu: update the handle ptr in early_init
[ Upstream commit 146b085eadd2ce405e67492a80d6e767748d5642 ]
update the handle ptr to amdgpu_ip_block ptr
for all functions pointers on early_init.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Stable-dep-of: 8b3e8fa6d7bd ("drm/amdgpu/uvd4.2: Don't initialize UVD 4.2 when DPM is disabled")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jayesh Choudhary <j-choudhary@ti.com>
Date: Tue Dec 9 17:33:28 2025 +0530
drm/bridge: cadence: cdns-mhdp8546-core: Add mode_valid hook to drm_bridge_funcs
[ Upstream commit 6dbff34016052b099558b76632e4983e2df13fed ]
Add cdns_mhdp_bridge_mode_valid() to check if specific mode is valid for
this bridge or not. In the legacy usecase with
!DRM_BRIDGE_ATTACH_NO_CONNECTOR we were using the hook from
drm_connector_helper_funcs but with DRM_BRIDGE_ATTACH_NO_CONNECTOR
we need to have mode_valid() in drm_bridge_funcs.
Without this patch, when using DRM_BRIDGE_ATTACH_NO_CONNECTOR
flag, the cdns_mhdp_bandwidth_ok() function would not be called
during mode validation, potentially allowing modes that exceed
the bridge's bandwidth capabilities to be incorrectly marked as
valid.
Fixes: c932ced6b585 ("drm/tidss: Update encoder/bridge chain connect model")
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Signed-off-by: Jayesh Choudhary <j-choudhary@ti.com>
Signed-off-by: Harikrishna Shenoy <h-shenoy@ti.com>
Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Link: https://patch.msgid.link/20251209120332.3559893-3-h-shenoy@ti.com
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Harikrishna Shenoy <h-shenoy@ti.com>
Date: Tue Dec 9 17:33:29 2025 +0530
drm/bridge: cadence: cdns-mhdp8546-core: Handle HDCP state in bridge atomic check
[ Upstream commit 4a8edd658489ec2a3d7e20482fa9e8d366153d8d ]
Now that we have DRM_BRIDGE_ATTACH_NO_CONNECTOR framework, handle the
HDCP state change in bridge atomic check as well to enable correct
functioning for HDCP in both DRM_BRIDGE_ATTACH_NO_CONNECTOR and
!DRM_BRIDGE_ATTACH_NO_CONNECTOR case.
Without this patch, when using DRM_BRIDGE_ATTACH_NO_CONNECTOR flag, HDCP
state changes would not be properly handled during atomic commits,
potentially leading to HDCP authentication failures or incorrect
protection status for content requiring HDCP encryption.
Fixes: 6a3608eae6d33 ("drm: bridge: cdns-mhdp8546: Enable HDCP")
Signed-off-by: Harikrishna Shenoy <h-shenoy@ti.com>
Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Link: https://patch.msgid.link/20251209120332.3559893-4-h-shenoy@ti.com
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jayesh Choudhary <j-choudhary@ti.com>
Date: Tue Dec 9 17:33:27 2025 +0530
drm/bridge: cadence: cdns-mhdp8546-core: Set the mhdp connector earlier in atomic_enable()
[ Upstream commit 43d6508ddbf9fb974fbc359a033154f78c9d4c8b ]
In case if we get errors in cdns_mhdp_link_up() or cdns_mhdp_reg_read()
in atomic_enable, we will go to cdns_mhdp_modeset_retry_fn() and will hit
NULL pointer while trying to access the mutex. We need the connector to
be set before that. Unlike in legacy cases with flag
!DRM_BRIDGE_ATTACH_NO_CONNECTOR, we do not have connector initialised
in bridge_attach(), so add the mhdp->connector_ptr in device structure
to handle both cases with DRM_BRIDGE_ATTACH_NO_CONNECTOR and
!DRM_BRIDGE_ATTACH_NO_CONNECTOR, set it in atomic_enable() earlier to
avoid possible NULL pointer dereference in recovery paths like
modeset_retry_fn() with the DRM_BRIDGE_ATTACH_NO_CONNECTOR flag set.
Fixes: c932ced6b585 ("drm/tidss: Update encoder/bridge chain connect model")
Signed-off-by: Jayesh Choudhary <j-choudhary@ti.com>
Signed-off-by: Harikrishna Shenoy <h-shenoy@ti.com>
Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Link: https://patch.msgid.link/20251209120332.3559893-2-h-shenoy@ti.com
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Johan Hovold <johan@kernel.org>
Date: Fri May 8 16:44:44 2026 +0200
drm/gma500/oaktrail_hdmi: fix i2c adapter leak on setup
commit 950953f774b3f69da6f413e045ef075e1f3da2df upstream.
Make sure to drop the reference taken to the I2C adapter (and its
module) when setting up HDMI to allow the adapter to be deregistered.
Fixes: 1b082ccf5901 ("gma500: Add Oaktrail support")
Cc: stable@vger.kernel.org # 3.3
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Link: https://patch.msgid.link/20260508144446.59722-2-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Johan Hovold <johan@kernel.org>
Date: Fri May 8 16:44:45 2026 +0200
drm/gma500/oaktrail_lvds: fix hang on init failure
commit 657a091ab6d01d0091b77660c75cfed573c9a53e upstream.
The LVDS init code looks up an I2C adapter using i2c_get_adapter() and
tries to read the EDID before falling back to allocating and registering
its own adapter.
The error handling does not separate these cases so on a late init
failure it will try to deregister and free also an adapter that had
previously been registered. Since i2c_get_adapter() takes another
reference to the adapter, deregistration hangs indefinitely while
waiting for the reference to be released.
Fix this by only destroying adapters allocated during LVDS init on
errors.
Fixes: a57ebfc0b4da ("drm/gma500: Make oaktrail lvds use ddc adapter from drm_connector")
Cc: stable@vger.kernel.org # 6.0
Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Link: https://patch.msgid.link/20260508144446.59722-3-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Johan Hovold <johan@kernel.org>
Date: Fri May 8 16:44:46 2026 +0200
drm/gma500/oaktrail_lvds: fix i2c adapter leaks on init
commit 84d1c9b416d54afe760ca4c378bd95c89261254c upstream.
The LVDS init code looks up an I2C adapter using i2c_get_adapter() and
tries to read the EDID before falling back to allocating and registering
its own adapter.
Make sure to drop the references taken by i2c_get_adapter() when falling
back to allocating an adapter as well as on late errors to allow the
looked up adapter to be deregistered.
Fixes: 1b082ccf5901 ("gma500: Add Oaktrail support")
Cc: stable@vger.kernel.org # 3.3
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Link: https://patch.msgid.link/20260508144446.59722-4-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
Date: Tue May 5 14:39:20 2026 +0530
drm/i915/dp: Fix VSC dynamic range signaling for RGB formats
commit 1ae15b6c7965d137eef21f2cc7d367b29cb88369 upstream.
For RGB, set dynamic_range to CTA or VESA based on
crtc_state->limited_color_range so sinks apply correct
quantization. YCbCr remains limited (CTA) range.
(DP v1.4, Table 5-1)
v2:
- Added Reported-by and Tested-by tags
v3:
- Add back YCbCr comment(Suraj)
Cc: stable@vger.kernel.org #v5.8+
Reported-by: DeepChirp <DeepChirp@outlook.com>
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/work_items/15874
Tested-by: DeepChirp <DeepChirp@outlook.com>
Fixes: 9799c4c3b76e ("drm/i915/dp: Add compute routine for DP VSC SDP")
Assisted-by: GitHub-Copilot:GPT-5.4
Signed-off-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://patch.msgid.link/20260505090920.2479112-1-chaitanya.kumar.borah@intel.com
(cherry picked from commit 38e10ddae6f8d42a2e8437fcd25a1cac51106c64)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date: Tue Mar 24 15:48:38 2026 +0200
drm/i915/wm: Verify the correct plane DDB entry
[ Upstream commit a97c88a176b6b8d116f4d3f508f3bd02bc77b462 ]
Actually verify the DDB entry for the plane we're looking
at instead of always verifying the cursor DDB.
Fixes: 7d4561722c3b ("drm/i915: Tweak plane ddb allocation tracking")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20260324134843.2364-5-ville.syrjala@linux.intel.com
Reviewed-by: Vinod Govindapillai <vinod.govindapillai@intel.com>
(cherry picked from commit f002f7c7439de18117a31ca84dc87a59719c3dd6)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date: Thu Oct 31 17:56:41 2024 +0200
drm/i915: Relocate the SKL wm sanitation code
[ Upstream commit e31e8681d29c5c35aa070ca6323c6b95ecf0db99 ]
In order to add more MBUS sanitation into the code we'll want
to reuse a bunch of the code that performs the MBUS/related
hardware programming. Currently that code comes after the
main skl_wm_get_hw_state_and_sanitize() entrypoint. In order
to avoid annoying forward declarations relocate the
skl_wm_get_hw_state_and_sanitize() and related stuff nearer to
the end of the file.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241031155646.15165-2-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Stable-dep-of: a97c88a176b6 ("drm/i915/wm: Verify the correct plane DDB entry")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
Date: Thu Apr 16 13:31:18 2026 +0200
drm/i915: skip __i915_request_skip() for already signaled requests
commit 4cfe4c0efbdcde742a47813180cc69b132d7598e upstream.
After a GPU reset the HWSP is zeroed, so previously completed
requests appear incomplete. If such a request is picked up during
reset_rewind() and marked guilty, i915_request_set_error_once()
returns early (fence already signaled), leaving fence.error without
a fatal error code. The subsequent __i915_request_skip() then hits:
```
GEM_BUG_ON(!fatal_error(rq->fence.error))
```
Fixes a kernel BUG observed on Sandy Bridge (Gen6) during
heartbeat-triggered engine resets.
```
kernel BUG at drivers/gpu/drm/i915/i915_request.c:556!
RIP: __i915_request_skip+0x15e/0x1d0 [i915]
...
__i915_request_reset+0x212/0xa70 [i915]
reset_rewind+0xe4/0x280 [i915]
intel_gt_reset+0x30d/0x5b0 [i915]
heartbeat+0x516/0x530 [i915]
```
Guard __i915_request_skip() with i915_request_signaled(), if the
fence is already signaled, the ring content is committed and there
is nothing left to skip.
Fixes: 36e191f0644b ("drm/i915: Apply i915_request_skip() on submission")
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/work_items/13729
Signed-off-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com>
Cc: stable@vger.kernel.org # v5.7+
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://lore.kernel.org/r/fe76921d35b6ae85aa651822726d0d9815aa5362.1776339012.git.sebastian.brzezinka@intel.com
(cherry picked from commit 5ba54393dcd7adf75a9f39f5a933b1538349cad5)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Alexandru Dadu <alexandru.dadu@imgtec.com>
Date: Mon Mar 23 20:31:29 2026 +0200
drm/imagination: Switch reset_reason fields from enum to u32
[ Upstream commit d2f83a6cd598bf413f1acf34153bd1d71023fbab ]
Update the reset_reason fwif structure fields from enum to u32 to remove
any ambiguity from the interface (enum is not a fixed size thus is unfit
for the purpose of the data type).
Fixes: a26f067feac1f ("drm/imagination: Add FWIF headers")
Signed-off-by: Alexandru Dadu <alexandru.dadu@imgtec.com>
Reviewed-by: Matt Coster <matt.coster@imgtec.com>
Link: https://patch.msgid.link/20260323-b4-firmware-context-reset-notification-handling-v3-2-1a66049a9a65@imgtec.com
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alexander Konyukhov <Alexander.Konyukhov@kaspersky.com>
Date: Tue Feb 3 16:48:46 2026 +0300
drm/komeda: fix integer overflow in AFBC framebuffer size check
[ Upstream commit 779ec12c85c9e4547519e3903a371a3b26a289de ]
The AFBC framebuffer size validation calculates the minimum required
buffer size by adding the AFBC payload size to the framebuffer offset.
This addition is performed without checking for integer overflow.
If the addition oveflows, the size check may incorrectly succed and
allow userspace to provide an undersized drm_gem_object, potentially
leading to out-of-bounds memory access.
Add usage of check_add_overflow() to safely compute the minimum
required size and reject the framebuffer if an overflow is detected.
This makes the AFBC size validation more robust against malformed.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 65ad2392dd6d ("drm/komeda: Added AFBC support for komeda driver")
Signed-off-by: Alexander Konyukhov <Alexander.Konyukhov@kaspersky.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://lore.kernel.org/r/20260203134907.1587067-1-Alexander.Konyukhov@kaspersky.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Myeonghun Pak <mhun512@gmail.com>
Date: Wed May 13 15:57:00 2026 +0900
drm/loongson: Use managed KMS polling
commit 0a9c56dd387605d17dabeedd9fdd2c4c1d0bab7b upstream.
lsdc_pci_probe() initializes KMS polling before setting up vblank support,
requesting the IRQ and registering the DRM device. If any of those later
steps fails, probe returns without finalizing polling. The driver also
never finalizes polling on regular removal.
Use drmm_kms_helper_poll_init() so polling is tied to the DRM device
lifetime and automatically finalized on probe failure and device removal.
This issue was identified during our ongoing static-analysis research while
reviewing kernel code.
Fixes: f39db26c5428 ("drm: Add kms driver for loongson display controller")
Cc: stable@vger.kernel.org
Co-developed-by: Ijae Kim <ae878000@gmail.com>
Signed-off-by: Ijae Kim <ae878000@gmail.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Jianmin Lv <lvjianmin@loongson.cn>
Reviewed-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Myeonghun Pak <mhun512@gmail.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20260513065706.23803-1-mhun512@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Connor Abbott <cwabbott0@gmail.com>
Date: Wed Mar 25 16:58:37 2026 -0400
drm/msm/a6xx: Fix dumping A650+ debugbus blocks
[ Upstream commit cc83f71c9be0715fe93b963ffa9767d5d84354ed ]
These should be appended after the existing debugbus blocks, instead of
replacing them.
Fixes: 1e05bba5e2b8 ("drm/msm/a6xx: Update a6xx gpu coredump")
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/714270/
Message-ID: <20260325-drm-msm-a650-debugbus-v1-1-dfbf358890a7@gmail.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rob Clark <robin.clark@oss.qualcomm.com>
Date: Wed Mar 25 11:40:42 2026 -0700
drm/msm/a6xx: Fix HLSQ register dumping
[ Upstream commit c289a6db9ba6cb974f0317da142e4f665d589566 ]
Fix the bitfield offset of HLSQ_READ_SEL state-type bitfield. Otherwise
we are always reading TP state when we wanted SP or HLSQ state.
Reported-by: Connor Abbott <cwabbott0@gmail.com>
Suggested-by: Connor Abbott <cwabbott0@gmail.com>
Fixes: 1707add81551 ("drm/msm/a6xx: Add a6xx gpu state")
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714236/
Message-ID: <20260325184043.1259312-1-robin.clark@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Date: Fri Mar 27 05:43:50 2026 +0530
drm/msm/a6xx: Use barriers while updating HFI Q headers
[ Upstream commit dc78b35d5ec09d1b0b8a937e6e640d2c5a030915 ]
To avoid harmful compiler optimizations and IO reordering in the HW, use
barriers and READ/WRITE_ONCE helpers as necessary while accessing the HFI
queue index variables.
Fixes: 4b565ca5a2cb ("drm/msm: Add A6XX device support")
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714653/
Message-ID: <20260327-a8xx-gpu-batch2-v2-1-2b53c38d2101@oss.qualcomm.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yuanjie Yang <yuanjie.yang@oss.qualcomm.com>
Date: Mon Mar 9 14:37:20 2026 +0800
drm/msm/dpu: fix mismatch between power and frequency
[ Upstream commit bc1dccc518cc5ab5140fba06c27e7188e0ed342b ]
During DPU runtime suspend, calling dev_pm_opp_set_rate(dev, 0) drops
the MMCX rail to MIN_SVS while the core clock frequency remains at its
original (highest) rate. When runtime resume re-enables the clock, this
may result in a mismatch between the rail voltage and the clock rate.
For example, in the DPU bind path, the sequence could be:
cpu0: dev_sync_state -> rpmhpd_sync_state
cpu1: dpu_kms_hw_init
timeline 0 ------------------------------------------------> t
After rpmhpd_sync_state, the voltage performance is no longer guaranteed
to stay at the highest level. During dpu_kms_hw_init, calling
dev_pm_opp_set_rate(dev, 0) drops the voltage, causing the MMCX rail to
fall to MIN_SVS while the core clock is still at its maximum frequency.
When the power is re-enabled, only the clock is enabled, leading to a
situation where the MMCX rail is at MIN_SVS but the core clock is at its
highest rate. In this state, the rail cannot sustain the clock rate,
which may cause instability or system crash.
Remove the call to dev_pm_opp_set_rate(dev, 0) from dpu_runtime_suspend
to ensure the correct vote is restored when DPU resumes.
Fixes: b0530eb11913 ("drm/msm/dpu: Use OPP API to set clk/perf state")
Signed-off-by: Yuanjie Yang <yuanjie.yang@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/710077/
Link: https://lore.kernel.org/r/20260309063720.13572-1-yuanjie.yang@oss.qualcomm.com
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pengyu Luo <mitltlatltl@gmail.com>
Date: Mon Mar 9 18:02:53 2026 +0800
drm/msm/dsi: add the missing parameter description
[ Upstream commit 958adefc4c0fddee3b12269da5dd7cb49bac953f ]
Add a description for is_bonded_dsi in dsi_adjust_pclk_for_compression
to match the existing kernel-doc comment.
Fixes: e4eb11b34d6c ("drm/msm/dsi: fix pclk rate calculation for bonded dsi")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202603080314.XeqyRZ7A-lkp@intel.com/
Signed-off-by: Pengyu Luo <mitltlatltl@gmail.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/710112/
Link: https://lore.kernel.org/r/20260309100254.877801-1-mitltlatltl@gmail.com
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pengyu Luo <mitltlatltl@gmail.com>
Date: Sat Mar 7 19:12:48 2026 +0800
drm/msm/dsi: fix bits_per_pclk
[ Upstream commit 2d51cfb77daa30b10bc68c403f8ace35783d2922 ]
mipi_dsi_pixel_format_to_bpp return dst bpp not src bpp, dst bpp may
not be the uncompressed data size. use src bpc * 3 to get src bpp,
this aligns with pclk rate calculation.
Fixes: ac47870fd795 ("drm/msm/dsi: fix hdisplay calculation when programming dsi registers")
Signed-off-by: Pengyu Luo <mitltlatltl@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/709916/
Link: https://lore.kernel.org/r/20260307111250.105772-1-mitltlatltl@gmail.com
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pengyu Luo <mitltlatltl@gmail.com>
Date: Sat Mar 7 19:12:49 2026 +0800
drm/msm/dsi: fix hdisplay calculation for CMD mode panel
[ Upstream commit 82159db4371f5cef56444ebd0b8f96e2a6d709ff ]
Commit ac47870fd795 ("drm/msm/dsi: fix hdisplay calculation when
programming dsi registers") incorrecly broke hdisplay calculation for
CMD mode by specifying incorrect number of bytes per transfer, fix it.
Fixes: ac47870fd795 ("drm/msm/dsi: fix hdisplay calculation when programming dsi registers")
Signed-off-by: Pengyu Luo <mitltlatltl@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/709917/
Link: https://lore.kernel.org/r/20260307111250.105772-2-mitltlatltl@gmail.com
[DB: fixed commit message]
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alexander Koskovich <akoskovich@pm.me>
Date: Tue Mar 24 11:48:27 2026 +0000
drm/msm/dsi: rename MSM8998 DSI version from V2_2_0 to V2_0_0
[ Upstream commit 913a709dea0eff9c7b2e9470f8c8594b9a0114ab ]
The MSM8998 DSI controller is v2.0.0 as stated in commit 7b8c9e203039
("drm/msm/dsi: Add support for MSM8998 DSI controller"). The value was
always correct just the name was wrong.
Rename and reorder to maintain version sorting.
Fixes: 7b8c9e203039 ("drm/msm/dsi: Add support for MSM8998 DSI controller")
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Alexander Koskovich <akoskovich@pm.me>
Patchwork: https://patchwork.freedesktop.org/patch/713717/
Link: https://lore.kernel.org/r/20260324-dsi-rgb101010-support-v5-3-ff6afc904115@pm.me
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rob Clark <robin.clark@oss.qualcomm.com>
Date: Wed Mar 25 11:41:05 2026 -0700
drm/msm/shrinker: Fix can_block() logic
[ Upstream commit df0f439e3926817cf577ca6272aad68468ff7624 ]
The intention here was to allow blocking if DIRECT_RECLAIM or if called
from kswapd and KSWAPD_RECLAIM is set.
Reported by Claude code review: https://lore.gitlab.freedesktop.org/drm-ai-reviews/review-patch9-20260309151119.290217-10-boris.brezillon@collabora.com/ on a panthor patch which had copied similar logic.
Reported-by: Boris Brezillon <boris.brezillon@collabora.com>
Fixes: 7860d720a84c ("drm/msm: Fix build break with recent mm tree")
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Patchwork: https://patchwork.freedesktop.org/patch/714238/
Message-ID: <20260325184106.1259528-1-robin.clark@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Date: Mon Mar 23 03:21:49 2026 +0200
drm/panel: sharp-ls043t1le01: make use of prepare_prev_first
[ Upstream commit c222177d7c7e1b2e0433d9e47ec2da7015345d50 ]
The DSI link must be powered up to let panel driver to talk to the panel
during prepare() callback execution. Set the prepare_prev_first flag to
guarantee this.
Fixes: 9e15123eca79 ("drm/msm/dsi: Stop unconditionally powering up DSI hosts at modeset")
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://patch.msgid.link/20260323-panel-fix-v1-1-9f12b09161e8@oss.qualcomm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sebastian Reichel <sebastian.reichel@collabora.com>
Date: Tue Feb 17 16:25:26 2026 +0200
drm/panel: simple: Correct G190EAN01 prepare timing
[ Upstream commit f1080f82570b797598c1ba7e9c800ae9e94aafc6 ]
The prepare timing specified by the G190EAN01 datasheet should be
between 30 and 50 ms. Considering it might take some time for the
LVDS encoder to enable the signal, we should only wait the min.
required time in the panel driver and not the max. allowed time.
Fixes: 2f7b832fc992 ("drm/panel: simple: Add support for AUO G190EAN01 panel")
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Ian Ray <ian.ray@gehealthcare.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://patch.msgid.link/20260217142528.68613-1-ian.ray@gehealthcare.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Gyeyoung Baek <gye976@gmail.com>
Date: Sun Apr 19 16:17:16 2026 +0900
drm/panfrost: Fix wait_bo ioctl leaking positive return from dma_resv_wait_timeout()
commit 459d75523b71c0ec254d153d8850d0b7008af396 upstream.
dma_resv_wait_timeout() returns a positive 'remaining jiffies' value
on success, 0 on timeout, and -errno on failure.
panfrost_ioctl_wait_bo() returns this 'long' result from an int-typed
ioctl handler, so positive values reach userspace as bogus errors.
Explicitly set ret to 0 on the success path.
Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver")
Cc: stable@vger.kernel.org
Signed-off-by: Gyeyoung Baek <gye976@gmail.com>
Reviewed-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Link: https://patch.msgid.link/fe33f82fded7be1c18e2e0eb2db451d5a738cf39.1776581974.git.gye976@gmail.com
Signed-off-by: Steven Price <steven.price@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ethan Tidmore <ethantidmore06@gmail.com>
Date: Mon Feb 16 19:48:01 2026 -0600
drm/sun4i: backend: fix error pointer dereference
[ Upstream commit 06277983eca4a31d3c2114fa33d99a6e82484b11 ]
The function drm_atomic_get_plane_state() can return an error pointer
and is not checked for it. Add error pointer check.
Detected by Smatch:
drivers/gpu/drm/sun4i/sun4i_backend.c:496 sun4i_backend_atomic_check() error:
'plane_state' dereferencing possible ERR_PTR()
Fixes: 96180dde23b79 ("drm/sun4i: backend: Add a custom atomic_check for the frontend")
Signed-off-by: Ethan Tidmore <ethantidmore06@gmail.com>
Reviewed-by: Chen-Yu Tsai <wens@kernel.org>
Link: https://patch.msgid.link/20260217014801.60760-1-ethantidmore06@gmail.com
Signed-off-by: Chen-Yu Tsai <wens@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ethan Tidmore <ethantidmore06@gmail.com>
Date: Thu Feb 26 10:38:36 2026 -0600
drm/sun4i: Fix resource leaks
[ Upstream commit 127367ad2e0f4870de60c6d719ae82ecf68d674c ]
Three clocks are not being released in devm_regmap_init_mmio() error
path.
Add proper goto and set ret to the error code.
Fixes: 8270249fbeaf0 ("drm/sun4i: backend: Create regmap after access is possible")
Signed-off-by: Ethan Tidmore <ethantidmore06@gmail.com>
Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://patch.msgid.link/20260226163836.10335-1-ethantidmore06@gmail.com
Signed-off-by: Chen-Yu Tsai <wens@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yuho Choi <dbgh9129@gmail.com>
Date: Sun Apr 19 20:25:13 2026 -0400
drm/sysfb: ofdrm: fix PCI device reference leaks
[ Upstream commit 4aa8110000b0d215deef8eed283565dd0c1def88 ]
display_get_pci_dev_of() gets a referenced PCI device via
pci_get_device(). Drop that reference when pci_enable_device() fails and
release it during the managed teardown path after pci_disable_device().
Without that, ofdrm leaks the pci_dev reference on both the error path
and the normal cleanup path.
Fixes: c8a17756c425 ("drm/ofdrm: Add ofdrm for Open Firmware framebuffers")
Co-developed-by: Myeonghun Pak <mhun512@gmail.com>
Signed-off-by: Myeonghun Pak <mhun512@gmail.com>
Co-developed-by: Ijae Kim <ae878000@gmail.com>
Signed-off-by: Ijae Kim <ae878000@gmail.com>
Co-developed-by: Taegyu Kim <tmk5904@psu.edu>
Signed-off-by: Taegyu Kim <tmk5904@psu.edu>
Signed-off-by: Yuho Choi <dbgh9129@gmail.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20260420002513.216-1-dbgh9129@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Maíra Canal <mcanal@igalia.com>
Date: Fri Mar 6 08:30:33 2026 -0300
drm/v3d: Handle error from drm_sched_entity_init()
[ Upstream commit 8cf1bec37b27846ad3169744c9f1a89a06dcb3fa ]
drm_sched_entity_init() can fail but its return value is currently being
ignored in v3d_open(). Check the return value and properly unwind
on failure by destroying any already-initialized scheduler entities.
Fixes: 57692c94dcbe ("drm/v3d: Introduce a new DRM driver for Broadcom V3D V3.x+")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patch.msgid.link/20260306-v3d-reset-locking-improv-v3-1-49864fe00692@igalia.com
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ashutosh Desai <ashutoshdesai993@gmail.com>
Date: Fri May 15 17:58:09 2026 -0300
drm/v3d: Reject empty multisync extension to prevent infinite loop
v3d_get_extensions() walks a userspace-provided singly-linked list of
ioctl extensions without any bound on the chain length. A local user
can craft a self-referential extension (ext->next == &ext) with zero
in_sync_count and out_sync_count, which bypasses the existing duplicate-
extension guard:
if (se->in_sync_count || se->out_sync_count)
return -EINVAL;
The guard never fires because v3d_get_multisync_post_deps() returns
immediately when count is zero, leaving both fields at zero on every
iteration. The result is an infinite loop in kernel context, blocking
the calling thread and pegging a CPU core indefinitely.
Fix this by rejecting a multisync extension where both in_sync_count
and out_sync_count are zero in v3d_get_multisync_submit_deps(). An
empty multisync carries no synchronization information and serves no
useful purpose, so returning -EINVAL for such an extension is the
correct defense against this attack vector.
Fixes: e4165ae8304e ("drm/v3d: add multiple syncobjs support")
Cc: stable@vger.kernel.org
Signed-off-by: Ashutosh Desai <ashutoshdesai993@gmail.com>
Link: https://patch.msgid.link/20260415050000.3816128-1-ashutoshdesai993@gmail.com
Signed-off-by: Maíra Canal <mcanal@igalia.com>
(cherry picked from commit fb44d589bf3148e13452185a6e772a7efbf2d684)
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Matt Roper <matthew.d.roper@intel.com>
Date: Wed Apr 8 15:27:44 2026 -0700
drm/xe/debugfs: Correct printing of register whitelist ranges
[ Upstream commit 03f2499c51dffce611b065b2894406beb9f2ebe0 ]
The register-save-restore debugfs prints whitelist entries as offset
ranges. E.g.,
REG[0x39319c-0x39319f]: allow read access
for a single dword-sized register. However the GENMASK value used to
set the lower bits to '1' for the upper bound of the whitelist range
incorrectly included one more bit than it should have, causing the
whitelist ranges to sometimes appear twice as large as they really were.
For example,
REG[0x6210-0x6217]: allow rw access
was also intended to be a single dword-sized register whitelist (with a
range 0x6210-0x6213) but was printed incorrectly as a qword-sized range
because one too many bits was flipped on. Similar 'off by one' logic
was applied when printing 4-dword register ranges and 64-dword register
ranges as well.
Correct the GENMASK logic to print these ranges in debugfs correctly.
No impact outside of correcting the misleading debugfs output.
Fixes: d855d2246ea6 ("drm/xe: Print whitelist while applying")
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patch.msgid.link/20260408-regsr_wl_range-v1-1-e9a28c8b4264@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit 1a2a722ff96749734a5585dfe7f0bea7719caa8b)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Matthew Auld <matthew.auld@intel.com>
Date: Fri May 8 11:26:36 2026 +0100
drm/xe/dma-buf: handle empty bo and UAF races
commit 981bedbbe61364fcc3a3b87ebaf648a66cd07108 upstream.
There look to be some nasty races here when triggering the
invalidate_mappings hook:
1) We do xe_bo_alloc() followed by the attach, before the actual full bo
init step in xe_dma_buf_init_obj(). However the bo is visible on the
attachments list after the attach. This is bad since exporter driver,
say amdgpu, can at any time call back into our invalidate_mappings hook,
with an empty/bogus bo, leading to potential bugs/crashes.
2) Similar to 1) but here we get a UAF, when the invalidate_mappings
hook is triggered. For example, we get as far as xe_bo_init_locked()
but this fails in some way. But here the bo will be freed on error, but
we still have it attached from dma-buf pov, so if the
invalidate_mappings is now triggered then the bo we access is gone and
we trigger UAF and more bugs/crashes.
To fix this, move the attach step until after we actually have a fully
set up buffer object. Note that the bo is not published to userspace
until later, so not sure what the comment "Don't publish the bo
until we have a valid attachment", is referring to.
We have at least two different customers reporting hitting a NULL ptr
deref in evict_flags when importing something from amdgpu, followed by
triggering the evict flow. Hit rate is also pretty low, which would
hint at some kind of race, so something like 1) or 2) might explain
this.
v2:
- Shuffle the order of the ops slightly (no functional change)
- Improve the comment to better explain the ordering (Matt B)
Assisted-by: Gemini:gemini-3 #debug
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/work_items/7903
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/work_items/4055
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/20260508102635.149172-3-matthew.auld@intel.com
(cherry picked from commit af1f2ad0c59fe4e2f924c526f66e968289d77971)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Shuicheng Lin <shuicheng.lin@intel.com>
Date: Fri Apr 17 16:33:08 2026 +0000
drm/xe/gsc: Fix BO leak on error in query_compatibility_version()
[ Upstream commit 3762d6c36549accea7068c4a175483fafdd03657 ]
When xe_gsc_read_out_header() fails, query_compatibility_version()
returns directly instead of jumping to the out_bo label. This skips
the xe_bo_unpin_map_no_vm() call, leaving the BO pinned and mapped
with no remaining reference to free it.
Fix by using goto out_bo so the error path properly cleans up the BO,
consistent with the other error handling in the same function.
Fixes: 0881cbe04077 ("drm/xe/gsc: Query GSC compatibility version")
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patch.msgid.link/20260417163308.3416147-1-shuicheng.lin@intel.com
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
(cherry picked from commit 8de86d0a843c32ca9d36864bdb92f0376a830bce)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shuicheng Lin <shuicheng.lin@intel.com>
Date: Wed Apr 8 02:06:47 2026 +0000
drm/xe: Fix error cleanup in xe_exec_queue_create_ioctl()
[ Upstream commit f3cc22d4df3ed58439ea7e21daa54c3608e03b78 ]
Two error handling issues exist in xe_exec_queue_create_ioctl():
1. When xe_hw_engine_group_add_exec_queue() fails, the error path jumps
to put_exec_queue which skips xe_exec_queue_kill(). If the VM is in
preempt fence mode, xe_vm_add_compute_exec_queue() has already added
the queue to the VM's compute exec queue list. Skipping the kill
leaves the queue on that list, leading to a dangling pointer after
the queue is freed.
2. When xa_alloc() fails after xe_hw_engine_group_add_exec_queue() has
succeeded, the error path does not call
xe_hw_engine_group_del_exec_queue() to remove the queue from the hw
engine group list. The queue is then freed while still linked into
the hw engine group, causing a use-after-free.
Fix both by:
- Changing the xe_hw_engine_group_add_exec_queue() failure path to jump
to kill_exec_queue so that xe_exec_queue_kill() properly removes the
queue from the VM's compute list.
- Adding a del_hw_engine_group label before kill_exec_queue for the
xa_alloc() failure path, which removes the queue from the hw engine
group before proceeding with the rest of the cleanup.
Fixes: 7970cb36966c ("'drm/xe/hw_engine_group: Register hw engine group's exec queues")
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Assisted-by: Claude:claude-opus-4.6
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260408020647.3397933-1-shuicheng.lin@intel.com
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
(cherry picked from commit 37c831f401746a45d510b312b0ed7a77b1e06ec8)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Date: Tue Jan 20 12:19:25 2026 +0100
dt-bindings: clock: qcom,dispcc-sc7180: Define MDSS resets
[ Upstream commit fc6e29d42872680dca017f2e5169eefe971f8d89 ]
The MDSS resets have so far been left undescribed. Fix that.
Fixes: 75616da71291 ("dt-bindings: clock: Introduce QCOM sc7180 display clock bindings")
Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Reviewed-by: Taniya Das <taniya.das@oss.qualcomm.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Tested-by: Val Packett <val@packett.cool> # sc7180-ecs-liva-qc710
Link: https://lore.kernel.org/r/20260120-topic-7180_dispcc_bcr-v1-1-0b1b442156c3@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Stable-dep-of: b0bc6011c549 ("clk: qcom: dispcc-sc7180: Add missing MDSS resets")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Val Packett <val@packett.cool>
Date: Thu Mar 12 08:12:06 2026 -0300
dt-bindings: clock: qcom,gcc-sc8180x: Add missing GDSCs
[ Upstream commit 76404ffbf07f28a5ec04748e18fce3dac2e78ef6 ]
There are 5 more GDSCs that we were ignoring and not putting to sleep,
which are listed in downstream DTS. Add them.
Signed-off-by: Val Packett <val@packett.cool>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260312112321.370983-2-val@packett.cool
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Stable-dep-of: 3565741eb985 ("clk: qcom: gcc-sc8180x: Add missing GDSCs")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Geert Uytterhoeven <geert+renesas@glider.be>
Date: Fri Mar 6 11:26:20 2026 +0100
dt-bindings: interrupt-controller: arm,gic-v3: Fix EPPI range
[ Upstream commit 15cfc8984defc17e5e4de1f58db7b993240fcbda ]
According to the "Arm Generic Interrupt Controller (GIC) Architecture
Specification, v3 and v4", revision H.b[1], there can be only 64
Extended PPI interrupts.
[1] https://developer.arm.com/documentation/ihi0069/hb/
Fixes: 4b049063e0bcbfd3 ("dt-bindings: interrupt-controller: arm,gic-v3: Describe EPPI range support")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Brain-farted-by: Marc Zyngier <maz@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://patch.msgid.link/3e49a63c6b2b6ee48e3737adee87781f9c136c5f.1772792753.git.geert+renesas@glider.be
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Josua Mayer <josua@solid-run.com>
Date: Thu Apr 9 14:34:33 2026 +0200
dt-bindings: net: dsa: nxp,sja1105: make spi-cpol optional for sja1110
[ Upstream commit 600f01dc4bd0c736b3ffea9f7976136d8bf1b136 ]
Currently, the binding requires 'spi-cpha' for SJA1105 and 'spi-cpol'
for SJA1110.
However, the SJA1110 supports both SPI modes 0 and 2. Mode 2
(cpha=0, cpol=1) is used by the NXP LX2160 Bluebox 3.
On the SolidRun i.MX8DXL HummingBoard Telematics, mode 0 is stable,
while forcing mode 2 introduces CRC errors especially during bursts.
Drop the requirement on spi-cpol for SJA1110.
Fixes: af2eab1a8243 ("dt-bindings: net: nxp,sja1105: document spi-cpol/cpha")
Signed-off-by: Josua Mayer <josua@solid-run.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20260409-imx8dxl-sr-som-v2-1-83ff20629ba0@solid-run.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Matt Vollrath <tactii@gmail.com>
Date: Thu Apr 16 17:53:36 2026 -0700
e1000e: Unroll PTP in probe error handling
[ Upstream commit aa3f7fe409350857c25d050482a2eef2cfd69b58 ]
If probe fails after registering the PTP clock and its delayed work,
these resources must be released.
This was not an issue until a 2016 fix moved the e1000e_ptp_init() call
before the jump to err_register.
Fixes: aa524b66c5ef ("e1000e: don't modify SYSTIM registers during SIOCSHWTSTAMP ioctl")
Signed-off-by: Matt Vollrath <tactii@gmail.com>
Tested-by: Avigail Dahan <avigailx.dahan@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-12-686c33c9828d@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Thomas Huth <thuth@redhat.com>
Date: Fri Apr 10 17:46:37 2026 +0200
efi/capsule-loader: fix incorrect sizeof in phys array reallocation
[ Upstream commit 48a428215782321b56956974f23593e40ce84b7a ]
The krealloc() call for cap_info->phys in __efi_capsule_setup_info() uses
sizeof(phys_addr_t *) instead of sizeof(phys_addr_t), which might be
causing an undersized allocation.
The allocation is also inconsistent with the initial array allocation in
efi_capsule_open() that allocates one entry with sizeof(phys_addr_t),
and the efi_capsule_write() function that stores phys_addr_t values (not
pointers) via page_to_phys().
On 64-bit systems where sizeof(phys_addr_t) == sizeof(phys_addr_t *), this
goes unnoticed. On 32-bit systems with PAE where phys_addr_t is 64-bit but
pointers are 32-bit, this allocates half the required space, which might
lead to a heap buffer overflow when storing physical addresses.
This is similar to the bug fixed in commit fccfa646ef36 ("efi/capsule-loader:
fix incorrect allocation size") which fixed the same issue at the initial
allocation site.
Fixes: f24c4d478013 ("efi/capsule-loader: Reinstate virtual capsule mapping")
Assisted-by: Claude:claude-sonnet-4-5
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Gao Xiang <xiang@kernel.org>
Date: Mon Mar 10 17:54:57 2025 +0800
erofs: add encoded extent on-disk definition
[ Upstream commit efb2aef569b35b415c232c4e9fdecd0e540e1f60 ]
Previously, EROFS provided both (non-)compact compressed indexes to
keep necessary hints for each logical block, enabling O(1) random
indexing. This approach was originally designed for small compression
units (e.g., 4KiB), where compressed data is strictly block-aligned via
fixed-sized output compression.
However, EROFS now supports big pclusters up to 1MiB and many users use
large configurations to minimize image sizes. For such configurations,
the total number of extents decreases significantly (e.g., only 1,024
extents for a 1GiB file using 1MiB pclusters), then runtime metadata
overhead becomes negligible compared to data I/O and decoding costs.
Additionally, some popular compression algorithm (mainly Zstd) still
lacks native fixed-sized output compression support (although it's
planned by their authors). Instead of just waiting for compressor
improvements, let's adopt byte-oriented extents, allowing these
compressors to retain their current methods.
For example, it speeds up Zstd compression a lot:
Processor: Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz * 96
Dataset: enwik9
Build time Size Type Command Line
3m52.339s 266653696 FO -C524288 -zzstd,22
3m48.549s 266174464 FO -E48bit -C524288 -zzstd,22
0m12.821s 272134144 FI -E48bit -C1048576 --max-extent-bytes=1048576 -zzstd,22
0m14.528s 248987648 FO -C1048576 -zlzma,9
0m14.605s 248504320 FO -E48bit -C1048576 -zlzma,9
Encoded extents are structured as an array of `struct z_erofs_extent`,
sorted by logical address in ascending order:
__le32 plen // encoded length, algorithm id and flags
__le32 pstart_lo // physical offset LSB
__le32 pstart_hi // physical offset MSB
__le32 lstart_lo // logical offset
__le32 lstart_hi // logical offset MSB
..
Note that prefixed reduced records can be used to minimize metadata for
specific cases (e.g. lstart less than 32 bits, then 32 to 16 bytes).
If the logical lengths of all encoded extents are the same, 4-byte
(plen) and 8-byte (plen, pstart_lo) records can be used. Or, 16-byte
(plen .. lstart_lo) and 32-byte full records have to be used instead.
If 16-byte and 32-byte records are used, the total number of extents
is kept in `struct z_erofs_map_header`, and binary search can be
applied on them. Note that `eytzinger order` is not considerd because
data sequential access is important.
If 4-byte records are used, 8-byte start physical offset is between
`struct z_erofs_map_header` and the `plen` array.
In addition, 64-bit physical offsets can be applied with new encoded
extent format to match full 48-bit block addressing.
Remove redundant comments around `struct z_erofs_lcluster_index` too.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Acked-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20250310095459.2620647-8-hsiangkao@linux.alibaba.com
Stable-dep-of: 2d8c7edcb661 ("erofs: unify lcn as u64 for 32-bit platforms")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Gao Xiang <xiang@kernel.org>
Date: Fri Oct 17 15:05:38 2025 +0800
erofs: avoid infinite loops due to corrupted subpage compact indexes
[ Upstream commit e13d315ae077bb7c3c6027cc292401bc0f4ec683 ]
Robert reported an infinite loop observed by two crafted images.
The root cause is that `clusterofs` can be larger than `lclustersize`
for !NONHEAD `lclusters` in corrupted subpage compact indexes, e.g.:
blocksize = lclustersize = 512 lcn = 6 clusterofs = 515
Move the corresponding check for full compress indexes to
`z_erofs_load_lcluster_from_disk()` to also cover subpage compact
compress indexes.
It also fixes the position of `m->type >= Z_EROFS_LCLUSTER_TYPE_MAX`
check, since it should be placed right after
`z_erofs_load_{compact,full}_lcluster()`.
Fixes: 8d2517aaeea3 ("erofs: fix up compacted indexes for block size < 4096")
Fixes: 1a5223c182fd ("erofs: do sanity check on m->type in z_erofs_load_compact_lcluster()")
Reported-by: Robert Morris <rtm@csail.mit.edu>
Closes: https://lore.kernel.org/r/35167.1760645886@localhost
Reviewed-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Stable-dep-of: 2d8c7edcb661 ("erofs: unify lcn as u64 for 32-bit platforms")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chao Yu <chao@kernel.org>
Date: Tue Jul 8 19:09:28 2025 +0800
erofs: do sanity check on m->type in z_erofs_load_compact_lcluster()
[ Upstream commit 1a5223c182fdb3bb3c0ca85cec101c740f685ab6 ]
All below functions will do sanity check on m->type, let's move sanity
check to z_erofs_load_compact_lcluster() for cleanup.
- z_erofs_map_blocks_fo
- z_erofs_get_extent_compressedlen
- z_erofs_get_extent_decompressedlen
- z_erofs_extent_lookback
Reviewed-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250708110928.3110375-1-chao@kernel.org
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Stable-dep-of: 2d8c7edcb661 ("erofs: unify lcn as u64 for 32-bit platforms")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Gao Xiang <xiang@kernel.org>
Date: Mon Apr 20 18:11:42 2026 +0800
erofs: unify lcn as u64 for 32-bit platforms
[ Upstream commit 2d8c7edcb661812249469f4a5b62e9339118846f ]
As sashiko reported [1], `lcn` was typed as `unsigned long` (or
`unsigned int` sometimes), which is only 32 bits wide on 32-bit
platforms, which causes `(lcn << lclusterbits)` to be truncated
at 4 GiB.
In order to consolidate the logic, just use `u64` consistently
around the codebase.
[1] https://sashiko.dev/r/20260420034612.1899973-1-hsiangkao%40linux.alibaba.com
Fixes: 152a333a5895 ("staging: erofs: add compacted compression indexes support")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mohsin Bashir <hmohsin@meta.com>
Date: Tue Apr 7 17:24:15 2026 -0700
eth: fbnic: Use wake instead of start
[ Upstream commit 12ff2a4aee6c86746623d5aed24389dbf6dffded ]
fbnic_up() calls netif_tx_start_all_queues(), which only clears
__QUEUE_STATE_DRV_XOFF. If qdisc backlog has accumulated on any TX
queue before the reconfiguration (e.g. ring resize via ethtool -G),
start does not call __netif_schedule() to kick the qdisc, so the
pending backlog is never drained and the queue stalls.
Switch to netif_tx_wake_all_queues(), which clears DRV_XOFF and also
calls __netif_schedule() on every queue, ensuring any backlog that
built up before the down/up cycle is promptly dequeued.
Fixes: bc6107771bb4 ("eth: fbnic: Allocate a netdevice and napi vectors with queues")
Signed-off-by: Mohsin Bashir <hmohsin@meta.com>
Link: https://patch.msgid.link/20260408002415.2963915-1-mohsin.bashr@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Carlier <devnexen@gmail.com>
Date: Fri May 15 11:55:40 2026 -0400
eventfs: Use list_add_tail_rcu() for SRCU-protected children list
[ Upstream commit f67950b2887fa10df50c4317a1fe98a65bc6875b ]
Commit d2603279c7d6 ("eventfs: Use list_del_rcu() for SRCU protected
list variable") converted the removal side to pair with the
list_for_each_entry_srcu() walker in eventfs_iterate(). The insertion
in eventfs_create_dir() was left as a plain list_add_tail(), which on
weakly-ordered architectures can expose a new entry to the SRCU reader
before its list pointers and fields are observable.
Use list_add_tail_rcu() so the publication pairs with the existing
list_del_rcu() and list_for_each_entry_srcu().
Fixes: 43aa6f97c2d0 ("eventfs: Get rid of dentry pointers without refcounts")
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260418152251.199343-1-devnexen@gmail.com
Signed-off-by: David Carlier <devnexen@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
[ adapted scoped_guard(mutex, &eventfs_mutex) block to explicit mutex_lock()/mutex_unlock() pair ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Ye Bin <yebin10@huawei.com>
Date: Mon Mar 30 21:30:35 2026 +0800
ext4: fix possible null-ptr-deref in mbt_kunit_exit()
[ Upstream commit 22f53f08d9eb837ce69b1a07641d414aac8d045f ]
There's issue as follows:
# test_new_blocks_simple: failed to initialize: -12
KASAN: null-ptr-deref in range [0x0000000000000638-0x000000000000063f]
Tainted: [E]=UNSIGNED_MODULE, [N]=TEST
RIP: 0010:mbt_kunit_exit+0x5e/0x3e0 [ext4_test]
Call Trace:
<TASK>
kunit_try_run_case_cleanup+0xbc/0x100 [kunit]
kunit_generic_run_threadfn_adapter+0x89/0x100 [kunit]
kthread+0x408/0x540
ret_from_fork+0xa76/0xdf0
ret_from_fork_asm+0x1a/0x30
If mbt_kunit_init() init testcase failed will lead to null-ptr-deref.
So add test if 'sb' is inited success in mbt_kunit_exit().
Fixes: 7c9fa399a369 ("ext4: add first unit test for ext4_mb_new_blocks_simple in mballoc")
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Link: https://patch.msgid.link/20260330133035.287842-6-yebin@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chao Yu <chao@kernel.org>
Date: Tue May 19 08:41:56 2026 -0400
f2fs: fix false alarm of lockdep on cp_global_sem lock
[ Upstream commit 6a5e3de9c2bb0b691d16789a5d19e9276a09b308 ]
lockdep reported a potential deadlock:
a) TCMU device removal context:
- call del_gendisk() to get q->q_usage_counter
- call start_flush_work() to get work_completion of wb->dwork
b) f2fs writeback context:
- in wb_workfn(), which holds work_completion of wb->dwork
- call f2fs_balance_fs() to get sbi->gc_lock
c) f2fs vfs_write context:
- call f2fs_gc() to get sbi->gc_lock
- call f2fs_write_checkpoint() to get sbi->cp_global_sem
d) f2fs mount context:
- call recover_fsync_data() to get sbi->cp_global_sem
- call f2fs_check_and_fix_write_pointer() to call blkdev_report_zones()
that goes down to blk_mq_alloc_request and get q->q_usage_counter
Original callstack is in Closes tag.
However, I think this is a false alarm due to before mount returns
successfully (context d), we can not access file therein via vfs_write
(context c).
Let's introduce per-sb cp_global_sem_key, and assign the key for
cp_global_sem, so that lockdep can recognize cp_global_sem from
different super block correctly.
A lot of work are done by Shin'ichiro Kawasaki, thanks a lot for
the work.
Fixes: c426d99127b1 ("f2fs: Check write pointer consistency of open zones")
Cc: stable@kernel.org
Reported-and-tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Closes: https://lore.kernel.org/linux-f2fs-devel/20260218125237.3340441-1-shinichiro.kawasaki@wdc.com
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
[ adapted context to use `init_f2fs_rwsem()` instead of the not-yet-backported `init_f2fs_rwsem_trace()` macro ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Yongpeng Yang <yangyongpeng@xiaomi.com>
Date: Tue May 19 08:41:45 2026 -0400
f2fs: fix incorrect file address mapping when inline inode is unwritten
[ Upstream commit 68a0178981a0f493295afa29f8880246e561494c ]
When `fileinfo->fi_flags` does not have the `FIEMAP_FLAG_SYNC` bit set
and inline data has not been persisted yet, the physical address of the
extent is calculated incorrectly for unwritten inline inodes.
root@vm:/mnt/f2fs# dd if=/dev/zero of=data.3k bs=3k count=1
root@vm:/mnt/f2fs# f2fs_io fiemap 0 100 data.3k
Fiemap: offset = 0 len = 100
logical addr. physical addr. length flags
0 0000000000000000 00000ffffffff16c 0000000000000c00 00000301
This patch fixes the issue by checking if the inode's address is valid.
If the inline inode is unwritten, set the physical address to 0 and
mark the extent with `FIEMAP_EXTENT_UNKNOWN | FIEMAP_EXTENT_DELALLOC`
flags.
Cc: stable@kernel.org
Fixes: 67f8cf3cee6f ("f2fs: support fiemap for inline_data")
Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
[ renamed `ifolio` to `ipage` in `inline_data_addr()` and `F2FS_INODE()` calls ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Yongpeng Yang <yangyongpeng@xiaomi.com>
Date: Fri Apr 10 23:05:39 2026 +0800
f2fs: protect extension_list reading with sb_lock in f2fs_sbi_show()
[ Upstream commit 5909bedbed38c558bee7cb6758ceedf9bc3a9194 ]
In f2fs_sbi_show(), the extension_list, extension_count and
hot_ext_count are read without holding sbi->sb_lock. If a concurrent
sysfs store modifies the extension list via f2fs_update_extension_list(),
the show path may read inconsistent count and array contents, potentially
leading to out-of-bounds access or displaying stale data.
Fix this by holding sb_lock around the entire extension list read
and format operation.
Fixes: b6a06cbbb5f7 ("f2fs: support hot file extension")
Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ondrej Mosnacek <omosnace@redhat.com>
Date: Mon Feb 16 16:06:25 2026 +0100
fanotify: call fanotify_events_supported() before path_permission() and security_path_notify()
[ Upstream commit 66052a768d4726a31e939b5ac902f2b0b452c8d5 ]
The latter trigger LSM (e.g. SELinux) checks, which will log a denial
when permission is denied, so it's better to do them after validity
checks to avoid logging a denial when the operation would fail anyway.
Fixes: 0b3b094ac9a7 ("fanotify: Disallow permission events for proc filesystem")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Link: https://patch.msgid.link/20260216150625.793013-3-omosnace@redhat.com
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date: Fri Mar 20 15:36:46 2026 +0100
fbdev: matroxfb: Mark variable with __maybe_unused to avoid W=1 build break
[ Upstream commit caf6144053b4e1c815aa56afb54745a176f999df ]
Clang is not happy about set but unused variable:
drivers/video/fbdev/matrox/g450_pll.c:412:18: error: variable 'mnp' set but not used
412 | unsigned int mnp;
| ^
1 error generated.
Since the commit 7b987887f97b ("video: fbdev: matroxfb: remove dead code
and set but not used variable") the 'mnp' became unused, but eliminating
that code might have side-effects. The question here is what should we do
with 'mnp'? The easiest way out is just mark it with __maybe_unused which
will shut the compiler up and won't change any possible IO flow. So does
this change.
A dive into the history of the driver:
The problem was revealed when the #if 0 guarded code along with unused
pixel_vco variable was removed. That code was introduced in the original
commit 213d22146d1f ("[PATCH] (1/3) matroxfb for 2.5.3"). And then guarded
in the commit 705e41f82988 ("matroxfb DVI updates: Handle DVI output on
G450/G550. Powerdown unused portions of G450/G550 DAC. Split G450/G550 DAC
from older DAC1064 handling. Modify PLL setting when both CRTCs use same
pixel clocks.").
NOTE: The two commits mentioned above pre-date Git era and available in
history.git repository for archaeological purposes.
Even without that guard the modern compilers may see that the pixel_vco
wasn't ever used and seems a leftover after some debug or review made
25 years ago.
The g450_mnp2vco() doesn't have any IO and as Jason said doesn't seem
to have any side effects either than some unneeded CPU processing during
runtime. I agree that's unlikely that timeout (or heating up the CPU) has
any effect on the HW (GPU/display) functionality.
Fixes: 7b987887f97b ("video: fbdev: matroxfb: remove dead code and set but not used variable")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yuho Choi <dbgh9129@gmail.com>
Date: Sun Apr 19 21:01:18 2026 -0400
fbdev: offb: fix PCI device reference leak on probe failure
[ Upstream commit 869b93ba04088713596e68453c1146f52f713290 ]
offb_init_nodriver() gets a referenced PCI device with pci_get_device().
If pci_enable_device() fails, the function returns without dropping that
reference.
Release the PCI device reference before returning from the
pci_enable_device() failure path.
Fixes: 5bda8f7b5468 ("video: fbdev: offb: Call pci_enable_device() before using the PCI VGA device")
Co-developed-by: Myeonghun Pak <mhun512@gmail.com>
Signed-off-by: Myeonghun Pak <mhun512@gmail.com>
Co-developed-by: Ijae Kim <ae878000@gmail.com>
Signed-off-by: Ijae Kim <ae878000@gmail.com>
Co-developed-by: Taegyu Kim <tmk5904@psu.edu>
Signed-off-by: Taegyu Kim <tmk5904@psu.edu>
Signed-off-by: Yuho Choi <dbgh9129@gmail.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Al Viro <viro@zeniv.linux.org.uk>
Date: Fri Jul 19 20:17:58 2024 -0400
fdget(), trivial conversions
[ Upstream commit 6348be02eead77bdd1562154ed6b3296ad3b3750 ]
fdget() is the first thing done in scope, all matching fdput() are
immediately followed by leaving the scope.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Stable-dep-of: 66052a768d47 ("fanotify: call fanotify_events_supported() before path_permission() and security_path_notify()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sebastian Ene <sebastianene@google.com>
Date: Thu Apr 2 11:39:39 2026 +0000
firmware: arm_ffa: Use the correct buffer size during RXTX_MAP
[ Upstream commit 83210251fd70d5f96bcdc8911e15f7411a6b2463 ]
Don't use the discovered buffer size from an FFA_FEATURES call directly
since we can run on a system that has the PAGE_SIZE larger than the
returned size which makes the alloc_pages_exact for the buffer to be
rounded up.
Fixes: 61824feae5c0 ("firmware: arm_ffa: Fetch the Rx/Tx buffer size using ffa_features()")
Signed-off-by: Sebastian Ene <sebastianene@google.com>
Link: https://patch.msgid.link/20260402113939.930221-1-sebastianene@google.com
Signed-off-by: Sudeep Holla <sudeep.holla@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mario Limonciello (AMD) <superm1@kernel.org>
Date: Sat Mar 7 08:10:20 2026 -0600
firmware: dmi: Correct an indexing error in dmi.h
[ Upstream commit c064abc68e009d2cc18416e7132d9c25e03125b6 ]
The entries later in enum dmi_entry_type don't match the SMBIOS
specification¹.
The entry for type 33: `64-Bit Memory Error Information` is not present and
thus the index for all later entries is incorrect.
Add it.
Also, add missing entry types 43-46, while at it.
¹ Search for "System Management BIOS (SMBIOS) Reference Specification"
[ bp: Drop the flaky SMBIOS spec URL. ]
Fixes: 93c890dbe5287 ("firmware: Add DMI entry types to the headers")
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Link: https://patch.msgid.link/20260307141024.819807-2-superm1@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Bae Yeonju <iwasbaeyz@gmail.com>
Date: Sat Mar 21 13:45:02 2026 +0900
fs/adfs: validate nzones in adfs_validate_bblk()
[ Upstream commit dd9d3e16c2d5fa166e13dce07413be51f42c8f5d ]
Reject ADFS disc records with a zero zone count during boot block
validation, before the disc record is used.
When nzones is 0, adfs_read_map() passes it to kmalloc_array(0, ...)
which returns ZERO_SIZE_PTR, and adfs_map_layout() then writes to
dm[-1], causing an out-of-bounds write before the allocated buffer.
adfs_validate_dr0() already rejects nzones != 1 for old-format
images. Add the equivalent check to adfs_validate_bblk() for
new-format images so that a crafted image with nzones == 0 is
rejected at probe time.
Found by syzkaller.
Fixes: f6f14a0d71b0 ("fs/adfs: map: move map-specific sb initialisation to map.c")
Signed-off-by: Bae Yeonju <iwasbaeyz@gmail.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: HyungJung Joo <jhj140711@gmail.com>
Date: Tue Mar 17 14:45:56 2026 +0900
fs/mbcache: cancel shrink work before destroying the cache
[ Upstream commit d227786ab1119669df4dc333a61510c52047cce4 ]
mb_cache_destroy() calls shrinker_free() and then frees all cache
entries and the cache itself, but it does not cancel the pending
c_shrink_work work item first.
If mb_cache_entry_create() schedules c_shrink_work via schedule_work()
and the work item is still pending or running when mb_cache_destroy()
runs, mb_cache_shrink_worker() will access the cache after its memory
has been freed, causing a use-after-free.
This is only reachable by a privileged user (root or CAP_SYS_ADMIN)
who can trigger the last put of a mounted ext2/ext4/ocfs2 filesystem.
Cancel the work item with cancel_work_sync() before calling
shrinker_free(), ensuring the worker has finished and will not be
rescheduled before the cache is torn down.
Fixes: c2f3140fe2ec ("mbcache2: limit cache size")
Signed-off-by: Hyungjung Joo <jhj140711@gmail.com>
Link: https://patch.msgid.link/20260317054556.1821600-1-jhj140711@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pengpeng Hou <pengpeng@iscas.ac.cn>
Date: Fri Mar 27 14:19:55 2026 +0800
fs/ntfs3: terminate the cached volume label after UTF-8 conversion
[ Upstream commit a6cd43fe9b083fa23fe1595666d5738856cb261a ]
ntfs_fill_super() loads the on-disk volume label with utf16s_to_utf8s()
and stores the result in sbi->volume.label. The converted label is later
exposed through ntfs3_label_show() using %s, but utf16s_to_utf8s() only
returns the number of bytes written and does not add a trailing NUL.
If the converted label fills the entire fixed buffer,
ntfs3_label_show() can read past the end of sbi->volume.label while
looking for a terminator.
Terminate the cached label explicitly after a successful conversion and
clamp the exact-full case to the last byte of the buffer.
Fixes: 82cae269cfa9 ("fs/ntfs3: Add initialization of super block")
Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: HyungJung Joo <jhj140711@gmail.com>
Date: Tue Mar 17 14:48:27 2026 +0900
fs/omfs: reject s_sys_blocksize smaller than OMFS_DIR_START
[ Upstream commit 0621c385fda1376e967f37ccd534c26c3e511d14 ]
omfs_fill_super() rejects oversized s_sys_blocksize values (> PAGE_SIZE),
but it does not reject values smaller than OMFS_DIR_START (0x1b8 = 440).
Later, omfs_make_empty() uses
sbi->s_sys_blocksize - OMFS_DIR_START
as the length argument to memset(). Since s_sys_blocksize is u32,
a crafted filesystem image with s_sys_blocksize < OMFS_DIR_START causes
an unsigned underflow there, wrapping to a value near 2^32. That drives
a ~4 GiB memset() from bh->b_data + OMFS_DIR_START and overwrites kernel
memory far beyond the backing block buffer.
Add the corresponding lower-bound check alongside the existing upper-bound
check in omfs_fill_super(), so that malformed images are rejected during
superblock validation before any filesystem data is processed.
Fixes: a3ab7155ea21 ("omfs: add directory routines")
Signed-off-by: Hyungjung Joo <jhj140711@gmail.com>
Link: https://patch.msgid.link/20260317054827.1822061-1-jhj140711@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Amir Goldstein <amir73il@gmail.com>
Date: Mon Apr 20 14:58:00 2026 +0200
fsnotify: fix inode reference leak in fsnotify_recalc_mask()
[ Upstream commit 4aca914ac152f5d055ddcb36704d1e539ac08977 ]
fsnotify_recalc_mask() fails to handle the return value of
__fsnotify_recalc_mask(), which may return an inode pointer that needs
to be released via fsnotify_drop_object() when the connector's HAS_IREF
flag transitions from set to cleared.
This manifests as a hung task with the following call trace:
INFO: task umount:1234 blocked for more than 120 seconds.
Call Trace:
__schedule
schedule
fsnotify_sb_delete
generic_shutdown_super
kill_anon_super
cleanup_mnt
task_work_run
do_exit
do_group_exit
The race window that triggers the iref leak:
Thread A (adding mark) Thread B (removing mark)
────────────────────── ────────────────────────
fsnotify_add_mark_locked():
fsnotify_add_mark_list():
spin_lock(conn->lock)
add mark_B(evictable) to list
spin_unlock(conn->lock)
return
/* ---- gap: no lock held ---- */
fsnotify_detach_mark(mark_A):
spin_lock(mark_A->lock)
clear ATTACHED flag on mark_A
spin_unlock(mark_A->lock)
fsnotify_put_mark(mark_A)
fsnotify_recalc_mask():
spin_lock(conn->lock)
__fsnotify_recalc_mask():
/* mark_A skipped: ATTACHED cleared */
/* only mark_B(evictable) remains */
want_iref = false
has_iref = true /* not yet cleared */
-> HAS_IREF transitions true -> false
-> returns inode pointer
spin_unlock(conn->lock)
/* BUG: return value discarded!
* iput() and fsnotify_put_sb_watched_objects()
* are never called */
Fix this by deferring the transition true -> false of HAS_IREF flag from
fsnotify_recalc_mask() (Thread A) to fsnotify_put_mark() (thread B).
Fixes: c3638b5b1374 ("fsnotify: allow adding an inode mark without pinning inode")
Signed-off-by: Xin Yin <yinxin.x@bytedance.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Link: https://patch.msgid.link/CAOQ4uxiPsbHb0o5voUKyPFMvBsDkG914FYDcs4C5UpBMNm0Vcg@mail.gmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date: Tue Apr 28 12:34:25 2026 +0200
futex: Prevent lockup in requeue-PI during signal/ timeout wakeup
[ Upstream commit bc7304f3ae20972d11db6e0b1b541c63feda5f05 ]
During wait-requeue-pi (task A) and requeue-PI (task B) the following
race can happen:
Task A Task B
futex_wait_requeue_pi()
futex_setup_timer()
futex_do_wait()
futex_requeue()
CLASS(hb, hb1)(&key1);
CLASS(hb, hb2)(&key2);
*timeout*
futex_requeue_pi_wakeup_sync()
requeue_state = Q_REQUEUE_PI_IGNORE
*blocks on hb->lock*
futex_proxy_trylock_atomic()
futex_requeue_pi_prepare()
Q_REQUEUE_PI_IGNORE => -EAGAIN
double_unlock_hb(hb1, hb2)
*retry*
Task B acquires both hb locks and attempts to acquire the PI-lock of the
top most waiter (task B). Task A is leaving early due to a signal/
timeout and started removing itself from the queue. It updates its
requeue_state but can not remove it from the list because this requires
the hb lock which is owned by task B.
Usually task A is able to swoop the lock after task B unlocked it.
However if task B is of higher priority then task A may not be able to
wake up in time and acquire the lock before task B gets it again.
Especially on a UP system where A is never scheduled.
As a result task A blocks on the lock and task B busy loops, trying to
make progress but live locks the system instead. Tragic.
This can be fixed by removing the top most waiter from the list in this
case. This allows task B to grab the next top waiter (if any) in the
next iteration and make progress.
Remove the top most waiter if futex_requeue_pi_prepare() fails.
Let the waiter conditionally remove itself from the list in
handle_early_requeue_pi_wakeup().
Fixes: 07d91ef510fb1 ("futex: Prevent requeue_pi() lock nesting issue on RT")
Reported-by: Moritz Klammler <Moritz.Klammler@ferchau.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260428103425.dywXyPd3@linutronix.de
Closes: https://lore.kernel.org/all/VE1PR06MB6894BE61C173D802365BE19DFF4CA@VE1PR06MB6894.eurprd06.prod.outlook.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Andreas Gruenbacher <agruenba@redhat.com>
Date: Tue Mar 31 06:13:42 2026 +0200
gfs2: add some missing log locking
[ Upstream commit fe2c8d051150b90b3ccb85f89e3b1d636cb88ec8 ]
Function gfs2_logd() calls the log flushing functions gfs2_ail1_start(),
gfs2_ail1_wait(), and gfs2_ail1_empty() without holding sdp->sd_log_flush_lock,
but these functions require exclusion against concurrent transactions.
To fix that, add a non-locking __gfs2_log_flush() function. Then, in
gfs2_logd(), take sdp->sd_log_flush_lock before calling the above mentioned log
flushing functions and __gfs2_log_flush().
Fixes: 5e4c7632aae1c ("gfs2: Issue revokes more intelligently")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Andreas Gruenbacher <agruenba@redhat.com>
Date: Mon Feb 23 12:04:05 2026 +0100
gfs2: Call unlock_new_inode before d_instantiate
[ Upstream commit 2ff7cf7e0640ff071ebc5c7e3dc2df024a7c91e6 ]
As Neil Brown describes in detail in the link referenced below, new
inodes must be unlocked before they can be instantiated.
An even better fix is to use d_instantiate_new(), which combines
d_instantiate() and unlock_new_inode().
Fixes: 3d36e57ff768 ("gfs2: gfs2_create_inode rework")
Reported-by: syzbot+0ea5108a1f5fb4fcc2d8@syzkaller.appspotmail.com
Link: https://lore.kernel.org/linux-fsdevel/177153754005.8396.8777398743501764194@noble.neil.brown.name/
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Andreas Gruenbacher <agruenba@redhat.com>
Date: Tue Apr 7 12:14:30 2026 +0200
gfs2: prevent NULL pointer dereference during unmount
[ Upstream commit 74b4dbb946060a3233604d91859a9abd3708141d ]
When flushing out outstanding glock work during an unmount, gfs2_log_flush()
can be called when sdp->sd_jdesc has already been deallocated and sdp->sd_jdesc
is NULL. Commit 35264909e9d1 ("gfs2: Fix NULL pointer dereference in
gfs2_log_flush") added a check for that to gfs2_log_flush() itself, but it
missed the sdp->sd_jdesc dereference in gfs2_log_release(). Fix that.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Closes: https://lore.kernel.org/r/202604071139.HNJiCaAi-lkp@intel.com/
Fixes: 35264909e9d1 ("gfs2: Fix NULL pointer dereference in gfs2_log_flush")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Denis Benato <denis.benato@linux.dev>
Date: Sat Feb 28 20:10:09 2026 +0100
HID: asus: do not abort probe when not necessary
[ Upstream commit 7253091766ded0fd81fe8d8be9b8b835495b06e8 ]
In order to avoid dereferencing a NULL pointer asus_probe is aborted early
and control of some asus devices is transferred over hid-generic after
erroring out even when such NULL dereference cannot happen: only early
abort when the NULL dereference can happen.
Also make the code shorter and more adherent to coding standards
removing square brackets enclosing single-line if-else statements.
Fixes: d3af6ca9a8c3 ("HID: asus: fix UAF via HID_CLAIMED_INPUT validation")
Signed-off-by: Denis Benato <denis.benato@linux.dev>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Denis Benato <denis.benato@linux.dev>
Date: Sat Feb 28 20:10:07 2026 +0100
HID: asus: make asus_resume adhere to linux kernel coding standards
[ Upstream commit 51d33b42b8ae23da92819d28439fdd5636c45186 ]
Linux kernel coding standars requires functions opening brackets to be in
a newline: move the opening bracket of asus_resume in its own line.
Fixes: 546edbd26cff ("HID: hid-asus: reset the backlight brightness level on resume")
Signed-off-by: Denis Benato <denis.benato@linux.dev>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Oliver Neukum <oneukum@suse.com>
Date: Tue Mar 24 15:24:54 2026 +0100
HID: usbhid: fix deadlock in hid_post_reset()
[ Upstream commit 8df2c1b47ee3cd50fd454f75c7a7e2ae8a6adf72 ]
You can build a USB device that includes a HID component
and a storage or UAS component. The components can be reset
only together. That means that hid_pre_reset() and hid_post_reset()
are in the block IO error handling. Hence no memory allocation
used in them may do block IO because the IO can deadlock
on the mutex held while resetting a device and calling the
interface drivers.
Use GFP_NOIO for all allocations in them.
Fixes: dc3c78e434690 ("HID: usbhid: Check HID report descriptor contents after device reset")
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Peter Zijlstra <peterz@infradead.org>
Date: Tue Feb 24 17:35:37 2026 +0100
hrtimer: Avoid pointless reprogramming in __hrtimer_start_range_ns()
[ Upstream commit d19ff16c11db38f3ee179d72751fb9b340174330 ]
Much like hrtimer_reprogram(), skip programming if the cpu_base is running
the hrtimer interrupt.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Juri Lelli <juri.lelli@redhat.com>
Reviewed-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260224163429.069535561@kernel.org
Stable-dep-of: f2e388a019e4 ("hrtimer: Reduce trace noise in hrtimer_start()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Thomas Gleixner <tglx@kernel.org>
Date: Tue Feb 24 17:36:59 2026 +0100
hrtimer: Reduce trace noise in hrtimer_start()
[ Upstream commit f2e388a019e4cf83a15883a3d1f1384298e9a6aa ]
hrtimer_start() when invoked with an already armed timer traces like:
<comm>-.. [032] d.h2. 5.002263: hrtimer_cancel: hrtimer= ....
<comm>-.. [032] d.h1. 5.002263: hrtimer_start: hrtimer= ....
Which is incorrect as the timer doesn't get canceled. Just the expiry time
changes. The internal dequeue operation which is required for that is not
really interesting for trace analysis. But it makes it tedious to keep real
cancellations and the above case apart.
Remove the cancel tracing in hrtimer_start() and add a 'was_armed'
indicator to the hrtimer start tracepoint, which clearly indicates what the
state of the hrtimer is when hrtimer_start() is invoked:
<comm>-.. [032] d.h1. 6.200103: hrtimer_start: hrtimer= .... was_armed=0
<comm>-.. [032] d.h1. 6.200558: hrtimer_start: hrtimer= .... was_armed=1
Fixes: c6a2a1770245 ("hrtimer: Add tracepoint for hrtimers")
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260224163430.208491877@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Richard Clark <richard.xnu.clark@gmail.com>
Date: Tue Dec 24 15:57:03 2024 +0800
hrtimers: Update the return type of enqueue_hrtimer()
[ Upstream commit da7100d3bf7d6f5c49ef493ea963766898e9b069 ]
The return type should be 'bool' instead of 'int' according to the calling
context in the kernel, and its internal implementation, i.e. :
return timerqueue_add();
which is a bool-return function.
[ tglx: Adjust function arguments ]
Signed-off-by: Richard Clark <richard.xnu.clark@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/Z2ppT7me13dtxm1a@MBC02GN1V4Q05P
Stable-dep-of: f2e388a019e4 ("hrtimer: Reduce trace noise in hrtimer_start()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Francesco Lavra <flavra@baylibre.com>
Date: Wed Nov 26 11:46:18 2025 +0100
hte: tegra194: remove Kconfig dependency on Tegra194 SoC
[ Upstream commit 92dfd92f747698352b256cd9ddd7497bb7ebe9c8 ]
This driver runs also on other Tegra SoCs (e.g. Tegra234).
Replace Kconfig dependency on Tegra194 with more generic dependency on
Tegra, and amend the Kconfig help text to reflect the fact that this
driver works on SoCs other than Tegra194.
Fixes: b003fb5c9df8 ("hte: Add Tegra234 provider")
Signed-off-by: Francesco Lavra <flavra@baylibre.com>
Acked-by: Dipen Patel <dipenp@nvidia.com>
Signed-off-by: Dipen Patel <dipenp@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Billy Tsai <billy_tsai@aspeedtech.com>
Date: Mon Mar 9 10:33:24 2026 +0800
hwmon: (aspeed-g6-pwm-tach): remove redundant driver remove callback
[ Upstream commit 46fef8583daa1bf78fda7eaa523c64d4440322ac ]
Drops the remove callback as it only asserts reset and the probe already
registers a devres action (devm_add_action_or_reset()) to call
aspeed_pwm_tach_reset_assert().
Fixes: 7e1449cd15d1 ("hwmon: (aspeed-g6-pwm-tacho): Support for ASPEED g6 PWM/Fan tach")
Signed-off-by: Billy Tsai <billy_tsai@aspeedtech.com>
Link: https://lore.kernel.org/r/20260309-pwm_fixes-v2-1-ca9768e70470@aspeedtech.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Date: Thu Oct 17 17:59:01 2024 +0200
hwmon: Switch back to struct platform_driver::remove()
[ Upstream commit 6126f7bb6075d0af577e55bf7e2cbbcc272f520b ]
After commit 0edb555a65d1 ("platform: Make platform_driver::remove()
return void") .remove() is (again) the right callback to implement for
platform drivers.
Convert all platform drivers below drivers/hwmonto use .remove(), with
the eventual goal to drop struct platform_driver::remove_new(). As
.remove() and .remove_new() have the same prototypes, conversion is done
by just changing the structure member name in the driver initializer.
While touching these files, make indention of the struct initializer
consistent in several files.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Message-ID: <20241017155900.137357-2-u.kleine-koenig@baylibre.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Stable-dep-of: 46fef8583daa ("hwmon: (aspeed-g6-pwm-tach): remove redundant driver remove callback")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Felix Gu <ustc.gu@gmail.com>
Date: Sat Apr 4 18:32:30 2026 +0800
i3c: dw: Fix memory leak in dw_i3c_master_i3c_xfers()
[ Upstream commit 256cc1f1305a8e5dcadf8ca208d04a3acadd26f1 ]
The dw_i3c_master_i3c_xfers() function allocates memory for the xfer
structure using dw_i3c_master_alloc_xfer(). If pm_runtime_resume_and_get()
fails, the function returns without freeing the allocated xfer, resulting
in a memory leak.
Since dw_i3c_master_free_xfer() is a thin wrapper around kfree(), use
the __free(kfree) cleanup attribute to handle the free automatically on
all exit paths.
Fixes: 62fe9d06f570 ("i3c: dw: Add power management support")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260404-dw-i3c-2-v3-1-8f7d146549c1@gmail.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Felix Gu <ustc.gu@gmail.com>
Date: Fri Mar 20 22:18:02 2026 +0800
i3c: master: dw-i3c: Fix missing reset assertion in remove() callback
[ Upstream commit bef1eef667186cedb0bc6d152464acb3c97d5f72 ]
The reset line acquired during probe is currently left deasserted when
the driver is unbound.
Switch to devm_reset_control_get_optional_exclusive_deasserted() to
ensure the reset is automatically re-asserted by the devres core when
the driver is removed.
Fixes: 62fe9d06f570 ("i3c: dw: Add power management support")
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260320-dw-i3c-v3-1-477040c2e3f5@gmail.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Billy Tsai <billy_tsai@aspeedtech.com>
Date: Tue Apr 7 16:53:23 2026 +0800
i3c: mipi-i3c-hci: fix IBI payload length calculation for final status
[ Upstream commit d35a6db887eeae7c57b719521e39d64f929c6dc3 ]
In DMA mode, the IBI status descriptor encodes the payload using
CHUNKS (number of chunks) and DATA_LENGTH (valid bytes in the last
chunk). All preceding chunks are implicitly full-sized.
The current code accumulates full chunk sizes for non-final status
descriptors, but for the final status descriptor it only adds
DATA_LENGTH. This ignores the contribution of the preceding full
chunks described by the same final status entry.
As a result, the computed IBI payload length is truncated whenever
the final status spans multiple chunks. For example, with a chunk
size of 4 bytes, CHUNKS=2 and DATA_LENGTH=1 should result in a total
payload size of 5 bytes, but the current code reports only 1 byte.
Fix the calculation by adding the size of (CHUNKS - 1) full chunks
plus DATA_LENGTH for the last chunk.
Fixes: 9ad9a52cce28 ("i3c/master: introduce the mipi-i3c-hci driver")
Signed-off-by: Billy Tsai <billy_tsai@aspeedtech.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260407-i3c-hci-dma-v2-1-a583187b9d22@aspeedtech.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Matt Vollrath <tactii@gmail.com>
Date: Wed May 6 14:48:11 2026 -0700
i40e: Cleanup PTP pins on probe failure
commit 678b713ece1e853f11e670a84cb887c35e1381b7 upstream.
PTP pin structs are allocated early in probe, but never cleaned up.
Fix this by calling i40e_ptp_free_pins in the error path.
To support this, i40e_ptp_free_pins is added to the header and
pin_config is correctly nullified after being freed.
This has been an issue since i40e_ptp_alloc_pins was introduced.
Fixes: 1050713026a08 ("i40e: add support for PTP external synchronization clock")
Reported-by: Kohei Enju <kohei@enjuk.jp>
Cc: stable@vger.kernel.org
Signed-off-by: Matt Vollrath <tactii@gmail.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Kohei Enju <kohei@enjuk.jp>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260506-jk-iwl-net-2026-05-04-v2-2-a5ea4dc837a9@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Kohei Enju <kohei@enjuk.jp>
Date: Thu Apr 16 17:53:33 2026 -0700
i40e: don't advertise IFF_SUPP_NOFCS
[ Upstream commit a24162f18825684ad04e3a5d0531f8a50d679347 ]
i40e advertises IFF_SUPP_NOFCS, allowing users to use the SO_NOFCS
socket option. However, this option is silently ignored, as the driver
does not check skb->no_fcs, and always enables FCS insertion offload.
Fix this by removing the advertisement of IFF_SUPP_NOFCS.
This behavior can be reproduced with a simple AF_PACKET socket:
import socket
s = socket.socket(socket.AF_PACKET, socket.SOCK_RAW)
s.setsockopt(socket.SOL_SOCKET, 43, 1) # SO_NOFCS
s.bind(("eth0", 0))
s.send(b'\xff' * 64)
Previously, send() succeeds but the driver ignores SO_NOFCS.
With this change, send() fails with -EPROTONOSUPPORT, as expected.
Fixes: 41c445ff0f48 ("i40e: main driver core")
Signed-off-by: Kohei Enju <kohei@enjuk.jp>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-9-686c33c9828d@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Petr Oros <poros@redhat.com>
Date: Mon Apr 27 22:22:16 2026 -0700
iavf: add VIRTCHNL_OP_ADD_VLAN to success completion handler
[ Upstream commit 34d33313b52eeac3a97ad2e3176d523ec70d9283 ]
The V1 ADD_VLAN opcode had no success handler; filters sent via V1
stayed in ADDING state permanently. Add a fallthrough case so V1
filters also transition ADDING -> ACTIVE on PF confirmation.
Critically, add an `if (v_retval) break` guard: the error switch in
iavf_virtchnl_completion() does NOT return after handling errors,
it falls through to the success switch. Without this guard, a
PF-rejected ADD would incorrectly mark ADDING filters as ACTIVE,
creating a driver/HW mismatch where the driver believes the filter
is installed but the PF never accepted it.
For V2, this is harmless: iavf_vlan_add_reject() in the error
block already kfree'd all ADDING filters, so the success handler
finds nothing to transition.
Fixes: 968996c070ef ("iavf: Fix VLAN_V2 addition/rejection")
Signed-off-by: Petr Oros <poros@redhat.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-4-cdcb48303fd8@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Petr Oros <poros@redhat.com>
Date: Mon Apr 27 22:22:13 2026 -0700
iavf: rename IAVF_VLAN_IS_NEW to IAVF_VLAN_ADDING
[ Upstream commit 70d62b669f1f9080a25278fc90b64309f4ae8959 ]
Rename the IAVF_VLAN_IS_NEW state to IAVF_VLAN_ADDING to better
describe what the state represents: an ADD request has been sent to
the PF and is waiting for a response.
This is a pure rename with no behavioral change, preparing for a
cleanup of the VLAN filter state machine.
Signed-off-by: Petr Oros <poros@redhat.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-1-cdcb48303fd8@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Stable-dep-of: f2ce65b9b917 ("iavf: stop removing VLAN filters from PF on interface down")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Petr Oros <poros@redhat.com>
Date: Mon Apr 27 22:22:14 2026 -0700
iavf: stop removing VLAN filters from PF on interface down
[ Upstream commit f2ce65b9b917474a1a6ce68d357e15fac2aca0f2 ]
When a VF goes down, the driver currently sends DEL_VLAN to the PF for
every VLAN filter (ACTIVE -> DISABLE -> send DEL -> INACTIVE), then
re-adds them all on UP (INACTIVE -> ADD -> send ADD -> ADDING ->
ACTIVE). This round-trip is unnecessary because:
1. The PF disables the VF's queues via VIRTCHNL_OP_DISABLE_QUEUES,
which already prevents all RX/TX traffic regardless of VLAN filter
state.
2. The VLAN filters remaining in PF HW while the VF is down is
harmless - packets matching those filters have nowhere to go with
queues disabled.
3. The DEL+ADD cycle during down/up creates race windows where the
VLAN filter list is incomplete. With spoofcheck enabled, the PF
enables TX VLAN filtering on the first non-zero VLAN add, blocking
traffic for any VLANs not yet re-added.
Remove the entire DISABLE/INACTIVE state machinery:
- Remove IAVF_VLAN_DISABLE and IAVF_VLAN_INACTIVE enum values
- Remove iavf_restore_filters() and its call from iavf_open()
- Remove VLAN filter handling from iavf_clear_mac_vlan_filters(),
rename it to iavf_clear_mac_filters()
- Remove DEL_VLAN_FILTER scheduling from iavf_down()
- Remove all DISABLE/INACTIVE handling from iavf_del_vlans()
VLAN filters now stay ACTIVE across down/up cycles. Only explicit
user removal (ndo_vlan_rx_kill_vid) or PF/VF reset triggers VLAN
filter deletion/re-addition.
Fixes: ed1f5b58ea01 ("i40evf: remove VLAN filters on close")
Signed-off-by: Petr Oros <poros@redhat.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-2-cdcb48303fd8@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Petr Oros <poros@redhat.com>
Date: Mon Apr 27 22:22:15 2026 -0700
iavf: wait for PF confirmation before removing VLAN filters
[ Upstream commit bbcbe4ed70dea948849549af7edf44bd42bbd695 ]
The VLAN filter DELETE path was asymmetric with the ADD path: ADD
waits for PF confirmation (ADD -> ADDING -> ACTIVE), but DELETE
immediately frees the filter struct after sending the DEL message
without waiting for the PF response.
This is problematic because:
- If the PF rejects the DEL, the filter remains in HW but the driver
has already freed the tracking structure, losing sync.
- Race conditions between DEL pending and other operations
(add, reset) cannot be properly resolved if the filter struct
is already gone.
Add IAVF_VLAN_REMOVING state to make the DELETE path symmetric:
REMOVE -> REMOVING (send DEL) -> PF confirms -> kfree
-> PF rejects -> ACTIVE
In iavf_del_vlans(), transition filters from REMOVE to REMOVING
instead of immediately freeing them. The new DEL completion handler
in iavf_virtchnl_completion() frees filters on success or reverts
them to ACTIVE on error.
Update iavf_add_vlan() to handle the REMOVING state: if a DEL is
pending and the user re-adds the same VLAN, queue it for ADD so
it gets re-programmed after the PF processes the DEL.
The !VLAN_FILTERING_ALLOWED early-exit path still frees filters
directly since no PF message is sent in that case.
Also update iavf_del_vlan() to skip filters already in REMOVING
state: DEL has been sent to PF and the completion handler will
free the filter when PF confirms. Without this guard, the sequence
DEL(pending) -> user-del -> second DEL could cause the PF to return
an error for the second DEL (filter already gone), causing the
completion handler to incorrectly revert a deleted filter back to
ACTIVE.
Fixes: 968996c070ef ("iavf: Fix VLAN_V2 addition/rejection")
Signed-off-by: Petr Oros <poros@redhat.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-3-cdcb48303fd8@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Michal Schmidt <mschmidt@redhat.com>
Date: Thu Apr 16 17:53:28 2026 -0700
ice: fix double-free of tx_buf skb
[ Upstream commit 1a303baa715e6b78d6a406aaf335f87ff35acfcd ]
If ice_tso() or ice_tx_csum() fail, the error path in
ice_xmit_frame_ring() frees the skb, but the 'first' tx_buf still points
to it and is marked as valid (ICE_TX_BUF_SKB).
'next_to_use' remains unchanged, so the potential problem will
likely fix itself when the next packet is transmitted and the tx_buf
gets overwritten. But if there is no next packet and the interface is
brought down instead, ice_clean_tx_ring() -> ice_unmap_and_free_tx_buf()
will find the tx_buf and free the skb for the second time.
The fix is to reset the tx_buf type to ICE_TX_BUF_EMPTY in the error
path, so that ice_unmap_and_free_tx_buf().
Move the initialization of 'first' up, to ensure it's already valid in
case we hit the linearization error path.
The bug was spotted by AI while I had it looking for something else.
It also proposed an initial version of the patch.
I reproduced the bug and tested the fix by adding code to inject
failures, on a build with KASAN.
I looked for similar bugs in related Intel drivers and did not find any.
Fixes: d76a60ba7afb ("ice: Add support for VLANs and offloads")
Assisted-by: Claude:claude-4.6-opus-high Cursor
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-4-686c33c9828d@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Paul Greenwalt <paul.greenwalt@intel.com>
Date: Thu Apr 16 17:53:30 2026 -0700
ice: fix ICE_AQ_LINK_SPEED_M for 200G
[ Upstream commit 4a3a940059e98539de293a6e36e464094c2e875b ]
When setting PHY configuration during driver initialization, 200G link
speed is not being advertised even when the PHY is capable. This is
because the get PHY capabilities link speed response is being masked by
ICE_AQ_LINK_SPEED_M, which does not include the 200G link speed bit.
ICE_AQ_LINK_SPEED_200GB is defined as BIT(11), but the mask 0x7FF only
covers bits 0-10. Fix ICE_AQ_LINK_SPEED_M to use GENMASK(11, 0) so
that it covers all defined link speed bits including 200G.
Fixes: 24407a01e57c ("ice: Add 200G speed/phy type use")
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-6-686c33c9828d@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jacob Keller <jacob.e.keller@intel.com>
Date: Mon Apr 20 17:51:28 2026 -0700
ice: fix ice_ptp_read_tx_hwtstamp_status_eth56g
[ Upstream commit 1f75dbc53f68f0fb2acd99f92315e426a3d0b446 ]
The ice_ptp_read_tx_hwtstamp_status_eth56g function calls
ice_read_phy_eth56g with a PHY index. However the function actually expects
a port index. This causes the function to read the wrong PHY_PTP_INT_STATUS
registers, and effectively makes the status wrong for the second set of
ports from 4 to 7.
The ice_read_phy_eth56g function uses the provided port index to determine
which PHY device to read. We could refactor the entire chain to take a PHY
index, but this would impact many code sites. Instead, multiply the PHY
index by the number of ports, so that we read from the first port of each
PHY.
Fixes: 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products")
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Petr Oros <poros@redhat.com>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260420-jk-iwl-net-2026-04-20-ptp-e825c-phy-interrupt-fixes-v1-4-bc2240f42251@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Petr Oros <poros@redhat.com>
Date: Mon Apr 27 22:22:17 2026 -0700
ice: fix NULL pointer dereference in ice_reset_all_vfs()
[ Upstream commit 54ef02487914c24170c7e1c061e45212dc55365e ]
ice_reset_all_vfs() ignores the return value of ice_vf_rebuild_vsi().
When the VSI rebuild fails (e.g. during NVM firmware update via
nvmupdate64e), ice_vsi_rebuild() tears down the VSI on its error path,
leaving txq_map and rxq_map as NULL. The subsequent unconditional call
to ice_vf_post_vsi_rebuild() leads to a NULL pointer dereference in
ice_ena_vf_q_mappings() when it accesses vsi->txq_map[0].
The single-VF reset path in ice_reset_vf() already handles this
correctly by checking the return value of ice_vf_reconfig_vsi() and
skipping ice_vf_post_vsi_rebuild() on failure.
Apply the same pattern to ice_reset_all_vfs(): check the return value
of ice_vf_rebuild_vsi() and skip ice_vf_post_vsi_rebuild() and
ice_eswitch_attach_vf() on failure. The VF is left safely disabled
(ICE_VF_STATE_INIT not set, VFGEN_RSTAT not set to VFACTIVE) and can
be recovered via a VFLR triggered by a PCI reset of the VF
(sysfs reset or driver rebind).
Note that this patch does not prevent the VF VSI rebuild from failing
during NVM update — the underlying cause is firmware being in a
transitional state while the EMP reset is processed, which can cause
Admin Queue commands (ice_add_vsi, ice_cfg_vsi_lan) to fail. This
patch only prevents the subsequent NULL pointer dereference that
crashes the kernel when the rebuild does fail.
crash> bt
PID: 50795 TASK: ff34c9ee708dc680 CPU: 1 COMMAND: "kworker/u512:5"
#0 [ff72159bcfe5bb50] machine_kexec at ffffffffaa8850ee
#1 [ff72159bcfe5bba8] __crash_kexec at ffffffffaaa15fba
#2 [ff72159bcfe5bc68] crash_kexec at ffffffffaaa16540
#3 [ff72159bcfe5bc70] oops_end at ffffffffaa837eda
#4 [ff72159bcfe5bc90] page_fault_oops at ffffffffaa893997
#5 [ff72159bcfe5bce8] exc_page_fault at ffffffffab528595
#6 [ff72159bcfe5bd10] asm_exc_page_fault at ffffffffab600bb2
[exception RIP: ice_ena_vf_q_mappings+0x79]
RIP: ffffffffc0a85b29 RSP: ff72159bcfe5bdc8 RFLAGS: 00010206
RAX: 00000000000f0000 RBX: ff34c9efc9c00000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000010 RDI: ff34c9efc9c00000
RBP: ff34c9efc27d4828 R8: 0000000000000093 R9: 0000000000000040
R10: ff34c9efc27d4828 R11: 0000000000000040 R12: 0000000000100000
R13: 0000000000000010 R14: R15:
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#7 [ff72159bcfe5bdf8] ice_sriov_post_vsi_rebuild at ffffffffc0a85e2e [ice]
#8 [ff72159bcfe5be08] ice_reset_all_vfs at ffffffffc0a920b4 [ice]
#9 [ff72159bcfe5be48] ice_service_task at ffffffffc0a31519 [ice]
#10 [ff72159bcfe5be88] process_one_work at ffffffffaa93dca4
#11 [ff72159bcfe5bec8] worker_thread at ffffffffaa93e9de
#12 [ff72159bcfe5bf18] kthread at ffffffffaa946663
#13 [ff72159bcfe5bf50] ret_from_fork at ffffffffaa8086b9
The panic occurs attempting to dereference the NULL pointer in RDX at
ice_sriov.c:294, which loads vsi->txq_map (offset 0x4b8 in ice_vsi).
The faulting VSI is an allocated slab object but not fully initialized
after a failed ice_vsi_rebuild():
crash> struct ice_vsi 0xff34c9efc27d4828
netdev = 0x0,
rx_rings = 0x0,
tx_rings = 0x0,
q_vectors = 0x0,
txq_map = 0x0,
rxq_map = 0x0,
alloc_txq = 0x10,
num_txq = 0x10,
alloc_rxq = 0x10,
num_rxq = 0x10,
The nvmupdate64e process was performing NVM firmware update:
crash> bt 0xff34c9edd1a30000
PID: 49858 TASK: ff34c9edd1a30000 CPU: 1 COMMAND: "nvmupdate64e"
#0 [ff72159bcd617618] __schedule at ffffffffab5333f8
#4 [ff72159bcd617750] ice_sq_send_cmd at ffffffffc0a35347 [ice]
#5 [ff72159bcd6177a8] ice_sq_send_cmd_retry at ffffffffc0a35b47 [ice]
#6 [ff72159bcd617810] ice_aq_send_cmd at ffffffffc0a38018 [ice]
#7 [ff72159bcd617848] ice_aq_read_nvm at ffffffffc0a40254 [ice]
#8 [ff72159bcd6178b8] ice_read_flat_nvm at ffffffffc0a4034c [ice]
#9 [ff72159bcd617918] ice_devlink_nvm_snapshot at ffffffffc0a6ffa5 [ice]
dmesg:
ice 0000:13:00.0: firmware recommends not updating fw.mgmt, as it
may result in a downgrade. continuing anyways
ice 0000:13:00.1: ice_init_nvm failed -5
ice 0000:13:00.1: Rebuild failed, unload and reload driver
Fixes: 12bb018c538c ("ice: Refactor VF reset")
Signed-off-by: Petr Oros <poros@redhat.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260427-jk-iwl-net-petr-oros-fixes-v1-5-cdcb48303fd8@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Grzegorz Nitka <grzegorz.nitka@intel.com>
Date: Mon Apr 20 17:51:25 2026 -0700
ice: fix timestamp interrupt configuration for E825C
[ Upstream commit c0a575a801a2040eb1e0db54b488f8c548c8458a ]
The E825C ice_phy_cfg_intr_eth56g() function is responsible for programming
the PHY interrupt for a given port. This function writes to the
PHY_REG_TS_INT_CONFIG register of the port. The register is responsible for
configuring whether the port interrupt logic is enabled, as well as
programming the threshold of waiting timestamps that will trigger an
interrupt from this port.
This threshold value must not be programmed to zero while the interrupt is
enabled. Doing so puts the port in a misconfigured state where the PHY
timestamp interrupt for the quad of connected ports will become stuck.
This occurs, because a threshold of zero results in the timestamp interrupt
status for the port becoming stuck high. The four ports in the connected
quad have their timestamp status indicators muxed together. A new interrupt
cannot be generated until the timestamp status indicators return low for
all four ports.
Normally, the timestamp status for a port will clear once there are fewer
timestamps in that ports timestamp memory bank than the threshold. A
threshold of zero makes this impossible, so the timestamp status for the
port does not clear.
The ice driver never intentionally programs the threshold to zero, indeed
the driver always programs it to a value of 1, intending to get an
interrupt immediately as soon as even a single packet is waiting for a
timestamp.
However, there is a subtle flaw in the programming logic in the
ice_phy_cfg_intr_eth56g() function. Due to the way that the hardware
handles enabling the PHY interrupt. If the threshold value is modified at
the same time as the interrupt is enabled, the HW PHY state machine might
enable the interrupt before the new threshold value is actually updated.
This leaves a potential race condition caused by the hardware logic where
a PHY timestamp interrupt might be triggered before the non-zero threshold
is written, resulting in the PHY timestamp logic becoming stuck.
Once the PHY timestamp status is stuck high, it will remain stuck even
after attempting to reprogram the PHY block by changing its threshold or
disabling the interrupt. Even a typical PF or CORE reset will not reset the
particular block of the PHY that becomes stuck. Even a warm power cycle is
not guaranteed to cause the PHY block to reset, and a cold power cycle is
required.
Prevent this by always writing the PHY_REG_TS_INT_CONFIG in two stages.
First write the threshold value with the interrupt disabled, and only write
the enable bit after the threshold has been programmed. When disabling the
interrupt, leave the threshold unchanged. Additionally, re-read the
register after writing it to guarantee that the write to the PHY has been
flushed upon exit of the function.
While we're modifying this function implementation, explicitly reject
programming a threshold of 0 when enabling the interrupt. No caller does
this today, but the consequences of doing so are significant. An explicit
rejection in the code makes this clear.
Fixes: 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products")
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Petr Oros <poros@redhat.com>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260420-jk-iwl-net-2026-04-20-ptp-e825c-phy-interrupt-fixes-v1-1-bc2240f42251@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alice Mikityanska <alice@isovalent.com>
Date: Thu Feb 5 15:39:20 2026 +0200
ice: Remove jumbo_remove step from TX path
[ Upstream commit 8b76102c5e00d1f090e0c31d17b060c76d8fa859 ]
Now that the kernel doesn't insert HBH for BIG TCP IPv6 packets, remove
unnecessary steps from the ice TX path, that used to check and remove
HBH.
Signed-off-by: Alice Mikityanska <alice@isovalent.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260205133925.526371-8-alice.kernel@fastmail.im
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 1a303baa715e ("ice: fix double-free of tx_buf skb")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Grzegorz Nitka <grzegorz.nitka@intel.com>
Date: Thu Apr 16 17:53:26 2026 -0700
ice: update PCS latency settings for E825 10G/25Gb modes
[ Upstream commit 05567e4052732d70c7ff9655217b3d14d25f639a ]
Update MAC Rx/Tx offset registers settings (PHY_MAC_[RX|TX]_OFFSET
registers) with the data obtained with the latest research. It applies
to PCS latency settings for the following speeds/modes:
* 10Gb NO-FEC
- TX latency changed from 71.25 ns to 73 ns
- RX latency changed from -25.6 ns to -28 ns
* 25Gb NO-FEC
- TX latency changed from 28.17 ns to 33 ns
- RX latency changed from -12.45 ns to -12 ns
* 25Gb RS-FEC
- TX latency changed from 64.5 ns to 69 ns
- RX latency changed from -3.6 ns to -3 ns
The original data came from simulation and pre-production hardware.
The new data measures the actual delays and as such is more accurate.
Fixes: 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products")
Co-developed-by: Zoltan Fodor <zoltan.fodor@intel.com>
Signed-off-by: Zoltan Fodor <zoltan.fodor@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260416-iwl-net-submission-2026-04-14-v2-2-686c33c9828d@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Daniel Hodges <hodgesd@meta.com>
Date: Sat Jan 31 18:40:15 2026 -0800
ima: check return value of crypto_shash_final() in boot aggregate
[ Upstream commit 870819434c8dfcc3158033b66e7851b81bb17e21 ]
The return value of crypto_shash_final() is not checked in
ima_calc_boot_aggregate_tfm(). If the hash finalization fails, the
function returns success and a corrupted boot aggregate digest could
be used for IMA measurements.
Capture the return value and propagate any error to the caller.
Fixes: 76bb28f6126f ("ima: use new crypto_shash API instead of old crypto_hash")
Signed-off-by: Daniel Hodges <hodgesd@meta.com>
Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Safonov <0x7f454c46@gmail.com>
Date: Tue Mar 10 17:40:39 2026 +0000
ima_fs: Correctly create securityfs files for unsupported hash algos
[ Upstream commit d7bd8cf0b348d3edae7bee33e74a32b21668b181 ]
ima_tpm_chip->allocated_banks[i].crypto_id is initialized to
HASH_ALGO__LAST if the TPM algorithm is not supported. However there
are places relying on the algorithm to be valid because it is accessed
by hash_algo_name[].
On 6.12.40 I observe the following read out-of-bounds in hash_algo_name:
==================================================================
BUG: KASAN: global-out-of-bounds in create_securityfs_measurement_lists+0x396/0x440
Read of size 8 at addr ffffffff83e18138 by task swapper/0/1
CPU: 4 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.12.40 #3
Call Trace:
<TASK>
dump_stack_lvl+0x61/0x90
print_report+0xc4/0x580
? kasan_addr_to_slab+0x26/0x80
? create_securityfs_measurement_lists+0x396/0x440
kasan_report+0xc2/0x100
? create_securityfs_measurement_lists+0x396/0x440
create_securityfs_measurement_lists+0x396/0x440
ima_fs_init+0xa3/0x300
ima_init+0x7d/0xd0
init_ima+0x28/0x100
do_one_initcall+0xa6/0x3e0
kernel_init_freeable+0x455/0x740
kernel_init+0x24/0x1d0
ret_from_fork+0x38/0x80
ret_from_fork_asm+0x11/0x20
</TASK>
The buggy address belongs to the variable:
hash_algo_name+0xb8/0x420
Memory state around the buggy address:
ffffffff83e18000: 00 01 f9 f9 f9 f9 f9 f9 00 01 f9 f9 f9 f9 f9 f9
ffffffff83e18080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffffffff83e18100: 00 00 00 00 00 00 00 f9 f9 f9 f9 f9 00 05 f9 f9
^
ffffffff83e18180: f9 f9 f9 f9 00 00 00 00 00 00 00 04 f9 f9 f9 f9
ffffffff83e18200: 00 00 00 00 00 00 00 00 04 f9 f9 f9 f9 f9 f9 f9
==================================================================
Seems like the TPM chip supports sha3_256, which isn't yet in
tpm_algorithms:
tpm tpm0: TPM with unsupported bank algorithm 0x0027
That's TPM_ALG_SHA3_256 == 0x0027 from "Trusted Platform Module 2.0
Library Part 2: Structures", page 51 [1].
See also the related U-Boot algorithms update [2].
Thus solve the problem by creating a file name with "_tpm_alg_<ID>"
postfix if the crypto algorithm isn't initialized.
This is how it looks on the test machine (patch ported to v6.12 release):
# ls -1 /sys/kernel/security/ima/
ascii_runtime_measurements
ascii_runtime_measurements_tpm_alg_27
ascii_runtime_measurements_sha1
ascii_runtime_measurements_sha256
binary_runtime_measurements
binary_runtime_measurements_tpm_alg_27
binary_runtime_measurements_sha1
binary_runtime_measurements_sha256
policy
runtime_measurements_count
violations
[1]: https://trustedcomputinggroup.org/wp-content/uploads/Trusted-Platform-Module-2.0-Library-Part-2-Version-184_pub.pdf
[2]: https://lists.denx.de/pipermail/u-boot/2024-July/558835.html
Fixes: 9fa8e7625008 ("ima: add crypto agility support for template-hash algorithm")
Signed-off-by: Dmitry Safonov <dima@arista.com>
Cc: Enrico Bravi <enrico.bravi@polito.it>
Cc: Silvia Sisinni <silvia.sisinni@polito.it>
Cc: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com>
Tested-by: Roberto Sassu <roberto.sassu@huawei.com>
Link: https://github.com/linux-integrity/linux/issues/14
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Al Viro <viro@zeniv.linux.org.uk>
Date: Mon May 13 23:41:51 2024 -0600
ima_fs: don't bother with removal of files in directory we'll be removing
[ Upstream commit 22260a99d791163f7697a240dfc48e4e5a91ecfe ]
removal of parent takes all children out
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Stable-dep-of: d7bd8cf0b348 ("ima_fs: Correctly create securityfs files for unsupported hash algos")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Al Viro <viro@zeniv.linux.org.uk>
Date: Mon Mar 10 12:30:20 2025 -0400
ima_fs: get rid of lookup-by-dentry stuff
[ Upstream commit d15ffbbf4d32a9007c4a339a9fecac90ce30432a ]
lookup_template_data_hash_algo() machinery is used to locate the
matching ima_algo_array[] element at read time; securityfs
allows to stash that into inode->i_private at object creation
time, so there's no need to bother
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Stable-dep-of: d7bd8cf0b348 ("ima_fs: Correctly create securityfs files for unsupported hash algos")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nicholas Carlini <nicholas@carlini.com>
Date: Mon May 11 18:02:16 2026 +0000
io-wq: check that the predecessor is hashed in io_wq_remove_pending()
commit d6a2d7b04b5a093021a7a0e2e69e9d5237dfa8cc upstream.
io_wq_remove_pending() needs to fix up wq->hash_tail[] if the cancelled
work was the tail of its hash bucket. When doing this, it checks whether
the preceding entry in acct->work_list has the same hash value, but
never checks that the predecessor is hashed at all. io_get_work_hash()
is simply atomic_read(&work->flags) >> IO_WQ_HASH_SHIFT, and the hash
bits are never set for non-hashed work, so it returns 0. Thus, when a
hashed bucket-0 work is cancelled while a non-hashed work is its list
predecessor, the check spuriously passes and a pointer to the non-hashed
io_kiocb is stored in wq->hash_tail[0].
Because non-hashed work is dequeued via the fast path in
io_get_next_work(), which never touches hash_tail[], the stale pointer
is never cleared. Therefore, after the non-hashed io_kiocb completes and
is freed back to req_cachep, wq->hash_tail[0] is a dangling pointer. The
io_wq is per-task (tctx->io_wq) and survives ring open/close, so the
dangling pointer persists for the lifetime of the task; the next hashed
bucket-0 enqueue dereferences it in io_wq_insert_work() and
wq_list_add_after() writes through freed memory.
Add the missing io_wq_is_hashed() check so a non-hashed predecessor
never inherits a hash_tail[] slot.
Cc: stable@vger.kernel.org
Fixes: 204361a77f40 ("io-wq: fix hang after cancelling pending hashed work")
Signed-off-by: Nicholas Carlini <nicholas@carlini.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Pavel Begunkov <asml.silence@gmail.com>
Date: Tue May 13 18:26:47 2025 +0100
io_uring/kbuf: use mem_is_zero()
commit 1724849072854a66861d461b298b04612702d685 upstream.
Make use of mem_is_zero() for reserved fields checking.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/11fe27b7a831329bcdb4ea087317ef123ba7c171.1747150490.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Vasant Hegde <vasant.hegde@amd.com>
Date: Wed Oct 30 06:35:53 2024 +0000
iommu/amd: Convert dev_data lock from spinlock to mutex
[ Upstream commit e843aedbeb82b17a5fe6172449bff133fc8b68a1 ]
Currently in attach device path it takes dev_data->spinlock. But as per
design attach device path can sleep. Also if device is PRI capable then
it adds device to IOMMU fault handler queue which takes mutex. Hence
currently PRI enablement is done outside dev_data lock.
Covert dev_data lock from spinlock to mutex so that it follows the
design and also PRI enablement can be done properly.
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-10-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Stable-dep-of: faad224fe0f0 ("iommu/amd: Fix clone_alias() to use the original device's devid")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vasant Hegde <vasant.hegde@amd.com>
Date: Wed Oct 30 06:35:50 2024 +0000
iommu/amd: Do not detach devices in domain free path
[ Upstream commit 07bbd660dbd6ff03907d9ddbdfe9deabbd18ac4d ]
All devices attached to a protection domain must be freed before
calling domain free. Hence do not try to free devices in domain
free path. Continue to throw warning if pdom->dev_list is not empty
so that any potential issues can be fixed.
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-7-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Stable-dep-of: faad224fe0f0 ("iommu/amd: Fix clone_alias() to use the original device's devid")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vasant Hegde <vasant.hegde@amd.com>
Date: Wed Apr 1 08:00:17 2026 +0000
iommu/amd: Fix clone_alias() to use the original device's devid
[ Upstream commit faad224fe0f0857a04ff2eb3c90f0de57f47d0f3 ]
Currently clone_alias() assumes first argument (pdev) is always the
original device pointer. This function is called by
pci_for_each_dma_alias() which based on topology decides to send
original or alias device details in first argument.
This meant that the source devid used to look up and copy the DTE
may be incorrect, leading to wrong or stale DTE entries being
propagated to alias device.
Fix this by passing the original pdev as the opaque data argument to
both the direct clone_alias() call and pci_for_each_dma_alias(). Inside
clone_alias(), retrieve the original device from data and compute devid
from it.
Fixes: 3332364e4ebc ("iommu/amd: Support multiple PCI DMA aliases in device table")
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Date: Mon Nov 18 05:49:34 2024 +0000
iommu/amd: Introduce helper function get_dte256()
[ Upstream commit a2ce608a1eb65c2af99c58b63eae557165a0da87 ]
And use it in clone_alias() along with update_dte256().
Also use get_dte256() in dump_dte_entry().
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20241118054937.5203-7-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Stable-dep-of: faad224fe0f0 ("iommu/amd: Fix clone_alias() to use the original device's devid")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Date: Mon Nov 18 05:49:32 2024 +0000
iommu/amd: Introduce helper function to update 256-bit DTE
[ Upstream commit 8b3f78733814b180089a400743b6f19d118aec62 ]
The current implementation does not follow 128-bit write requirement
to update DTE as specified in the AMD I/O Virtualization Techonology
(IOMMU) Specification.
Therefore, modify the struct dev_table_entry to contain union of u128 data
array, and introduce a helper functions update_dte256() to update DTE using
two 128-bit cmpxchg operations to update 256-bit DTE with the modified
structure, and take into account the DTE[V, GV] bits when programming
the DTE to ensure proper order of DTE programming and flushing.
In addition, introduce a per-DTE spin_lock struct dev_data.dte_lock to
provide synchronization when updating the DTE to prevent cmpxchg128
failure.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Suggested-by: Uros Bizjak <ubizjak@gmail.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Link: https://lore.kernel.org/r/20241118054937.5203-5-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Stable-dep-of: faad224fe0f0 ("iommu/amd: Fix clone_alias() to use the original device's devid")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jason Gunthorpe <jgg@ziepe.ca>
Date: Thu Dec 5 20:13:41 2024 -0400
iommu/amd: Put list_add/del(dev_data) back under the domain->lock
[ Upstream commit 4a552f7890f0870f6d9fd4fbc6c05cea7bfd4503 ]
The list domain->dev_list is protected by the domain->lock spinlock.
Any iteration, addition or removal must be under the lock.
Move the list_del() up into the critical section. pdom_is_sva_capable(),
and destroy_gcr3_table() do not interact with the list element.
Wrap the list_add() in a lock, it would make more sense if this was under
the same critical section as adjusting the refcounts earlier, but that
requires more complications.
Fixes: d6b47dec3684 ("iommu/amd: Reduce domain lock scope in attach device path")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/1-v1-3b9edcf8067d+3975-amd_dev_list_locking_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vasant Hegde <vasant.hegde@amd.com>
Date: Wed Oct 30 06:35:52 2024 +0000
iommu/amd: Rearrange attach device code
[ Upstream commit 4b18ef8491b06e353e8801705092cc292582cb7a ]
attach_device() is just holding lock and calling do_attach(). There is
not need to have another function. Just move do_attach() code to
attach_device(). Similarly move do_detach() code to detach_device().
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-9-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Stable-dep-of: faad224fe0f0 ("iommu/amd: Fix clone_alias() to use the original device's devid")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vasant Hegde <vasant.hegde@amd.com>
Date: Wed Oct 30 06:35:51 2024 +0000
iommu/amd: Reduce domain lock scope in attach device path
[ Upstream commit d6b47dec368400a62d2b9d44c8e136fc15eac72c ]
Currently attach device path takes protection domain lock followed by
dev_data lock. Most of the operations in this function is specific to
device data except pdom_attach_iommu() where it updates protection
domain structure. Hence reduce the scope of protection domain lock.
Note that this changes the locking order. Now it takes device lock
before taking doamin lock (group->mutex -> dev_data->lock ->
pdom->lock). dev_data->lock is used only in device attachment path.
So changing order is fine. It will not create any issue.
Finally move numa node assignment to pdom_attach_iommu().
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-8-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Stable-dep-of: faad224fe0f0 ("iommu/amd: Fix clone_alias() to use the original device's devid")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vasant Hegde <vasant.hegde@amd.com>
Date: Wed Oct 30 06:35:47 2024 +0000
iommu/amd: Remove protection_domain.dev_cnt variable
[ Upstream commit 743a4bae9fa1480e5f6837f6a55be918d6cd0e16 ]
protection_domain->dev_list tracks list of attached devices to
domain. We can use list_* functions on dev_list to get device count.
Hence remove 'dev_cnt' variable.
No functional change intended.
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-4-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Stable-dep-of: faad224fe0f0 ("iommu/amd: Fix clone_alias() to use the original device's devid")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vasant Hegde <vasant.hegde@amd.com>
Date: Wed Oct 30 06:35:54 2024 +0000
iommu/amd: Reorder attach device code
[ Upstream commit 0b136493d3ffa1358783dcf5b9f866ceef2ff122 ]
Ideally in attach device path, it should take dev_data lock before
making changes to device data including IOPF enablement. So far dev_data
was using spinlock and it was hitting lock order issue when it tries to
enable IOPF. Hence Commit 526606b0a199 ("iommu/amd: Fix Invalid wait
context issue") moved IOPF enablement outside dev_data->lock.
Previous patch converted dev_data lock to mutex. Now its safe to call
amd_iommu_iopf_add_device() with dev_data->mutex. Hence move back PCI
device capability enablement (ATS, PRI, PASID) and IOPF enablement code
inside the lock. Also in attach_device(), update 'dev_data->domain' at
the end so that error handling becomes simple.
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-11-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Stable-dep-of: 4a552f7890f0 ("iommu/amd: Put list_add/del(dev_data) back under the domain->lock")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vasant Hegde <vasant.hegde@amd.com>
Date: Wed Oct 30 06:35:48 2024 +0000
iommu/amd: xarray to track protection_domain->iommu list
[ Upstream commit d16041124de1dea4389b5e6b330657f34f8c0492 ]
Use xarray to track IOMMU attached to protection domain instead of
static array of MAX_IOMMUS. Also add lockdep assertion.
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20241030063556.6104-5-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Stable-dep-of: faad224fe0f0 ("iommu/amd: Fix clone_alias() to use the original device's devid")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nicolin Chen <nicolinc@nvidia.com>
Date: Thu Mar 12 17:36:34 2026 -0700
iommu/tegra241-cmdqv: Set supports_cmd op in tegra241_vcmdq_hw_init()
[ Upstream commit 803e41f36d227022ab9bbe780c82283fd4713b2e ]
vintf->hyp_own is finalized in tegra241_vintf_hw_init(). On the other hand,
tegra241_vcmdq_alloc_smmu_cmdq() is called via an init_structures callback,
which is earlier than tegra241_vintf_hw_init().
This results in the supports_cmd op always being set to the guest function,
although this doesn't break any functionality nor have some noticeable perf
impact since non-invalidation commands are not issued in the perf sensitive
context.
Fix this by moving supports_cmd to tegra241_vcmdq_hw_init().
After this change,
- For a guest kernel, this will be a status quo
- For a host kernel, non-invalidation commands will be issued to VCMDQ(s)
Fixes: a9d40285bdef ("iommu/tegra241-cmdqv: Limit CMDs for VCMDQs of a guest owned VINTF")
Reported-by: Eric Auger <eric.auger@redhat.com>
Reported-by: Shameer Kolothum <skolothumtho@nvidia.com>
Closes: https://lore.kernel.org/qemu-devel/CH3PR12MB754836BEE54E39B30C7210C0AB44A@CH3PR12MB7548.namprd12.prod.outlook.com/
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Shameer Kolothum <skolothumtho@nvidia.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Naval Alcalá <ari@naval.cat>
Date: Sat May 9 10:43:44 2026 +0800
iommu/vt-d: Disable DMAR for Intel Q35 IGFX
commit 2cda2e10dc8343ae01eae9e999a876b7e7d37861 upstream.
Intel Q35 integrated graphics (8086:29b2) exhibits broken DMAR
behaviour similar to other G4x/GM45 devices for which DMAR is
already disabled via quirks.
When DMAR is enabled, the system may hard lock up during boot or
early device initialization, requiring a reset.
Add the missing PCI ID to the existing quirk list to disable
DMAR for this device.
Fixes: 1f76249cc3be ("iommu/vt-d: Declare Broadwell igfx dmar support snafu")
Cc: stable@vger.kernel.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=201185
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=216064
Signed-off-by: Naval Alcalá <ari@naval.cat>
Link: https://lore.kernel.org/r/20260410161622.13549-1-ari@naval.cat
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Zhenzhong Duan <zhenzhong.duan@intel.com>
Date: Fri May 15 11:37:08 2026 -0400
iommufd: Fix return value of iommufd_fault_fops_write()
[ Upstream commit aaca2aa92785a6ab8e3183e7184bca447a99cd76 ]
copy_from_user() may return number of bytes failed to copy, we should
not pass over this number to user space to cheat that write() succeed.
Instead, -EFAULT should be returned.
Link: https://patch.msgid.link/r/20260330030755.12856-1-zhenzhong.duan@intel.com
Cc: stable@vger.kernel.org
Fixes: 07838f7fd529 ("iommufd: Add iommufd fault object")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
[ applied identical hunk to drivers/iommu/iommufd/fault.c ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jacob Pan <jacob.pan@linux.microsoft.com>
Date: Fri Feb 13 10:36:36 2026 -0800
iommufd: vfio compatibility extension check for noiommu mode
[ Upstream commit 7147ec874ea08c322d779d8eba28946e294ed1f3 ]
VFIO_CHECK_EXTENSION should return false for TYPE1_IOMMU variants when
in NO-IOMMU mode and IOMMUFD compat container is set. This change makes
the behavior match VFIO_CONTAINER in noiommu mode. It also prevents
userspace from incorrectly attempting to use TYPE1 IOMMU operations
in a no-iommu context.
Fixes: d624d6652a65 ("iommufd: vfio container FD ioctl compatibility")
Link: https://patch.msgid.link/r/20260213183636.3340-1-jacob.pan@linux.microsoft.com
Signed-off-by: Jacob Pan <jacob.pan@linux.microsoft.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jian Zhang <zhangjian.3032@bytedance.com>
Date: Fri Apr 3 17:06:01 2026 +0800
ipmi: ssif_bmc: change log level to dbg in irq callback
[ Upstream commit c9c99b7b7051eb7121b3224bfce181fb023b0269 ]
Long-running tests indicate that this logging can occasionally disrupt
timing and lead to request/response corruption.
Irq handler need to be executed as fast as possible,
most I2C slave IRQ implementations are byte-level, logging here
can significantly affect transfer behavior and timing. It is recommended
to use dev_dbg() for these messages.
Fixes: dd2bc5cc9e25 ("ipmi: ssif_bmc: Add SSIF BMC driver")
Signed-off-by: Jian Zhang <zhangjian.3032@bytedance.com>
Message-ID: <20260403090603.3988423-4-zhangjian.3032@bytedance.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jian Zhang <zhangjian.3032@bytedance.com>
Date: Fri Apr 3 17:06:00 2026 +0800
ipmi: ssif_bmc: fix message desynchronization after truncated response
[ Upstream commit 1d38e849adb6851ee280aa1a1d687b2181549a66 ]
A truncated response, caused by host power-off, or other conditions,
can lead to message desynchronization.
Raw trace data (STOP loss scenario, add state transition comment):
1. T-1: Read response phase (SSIF_RES_SENDING)
8271.955342 WR_RCV [03] <- Read polling cmd
8271.955348 RD_REQ [04] <== SSIF_RES_SENDING <- start sending response
8271.955436 RD_PRO [b4]
8271.955527 RD_PRO [00]
8271.955618 RD_PRO [c1]
8271.955707 RD_PRO [00]
8271.955814 RD_PRO [ad] <== SSIF_RES_SENDING <- last byte
<- !! STOP lost (truncated response)
2. T: New Write request arrives, BMC still in SSIF_RES_SENDING
8271.967973 WR_REQ [] <== SSIF_RES_SENDING >> SSIF_ABORTING <- log: unexpected WR_REQ in RES_SENDING
8271.968447 WR_RCV [02] <== SSIF_ABORTING <- do nothing
8271.968452 WR_RCV [02] <== SSIF_ABORTING <- do nothing
8271.968454 WR_RCV [18] <== SSIF_ABORTING <- do nothing
8271.968456 WR_RCV [01] <== SSIF_ABORTING <- do nothing
8271.968458 WR_RCV [66] <== SSIF_ABORTING <- do nothing
8271.978714 STOP [] <== SSIF_ABORTING >> SSIF_READY <- log: unexpected SLAVE STOP in state=SSIF_ABORTING
3. T+1: Next Read polling, treated as a fresh transaction
8271.979125 WR_REQ [] <== SSIF_READY >> SSIF_START
8271.979326 WR_RCV [03] <== SSIF_START >> SSIF_SMBUS_CMD <- smbus_cmd=0x03
8271.979331 RD_REQ [04] <== SSIF_RES_SENDING <- sending response
8271.979427 RD_PRO [b4] <- !! this is T's stale response -> desynchronization
When in SSIF_ABORTING state, a newly arrived command should still be
handled to avoid dropping the request or causing message
desynchronization.
Fixes: dd2bc5cc9e25 ("ipmi: ssif_bmc: Add SSIF BMC driver")
Signed-off-by: Jian Zhang <zhangjian.3032@bytedance.com>
Message-ID: <20260403090603.3988423-3-zhangjian.3032@bytedance.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jian Zhang <zhangjian.3032@bytedance.com>
Date: Fri Apr 3 17:05:59 2026 +0800
ipmi: ssif_bmc: fix missing check for copy_to_user() partial failure
[ Upstream commit ea641be7a4faee4351f9c5ed6b188e1bbf5586a6 ]
copy_to_user() returns the number of bytes that could not be copied,
with a non-zero value indicating a partial or complete failure. The
current code only checks for negative return values and treats all
non-negative results as success.
Treating any positive return value from copy_to_user() as
an error and returning -EFAULT.
Fixes: dd2bc5cc9e25 ("ipmi: ssif_bmc: Add SSIF BMC driver")
Signed-off-by: Jian Zhang <zhangjian.3032@bytedance.com>
Message-ID: <20260403090603.3988423-2-zhangjian.3032@bytedance.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alok Tiwari <alok.a.tiwari@oracle.com>
Date: Sun Sep 7 12:25:32 2025 -0700
ipv4: udp: fix typos in comments
[ Upstream commit d436b5abba4f80e968b3ff83be4363c7aedcc799 ]
Correct typos in ipv4/udp.c comments for clarity:
"Encapulation" -> "Encapsulation"
"measureable" -> "measurable"
"tacking care" -> "taking care"
No functional changes.
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250907192535.3610686-1-alok.a.tiwari@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: b80a95ccf160 ("udp: Force compute_score to always inline")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Thu Apr 16 10:35:05 2026 +0000
ipv6: fix possible UAF in icmpv6_rcv()
[ Upstream commit f996edd7615e686ada141b7f3395025729ff8ccb ]
Caching saddr and daddr before pskb_pull() is problematic
since skb->head can change.
Remove these temporary variables:
- We only access &ipv6_hdr(skb)->saddr and &ipv6_hdr(skb)->daddr
when net_dbg_ratelimited() is called in the slow path.
- Avoid potential future misuse after pskb_pull() call.
Fixes: 4b3418fba0fe ("ipv6: icmp: include addresses in debug messages")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de>
Reviewed-by: Joe Damato <joe@dama.to>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260416103505.2380753-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alok Tiwari <alok.a.tiwari@oracle.com>
Date: Tue Sep 9 05:26:07 2025 -0700
ipv6: udp: fix typos in comments
[ Upstream commit ac36dea3bc85c2cde87e490736708032328dfbdc ]
Correct typos in ipv6/udp.c comments:
"execeeds" -> "exceeds"
"tacking care" -> "taking care"
"measureable" -> "measurable"
No functional changes.
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250909122611.3711859-1-alok.a.tiwari@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: b80a95ccf160 ("udp: Force compute_score to always inline")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yingnan Zhang <342144303@qq.com>
Date: Wed Apr 15 22:40:29 2026 +0800
ipvs: fix MTU check for GSO packets in tunnel mode
[ Upstream commit 67bf42cae41d847fd6e5749eb68278ca5d748b25 ]
Currently, IPVS skips MTU checks for GSO packets by excluding them with
the !skb_is_gso(skb) condition. This creates problems when IPVS tunnel
mode encapsulates GSO packets with IPIP headers.
The issue manifests in two ways:
1. MTU violation after encapsulation:
When a GSO packet passes through IPVS tunnel mode, the original MTU
check is bypassed. After adding the IPIP tunnel header, the packet
size may exceed the outgoing interface MTU, leading to unexpected
fragmentation at the IP layer.
2. Fragmentation with problematic IP IDs:
When net.ipv4.vs.pmtu_disc=1 and a GSO packet with multiple segments
is fragmented after encapsulation, each segment gets a sequentially
incremented IP ID (0, 1, 2, ...). This happens because:
a) The GSO packet bypasses MTU check and gets encapsulated
b) At __ip_finish_output, the oversized GSO packet is split into
separate SKBs (one per segment), with IP IDs incrementing
c) Each SKB is then fragmented again based on the actual MTU
This sequential IP ID allocation differs from the expected behavior
and can cause issues with fragment reassembly and packet tracking.
Fix this by properly validating GSO packets using
skb_gso_validate_network_len(). This function correctly validates
whether the GSO segments will fit within the MTU after segmentation. If
validation fails, send an ICMP Fragmentation Needed message to enable
proper PMTU discovery.
Fixes: 4cdd34084d53 ("netfilter: nf_conntrack_ipv6: improve fragmentation handling")
Signed-off-by: Yingnan Zhang <342144303@qq.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Brian Masney <bmasney@redhat.com>
Date: Sun Feb 22 18:43:44 2026 -0500
irqchip/irq-pic32-evic: Address warning related to wrong printf() formatter
[ Upstream commit 86be659415b0ddefebc3120e309091aa215a9064 ]
This driver is currently only build on 32 bit MIPS systems. When building
it on x86_64, the following warning occurs:
drivers/irqchip/irq-pic32-evic.c: In function ‘pic32_ext_irq_of_init’:
./include/linux/kern_levels.h:5:25: error: format ‘%d’ expects argument of type
‘int’, but argument 2 has type ‘long unsigned int’ [-Werror=format=]
Update the printf() formatter in preparation for allowing this driver to
be compiled on all architectures.
Fixes: aaa8666ada780 ("IRQCHIP: irq-pic32-evic: Add support for PIC32 interrupt controller")
Signed-off-by: Brian Masney <bmasney@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20260222-irqchip-pic32-v1-1-37f50d1f14af@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Date: Fri May 8 02:31:21 2026 -0700
irqchip/riscv-imsic: Clear interrupt move state during CPU offlining
commit cefafbd561402b0fe6447449364a30315b9b1570 upstream.
Affinity changes of IMSIC interrupts have to be careful to not lose an
interrupt in the process. Each vector keeps track of an affinity change in
progress with two pointers in struct imsic_vector.
imsic_vector::move_prev points to the previous CPU target data and
imsic_vector::move_next to the designated new CPU target data.
imsic_vector::move_prev on the new CPU can only be cleared after the
previous CPU has cleared imsic_vector::move_next, which ususally happens in
__imsic_remote_sync().
In case of CPU hot-unplug __imsic_remote_sync() is not invoked because the
CPU is already marked offline. That means imsic_vector::move_prev becomes
stale until the CPU is onlined again.
The stale pointer prevents further affinity changes for the affected
interrupts.
Solve this by clearing the imsic_vector::move_prev pointers in the CPU
hotplug offline path.
[ tglx: Replace word salad in change log ]
Fixes: 0f67911e821c ("irqchip/riscv-imsic: Separate next and previous pointers in IMSIC vector")
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260508-imsic-v2-1-e9f08dd46cf5@sifive.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Mathias Krause <minipli@grsecurity.net>
Date: Thu Apr 2 16:51:16 2026 +0200
kbuild: builddeb - avoid recompiles for non-cross-compiles
[ Upstream commit 2452dcf4d740effff5aa71b7f6529ee8c04fd8f6 ]
Commit e2c318225ac1 ("kbuild: deb-pkg: add
pkg.linux-upstream.nokernelheaders build profile") changed how
install-extmod-build gets called, making it always rebuild the host
programs below scripts/ if HOSTCC wasn't specified with its full triplet
on the make command line. That is, apparently, needed to fix up commit
f1d87664b82a ("kbuild: cross-compile linux-headers package when
possible") for cross-compiles. However, in the much more common case of
non-cross-compile builds this will lead to unnecessary rebuilding of
host tools including gcc plugins. This, in turn, will lead to a full
kernel rebuild on the next 'make bindeb-pkg' which is unfortunate.
Avoid that by only triggering the rebuild of host tools for actual
cross-compile builds.
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
Fixes: e2c318225ac1 ("kbuild: deb-pkg: add pkg.linux-upstream.nokernelheaders build profile")
Cc: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nicolas Schier <nsc@kernel.org>
Link: https://patch.msgid.link/20260402145116.1010901-1-minipli@grsecurity.net
Signed-off-by: Nicolas Schier <nsc@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: DaeMyung Kang <charsyam@gmail.com>
Date: Sun Apr 19 20:02:55 2026 +0900
ksmbd: destroy async_ida in ksmbd_conn_free()
[ Upstream commit b32c8db48212a34998c36d0bbc05b29d5c407ef5 ]
When per-connection async_ida was converted from a dynamically
allocated ksmbd_ida to an embedded struct ida, ksmbd_ida_free() was
removed from the connection teardown path but no matching
ida_destroy() was added. The connection is therefore freed with the
IDA's backing xarray still intact.
The kernel IDA API expects ida_init() and ida_destroy() to be paired
over an object's lifetime, so add the missing cleanup before the
connection is freed.
No leak has been observed in testing; this is a pairing fix to match
the IDA lifetime rules, not a response to a reproduced regression.
Fixes: d40012a83f87 ("cifsd: declare ida statically")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: DaeMyung Kang <charsyam@gmail.com>
Date: Sun Apr 19 20:02:54 2026 +0900
ksmbd: destroy tree_conn_ida in ksmbd_session_destroy()
[ Upstream commit c049ee14eb4343b69b6f7755563f961f5e153423 ]
When per-session tree_conn_ida was converted from a dynamically
allocated ksmbd_ida to an embedded struct ida, ksmbd_ida_free() was
removed from ksmbd_session_destroy() but no matching ida_destroy()
was added. The session is therefore freed with the IDA's backing
xarray still intact.
The kernel IDA API expects ida_init() and ida_destroy() to be paired
over an object's lifetime, so add the missing cleanup before the
enclosing session is freed.
Also move ida_init() to right after the session is allocated so that
it is always paired with the destroy call even on the early error
paths of __session_create() (ksmbd_init_file_table() or
__init_smb2_session() failures), both of which jump to the error
label and invoke ksmbd_session_destroy() on a partially initialised
session.
No leak has been observed in testing; this is a pairing fix to match
the IDA lifetime rules, not a response to a reproduced regression.
Fixes: d40012a83f87 ("cifsd: declare ida statically")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: DaeMyung Kang <charsyam@gmail.com>
Date: Tue Apr 21 03:45:11 2026 +0900
ksmbd: fix durable fd leak on ClientGUID mismatch in durable v2 open
[ Upstream commit 804054d19886ac6628883d82410f6ee42a818664 ]
ksmbd_lookup_fd_cguid() returns a ksmbd_file with its refcount
incremented via ksmbd_fp_get(). parse_durable_handle_context() in
the DURABLE_REQ_V2 case properly releases this reference on every
path inside the ClientGUID-match branch, either by calling
ksmbd_put_durable_fd() or by transferring ownership to dh_info->fp
for a successful reconnect. However, when an entry exists in the
global file table with the same CreateGuid but a different
ClientGUID, the code simply falls through to the new-open path
without dropping the reference obtained from ksmbd_lookup_fd_cguid().
Per MS-SMB2 section 3.3.5.9.10 ("Handling the
SMB2_CREATE_DURABLE_HANDLE_REQUEST_V2 Create Context"), the server
MUST locate an Open whose Open.CreateGuid matches the request's
CreateGuid AND whose Open.ClientGuid matches the ClientGuid of the
connection that received the request. If no such Open is found, the
server MUST continue with the normal open execution phase. A
CreateGuid hit with a ClientGUID mismatch is therefore the
"Open not found" case: proceeding with a new open is correct, but
the reference obtained purely as a side effect of the lookup must
not be leaked.
Repeated requests that hit this mismatch pin global_ft entries,
prevent __ksmbd_close_fd() from ever running for the corresponding
files, and defeat the durable scavenger, leading to long-lived
resource leaks.
Release the reference in the mismatch path and clear dh_info->fp so
subsequent logic does not mistake a non-matching lookup result for
a reconnect target.
Fixes: c8efcc786146 ("ksmbd: add support for durable handles v1/v2")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Joshua Klinesmith <joshuaklinesmith@gmail.com>
Date: Mon Apr 6 22:31:12 2026 -0400
ksmbd: fix use-after-free from async crypto on Qualcomm crypto engine
[ Upstream commit 3e298897f41c61450c2e7a4f457e8b2485eb35b3 ]
ksmbd_crypt_message() sets a NULL completion callback on AEAD requests
and does not handle the -EINPROGRESS return code from async hardware
crypto engines like the Qualcomm Crypto Engine (QCE). When QCE returns
-EINPROGRESS, ksmbd treats it as an error and immediately frees the
request while the hardware DMA operation is still in flight. The DMA
completion callback then dereferences freed memory, causing a NULL
pointer crash:
pc : qce_skcipher_done+0x24/0x174
lr : vchan_complete+0x230/0x27c
...
el1h_64_irq+0x68/0x6c
ksmbd_free_work_struct+0x20/0x118 [ksmbd]
ksmbd_exit_file_cache+0x694/0xa4c [ksmbd]
Use the standard crypto_wait_req() pattern with crypto_req_done() as
the completion callback, matching the approach used by the SMB client
in fs/smb/client/smb2ops.c. This properly handles both synchronous
engines (immediate return) and async engines (-EINPROGRESS followed
by callback notification).
Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Link: https://github.com/openwrt/openwrt/issues/21822
Signed-off-by: Joshua Klinesmith <joshuaklinesmith@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hyunwoo Kim <imv4bel@gmail.com>
Date: Tue Apr 21 00:31:47 2026 +0900
ksmbd: scope conn->binding slowpath to bound sessions only
[ Upstream commit b0da97c034b6107d14e537e212d4ce8b22109a58 ]
When the binding SESSION_SETUP sets conn->binding = true, the flag stays
set after the call so that the global session lookup in
ksmbd_session_lookup_all() can find the session, which was not added to
conn->sessions. Because the flag is connection-wide, the global lookup
path will also resolve any other session by id if asked.
Tighten the global lookup so that the returned session must have this
connection registered in its channel xarray (sess->ksmbd_chann_list).
The channel entry is installed by the existing binding_session path in
ntlm_authenticate()/krb5_authenticate() when a SESSION_SETUP completes
successfully, so this condition is a strict equivalent of "this
connection has been accepted as a channel of this session". Connections
that have not bound to a given session cannot reach it via the global
table.
The existing conn->binding gate for entering the slowpath is preserved
so that non-binding connections keep the fast-path-only behavior, and
the session->state check is unchanged.
Fixes: f5a544e3bab7 ("ksmbd: add support for SMB3 multichannel")
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ricardo B. Marlière <rbm@suse.com>
Date: Sat Mar 7 19:07:56 2026 -0300
ktest: Avoid undef warning when WARNINGS_FILE is unset
[ Upstream commit 057854f8a595160656fe77ed7bf0d2403724b915 ]
check_buildlog() probes $warnings_file with -f even when WARNINGS_FILE is
not configured. Perl warns about the uninitialized value and adds noise to
the test log, which can hide the output we actually care about.
Check that WARNINGS_FILE is defined before testing whether the file exists.
Cc: John Hawley <warthog9@eaglescrag.net>
Cc: Andrea Righi <arighi@nvidia.com>
Cc: Marcos Paulo de Souza <mpdesouza@suse.com>
Cc: Matthieu Baerts <matttbe@kernel.org>
Cc: Fernando Fernandez Mancera <fmancera@suse.de>
Cc: Pedro Falcato <pfalcato@suse.de>
Link: https://patch.msgid.link/20260307-ktest-fixes-v1-1-565d412f4925@suse.com
Fixes: 4283b169abfb ("ktest: Add make_warnings_file and process full warnings")
Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ricardo B. Marlière <rbm@suse.com>
Date: Sat Mar 7 19:07:59 2026 -0300
ktest: Honor empty per-test option overrides
[ Upstream commit a2de57a3c8192dcd67cccaff6c341b93748d799b ]
A per-test override can clear an inherited default option by assigning an
empty value, but __set_test_option() still used option_defined() to decide
whether a per-test key existed. That turned an empty per-test assignment
back into "fall back to the default", so tests still could not clear
inherited settings.
For example:
DEFAULTS
(...)
LOG_FILE = /tmp/ktest-empty-override.log
CLEAR_LOG = 1
ADD_CONFIG = /tmp/.config
TEST_START
TEST_TYPE = build
BUILD_TYPE = nobuild
ADD_CONFIG =
This would run the test with ADD_CONFIG[1] = /tmp/.config
Fix by checking whether the per-test key exists before falling back. If it
does exist but is empty, treat it as unset for that test and stop the
fallback chain there.
Cc: John Hawley <warthog9@eaglescrag.net>
Cc: Andrea Righi <arighi@nvidia.com>
Cc: Marcos Paulo de Souza <mpdesouza@suse.com>
Cc: Matthieu Baerts <matttbe@kernel.org>
Cc: Fernando Fernandez Mancera <fmancera@suse.de>
Cc: Pedro Falcato <pfalcato@suse.de>
Link: https://patch.msgid.link/20260307-ktest-fixes-v1-4-565d412f4925@suse.com
Fixes: 22c37a9ac49d ("ktest: Allow tests to undefine default options")
Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ricardo B. Marlière <rbm@suse.com>
Date: Sat Mar 7 19:08:03 2026 -0300
ktest: Run POST_KTEST hooks on failure and cancellation
[ Upstream commit bc6e165a452da909cef0efbc286e6695624db372 ]
PRE_KTEST can be useful for setting up the environment and POST_KTEST to
tear it down, however POST_KTEST only runs on the normal end-of-run path.
It is skipped when ktest exits through dodie() or cancel_test(). Final
cleanup hooks are skipped.
Factor the final hook execution into run_post_ktest(), call it from the
normal exit path and from the early exit paths, and guard it so the hook
runs at most once.
Cc: John Hawley <warthog9@eaglescrag.net>
Cc: Andrea Righi <arighi@nvidia.com>
Cc: Marcos Paulo de Souza <mpdesouza@suse.com>
Cc: Matthieu Baerts <matttbe@kernel.org>
Cc: Fernando Fernandez Mancera <fmancera@suse.de>
Cc: Pedro Falcato <pfalcato@suse.de>
Link: https://patch.msgid.link/20260307-ktest-fixes-v1-8-565d412f4925@suse.com
Fixes: 921ed4c7208e ("ktest: Add PRE/POST_KTEST and TEST options")
Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Aaron Sacks <contact@xchglabs.com>
Date: Tue May 12 02:07:42 2026 -0400
KVM: Reject wrapped offset in kvm_reset_dirty_gfn()
commit 577a8d3bae0531f0e5ccfac919cd8192f920a804 upstream.
kvm_reset_dirty_gfn() guards the gfn range with
if (!memslot || (offset + __fls(mask)) >= memslot->npages)
return;
but offset is u64 and the addition is unchecked. The check can be
silently bypassed by a u64 wrap.
The dirty ring backing those entries is MAP_SHARED at
KVM_DIRTY_LOG_PAGE_OFFSET of the vcpu fd, so the VMM can rewrite the
slot and offset fields of any entry between when the kernel pushes
them and when KVM_RESET_DIRTY_RINGS consumes them. On reset,
kvm_dirty_ring_reset() re-reads the values via READ_ONCE() and feeds
them straight back into this check; only the flags handshake is
treated as the handover, the slot/offset payload is taken on trust.
Crafting two entries
entry[i].offset = 0xffffffffffffffc1
entry[i+1].offset = 0
makes the coalescing loop in kvm_dirty_ring_reset() compute
delta = (s64)(0 - 0xffffffffffffffc1) = 63
which falls in [0, BITS_PER_LONG), so it folds entry[i+1] into the
existing mask by setting bit 63. The trailing kvm_reset_dirty_gfn()
call then sees offset = 0xffffffffffffffc1 and __fls(mask) = 63;
the sum is 0 in u64 and the bounds check passes.
That offset propagates into kvm_arch_mmu_enable_log_dirty_pt_masked()
unchanged. On the legacy MMU path -- kvm_memslots_have_rmaps() ==
true, i.e. shadow paging, any VM that has allocated shadow roots, or
a write-tracked slot -- it reaches gfn_to_rmap(), which indexes
slot->arch.rmap[0][] with a near-U64_MAX gfn. That is an
out-of-bounds load of a kvm_rmap_head, followed by a conditional
clear of PT_WRITABLE_MASK in whatever the loaded pointer points at.
The path is reachable from any process holding /dev/kvm.
Range-check offset on its own first, so the addition cannot wrap.
memslot->npages is bounded well below U64_MAX, so once offset <
npages holds, offset + __fls(mask) (with __fls(mask) < BITS_PER_LONG)
stays in range.
Fixes: fb04a1eddb1a ("KVM: X86: Implement ring-based dirty memory tracking")
Cc: stable@vger.kernel.org
Signed-off-by: Aaron Sacks <contact@xchglabs.com>
Link: https://patch.msgid.link/20260512060742.1628959-1-contact@xchglabs.com/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Junrui Luo <moonafterrain@outlook.com>
Date: Wed Apr 15 17:26:55 2026 +0800
KVM: s390: pci: fix GAIT table indexing due to double-scaling pointer arithmetic
commit 16d990a15491cf76cd6eef0846e1b4100e63261a upstream.
kvm_s390_pci_aif_enable(), kvm_s390_pci_aif_disable(), and
aen_host_forward() index the GAIT by manually multiplying the index
with sizeof(struct zpci_gaite).
Since aift->gait is already a struct zpci_gaite pointer, this
double-scales the offset, accessing element aisb*16 instead of aisb.
This causes out-of-bounds accesses when aisb >= 32 (with
ZPCI_NR_DEVICES=512)
Fix by removing the erroneous sizeof multiplication.
Fixes: 3c5a1b6f0a18 ("KVM: s390: pci: provide routines for enabling/disabling interrupt forwarding")
Fixes: 73f91b004321 ("KVM: s390: pci: enable host forwarding of Adapter Event Notifications")
Reported-by: Yuhao Jiang <danisjiang@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Junrui Luo <moonafterrain@outlook.com>
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Qiang Ma <maqianga@uniontech.com>
Date: Tue May 12 09:53:13 2026 +0800
KVM: x86: Fix Xen hypercall tracepoint argument assignment
commit 2b72f1674e427c56e3772c5ccf785fdda2138820 upstream.
TRACE_EVENT(kvm_xen_hypercall) stores a5 in __entry->a4 instead of
__entry->a5.
That overwrites the recorded a4 argument and leaves a5 unset in the
trace entry. Fix the typo so both arguments are captured correctly.
Signed-off-by: Qiang Ma <maqianga@uniontech.com>
Link: https://patch.msgid.link/20260512015313.1685784-1-maqianga@uniontech.com/
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Chen Ni <nichen@iscas.ac.cn>
Date: Thu Feb 26 11:30:48 2026 +0800
leds: lgm-sso: Remove duplicate assignments for priv->mmap
[ Upstream commit 7186d0330c3f3e86de577687a82f4ebd96dcb5ac ]
Remove duplicate assignment of priv->mmap in intel_sso_led_probe().
Fixes: fba8a6f2263b ("leds: lgm-sso: Fix clock handling")
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Link: https://patch.msgid.link/20260226033048.3715915-1-nichen@iscas.ac.cn
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Geert Uytterhoeven <geert+renesas@glider.be>
Date: Tue Mar 31 17:21:43 2026 +0200
lib/hexdump: print_hex_dump_bytes() calls print_hex_dump_debug()
[ Upstream commit 36776b7f8a8955b4e75b5d490a75fee0c7a2a7ef ]
print_hex_dump_bytes() claims to be a simple wrapper around
print_hex_dump(), but it actally calls print_hex_dump_debug(), which
means no output is printed if (dynamic) DEBUG is disabled.
Update the documentation to match the implementation.
Fixes: 091cb0994edd20d6 ("lib/hexdump: make print_hex_dump_bytes() a nop on !DEBUG builds")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Link: https://patch.msgid.link/3d5c3069fd9102ecaf81d044b750cd613eb72a08.1774970392.git.geert+renesas@glider.be
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Raphael Zimmer <raphael.zimmer@tu-ilmenau.de>
Date: Tue May 12 18:16:40 2026 +0200
libceph: Fix potential null-ptr-deref in decode_choose_args()
commit 28b0a2ab8c82d0bbdeb8013029c67c978ce6e4bf upstream.
A message of type CEPH_MSG_OSD_MAP contains an OSD map that itself
contains a CRUSH map. When decoding this CRUSH map in crush_decode(), an
array of max_buckets CRUSH buckets is decoded, where some indices may
not refer to actual buckets and are therefore set to NULL. The received
CRUSH map may optionally contain choose_args that get decoded in
decode_choose_args(). When decoding a crush_choose_arg_map, a series of
choose_args for different buckets is decoded, with the bucket_index
being read from the incoming message. It is only checked that the bucket
index does not exceed max_buckets, but not that it doesn't point to an
index with a NULL bucket. If a (potentially corrupted) message contains
a crush_choose_arg_map including such a bucket_index, a null pointer
dereference may occur in the subsequent processing when attempting to
access the bucket with the given index.
This patch fixes the issue by extending the affected check. Now, it is
only attempted to access the bucket if it is not NULL.
Cc: stable@vger.kernel.org
Signed-off-by: Raphael Zimmer <raphael.zimmer@tu-ilmenau.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Raphael Zimmer <raphael.zimmer@tu-ilmenau.de>
Date: Wed Apr 22 10:47:13 2026 +0200
libceph: Fix potential out-of-bounds access in crush_decode()
commit 4c79fc2d598694bda845b46229c9d48b65042970 upstream.
A message of type CEPH_MSG_OSD_MAP containing a crush map with at least
one bucket has two fields holding the bucket algorithm. If the values
in these two fields differ, an out-of-bounds access can occur. This is
the case because the first algorithm field (alg) is used to allocate
the correct amount of memory for a bucket of this type, while the second
algorithm field inside the bucket (b->alg) is used in the subsequent
processing.
This patch fixes the issue by adding a check that compares alg and
b->alg and aborts the processing in case they differ. Furthermore,
b->alg is set to 0 in this case, because the destruction of the crush
map also uses this field to determine the bucket type, which can again
result in an out-of-bounds access when trying to free the memory pointed
to by the fields of the bucket. To correctly free the memory allocated
for the bucket in such a case, the corresponding call to kfree is moved
from the algorithm-specific crush_destroy_bucket functions to the
generic crush_destroy_bucket().
Cc: stable@vger.kernel.org
Signed-off-by: Raphael Zimmer <raphael.zimmer@tu-ilmenau.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Raphael Zimmer <raphael.zimmer@tu-ilmenau.de>
Date: Tue May 5 11:08:12 2026 +0200
libceph: Fix potential out-of-bounds access in osdmap_decode()
commit 35d0ed82d03e5ee77ea4f31f20e29562a7721649 upstream.
When decoding osd_state and osd_weight from an incoming osdmap in
osdmap_decode(), both are decoded for each osd, i.e., map->max_osd
times. The ceph_decode_need() check only accounts for
sizeof(*map->osd_weight) once. This can potentially result in an
out-of-bounds memory access if the incoming message is corrupted such
that the max_osd value exceeds the actual content of the osdmap message.
This patch fixes the issue by changing the corresponding part in the
ceph_decode_need() check to account for
map->max_osd*sizeof(*map->osd_weight).
Cc: stable@vger.kernel.org
Fixes: dcbc919a5dc8 ("libceph: switch osdmap decoding to use ceph_decode_entity_addr")
Signed-off-by: Raphael Zimmer <raphael.zimmer@tu-ilmenau.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Raphael Zimmer <raphael.zimmer@tu-ilmenau.de>
Date: Tue May 12 09:29:30 2026 +0200
libceph: handle rbtree insertion error in decode_choose_args()
commit d289478cfc0bcf81c7914200d6abdcb78bd04ded upstream.
A message of type CEPH_MSG_OSD_MAP contains an OSD map that itself
contains a CRUSH map. The received CRUSH map may optionally contain
choose_args that get decoded in decode_choose_args(). In this function,
num_choose_arg_maps is read from the message, and a corresponding number
of crush_choose_arg_maps gets decoded afterwards. Each
crush_choose_arg_map has a choose_args_index, which serves as the key
when inserting it into the choose_args rbtree of the decoded crush_map.
If a (potentially corrupted) message contains two crush_choose_arg_maps
with the same index, the assertion in insert_choose_arg_map() triggers a
kernel BUG when trying to insert the second crush_choose_arg_map.
This patch fixes the issue by switching to the non-asserting rbtree
insertion function and rejecting the message if the insertion fails.
[ idryomov: changelog ]
Cc: stable@vger.kernel.org
Signed-off-by: Raphael Zimmer <raphael.zimmer@tu-ilmenau.de>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Sat May 23 13:05:02 2026 +0200
Linux 6.12.91
Link: https://lore.kernel.org/r/20260520162111.222830634@linuxfoundation.org
Tested-by: Brett A C Sheffield <bacs@librecast.net>
Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Tested-by: Pavel Machek (CIP) <pavel@nabladev.com>
Tested-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Bart Van Assche <bvanassche@acm.org>
Date: Fri Mar 13 10:15:07 2026 -0700
locking: Fix rwlock support in <linux/spinlock_up.h>
[ Upstream commit 756a0e011cfca0b45a48464aa25b05d9a9c2fb0b ]
Architecture support for rwlocks must be available whether or not
CONFIG_DEBUG_SPINLOCK has been defined. Move the definitions of the
arch_{read,write}_{lock,trylock,unlock}() macros such that these become
visbile if CONFIG_DEBUG_SPINLOCK=n.
This patch prepares for converting do_raw_{read,write}_trylock() into
inline functions. Without this patch that conversion triggers a build
failure for UP architectures, e.g. arm-ep93xx. I used the following
kernel configuration to build the kernel for that architecture:
CONFIG_ARCH_MULTIPLATFORM=y
CONFIG_ARCH_MULTI_V7=n
CONFIG_ATAGS=y
CONFIG_MMU=y
CONFIG_ARCH_MULTI_V4T=y
CONFIG_CPU_LITTLE_ENDIAN=y
CONFIG_ARCH_EP93XX=y
Fixes: fb1c8f93d869 ("[PATCH] spinlock consolidation")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260313171510.230998-2-bvanassche@acm.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Xianglai Li <lixianglai@loongson.cn>
Date: Sun May 17 17:18:55 2026 +0800
LoongArch: KVM: Compile switch.S directly into the kernel
commit 5203012fa6045aac4b69d4e7c212e16dcf38ef10 upstream.
If we directly compile the switch.S file into the kernel, the address of
the kvm_exc_entry function will definitely be within the DMW memory area.
Therefore, we will no longer need to perform a copy relocation of the
kvm_exc_entry.
So this patch compiles switch.S directly into the kernel, and then remove
the copy relocation execution logic for the kvm_exc_entry function.
Signed-off-by: Xianglai Li <lixianglai@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Daan De Meyer <daan.j.demeyer@gmail.com>
Date: Tue Mar 31 10:51:28 2026 +0000
loop: fix partition scan race between udev and loop_reread_partitions()
[ Upstream commit 267ec4d7223a783f029a980f41b93c39b17996da ]
When LOOP_CONFIGURE is called with LO_FLAGS_PARTSCAN, the following
sequence occurs:
1. disk_force_media_change() sets GD_NEED_PART_SCAN
2. Uevent suppression is lifted and a KOBJ_CHANGE uevent is sent
3. loop_global_unlock() releases the lock
4. loop_reread_partitions() calls bdev_disk_changed() to scan
There is a race between steps 2 and 4: when udev receives the uevent
and opens the device before loop_reread_partitions() runs,
blkdev_get_whole() in bdev.c sees GD_NEED_PART_SCAN set and calls
bdev_disk_changed() for a first scan. Then loop_reread_partitions()
does a second scan. The open_mutex serializes these two scans, but
does not prevent both from running.
The second scan in bdev_disk_changed() drops all partition devices
from the first scan (via blk_drop_partitions()) before re-adding
them, causing partition block devices to briefly disappear. This
breaks any systemd unit with BindsTo= on the partition device: systemd
observes the device going dead, fails the dependent units, and does
not retry them when the device reappears.
Fix this by removing the GD_NEED_PART_SCAN set from
disk_force_media_change() entirely. None of the current callers need
the lazy on-open partition scan triggered by this flag:
- floppy: sets GENHD_FL_NO_PART, so disk_has_partscan() is always
false and GD_NEED_PART_SCAN has no effect.
- loop (loop_configure, loop_change_fd): when LO_FLAGS_PARTSCAN is
set, loop_reread_partitions() performs an explicit scan. When not
set, GD_SUPPRESS_PART_SCAN prevents the lazy scan path.
- loop (__loop_clr_fd): calls bdev_disk_changed() explicitly if
LO_FLAGS_PARTSCAN is set.
- nbd (nbd_clear_sock_ioctl): capacity is set to zero immediately
after; nbd manages GD_NEED_PART_SCAN explicitly elsewhere.
With GD_NEED_PART_SCAN no longer set by disk_force_media_change(),
udev opening the loop device after the uevent no longer triggers a
redundant scan in blkdev_get_whole(), and only the single explicit
scan from loop_reread_partitions() runs.
A regression test for this bug has been submitted to blktests:
https://github.com/linux-blktests/blktests/pull/240.
Fixes: 9f65c489b68d ("loop: raise media_change event")
Signed-off-by: Daan De Meyer <daan@amutable.com>
Acked-by: Christian Brauner <brauner@kernel.org>
Link: https://patch.msgid.link/20260331105130.1077599-1-daan@amutable.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Wed Apr 1 10:38:08 2026 +0000
macvlan: annotate data-races around port->bc_queue_len_used
[ Upstream commit 1ef5789d9906df3771c99b7f413caaf2bf473ca5 ]
port->bc_queue_len_used is read and written locklessly,
add READ_ONCE()/WRITE_ONCE() annotations.
While WRITE_ONCE() in macvlan_fill_info() is not yet needed,
it is a prereq for future RTNL avoidance.
Fixes: d4bff72c8401 ("macvlan: Support for high multicast packet rate")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260401103809.3038139-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dudu Lu <phx0fer@gmail.com>
Date: Mon Apr 13 16:53:49 2026 +0800
macvlan: fix macvlan_get_size() not reserving space for IFLA_MACVLAN_BC_CUTOFF
[ Upstream commit fa92a77b0ed4d5f11a71665a232ac5a54a4b055d ]
macvlan_get_size() does not account for IFLA_MACVLAN_BC_CUTOFF, but
macvlan_fill_info() conditionally includes it when port->bc_cutoff != 1.
This causes nla_put_s32() to fail with -EMSGSIZE when the netlink skb
runs out of space, triggering a WARN_ON in rtnetlink and preventing the
interface from being dumped.
The bug can be reproduced with:
ip link add macvlan0 link eth0 type macvlan mode bridge
ip link set macvlan0 type macvlan bc_cutoff 0
ip -d link show macvlan0 # fails with -EMSGSIZE
The bc_cutoff feature was added in commit 954d1fa1ac93 ("macvlan: Add
netlink attribute for broadcast cutoff"), which added the nla_put_s32()
call in macvlan_fill_info() but missed adding the corresponding
nla_total_size(4) in macvlan_get_size(). A follow-up commit
55cef78c244d ("macvlan: add forgotten nla_policy for
IFLA_MACVLAN_BC_CUTOFF") fixed the missing nla_policy entry but still
did not fix the size calculation.
Fixes: 954d1fa1ac93 ("macvlan: Add netlink attribute for broadcast cutoff")
Signed-off-by: Dudu Lu <phx0fer@gmail.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260413085349.73977-1-phx0fer@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wolfram Sang <wsa+renesas@sang-engineering.com>
Date: Mon Apr 13 12:42:38 2026 +0200
mailbox: add sanity check for channel array
[ Upstream commit c1aad75595fb67edc7fda8af249d3b886efa1be9 ]
Fail gracefully if there is no channel array attached to the mailbox
controller. Otherwise the later dereference will cause an OOPS which
might not be seen because mailbox controllers might instantiate very
early. Remove the comment explaining the obvious while here.
Fixes: 2b6d83e2b8b7 ("mailbox: Introduce framework for mailbox")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wolfram Sang <wsa+renesas@sang-engineering.com>
Date: Fri Apr 17 09:42:34 2026 +0200
mailbox: mailbox-test: don't free the reused channel
[ Upstream commit 88ebadbf0deefdaccdab868b44ff70a0a257f473 ]
The RX channel can be aliased to the TX channel if it has a different
MMIO. This special case needs to be handled when freeing the channels
otherwise a double-free occurs.
Fixes: 8ea4484d0c2b ("mailbox: Add generic mechanism for testing Mailbox Controllers")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wolfram Sang <wsa+renesas@sang-engineering.com>
Date: Fri Apr 10 14:53:00 2026 +0200
mailbox: mailbox-test: free channels on probe error
[ Upstream commit c02053a9055d5fdfd32432287cca8958db1d5bc5 ]
On probe error, free the previously obtained channels. This not only
prevents a leak, but also UAF scenarios because the client structure
will be removed nonetheless because it was allocated with devm.
Link: https://sashiko.dev/#/patchset/20260327151217.5327-2-wsa%2Brenesas%40sang-engineering.com
Fixes: 8ea4484d0c2b ("mailbox: Add generic mechanism for testing Mailbox Controllers")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wolfram Sang <wsa+renesas@sang-engineering.com>
Date: Fri Apr 17 09:42:35 2026 +0200
mailbox: mailbox-test: initialize struct earlier
[ Upstream commit bbcf9af68bfedb3d9cc3c7eae62f5c844d8b78b9 ]
The waitqueue must be initialized before the debugfs files are created
because from that time, requests from userspace can already be made.
Similarily, drvdata and spinlock needs to be initialized before we
request the channel, otherwise dangling irqs might run into problems
like a NULL pointer exception.
Fixes: 8ea4484d0c2b ("mailbox: Add generic mechanism for testing Mailbox Controllers")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Wolfram Sang <wsa+renesas@sang-engineering.com>
Date: Fri Apr 17 09:42:36 2026 +0200
mailbox: mailbox-test: make data_ready a per-instance variable
[ Upstream commit 6e937f4e769e60947909e3525965f0137b9039e8 ]
While not the default case, multiple tests can be run simultaneously.
Then, data_ready being a global variable will be overwritten and the
per-instance lock will not help. Turn the global variable into a
per-instance one to avoid this problem.
Fixes: e339c80af95e ("mailbox: mailbox-test: don't rely on rx_buffer content to signal data ready")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jason-JH Lin <jason-jh.lin@mediatek.com>
Date: Mon Mar 23 17:07:11 2026 +0800
mailbox: mtk-cmdq: Fix CURR and END addr for task insert case
[ Upstream commit d2591db9c8ef19fbb4d24ed15e0c6edfa6bc7917 ]
Fix CURR and END address calculation for inserting a cmdq task into the
task list by using cmdq_reg_shift_addr() for proper address converting.
This ensures both CURR and END addresses are set correctly when
enabling the thread.
Fixes: a195c7ccfb7a ("mailbox: mtk-cmdq: Refine DMA address handling for the command buffer")
Signed-off-by: Jason-JH Lin <jason-jh.lin@mediatek.com>
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Xiao Ni <xni@redhat.com>
Date: Thu Mar 5 09:18:33 2026 +0800
md/raid1: fix the comparing region of interval tree
[ Upstream commit de3544d2e5ea99064498de3c21ba490155864657 ]
Interval tree uses [start, end] as a region which stores in the tree.
In raid1, it uses the wrong end value. For example:
bio(A,B) is too big and needs to be split to bio1(A,C-1), bio2(C,B).
The region of bio1 is [A,C] and the region of bio2 is [C,B]. So bio1 and
bio2 overlap which is not right.
Fix this problem by using right end value of the region.
Fixes: d0d2d8ba0494 ("md/raid1: introduce wait_for_serialization")
Signed-off-by: Xiao Ni <xni@redhat.com>
Link: https://lore.kernel.org/linux-raid/20260305011839.5118-2-xni@redhat.com/
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yu Kuai <yukuai@fnnas.com>
Date: Fri Mar 27 22:07:29 2026 +0800
md: wake raid456 reshape waiters before suspend
[ Upstream commit cf86bb53b9c92354904a328e947a05ffbfdd1840 ]
During raid456 reshape, direct IO across the reshape position can sleep
in raid5_make_request() waiting for reshape progress while still
holding an active_io reference. If userspace then freezes reshape and
writes md/suspend_lo or md/suspend_hi, mddev_suspend() kills active_io
and waits for all in-flight IO to drain.
This can deadlock: the IO needs reshape progress to continue, but the
reshape thread is already frozen, so the active_io reference is never
dropped and suspend never completes.
raid5_prepare_suspend() already wakes wait_for_reshape for dm-raid. Do
the same for normal md suspend when reshape is already interrupted, so
waiting raid456 IO can abort, drop its reference, and let suspend
finish.
The mdadm test tests/25raid456-reshape-deadlock reproduces the hang.
Fixes: 714d20150ed8 ("md: add new helpers to suspend/resume array")
Link: https://lore.kernel.org/linux-raid/20260327140729.2030564-1-yukuai@fnnas.com/
Signed-off-by: Yu Kuai <yukuai@fnnas.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Date: Thu Feb 26 15:37:34 2026 +0200
media: i2c: og01a1b: Fix V4L2 subdevice data initialization on probe
[ Upstream commit 535b7f106991c7d8f0e5b8e1769bfb8b1ce9d3d6 ]
It's necessary to finalize the camera sensor subdevice initialization on
driver probe and clean V4L2 subdevice data up on error paths and driver
removal.
The change fixes a previously reported by v4l2-compliance issue of
the failed VIDIOC_(UN)SUBSCRIBE_EVENT/DQEVENT test:
fail: v4l2-test-controls.cpp(1104): subscribe event for control 'User Controls' failed
Fixes: 472377febf84 ("media: Add a driver for the og01a1b camera sensor")
Signed-off-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Date: Wed Aug 13 00:45:28 2025 +0300
media: i2c: og01a1b: Replace client->dev usage
[ Upstream commit 4d58f671944a16314f03d3c0c40ee69058ca02c9 ]
The driver needs to access the struct device in many places, and
retrieves it from the i2c_client itself retrieved with
v4l2_get_subdevdata(). Store it as a pointer in struct og01a1b and
access it from there instead, to simplify the driver.
While at it, fix a mistake in the sort order of include statements.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Mehdi Djait <mehdi.djait@linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil+cisco@kernel.org>
Stable-dep-of: 535b7f106991 ("media: i2c: og01a1b: Fix V4L2 subdevice data initialization on probe")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mikko Perttunen <mperttunen@nvidia.com>
Date: Mon Jan 26 15:50:42 2026 +0900
memory: tegra124-emc: Fix dll_change check
[ Upstream commit 9597ab9a8296ab337e6820f8a717ff621078b632 ]
The code checking whether the specified memory timing enables DLL
in the EMRS register was reversed. DLL is enabled if bit A0 is low.
Fix the check.
Fixes: 73a7f0a90641 ("memory: tegra: Add EMC (external memory controller) driver")
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Link: https://patch.msgid.link/20260126-fix-emc-dllchange-v1-1-47ad3bb63262@nvidia.com
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mikko Perttunen <mperttunen@nvidia.com>
Date: Mon Jan 26 15:50:43 2026 +0900
memory: tegra30-emc: Fix dll_change check
[ Upstream commit 0a93f2355cf4922ad2399dbef5ea1049fef116d4 ]
The code checking whether the specified memory timing enables DLL
in the EMRS register was reversed. DLL is enabled if bit A0 is low.
Fix the check.
Fixes: e34212c75a68 ("memory: tegra: Introduce Tegra30 EMC driver")
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Link: https://patch.msgid.link/20260126-fix-emc-dllchange-v1-2-47ad3bb63262@nvidia.com
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Abdun Nihaal <nihaal@cse.iitm.ac.in>
Date: Tue Jan 20 15:56:20 2026 +0530
mfd: mc13xxx-core: Fix memory leak in mc13xxx_add_subdevice_pdata()
[ Upstream commit a5a65a7fb2f7796bbe492cd6be59c92cb64377d1 ]
The memory allocated for cell.name using kmemdup() is not freed when
mfd_add_devices() fails. Fix that by using devm_kmemdup().
Fixes: 8e00593557c3 ("mfd: Add mc13892 support to mc13xxx")
Signed-off-by: Abdun Nihaal <nihaal@cse.iitm.ac.in>
Link: https://patch.msgid.link/20260120102622.66921-1-nihaal@cse.iitm.ac.in
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Petr Pavlu <petr.pavlu@suse.com>
Date: Fri Mar 13 14:48:02 2026 +0100
module: Fix freeing of charp module parameters when CONFIG_SYSFS=n
[ Upstream commit deffe1edba626d474fef38007c03646ca5876a0e ]
When setting a charp module parameter, the param_set_charp() function
allocates memory to store a copy of the input value. Later, when the module
is potentially unloaded, the destroy_params() function is called to free
this allocated memory.
However, destroy_params() is available only when CONFIG_SYSFS=y, otherwise
only a dummy variant is present. In the unlikely case that the kernel is
configured with CONFIG_MODULES=y and CONFIG_SYSFS=n, this results in
a memory leak of charp values when a module is unloaded.
Fix this issue by making destroy_params() always available when
CONFIG_MODULES=y. Rename the function to module_destroy_params() to clarify
that it is intended for use by the module loader.
Fixes: e180a6b7759a ("param: fix charp parameters set via sysfs")
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Paolo Abeni <pabeni@redhat.com>
Date: Mon May 18 07:48:21 2026 -0400
mptcp: drop __mptcp_fastopen_gen_msk_ackseq()
[ Upstream commit f03afb3aeb9d81f6c5ab728a61a040012923e3b3 ]
When we will move the whole RX path under the msk socket lock, updating
the already queued skb for passive fastopen socket at 3rd ack time will
be extremely painful and race prone
The map_seq for already enqueued skbs is used only to allow correct
coalescing with later data; preventing collapsing to the first skb of
a fastopen connect we can completely remove the
__mptcp_fastopen_gen_msk_ackseq() helper.
Before dropping this helper, a new item had to be added to the
mptcp_skb_cb structure. Because this item will be frequently tested in
the fast path -- almost on every packet -- and because there is free
space there, a single byte is used instead of a bitfield. This micro
optimisation slightly reduces the number of CPU operations to do the
associated check.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250218-net-next-mptcp-rx-path-refactor-v1-2-4a47d90d7998@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 6254a16d6f0c ("mptcp: fix rx timestamp corruption on fastopen")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Paolo Abeni <pabeni@redhat.com>
Date: Mon May 18 07:48:22 2026 -0400
mptcp: fix rx timestamp corruption on fastopen
[ Upstream commit 6254a16d6f0c672e3809ca5d7c9a28a55d71f764 ]
The skb cb offset containing the timestamp presence flag is cleared
before loading such information. Cache such value before MPTCP CB
initialization.
Fixes: 36b122baf6a8 ("mptcp: add subflow_v(4,6)_send_synack()")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260501-net-mptcp-misc-fixes-7-1-rc3-v1-3-b70118df778e@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date: Mon May 18 09:46:19 2026 -0400
mptcp: pm: ADD_ADDR rtx: fix potential data-race
[ Upstream commit 5cd6e0ad79d2615264f63929f8b457ad97ae550d ]
This mptcp_pm_add_timer() helper is executed as a timer callback in
softirq context. To avoid any data races, the socket lock needs to be
held with bh_lock_sock().
If the socket is in use, retry again soon after, similar to what is done
with the keepalive timer.
Fixes: 00cfd77b9063 ("mptcp: retransmit ADD_ADDR when timeout")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260505-net-mptcp-pm-fixes-7-1-rc3-v1-3-fca8091060a4@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[ applied hunk to `net/mptcp/pm_netlink.c` instead of `net/mptcp/pm.c` ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date: Mon May 18 21:23:11 2026 -0400
mptcp: pm: ADD_ADDR rtx: resched blocked ADD_ADDR quicker
[ Upstream commit 3cf12492891c4b5ff54dda404a2de4ec54c9e1b5 ]
When an ADD_ADDR needs to be retransmitted and another one has already
been prepared -- e.g. multiple ADD_ADDRs have been sent in a row and
need to be retransmitted later -- this additional retransmission will
need to wait.
In this case, the timer was reset to TCP_RTO_MAX / 8, which is ~15
seconds. This delay is unnecessary long: it should just be rescheduled
at the next opportunity, e.g. after the retransmission timeout.
Without this modification, some issues can be seen from time to time in
the selftests when multiple ADD_ADDRs are sent, and the host takes time
to process them, e.g. the "signal addresses, ADD_ADDR timeout" MPTCP
Join selftest, especially with a debug kernel config.
Note that on older kernels, 'timeout' is not available. It should be
enough to replace it by one second (HZ).
Fixes: 00cfd77b9063 ("mptcp: retransmit ADD_ADDR when timeout")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260505-net-mptcp-pm-fixes-7-1-rc3-v1-6-fca8091060a4@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[ replaced `TCP_RTO_MAX / 8` with `HZ` ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date: Mon May 18 09:05:24 2026 -0400
mptcp: pm: kernel: correctly retransmit ADD_ADDR ID 0
[ Upstream commit b12014d2d36eaed4e4bec5f1ac7e91110eeb100d ]
When adding the ADD_ADDR to the list, the address including the IP, port
and ID are copied. On the other hand, when the endpoint corresponds to
the one from the initial subflow, the ID is set to 0, as specified by
the MPTCP protocol.
The issue is that the ID was reset after having copied the ID in the
ADD_ADDR entry. So the retransmission was done, but using a different ID
than the initial one.
Fixes: 8b8ed1b429f8 ("mptcp: pm: reuse ID 0 after delete and re-add")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260505-net-mptcp-pm-fixes-7-1-rc3-v1-1-fca8091060a4@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[ applied to net/mptcp/pm_netlink.c instead of upstream's pm_kernel.c ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Date: Mon May 18 07:48:40 2026 -0400
mptcp: pm: prio: skip closed subflows
[ Upstream commit 166b78344031bf7ac9f55cb5282776cfd85f220e ]
When sending an MP_PRIO, closed subflows need to be skipped.
This fixes the case where the initial subflow got closed, re-opened
later, then an MP_PRIO is needed for the same local address.
Note that explicit MP_PRIO cannot be sent during the 3WHS, so it is fine
to use __mptcp_subflow_active().
Fixes: 067065422fcd ("mptcp: add the outgoing MP_PRIO support")
Cc: stable@vger.kernel.org
Fixes: b29fcfb54cd7 ("mptcp: full disconnect implementation")
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20260505-net-mptcp-pm-fixes-7-1-rc3-v1-9-fca8091060a4@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[ applied to renamed function `mptcp_pm_nl_mp_prio_send_ack()` in `pm_netlink.c` ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Cosmin Tanislav <cosmin-gabriel.tanislav.xa@renesas.com>
Date: Wed Mar 11 17:39:57 2026 +0200
mtd: parsers: ofpart: call of_node_get() for dedicated subpartitions
[ Upstream commit e882626c1747653f1f01ea9d12e278e613b11d0f ]
In order to parse sub-partitions, add_mtd_partitions() calls
parse_mtd_partitions() for all previously found partitions.
Each partition will end up being passed to parse_fixed_partitions(), and
its of_node will be treated as the ofpart_node.
Commit 7cce81df7d26 ("mtd: parsers: ofpart: fix OF node refcount leak in
parse_fixed_partitions()") added of_node_put() calls for ofpart_node on
all exit paths.
In the case where the partition passed to parse_fixed_partitions() has a
parent, it is treated as a dedicated partitions node, and of_node_put()
is wrongly called for it, even if of_node_get() was not called
explicitly.
On repeated bind / unbinds of the MTD, the extra of_node_put() ends up
decrementing the refcount down to 0, which should never happen,
resulting in the following error:
OF: ERROR: of_node_release() detected bad of_node_put() on
/soc/spi@80007000/flash@0/partitions/partition@0
Call of_node_get() to balance the call to of_node_put() done for
dedicated partitions nodes.
Fixes: 7cce81df7d26 ("mtd: parsers: ofpart: fix OF node refcount leak in parse_fixed_partitions()")
Signed-off-by: Cosmin Tanislav <cosmin-gabriel.tanislav.xa@renesas.com>
Tested-by: Tommaso Merciai <tommaso.merciai.xr@bp.renesas.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Cosmin Tanislav <cosmin-gabriel.tanislav.xa@renesas.com>
Date: Wed Mar 11 17:39:56 2026 +0200
mtd: parsers: ofpart: call of_node_put() only in ofpart_fail path
[ Upstream commit 0c87dea1aab86116211cb37387c404c9e9231c39 ]
ofpart_none can only be reached after the for_each_child_of_node() loop
finishes. for_each_child_of_node() correctly calls of_node_put() for all
device nodes it iterates over as long as we don't break or jump out of
the loop.
Calling of_node_put() inside the ofpart_none path will wrongly decrement
the ref count of the last node in the for_each_child_of_node() loop.
Move the call to of_node_put() under the ofpart_fail label to fix this.
Fixes: ebd5a74db74e ("mtd: ofpart: Check availability of reg property instead of name property")
Signed-off-by: Cosmin Tanislav <cosmin-gabriel.tanislav.xa@renesas.com>
Tested-by: Tommaso Merciai <tommaso.merciai.xr@bp.renesas.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chen Ni <nichen@iscas.ac.cn>
Date: Fri Feb 27 09:43:36 2026 +0800
mtd: physmap_of_gemini: Fix disabled pinctrl state check
[ Upstream commit b7c0982184b0661f5b1b805f3a56f1bd3757b63e ]
The condition for checking the disabled pinctrl state incorrectly checks
gf->enabled_state instead of gf->disabled_state. This causes misleading
error messages and could lead to incorrect behavior when only one of the
pinctrl states is defined.
Fix the condition to properly check gf->disabled_state.
Fixes: 9d3b5086f6d4 ("mtd: physmap_of_gemini: Handle pin control")
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Linus Walleij <linusw@kernel.org>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Richard Genoud <richard.genoud@bootlin.com>
Date: Tue Mar 17 15:24:30 2026 +0100
mtd: rawnand: sunxi: fix sunxi_nfc_hw_ecc_read_extra_oob
[ Upstream commit 848c13996c55fe4ea6bf5acc3ce6c8c5c944b5f6 ]
When dumping the OOB, the bytes at the end where actually copied from
the beginning of the OOB instead of current_offset.
That leads to something like:
OOB: ff ff ff ff ff ff ff ff ea 19 00 3a 83 db aa 8d
OOB: 99 09 c8 9a 90 36 35 7d aa 15 13 07 3d 97 b2 a4
OOB: a8 bb 19 b3 07 e9 f6 25 52 d7 1a 23 e2 7e 0a e4
OOB: 52 8a 09 d2 1a 86 3d cf b4 99 43 13 d3 90 33 0b
OOB: ff ff ff ff ff ff ff ff ea 19 00 3a 83 db aa 8d
OOB: 99 09 c8 9a 90 36 35 7d aa 15 13 07 3d 97 b2 a4
OOB: a8 bb 19 b3 07 e9 f6 25 52 d7 1a 23 e2 7e 0a e4
OOB: 52 8a 09 d2 1a 86 3d cf b4 99 43 13 d3 90 33 0b
instead of:
OOB: ff ff ff ff ff ff ff ff ea 19 00 3a 83 db aa 8d
OOB: 99 09 c8 9a 90 36 35 7d aa 15 13 07 3d 97 b2 a4
OOB: a8 bb 19 b3 07 e9 f6 25 52 d7 1a 23 e2 7e 0a e4
OOB: 52 8a 09 d2 1a 86 3d cf b4 99 43 13 d3 90 33 0b
OOB: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
OOB: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
OOB: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
OOB: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
(example with BCH16, user data [8,0], no scrambling)
*cur_off (offset from the beginning of the page) was compared to offset
(offset from the beginning of the OOB), and then, the
nand_change_read_column_op() sets the current position to the beginning
of the OOB instead of OOB+offset
Fixes: 15d6f118285f ("mtd: rawnand: sunxi: Stop supporting ECC_HW_SYNDROME mode")
Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Richard Genoud <richard.genoud@bootlin.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Haibo Chen <haibo.chen@nxp.com>
Date: Mon Dec 8 17:14:14 2025 +0800
mtd: spi-nor: core: correct the op.dummy.nbytes when check read operations
[ Upstream commit 756564a536ecd8c9d33edd89f0647a91a0b03587 ]
When check read operation, need to setting the op.dummy.nbytes based
on current read operation rather than the nor->read_proto.
Fixes: 0e30f47232ab ("mtd: spi-nor: add support for DTR protocol")
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Takahiro Kuwano <Takahiro.Kuwano@infineon.com>
Date: Wed Nov 5 16:47:59 2025 +0900
mtd: spi-nor: sfdp: introduce smpt_map_id fixup hook
[ Upstream commit f74de390557bf2bcc5dca4a357b41c0701d3f76e ]
Certain chips have inconsistent Sector Map Parameter Table (SMPT) data,
which leads to the wrong map ID being identified, causing failures to
detect the correct sector map.
To fix this, introduce smpt_map_id() into the struct spi_nor_fixups.
This function will be called after the initial SMPT-based detection,
allowing chip-specific logic to correct the map ID.
Infineon S25FS512S needs this fixup as it has inconsistency between map
ID definition and configuration register value actually obtained.
Co-developed-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Reviewed-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Tested-by: Marek Vasut <marek.vasut+renesas@mailbox.org> # S25FS512S
Signed-off-by: Takahiro Kuwano <Takahiro.Kuwano@infineon.com>
Reviewed-by: Tudor Ambarus <tudor.ambarus@linaro.org>>
Signed-off-by: Pratyush Yadav <pratyush@kernel.org>
Stable-dep-of: 3620d67b4849 ("mtd: spi-nor: update spi_nor_fixups::post_sfdp() documentation")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Takahiro Kuwano <Takahiro.Kuwano@infineon.com>
Date: Wed Nov 5 16:47:58 2025 +0900
mtd: spi-nor: sfdp: introduce smpt_read_dummy fixup hook
[ Upstream commit 653f6def567c81f37302f9591ffd54df3e2a11eb ]
SMPT contains config detection info that describes opcode, address, and
dummy cycles to read sector map config. The dummy cycles parameter can
be SMPT_CMD_READ_DUMMY_IS_VARIABLE and in that case nor->read_dummy
(initialized as 0) is used. In Infineon flash chips, Read Any Register
command with variable dummy cycle is defined in SMPT. S25Hx/S28Hx flash
has 0 dummy cycle by default to read volatile regiters and
nor->read_dummy can work. S25FS-S flash has 8 dummy cycles so we need a
hook that can fix dummy cycles with actually used value.
Inroduce smpt_read_dummy() in struct spi_nor_fixups. It is called when
the dummy cycle field in SMPT config detection is 'varialble'.
Reviewed-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Tested-by: Marek Vasut <marek.vasut+renesas@mailbox.org> # S25FS512S
Signed-off-by: Takahiro Kuwano <Takahiro.Kuwano@infineon.com>
Signed-off-by: Pratyush Yadav <pratyush@kernel.org>
Stable-dep-of: 3620d67b4849 ("mtd: spi-nor: update spi_nor_fixups::post_sfdp() documentation")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Shiji Yang <yangshiji66@outlook.com>
Date: Wed Jan 28 20:42:56 2026 +0800
mtd: spi-nor: swp: check SR_TB flag when getting tb_mask
[ Upstream commit 94645aa41bf9ecb87c2ce78b1c3405bfb6074a37 ]
When the chip does not support top/bottom block protect, the tb_mask
must be set to 0, otherwise SR1 bit5 will be unexpectedly modified.
Signed-off-by: Shiji Yang <yangshiji66@outlook.com>
Fixes: 3dd8012a8eeb ("mtd: spi-nor: add TB (Top/Bottom) protect support")
Reviewed-by: Michael Walle <mwalle@kernel.org>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jonas Gorski <jonas.gorski@gmail.com>
Date: Thu Dec 18 10:54:30 2025 +0100
mtd: spi-nor: update spi_nor_fixups::post_sfdp() documentation
[ Upstream commit 3620d67b48493c6252bbc873dc88dde81641d56b ]
After commit 5273cc6df984 ("mtd: spi-nor: core: Call
spi_nor_post_sfdp_fixups() only when SFDP is defined")
spi_nor_post_sfdp_fixups() isn't called anymore if no SFDP is detected.
Update the documentation accordingly.
Fixes: 5273cc6df984 ("mtd: spi-nor: core: Call spi_nor_post_sfdp_fixups() only when SFDP is defined")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Pratyush Yadav <pratyush@kernel.org>
Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Florian Westphal <fw@strlen.de>
Date: Fri Apr 24 16:58:38 2026 +0200
neigh: let neigh_xmit take skb ownership
[ Upstream commit 4438113be604ee67a7bf4f81da6e1cca41332ce4 ]
neigh_xmit always releases the skb, except when no neighbour table is
found. But even the first added user of neigh_xmit (mpls) relied on
neigh_xmit to release the skb (or queue it for tx).
sashiko reported:
If neigh_xmit() is called with an uninitialized neighbor table (for
example, NEIGH_ND_TABLE when IPv6 is disabled), it returns -EAFNOSUPPORT
and bypasses its internal out_kfree_skb error path. Because the return
value of neigh_xmit() is ignored here, does this leak the SKB?
Assume full ownership and remove the last code path that doesn't
xmit or free skb.
Fixes: 4fd3d7d9e868 ("neigh: Add helper function neigh_xmit")
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260424145843.74055-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jiayuan Chen <jiayuan.chen@linux.dev>
Date: Sat Apr 11 08:55:19 2026 +0800
net, bpf: fix null-ptr-deref in xdp_master_redirect() for down master
[ Upstream commit 1921f91298d1388a0bb9db8f83800c998b649cb3 ]
syzkaller reported a kernel panic in bond_rr_gen_slave_id() reached via
xdp_master_redirect(). Full decoded trace:
https://syzkaller.appspot.com/bug?extid=80e046b8da2820b6ba73
bond_rr_gen_slave_id() dereferences bond->rr_tx_counter, a per-CPU
counter that bonding only allocates in bond_open() when the mode is
round-robin. If the bond device was never brought up, rr_tx_counter
stays NULL.
The XDP redirect path can still reach that code on a bond that was
never opened: bpf_master_redirect_enabled_key is a global static key,
so as soon as any bond device has native XDP attached, the
XDP_TX -> xdp_master_redirect() interception is enabled for every
slave system-wide. The path xdp_master_redirect() ->
bond_xdp_get_xmit_slave() -> bond_xdp_xmit_roundrobin_slave_get() ->
bond_rr_gen_slave_id() then runs against a bond that has no
rr_tx_counter and crashes.
Fix this in the generic xdp_master_redirect() by refusing to call into
the master's ->ndo_xdp_get_xmit_slave() when the master device is not
up. IFF_UP is only set after ->ndo_open() has successfully returned,
so this reliably excludes masters whose XDP state has not been fully
initialized. Drop the frame with XDP_ABORTED so the exception is
visible via trace_xdp_exception() rather than silently falling through.
This is not specific to bonding: any current or future master that
defers XDP state allocation to ->ndo_open() is protected.
Fixes: 879af96ffd72 ("net, core: Add support for XDP redirection to slave device")
Reported-by: syzbot+80e046b8da2820b6ba73@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/698f84c6.a70a0220.2c38d7.00cc.GAE@google.com/T/
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Link: https://patch.msgid.link/20260411005524.201200-2-jiayuan.chen@linux.dev
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Uday Shankar <ushankar@purestorage.com>
Date: Wed Mar 12 13:51:46 2025 -0600
net, treewide: define and use MAC_ADDR_STR_LEN
[ Upstream commit 6d6c1ba7824022528dbe3e283fafbd0775424128 ]
There are a few places in the tree which compute the length of the
string representation of a MAC address as 3 * ETH_ALEN - 1. Define a
constant for this and use it where relevant. No functionality changes
are expected.
Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@verge.net.au>
Link: https://patch.msgid.link/20250312-netconsole-v6-1-3437933e79b8@purestorage.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Stable-dep-of: 3bc179bc7146 ("netpoll: fix IPv6 local-address corruption")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Prathamesh Deshpande <prathameshdeshpande7@gmail.com>
Date: Wed Apr 15 01:49:37 2026 +0100
net/mlx5: Fix HCA caps leak on notifier init failure
[ Upstream commit d03fc81a57956248383efec99967d0ae627390a8 ]
mlx5_mdev_init() allocates HCA caps via mlx5_hca_caps_alloc() before
calling mlx5_notifiers_init(). If notifier initialization fails, the
error path jumps to err_hca_caps and skips mlx5_hca_caps_free(), leaking
allocated caps.
Add a dedicated unwind label for notifier-init failure that frees HCA
caps before continuing the existing cleanup sequence.
Fixes: b6b03097f982 ("net/mlx5: Initialize events outside devlink lock")
Signed-off-by: Prathamesh Deshpande <prathameshdeshpande7@gmail.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260415005022.34764-1-prathameshdeshpande7@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Gal Pressman <gal@nvidia.com>
Date: Thu Apr 9 23:28:51 2026 +0300
net/mlx5e: Fix features not applied during netdev registration
[ Upstream commit 9994ad4df82d64e57135c0f0906897685f5a9e87 ]
mlx5e_fix_features() returns early when the netdevice is not present.
This is correct during profile transitions where priv is cleared, but it
also incorrectly blocks feature fixups during register_netdev(), when
the device is also not yet present.
It is not trivial to distinguish between both cases as we cannot use
priv to carry state, and in both cases reg_state == NETREG_REGISTERED.
Force a netdev features update after register_netdev() completes, where
the device is present and fix_features() can actually work.
This is not a pretty solution, as it results in an additional features
update call (register_netdevice() already calls
__netdev_update_features() internally), but it is the simplest,
cleanest, and most robust way I found to fix this issue after multiple
attempts.
This fixes an issue on systems where CQE compression is enabled by
default, RXHASH remains enabled after registration despite the two
features being mutually exclusive.
Fixes: ab4b01bfdaa6 ("net/mlx5e: Verify dev is present for fix features ndo")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260409202852.158059-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Gal Pressman <gal@nvidia.com>
Date: Thu Apr 9 23:28:52 2026 +0300
net/mlx5e: IPsec, fix ASO poll timeout with read_poll_timeout_atomic()
[ Upstream commit edccdd1eb94712da97a6ce71123ec27890add754 ]
The do-while poll loop uses jiffies for its timeout:
expires = jiffies + msecs_to_jiffies(10);
jiffies is sampled at an arbitrary point within the current tick, so the
first partial tick contributes anywhere from a full tick down to nearly
zero real time. For small msecs_to_jiffies() results this is
significant, the effective poll window can be much shorter than the
requested 10ms, and in the worst case the loop exits after a single
iteration (e.g., when HZ=100), well before the device has delivered the
CQE.
Replace the loop with read_poll_timeout_atomic(), which counts elapsed
time via udelay() accounting rather than jiffies, guaranteeing the full
poll window regardless of HZ.
Additionally, read_poll_timeout_atomic() executes the poll operation one
more time after the timeout has expired, giving the CQE a final chance
to be detected. The old do-while loop could exit without a final poll if
the timeout expired during the udelay() between iterations.
Fixes: 76e463f6508b ("net/mlx5e: Overcome slow response for first IPsec ASO WQE")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260409202852.158059-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Håkon Bugge <haakon.bugge@oracle.com>
Date: Wed Apr 8 01:04:19 2026 -0700
net/rds: Optimize rds_ib_laddr_check
[ Upstream commit 236f718ac885965fa886440b9898dfae185c9733 ]
rds_ib_laddr_check() creates a CM_ID and attempts to bind the address
in question to it. This in order to qualify the allegedly local
address as a usable IB/RoCE address.
In the field, ExaWatcher runs rds-ping to all ports in the fabric from
all local ports. This using all active ToS'es. In a full rack system,
we have 14 cell servers and eight db servers. Typically, 6 ToS'es are
used. This implies 528 rds-ping invocations per ExaWatcher's "RDSinfo"
interval.
Adding to this, each rds-ping invocation creates eight sockets and
binds the local address to them:
socket(AF_RDS, SOCK_SEQPACKET, 0) = 3
bind(3, {sa_family=AF_INET, sin_port=htons(0),
sin_addr=inet_addr("192.168.36.2")}, 16) = 0
socket(AF_RDS, SOCK_SEQPACKET, 0) = 4
bind(4, {sa_family=AF_INET, sin_port=htons(0),
sin_addr=inet_addr("192.168.36.2")}, 16) = 0
socket(AF_RDS, SOCK_SEQPACKET, 0) = 5
bind(5, {sa_family=AF_INET, sin_port=htons(0),
sin_addr=inet_addr("192.168.36.2")}, 16) = 0
socket(AF_RDS, SOCK_SEQPACKET, 0) = 6
bind(6, {sa_family=AF_INET, sin_port=htons(0),
sin_addr=inet_addr("192.168.36.2")}, 16) = 0
socket(AF_RDS, SOCK_SEQPACKET, 0) = 7
bind(7, {sa_family=AF_INET, sin_port=htons(0),
sin_addr=inet_addr("192.168.36.2")}, 16) = 0
socket(AF_RDS, SOCK_SEQPACKET, 0) = 8
bind(8, {sa_family=AF_INET, sin_port=htons(0),
sin_addr=inet_addr("192.168.36.2")}, 16) = 0
socket(AF_RDS, SOCK_SEQPACKET, 0) = 9
bind(9, {sa_family=AF_INET, sin_port=htons(0),
sin_addr=inet_addr("192.168.36.2")}, 16) = 0
socket(AF_RDS, SOCK_SEQPACKET, 0) = 10
bind(10, {sa_family=AF_INET, sin_port=htons(0),
sin_addr=inet_addr("192.168.36.2")}, 16) = 0
So, at every interval ExaWatcher executes rds-ping's, 4224 CM_IDs are
allocated, considering this full-rack system. After the a CM_ID has
been allocated, rdma_bind_addr() is called, with the port number being
zero. This implies that the CMA will attempt to search for an un-used
ephemeral port. Simplified, the algorithm is to start at a random
position in the available port space, and then if needed, iterate
until an un-used port is found.
The book-keeping of used ports uses the idr system, which again uses
slab to allocate new struct idr_layer's. The size is 2092 bytes and
slab tries to reduce the wasted space. Hence, it chooses an order:3
allocation, for which 15 idr_layer structs will fit and only 1388
bytes are wasted per the 32KiB order:3 chunk.
Although this order:3 allocation seems like a good space/speed
trade-off, it does not resonate well with how it used by the CMA. The
combination of the randomized starting point in the port space (which
has close to zero spatial locality) and the close proximity in time of
the 4224 invocations of the rds-ping's, creates a memory hog for
order:3 allocations.
These costly allocations may need reclaims and/or compaction. At
worst, they may fail and produce a stack trace such as (from uek4):
[<ffffffff811a72d5>] __inc_zone_page_state+0x35/0x40
[<ffffffff811c2e97>] page_add_file_rmap+0x57/0x60
[<ffffffffa37ca1df>] remove_migration_pte+0x3f/0x3c0 [ksplice_6cn872bt_vmlinux_new]
[<ffffffff811c3de8>] rmap_walk+0xd8/0x340
[<ffffffff811e8860>] remove_migration_ptes+0x40/0x50
[<ffffffff811ea83c>] migrate_pages+0x3ec/0x890
[<ffffffff811afa0d>] compact_zone+0x32d/0x9a0
[<ffffffff811b00ed>] compact_zone_order+0x6d/0x90
[<ffffffff811b03b2>] try_to_compact_pages+0x102/0x270
[<ffffffff81190e56>] __alloc_pages_direct_compact+0x46/0x100
[<ffffffff8119165b>] __alloc_pages_nodemask+0x74b/0xaa0
[<ffffffff811d8411>] alloc_pages_current+0x91/0x110
[<ffffffff811e3b0b>] new_slab+0x38b/0x480
[<ffffffffa41323c7>] __slab_alloc+0x3b7/0x4a0 [ksplice_s0dk66a8_vmlinux_new]
[<ffffffff811e42ab>] kmem_cache_alloc+0x1fb/0x250
[<ffffffff8131fdd6>] idr_layer_alloc+0x36/0x90
[<ffffffff8132029c>] idr_get_empty_slot+0x28c/0x3d0
[<ffffffff813204ad>] idr_alloc+0x4d/0xf0
[<ffffffffa051727d>] cma_alloc_port+0x4d/0xa0 [rdma_cm]
[<ffffffffa0517cbe>] rdma_bind_addr+0x2ae/0x5b0 [rdma_cm]
[<ffffffffa09d8083>] rds_ib_laddr_check+0x83/0x2c0 [ksplice_6l2xst5i_rds_rdma_new]
[<ffffffffa05f892b>] rds_trans_get_preferred+0x5b/0xa0 [rds]
[<ffffffffa05f09f2>] rds_bind+0x212/0x280 [rds]
[<ffffffff815b4016>] SYSC_bind+0xe6/0x120
[<ffffffff815b4d3e>] SyS_bind+0xe/0x10
[<ffffffff816b031a>] system_call_fastpath+0x18/0xd4
To avoid these excessive calls to rdma_bind_addr(), we optimize
rds_ib_laddr_check() by simply checking if the address in question has
been used before. The rds_rdma module keeps track of addresses
associated with IB devices, and the function rds_ib_get_device() is
used to determine if the address already has been qualified as a valid
local address. If not found, we call the legacy rds_ib_laddr_check(),
now renamed to rds_ib_laddr_check_cm().
Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com>
Signed-off-by: Gerd Rausch <gerd.rausch@oracle.com>
Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260408080420.540032-2-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: ebf71dd4aff4 ("net/rds: Restrict use of RDS/IB to the initial network namespace")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Allison Henderson <achender@kernel.org>
Date: Tue May 5 16:43:36 2026 -0700
net/rds: reset op_nents when zerocopy page pin fails
commit e174929793195e0cd6a4adb0cad731b39f9019b4 upstream.
When iov_iter_get_pages2() fails in rds_message_zcopy_from_user(),
the pinned pages are released with put_page(), and
rm->data.op_mmp_znotifier is cleared. But we fail to properly
clear rm->data.op_nents.
Later when rds_message_purge() is called from rds_sendmsg() the
cleanup loop iterates over the incorrectly non zero number of
op_nents and frees them again.
Fix this by properly resetting op_nents when it should be in
rds_message_zcopy_from_user().
Fixes: 0cebaccef3ac ("rds: zerocopy Tx support.")
Signed-off-by: Allison Henderson <achender@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260505234336.2132721-1-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Greg Jumper <greg.jumper@oracle.com>
Date: Wed Apr 8 01:04:20 2026 -0700
net/rds: Restrict use of RDS/IB to the initial network namespace
[ Upstream commit ebf71dd4aff46e8e421d455db3e231ba43d2fa8a ]
Prevent using RDS/IB in network namespaces other than the initial one.
The existing RDS/IB code will not work properly in non-initial network
namespaces.
Fixes: d5a8ac28a7ff ("RDS-TCP: Make RDS-TCP work correctly when it is set up in a netns other than init_net")
Reported-by: syzbot+da8e060735ae02c8f3d1@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=da8e060735ae02c8f3d1
Signed-off-by: Greg Jumper <greg.jumper@oracle.com>
Signed-off-by: Allison Henderson <achender@kernel.org>
Link: https://patch.msgid.link/20260408080420.540032-3-achender@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Michael Bommarito <michael.bommarito@gmail.com>
Date: Sat Apr 18 10:10:47 2026 -0400
net/rds: zero per-item info buffer before handing it to visitors
[ Upstream commit c88eb7e8d8397a8c1db59c425332c5a30b2a1682 ]
rds_for_each_conn_info() and rds_walk_conn_path_info() both hand a
caller-allocated on-stack u64 buffer to a per-connection visitor and
then copy the full item_len bytes back to user space via
rds_info_copy() regardless of how much of the buffer the visitor
actually wrote.
rds_ib_conn_info_visitor() and rds6_ib_conn_info_visitor() only
write a subset of their output struct when the underlying
rds_connection is not in state RDS_CONN_UP (src/dst addr, tos, sl
and the two GIDs via explicit memsets). Several u32 fields
(max_send_wr, max_recv_wr, max_send_sge, rdma_mr_max, rdma_mr_size,
cache_allocs) and the 2-byte alignment hole between sl and
cache_allocs remain as whatever stack contents preceded the visitor
call and are then memcpy_to_user()'d out to user space.
struct rds_info_rdma_connection and struct rds6_info_rdma_connection
are the only rds_info_* structs in include/uapi/linux/rds.h that are
not marked __attribute__((packed)), so they have a real alignment
hole. The other info visitors (rds_conn_info_visitor,
rds6_conn_info_visitor, rds_tcp_tc_info, ...) write all fields of
their packed output struct today and are not known to be vulnerable,
but a future visitor that adds a conditional write-path would have
the same bug.
Reproduction on a kernel built without CONFIG_INIT_STACK_ALL_ZERO=y:
a local unprivileged user opens AF_RDS, sets SO_RDS_TRANSPORT=IB,
binds to a local address on an RDMA-capable netdev (rxe soft-RoCE on
any netdev is sufficient), sendto()'s any peer on the same subnet
(fails cleanly but installs an rds_connection in the global hash in
RDS_CONN_CONNECTING), then calls getsockopt(SOL_RDS,
RDS_INFO_IB_CONNECTIONS). The returned 68-byte item contains 26
bytes of stack garbage including kernel text/data pointers:
0..7 0a 63 00 01 0a 63 00 02 src=10.99.0.1 dst=10.99.0.2
8..39 00 ... gids (memset-zeroed)
40..47 e0 92 a3 81 ff ff ff ff kernel pointer (max_send_wr)
48..55 7f 37 b5 81 ff ff ff ff kernel pointer (rdma_mr_max)
56..59 01 00 08 00 rdma_mr_size (garbage)
60..61 00 00 tos, sl
62..63 00 00 alignment padding
64..67 18 00 00 00 cache_allocs (garbage)
Fix by zeroing the per-item buffer in both rds_for_each_conn_info()
and rds_walk_conn_path_info() before invoking the visitor. This
covers the IPv4/IPv6 IB visitors and hardens all current and future
visitors against the same class of bug.
No functional change for visitors that fully populate their output.
Changes in v2:
- retarget at the net tree (subject prefix "[PATCH net v2]",
net/rds: prefix in the title)
- pick up Reviewed-by tags from Sharath Srinivasan and
Allison Henderson
Fixes: ec16227e1414 ("RDS/IB: Infiniband transport")
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Reviewed-by: Sharath Srinivasan <sharath.srinivasan@oracle.com>
Reviewed-by: Allison Henderson <achender@kernel.org>
Assisted-by: Claude:claude-opus-4-7
Link: https://patch.msgid.link/20260418141047.3398203-1-michael.bommarito@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jamal Hadi Salim <jhs@mojatatu.com>
Date: Fri Apr 10 07:16:27 2026 -0400
net/sched: act_ct: Only release RCU read lock after ct_ft
[ Upstream commit f462dca0c8415bf0058d0ffa476354c4476d0f09 ]
When looking up a flow table in act_ct in tcf_ct_flow_table_get(),
rhashtable_lookup_fast() internally opens and closes an RCU read critical
section before returning ct_ft.
The tcf_ct_flow_table_cleanup_work() can complete before refcount_inc_not_zero()
is invoked on the returned ct_ft resulting in a UAF on the already freed ct_ft
object. This vulnerability can lead to privilege escalation.
Analysis from zdi-disclosures@trendmicro.com:
When initializing act_ct, tcf_ct_init() is called, which internally triggers
tcf_ct_flow_table_get().
static int tcf_ct_flow_table_get(struct net *net, struct tcf_ct_params *params)
{
struct zones_ht_key key = { .net = net, .zone = params->zone };
struct tcf_ct_flow_table *ct_ft;
int err = -ENOMEM;
mutex_lock(&zones_mutex);
ct_ft = rhashtable_lookup_fast(&zones_ht, &key, zones_params); // [1]
if (ct_ft && refcount_inc_not_zero(&ct_ft->ref)) // [2]
goto out_unlock;
...
}
static __always_inline void *rhashtable_lookup_fast(
struct rhashtable *ht, const void *key,
const struct rhashtable_params params)
{
void *obj;
rcu_read_lock();
obj = rhashtable_lookup(ht, key, params);
rcu_read_unlock();
return obj;
}
At [1], rhashtable_lookup_fast() looks up and returns the corresponding ct_ft
from zones_ht . The lookup is performed within an RCU read critical section
through rcu_read_lock() / rcu_read_unlock(), which prevents the object from
being freed. However, at the point of function return, rcu_read_unlock() has
already been called, and there is nothing preventing ct_ft from being freed
before reaching refcount_inc_not_zero(&ct_ft->ref) at [2]. This interval becomes
the race window, during which ct_ft can be freed.
Free Process:
tcf_ct_flow_table_put() is executed through the path tcf_ct_cleanup() call_rcu()
tcf_ct_params_free_rcu() tcf_ct_params_free() tcf_ct_flow_table_put().
static void tcf_ct_flow_table_put(struct tcf_ct_flow_table *ct_ft)
{
if (refcount_dec_and_test(&ct_ft->ref)) {
rhashtable_remove_fast(&zones_ht, &ct_ft->node, zones_params);
INIT_RCU_WORK(&ct_ft->rwork, tcf_ct_flow_table_cleanup_work); // [3]
queue_rcu_work(act_ct_wq, &ct_ft->rwork);
}
}
At [3], tcf_ct_flow_table_cleanup_work() is scheduled as RCU work
static void tcf_ct_flow_table_cleanup_work(struct work_struct *work)
{
struct tcf_ct_flow_table *ct_ft;
struct flow_block *block;
ct_ft = container_of(to_rcu_work(work), struct tcf_ct_flow_table,
rwork);
nf_flow_table_free(&ct_ft->nf_ft);
block = &ct_ft->nf_ft.flow_block;
down_write(&ct_ft->nf_ft.flow_block_lock);
WARN_ON(!list_empty(&block->cb_list));
up_write(&ct_ft->nf_ft.flow_block_lock);
kfree(ct_ft); // [4]
module_put(THIS_MODULE);
}
tcf_ct_flow_table_cleanup_work() frees ct_ft at [4]. When this function executes
between [1] and [2], UAF occurs.
This race condition has a very short race window, making it generally
difficult to trigger. Therefore, to trigger the vulnerability an msleep(100) was
inserted after[1]
Fixes: 138470a9b2cc2 ("net/sched: act_ct: fix lockdep splat in tcf_ct_flow_table_get")
Reported-by: zdi-disclosures@trendmicro.com
Tested-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260410111627.46611-1-jhs@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dudu Lu <phx0fer@gmail.com>
Date: Mon Apr 13 16:49:27 2026 +0800
net/sched: act_mirred: fix wrong device for mac_header_xmit check in tcf_blockcast_redir
[ Upstream commit 4510d140524ca7d6e772db962e013f26f09a63b1 ]
In tcf_blockcast_redir(), when iterating block ports to redirect
packets to multiple devices, the mac_header_xmit flag is queried
from the wrong device. The loop sends to dev_prev but queries
dev_is_mac_header_xmit(dev) — which is the NEXT device in the
iteration, not the one being sent to.
This causes tcf_mirred_to_dev() to make incorrect decisions about
whether to push or pull the MAC header. When the block contains
mixed device types (e.g., an ethernet veth and a tunnel device),
intermediate devices get the wrong mac_header_xmit flag, leading to
skb header corruption. In the worst case, skb_push_rcsum with an
incorrect mac_len can exhaust headroom and panic.
The last device in the loop is handled correctly (line 365-366 uses
dev_is_mac_header_xmit(dev_prev)), confirming this is a copy-paste
oversight for the intermediate devices.
Fix by using dev_prev instead of dev for the mac_header_xmit query,
consistent with the device actually being sent to.
Fixes: 42f39036cda8 ("net/sched: act_mirred: Allow mirred to block")
Signed-off-by: Dudu Lu <phx0fer@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260413084927.71353-1-phx0fer@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Paolo Abeni <pabeni@redhat.com>
Date: Wed Apr 29 09:39:11 2026 +0200
net/sched: cls_flower: revert unintended changes
[ Upstream commit 1e01abec856593e02cd69fd95b784c10dd46880c ]
While applying the blamed commit 4ca07b9239bd ("net: mctp i2c: check
length before marking flow active"), I unintentionally included
unrelated and unacceptable changes.
Revert them.
Fixes: 4ca07b9239bd ("net: mctp i2c: check length before marking flow active")
Reported-by: Jeremy Kerr <jk@codeconstruct.com.au>
Closes: https://lore.kernel.org/netdev/bd8704fe0bd53e278add5cde4873256656623e2e.camel@codeconstruct.com.au/
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/043026a53ff84da88b17648c4b0d17f0331749cb.1777447863.git.pabeni@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stephen Hemminger <stephen@networkplumber.org>
Date: Fri Apr 17 20:19:44 2026 -0700
net/sched: netem: check for negative latency and jitter
[ Upstream commit 90be9fedb218ee95a1cf59050d1306fbfb0e8b87 ]
Reject requests with negative latency or jitter.
A negative value added to current timestamp (u64) wraps
to an enormous time_to_send, disabling dequeue.
The original UAPI used u32 for these values; the conversion to 64-bit
time values via TCA_NETEM_LATENCY64 and TCA_NETEM_JITTER64
allowed signed values to reach the kernel without validation.
Jitter is already silently clamped by an abs() in netem_change();
that abs() can be removed in a follow-up once this rejection is in
place.
Fixes: 99803171ef04 ("netem: add uapi to express delay and jitter in nanoseconds")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260418032027.900913-7-stephen@networkplumber.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stephen Hemminger <stephen@networkplumber.org>
Date: Fri Apr 17 20:19:39 2026 -0700
net/sched: netem: fix probability gaps in 4-state loss model
[ Upstream commit 732b463449fd0ef90acd13cda68eab1c91adb00c ]
The 4-state Markov chain in loss_4state() has gaps at the boundaries
between transition probability ranges. The comparisons use:
if (rnd < a4)
else if (a4 < rnd && rnd < a1 + a4)
When rnd equals a boundary value exactly, neither branch matches and
no state transition occurs. The redundant lower-bound check (a4 < rnd)
is already implied by being in the else branch.
Remove the unnecessary lower-bound comparisons so the ranges are
contiguous and every random value produces a transition, matching
the GI (General and Intuitive) loss model specification.
This bug goes back to original implementation of this model.
Fixes: 661b79725fea ("netem: revised correlated loss generator")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260418032027.900913-2-stephen@networkplumber.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stephen Hemminger <stephen@networkplumber.org>
Date: Fri Apr 17 20:19:40 2026 -0700
net/sched: netem: fix queue limit check to include reordered packets
[ Upstream commit 4185701fcce6b426b6c3630b25330dddd9c47b0d ]
The queue limit check in netem_enqueue() uses q->t_len which only
counts packets in the internal tfifo. Packets placed in sch->q by
the reorder path (__qdisc_enqueue_head) are not counted, allowing
the total queue occupancy to exceed sch->limit under reordering.
Include sch->q.qlen in the limit check.
Fixes: f8d4bc455047 ("net/sched: netem: account for backlog updates from child qdisc")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260418032027.900913-3-stephen@networkplumber.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stephen Hemminger <stephen@networkplumber.org>
Date: Fri Apr 17 20:19:43 2026 -0700
net/sched: netem: fix slot delay calculation overflow
[ Upstream commit 51e94e1e2fef351c74d69eb53666df808d26af95 ]
get_slot_next() computes a random delay between min_delay and
max_delay using:
get_random_u32() * (max_delay - min_delay) >> 32
This overflows signed 64-bit arithmetic when the delay range exceeds
approximately 2.1 seconds (2^31 nanoseconds), producing a negative
result that effectively disables slot-based pacing. This is a
realistic configuration for WAN emulation (e.g., slot 1s 5s).
Use mul_u64_u32_shr() which handles the widening multiply without
overflow.
Fixes: 0a9fe5c375b5 ("netem: slotting with non-uniform distribution")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260418032027.900913-6-stephen@networkplumber.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stephen Hemminger <stephen@networkplumber.org>
Date: Fri Apr 17 20:19:41 2026 -0700
net/sched: netem: only reseed PRNG when seed is explicitly provided
[ Upstream commit 986afaf809940577224a99c3a08d97a15eb37e93 ]
netem_change() unconditionally reseeds the PRNG on every tc change
command. If TCA_NETEM_PRNG_SEED is not specified, a new random seed
is generated, destroying reproducibility for users who set a
deterministic seed on a previous change.
Move the initial random seed generation to netem_init() and only
reseed in netem_change() when TCA_NETEM_PRNG_SEED is explicitly
provided by the user.
Fixes: 4072d97ddc44 ("netem: add prng attribute to netem_sched_data")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260418032027.900913-4-stephen@networkplumber.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Stephen Hemminger <stephen@networkplumber.org>
Date: Fri Apr 17 20:19:42 2026 -0700
net/sched: netem: validate slot configuration
[ Upstream commit 01801c359a74737b9b1aa28568b60374d857241a ]
Reject slot configurations that have no defensible meaning:
- negative min_delay or max_delay
- min_delay greater than max_delay
- negative dist_delay or dist_jitter
- negative max_packets or max_bytes
Negative or out-of-order delays underflow in get_slot_next(),
producing garbage intervals. Negative limits trip the per-slot
accounting (packets_left/bytes_left <= 0) on the first packet of
every slot, defeating the rate-limiting half of the slot feature.
Note that dist_jitter has been silently coerced to its absolute
value by get_slot() since the feature was introduced; rejecting
negatives here converts that silent coercion into -EINVAL. The
abs() can be removed in a follow-up.
Fixes: 836af83b54e3 ("netem: support delivering packets in delayed time slots")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260418032027.900913-5-stephen@networkplumber.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Mon Apr 27 08:36:06 2026 +0000
net/sched: sch_cake: annotate data-races in cake_dump_stats() (V)
[ Upstream commit a6c95b833dc17e84d16a8ac0f40fd0931616a52d ]
cake_dump_stats() runs without qdisc spinlock being held.
In this final patch, I add READ_ONCE()/WRITE_ONCE() annotations
for cparams.target and cparams.interval.
Fixes: 046f6fd5daef ("sched: Add Common Applications Kept Enhanced (cake) qdisc")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: "Toke Høiland-Jørgensen" <toke@toke.dk>
Link: https://patch.msgid.link/20260427083606.459355-6-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dudu Lu <phx0fer@gmail.com>
Date: Mon Apr 13 19:00:41 2026 +0800
net/sched: sch_cake: fix NAT destination port not being updated in cake_update_flowkeys
[ Upstream commit f9e40664706927d7ae22a448a3383e23c38a4c0b ]
cake_update_flowkeys() is supposed to update the flow dissector keys
with the NAT-translated addresses and ports from conntrack, so that
CAKE's per-flow fairness correctly identifies post-NAT flows as
belonging to the same connection.
For the source port, this works correctly:
keys->ports.src = port;
But for the destination port, the assignment is reversed:
port = keys->ports.dst;
This means the NAT destination port is never updated in the flow keys.
As a result, when multiple connections are NATed to the same destination,
CAKE treats them as separate flows because the original (pre-NAT)
destination ports differ. This breaks CAKE's NAT-aware flow isolation
when using the "nat" mode.
The bug was introduced in commit b0c19ed6088a ("sch_cake: Take advantage
of skb->hash where appropriate") which refactored the original direct
assignment into a compare-and-conditionally-update pattern, but wrote
the destination port update backwards.
Fix by reversing the assignment direction to match the source port
pattern.
Fixes: b0c19ed6088a ("sch_cake: Take advantage of skb->hash where appropriate")
Signed-off-by: Dudu Lu <phx0fer@gmail.com>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Link: https://patch.msgid.link/20260413110041.44704-1-phx0fer@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Thu Apr 23 06:28:39 2026 +0000
net/sched: sch_choke: annotate data-races in choke_dump_stats()
[ Upstream commit d3aeb889dcbd78e95f500d383799a23d949796e0 ]
choke_dump_stats() only runs with RTNL held.
It reads fields that can be changed in qdisc fast path.
Add READ_ONCE()/WRITE_ONCE() annotations.
Fixes: edb09eb17ed8 ("net: sched: do not acquire qdisc spinlock in qdisc/class stats dump")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260423062839.2524324-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Tue Apr 21 14:25:09 2026 +0000
net/sched: sch_fq_codel: remove data-races from fq_codel_dump_stats()
[ Upstream commit bbfaa73ea6871db03dc05d7f05f00557a8981f25 ]
fq_codel_dump_stats() acquires the qdisc spinlock a bit too late.
Move this acquisition before we fill st.qdisc_stats with live data.
Fixes: edb09eb17ed8 ("net: sched: do not acquire qdisc spinlock in qdisc/class stats dump")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260421142509.3967231-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Thu Apr 23 06:35:27 2026 +0000
net/sched: sch_fq_pie: annotate data-races in fq_pie_dump_stats()
[ Upstream commit 59b145771c7982cfe9020d4e9e22da92d6b5ae31 ]
fq_codel_dump_stats() acquires the qdisc spinlock a bit too late.
Move this acquisition before we fill tc_fq_pie_xstats with live data.
Alternative would be to add READ_ONCE() and WRITE_ONCE() annotations,
but the spinlock is needed anyway to scan q->new_flows and q->old_flows.
Fixes: ec97ecf1ebe4 ("net: sched: add Flow Queue PIE packet scheduler")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260423063527.2568262-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Tue Apr 21 14:29:44 2026 +0000
net/sched: sch_pie: annotate data-races in pie_dump_stats()
[ Upstream commit 5154561d9b119f781249f8e845fecf059b38b483 ]
pie_dump_stats() only runs with RTNL held,
reading fields that can be changed in qdisc fast path.
Add READ_ONCE()/WRITE_ONCE() annotations.
Alternative would be to acquire the qdisc spinlock, but our long-term
goal is to make qdisc dump operations lockless as much as we can.
tc_pie_xstats fields don't need to be latched atomically,
otherwise this bug would have been caught earlier.
Fixes: edb09eb17ed8 ("net: sched: do not acquire qdisc spinlock in qdisc/class stats dump")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260421142944.4009941-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Thu Apr 30 08:00:56 2026 +0000
net/sched: sch_pie: annotate more data-races in pie_dump_stats()
[ Upstream commit 6d4106e8df94c0c52cf3ca6a6a0d01567fb3844e ]
My prior patch missed few READ_ONCE()/WRITE_ONCE() annotations.
Fixes: 5154561d9b11 ("net/sched: sch_pie: annotate data-races in pie_dump_stats()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260430080056.35104-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Tue Apr 21 14:23:09 2026 +0000
net/sched: sch_red: annotate data-races in red_dump_stats()
[ Upstream commit a8f5192809caf636d05ba47c144f282cfd0e3839 ]
red_dump_stats() only runs with RTNL held,
reading fields that can be changed in qdisc fast path.
Add READ_ONCE()/WRITE_ONCE() annotations.
Alternative would be to acquire the qdisc spinlock, but our long-term
goal is to make qdisc dump operations lockless as much as we can.
tc_red_xstats fields don't need to be latched atomically,
otherwise this bug would have been caught earlier.
Fixes: edb09eb17ed8 ("net: sched: do not acquire qdisc spinlock in qdisc/class stats dump")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260421142309.3964322-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Tue Apr 21 14:16:55 2026 +0000
net/sched: sch_sfb: annotate data-races in sfb_dump_stats()
[ Upstream commit 1ada03fdef82d3d7d2edb9dcd3acc91917675e48 ]
sfb_dump_stats() only runs with RTNL held,
reading fields that can be changed in qdisc fast path.
Add READ_ONCE()/WRITE_ONCE() annotations.
Alternative would be to acquire the qdisc spinlock, but our long-term
goal is to make qdisc dump operations lockless as much as we can.
tc_sfb_xstats fields don't need to be latched atomically,
otherwise this bug would have been caught earlier.
Fixes: edb09eb17ed8 ("net: sched: do not acquire qdisc spinlock in qdisc/class stats dump")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260421141655.3953721-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Weiming Shi <bestswngs@gmail.com>
Date: Thu Apr 23 00:19:58 2026 +0800
net/sched: taprio: fix NULL pointer dereference in class dump
[ Upstream commit 3d07ca5c0fae311226f737963984bd94bb159a87 ]
When a TAPRIO child qdisc is deleted via RTM_DELQDISC, taprio_graft()
is called with new == NULL and stores NULL into q->qdiscs[cl - 1].
Subsequent RTM_GETTCLASS dump operations walk all classes via
taprio_walk() and call taprio_dump_class(), which calls taprio_leaf()
returning the NULL pointer, then dereferences it to read child->handle,
causing a kernel NULL pointer dereference.
The bug is reachable with namespace-scoped CAP_NET_ADMIN on any kernel
with CONFIG_NET_SCH_TAPRIO enabled. On systems with unprivileged user
namespaces enabled, an unprivileged local user can trigger a kernel
panic by creating a taprio qdisc inside a new network namespace,
grafting an explicit child qdisc, deleting it, and requesting a class
dump. The RTM_GETTCLASS dump itself requires no capability.
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000007: 0000 [#1] SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000038-0x000000000000003f]
RIP: 0010:taprio_dump_class (net/sched/sch_taprio.c:2478)
Call Trace:
<TASK>
tc_fill_tclass (net/sched/sch_api.c:1966)
qdisc_class_dump (net/sched/sch_api.c:2326)
taprio_walk (net/sched/sch_taprio.c:2514)
tc_dump_tclass_qdisc (net/sched/sch_api.c:2352)
tc_dump_tclass_root (net/sched/sch_api.c:2370)
tc_dump_tclass (net/sched/sch_api.c:2431)
rtnl_dumpit (net/core/rtnetlink.c:6864)
netlink_dump (net/netlink/af_netlink.c:2325)
rtnetlink_rcv_msg (net/core/rtnetlink.c:6959)
netlink_rcv_skb (net/netlink/af_netlink.c:2550)
</TASK>
Fix this by substituting &noop_qdisc when new is NULL in
taprio_graft(), a common pattern used by other qdiscs (e.g.,
multiq_graft()) to ensure the q->qdiscs[] slots are never NULL.
This makes control-plane dump paths safe without requiring individual
NULL checks.
Since the data-plane paths (taprio_enqueue and taprio_dequeue_from_txq)
previously had explicit NULL guards that would drop/skip the packet
cleanly, update those checks to test for &noop_qdisc instead. Without
this, packets would reach taprio_enqueue_one() which increments the root
qdisc's qlen and backlog before calling the child's enqueue; noop_qdisc
drops the packet but those counters are never rolled back, permanently
inflating the root qdisc's statistics.
After this change *old can be a valid qdisc, NULL, or &noop_qdisc.
Only call qdisc_put(*old) in the first case to avoid decreasing
noop_qdisc's refcount, which was never increased.
Fixes: 665338b2a7a0 ("net/sched: taprio: dump class stats for the actual q->qdiscs[]")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Tested-by: Weiming Shi <bestswngs@gmail.com>
Link: https://patch.msgid.link/20260422161958.2517539-3-bestswngs@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Date: Fri Apr 10 18:57:57 2026 -0700
net/sched: taprio: fix use-after-free in advance_sched() on schedule switch
[ Upstream commit 105425b1969c5affe532713cfac1c0b320d7ac2b ]
In advance_sched(), when should_change_schedules() returns true,
switch_schedules() is called to promote the admin schedule to oper.
switch_schedules() queues the old oper schedule for RCU freeing via
call_rcu(), but 'next' still points into an entry of the old oper
schedule. The subsequent 'next->end_time = end_time' and
rcu_assign_pointer(q->current_entry, next) are use-after-free.
Fix this by selecting 'next' from the new oper schedule immediately
after switch_schedules(), and using its pre-calculated end_time.
setup_first_end_time() sets the first entry's end_time to
base_time + interval when the schedule is installed, so the value
is already correct.
The deleted 'end_time = sched_base_time(admin)' assignment was also
harmful independently: it would overwrite the new first entry's
pre-calculated end_time with just base_time.
Fixes: a3d43c0d56f1 ("taprio: Add support adding an admin schedule")
Reported-by: Junxi Qian <qjx1298677004@gmail.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Al Viro <viro@zeniv.linux.org.uk>
Date: Sat May 25 23:32:20 2024 -0400
net/socket.c: switch to CLASS(fd)
[ Upstream commit 53c0a58beb60b76e105a61aae518fd780eec03d9 ]
The important part in sockfd_lookup_light() is avoiding needless
file refcount operations, not the marginal reduction of the register
pressure from not keeping a struct file pointer in the caller.
Switch to use fdget()/fdpu(); with sane use of CLASS(fd) we can
get a better code generation...
Would be nice if somebody tested it on networking test suites
(including benchmarks)...
sockfd_lookup_light() does fdget(), uses sock_from_file() to
get the associated socket and returns the struct socket reference to
the caller, along with "do we need to fput()" flag. No matching fdput(),
the caller does its equivalent manually, using the fact that sock->file
points to the struct file the socket has come from.
Get rid of that - have the callers do fdget()/fdput() and
use sock_from_file() directly. That kills sockfd_lookup_light()
and fput_light() (no users left).
What's more, we can get rid of explicit fdget()/fdput() by
switching to CLASS(fd, ...) - code generation does not suffer, since
now fdput() inserted on "descriptor is not opened" failure exit
is recognized to be a no-op by compiler.
[folded a fix for braino in do_recvmmsg() caught by Simon Horman]
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Stable-dep-of: 66052a768d47 ("fanotify: call fanotify_events_supported() before path_permission() and security_path_notify()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Lorenzo Bianconi <lorenzo@kernel.org>
Date: Wed Apr 8 20:26:56 2026 +0200
net: airoha: Add missing RX_CPU_IDX() configuration in airoha_qdma_cleanup_rx_queue()
[ Upstream commit 656121b155030086b01cfce9bd31b0c925ee6860 ]
When the descriptor index written in REG_RX_CPU_IDX() is equal to the one
stored in REG_RX_DMA_IDX(), the hw will stop since the QDMA RX ring is
empty.
Add missing REG_RX_CPU_IDX() configuration in airoha_qdma_cleanup_rx_queue
routine during QDMA RX ring cleanup.
Fixes: 514aac359987 ("net: airoha: Add missing cleanup bits in airoha_qdma_cleanup_rx_queue()")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260408-airoha-cpu-idx-airoha_qdma_cleanup_rx_queue-v1-1-8efa64844308@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Lorenzo Bianconi <lorenzo@kernel.org>
Date: Sat Oct 12 11:01:11 2024 +0200
net: airoha: Implement BQL support
[ Upstream commit 1d304174106c93ce05f6088813ad7203b3eb381a ]
Introduce BQL support in the airoha_eth driver reporting to the kernel
info about tx hw DMA queues in order to avoid bufferbloat and keep the
latency small.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20241012-en7581-bql-v2-1-4deb4efdb60b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 656121b15503 ("net: airoha: Add missing RX_CPU_IDX() configuration in airoha_qdma_cleanup_rx_queue()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Lorenzo Bianconi <lorenzo@kernel.org>
Date: Mon Apr 20 10:07:47 2026 +0200
net: airoha: Move ndesc initialization at end of airoha_qdma_init_rx_queue()
[ Upstream commit 379050947a1828826ad7ea50c95245a56929b35a ]
If queue entry or DMA descriptor list allocation fails in
airoha_qdma_init_rx_queue routine, airoha_qdma_cleanup() will trigger a
NULL pointer dereference running netif_napi_del() for RX queue NAPIs
since netif_napi_add() has never been executed to this particular RX NAPI.
The issue is due to the early ndesc initialization in
airoha_qdma_init_rx_queue() since airoha_qdma_cleanup() relies on ndesc
value to check if the queue is properly initialized. Fix the issue moving
ndesc initialization at end of airoha_qdma_init_tx routine.
Move page_pool allocation after descriptor list allocation in order to
avoid memory leaks if desc allocation fails.
Fixes: 23020f049327 ("net: airoha: Introduce ethernet support for EN7581 SoC")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20260420-airoha_qdma_init_rx_queue-fix-v2-1-d99347e5c18d@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zoran Ilievski <goodboy@rexbytes.com>
Date: Mon May 11 08:40:02 2026 +0200
net: atlantic: preserve PCI wake-from-D3 on shutdown when WOL enabled
commit 2c308cf34284420963607d677d576a2b4124d8bd upstream.
The shutdown handler aq_pci_shutdown() unconditionally calls
pci_wake_from_d3(pdev, false), clearing the PCI PME_En bit even when
wake-on-LAN has been configured. While aq_nic_shutdown() correctly
programs the NIC firmware via aq_nic_set_power() to listen for magic
packets, the PCI subsystem will not propagate the resulting PME wake
event from D3, so the system never wakes after poweroff.
WOL from suspend (S3) is unaffected because aq_suspend_common() does
not touch pci_wake_from_d3() and relies on the PM core's wake
configuration via device_may_wakeup().
This affects all atlantic-supported NICs (AQC107/108/111/112/113);
users have reported that WOL works if the atlantic driver is never
loaded, but breaks once it has run its shutdown path.
Pass the configured WOL state to pci_wake_from_d3() instead of a
literal false, so the PCI PME_En bit is preserved when the user has
armed WOL via ethtool.
Fixes: 90869ddfefeb ("net: aquantia: Implement pci shutdown callback")
Cc: stable@vger.kernel.org
Signed-off-by: Zoran Ilievski <goodboy@rexbytes.com>
Reviewed-by: Sukhdeep Singh <sukhdeeps@marvell.com>
Link: https://patch.msgid.link/20260511064002.1857-1-goodboy@rexbytes.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Doug Berger <opendmb@gmail.com>
Date: Thu Mar 6 11:26:30 2025 -0800
net: bcmgenet: add bcmgenet_has_* helpers
[ Upstream commit 07c1a756a50b1180a085ab61819a388bbb906a95 ]
Introduce helper functions to indicate whether the driver should
make use of a particular feature that it supports. These helpers
abstract the implementation of how the feature availability is
encoded.
Signed-off-by: Doug Berger <opendmb@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250306192643.2383632-3-opendmb@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 5393b2b5bee2 ("net: bcmgenet: fix racing timeout handler")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Justin Chen <justin.chen@broadcom.com>
Date: Mon Apr 6 10:57:55 2026 -0700
net: bcmgenet: fix leaking free_bds
[ Upstream commit 3f3168300efb839028328d720ab3962f91d6a0d0 ]
While reclaiming the tx queue we fast forward the write pointer to
drop any data in flight. These dropped frames are not added back
to the pool of free bds. We also need to tell the netdev that we
are dropping said data.
Fixes: f1bacae8b655 ("net: bcmgenet: support reclaiming unsent Tx packets")
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de>
Tested-by: Nicolai Buchwitz <nb@tipi-net.de>
Link: https://patch.msgid.link/20260406175756.134567-3-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Justin Chen <justin.chen@broadcom.com>
Date: Mon Apr 6 10:57:54 2026 -0700
net: bcmgenet: fix off-by-one in bcmgenet_put_txcb
[ Upstream commit 57f3f53d2c9c5a9e133596e2f7bc1c50688a6d38 ]
The write_ptr points to the next open tx_cb. We want to return the
tx_cb that gets rewinded, so we must rewind the pointer first then
return the tx_cb that it points to. That way the txcb can be correctly
cleaned up.
Fixes: 876dbadd53a7 ("net: bcmgenet: Fix unmapping of fragments in bcmgenet_xmit()")
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de>
Link: https://patch.msgid.link/20260406175756.134567-2-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Justin Chen <justin.chen@broadcom.com>
Date: Mon Apr 6 10:57:56 2026 -0700
net: bcmgenet: fix racing timeout handler
[ Upstream commit 5393b2b5bee2ac51a0043dc7f4ac3475f053d08d ]
The bcmgenet_timeout handler tries to take down all tx queues when
a single queue times out. This is over zealous and causes many race
conditions with queues that are still chugging along. Instead lets
only restart the timed out queue.
Fixes: 13ea657806cf ("net: bcmgenet: improve TX timeout")
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de>
Tested-by: Nicolai Buchwitz <nb@tipi-net.de>
Link: https://patch.msgid.link/20260406175756.134567-4-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ryo Takakura <ryotkkr98@gmail.com>
Date: Wed Jul 2 18:24:17 2025 +0900
net: bcmgenet: Initialize u64 stats seq counter
[ Upstream commit ffc2c8c4a714df53a715827d6334ab9474424f6a ]
Initialize u64 stats as it uses seq counter on 32bit machines
as suggested by lockdep below.
[ 1.830953][ T1] INFO: trying to register non-static key.
[ 1.830993][ T1] The code is fine but needs lockdep annotation, or maybe
[ 1.831027][ T1] you didn't initialize this object before use?
[ 1.831057][ T1] turning off the locking correctness validator.
[ 1.831090][ T1] CPU: 1 UID: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.16.0-rc2-v7l+ #1 PREEMPT
[ 1.831097][ T1] Tainted: [W]=WARN
[ 1.831099][ T1] Hardware name: BCM2711
[ 1.831101][ T1] Call trace:
[ 1.831104][ T1] unwind_backtrace from show_stack+0x18/0x1c
[ 1.831120][ T1] show_stack from dump_stack_lvl+0x8c/0xcc
[ 1.831129][ T1] dump_stack_lvl from register_lock_class+0x9e8/0x9fc
[ 1.831141][ T1] register_lock_class from __lock_acquire+0x420/0x22c0
[ 1.831154][ T1] __lock_acquire from lock_acquire+0x130/0x3f8
[ 1.831166][ T1] lock_acquire from bcmgenet_get_stats64+0x4a4/0x4c8
[ 1.831176][ T1] bcmgenet_get_stats64 from dev_get_stats+0x4c/0x408
[ 1.831184][ T1] dev_get_stats from rtnl_fill_stats+0x38/0x120
[ 1.831193][ T1] rtnl_fill_stats from rtnl_fill_ifinfo+0x7f8/0x1890
[ 1.831203][ T1] rtnl_fill_ifinfo from rtmsg_ifinfo_build_skb+0xd0/0x138
[ 1.831214][ T1] rtmsg_ifinfo_build_skb from rtmsg_ifinfo+0x48/0x8c
[ 1.831225][ T1] rtmsg_ifinfo from register_netdevice+0x8c0/0x95c
[ 1.831237][ T1] register_netdevice from register_netdev+0x28/0x40
[ 1.831247][ T1] register_netdev from bcmgenet_probe+0x690/0x6bc
[ 1.831255][ T1] bcmgenet_probe from platform_probe+0x64/0xbc
[ 1.831263][ T1] platform_probe from really_probe+0xd0/0x2d4
[ 1.831269][ T1] really_probe from __driver_probe_device+0x90/0x1a4
[ 1.831273][ T1] __driver_probe_device from driver_probe_device+0x38/0x11c
[ 1.831278][ T1] driver_probe_device from __driver_attach+0x9c/0x18c
[ 1.831282][ T1] __driver_attach from bus_for_each_dev+0x84/0xd4
[ 1.831291][ T1] bus_for_each_dev from bus_add_driver+0xd4/0x1f4
[ 1.831303][ T1] bus_add_driver from driver_register+0x88/0x120
[ 1.831312][ T1] driver_register from do_one_initcall+0x78/0x360
[ 1.831320][ T1] do_one_initcall from kernel_init_freeable+0x2bc/0x314
[ 1.831331][ T1] kernel_init_freeable from kernel_init+0x1c/0x144
[ 1.831339][ T1] kernel_init from ret_from_fork+0x14/0x20
[ 1.831344][ T1] Exception stack(0xf082dfb0 to 0xf082dff8)
[ 1.831349][ T1] dfa0: 00000000 00000000 00000000 00000000
[ 1.831353][ T1] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 1.831356][ T1] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
Fixes: 59aa6e3072aa ("net: bcmgenet: switch to use 64bit statistics")
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Ryo Takakura <ryotkkr98@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250702092417.46486-1-ryotkkr98@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Doug Berger <opendmb@gmail.com>
Date: Thu Mar 6 11:26:34 2025 -0800
net: bcmgenet: move DESC_INDEX flow to ring 0
[ Upstream commit 3b5d4f5a820d362dd46472542b2e961fb1f93515 ]
The default transmit and receive packet handling is moved from
the DESC_INDEX (i.e. 16) descriptor rings to the Ring 0 queues.
This saves a fair amount of special case code by unifying the
handling.
A default dummy filter is enabled in the Hardware Filter Block
to route default receive packets to Ring 0.
Signed-off-by: Doug Berger <opendmb@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250306192643.2383632-7-opendmb@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 5393b2b5bee2 ("net: bcmgenet: fix racing timeout handler")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Doug Berger <opendmb@gmail.com>
Date: Thu Mar 6 11:26:39 2025 -0800
net: bcmgenet: support reclaiming unsent Tx packets
[ Upstream commit f1bacae8b655163dcbc3c54b9e714ef1a8986d7b ]
When disabling the transmitter any outstanding packets can now
be reclaimed by bcmgenet_tx_reclaim_all() rather than by the
bcmgenet_fini_dma() function.
Signed-off-by: Doug Berger <opendmb@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250306192643.2383632-12-opendmb@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 5393b2b5bee2 ("net: bcmgenet: fix racing timeout handler")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zak Kemble <zakkemble@gmail.com>
Date: Mon May 19 12:32:55 2025 +0100
net: bcmgenet: switch to use 64bit statistics
[ Upstream commit 59aa6e3072aa7e51e9040e8c342d0c0825c5f48f ]
Update the driver to use ndo_get_stats64, rtnl_link_stats64 and
u64_stats_t counters for statistics.
Signed-off-by: Zak Kemble <zakkemble@gmail.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250519113257.1031-2-zakkemble@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 5393b2b5bee2 ("net: bcmgenet: fix racing timeout handler")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mieczyslaw Nalewaj <namiltd@yahoo.com>
Date: Sun Apr 19 21:37:07 2026 +0200
net: dsa: realtek: rtl8365mb: fix mode mask calculation
[ Upstream commit 0c078021d3861966614d5e594ee03587f0c9e74d ]
The RTL8365MB_DIGITAL_INTERFACE_SELECT_MODE_MASK macro was shifting
the 4-bit mask (0xF) by only (_extint % 2) bits instead of
(_extint % 2) * 4. This caused the mask to overlap with the adjacent
nibble when configuring odd-numbered external interfaces, selecting
the wrong bits entirely.
Align the shift calculation with the existing ...MODE_OFFSET macro.
Fixes: 4af2950c50c8 ("net: dsa: realtek-smi: add rtl8365mb subdriver for RTL8365MB-VC")
Signed-off-by: Abdulkader Alrezej <alrazj.abdulkader@gmail.com>
Signed-off-by: Mieczyslaw Nalewaj <namiltd@yahoo.com>
Reviewed-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Link: https://patch.msgid.link/400a6387-a444-4576-af6d-26be5410bce3@yahoo.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mashiro Chen <mashiro.chen@mailbox.org>
Date: Wed Apr 8 01:31:01 2026 +0800
net: hamradio: 6pack: fix uninit-value in sixpack_receive_buf
[ Upstream commit bf9a38803b2626b01cc769aaf13485d8650f576f ]
sixpack_receive_buf() does not properly skip bytes with TTY error flags.
The while loop iterates through the flags buffer but never advances the
data pointer (cp), and passes the original count (including error bytes)
to sixpack_decode(). This causes sixpack_decode() to process bytes that
should have been skipped due to TTY errors. The TTY layer does not
guarantee that cp[i] holds a meaningful value when fp[i] is set, so
passing those positions to sixpack_decode() results in KMSAN reporting
an uninit-value read.
Fix this by processing bytes one at a time, advancing cp on each
iteration, and only passing valid (non-error) bytes to sixpack_decode().
This matches the pattern used by slip_receive_buf() and
mkiss_receive_buf() for the same purpose.
Reported-by: syzbot+ecdb8c9878a81eb21e54@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=ecdb8c9878a81eb21e54
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Mashiro Chen <mashiro.chen@mailbox.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260407173101.107352-1-mashiro.chen@mailbox.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Luca Weiss <luca.weiss@fairphone.com>
Date: Thu Apr 9 10:13:32 2026 +0200
net: ipa: Fix decoding EV_PER_EE for IPA v5.0+
[ Upstream commit 1335b903cf2e8aeaca87fd665683384c731ec941 ]
Initially 'reg' and 'val' are assigned from HW_PARAM_2.
But since IPA v5.0+ takes EV_PER_EE from HW_PARAM_4 (instead of
NUM_EV_PER_EE from HW_PARAM_2), we not only need to re-assign 'reg' but
also read the register value of that register into 'val' so that
reg_decode() works on the correct value.
Fixes: f651334e1ef5 ("net: ipa: add HW_PARAM_4 GSI register")
Link: https://sashiko.dev/#/patchset/20260403-milos-ipa-v1-0-01e9e4e03d3e%40fairphone.com?part=2
Signed-off-by: Luca Weiss <luca.weiss@fairphone.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://patch.msgid.link/20260409-ipa-fixes-v1-2-a817c30678ac@fairphone.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Luca Weiss <luca.weiss@fairphone.com>
Date: Thu Apr 9 10:13:31 2026 +0200
net: ipa: Fix programming of QTIME_TIMESTAMP_CFG
[ Upstream commit de08f9585692813bd41ee654fca0487664c4de30 ]
The 'val' variable gets overwritten multiple times, discarding previous
values. Looking at the git log shows these should be combined with |=
instead.
Fixes: 9265a4f0f0b4 ("net: ipa: define even more IPA register fields")
Link: https://sashiko.dev/#/patchset/20260403-milos-ipa-v1-0-01e9e4e03d3e%40fairphone.com?part=4
Signed-off-by: Luca Weiss <luca.weiss@fairphone.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://patch.msgid.link/20260409-ipa-fixes-v1-1-a817c30678ac@fairphone.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: William A. Kennington III <william@wkennington.com>
Date: Thu Apr 23 00:46:52 2026 -0700
net: mctp i2c: check length before marking flow active
[ Upstream commit 4ca07b9239bd0478ae586632a2ed72be37ed8407 ]
Currently, mctp_i2c_get_tx_flow_state() is called before the packet length
sanity check. This function marks a new flow as active in the MCTP core.
If the sanity check fails, mctp_i2c_xmit() returns early without calling
mctp_i2c_lock_nest(). This results in a mismatched locking state: the
flow is active, but the I2C bus lock was never acquired for it.
When the flow is later released, mctp_i2c_release_flow() will see the
active state and queue an unlock marker. The TX thread will then
decrement midev->i2c_lock_count from 0, causing it to underflow to -1.
This underflow permanently breaks the driver's locking logic, allowing
future transmissions to occur without holding the I2C bus lock, leading
to bus collisions and potential hardware hangs.
Move the mctp_i2c_get_tx_flow_state() call to after the length sanity
check to ensure we only transition the flow state if we are actually
going to proceed with the transmission and locking.
Fixes: f5b8abf9fc3d ("mctp i2c: MCTP I2C binding driver")
Signed-off-by: William A. Kennington III <william@wkennington.com>
Acked-by: Jeremy Kerr <jk@codeconstruct.com.au>
Link: https://patch.msgid.link/20260423074741.201460-1-william@wkennington.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pavel Begunkov <asml.silence@gmail.com>
Date: Tue Feb 4 13:56:15 2025 -0800
net: page_pool: create hooks for custom memory providers
[ Upstream commit 57afb483015768903029c8336ee287f4b03c1235 ]
A spin off from the original page pool memory providers patch by Jakub,
which allows extending page pools with custom allocators. One of such
providers is devmem TCP, and the other is io_uring zerocopy added in
following patches.
Link: https://lore.kernel.org/netdev/20230707183935.997267-7-kuba@kernel.org/
Co-developed-by: Jakub Kicinski <kuba@kernel.org> # initial mp proposal
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David Wei <dw@davidwei.uk>
Link: https://patch.msgid.link/20250204215622.695511-5-dw@davidwei.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 5ef343614db7 ("page_pool: fix memory-provider leak in page_pool_create_percpu() error path")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Heiko Schocher <hs@nabladev.com>
Date: Sat Apr 25 05:13:39 2026 +0200
net: phy: dp83869: fix setting CLK_O_SEL field.
[ Upstream commit 46f74a3f7d57d9cc0110b09cbc8163fa0a01afa2 ]
Table 7-121 in datasheet says we have to set register 0xc6
to value 0x10 before CLK_O_SEL can be modified. No more infos
about this field found in datasheet. With this fix, setting
of CLK_O_SEL field in IO_MUX_CFG register worked through dts
property "ti,clk-output-sel" on a DP83869HMRGZR.
Signed-off-by: Heiko Schocher <hs@nabladev.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Fixes: 01db923e8377 ("net: phy: dp83869: Add TI dp83869 phy")
Link: https://patch.msgid.link/20260425031339.3318-1-hs@nabladev.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Charles Perry <charles.perry@microchip.com>
Date: Thu Apr 9 06:36:54 2026 -0700
net: phy: fix a return path in get_phy_c45_ids()
[ Upstream commit 6f533abe7bbad2eef1e42c639b6bb9dad2b02362 ]
The return value of phy_c45_probe_present() is stored in "ret", not
"phy_reg", fix this. "phy_reg" always has a positive value if we reach
this return path (since it would have returned earlier otherwise), which
means that the original goal of the patch of not considering -ENODEV
fatal wasn't achieved.
Fixes: 17b447539408 ("net: phy: c45 scanning: Don't consider -ENODEV fatal")
Signed-off-by: Charles Perry <charles.perry@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20260409133654.3203336-1-charles.perry@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Maxime Chevallier <maxime.chevallier@bootlin.com>
Date: Fri Apr 10 19:10:20 2026 +0200
net: phy: qcom: at803x: Use the correct bit to disable extended next page
[ Upstream commit e7a62edd34b1b4bc5f979988efc2f81c075733fd ]
As noted in the blamed commit, the AR8035 and other PHYs from this
family advertise the Extended Next Page support by default, which may be
understood by some partners as this PHY being multi-gig capable.
The fix is to disable XNP advertising, which is done by setting bit 12
of the Auto-Negotiation Advertisement Register (MII_ADVERTISE).
The blamed commit incorrectly uses MDIO_AN_CTRL1_XNP, which is bit 13 as per
802.3 : 45.2.7.1 AN control register (Register 7.0)
BIT 12 in MII_ADVERTISE is wrapped by ADVERTISE_RESV, used by some
drivers such as the aquantia one. 802.3 Clause 28 defines bit 12 as
Extended Next Page ability, at least in recent versions of the standard.
Let's add a define for it and use it in the at803x driver.
Fixes: 3c51fa5d2afe ("net: phy: ar803x: disable extended next page bit")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20260410171021.1277138-1-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: William Bowling <vakzz@zellic.io>
Date: Wed May 13 04:16:35 2026 +0000
net: skbuff: preserve shared-frag marker during coalescing
commit f84eca5817390257cef78013d0112481c503b4a3 upstream.
skb_try_coalesce() can attach paged frags from @from to @to. If @from
has SKBFL_SHARED_FRAG set, the resulting @to skb can contain the same
externally-owned or page-cache-backed frags, but the shared-frag marker
is currently lost.
That breaks the invariant relied on by later in-place writers. In
particular, ESP input checks skb_has_shared_frag() before deciding
whether an uncloned nonlinear skb can skip skb_cow_data(). If TCP
receive coalescing has moved shared frags into an unmarked skb, ESP can
see skb_has_shared_frag() as false and decrypt in place over page-cache
backed frags.
Propagate SKBFL_SHARED_FRAG when skb_try_coalesce() transfers paged
frags. The tailroom copy path does not need the marker because it copies
bytes into @to's linear data rather than transferring frag descriptors.
Fixes: cef401de7be8 ("net: fix possible wrong checksum generation")
Fixes: f4c50a4034e6 ("xfrm: esp: avoid in-place decrypt on shared skb frags")
Signed-off-by: William Bowling <vakzz@zellic.io>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Tested-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Link: https://patch.msgid.link/20260513041635.1289541-1-vakzz@zellic.io
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Hyunwoo Kim <imv4bel@gmail.com>
Date: Sat May 16 07:28:53 2026 +0900
net: skbuff: propagate shared-frag marker through frag-transfer helpers
commit 48f6a5356a33dd78e7144ae1faef95ffc990aae0 upstream.
Two frag-transfer helpers (__pskb_copy_fclone() and skb_shift()) fail
to propagate the SKBFL_SHARED_FRAG bit in skb_shinfo()->flags when
moving frags from source to destination. __pskb_copy_fclone() defers
the rest of the shinfo metadata to skb_copy_header() after copying
frag descriptors, but that helper only carries over gso_{size,segs,
type} and never touches skb_shinfo()->flags; skb_shift() moves frag
descriptors directly and leaves flags untouched. As a result, the
destination skb keeps a reference to the same externally-owned or
page-cache-backed pages while reporting skb_has_shared_frag() as
false.
The mismatch is harmful in any in-place writer that uses
skb_has_shared_frag() to decide whether shared pages must be detoured
through skb_cow_data(). ESP input is one such writer (esp4.c,
esp6.c), and a single nft 'dup to <local>' rule -- or any other
nf_dup_ipv4() / xt_TEE caller -- is enough to land a pskb_copy()'d
skb in esp_input() with the marker stripped, letting an unprivileged
user write into the page cache of a root-owned read-only file via
authencesn-ESN stray writes.
Set SKBFL_SHARED_FRAG on the destination whenever frag descriptors
were actually moved from the source. skb_copy() and skb_copy_expand()
share skb_copy_header() too but linearize all paged data into freshly
allocated head storage and emerge with nr_frags == 0, so
skb_has_shared_frag() returns false on its own; they need no change.
The same omission exists in skb_gro_receive() and skb_gro_receive_list().
The former moves the incoming skb's frag descriptors into the
accumulator's last sub-skb via two paths (a direct frag-move loop and
the head_frag + memcpy path); the latter chains the incoming skb whole
onto p's frag_list. Downstream skb_segment() reads only
skb_shinfo(p)->flags, and skb_segment_list() reuses each sub-skb's
shinfo as the nskb -- both p and lp must carry the marker.
The same omission also exists in tcp_clone_payload(), which builds an
MTU probe skb by moving frag descriptors from skbs on sk_write_queue
into a freshly allocated nskb. The helper falls into the same family
and warrants the same fix for consistency; no TCP TX-side in-place
writer is currently known to reach a user page through this gap, but
a future consumer depending on the marker would regress silently.
The same omission exists in skb_segment(): the per-iteration flag
merge takes only head_skb's flag, and the inner switch that rebinds
frag_skb to list_skb on head_skb-frags exhaustion does not fold the
new frag_skb's flag into nskb. Fold frag_skb's flag at both sites
so segments drawing frags from frag_list members carry the marker.
Fixes: cef401de7be8 ("net: fix possible wrong checksum generation")
Fixes: f4c50a4034e6 ("xfrm: esp: avoid in-place decrypt on shared skb frags")
Suggested-by: Sabrina Dubroca <sd@queasysnail.net>
Suggested-by: Sultan Alsawaf <sultan@kerneltoast.com>
Suggested-by: Ben Hutchings <ben@decadent.org.uk>
Suggested-by: Lin Ma <malin89@huawei.com>
Suggested-by: Jingguo Tan <tanjingguo@huawei.com>
Suggested-by: Aaron Esau <aaron1esau@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Tested-by: Rajat Gupta <rajat.gupta@oss.qualcomm.com>
Link: https://patch.msgid.link/ageeJfJHwgzmKXbh@v4bel
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Jakub Kicinski <kuba@kernel.org>
Date: Tue Apr 28 16:15:59 2026 -0700
net: tls: fix strparser anchor skb leak on offload RX setup failure
[ Upstream commit 58689498ca3384851145a754dbb1d8ed1cf9fb54 ]
When tls_set_device_offload_rx() fails at tls_dev_add(), the error path
calls tls_sw_free_resources_rx() to clean up the SW context that was
initialized by tls_set_sw_offload(). This function calls
tls_sw_release_resources_rx() (which stops the strparser via
tls_strp_stop()) and tls_sw_free_ctx_rx() (which kfrees the context),
but never frees the anchor skb that was allocated by alloc_skb(0) in
tls_strp_init().
Note that tls_sw_free_resources_rx() is exclusively used for this
"failed to start offload" code path, there's no other caller.
The leak did not exist before commit 84c61fe1a75b ("tls: rx: do not use
the standard strparser"), because the standard strparser doesn't try
to pre-allocate an skb.
The normal close path in tls_sk_proto_close() handles cleanup by calling
tls_sw_strparser_done() (which calls tls_strp_done()) after dropping
the socket lock, because tls_strp_done() does cancel_work_sync() and
the strparser work handler takes the socket lock.
Fixes: 84c61fe1a75b ("tls: rx: do not use the standard strparser")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20260428231559.1358502-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zhan Jun <zhanjun@uniontech.com>
Date: Thu Apr 23 08:49:12 2026 +0800
net: usb: rtl8150: fix use-after-free in rtl8150_start_xmit()
[ Upstream commit 23f0e34c64acba15cad4d23e50f41f533da195fa ]
syzbot reported a KASAN slab-use-after-free read in rtl8150_start_xmit()
when accessing skb->len for tx statistics after usb_submit_urb() has
been called:
BUG: KASAN: slab-use-after-free in rtl8150_start_xmit+0x71f/0x760
drivers/net/usb/rtl8150.c:712
Read of size 4 at addr ffff88810eb7a930 by task kworker/0:4/5226
The URB completion handler write_bulk_callback() frees the skb via
dev_kfree_skb_irq(dev->tx_skb). The URB may complete on another CPU
in softirq context before usb_submit_urb() returns in the submitter,
so by the time the submitter reads skb->len the skb has already been
queued to the per-CPU completion_queue and freed by net_tx_action():
CPU A (xmit) CPU B (USB completion softirq)
------------ ------------------------------
dev->tx_skb = skb;
usb_submit_urb() --+
|-------> write_bulk_callback()
| dev_kfree_skb_irq(dev->tx_skb)
| net_tx_action()
| napi_skb_cache_put() <-- free
netdev->stats.tx_bytes |
+= skb->len; <-- UAF read
Fix it by caching skb->len before submitting the URB and using the
cached value when updating the tx_bytes counter.
The pre-existing tx_bytes semantics are preserved: the counter tracks
the original frame length (skb->len), not the ETH_ZLEN/USB-alignment
padded "count" value that is handed to the device. Changing that
would be a user-visible accounting change and is out of scope for
this UAF fix.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot+3f46c095ac0ca048cb71@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/69e69ee7.050a0220.24bfd3.002b.GAE@google.com/
Closes: https://syzkaller.appspot.com/bug?extid=3f46c095ac0ca048cb71
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Zhan Jun <zhanjun@uniontech.com>
Link: https://patch.msgid.link/809895186B866C10+20260423004913.136655-1-zhangdandan@uniontech.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Morduan Zang <zhangdandan@uniontech.com>
Date: Fri Apr 24 09:55:17 2026 +0800
net: usb: rtl8150: free skb on usb_submit_urb() failure in xmit
[ Upstream commit adbe2cdf75461891e50dbe11896ac78e9af1f874 ]
When rtl8150_start_xmit() fails to submit the tx URB, the URB is never
handed to the USB core and write_bulk_callback() will not run. The
driver returns NETDEV_TX_OK, which tells the networking stack that the
skb has been consumed, but nothing actually frees the skb on this
error path:
dev->tx_skb = skb;
...
if ((res = usb_submit_urb(dev->tx_urb, GFP_ATOMIC))) {
...
/* no kfree_skb here */
}
return NETDEV_TX_OK;
This leaks the skb on every submit failure and also leaves dev->tx_skb
pointing at memory that the driver itself may later free, which is
fragile.
Free the skb with dev_kfree_skb_any() in the error path and clear
dev->tx_skb so no stale pointer is left behind.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Morduan Zang <zhangdandan@uniontech.com>
Link: https://patch.msgid.link/E7D3E1C013C5A859+20260424015517.9574-1-zhangdandan@uniontech.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Tue Apr 21 14:33:49 2026 +0000
net_sched: sch_hhf: annotate data-races in hhf_dump_stats()
[ Upstream commit a6edf2cd4156b71e07258876b7626692e158f7e8 ]
hhf_dump_stats() only runs with RTNL held,
reading fields that can be changed in qdisc fast path.
Add READ_ONCE()/WRITE_ONCE() annotations.
Fixes: edb09eb17ed8 ("net: sched: do not acquire qdisc spinlock in qdisc/class stats dump")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20260421143349.4052215-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Uday Shankar <ushankar@purestorage.com>
Date: Wed Mar 12 13:51:47 2025 -0600
netconsole: allow selection of egress interface via MAC address
[ Upstream commit f8a10bed32f5fbede13a5f22fdc4ab8740ea213a ]
Currently, netconsole has two methods of configuration - module
parameter and configfs. The former interface allows for netconsole
activation earlier during boot (by specifying the module parameter on
the kernel command line), so it is preferred for debugging issues which
arise before userspace is up/the configfs interface can be used. The
module parameter syntax requires specifying the egress interface name.
This requirement makes it hard to use for a couple reasons:
- The egress interface name can be hard or impossible to predict. For
example, installing a new network card in a system can change the
interface names assigned by the kernel.
- When constructing the module parameter, one may have trouble
determining the original (kernel-assigned) name of the interface
(which is the name that should be given to netconsole) if some stable
interface naming scheme is in effect. A human can usually look at
kernel logs to determine the original name, but this is very painful
if automation is constructing the parameter.
For these reasons, allow selection of the egress interface via MAC
address when configuring netconsole using the module parameter. Update
the netconsole documentation with an example of the new syntax.
Selection of egress interface by MAC address via configfs is far less
interesting (since when this interface can be used, one should be able
to easily convert between MAC address and interface name), so it is left
unimplemented.
Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
Tested-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250312-netconsole-v6-2-3437933e79b8@purestorage.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Stable-dep-of: 3bc179bc7146 ("netpoll: fix IPv6 local-address corruption")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Breno Leitao <leitao@debian.org>
Date: Mon Apr 27 07:30:37 2026 -0700
netconsole: propagate device name truncation in dev_name_store()
[ Upstream commit 92ceb7bff62c2606f664c204750eca0b85d44112 ]
dev_name_store() calls strscpy(nt->np.dev_name, buf, IFNAMSIZ) without
checking the return value. If userspace writes an interface name longer
than IFNAMSIZ - 1, strscpy() silently truncates and returns -E2BIG, but
the function ignores it and reports a fully successful write back to
userspace.
If a real interface happens to match the truncated name, netconsole will
bind to the wrong device on the next enable, sending kernel logs and
panic output to an unintended network segment with no indication to
userspace that anything was rewritten.
Reject writes whose length cannot fit in nt->np.dev_name up front:
if (count >= IFNAMSIZ)
return -ENAMETOOLONG;
This is not a big deal of a problem, but, it is still the correct
approach.
Fixes: 0bcc1816188e57 ("[NET] netconsole: Support dynamic reconfiguration using configfs")
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20260427-netconsole_ai_fixes-v2-3-59965f29d9cc@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nikola Z. Ivanov <zlatistiv@gmail.com>
Date: Sun Apr 26 23:14:34 2026 +0300
netdevsim: zero initialize struct iphdr in dummy sk_buff
[ Upstream commit 35eaa6d8d6c2ee65e96f507add856e0eacf24591 ]
Syzbot reports a KMSAN uninit-value originating from
nsim_dev_trap_skb_build, with the allocation also
being performed in the same function.
Fix this by calling skb_put_zero instead of skb_put to
guarantee zero initialization of the whole IP header.
Closes: https://syzkaller.appspot.com/bug?extid=23d7fcd204e3837866ff
Fixes: da58f90f11f5 ("netdevsim: Add devlink-trap support")
Signed-off-by: Nikola Z. Ivanov <zlatistiv@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260426201434.742030-1-zlatistiv@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Mon Apr 20 23:15:32 2026 +0200
netfilter: arp_tables: fix IEEE1394 ARP payload parsing
[ Upstream commit 1e8e3f449b1e73b73a843257635b9c50f0cc0f0a ]
Weiming Shi says:
"arp_packet_match() unconditionally parses the ARP payload assuming two
hardware addresses are present (source and target). However,
IPv4-over-IEEE1394 ARP (RFC 2734) omits the target hardware address
field, and arp_hdr_len() already accounts for this by returning a
shorter length for ARPHRD_IEEE1394 devices.
As a result, on IEEE1394 interfaces arp_packet_match() advances past a
nonexistent target hardware address and reads the wrong bytes for both
the target device address comparison and the target IP address. This
causes arptables rules to match against garbage data, leading to
incorrect filtering decisions: packets that should be accepted may be
dropped and vice versa.
The ARP stack in net/ipv4/arp.c (arp_create and arp_process) already
handles this correctly by skipping the target hardware address for
ARPHRD_IEEE1394. Apply the same pattern to arp_packet_match()."
Mangle the original patch to always return 0 (no match) in case user
matches on the target hardware address which is never present in
IEEE1394.
Note that this returns 0 (no match) for either normal and inverse match
because matching in the target hardware address in ARPHRD_IEEE1394 has
never been supported by arptables. This is intentional, matching on the
target hardware address should never evaluate true for ARPHRD_IEEE1394.
Moreover, adjust arpt_mangle to drop the packet too as AI suggests:
In arpt_mangle, the logic assumes a standard ARP layout. Because
IEEE1394 (FireWire) omits the target hardware address, the linear
pointer arithmetic miscalculates the offset for the target IP address.
This causes mangling operations to write to the wrong location, leading
to packet corruption. To ensure safety, this patch drops packets
(NF_DROP) when mangling is requested for these fields on IEEE1394
devices, as the current implementation cannot correctly map the FireWire
ARP payload.
This omits both mangling target hardware and IP address. Even if IP
address mangling should be possible in IEEE1394, this would require
to adjust arpt_mangle offset calculation, which has never been
supported.
Based on patch from Weiming Shi <bestswngs@gmail.com>.
Fixes: 6752c8db8e0c ("firewire net, ipv4 arp: Extend hardware address and remove driver-level packet inspection.")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Florian Westphal <fw@strlen.de>
Date: Tue Apr 14 19:13:46 2026 +0200
netfilter: conntrack: remove sprintf usage
[ Upstream commit 6e7066bdb481a87fe88c4fa563e348c03b2d373d ]
Replace it with scnprintf, the buffer sizes are expected to be large enough
to hold the result, no need for snprintf+overflow check.
Increase buffer size in mangle_content_len() while at it.
BUG: KASAN: stack-out-of-bounds in vsnprintf+0xea5/0x1270
Write of size 1 at addr [..]
vsnprintf+0xea5/0x1270
sprintf+0xb1/0xe0
mangle_content_len+0x1ac/0x280
nf_nat_sdp_session+0x1cc/0x240
process_sdp+0x8f8/0xb80
process_invite_request+0x108/0x2b0
process_sip_msg+0x5da/0xf50
sip_help_tcp+0x45e/0x780
nf_confirm+0x34d/0x990
[..]
Fixes: 9fafcd7b2032 ("[NETFILTER]: nf_conntrack/nf_nat: add SIP helper port")
Reported-by: Yiming Qian <yimingqian591@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Florian Westphal <fw@strlen.de>
Date: Thu Apr 23 02:19:11 2026 +0200
netfilter: nf_conntrack_sip: don't use simple_strtoul
[ Upstream commit 8cf6809cddcbe301aedfc6b51bcd4944d45795f6 ]
Replace unsafe port parsing in epaddr_len(), ct_sip_parse_header_uri(),
and ct_sip_parse_request() with a new sip_parse_port() helper that
validates each digit against the buffer limit, eliminating the use of
simple_strtoul() which assumes NUL-terminated strings.
The previous code dereferenced pointers without bounds checks after
sip_parse_addr() and relied on simple_strtoul() on non-NUL-terminated
skb data. A port that reaches the buffer limit without a trailing
character is also rejected as malformed.
Also get rid of all simple_strtoul() usage in conntrack, prefer a
stricter version instead. There are intentional changes:
- Bail out if number is > UINT_MAX and indicate a failure, same for
too long sequences.
While we do accept 05535 as port 5535, we will not accept e.g.
'sip:10.0.0.1:005060'. While its syntactically valid under RFC 3261,
we should restrict this to not waste cycles when presented with
malformed packets with 64k '0' characters.
- Force base 10 in ct_sip_parse_numerical_param(). This is used to fetch
'expire=' and 'rports='; both are expected to use base-10.
- In nf_nat_sip.c, only accept the parsed value if its within the 1k-64k
range.
- epaddr_len now returns 0 if the port is invalid, as it already does
for invalid ip addresses. This is intentional. nf_conntrack_sip
performs lots of guesswork to find the right parts of the message
to parse. Being stricter could break existing setups.
Connection tracking helpers are designed to allow traffic to
pass, not to block it.
Based on an earlier patch from Jenny Guanni Qu <qguanni@gmail.com>.
Fixes: 05e3ced297fe ("[NETFILTER]: nf_conntrack_sip: introduce SIP-URI parsing helper")
Reported-by: Klaudia Kloc <klaudia@vidocsecurity.com>
Reported-by: Dawid Moczadło <dawid@vidocsecurity.com>
Reported-by: Jenny Guanni Qu <qguanni@gmail.com>.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Li Xiasong <lixiasong1@huawei.com>
Date: Thu May 7 22:04:22 2026 +0800
netfilter: nf_conntrack_sip: get helper before allocating expectation
commit eb6317739b1ea3ab28791e1f91b24781905fa815 upstream.
process_register_request() allocates an expectation and then checks
whether a conntrack helper is available. If helper lookup fails, the
function returns early and the allocated expectation is left behind.
Reorder the code to fetch and validate helper before calling
nf_ct_expect_alloc(). This keeps the logic simpler and removes the leak
path while preserving existing behavior.
Fixes: e14575fa7529 ("netfilter: nf_conntrack: use rcu accessors where needed")
Cc: stable@vger.kernel.org
Signed-off-by: Li Xiasong <lixiasong1@huawei.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Tue May 19 15:55:04 2026 +0800
netfilter: nf_tables: unconditionally bump set->nelems before insertion
[ Upstream commit def602e498a4f951da95c95b1b8ce8ae68aa733a ]
In case that the set is full, a new element gets published then removed
without waiting for the RCU grace period, while RCU reader can be
walking over it already.
To address this issue, add the element transaction even if set is full,
but toggle the set_full flag to report -ENFILE so the abort path safely
unwinds the set to its previous state.
As for element updates, decrement set->nelems to restore it.
A simpler fix is to call synchronize_rcu() in the error path.
However, with a large batch adding elements to already maxed-out set,
this could cause noticeable slowdown of such batches.
Fixes: 35d0ac9070ef ("netfilter: nf_tables: fix set->nelems counting with no NLM_F_EXCL")
Reported-by: Inseo An <y0un9sa@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
[ Minor conflict resolved. ]
Signed-off-by: Li hongliang <1468888505@139.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Xiang Mei <xmei5@asu.edu>
Date: Tue Apr 14 15:14:01 2026 -0700
netfilter: nfnetlink_osf: fix divide-by-zero in OSF_WSS_MODULO
[ Upstream commit 2195574dc6d9017d32ac346987e12659f931d932 ]
nf_osf_match_one() computes ctx->window % f->wss.val in the
OSF_WSS_MODULO branch with no guard for f->wss.val == 0. A
CAP_NET_ADMIN user can add such a fingerprint via nfnetlink; a
subsequent matching TCP SYN divides by zero and panics the kernel.
Reject the bogus fingerprint in nfnl_osf_add_callback() above the
per-option for-loop. f->wss is per-fingerprint, not per-option, so
the check must run regardless of f->opt_num (including 0). Also
reject wss.wc >= OSF_WSS_MAX; nf_osf_match_one() already treats that
as "should not happen".
Crash:
Oops: divide error: 0000 [#1] SMP KASAN NOPTI
RIP: 0010:nf_osf_match_one (net/netfilter/nfnetlink_osf.c:98)
Call Trace:
<IRQ>
nf_osf_match (net/netfilter/nfnetlink_osf.c:220)
xt_osf_match_packet (net/netfilter/xt_osf.c:32)
ipt_do_table (net/ipv4/netfilter/ip_tables.c:348)
nf_hook_slow (net/netfilter/core.c:622)
ip_local_deliver (net/ipv4/ip_input.c:265)
ip_rcv (include/linux/skbuff.h:1162)
__netif_receive_skb_one_core (net/core/dev.c:6181)
process_backlog (net/core/dev.c:6642)
__napi_poll (net/core/dev.c:7710)
net_rx_action (net/core/dev.c:7945)
handle_softirqs (kernel/softirq.c:622)
Fixes: 11eeef41d5f6 ("netfilter: passive OS fingerprint xtables match")
Reported-by: Weiming Shi <bestswngs@gmail.com>
Suggested-by: Florian Westphal <fw@strlen.de>
Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Xiang Mei <xmei5@asu.edu>
Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Fernando Fernandez Mancera <fmancera@suse.de>
Date: Fri Apr 17 18:20:56 2026 +0200
netfilter: nfnetlink_osf: fix out-of-bounds read on option matching
[ Upstream commit f5ca450087c3baf3651055e7a6de92600f827af3 ]
In nf_osf_match(), the nf_osf_hdr_ctx structure is initialized once
and passed by reference to nf_osf_match_one() for each fingerprint
checked. During TCP option parsing, nf_osf_match_one() advances the
shared ctx->optp pointer.
If a fingerprint perfectly matches, the function returns early without
restoring ctx->optp to its initial state. If the user has configured
NF_OSF_LOGLEVEL_ALL, the loop continues to the next fingerprint.
However, because ctx->optp was not restored, the next call to
nf_osf_match_one() starts parsing from the end of the options buffer.
This causes subsequent matches to read garbage data and fail
immediately, making it impossible to log more than one match or logging
incorrect matches.
Instead of using a shared ctx->optp pointer, pass the context as a
constant pointer and use a local pointer (optp) for TCP option
traversal. This makes nf_osf_match_one() strictly stateless from the
caller's perspective, ensuring every fingerprint check starts at the
correct option offset.
Fixes: 1a6a0951fc00 ("netfilter: nfnetlink_osf: add missing fmatch check")
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Fernando Fernandez Mancera <fmancera@suse.de>
Date: Fri Apr 17 18:20:57 2026 +0200
netfilter: nfnetlink_osf: fix potential NULL dereference in ttl check
[ Upstream commit 711987ba281fd806322a7cd244e98e2a81903114 ]
The nf_osf_ttl() function accessed skb->dev to perform a local interface
address lookup without verifying that the device pointer was valid.
Additionally, the implementation utilized an in_dev_for_each_ifa_rcu
loop to match the packet source address against local interface
addresses. It assumed that packets from the same subnet should not see a
decrement on the initial TTL. A packet might appear it is from the same
subnet but it actually isn't especially in modern environments with
containers and virtual switching.
Remove the device dereference and interface loop. Replace the logic with
a switch statement that evaluates the TTL according to the ttl_check.
Fixes: 11eeef41d5f6 ("netfilter: passive OS fingerprint xtables match")
Reported-by: Kito Xu (veritas501) <hxzene@gmail.com>
Closes: https://lore.kernel.org/netfilter-devel/20260414074556.2512750-1-hxzene@gmail.com/
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Li Xiasong <lixiasong1@huawei.com>
Date: Thu May 7 22:04:23 2026 +0800
netfilter: nft_ct: fix missing expect put in obj eval
commit 19f94b6fee75b3ef7fbc06f3745b9a771a8a19a4 upstream.
nft_ct_expect_obj_eval() allocates an expectation and may call
nf_ct_expect_related(), but never drops its local reference.
Add nf_ct_expect_put(exp) before return to balance allocation.
Fixes: 857b46027d6f ("netfilter: nft_ct: add ct expectations support")
Cc: stable@vger.kernel.org
Signed-off-by: Li Xiasong <lixiasong1@huawei.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Florian Westphal <fw@strlen.de>
Date: Thu Apr 9 13:30:41 2026 +0200
netfilter: nft_fwd_netdev: check ttl/hl before forwarding
[ Upstream commit 1dfd95bdf4d18d263aa8fad06bfb9f4d9c992b18 ]
Drop packets if their ttl/hl is too small for forwarding.
Fixes: d32de98ea70f ("netfilter: nft_fwd_netdev: allow to forward packets via neighbour layer")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Tue Apr 14 13:06:38 2026 +0200
netfilter: nft_osf: restrict it to ipv4
[ Upstream commit b336fdbb7103fb1484e1dcb6741151d4b5a41e35 ]
This expression only supports for ipv4, restrict it.
Fixes: b96af92d6eaf ("netfilter: nf_tables: implement Passive OS fingerprint module in nft_osf")
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Xin Long <lucien.xin@gmail.com>
Date: Sun Apr 26 10:46:40 2026 -0400
netfilter: skip recording stale or retransmitted INIT
[ Upstream commit 576a5d2bad4814c881a829576b1261b9b8159d2b ]
An INIT whose init_tag matches the peer's vtag does not provide new state
information. It indicates either:
- a stale INIT (after INIT-ACK has already been seen on the same side), or
- a retransmitted INIT (after INIT has already been recorded on the same
side).
In both cases, the INIT must not update ct->proto.sctp.init[] state, since
it does not advance the handshake tracking and may otherwise corrupt
INIT/INIT-ACK validation logic.
Allow INIT processing only when the conntrack entry is newly created
(SCTP_CONNTRACK_NONE), or when the init_tag differs from the stored peer
vtag.
Note it skips the check for the ct with old_state SCTP_CONNTRACK_NONE in
nf_conntrack_sctp_packet(), as it is just created in sctp_new() where it
set ct->proto.sctp.vtag[IP_CT_DIR_REPLY] = ih->init_tag.
Fixes: 9fb9cbb1082d ("[NETFILTER]: Add nf_conntrack subsystem.")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Florian Westphal <fw@strlen.de>
Link: https://patch.msgid.link/ee56c3e416452b2a40589a2a85245ac2ad5e9f4b.1777214801.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jiexun Wang <wangjiexun2025@gmail.com>
Date: Fri Apr 17 20:25:06 2026 +0800
netfilter: xt_policy: fix strict mode inbound policy matching
[ Upstream commit 4b2b4d7d4e203c92db8966b163edfacb1f0e1e29 ]
match_policy_in() walks sec_path entries from the last transform to the
first one, but strict policy matching needs to consume info->pol[] in
the same forward order as the rule layout.
Derive the strict-match policy position from the number of transforms
already consumed so that multi-element inbound rules are matched
consistently.
Fixes: c4b885139203 ("[NETFILTER]: x_tables: replace IPv4/IPv6 policy match by address family independant version")
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Signed-off-by: Jiexun Wang <wangjiexun2025@gmail.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Florian Westphal <fw@strlen.de>
Date: Sat Apr 4 12:12:59 2026 +0200
netfilter: xt_socket: enable defrag after all other checks
[ Upstream commit 542be3fa5aff54210a02954c38f07e53ea9bdafd ]
Originally this did not matter because defrag was enabled once per netns
and only disabled again on netns dismantle. When this got changed I should
have adjusted checkentry to not leave defrag enabled on error.
Fixes: de8c12110a13 ("netfilter: disable defrag once its no longer needed")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Wed Apr 15 12:21:00 2026 +0200
netfilter: xtables: restrict several matches to inet family
[ Upstream commit b6fe26f86a1649f84e057f3f15605b08eda15497 ]
This is a partial revert of:
commit ab4f21e6fb1c ("netfilter: xtables: use NFPROTO_UNSPEC in more extensions")
to allow ipv4 and ipv6 only.
- xt_mac
- xt_owner
- xt_physdev
These extensions are not used by ebtables in userspace.
Moreover, xt_realm is only for ipv4, since dst->tclassid is ipv4
specific.
Fixes: ab4f21e6fb1c ("netfilter: xtables: use NFPROTO_UNSPEC in more extensions")
Reported-by: "Kito Xu (veritas501)" <hxzene@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Paulo Alcantara <pc@manguebit.org>
Date: Tue May 12 13:33:46 2026 +0100
netfs: fix error handling in netfs_extract_user_iter()
commit 0aad5704c6b4d14007d4eab15883e8524e4310f4 upstream.
In netfs_extract_user_iter(), if iov_iter_extract_pages() failed to
extract user pages, bail out on -ENOMEM, otherwise return the error
code only if @npages == 0, allowing short DIO reads and writes to be
issued.
This fixes mmapstress02 from LTP tests against CIFS.
Fixes: 85dd2c8ff368 ("netfs: Add a function to extract a UBUF or IOVEC into a BVEC iterator")
Reported-by: Xiaoli Feng <xifeng@redhat.com>
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://patch.msgid.link/20260512123404.719402-10-dhowells@redhat.com
Cc: netfs@lists.linux.dev
Cc: stable@vger.kernel.org
Cc: linux-cifs@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: David Howells <dhowells@redhat.com>
Date: Tue May 12 13:33:45 2026 +0100
netfs: Fix potential uninitialised var in netfs_extract_user_iter()
commit 7e3d8db899d54af39fafb2eb3392b0cdae9973b5 upstream.
In netfs_extract_user_iter(), if it's given a zero-length iterator, it will
fall through the loop without setting ret, and so the error handling
behaviour will be undefined, depending on whether ret happens to be
negative. The value of ret then propagates back up the callstack.
Fix this by presetting ret to 0.
Fixes: 85dd2c8ff368 ("netfs: Add a function to extract a UBUF or IOVEC into a BVEC iterator")
Closes: https://sashiko.dev/#/patchset/20260414082004.3756080-1-dhowells%40redhat.com
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://patch.msgid.link/20260512123404.719402-9-dhowells@redhat.com
cc: Paulo Alcantara <pc@manguebit.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Breno Leitao <leitao@debian.org>
Date: Wed Jun 18 02:32:45 2025 -0700
netpoll: Extract carrier wait function
[ Upstream commit 76d30b51e818064e02917ce6328fb2c8adce5c87 ]
Extract the carrier waiting logic into a dedicated helper function
netpoll_wait_carrier() to improve code readability and reduce
duplication in netpoll_setup().
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250618-netpoll_ip_ref-v1-1-c2ac00fe558f@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 3bc179bc7146 ("netpoll: fix IPv6 local-address corruption")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Breno Leitao <leitao@debian.org>
Date: Wed Jun 18 02:32:46 2025 -0700
netpoll: extract IPv4 address retrieval into helper function
[ Upstream commit 3699f992e8c22d3ce54d2c1a5774e2c49028f99c ]
Move the IPv4 address retrieval logic from netpoll_setup() into a
separate netpoll_take_ipv4() function to improve code organization
and readability. This change consolidates the IPv4-specific logic
and error handling into a dedicated function while maintaining
the same functionality.
No functional changes.
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250618-netpoll_ip_ref-v1-2-c2ac00fe558f@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 3bc179bc7146 ("netpoll: fix IPv6 local-address corruption")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Breno Leitao <leitao@debian.org>
Date: Wed Jun 18 02:32:47 2025 -0700
netpoll: Extract IPv6 address retrieval function
[ Upstream commit 6ad7969a361cbec5822285fb39203678ff462b64 ]
Extract the IPv6 address retrieval logic from netpoll_setup() into
a dedicated helper function netpoll_take_ipv6() to improve code
organization and readability.
The function handles obtaining the local IPv6 address from the
network device, including proper address type matching between
local and remote addresses (link-local vs global), and includes
appropriate error handling when IPv6 is not supported or no
suitable address is available.
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250618-netpoll_ip_ref-v1-3-c2ac00fe558f@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 76b93a810757 ("netpoll: pass buffer size to egress_dev() to avoid MAC truncation")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Breno Leitao <leitao@debian.org>
Date: Fri Apr 24 08:31:16 2026 -0700
netpoll: fix IPv6 local-address corruption
[ Upstream commit 3bc179bc7146c26c9dff75d2943d10528274e301 ]
netpoll_setup() decides whether to auto-populate the local source
address by testing np->local_ip.ip, which only inspects the first 4
bytes of the union inet_addr storage.
For an IPv6 netpoll whose caller-supplied local address has a zero
high-32 bits (::1, ::<suffix>, IPv4-mapped ::ffff:a.b.c.d, etc.), this
misdetects the address as unset (which they are not, but the first
4 bytes are empty), calls netpoll_take_ipv6() and overwrites it with
whatever matching link-local/global address the device happens to expose
first.
Introduce a helper netpoll_local_ip_unset() that picks the correct
family-aware test (ipv6_addr_any() for IPv6, !.ip for IPv4) and use it
from netpoll_setup().
Reproducer is something like:
echo "::2" > local_ip
echo 1 > enabled
cat local_ip
# before this fix: 2001:db8::1 (caller-supplied ::2 was clobbered)
# after this fix: ::2
Fixes: b7394d2429c1 ("netpoll: prepare for ipv6")
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20260424-netpoll_fix-v1-1-3a55348c625f@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Breno Leitao <leitao@debian.org>
Date: Fri May 1 02:58:41 2026 -0700
netpoll: pass buffer size to egress_dev() to avoid MAC truncation
[ Upstream commit 76b93a8107574006b25495664304ea9237494d70 ]
egress_dev() formats np->dev_mac via snprintf() but receives buf as
a bare char *, so it cannot derive the buffer size from the pointer. The
size argument was hardcoded to MAC_ADDR_STR_LEN (3 * ETH_ALEN - 1 = 17),
which is silly wrong in two ways:
1) misleading kernel log output on the MAC-selected target path
(np->dev_name[0] == '\0'); for example "aa:bb:cc:dd:ee:ff doesn't
exist, aborting" was logged as "aa:bb:cc:dd:ee:f doesn't exist,
aborting".
2) the second argument of snprintf is the size of the buffer, not the
size of what you want to write.
Add a bufsz parameter to egress_dev() and pass sizeof(buf) from each
caller, matching the standard snprintf() idiom and removing the
hardcoded size from the helper.
Every caller already declares "char buf[MAC_ADDR_STR_LEN + 1]" so the
formatted MAC continues to fit.
Tested by booting with
netconsole=6665@/aa:bb:cc:dd:ee:ff,6666@10.0.0.1/00:11:22:33:44:55
on a kernel without a matching device. Pre-fix dmesg shows
"aa:bb:cc:dd:ee:f doesn't exist, aborting"; post-fix shows the full
"aa:bb:cc:dd:ee:ff doesn't exist, aborting".
Fixes: f8a10bed32f5 ("netconsole: allow selection of egress interface via MAC address")
Cc: stable@vger.kernel.org
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20260501-netpoll_snprintf_fix-v1-1-84b0566e6597@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jiayuan Chen <jiayuan.chen@linux.dev>
Date: Mon Apr 13 19:45:19 2026 +0800
nexthop: fix IPv6 route referencing IPv4 nexthop
[ Upstream commit 29c95185ba32b621fbc3800fb86e7dc3edf5c2be ]
syzbot reported a panic [1] [2].
When an IPv6 nexthop is replaced with an IPv4 nexthop, the has_v4 flag
of all groups containing this nexthop is not updated. This is because
nh_group_v4_update is only called when replacing AF_INET to AF_INET6,
but the reverse direction (AF_INET6 to AF_INET) is missed.
This allows a stale has_v4=false to bypass fib6_check_nexthop, causing
IPv6 routes to be attached to groups that effectively contain only AF_INET
members. Subsequent route lookups then call nexthop_fib6_nh() which
returns NULL for the AF_INET member, leading to a NULL pointer
dereference.
Fix by calling nh_group_v4_update whenever the family changes, not just
AF_INET to AF_INET6.
Reproducer:
# AF_INET6 blackhole
ip -6 nexthop add id 1 blackhole
# group with has_v4=false
ip nexthop add id 100 group 1
# replace with AF_INET (no -6), has_v4 stays false
ip nexthop replace id 1 blackhole
# pass stale has_v4 check
ip -6 route add 2001:db8::/64 nhid 100
# panic
ping -6 2001:db8::1
[1] https://syzkaller.appspot.com/bug?id=e17283eb2f8dcf3dd9b47fe6f67a95f71faadad0
[2] https://syzkaller.appspot.com/bug?id=8699b6ae54c9f35837d925686208402949e12ef3
Fixes: 7bf4796dd099 ("nexthops: add support for replace")
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260413114522.147784-1-jiayuan.chen@linux.dev
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Paul Geurts <paul.geurts@prodrive-technologies.com>
Date: Wed Apr 22 12:09:30 2026 +0200
NFC: trf7970a: Ignore antenna noise when checking for RF field
[ Upstream commit a9bc28aa4e64320668131349436a650bf42591a5 ]
The main channel Received Signal Strength Indicator (RSSI) measurement
is used to determine whether an RF field is present or not. RSSI != 0
is interpreted as an RF Field is present. This does not take RF noise
and measurement inaccuracy into account, and results in false positives
in the field.
Define a noise level and make sure the RF field is only interpreted as
present when the RSSI is above the noise level.
Fixes: 851ee3cbf850 ("NFC: trf7970a: Don't turn on RF if there is already an RF field")
Signed-off-by: Paul Geurts <paul.geurts@prodrive-technologies.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
Reviewed-by: Mark Greer <mgreer@animalcreek.com>
Link: https://patch.msgid.link/20260422100930.581237-1-paul.geurts@prodrive-technologies.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alexey Kodanev <aleksei.kodanev@bell-sw.com>
Date: Wed Apr 22 16:05:36 2026 +0000
nfp: fix swapped arguments in nfp_encode_basic_qdr() calls
[ Upstream commit 4078c5611d7585548b249377ebd60c272e410490 ]
There is a mismatch between the passed arguments and the actual
nfp_encode_basic_qdr() function parameter names:
static int nfp_encode_basic_qdr(u64 addr, int dest_island, int cpp_tgt,
int mode, bool addr40, int isld1,
int isld0)
{
...
But "dest_island" and "cpp_tgt" are swapped at every call-site.
For example:
return nfp_encode_basic_qdr(*addr, cpp_tgt, dest_island,
mode, addr40, isld1, isld0);
As a result, nfp_encode_basic_qdr() receives "dest_island" as CPP target
type, which is always NFP_CPP_TARGET_QDR(2) for these calls, and "cpp_tgt"
as the destination island ID, which can accidentally match or be outside
the valid NFP_CPP_TARGET_* types (e.g. '-1' for any destination).
Since code already worked for years, also add extra pr_warn() to error
paths in nfp_encode_basic_qdr() to help identify any potential address
verification failures.
Detected using the static analysis tool - Svace.
Fixes: 4cb584e0ee7d ("nfp: add CPP access core")
Signed-off-by: Alexey Kodanev <aleksei.kodanev@bell-sw.com>
Link: https://patch.msgid.link/20260422160536.61855-1-aleksei.kodanev@bell-sw.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date: Wed Feb 4 21:21:49 2026 +0100
nfs/blocklayout: Fix compilation error (`make W=1`) in bl_write_pagelist()
[ Upstream commit f83c8dda456ce4863f346aa26d88efa276eda35d ]
Clang compiler is not happy about set but unused variable
(when dprintk() is no-op):
.../blocklayout/blocklayout.c:384:9: error: variable 'count' set but not used [-Werror,-Wunused-but-set-variable]
Remove a leftover from the previous cleanup.
Fixes: 3a6fd1f004fc ("pnfs/blocklayout: remove read-modify-write handling in bl_write_pagelist")
Acked-by: Anna Schumaker <anna.schumkaer@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Deepanshu Kartikey <kartikey406@gmail.com>
Date: Wed Apr 1 02:52:09 2026 +0900
nilfs2: reject zero bd_oblocknr in nilfs_ioctl_mark_blocks_dirty()
[ Upstream commit be3e5d10643d3be1cbac9d9939f220a99253f980 ]
nilfs_ioctl_mark_blocks_dirty() uses bd_oblocknr to detect dead blocks
by comparing it with the current block number bd_blocknr. If they differ,
the block is considered dead and skipped.
However, bd_oblocknr should never be 0 since block 0 typically stores the
primary superblock and is never a valid GC target block. A corrupted ioctl
request with bd_oblocknr set to 0 causes the comparison to incorrectly
match when the lookup returns -ENOENT and sets bd_blocknr to 0, bypassing
the dead block check and calling nilfs_bmap_mark() on a non-existent
block. This causes nilfs_btree_do_lookup() to return -ENOENT, triggering
the WARN_ON(ret == -ENOENT).
Fix this by rejecting ioctl requests with bd_oblocknr set to 0 at the
beginning of each iteration.
[ryusuke: slightly modified the commit message and comments for accuracy]
Fixes: 7942b919f732 ("nilfs2: ioctl operations")
Reported-by: syzbot+98a040252119df0506f8@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=98a040252119df0506f8
Suggested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Deepanshu Kartikey <Kartikey406@gmail.com>
Reported-by: syzbot+466a45fcfb0562f5b9a0@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=466a45fcfb0562f5b9a0
Cc: Junjie Cao <junjie.cao@linux.dev>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Al Viro <viro@zeniv.linux.org.uk>
Date: Mon May 18 12:21:30 2026 +0800
ntfs: ->d_compare() must not block
[ Upstream commit ca2a04e84af79596e5cd9cfe697d5122ec39c8ce ]
... so don't use __getname() there. Switch it (and ntfs_d_hash(), while
we are at it) to kmalloc(PATH_MAX, GFP_NOWAIT). Yes, ntfs_d_hash()
almost certainly can do with smaller allocations, but let ntfs folks
deal with that - keep the allocation size as-is for now.
Stop abusing names_cachep in ntfs, period - various uses of that thing
in there have nothing to do with pathnames; just use k[mz]alloc() and
be done with that. For now let's keep sizes as-in, but AFAICS none of
the users actually want PATH_MAX.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Li hongliang <1468888505@139.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Keith Busch <kbusch@kernel.org>
Date: Tue Apr 21 09:14:02 2026 -0700
nvme-pci: fix missed admin queue sq doorbell write
[ Upstream commit 1cc4cdae2a3b7730d462d69e30f213fd2efe7807 ]
We can batch admin commands submitted through io_uring_cmd passthrough,
which means bd->last may be false and skips the doorbell write to
aggregate multiple commands per write. If a subsequent command can't be
dispatched for whatever reason, we have to provide the blk-mq ops'
commit_rqs callback in order to ensure we properly update the doorbell.
Fixes: 58e5bdeb9c2b ("nvme: enable uring-passthrough for admin commands")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Maurizio Lombardi <mlombard@redhat.com>
Date: Mon Mar 16 15:39:35 2026 +0100
nvmet-tcp: propagate nvmet_tcp_build_pdu_iovec() errors to its callers
[ Upstream commit ea8e356acb165cb1fd75537a52e1f66e5e76c538 ]
Currently, when nvmet_tcp_build_pdu_iovec() detects an out-of-bounds
PDU length or offset, it triggers nvmet_tcp_fatal_error(cmd->queue)
and returns early. However, because the function returns void, the
callers are entirely unaware that a fatal error has occurred and
that the cmd->recv_msg.msg_iter was left uninitialized.
Callers such as nvmet_tcp_handle_h2c_data_pdu() proceed to blindly
overwrite the queue state with queue->rcv_state = NVMET_TCP_RECV_DATA
Consequently, the socket receiving loop may attempt to read incoming
network data into the uninitialized iterator.
Fix this by shifting the error handling responsibility to the callers.
Fixes: 52a0a9854934 ("nvmet-tcp: add bounds checks in nvmet_tcp_build_pdu_iovec")
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Yunje Shin <ioerts@kookmin.ac.kr>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Junrui Luo <moonafterrain@outlook.com>
Date: Sat Mar 7 15:21:09 2026 +0800
ocfs2/dlm: fix off-by-one in dlm_match_regions() region comparison
[ Upstream commit 01b61e8dda9b0fdb0d4cda43de25f4e390554d7b ]
The local-vs-remote region comparison loop uses '<=' instead of '<',
causing it to read one entry past the valid range of qr_regions. The
other loops in the same function correctly use '<'.
Fix the loop condition to use '<' for consistency and correctness.
Link: https://lkml.kernel.org/r/SYBPR01MB78813DA26B50EC5E01F00566AF7BA@SYBPR01MB7881.ausprd01.prod.outlook.com
Fixes: ea2034416b54 ("ocfs2/dlm: Add message DLM_QUERY_REGION")
Signed-off-by: Junrui Luo <moonafterrain@outlook.com>
Reported-by: Yuhao Jiang <danisjiang@gmail.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Heming Zhao <heming.zhao@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Junrui Luo <moonafterrain@outlook.com>
Date: Sat Mar 7 15:21:08 2026 +0800
ocfs2/dlm: validate qr_numregions in dlm_match_regions()
[ Upstream commit 7ab3fbb01bc6d79091bc375e5235d360cd9b78be ]
Patch series "ocfs2/dlm: fix two bugs in dlm_match_regions()".
In dlm_match_regions(), the qr_numregions field from a DLM_QUERY_REGION
network message is used to drive loops over the qr_regions buffer without
sufficient validation. This series fixes two issues:
- Patch 1 adds a bounds check to reject messages where qr_numregions
exceeds O2NM_MAX_REGIONS. The o2net layer only validates message
byte length; it does not constrain field values, so a crafted message
can set qr_numregions up to 255 and trigger out-of-bounds reads past
the 1024-byte qr_regions buffer.
- Patch 2 fixes an off-by-one in the local-vs-remote comparison loop,
which uses '<=' instead of '<', reading one entry past the valid range
even when qr_numregions is within bounds.
This patch (of 2):
The qr_numregions field from a DLM_QUERY_REGION network message is used
directly as loop bounds in dlm_match_regions() without checking against
O2NM_MAX_REGIONS. Since qr_regions is sized for at most O2NM_MAX_REGIONS
(32) entries, a crafted message with qr_numregions > 32 causes
out-of-bounds reads past the qr_regions buffer.
Add a bounds check for qr_numregions before entering the loops.
Link: https://lkml.kernel.org/r/SYBPR01MB7881A334D02ACEE5E0645801AF7BA@SYBPR01MB7881.ausprd01.prod.outlook.com
Link: https://lkml.kernel.org/r/SYBPR01MB788166F524AD04E262E174BEAF7BA@SYBPR01MB7881.ausprd01.prod.outlook.com
Fixes: ea2034416b54 ("ocfs2/dlm: Add message DLM_QUERY_REGION")
Signed-off-by: Junrui Luo <moonafterrain@outlook.com>
Reported-by: Yuhao Jiang <danisjiang@gmail.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Heming Zhao <heming.zhao@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: ZhengYuan Huang <gality369@gmail.com>
Date: Fri Apr 10 12:03:39 2026 +0800
ocfs2: fix listxattr handling when the buffer is full
[ Upstream commit d12f558e6200b3f47dbef9331ed6d115d2410e59 ]
[BUG]
If an OCFS2 inode has both inline and block-based xattrs, listxattr()
can return a size larger than the caller's buffer when the inline names
consume that buffer exactly.
kernel BUG at mm/usercopy.c:102!
Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
RIP: 0010:usercopy_abort+0xb7/0xd0 mm/usercopy.c:102
Call Trace:
__check_heap_object+0xe3/0x120 mm/slub.c:8243
check_heap_object mm/usercopy.c:196 [inline]
__check_object_size mm/usercopy.c:250 [inline]
__check_object_size+0x5c5/0x780 mm/usercopy.c:215
check_object_size include/linux/ucopysize.h:22 [inline]
check_copy_size include/linux/ucopysize.h:59 [inline]
copy_to_user include/linux/uaccess.h:219 [inline]
listxattr+0xb0/0x170 fs/xattr.c:926
filename_listxattr fs/xattr.c:958 [inline]
path_listxattrat+0x137/0x320 fs/xattr.c:988
__do_sys_listxattr fs/xattr.c:1001 [inline]
__se_sys_listxattr fs/xattr.c:998 [inline]
__x64_sys_listxattr+0x7f/0xd0 fs/xattr.c:998
...
[CAUSE]
Commit 936b8834366e ("ocfs2: Refactor xattr list and remove
ocfs2_xattr_handler().") replaced the old per-handler list accounting
with ocfs2_xattr_list_entry(), but it kept using size == 0 to detect
probe mode.
That assumption stops being true once ocfs2_listxattr() finishes the
inline-xattr pass. If the inline names fill the caller buffer exactly,
the block-xattr pass runs with a non-NULL buffer and a remaining size of
zero. ocfs2_xattr_list_entry() then skips the bounds check, keeps
counting block names, and returns a positive size larger than the
supplied buffer.
[FIX]
Detect probe mode by testing whether the destination buffer pointer is
NULL instead of whether the remaining size is zero.
That restores the pre-refactor behavior and matches the OCFS2 getxattr
helpers. Once the remaining buffer reaches zero while more names are
left, the block-xattr pass now returns -ERANGE instead of reporting a
size larger than the allocated list buffer.
Link: https://lkml.kernel.org/r/20260410040339.3837162-1-gality369@gmail.com
Fixes: 936b8834366e ("ocfs2: Refactor xattr list and remove ocfs2_xattr_handler().")
Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Heming Zhao <heming.zhao@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: ZhengYuan Huang <gality369@gmail.com>
Date: Fri Apr 10 11:42:20 2026 +0800
ocfs2: validate bg_bits during freefrag scan
[ Upstream commit 8f687eeed3da3012152b0f9473f578869de0cd7b ]
[BUG]
A crafted filesystem can trigger an out-of-bounds bitmap walk when
OCFS2_IOC_INFO is issued with OCFS2_INFO_FL_NON_COHERENT.
BUG: KASAN: use-after-free in instrument_atomic_read include/linux/instrumented.h:68 [inline]
BUG: KASAN: use-after-free in _test_bit include/asm-generic/bitops/instrumented-non-atomic.h:141 [inline]
BUG: KASAN: use-after-free in test_bit_le include/asm-generic/bitops/le.h:21 [inline]
BUG: KASAN: use-after-free in ocfs2_info_freefrag_scan_chain fs/ocfs2/ioctl.c:495 [inline]
BUG: KASAN: use-after-free in ocfs2_info_freefrag_scan_bitmap fs/ocfs2/ioctl.c:588 [inline]
BUG: KASAN: use-after-free in ocfs2_info_handle_freefrag fs/ocfs2/ioctl.c:662 [inline]
BUG: KASAN: use-after-free in ocfs2_info_handle_request+0x1c66/0x3370 fs/ocfs2/ioctl.c:754
Read of size 8 at addr ffff888031bce000 by task syz.0.636/1435
Call Trace:
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0xbe/0x130 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0xd1/0x650 mm/kasan/report.c:482
kasan_report+0xfb/0x140 mm/kasan/report.c:595
check_region_inline mm/kasan/generic.c:186 [inline]
kasan_check_range+0x11c/0x200 mm/kasan/generic.c:200
__kasan_check_read+0x11/0x20 mm/kasan/shadow.c:31
instrument_atomic_read include/linux/instrumented.h:68 [inline]
_test_bit include/asm-generic/bitops/instrumented-non-atomic.h:141 [inline]
test_bit_le include/asm-generic/bitops/le.h:21 [inline]
ocfs2_info_freefrag_scan_chain fs/ocfs2/ioctl.c:495 [inline]
ocfs2_info_freefrag_scan_bitmap fs/ocfs2/ioctl.c:588 [inline]
ocfs2_info_handle_freefrag fs/ocfs2/ioctl.c:662 [inline]
ocfs2_info_handle_request+0x1c66/0x3370 fs/ocfs2/ioctl.c:754
ocfs2_info_handle+0x18d/0x2a0 fs/ocfs2/ioctl.c:828
ocfs2_ioctl+0x632/0x6e0 fs/ocfs2/ioctl.c:913
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl fs/ioctl.c:583 [inline]
__x64_sys_ioctl+0x197/0x1e0 fs/ioctl.c:583
...
[CAUSE]
ocfs2_info_freefrag_scan_chain() uses on-disk bg_bits directly as the
bitmap scan limit. The coherent path reads group descriptors through
ocfs2_read_group_descriptor(), which validates the descriptor before
use. The non-coherent path uses ocfs2_read_blocks_sync() instead and
skips that validation, so an impossible bg_bits value can drive the
bitmap walk past the end of the block.
[FIX]
Compute the bitmap capacity from the filesystem format with
ocfs2_group_bitmap_size(), report descriptors whose bg_bits exceeds
that limit, and clamp the scan to the computed capacity. This keeps the
freefrag report going while avoiding reads beyond the buffer.
Link: https://lkml.kernel.org/r/20260410034220.3825769-1-gality369@gmail.com
Fixes: d24a10b9f8ed ("Ocfs2: Add a new code 'OCFS2_INFO_FREEFRAG' for o2info ioctl.")
Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Heming Zhao <heming.zhao@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: ZhengYuan Huang <gality369@gmail.com>
Date: Fri Apr 10 10:02:08 2026 +0800
ocfs2: validate group add input before caching
[ Upstream commit 70b672833f4025341c11b22c7f83778a5cd611bc ]
[BUG]
OCFS2_IOC_GROUP_ADD can trigger a BUG_ON in
ocfs2_set_new_buffer_uptodate():
kernel BUG at fs/ocfs2/uptodate.c:509!
Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
RIP: 0010:ocfs2_set_new_buffer_uptodate+0x194/0x1e0 fs/ocfs2/uptodate.c:509
Code: ffffe88f 42b9fe4c 89e64889 dfe8b4df
Call Trace:
ocfs2_group_add+0x3f1/0x1510 fs/ocfs2/resize.c:507
ocfs2_ioctl+0x309/0x6e0 fs/ocfs2/ioctl.c:887
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl fs/ioctl.c:583 [inline]
__x64_sys_ioctl+0x197/0x1e0 fs/ioctl.c:583
x64_sys_call+0x1144/0x26a0 arch/x86/include/generated/asm/syscalls_64.h:17
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x93/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7bbfb55a966d
[CAUSE]
ocfs2_group_add() calls ocfs2_set_new_buffer_uptodate() on a
user-controlled group block before ocfs2_verify_group_and_input()
validates that block number. That helper is only valid for newly
allocated metadata and asserts that the block is not already present in
the chosen metadata cache. The code also uses INODE_CACHE(inode) even
though the group descriptor belongs to main_bm_inode and later journal
accesses use that cache context instead.
[FIX]
Validate the on-disk group descriptor before caching it, then add it to
the metadata cache tracked by INODE_CACHE(main_bm_inode). Keep the
validation failure path separate from the later cleanup path so we only
remove the buffer from that cache after it has actually been inserted.
This keeps the group buffer lifetime consistent across validation,
journaling, and cleanup.
Link: https://lkml.kernel.org/r/20260410020209.3786348-1-gality369@gmail.com
Fixes: 7909f2bf8353 ("[PATCH 2/2] ocfs2: Implement group add for online resize")
Signed-off-by: ZhengYuan Huang <gality369@gmail.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Heming Zhao <heming.zhao@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Weiming Shi <bestswngs@gmail.com>
Date: Wed Apr 15 19:46:54 2026 -0700
openvswitch: cap upcall PID array size and pre-size vport replies
[ Upstream commit 2091c6aa0df6aba47deb5c8ab232b1cb60af3519 ]
The vport netlink reply helpers allocate a fixed-size skb with
nlmsg_new(NLMSG_DEFAULT_SIZE, ...) but serialize the full upcall PID
array via ovs_vport_get_upcall_portids(). Since
ovs_vport_set_upcall_portids() accepts any non-zero multiple of
sizeof(u32) with no upper bound, a CAP_NET_ADMIN user can install a PID
array large enough to overflow the reply buffer, causing nla_put() to
fail with -EMSGSIZE and hitting BUG_ON(err < 0). On systems with
unprivileged user namespaces enabled (e.g., Ubuntu default), this is
reachable via unshare -Urn since OVS vport mutation operations use
GENL_UNS_ADMIN_PERM.
kernel BUG at net/openvswitch/datapath.c:2414!
Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
CPU: 1 UID: 0 PID: 65 Comm: poc Not tainted 7.0.0-rc7-00195-geb216e422044 #1
RIP: 0010:ovs_vport_cmd_set+0x34c/0x400
Call Trace:
<TASK>
genl_family_rcv_msg_doit (net/netlink/genetlink.c:1116)
genl_rcv_msg (net/netlink/genetlink.c:1194)
netlink_rcv_skb (net/netlink/af_netlink.c:2550)
genl_rcv (net/netlink/genetlink.c:1219)
netlink_unicast (net/netlink/af_netlink.c:1344)
netlink_sendmsg (net/netlink/af_netlink.c:1894)
__sys_sendto (net/socket.c:2206)
__x64_sys_sendto (net/socket.c:2209)
do_syscall_64 (arch/x86/entry/syscall_64.c:63)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
</TASK>
Kernel panic - not syncing: Fatal exception
Reject attempts to set more PIDs than nr_cpu_ids in
ovs_vport_set_upcall_portids(), and pre-compute the worst-case reply
size in ovs_vport_cmd_msg_size() based on that bound, similar to the
existing ovs_dp_cmd_msg_size(). nr_cpu_ids matches the cap already
used by the per-CPU dispatch configuration on the datapath side
(ovs_dp_cmd_fill_info() serialises at most nr_cpu_ids PIDs), so the
two sides stay consistent.
Fixes: 5cd667b0a456 ("openvswitch: Allow each vport to have an array of 'port_id's.")
Reported-by: Xiang Mei <xmei5@asu.edu>
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Reviewed-by: Ilya Maximets <i.maximets@ovn.org>
Link: https://patch.msgid.link/20260416024653.153456-2-bestswngs@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Daniel Jordan <daniel.m.jordan@oracle.com>
Date: Fri Mar 13 11:24:33 2026 -0400
padata: Put CPU offline callback in ONLINE section to allow failure
[ Upstream commit c8c4a2972f83c8b68ff03b43cecdb898939ff851 ]
syzbot reported the following warning:
DEAD callback error for CPU1
WARNING: kernel/cpu.c:1463 at _cpu_down+0x759/0x1020 kernel/cpu.c:1463, CPU#0: syz.0.1960/14614
at commit 4ae12d8bd9a8 ("Merge tag 'kbuild-fixes-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux")
which tglx traced to padata_cpu_dead() given it's the only
sub-CPUHP_TEARDOWN_CPU callback that returns an error.
Failure isn't allowed in hotplug states before CPUHP_TEARDOWN_CPU
so move the CPU offline callback to the ONLINE section where failure is
possible.
Fixes: 894c9ef9780c ("padata: validate cpumask without removed CPU during offline")
Reported-by: syzbot+123e1b70473ce213f3af@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/69af0a05.050a0220.310d8.002f.GAE@google.com/
Debugged-by: Thomas Gleixner <tglx@kernel.org>
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chuyi Zhou <zhouchuyi@bytedance.com>
Date: Thu Feb 26 16:07:03 2026 +0800
padata: Remove cpu online check from cpu add and removal
[ Upstream commit 73117ea6470dca787f70f33c001f9faf437a1c0b ]
During the CPU offline process, the dying CPU is cleared from the
cpu_online_mask in takedown_cpu(). After this step, various CPUHP_*_DEAD
callbacks are executed to perform cleanup jobs for the dead CPU, so this
cpu online check in padata_cpu_dead() is unnecessary.
Similarly, when executing padata_cpu_online() during the
CPUHP_AP_ONLINE_DYN phase, the CPU has already been set in the
cpu_online_mask, the action even occurs earlier than the
CPUHP_AP_ONLINE_IDLE stage.
Remove this unnecessary cpu online check in __padata_add_cpu() and
__padata_remove_cpu().
Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Stable-dep-of: c8c4a2972f83 ("padata: Put CPU offline callback in ONLINE section to allow failure")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mina Almasry <almasrymina@google.com>
Date: Thu Aug 21 03:03:46 2025 +0000
page_pool: fix incorrect mp_ops error handling
[ Upstream commit abadf0ff63be488dc502ecfc9f622929a21b7117 ]
Minor fix to the memory provider error handling, we should be jumping to
free_ptr_ring in this error case rather than returning directly.
Found by code-inspection.
Cc: skhawaja@google.com
Fixes: b400f4b87430 ("page_pool: Set `dma_sync` to false for devmem memory provider")
Signed-off-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
Link: https://patch.msgid.link/20250821030349.705244-1-almasrymina@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Hasan Basbunar <basbunarhasan@gmail.com>
Date: Tue Apr 28 19:07:39 2026 +0200
page_pool: fix memory-provider leak in page_pool_create_percpu() error path
[ Upstream commit 5ef343614db766acdc01c56d66e780a1b43c6ac6 ]
When page_pool_create_percpu() fails on page_pool_list(), it falls
through to its err_uninit: label, which calls page_pool_uninit().
At that point page_pool_init() has already taken two references
when the user requested PP_FLAG_ALLOW_UNREADABLE_NETMEM:
pool->mp_ops->init(pool)
static_branch_inc(&page_pool_mem_providers);
Neither is undone by page_pool_uninit(); both are only undone by
__page_pool_destroy() (success-side teardown). The error path
therefore leaks the per-provider reference taken by mp_ops->init
(io_zcrx_ifq->refs in the io_uring zcrx provider, the dmabuf
binding refcount in the devmem provider) plus one increment of
the page_pool_mem_providers static branch on every failure of
xa_alloc_cyclic() inside page_pool_list().
The leaked io_zcrx_ifq->refs in turn pins everything
io_zcrx_ifq_free() would release on cleanup: ifq->user (uid),
ifq->mm_account (mmdrop), ifq->dev (device refcount),
ifq->netdev_tracker (netdev refcount), and the rbuf region.
The leaked static branch increment forces all subsequent
page_pool_alloc_netmems() and page_pool_return_page() callers to
take the slow mp_ops branch for the lifetime of the kernel.
Reachable via the io_uring zcrx path:
io_uring_register(IORING_REGISTER_ZCRX_IFQ) /* CAP_NET_ADMIN */
-> __io_uring_register
-> io_register_zcrx
-> zcrx_register_netdev
-> netif_mp_open_rxq
-> driver ndo_queue_mem_alloc
-> page_pool_create_percpu
-> page_pool_init succeeds (mp_ops->init runs, branch++)
-> page_pool_list fails (xa_alloc_cyclic -ENOMEM)
-> goto err_uninit <-- leak
The same shape applies to the devmem dmabuf provider via
mp_dmabuf_devmem_init()/mp_dmabuf_devmem_destroy().
Restore the cleanup symmetry by moving the mp_ops->destroy() and
static_branch_dec() calls out of __page_pool_destroy() and into
page_pool_uninit(), so page_pool_uninit() is again the strict
inverse of page_pool_init(). page_pool_uninit() has only two
callers (the err_uninit: path and __page_pool_destroy()), so this
preserves the single-call invariant on the success path while
fixing the err path. The error path of page_pool_init() itself
still skips the mp_ops cleanup correctly: mp_ops->init is the
last action that takes a reference before page_pool_init() returns
0, so when it returns an error neither the refcount nor the static
branch has been touched.
Triggering the bug requires xa_alloc_cyclic() to fail with -ENOMEM,
which under normal GFP_KERNEL retry behaviour is rare. It is
deterministic under CONFIG_FAULT_INJECTION with fail_page_alloc /
xa fault injection, or under sustained memory pressure. The leak
is silent: there is no warning, and the released kernel build
continues running with a permanently-incremented static branch.
Fixes: 0f9214046893 ("memory-provider: dmabuf devmem memory provider")
Signed-off-by: Hasan Basbunar <basbunarhasan@gmail.com>
Link: https://patch.msgid.link/20260428170739.34881-1-basbunarhasan@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Samiullah Khawaja <skhawaja@google.com>
Date: Wed Dec 11 21:20:30 2024 +0000
page_pool: Set `dma_sync` to false for devmem memory provider
[ Upstream commit b400f4b87430c105d92550cee5a72aea01fdf3d6 ]
Move the `dma_map` and `dma_sync` checks to `page_pool_init` to make
them generic. Set dma_sync to false for devmem memory provider because
the dma_sync APIs should not be used for dma_buf backed devmem memory
provider.
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
Signed-off-by: Mina Almasry <almasrymina@google.com>
Link: https://patch.msgid.link/20241211212033.1684197-4-almasrymina@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 5ef343614db7 ("page_pool: fix memory-provider leak in page_pool_create_percpu() error path")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Petr Pavlu <petr.pavlu@suse.com>
Date: Tue Aug 19 14:12:09 2025 +0200
params: Replace __modinit with __init_or_module
[ Upstream commit 3cb0c3bdea5388519bc1bf575dca6421b133302b ]
Remove the custom __modinit macro from kernel/params.c and instead use the
common __init_or_module macro from include/linux/module.h. Both provide the
same functionality.
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Reviewed-by: Aaron Tomlin <atomlin@atomlin.com>
Reviewed-by: Daniel Gomez <da.gomez@samsung.com>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Stable-dep-of: deffe1edba62 ("module: Fix freeing of charp module parameters when CONFIG_SYSFS=n")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Richard Cheng <icheng@nvidia.com>
Date: Thu Apr 2 17:38:50 2026 +0800
PCI/NPEM: Set LED_HW_PLUGGABLE for hotplug-capable ports
[ Upstream commit 16d021c878dca22532c984668c9e8cf4722d6a49 ]
NPEM registers LED classdevs on PCI endpoint that may be behind
hotplug-capable ports. During hot-removal, led_classdev_unregister() calls
led_set_brightness(LED_OFF) which leads to a PCI config read to a
disconnected device, which fails and returns -ENODEV (topology details in
msgid.link below):
leds 0003:01:00.0:enclosure:ok: Setting an LED's brightness failed (-19)
The LED core already suppresses this for devices with LED_HW_PLUGGABLE set,
but NPEM never sets it. Add the flag since NPEM LEDs are on hot-pluggable
hardware by nature.
Fixes: 4e893545ef87 ("PCI/NPEM: Add Native PCIe Enclosure Management support")
Signed-off-by: Richard Cheng <icheng@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Kai-Heng Feng <kaihengf@nvidia.com>
Link: https://patch.msgid.link/20260402093850.23075-1-icheng@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Date: Wed Mar 25 00:37:53 2026 +0530
PCI: dwc: Apply ECRC workaround to DesignWare 5.00a as well
[ Upstream commit 40805f32dceadebb7381d911003100bec7b8cd51 ]
The ECRC (TLP digest) workaround was originally added for DesignWare
version 4.90a. Tegra234 SoC has 5.00a DWC HW version, which has the same
ATU TD override behaviour, so apply the workaround for 5.00a too.
Fixes: a54e19073718 ("PCI: tegra194: Add Tegra234 PCIe support")
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-13-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Aksh Garg <a-garg7@ti.com>
Date: Tue Feb 24 14:08:16 2026 +0530
PCI: dwc: ep: Fix MSI-X Table Size configuration in dw_pcie_ep_set_msix()
[ Upstream commit 271d0b1f058ae9815e75233d04b23e3558c3e4f4 ]
In dw_pcie_ep_set_msix(), while updating the MSI-X Table Size value for
individual functions, Message Control register is read from the passed
function number register space using dw_pcie_ep_readw_dbi(), but always
written back to the Function 0's register space using dw_pcie_writew_dbi().
This causes incorrect MSI-X configuration for the rest of the functions,
other than Function 0.
Fix this by using dw_pcie_ep_writew_dbi() to write to the correct
function's register space, matching the read operation.
Fixes: 70fa02ca1446 ("PCI: dwc: Add dw_pcie_ep_{read,write}_dbi[2] helpers")
Signed-off-by: Aksh Garg <a-garg7@ti.com>
[mani: commit log]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20260224083817.916782-2-a-garg7@ti.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Richard Zhu <hongxing.zhu@nxp.com>
Date: Wed Oct 15 11:04:25 2025 +0800
PCI: dwc: Invoke post_init in dw_pcie_resume_noirq()
[ Upstream commit c577ce2881f9c76892de5ffc1a122e3ef427ecee ]
In some SoCs like i.MX95, CLKREQ# is pulled low by the controller driver
before link up. After link up, if the 'supports-clkreq' property is
specified in DT, the driver will release CLKREQ# so that it can go high and
the endpoint can pull it low whenever required i.e., during exit from L1
Substates.
Hence, at the end of dw_pcie_resume_noirq(), invoke the '.post_init()'
callback if exists to perform the above mentioned action.
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
[mani: reworded description]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20251015030428.2980427-9-hongxing.zhu@nxp.com
Stable-dep-of: edb5ca3262e2 ("PCI: dwc: Perform cleanup in the error path of dw_pcie_resume_noirq()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Date: Thu Feb 26 19:09:51 2026 +0530
PCI: dwc: Perform cleanup in the error path of dw_pcie_resume_noirq()
[ Upstream commit edb5ca3262e2255cf938a5948709d3472d4871ad ]
If the dw_pcie_resume_noirq() API fails, it just returns the errno without
doing cleanup in the error path, leading to resource leak.
So perform cleanup in the error path.
Fixes: 4774faf854f5 ("PCI: dwc: Implement generic suspend/resume functionality")
Reported-by: Senchuan Zhang <zhangsenchuan@eswincomputing.com>
Closes: https://lore.kernel.org/linux-pci/78296255.3869.19c8eb694d6.Coremail.zhangsenchuan@eswincomputing.com
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20260226133951.296743-1-mani@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Koichiro Den <den@valinux.co.jp>
Date: Fri Mar 6 00:10:50 2026 +0900
PCI: dwc: rcar-gen4: Change EPC BAR alignment to 4K as per the documentation
[ Upstream commit 13f55a7ca773c731a1e645934c1ae48577f48785 ]
R-Car S4 Series (R8A779F[4-7]*) EP controller uses a 4K minimum iATU region
size (CX_ATU_MIN_REGION_SIZE = 4K) as per R19UH0161EJ0130 Rev.1.30. Also,
the controller itself can only be configured in the range 4 KB to 64 KB, so
the current 1 MB alignment requirement is incorrect.
Hence, change the alignment to the min size 4K as per the documentation.
This also fixes needless unusability of BAR4 on this platform when the
target address is fixed, such as for doorbell targets.
Fixes: e311b3834dfa ("PCI: rcar-gen4: Add endpoint mode support")
Signed-off-by: Koichiro Den <den@valinux.co.jp>
[mani: commit log]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20260305151050.1834007-1-den@valinux.co.jp
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Gerd Bayer <gbayer@linux.ibm.com>
Date: Mon Mar 30 15:09:45 2026 +0200
PCI: Enable AtomicOps only if Root Port supports them
[ Upstream commit 1ae8c4ce157037e266184064a182af9ef9af278b ]
When inspecting the config space of a Connect-X physical function in an
s390 system after it was initialized by the mlx5_core device driver, we
found the function to be enabled to request AtomicOps despite the Root Port
lacking support for completing them:
00:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
Subsystem: Mellanox Technologies Device 0002
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
AtomicOpsCtl: ReqEn+
On s390 and many virtualized guests, the Endpoint is visible but the Root
Port is not. In this case, pci_enable_atomic_ops_to_root() previously
enabled AtomicOps in the Endpoint even though it can't tell whether the
Root Port supports them as a completer.
Change pci_enable_atomic_ops_to_root() to fail if there's no Root Port or
the Root Port doesn't support AtomicOps.
Fixes: 430a23689dea ("PCI: Add pci_enable_atomic_ops_to_root()")
Reported-by: Alexander Schmidt <alexs@linux.ibm.com>
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
[bhelgaas: commit log, check RP first to simplify flow]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20260330-fix_pciatops-v7-2-f601818417e8@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Niklas Cassel <cassel@kernel.org>
Date: Wed May 14 09:43:19 2025 +0200
PCI: endpoint: Align pci_epc_set_msix(), pci_epc_ops::set_msix() nr_irqs encoding
[ Upstream commit de0321bcc5fdd83631f0c2a6fdebfe0ad4e23449 ]
The kdoc for pci_epc_set_msix() says:
"Invoke to set the required number of MSI-X interrupts."
The kdoc for the callback pci_epc_ops->set_msix() says:
"ops to set the requested number of MSI-X interrupts in the MSI-X
capability register"
pci_epc_ops::set_msix() does however expect the parameter 'interrupts' to
be in the encoding as defined by the Table Size field. Nowhere in the
kdoc does it say that the number of interrupts should be in Table Size
encoding.
It is very confusing that the API pci_epc_set_msix() and the callback
function pci_epc_ops::set_msix() both take a parameter named 'interrupts',
but they expect completely different encodings.
Clean up the API and the callback function to have the same semantics,
i.e. the parameter represents the number of interrupts, regardless of the
internal encoding of that value.
Also rename the parameter 'interrupts' to 'nr_irqs', in both the wrapper
function and the callback function, such that the name is unambiguous.
[bhelgaas: more specific subject]
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable+noautosel@kernel.org # this is simply a cleanup
Link: https://patch.msgid.link/20250514074313.283156-14-cassel@kernel.org
Stable-dep-of: 271d0b1f058a ("PCI: dwc: ep: Fix MSI-X Table Size configuration in dw_pcie_ep_set_msix()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Samiullah Khawaja <skhawaja@google.com>
Date: Tue May 5 23:43:27 2026 +0000
PCI: Initialize temporary device in new_id_store()
[ Upstream commit f45a49a2380a47332817b7248c61a0ebbc6f0d00 ]
When setting new_id of a PCI device driver using sysfs a lockdep splat
occurs. This is because new_id_store() builds a temporary pci_dev for
pci_match_device(), which calls device_match_driver_override(). That
depends on the driver_override.lock added by cb3d1049f4ea ("driver core:
generalize driver_override in struct device").
The new driver_override.lock was not initialized in the temporary pci_dev,
resulting in this lockdep splat.
Initialize the temporary pci_dev to fix this.
Repro:
Build with CONFIG_LOCKDEP=y, boot with QEMU, and add a new ID:
# echo "8086 10f5" > /sys/bus/pci/drivers/e1000e/new_id
INFO: trying to register non-static key.
The code is fine but needs lockdep annotation, or maybe
you didn't initialize this object before use?
turning off the locking correctness validator.
CPU: 2 UID: 0 PID: 177 Comm: liveupdate-iomm Not tainted 7.0.0+ #9 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x5d/0x80
register_lock_class+0x77e/0x790
lock_acquire+0xbf/0x2e0
pci_match_device+0x24/0x180
new_id_store+0x189/0x1d0
kernfs_fop_write_iter+0x14f/0x210
vfs_write+0x263/0x5e0
ksys_write+0x79/0xf0
do_syscall_64+0x117/0xf80
Fixes: 10a4206a2401 ("PCI: use generic driver_override infrastructure")
Fixes: 8895d3bcb8ba ("PCI: Fail new_id for vendor/device values already built into driver")
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
[bhelgaas: add commit log details and repro, trim backtrace]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260505234327.716630-1-skhawaja@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chen-Yu Tsai <wenst@chromium.org>
Date: Tue Mar 24 17:35:41 2026 +0800
PCI: mediatek-gen3: Prevent leaking IRQ domains when IRQ not found
[ Upstream commit 5573c44cb3fd01a9f62d569ae9ac870ef5f0e0ba ]
In mtk_pcie_setup_irq(), the IRQ domains are allocated before the
controller's IRQ is fetched. If the latter fails, the function
directly returns an error, without cleaning up the allocated domains.
Hence, reverse the order so that the IRQ domains are allocated after the
controller's IRQ is found.
This was flagged by Sashiko during a review of "[PATCH v6 0/7] PCI:
mediatek-gen3: add power control support".
Fixes: 814cceebba9b ("PCI: mediatek-gen3: Add INTx support")
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://sashiko.dev/#/patchset/20260324052002.4072430-1-wenst%40chromium.org
Link: https://patch.msgid.link/20260324093542.18523-1-wenst@chromium.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
Date: Sat Mar 14 07:26:34 2026 +0530
PCI: qcom: Advertise Hotplug Slot Capability with no Command Completion support
[ Upstream commit 33a76fc3c3e61386524479b99f35423bd3d9a895 ]
Qcom PCIe Root Ports advertise hotplug capability in hardware, but do not
support hotplug command completion. As a result, the hotplug commands
issued by the pciehp driver never gets completion notification, leading to
repeated timeout warnings and multi-second delays during boot and
suspend/resume.
Commit a54db86ddc153 ("PCI: qcom: Do not advertise hotplug capability for
IPs v2.7.0 and v1.9.0") mistakenly assumed that the Root Ports doesn't
support Hotplug due to timeouts and disabled the Hotplug functionality
altogether. But the Root Ports does support reporting Hotplug events like
DL_Up/Down events.
So to fix the command completion timeout issues, just set the No Command
Completed Support (NCCS) bit and enable Hotplug in Slot Capability field
back.
Fixes: a54db86ddc153 ("PCI: qcom: Do not advertise hotplug capability for IPs v2.7.0 and v1.9.0")
Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
[mani: renamed function, commit log and added comment]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Tested-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> # Hamoa CRD, tunneled link
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://patch.msgid.link/20260314-hotplug-v1-1-96ac87d93867@oss.qualcomm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vidya Sagar <vidyas@nvidia.com>
Date: Wed Mar 25 00:37:50 2026 +0530
PCI: tegra194: Allow system suspend when the Endpoint link is not up
[ Upstream commit c76f8eae7d4695b1176c4ea5eb93c17e16a20272 ]
Host software initiates the L2 sequence. PCIe link is kept in L2 state
during suspend. If Endpoint mode is enabled and the link is up, the
software cannot proceed with suspend. However, when the PCIe Endpoint
driver is probed, but the PCIe link is not up, Tegra can go into suspend
state. So, allow system to suspend in this case.
Fixes: de2bbf2b71bb ("PCI: tegra194: Don't allow suspend when Tegra PCIe is in EP mode")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-10-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vidya Sagar <vidyas@nvidia.com>
Date: Wed Mar 25 00:37:48 2026 +0530
PCI: tegra194: Disable direct speed change for Endpoint mode
[ Upstream commit 976f6763f57970388bcd7118931f33f447916927 ]
Pre-silicon simulation showed the controller operating in Endpoint mode
initiating link speed change after completing Secondary Bus Reset. Ideally,
the Root Port or the Switch Downstream Port should initiate the link speed
change post SBR, not the Endpoint.
So, as per the hardware team recommendation, disable direct speed change
for the Endpoint mode to prevent it from initiating speed change after the
physical layer link is up at Gen1, leaving speed change ownership with the
host.
Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
[mani: commit log]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-8-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Date: Wed Mar 25 00:37:44 2026 +0530
PCI: tegra194: Disable LTSSM after transition to Detect on surprise link down
[ Upstream commit 9fa0c242f8d7acf1b124d4462d18f4023573ac1c ]
After the link reaches a Detect-related LTSSM state, disable LTSSM so it
does not keep toggling between Polling and Detect. Do this by polling for
the Detect state first, then clearing APPL_CTRL_LTSSM_EN in both
tegra_pcie_dw_pme_turnoff() and pex_ep_event_pex_rst_assert().
Fixes: 56e15a238d92 ("PCI: tegra: Add Tegra194 PCIe support")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-4-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Date: Wed Mar 25 00:37:46 2026 +0530
PCI: tegra194: Disable PERST# IRQ only in Endpoint mode
[ Upstream commit 40658a31b6e134169c648041efc84944c4c71dcd ]
The PERST# GPIO interrupt is only registered when the controller is
operating in Endpoint mode. In Root Port mode, the PERST# GPIO is
configured as an output to control downstream devices, and no interrupt is
registered for it.
Currently, tegra_pcie_dw_stop_link() unconditionally calls disable_irq()
on pex_rst_irq, which causes issues in Root Port mode where this IRQ is
not registered.
Fix this by only disabling the PERST# IRQ when operating in Endpoint mode,
where the interrupt is actually registered and used to detect PERST#
assertion/deassertion from the host.
Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194")
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-6-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vidya Sagar <vidyas@nvidia.com>
Date: Wed Mar 25 00:37:45 2026 +0530
PCI: tegra194: Don't force the device into the D0 state before L2
[ Upstream commit 71d9f67701e1affc82d18ca88ae798c5361beddf ]
As per PCIe CEM r6.0, sec 2.3, the PCIe Endpoint device should be in D3cold
to assert WAKE# pin. The previous workaround that forced downstream devices
to D0 before taking the link to L2 cited PCIe r4.0, sec 5.2, "Link State
Power Management"; however, that spec does not explicitly require putting
the device into D0 and only indicates that power removal may be initiated
without transitioning to D3hot.
Remove the D0 workaround so that Endpoint devices can use wake
functionality (WAKE# from D3). With some Endpoints the link may not enter
L2 when they remain in D3, but the Root Port continues with the usual flow
after PME timeout, so there is no functional issue.
Fixes: 56e15a238d92 ("PCI: tegra: Add Tegra194 PCIe support")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-5-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Date: Wed Mar 25 00:37:55 2026 +0530
PCI: tegra194: Fix CBB timeout caused by DBI access before core power-on
[ Upstream commit 34b3eef48d980cd37b876e128bbf314f69fb5d70 ]
When PERST# is deasserted twice (assert -> deassert -> assert -> deassert),
a CBB (Control Backbone) timeout occurs at DBI register offset 0x8bc
(PCIE_MISC_CONTROL_1_OFF). This happens because pci_epc_deinit_notify()
and dw_pcie_ep_cleanup() are called before reset_control_deassert() powers
on the controller core.
The call chain that causes the timeout:
pex_ep_event_pex_rst_deassert()
pci_epc_deinit_notify()
pci_epf_test_epc_deinit()
pci_epf_test_clear_bar()
pci_epc_clear_bar()
dw_pcie_ep_clear_bar()
__dw_pcie_ep_reset_bar()
dw_pcie_dbi_ro_wr_en() <- Accesses 0x8bc DBI register
reset_control_deassert(pcie->core_rst) <- Core powered on HERE
The DBI registers, including PCIE_MISC_CONTROL_1_OFF (0x8bc), are only
accessible after the controller core is powered on via
reset_control_deassert(pcie->core_rst). Accessing them before this point
results in a CBB timeout because the hardware is not yet operational.
Fix this by moving pci_epc_deinit_notify() and dw_pcie_ep_cleanup() to
after reset_control_deassert(pcie->core_rst), ensuring the controller is
fully powered on before any DBI register accesses occur.
Fixes: 40e2125381dc ("PCI: tegra194: Move controller cleanups to pex_ep_event_pex_rst_deassert()")
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-15-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vidya Sagar <vidyas@nvidia.com>
Date: Wed Mar 25 00:37:42 2026 +0530
PCI: tegra194: Fix polling delay for L2 state
[ Upstream commit adaffed907f14f954096555665ad6af2ae724d83 ]
As per PCIe r7.0, sec 5.3.3.2.1, after sending PME_Turn_Off message, Root
Port should wait for 1-10 msec for PME_TO_Ack message. Currently, driver is
polling for 10 msec with 1 usec delay which is aggressive. Use existing
macro PCIE_PME_TO_L2_TIMEOUT_US to poll for 10 msec with 1 msec delay.
Since this function is used in non-atomic context only, use non-atomic poll
function.
Fixes: 56e15a238d92 ("PCI: tegra: Add Tegra194 PCIe support")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-2-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vidya Sagar <vidyas@nvidia.com>
Date: Wed Mar 25 00:37:51 2026 +0530
PCI: tegra194: Free up Endpoint resources during remove()
[ Upstream commit 8870f02f7868209eb9bdc5dc53540a6262cf9227 ]
Free up the resources during remove() that were acquired by the DesignWare
driver for the Endpoint mode during probe().
Fixes: bb617cbd8151 ("PCI: tegra194: Clean up the exit path for Endpoint mode")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-11-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Date: Wed Mar 25 00:37:43 2026 +0530
PCI: tegra194: Increase LTSSM poll time on surprise link down
[ Upstream commit 74dd8efe4d6cead433162147333af989a568aac7 ]
On surprise link down, LTSSM state transits from L0 -> Recovery.RcvrLock ->
Recovery.RcvrSpeed -> Gen1 Recovery.RcvrLock -> Detect. Recovery.RcvrLock
and Recovery.RcvrSpeed transit times are 24 ms and 48 ms respectively, so
the total time from L0 to Detect is ~96 ms. Increase the poll timeout to
120 ms to account for this.
While at it, add LTSSM state defines for Detect-related states and use them
in the poll condition. Use readl_poll_timeout() instead of
readl_poll_timeout_atomic() in tegra_pcie_dw_pme_turnoff() since that path
runs in non-atomic context.
Fixes: 56e15a238d92 ("PCI: tegra: Add Tegra194 PCIe support")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-3-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Date: Mon Sep 22 13:40:57 2025 +0530
PCI: tegra194: Rename 'root_bus' to 'root_port_bus' in tegra_pcie_downstream_dev_to_D0()
[ Upstream commit e1bd928479fb1fa60e9034b0fdb1ab9f3fa92f33 ]
In tegra_pcie_downstream_dev_to_D0(), PCI devices are transitioned to D0
state. For iterating over the devices, first the downstream bus of the Root
Port is searched from the root bus. But the name of the variable that holds
the Root Port downstream bus is named as 'root_bus', which is wrong.
Rename the variable to 'root_port_bus'. Also, move the comment on 'bringing
the devices to D0' to where the state is set exactly.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250922081057.15209-1-mani@kernel.org
Stable-dep-of: 71d9f67701e1 ("PCI: tegra194: Don't force the device into the D0 state before L2")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vidya Sagar <vidyas@nvidia.com>
Date: Wed Mar 25 00:37:49 2026 +0530
PCI: tegra194: Set LTR message request before PCIe link up in Endpoint mode
[ Upstream commit b256493bf8cacf0e524bf4c10b5c4901d0c6cefe ]
LTR message should be sent as soon as the Root Port enables LTR in the
Endpoint mode. So set snoop and no-snoop LTR timing and LTR message request
before the PCIe link comes up, so that the LTR message is sent upstream as
soon as LTR is enabled.
Without programming these values, the Endpoint would send latencies of 0 to
the host, which will be inaccurate.
Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
[mani: commit log]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-9-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Vidya Sagar <vidyas@nvidia.com>
Date: Wed Mar 25 00:37:47 2026 +0530
PCI: tegra194: Use devm_gpiod_get_optional() to parse "nvidia,refclk-select"
[ Upstream commit f62bc7917de1374dce86a852ffba8baf9cb7a56a ]
The GPIO DT property "nvidia,refclk-select", to select the PCIe reference
clock is optional. Use devm_gpiod_get_optional() to get it.
Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-7-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Date: Wed Mar 25 00:37:52 2026 +0530
PCI: tegra194: Use DWC IP core version
[ Upstream commit ea60ca067f0f098043610c96a915d162113c1aac ]
Tegra194 PCIe driver used custom version numbers to detect Tegra194 and
Tegra234 IPs. With version detect logic added, version check results in
mismatch warnings:
tegra194-pcie 14100000.pcie: Versions don't match (0000562a != 3536322a)
Use HW version numbers which match to PORT_LOGIC.PCIE_VERSION_OFF in
Tegra194 driver to avoid these kernel warnings.
Fixes: a54e19073718 ("PCI: tegra194: Add Tegra234 PCIe support")
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-12-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Danilo Krummrich <dakr@kernel.org>
Date: Tue Mar 24 01:59:09 2026 +0100
PCI: use generic driver_override infrastructure
[ Upstream commit 10a4206a24013be4d558d476010cbf2eb4c9fa64 ]
When a driver is probed through __driver_attach(), the bus' match()
callback is called without the device lock held, thus accessing the
driver_override field without a lock, which can cause a UAF.
Fix this by using the driver-core driver_override infrastructure taking
care of proper locking internally.
Note that calling match() from __driver_attach() without the device lock
held is intentional. [1]
Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [1]
Reported-by: Gui-Dong Han <hanguidong02@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789
Fixes: 782a985d7af2 ("PCI: Introduce new device binding path using pci_dev.driver_override")
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alex Williamson <alex@shazbot.org>
Tested-by: Gui-Dong Han <hanguidong02@gmail.com>
Reviewed-by: Gui-Dong Han <hanguidong02@gmail.com>
Link: https://patch.msgid.link/20260324005919.2408620-6-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: René Rebe <rene@exactco.de>
Date: Wed Nov 26 17:42:56 2025 +0100
PCMCIA: Fix garbled log messages for KERN_CONT
[ Upstream commit bfeaa6814bd3f9a1f6d525b3b35a03b9a0368961 ]
For years the PCMCIA info messages are messed up by superfluous
newlines. While f2e6cf76751d ("pcmcia: Convert dev_printk to
dev_<level>") converted the code to pr_cont(), dev_info enforces a \n
via vprintk_store setting LOG_NEWLINE, breaking subsequent pr_cont.
Fix by logging the device name manually to allow pr_cont to work for
more readable and not \n distorted logs.
Fixes: f2e6cf76751d ("pcmcia: Convert dev_printk to dev_<level>")
Signed-off-by: René Rebe <rene@exactco.de>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ian Rogers <irogers@google.com>
Date: Thu Mar 12 15:31:31 2026 -0700
perf branch: Avoid incrementing NULL
[ Upstream commit c969a9d7bbf46f983c4a48566b3b2f7340b02296 ]
If the entry is NULL the value is meaningless so early return NULL to
avoid an increment of NULL. This was happening in calls from
has_stitched_lbr when running the "perf record LBR tests". The return
value isn't used in that case, so returning NULL as no effect.
Fixes: 42bbabed09ce ("perf tools: Add hw_idx in struct branch_stack")
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ian Rogers <irogers@google.com>
Date: Fri Apr 3 23:05:52 2026 -0700
perf cgroup: Update metric leader in evlist__expand_cgroup
[ Upstream commit c9ef786c0970991578397043f1c819229e2b7197 ]
When the evlist is expanded the metric leader wasn't being updated. As
the original evsel is deleted this creates a use-after-free in
stat-shadow's prepare_metric. This was detected running the "perf stat
--bpf-counters --for-each-cgroup test" with sanitizers.
The change itself puts the copied evsel into the priv field (known
unused because of evsel__clone use) and then in a second pass over the
list updates the copied values using the priv pointer.
Fixes: d1c5a0e86a4e ("perf stat: Add --for-each-cgroup option")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Sun Jian <sun.jian.kdev@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ian Rogers <irogers@google.com>
Date: Thu Sep 26 15:48:32 2024 +0100
perf evsel: Add alternate_hw_config and use in evsel__match
[ Upstream commit 22a4db3c36034e2b034c5b88414680857fc59cf4 ]
There are cases where we want to match events like instructions and
cycles with legacy hardware values, in particular in stat-shadow's
hard coded metrics. An evsel's name isn't a good point of reference as
it gets altered, strstr would be too imprecise and re-parsing the
event from its name is silly. Instead, hold the legacy hardware event
name, determined during parsing, in the evsel for this matching case.
Inline evsel__match2 that is only used in builtin-diff.
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: ak@linux.intel.com
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20240926144851.245903-2-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Stable-dep-of: c9ef786c0970 ("perf cgroup: Update metric leader in evlist__expand_cgroup")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Leo Yan <leo.yan@arm.com>
Date: Thu Apr 2 17:04:47 2026 +0100
perf expr: Return -EINVAL for syntax error in expr__find_ids()
[ Upstream commit 3a61fd866ef9aaa1d3158b460f852b74a2df07f4 ]
expr__find_ids() propagates the parser return value directly. For syntax
errors, the parser can return a positive value, but callers treat it as
success, e.g., for below case on Arm64 platform:
metric expr 100 * (STALL_SLOT_BACKEND / (CPU_CYCLES * #slots) - BR_MIS_PRED * 3 / CPU_CYCLES) for backend_bound
parsing metric: 100 * (STALL_SLOT_BACKEND / (CPU_CYCLES * #slots) - BR_MIS_PRED * 3 / CPU_CYCLES)
Failure to read '#slots' literal: #slots = nan
syntax error
Convert positive parser returns in expr__find_ids() to -EINVAL, as a
result, the error value will be respected by callers.
Before:
perf stat -C 5
Failure to read '#slots'Failure to read '#slots'Failure to read '#slots'Failure to read '#slots'Segmentation fault
After:
perf stat -C 5
Failure to read '#slots'Cannot find metric or group `Default'
Fixes: ded80bda8bc9 ("perf expr: Migrate expr ids table to a hashmap")
Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ian Rogers <irogers@google.com>
Date: Thu Mar 19 16:33:48 2026 -0700
perf lock: Fix option value type in parse_max_stack
[ Upstream commit cfaade34b52aa1ec553044255702c4b31b57c005 ]
The value is a void* and the address of an int, max_stack_depth, is
set up in the perf lock options. The parse_max_stack function treats
the int* as a long*, make this more correct by declaring the value to
be an int*.
Fixes: 0a277b622670 ("perf lock contention: Check --max-stack option")
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ian Rogers <irogers@google.com>
Date: Tue Apr 7 19:08:38 2026 -0700
perf maps: Fix copy_from that can break sorted by name order
[ Upstream commit f552b132e4d5248715828e7e5c2bf7889bf05b2e ]
When an parent is copied into a child the name array is populated in
address not name order. Make sure the name array isn't flagged as sorted.
Fixes: 659ad3492b91 ("perf maps: Switch from rbtree to lazily sorted array for addresses")
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ian Rogers <irogers@google.com>
Date: Mon Nov 18 17:16:41 2024 -0800
perf python: Add parse_events function
[ Upstream commit f081defccd934a8db309c90a61178e4f2eef386c ]
Add basic parse_events function that takes a string and returns an
evlist. As the python evlist is embedded in a pyrf_evlist, and the
evsels are embedded in pyrf_evsels, copy the parsed data into those
structs and update evsel__clone to enable this.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20241119011644.971342-20-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Stable-dep-of: c9ef786c0970 ("perf cgroup: Update metric leader in evlist__expand_cgroup")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ian Rogers <irogers@google.com>
Date: Thu Mar 19 16:33:49 2026 -0700
perf stat: Fix opt->value type for parse_cache_level
[ Upstream commit 44311ae84ad9177fb311aee856027861c22f17b2 ]
Commit f5803651b4a4 ("perf stat: Choose the most disaggregate command
line option") changed aggregation option handling for `perf stat` but
not `perf stat report` leading to parse_cache_level being passed a
struct in the `perf stat` case but erroneously an aggr_mode enum value
for `perf stat report`. Change the `perf stat report` aggregation
handling to use the same opt_aggr_mode as `perf stat`. Also, just pass
the boolean for consistency with other boolean argument handling.
Fixes: f5803651b4a4 ("perf stat: Choose the most disaggregate command line option")
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ian Rogers <irogers@google.com>
Date: Tue Oct 1 20:20:07 2024 -0700
perf tool_pmu: Factor tool events into their own PMU
[ Upstream commit 240505b2d0adcdc8fd018117e88dc27b09734735 ]
Rather than treat tool events as a special kind of event, create a
tool only PMU where the events/aliases match the existing
duration_time, user_time and system_time events. Remove special
parsing and printing support for the tool events, but add function
calls for when PMU functions are called on a tool_pmu.
Move the tool PMU code in evsel into tool_pmu.c to better encapsulate
the tool event behavior in that file.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20241002032016.333748-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Stable-dep-of: c9ef786c0970 ("perf cgroup: Update metric leader in evlist__expand_cgroup")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ian Rogers <irogers@google.com>
Date: Tue Apr 22 22:03:58 2025 -0700
perf tool_pmu: Fix aggregation on duration_time
[ Upstream commit 68cb1567439fa325ba980f3b5b67f95d3953eafd ]
evsel__count_has_error() fails counters when the enabled or running time
are 0. The duration_time event reads 0 when the cpu_map_idx != 0 to
avoid aggregating time over CPUs. Change the enable and running time
to always have a ratio of 100% so that evsel__count_has_error won't
fail.
Before:
```
$ sudo /tmp/perf/perf stat --per-core -a -M UNCORE_FREQ sleep 1
Performance counter stats for 'system wide':
S0-D0-C0 1 2,615,819,485 UNC_CLOCK.SOCKET # 2.61 UNCORE_FREQ
S0-D0-C0 2 <not counted> duration_time
1.002111784 seconds time elapsed
```
After:
```
$ perf stat --per-core -a -M UNCORE_FREQ sleep 1
Performance counter stats for 'system wide':
S0-D0-C0 1 758,160,296 UNC_CLOCK.SOCKET # 0.76 UNCORE_FREQ
S0-D0-C0 2 1,003,438,246 duration_time
1.002486017 seconds time elapsed
```
Note: the metric reads the value a different way and isn't impacted.
Fixes: 240505b2d0adcdc8 ("perf tool_pmu: Factor tool events into their own PMU")
Reported-by: Stephane Eranian <eranian@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20250423050358.94310-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chuck Lever <chuck.lever@oracle.com>
Date: Mon Mar 23 11:58:04 2026 -0400
perf tools: Fix module symbol resolution for non-zero .text sh_addr
[ Upstream commit 9a82bfde4775b7a87cd1a7e791f46f83ae442848 ]
When perf resolves symbols from kernel module ELF files (ET_REL),
it converts symbol addresses to file offsets so that sample IPs
can be matched to the correct symbol. The conversion adjusts each
symbol's st_value:
sym->st_value -= shdr->sh_addr - shdr->sh_offset;
For vmlinux (ET_EXEC), st_value is a virtual address and sh_addr
is the section's virtual base, so subtracting sh_addr and adding
sh_offset correctly yields a file offset.
For kernel modules (ET_REL), st_value is a section-relative
offset. The module loader ignores sh_addr entirely and places
symbols at module_base + st_value. Converting to file offset
requires only adding sh_offset; subtracting sh_addr introduces an
error equal to sh_addr bytes.
When .text has sh_addr == 0 -- the historical norm for simple
modules -- both formulas produce the same result and the bug is
latent. As modules gain more metadata sections before .text (.note,
.static_call.text, etc.), the linker assigns .text a non-zero
sh_addr, exposing the defect. For example, nfsd.ko on this kernel
has sh_addr=0xa80, kvm-intel.ko has sh_addr=0x1e90.
The effect is that all .text symbols in affected modules
shift by sh_addr bytes relative to sample IPs, causing perf
report to attribute samples to incorrect, nearby symbols. This
was observed as 13% of LLC-load-miss samples misattributed
to nfsd_file_get_dio_attrs when the actual hot function was
nfsd_cache_lookup, approximately 0xa80 bytes away in the symbol
table.
Use the existing dso__rel() flag (already set for ET_REL modules)
to select the correct adjustment: add sh_offset for ET_REL,
subtract (sh_addr - sh_offset) for ET_EXEC/ET_DYN.
Fixes: 0131c4ec794a ("perf tools: Make it possible to read object code from kernel modules")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Arnaldo Carvalho de Melo <acme@redhat.com>
Date: Wed Apr 8 14:31:57 2026 -0300
perf util: Kill die() prototype, dead for a long time
[ Upstream commit e5cce1b9c82fbd48e2f1f7a25a9fad8ee228176f ]
In fef2a735167a827a ("perf tools: Kill die()") the die() function was
removed, but not the prototype in util.h, now when building with
LIBPERL=1, during a 'make -C tools/perf build-test' routine test, it is
failing as perl likes die() calls and then this clashes with this
remnant, remove it.
Fixes: fef2a735167a827a ("perf tools: Kill die()")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mike Leach <mike.leach@arm.com>
Date: Wed Mar 18 10:36:39 2026 +0000
perf: tools: cs-etm: Fix print issue for Coresight debug in ETE/TRBE trace
[ Upstream commit 6c478e7b3eba3f387a2d6c749e3e3ee0f8ad1c53 ]
Building perf with CORESIGHT=1 and the optional CSTRACE_RAW=1 enables
additional debug printing of raw trace data when using command:-
perf report --dump.
This raw trace prints the CoreSight formatted trace frames, which may be
used to investigate suspected issues with trace quality / corruption /
decode.
These frames are not present in ETE + TRBE trace.
This fix removes the unnecessary call to print these frames.
This fix also rationalises implementation - original code had helper
function that unnecessarily repeated initialisation calls that had
already been made.
Due to an addtional fault with the OpenCSD library, this call when ETE/TRBE
are being decoded will cause a segfault in perf. This fix also prevents
that problem for perf using older (<= 1.8.0 version) OpenCSD libraries.
Fixes: 68ffe3902898 ("perf tools: Add decoder mechanic to support dumping trace data")
Reported-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Mike Leach <mike.leach@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yu-Chun Lin <eleanor.lin@realtek.com>
Date: Fri Mar 20 23:15:06 2026 +0800
pinctrl: abx500: Fix type of 'argument' variable
[ Upstream commit 34006f77890d050e6d80cbee365b5d703c1140b4 ]
The argument variable is assigned the return value of
pinconf_to_config_argument(), which returns a u32. Change its type from
enum pin_config_param to unsigned int to correctly store the configuration
argument.
Fixes: 03b054e9696c ("pinctrl: Pass all configs to driver on pin_config_set()")
Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com>
Signed-off-by: Linus Walleij <linusw@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date: Fri Feb 27 17:43:35 2026 +0100
pinctrl: cy8c95x0: Avoid returning positive values to user space
[ Upstream commit 5ad32c3607cf241a1a2680cabd64cbcd757227aa ]
When probe fails due to unclear interrupt status register, it returns
a positive number instead of the proper error code. Fix this accordingly.
Fixes: e6cbbe42944d ("pinctrl: Add Cypress cy8c95x0 support")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202602271847.vVWkqLBD-lkp@intel.com/
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Walleij <linusw@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date: Mon Feb 23 19:06:53 2026 +0100
pinctrl: cy8c95x0: remove duplicate error message
[ Upstream commit 970dacb3b9f0fedbbbcfd7dbf1f4f22340b3f359 ]
The pin control core is covered to report any error via message.
The devm_request_threaded_irq() already prints an error message.
Remove the duplicates.
While at it, drop the info message as the same information about
an IRQ in use can be retrieved differently.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Walleij <linusw@kernel.org>
Stable-dep-of: 5ad32c3607cf ("pinctrl: cy8c95x0: Avoid returning positive values to user space")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date: Mon Feb 23 19:06:54 2026 +0100
pinctrl: cy8c95x0: Unify messages with help of dev_err_probe()
[ Upstream commit 014884732095b982412d13d3220c3fe8483b9b3e ]
Unify error messages that might appear during probe phase by
switching to use dev_err_probe().
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Walleij <linusw@kernel.org>
Stable-dep-of: 5ad32c3607cf ("pinctrl: cy8c95x0: Avoid returning positive values to user space")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ethan Tidmore <ethantidmore06@gmail.com>
Date: Fri Feb 27 15:56:23 2026 -0600
pinctrl: pinctrl-pic32: Fix resource leak
[ Upstream commit fe5560688f3ba98364c7de7b4f8dc240ffd1ff75 ]
Fix three possible resource leaks by using the devres version of
clk_prepare_enable(). Also, update error message accordingly.
Detected by Smatch:
drivers/pinctrl/pinctrl-pic32.c:2211 pic32_pinctrl_probe() warn:
'pctl->clk' from clk_prepare_enable() not released on lines: 2208.
drivers/pinctrl/pinctrl-pic32.c:2274 pic32_gpio_probe() warn:
'bank->clk' from clk_prepare_enable() not released on lines: 2264,2272.
Fixes: 2ba384e6c3810 ("pinctrl: pinctrl-pic32: Add PIC32 pin control driver")
Signed-off-by: Ethan Tidmore <ethantidmore06@gmail.com>
Signed-off-by: Linus Walleij <linusw@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yu-Chun Lin <eleanor.lin@realtek.com>
Date: Tue Mar 17 19:54:03 2026 +0800
pinctrl: realtek: Fix function signature for config argument
[ Upstream commit 1f5451844786ed203605528dca9e5d84ed378160 ]
The argument originates from pinconf_to_config_argument(), which returns a
u32. Therefore, the arg parameter should be an unsigned int instead of enum
pin_config_param.
Fixes: e99ce78030db ("pinctrl: realtek: Add common pinctrl driver for Realtek DHC RTD SoCs")
Signed-off-by: Yu-Chun Lin <eleanor.lin@realtek.com>
Signed-off-by: Linus Walleij <linusw@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Biju Das <biju.das.jz@bp.renesas.com>
Date: Thu Mar 26 16:24:51 2026 +0000
pinctrl: renesas: rzg2l: Fix save/restore of {IOLH,IEN,PUPD,SMT} registers
[ Upstream commit d9a60e367919752a1d398ebeba667f1e200fae1e ]
The rzg2l_pinctrl_pm_setup_regs() handles save/restore of
{IOLH,IEN,PUPD,SMT} registers during s2ram, but only for ports where all
pins share the same pincfg. Extend the code to also support ports with
variable pincfg per pin, so that {IOLH,IEN,PUPD,SMT} registers are
correctly saved and restored for all pins.
Fixes: 254203f9a94c ("pinctrl: renesas: rzg2l: Add suspend/resume support")
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260326162459.101414-1-biju.das.jz@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date: Thu Mar 5 20:47:03 2026 +0100
platform/chrome: chromeos_tbmc: Drop wakeup source on remove
[ Upstream commit 5d441a4bc93642ed6f41da87327a39946b4e1455 ]
The wakeup source added by device_init_wakeup() in chromeos_tbmc_add()
needs to be dropped during driver removal, so add a .remove() callback
to the driver for this purpose.
Fixes: 0144c00ed86b ("platform/chrome: chromeos_tbmc: Report wake events")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/6151957.MhkbZ0Pkbq@rafael.j.wysocki
Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date: Wed Mar 4 19:54:08 2026 +0100
platform/surface: surfacepro3_button: Drop wakeup source on remove
[ Upstream commit 1410a228ab2d36fe2b383415a632ae12048d4f3a ]
The wakeup source added by device_init_wakeup() in surface_button_add()
needs to be dropped during driver removal, so update the driver to do
that.
Fixes: 19351f340765 ("platform/x86: surfacepro3: Support for wakeup from suspend-to-idle")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/4368848.1IzOArtZ34@rafael.j.wysocki
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Danilo Krummrich <dakr@kernel.org>
Date: Tue Mar 24 01:59:10 2026 +0100
platform/wmi: use generic driver_override infrastructure
[ Upstream commit 8a700b1fc94df4d847a04f14ebc7f8532592b367 ]
When a driver is probed through __driver_attach(), the bus' match()
callback is called without the device lock held, thus accessing the
driver_override field without a lock, which can cause a UAF.
Fix this by using the driver-core driver_override infrastructure taking
care of proper locking internally.
Note that calling match() from __driver_attach() without the device lock
held is intentional. [1]
Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [1]
Reported-by: Gui-Dong Han <hanguidong02@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789
Fixes: 12046f8c77e0 ("platform/x86: wmi: Add driver_override support")
Reviewed-by: Armin Wolf <W_Armin@gmx.de>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://patch.msgid.link/20260324005919.2408620-7-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Denis Benato <denis.benato@linux.dev>
Date: Mon Mar 2 18:44:30 2026 +0100
platform/x86: asus-wmi: adjust screenpad power/brightness handling
[ Upstream commit 130d29c5627cd50e786e926ad7ef66322c5a0c09 ]
Fix illogical screen off control by hardcoding 0 and 1 depending on the
requested brightness and also do not rely on the last screenpad power
state to issue screen brightness commands.
Fixes: 2c97d3e55b70 ("platform/x86: asus-wmi: add support for ASUS screenpad")
Signed-off-by: Denis Benato <denis.benato@linux.dev>
Signed-off-by: Luke Jones <luke@ljones.dev>
Link: https://patch.msgid.link/20260302174431.349816-2-denis.benato@linux.dev
Link: https://patch.msgid.link/20260326231154.856729-2-ethantidmore06@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Denis Benato <denis.benato@linux.dev>
Date: Mon Mar 2 18:44:31 2026 +0100
platform/x86: asus-wmi: fix screenpad brightness range
[ Upstream commit 8d95d1f4aa5c76202b0833a70998769384612488 ]
Fix screenpad brightness range being too limited without reason:
testing this patch on a Zenbook Duo showed the hardware minimum not being
too low, therefore allow the user to configure the entire range, and
expose to userspace the hardware brightness range and value.
Fixes: 2c97d3e55b70 ("platform/x86: asus-wmi: add support for ASUS screenpad")
Signed-off-by: Denis Benato <denis.benato@linux.dev>
Signed-off-by: Luke Jones <luke@ljones.dev>
Link: https://patch.msgid.link/20260302174431.349816-3-denis.benato@linux.dev
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pengpeng Hou <pengpeng@iscas.ac.cn>
Date: Wed Apr 8 08:38:21 2026 +0800
platform/x86: dell-wmi-sysman: bound enumeration string aggregation
[ Upstream commit 3c34471c26abc52a37f5ad90949e2e4b8027eb14 ]
populate_enum_data() aggregates firmware-provided value-modifier
and possible-value strings into fixed 512-byte struct members.
The current code bounds each individual source string but then
appends every string and separator with raw strcat() and no
remaining-space check.
Switch the aggregation loops to a bounded append helper and
reject enumeration packages whose combined strings do not fit
in the destination buffers.
Fixes: e8a60aa7404b ("platform/x86: Introduce support for Systems Management Driver over WMI for Dell Systems")
Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
Link: https://patch.msgid.link/20260408084501.1-dell-wmi-sysman-v2-pengpeng@iscas.ac.cn
[ij: add include]
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Fedor Pchelkin <pchelkin@ispras.ru>
Date: Fri Apr 3 16:42:39 2026 +0300
platform/x86: dell_rbu: avoid uninit value usage in packet_size_write()
[ Upstream commit f8fd138c2363c0e2d3235c32bfb4fb5c6474e4ae ]
Ensure the temp value has been properly parsed from the user-provided
buffer and initialized to be used in later operations. While at it,
prefer a convenient kstrtoul() helper.
Found by Linux Verification Center (linuxtesting.org) with Svace static
analysis tool.
Fixes: ad6ce87e5bd4 ("[PATCH] dell_rbu: changes in packet update mechanism")
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Link: https://patch.msgid.link/20260403134240.604837-1-pchelkin@ispras.ru
[ij: add include]
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Date: Fri Mar 20 11:31:54 2026 +0100
platform/x86: panasonic-laptop: Fix OPTD notifier registration and cleanup
[ Upstream commit 8baeff2c1d33dad8572216c6ad3a7425852507d4 ]
An ACPI notify handler is leaked if device_create_file() returns an
error in acpi_pcc_hotkey_add().
Also, it is pointless to call pcc_unregister_optd_notifier() in
acpi_pcc_hotkey_remove() if pcc->platform is NULL and it is better
to arrange the cleanup code in that function in the same order as
the rollback code in acpi_pcc_hotkey_add().
Address the above by placing the pcc_register_optd_notifier() call in
acpi_pcc_hotkey_add() after the device_create_file() return value
check and placing the pcc_unregister_optd_notifier() call in
acpi_pcc_hotkey_remove() right before the device_remove_file() call.
Fixes: d5a81d8e864b ("platform/x86: panasonic-laptop: Add support for optical driver power in Y and W series")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/2411055.ElGaqSPkdT@rafael.j.wysocki
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Date: Sun Feb 1 12:48:59 2026 +0200
PM: domains: De-constify fields in struct dev_pm_domain_attach_data
[ Upstream commit 1877d3f258cbb57d64e275754fb9b18b089ce72d ]
It doesn't really make sense to keep u32 fields to be marked as const.
Having the const fields prevents their modification in the driver. Instead
the whole struct can be defined as const, if it is constant.
Fixes: 161e16a5e50a ("PM: domains: Add helper functions to attach/detach multiple PM domains")
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Felix Gu <ustc.gu@gmail.com>
Date: Wed Jan 21 22:17:17 2026 +0800
pmdomain: imx: scu-pd: Fix device_node reference leak during ->probe()
[ Upstream commit c8e9b6a55702be6c6d034e973d519c52c3848415 ]
When calling of_parse_phandle_with_args(), the caller is responsible
to call of_node_put() to release the reference of device node.
In imx_sc_pd_get_console_rsrc(), it does not release the reference.
Fixes: 893cfb99734f ("firmware: imx: scu-pd: do not power off console domain")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Felix Gu <gu_0233@qq.com>
Date: Fri Jan 16 20:27:47 2026 +0800
pmdomain: ti: omap_prm: Fix a reference leak on device node
[ Upstream commit 44c28e1c52764fef6dd1c1ada3a248728812e67f ]
When calling of_parse_phandle_with_args(), the caller is responsible
to call of_node_put() to release the reference of device node.
In omap_prm_domain_attach_dev, it does not release the reference.
Fixes: 58cbff023bfa ("soc: ti: omap-prm: Add basic power domain support")
Signed-off-by: Felix Gu <gu_0233@qq.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sourabh Jain <sourabhjain@linux.ibm.com>
Date: Thu Mar 12 14:00:49 2026 +0530
powerpc/crash: fix backup region offset update to elfcorehdr
[ Upstream commit 789335cacdf37da93bb7c70322dff8c7e82881df ]
update_backup_region_phdr() in file_load_64.c iterates over all the
program headers in the kdump kernel’s elfcorehdr and updates the
p_offset of the program header whose physical address starts at 0.
However, the loop logic is incorrect because the program header pointer
is not updated during iteration. Since elfcorehdr typically contains
PT_NOTE entries first, the PT_LOAD program header with physical address
0 is never reached. As a result, its p_offset is not updated to point to
the backup region.
Because of this behavior, the capture kernel exports the first 64 KB of
the crashed kernel’s memory at offset 0, even though that memory
actually lives in the backup region. When a crash happens, purgatory
copies the first 64 KB of the crashed kernel’s memory into the backup
region so the capture kernel can safely use it.
This has not caused problems so far because the first 64 KB is usually
identical in both the crashed and capture kernels. However, this is
just an assumption and is not guaranteed to always hold true.
Fix update_backup_region_phdr() to correctly update the p_offset of the
program header with a starting physical address of 0 by correcting the
logic used to iterate over the program headers.
Fixes: cb350c1f1f86 ("powerpc/kexec_file: Prepare elfcore header for crashing kernel")
Reviewed-by: Aditya Gupta <adityag@linux.ibm.com>
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Reviewed-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260312083051.1935737-2-sourabhjain@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sourabh Jain <sourabhjain@linux.ibm.com>
Date: Thu Mar 12 14:00:50 2026 +0530
powerpc/crash: Update backup region offset in elfcorehdr on memory hotplug
[ Upstream commit f53b24d1fa263f56155213eabab734c18d884aff ]
When elfcorehdr is prepared for kdump, the program header representing
the first 64 KB of memory is expected to have its offset point to the
backup region. This is required because purgatory copies the first 64 KB
of the crashed kernel memory to this backup region following a kernel
crash. This allows the capture kernel to use the first 64 KB of memory
to place the exception vectors and other required data.
When elfcorehdr is recreated due to memory hotplug, the offset of
the program header representing the first 64 KB is not updated.
As a result, the capture kernel exports the first 64 KB at offset
0, even though the data actually resides in the backup region.
Fix this by calling sync_backup_region_phdr() to update the program
header offset in the elfcorehdr created during memory hotplug.
sync_backup_region_phdr() works for images loaded via the
kexec_file_load syscall. However, it does not work for kexec_load,
because image->arch.backup_start is not initialized in that case.
So introduce machine_kexec_post_load() to process the elfcorehdr
prepared by kexec-tools and initialize image->arch.backup_start for
kdump images loaded via kexec_load syscall.
Rename update_backup_region_phdr() to sync_backup_region_phdr() and
extend it to synchronize the backup region offset between the kdump
image and the ELF core header. The helper now supports updating either
the kdump image from the ELF program header or updating the ELF program
header from the kdump image, avoiding code duplication.
Define ARCH_HAS_KIMAGE_ARCH and struct kimage_arch when
CONFIG_KEXEC_FILE or CONFIG_CRASH_DUMP is enabled so that
kimage->arch.backup_start is available with the kexec_load system call.
This patch depends on the patch titled
"powerpc/crash: fix backup region offset update to elfcorehdr".
Fixes: 849599b702ef ("powerpc/crash: add crash memory hotplug support")
Reviewed-by: Aditya Gupta <adityag@linux.ibm.com>
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260312083051.1935737-3-sourabhjain@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ma Ke <make24@iscas.ac.cn>
Date: Sun Nov 16 10:44:11 2025 +0800
powerpc/warp: Fix error handling in pika_dtm_thread
commit 108d7f951271cbd36ca36efc5e5d106966f5180c upstream.
pika_dtm_thread() acquires client through of_find_i2c_device_by_node()
but fails to release it in error handling path. This could result in a
reference count leak, preventing proper cleanup and potentially
leading to resource exhaustion. Add put_device() to release the
reference in the error handling path.
Found by code review.
Cc: stable@vger.kernel.org
Fixes: 3984114f0562 ("powerpc/warp: Platform fix for i2c change")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20251116024411.21968-1-make24@iscas.ac.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Taegu Ha <hataegu0826@gmail.com>
Date: Thu Apr 9 16:11:15 2026 +0900
ppp: require CAP_NET_ADMIN in target netns for unattached ioctls
[ Upstream commit 2bb6379416fd19f44c3423a00bfd8626259f6067 ]
/dev/ppp open is currently authorized against file->f_cred->user_ns,
while unattached administrative ioctls operate on current->nsproxy->net_ns.
As a result, a local unprivileged user can create a new user namespace
with CLONE_NEWUSER, gain CAP_NET_ADMIN only in that new user namespace,
and still issue PPPIOCNEWUNIT, PPPIOCATTACH, or PPPIOCATTCHAN against
an inherited network namespace.
Require CAP_NET_ADMIN in the user namespace that owns the target network
namespace before handling unattached PPP administrative ioctls.
This preserves normal pppd operation in the network namespace it is
actually privileged in, while rejecting the userns-only inherited-netns
case.
Fixes: 273ec51dd7ce ("net: ppp_generic - introduce net-namespace functionality v2")
Signed-off-by: Taegu Ha <hataegu0826@gmail.com>
Link: https://patch.msgid.link/20260409071117.4354-1-hataegu0826@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Qingfang Deng <qingfang.deng@linux.dev>
Date: Wed Apr 15 10:24:51 2026 +0800
pppoe: drop PFC frames
[ Upstream commit cc1ff87bce1ccd38410ab10960f576dcd17db679 ]
RFC 2516 Section 7 states that Protocol Field Compression (PFC) is NOT
RECOMMENDED for PPPoE. In practice, pppd does not support negotiating
PFC for PPPoE sessions, and the current PPPoE driver assumes an
uncompressed (2-byte) protocol field. However, the generic PPP layer
function ppp_input() is not aware of the negotiation result, and still
accepts PFC frames.
If a peer with a broken implementation or an attacker sends a frame with
a compressed (1-byte) protocol field, the subsequent PPP payload is
shifted by one byte. This causes the network header to be 4-byte
misaligned, which may trigger unaligned access exceptions on some
architectures.
To reduce the attack surface, drop PPPoE PFC frames. Introduce
ppp_skb_is_compressed_proto() helper function to be used in both
ppp_generic.c and pppoe.c to avoid open-coding.
Fixes: 7fb1b8ca8fa1 ("ppp: Move PFC decompression to PPP generic layer")
Signed-off-by: Qingfang Deng <qingfang.deng@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260415022456.141758-2-qingfang.deng@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Cole Leavitt <cole@unwrap.rs>
Date: Wed Feb 25 16:54:06 2026 -0700
pstore/ram: fix resource leak when ioremap() fails
[ Upstream commit 2ddb69f686ef7a621645e97fc7329c50edf5d0e5 ]
In persistent_ram_iomap(), ioremap() or ioremap_wc() may return NULL on
failure. Currently, if this happens, the function returns NULL without
releasing the memory region acquired by request_mem_region().
This leads to a resource leak where the memory region remains reserved
but unusable.
Additionally, the caller persistent_ram_buffer_map() handles NULL
correctly by returning -ENOMEM, but without this check, a NULL return
combined with request_mem_region() succeeding leaves resources in an
inconsistent state.
This is the ioremap() counterpart to commit 05363abc7625 ("pstore:
ram_core: fix incorrect success return when vmap() fails") which fixed
a similar issue in the vmap() path.
Fixes: 404a6043385d ("staging: android: persistent_ram: handle reserving and mapping memory")
Signed-off-by: Cole Leavitt <cole@unwrap.rs>
Link: https://patch.msgid.link/20260225235406.11790-1-cole@unwrap.rs
Signed-off-by: Kees Cook <kees@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sangyun Kim <sangyun.kim@snu.ac.kr>
Date: Sun Apr 19 17:08:38 2026 +0900
pwm: atmel-tcb: Cache clock rates and mark chip as atomic
[ Upstream commit 68637b68afcc3cb4d56aca14a3a1d1b47b879369 ]
atmel_tcb_pwm_apply() holds tcbpwmc->lock as a spinlock via
guard(spinlock)() and then calls atmel_tcb_pwm_config(), which calls
clk_get_rate() twice. clk_get_rate() acquires clk_prepare_lock (a
mutex), so this is a sleep-in-atomic-context violation.
On CONFIG_DEBUG_ATOMIC_SLEEP kernels every pwm_apply_state() that
enables or reconfigures the PWM triggers a "BUG: sleeping function
called from invalid context" warning.
Acquire exclusive control over the clock rates with
clk_rate_exclusive_get() at probe time and cache the rates in struct
atmel_tcb_pwm_chip, then read the cached rates from
atmel_tcb_pwm_config(). This keeps the spinlock-based mutual exclusion
introduced in commit 37f7707077f5 ("pwm: atmel-tcb: Fix race condition
and convert to guards") and removes the sleeping calls from the atomic
section.
With no sleeping calls left in .apply() and the regmap-mmio bus already
running with fast_io=true, also mark the chip as atomic so consumers
can use pwm_apply_atomic() from atomic context.
Fixes: 37f7707077f5 ("pwm: atmel-tcb: Fix race condition and convert to guards")
Signed-off-by: Sangyun Kim <sangyun.kim@snu.ac.kr>
Link: https://patch.msgid.link/20260419080838.3192357-1-sangyun.kim@snu.ac.kr
[ukleinek: Ensure .clk is enabled before calling clk_get_rate on it.]
Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jan Kara <jack@suse.cz>
Date: Fri Feb 27 14:22:16 2026 +0100
quota: Fix race of dquot_scan_active() with quota deactivation
[ Upstream commit e93ab401da4b2e2c1b8ef2424de2f238d51c8b2d ]
dquot_scan_active() can race with quota deactivation in
quota_release_workfn() like:
CPU0 (quota_release_workfn) CPU1 (dquot_scan_active)
============================== ==============================
spin_lock(&dq_list_lock);
list_replace_init(
&releasing_dquots, &rls_head);
/* dquot X on rls_head,
dq_count == 0,
DQ_ACTIVE_B still set */
spin_unlock(&dq_list_lock);
synchronize_srcu(&dquot_srcu);
spin_lock(&dq_list_lock);
list_for_each_entry(dquot,
&inuse_list, dq_inuse) {
/* finds dquot X */
dquot_active(X) -> true
atomic_inc(&X->dq_count);
}
spin_unlock(&dq_list_lock);
spin_lock(&dq_list_lock);
dquot = list_first_entry(&rls_head);
WARN_ON_ONCE(atomic_read(&dquot->dq_count));
The problem is not only a cosmetic one as under memory pressure the
caller of dquot_scan_active() can end up working on freed dquot.
Fix the problem by making sure the dquot is removed from releasing list
when we acquire a reference to it.
Fixes: 869b6ea1609f ("quota: Fix slow quotaoff")
Reported-by: Sam Sun <samsun1006219@gmail.com>
Link: https://lore.kernel.org/all/CAEkJfYPTt3uP1vAYnQ5V2ZWn5O9PLhhGi5HbOcAzyP9vbXyjeg@mail.gmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Chih Kai Hsu <hsu.chih.kai@realtek.com>
Date: Thu Mar 26 15:39:23 2026 +0800
r8152: fix incorrect register write to USB_UPHY_XTAL
[ Upstream commit 48afd5124fd6129c46fd12cb06155384b1c4a0c4 ]
The old code used ocp_write_byte() to clear the OOBS_POLLING bit
(BIT(8)) in the USB_UPHY_XTAL register, but this doesn't correctly
clear a bit in the upper byte of the 16-bit register.
Fix this by using ocp_write_word() instead.
Fixes: 195aae321c82 ("r8152: support new chips")
Signed-off-by: Chih Kai Hsu <hsu.chih.kai@realtek.com>
Reviewed-by: Hayes Wang <hayeswang@realtek.com>
Link: https://patch.msgid.link/20260326073925.32976-454-nic_swsd@realtek.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Florian Westphal <fw@strlen.de>
Date: Mon Mar 30 14:27:39 2026 +0200
RDMA/core: Prefer NLA_NUL_STRING
[ Upstream commit 6ed3d14fc45d3da6025e7fe4a6a09066856698e2 ]
These attributes are evaluated as c-string (passed to strcmp), but
NLA_STRING doesn't check for the presence of a \0 terminator.
Either this needs to switch to nla_strcmp() and needs to adjust printf fmt
specifier to not use plain %s, or this needs to use NLA_NUL_STRING.
As the code has been this way for long time, it seems to me that userspace
does include the terminating nul, even tough its not enforced so far, and
thus NLA_NUL_STRING use is the simpler solution.
Fixes: 30dc5e63d6a5 ("RDMA/core: Add support for iWARP Port Mapper user space service")
Link: https://patch.msgid.link/r/20260330122742.13315-1-fw@strlen.de
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Jason Gunthorpe <jgg@ziepe.ca>
Date: Sun May 17 21:23:44 2026 -0400
RDMA/mana: Remove user triggerable WARN_ON() in mana_ib_create_qp_rss()
[ Upstream commit 159f2efabc89d3f931d38f2d35876535d4abf0a3 ]
Sashiko points out that the user can specify WQs sharing the same CQ as a
part of the uAPI and this will trigger the WARN_ON() then go on to corrupt
the kernel.
Just reject it outright and fail the QP creation.
Cc: stable@vger.kernel.org
Fixes: c15d7802a424 ("RDMA/mana_ib: Add CQ interrupt support for RAW QP")
Link: https://sashiko.dev/#/patchset/0-v2-1c49eeb88c48%2B91-rdma_udata_rep_jgg%40nvidia.com?part=1
Link: https://patch.msgid.link/r/5-v1-41f3135e5565+9d2-rdma_ai_fixes1_jgg@nvidia.com
Reviewed-by: Long Li <longli@microsoft.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
[ adjusted context ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Tim Michals <tcmichals@yahoo.com>
Date: Wed Feb 4 12:27:30 2026 -0800
remoteproc: xlnx: Fix sram property parsing
[ Upstream commit d116bccf6f1c199b27c9ebdf07cc3cfe868f919c ]
As per sram bindings, "sram" property can be list of phandles.
When more than one sram phandles are listed, driver can't parse second
phandle's address correctly. Because, phandle index is passed to the API
instead of offset of address from reg property which is always 0 as per
sram.yaml bindings. Fix it by passing 0 to the API instead of sram
phandle index.
Fixes: 77fcdf51b8ca ("remoteproc: xlnx: Add sram support")
Signed-off-by: Tim Michals <tcmichals@yahoo.com>
Signed-off-by: Tanmay Shah <tanmay.shah@amd.com>
Link: https://lore.kernel.org/r/20260204202730.3729984-1-tanmay.shah@amd.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Philipp Zabel <p.zabel@pengutronix.de>
Date: Wed Sep 25 18:40:10 2024 +0200
reset: Add devres helpers to request pre-deasserted reset controls
[ Upstream commit d872bed85036f5e60c66b0dd0994346b4ea6470c ]
Add devres helpers
- devm_reset_control_bulk_get_exclusive_deasserted
- devm_reset_control_bulk_get_optional_exclusive_deasserted
- devm_reset_control_bulk_get_optional_shared_deasserted
- devm_reset_control_bulk_get_shared_deasserted
- devm_reset_control_get_exclusive_deasserted
- devm_reset_control_get_optional_exclusive_deasserted
- devm_reset_control_get_optional_shared_deasserted
- devm_reset_control_get_shared_deasserted
to request and immediately deassert reset controls. During cleanup,
reset_control_assert() will be called automatically on the returned
reset controls.
Acked-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/20240925-reset-get-deasserted-v2-2-b3601bbd0458@pengutronix.de
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Stable-dep-of: bef1eef66718 ("i3c: master: dw-i3c: Fix missing reset assertion in remove() callback")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Philipp Zabel <p.zabel@pengutronix.de>
Date: Wed Sep 25 18:40:09 2024 +0200
reset: replace boolean parameters with flags parameter
[ Upstream commit dad35f7d2fc14e446669d4cab100597a6798eae5 ]
Introduce enum reset_control_flags and replace the list of boolean
parameters to the internal reset_control_get functions with a single
flags parameter, before adding more boolean options.
The separate boolean parameters have been shown to be error prone in
the past. See for example commit a57f68ddc886 ("reset: Fix devm bulk
optional exclusive control getter").
Acked-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/20240925-reset-get-deasserted-v2-1-b3601bbd0458@pengutronix.de
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Stable-dep-of: bef1eef66718 ("i3c: master: dw-i3c: Fix missing reset assertion in remove() callback")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Mario Limonciello <mario.limonciello@amd.com>
Date: Mon May 4 18:01:37 2026 -0500
Revert "ACPI: CPPC: Adjust debug messages in amd_set_max_freq_ratio() to warn"
commit db5dadb562cabb6da49959b473ed0d9645b6f2da upstream.
Some older systems don't support CPPC in the firmware and this just makes
noise for them when booting. Drop back to debug.
This reverts commit 21fb59ab4b9767085f4fe1edbdbe3177fbb9ec97.
Fixes: 21fb59ab4b976 ("ACPI: CPPC: Adjust debug messages in amd_set_max_freq_ratio() to warn")
Suggested-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Kim Phillips <kim.phillips@amd.com>
Cc: All applicable <stable@vger.kernel.org>
Link: https://patch.msgid.link/20260504230141.484743-2-mario.limonciello@amd.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Anthony Pighin (Nokia) <anthony.pighin@nokia.com>
Date: Tue Nov 25 18:00:10 2025 +0000
rtc: abx80x: Disable alarm feature if no interrupt attached
[ Upstream commit 0fedce7244e4b85c049ce579c87e298a1b0b811d ]
Commit 795cda8338ea ("rtc: interface: Fix long-standing race when setting
alarm") exposed an issue where the rtc-abx80x driver does not clear the
alarm feature bit, but instead relies on the set_alarm operation to return
invalid.
For example, when a RTC_UIE_ON ioctl is handled, it should abort at the
feature validation. Instead, it proceeds to the rtc_timer_enqueue(),
which used to return an error from the set_alarm call. However,
following the race condition handling, which likely should not be
discarding predecing errors, a success condition is returned to the
ioctl() caller. This results in (for example):
hwclock: select() to /dev/rtc0 to wait for clock tick timed out
Notwithstanding the validity of the race condition handling, if an interrupt
wasn't specified, or could not be attached, the driver should clear the
alarm feature bit.
Fixes: 718a820a303c ("rtc: abx80x: add alarm support")
Signed-off-by: Anthony Pighin <anthony.pighin@nokia.com>
Link: https://patch.msgid.link/BN0PR08MB69510928028C933749F4139383D1A@BN0PR08MB6951.namprd08.prod.outlook.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ilya Leoshkevich <iii@linux.ibm.com>
Date: Fri Mar 13 18:46:25 2026 +0100
s390/bpf: Zero-extend bpf prog return values and kfunc arguments
[ Upstream commit 202e42e4aa890172366354b233c42c73107a3f59 ]
s390x ABI requires callers to zero-extend unsigned arguments and
sign-extend signed arguments, and callees to zero-extend unsigned
return values and sign-extend signed return values.
s390 BPF JIT currently implements only sign extension. Fix this
omission and implement zero extension too.
Fixes: 528eb2cb87bc ("s390/bpf: Implement arch_prepare_bpf_trampoline()")
Reported-by: Hari Bathini <hbathini@linux.ibm.com>
Closes: https://lore.kernel.org/bpf/20260312080113.843408-1-hbathini@linux.ibm.com/
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Tested-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Link: https://lore.kernel.org/r/20260313174807.581826-1-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Danilo Krummrich <dakr@kernel.org>
Date: Tue Mar 24 01:59:13 2026 +0100
s390/cio: use generic driver_override infrastructure
[ Upstream commit ac4d8bb6e2e13e8684a76ea48d13ebaaaf5c24c4 ]
When a driver is probed through __driver_attach(), the bus' match()
callback is called without the device lock held, thus accessing the
driver_override field without a lock, which can cause a UAF.
Fix this by using the driver-core driver_override infrastructure taking
care of proper locking internally.
Note that calling match() from __driver_attach() without the device lock
held is intentional. [1]
Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [1]
Reported-by: Gui-Dong Han <hanguidong02@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789
Fixes: ebc3d1791503 ("s390/cio: introduce driver_override on the css bus")
Reviewed-by: Vineeth Vijayan <vneethv@linux.ibm.com>
Link: https://patch.msgid.link/20260324005919.2408620-10-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zicheng Qu <quzicheng@huawei.com>
Date: Fri Apr 24 07:11:13 2026 +0000
sched/fair: Clear rel_deadline when initializing forked entities
[ Upstream commit 3da56dc063cd77b9c0b40add930767fab4e389f3 ]
A yield-triggered crash can happen when a newly forked sched_entity
enters the fair class with se->rel_deadline unexpectedly set.
The failing sequence is:
1. A task is forked while se->rel_deadline is still set.
2. __sched_fork() initializes vruntime, vlag and other sched_entity
state, but does not clear rel_deadline.
3. On the first enqueue, enqueue_entity() calls place_entity().
4. Because se->rel_deadline is set, place_entity() treats se->deadline
as a relative deadline and converts it to an absolute deadline by
adding the current vruntime.
5. However, the forked entity's deadline is not a valid inherited
relative deadline for this new scheduling instance, so the conversion
produces an abnormally large deadline.
6. If the task later calls sched_yield(), yield_task_fair() advances
se->vruntime to se->deadline.
7. The inflated vruntime is then used by the following enqueue path,
where the vruntime-derived key can overflow when multiplied by the
entity weight.
8. This corrupts cfs_rq->sum_w_vruntime, breaks EEVDF eligibility
calculation, and can eventually make all entities appear ineligible.
pick_next_entity() may then return NULL unexpectedly, leading to a
later NULL dereference.
A captured trace shows the effect clearly. Before yield, the entity's
vruntime was around:
9834017729983308
After yield_task_fair() executed:
se->vruntime = se->deadline
the vruntime jumped to:
19668035460670230
and the deadline was later advanced further to:
19668035463470230
This shows that the deadline had already become abnormally large before
yield_task_fair() copied it into vruntime.
rel_deadline is only meaningful when se->deadline really carries a
relative deadline that still needs to be placed against vruntime. A
freshly forked sched_entity should not inherit or retain this state.
Clear se->rel_deadline in __sched_fork(), together with the other
sched_entity runtime state, so that the first enqueue does not interpret
the new entity's deadline as a stale relative deadline.
Fixes: 82e9d0456e06 ("sched/fair: Avoid re-setting virtual deadline on 'migrations'")
Analyzed-by: Hui Tang <tanghui20@huawei.com>
Analyzed-by: Zhang Qiao <zhangqiao22@huawei.com>
Signed-off-by: Zicheng Qu <quzicheng@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260424071113.1199600-1-quzicheng@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Tejun Heo <tj@kernel.org>
Date: Sun May 17 23:28:00 2026 -0400
sched_ext: Guard scx_dsq_move() against NULL kit->dsq after failed iter_new
[ Upstream commit 4fda9f0e7c950da4fe03cedeb2ac818edf5d03e9 ]
bpf_iter_scx_dsq_new() clears kit->dsq on failure and
bpf_iter_scx_dsq_{next,destroy}() guard against that. scx_dsq_move() doesn't -
it dereferences kit->dsq immediately, so a BPF program that calls
scx_bpf_dsq_move[_vtime]() after a failed iter_new oopses the kernel.
Return false if kit->dsq is NULL.
Fixes: 4c30f5ce4f7a ("sched_ext: Implement scx_bpf_dispatch[_vtime]_from_dsq()")
Cc: stable@vger.kernel.org # v6.12+
Reported-by: Chris Mason <clm@meta.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Andrea Righi <arighi@nvidia.com>
[ dropped the `struct scx_sched *sch` declaration and `sch = src_dsq->sched` line ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Yang Erkun <yangerkun@huawei.com>
Date: Tue Jan 27 14:20:42 2026 +0800
scsi: sg: Fix sysctl sg-big-buff register during sg_init()
[ Upstream commit 3033c471aaf675254efaa0da431e95d91a104b41 ]
Commit 26d1c80fd61e ("scsi/sg: move sg-big-buff sysctl to scsi/sg.c") made
a mistake. sysctl sg-big-buff was not created because the call to
register_sg_sysctls() was placed on the wrong code path.
Fixes: 26d1c80fd61e ("scsi/sg: move sg-big-buff sysctl to scsi/sg.c")
Signed-off-by: Yang Erkun <yangerkun@huawei.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20260127062044.3034148-2-yangerkun@huawei.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Yang Erkun <yangerkun@huawei.com>
Date: Tue Jan 27 14:20:43 2026 +0800
scsi: sg: Resolve soft lockup issue when opening /dev/sgX
[ Upstream commit d06a310b45e153872033dd0cf19d5a2279121099 ]
The parameter def_reserved_size defines the default buffer size reserved
for each Sg_fd and should be restricted to a range between 0 and 1,048,576
(see https://tldp.org/HOWTO/SCSI-Generic-HOWTO/proc.html). Although the
function sg_proc_write_dressz enforces this limit, it is possible to bypass
it by directly modifying the module parameter as shown below, which then
causes a soft lockup:
echo -1 > /sys/module/sg/parameters/def_reserved_size
exec 4<> /dev/sg0
watchdog: BUG: soft lockup - CPU#5 stuck for 26 seconds! [bash:537]
Modules loaded:
CPU: 5 UID: 0 PID: 537 Command: bash, kernel version 6.19.0-rc3+ #134,
PREEMPT disabled
Hardware: QEMU Standard PC (i440FX + PIIX, 1996), BIOS version
1.16.1-2.fc37 dated 04/01/2014
...
Call Trace:
sg_build_reserve+0x5c/0xa0
sg_add_sfp+0x168/0x270
sg_open+0x16e/0x340
chrdev_open+0xbe/0x230
do_dentry_open+0x175/0x480
vfs_open+0x34/0xf0
do_open+0x265/0x3d0
path_openat+0x110/0x290
do_filp_open+0xc3/0x170
do_sys_openat2+0x71/0xe0
__x64_sys_openat+0x6d/0xa0
do_syscall_64+0x62/0x310
entry_SYSCALL_64_after_hwframe+0x76/0x7e
The fix is to use module_param_cb to validate and reject invalid values
assigned to def_reserved_size.
Fixes: 6460e75a104d ("[SCSI] sg: fixes for large page_size")
Signed-off-by: Yang Erkun <yangerkun@huawei.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://patch.msgid.link/20260127062044.3034148-3-yangerkun@huawei.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Junrui Luo <moonafterrain@outlook.com>
Date: Wed Mar 4 23:42:58 2026 +0800
scsi: target: core: Fix integer overflow in UNMAP bounds check
[ Upstream commit 2bf2d65f76697820dbc4227d13866293576dd90a ]
sbc_execute_unmap() checks LBA + range does not exceed the device capacity,
but does not guard against LBA + range wrapping around on 64-bit overflow.
Add an overflow check matching the pattern already used for WRITE_SAME in
the same file.
Fixes: 86d7182985d2 ("target: Add sbc_execute_unmap() helper")
Reported-by: Yuhao Jiang <danisjiang@gmail.com>
Signed-off-by: Junrui Luo <moonafterrain@outlook.com>
Link: https://patch.msgid.link/SYBPR01MB7881593C61AD52C69FBDB0BDAF7CA@SYBPR01MB7881.ausprd01.prod.outlook.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Xin Long <lucien.xin@gmail.com>
Date: Sun Apr 26 10:46:41 2026 -0400
sctp: discard stale INIT after handshake completion
[ Upstream commit 8a92cb475ca90d84db769e4d4383e631ace0d6e5 ]
After an association reaches ESTABLISHED, the peer’s init_tag is already
known from the handshake. Any subsequent INIT with the same init_tag is
not a valid restart, but a delayed or duplicate INIT.
Drop such INIT chunks in sctp_sf_do_unexpected_init() instead of
processing them as new association attempts.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Link: https://patch.msgid.link/5788c76c1ee122a3ed00189e88dcf9df1fba226c.1777214801.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Xin Long <lucien.xin@gmail.com>
Date: Sun Apr 12 14:13:51 2026 -0400
sctp: fix missing encap_port propagation for GSO fragments
[ Upstream commit bf6f95ae3b8b2638c0e1d6d802d50983ce5d0f45 ]
encap_port in SCTP_INPUT_CB(skb) is used by sctp_vtag_verify() for
SCTP-over-UDP processing. In the GSO case, it is only set on the head
skb, while fragment skbs leave it 0.
This results in fragment skbs seeing encap_port == 0, breaking
SCTP-over-UDP connections.
Fix it by propagating encap_port from the head skb cb when initializing
fragment skbs in sctp_inq_pop().
Fixes: 046c052b475e ("sctp: enable udp tunneling socks")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Link: https://patch.msgid.link/ea65ed61b3598d8b4940f0170b9aa1762307e6c3.1776017631.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Michael Bommarito <michael.bommarito@gmail.com>
Date: Wed Apr 15 23:19:03 2026 -0400
sctp: fix OOB write to userspace in sctp_getsockopt_peer_auth_chunks
[ Upstream commit 0cf004ffb61cd32d140531c3a84afe975f9fc7ea ]
sctp_getsockopt_peer_auth_chunks() checks that the caller's optval
buffer is large enough for the peer AUTH chunk list with
if (len < num_chunks)
return -EINVAL;
but then writes num_chunks bytes to p->gauth_chunks, which lives
at offset offsetof(struct sctp_authchunks, gauth_chunks) == 8
inside optval. The check is missing the sizeof(struct
sctp_authchunks) = 8-byte header. When the caller supplies
len == num_chunks (for any num_chunks > 0) the test passes but
copy_to_user() writes sizeof(struct sctp_authchunks) = 8 bytes
past the declared buffer.
The sibling function sctp_getsockopt_local_auth_chunks() at the
next line already has the correct check:
if (len < sizeof(struct sctp_authchunks) + num_chunks)
return -EINVAL;
Align the peer variant with its sibling.
Reproducer confirms on v7.0-13-generic: an unprivileged userspace
caller that opens a loopback SCTP association with AUTH enabled,
queries num_chunks with a short optval, then issues the real
getsockopt with len == num_chunks and sentinel bytes painted past
the buffer observes those sentinel bytes overwritten with the
peer's AUTH chunk type. The bytes written are under the peer's
control but land in the caller's own userspace; this is not a
kernel memory corruption, but it is a kernel-side contract
violation that can silently corrupt adjacent userspace data.
Fixes: 65b07e5d0d09 ("[SCTP]: API updates to suport SCTP-AUTH extensions.")
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20260416031903.1447072-1-michael.bommarito@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Waiman Long <longman@redhat.com>
Date: Wed Mar 11 16:05:26 2026 -0400
selftest: memcg: skip memcg_sock test if address family not supported
[ Upstream commit 2d028f3e4bbbfd448928a8d3d2814b0b04c214f4 ]
The test_memcg_sock test in memcontrol.c sets up an IPv6 socket and send
data over it to consume memory and verify that memory.stat.sock and
memory.current values are close.
On systems where IPv6 isn't enabled or not configured to support
SOCK_STREAM, the test_memcg_sock test always fails. When the socket()
call fails, there is no way we can test the memory consumption and verify
the above claim. I believe it is better to just skip the test in this
case instead of reporting a test failure hinting that there may be
something wrong with the memcg code.
Link: https://lkml.kernel.org/r/20260311200526.885899-1-longman@redhat.com
Fixes: 5f8f019380b8 ("selftests: cgroup/memcontrol: add basic test for socket accounting")
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Michal Koutný <mkoutny@suse.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eduard Zingerman <eddyz87@gmail.com>
Date: Sat Apr 11 00:33:44 2026 -0700
selftests/bpf: fix __jited_unpriv tag name
[ Upstream commit cdd54fe98c00549264a92613af6bb0e9a5fd0d1c ]
__jited_unpriv was using "test_jited=" as its tag name, same as the
priv variant __jited. Fix by using "test_jited_unpriv=".
Fixes: 7d743e4c759c ("selftests/bpf: __jited test tag to check disassembly after jit")
Acked-by: Ihor Solodrai <ihor.solodrai@linux.dev>
Reviewed-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260410-selftests-global-tags-ordering-v2-1-c566ec9781bf@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: AnishMulay <anishm7030@gmail.com>
Date: Wed Feb 18 11:39:41 2026 -0500
selftests/mm: skip migration tests if NUMA is unavailable
[ Upstream commit 54218f10dfbe88c8e41c744fd45a756cde60b8c4 ]
Currently, the migration test asserts that numa_available() returns 0. On
systems where NUMA is not available (returning -1), such as certain ARM64
configurations or single-node systems, this assertion fails and crashes
the test.
Update the test to check the return value of numa_available(). If it is
less than 0, skip the test gracefully instead of failing.
This aligns the behavior with other MM selftests (like rmap) that skip
when NUMA support is missing.
Link: https://lkml.kernel.org/r/20260218163941.13499-1-anishm7030@gmail.com
Fixes: 0c2d08728470 ("mm: add selftests for migration entries")
Signed-off-by: AnishMulay <anishm7030@gmail.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Tested-by: Sayali Patil <sayalip@linux.ibm.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Amit Machhiwal <amachhiw@linux.ibm.com>
Date: Fri Mar 13 22:24:26 2026 +0530
selftests/powerpc: Suppress -Wmaybe-uninitialized with GCC 15
[ Upstream commit 6e65886fceb23605eff952d6b1975737b4c4b154 ]
GCC 15 reports the below false positive '-Wmaybe-uninitialized' warning
in vphn_unpack_associativity() when building the powerpc selftests.
# make -C tools/testing/selftests TARGETS="powerpc"
[...]
CC test-vphn
In file included from test-vphn.c:3:
In function ‘vphn_unpack_associativity’,
inlined from ‘test_one’ at test-vphn.c:371:2,
inlined from ‘test_vphn’ at test-vphn.c:399:9:
test-vphn.c:10:33: error: ‘be_packed’ may be used uninitialized [-Werror=maybe-uninitialized]
10 | #define be16_to_cpup(x) bswap_16(*x)
| ^~~~~~~~
vphn.c:42:27: note: in expansion of macro ‘be16_to_cpup’
42 | u16 new = be16_to_cpup(field++);
| ^~~~~~~~~~~~
In file included from test-vphn.c:19:
vphn.c: In function ‘test_vphn’:
vphn.c:27:16: note: ‘be_packed’ declared here
27 | __be64 be_packed[VPHN_REGISTER_COUNT];
| ^~~~~~~~~
cc1: all warnings being treated as errors
When vphn_unpack_associativity() is called from hcall_vphn() in kernel
the error is not seen while building vphn.c during kernel compilation.
This is because the top level Makefile includes '-fno-strict-aliasing'
flag always.
The issue here is that GCC 15 emits '-Wmaybe-uninitialized' due to type
punning between __be64[] and __b16* when accessing the buffer via
be16_to_cpup(). The underlying object is fully initialized but GCC 15
fails to track the aliasing due to the strict aliasing violation here.
Please refer [1] and [2]. This results in a false positive warning which
is promoted to an error under '-Werror'. This problem is not seen when
the compilation is performed with GCC 13 and 14. An issue [1] has also
been created on GCC bugzilla.
The selftest compiles fine with '-fno-strict-aliasing'. Since this GCC
flag is used to compile vphn.c in kernel too, the same flag should be
used to build vphn tests when compiling vphn.c in the selftest as well.
Fix this by including '-fno-strict-aliasing' during vphn.c compilation
in the selftest. This keeps the build working while limiting the scope
of the suppression to building vphn tests.
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124427
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99768
Fixes: 58dae82843f5 ("selftests/powerpc: Add test for VPHN")
Reviewed-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Amit Machhiwal <amachhiw@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260313165426.43259-1-amachhiw@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: David Carlier <devnexen@gmail.com>
Date: Fri Mar 13 05:17:55 2026 +0000
selftests/sched_ext: Add missing error check for exit__load()
[ Upstream commit 1d02346fec8d13b05e54296ddc6ae29b7e1067df ]
exit__load(skel) was called without checking its return value.
Every other test in the suite wraps the load call with
SCX_FAIL_IF(). Add the missing check to be consistent with the
rest of the test suite.
Fixes: a5db7817af78 ("sched_ext: Add selftests")
Signed-off-by: David Carlier <devnexen@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Florian Westphal <fw@strlen.de>
Date: Fri Apr 10 00:45:02 2026 +0200
selftests: netfilter: nft_tproxy.sh: adjust to socat changes
[ Upstream commit 61119542663cac70898aef532eb57ee41ea9b477 ]
Like e65d8b6f3092 ("selftests: drv-net: adjust to socat changes") we
need to add shut-none for this test too.
The extra 0-packet can trigger a second (unexpected) reply from the server.
Fixes: 7e37e0eacd22 ("selftests: netfilter: nft_tproxy.sh: add tcp tests")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netdev/20260408152432.24b8ad0d@kernel.org/
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Link: https://patch.msgid.link/20260409224506.27072-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dan Carpenter <error27@gmail.com>
Date: Wed Apr 29 09:48:17 2026 +0300
sfc: fix error code in efx_devlink_info_running_versions()
[ Upstream commit 051ffb001b8a232cfa6e72f38bb5f51c4270a60b ]
Return -EIO if efx_mcdi_rpc() doesn't return enough space.
Fixes: 14743ddd2495 ("sfc: add devlink info support for ef100")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/afGpsbLRHL4_H0KS@stanley.mountain
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Weiming Shi <bestswngs@gmail.com>
Date: Thu Apr 16 18:01:51 2026 +0800
slip: bound decode() reads against the compressed packet length
[ Upstream commit 4c1367a2d7aad643a6f87c6931b13cc1a25e8ca7 ]
slhc_uncompress() parses a VJ-compressed TCP header by advancing a
pointer through the packet via decode() and pull16(). Neither helper
bounds-checks against isize, and decode() masks its return with
& 0xffff so it can never return the -1 that callers test for -- those
error paths are dead code.
A short compressed frame whose change byte requests optional fields
lets decode() read past the end of the packet. The over-read bytes
are folded into the cached cstate and reflected into subsequent
reconstructed packets.
Make decode() and pull16() take the packet end pointer and return -1
when exhausted. Add a bounds check before the TCP-checksum read.
The existing == -1 tests now do what they were always meant to.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Simon Horman <horms@kernel.org>
Closes: https://lore.kernel.org/netdev/20260414134126.758795-2-horms@kernel.org/
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260416100147.531855-5-bestswngs@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Weiming Shi <bestswngs@gmail.com>
Date: Thu Apr 16 04:41:31 2026 +0800
slip: reject VJ receive packets on instances with no rstate array
[ Upstream commit e76607442d5b73e1ba6768f501ef815bb58c2c0e ]
slhc_init() accepts rslots == 0 as a valid configuration, with the
documented meaning of 'no receive compression'. In that case the
allocation loop in slhc_init() is skipped, so comp->rstate stays
NULL and comp->rslot_limit stays 0 (from the kzalloc of struct
slcompress).
The receive helpers do not defend against that configuration.
slhc_uncompress() dereferences comp->rstate[x] when the VJ header
carries an explicit connection ID, and slhc_remember() later assigns
cs = &comp->rstate[...] after only comparing the packet's slot number
to comp->rslot_limit. Because rslot_limit is 0, slot 0 passes the
range check, and the code dereferences a NULL rstate.
The configuration is reachable in-tree through PPP. PPPIOCSMAXCID
stores its argument in a signed int, and (val >> 16) uses arithmetic
shift. Passing 0xffff0000 therefore sign-extends to -1, so val2 + 1
is 0 and ppp_generic.c ends up calling slhc_init(0, 1). Because
/dev/ppp open is gated by ns_capable(CAP_NET_ADMIN), the whole path
is reachable from an unprivileged user namespace. Once the malformed
VJ state is installed, any inbound VJ-compressed or VJ-uncompressed
frame that selects slot 0 crashes the kernel in softirq context:
Oops: general protection fault, probably for non-canonical
address 0xdffffc0000000000: 0000 [#1] SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
RIP: 0010:slhc_uncompress (drivers/net/slip/slhc.c:519)
Call Trace:
<TASK>
ppp_receive_nonmp_frame (drivers/net/ppp/ppp_generic.c:2466)
ppp_input (drivers/net/ppp/ppp_generic.c:2359)
ppp_async_process (drivers/net/ppp/ppp_async.c:492)
tasklet_action_common (kernel/softirq.c:926)
handle_softirqs (kernel/softirq.c:623)
run_ksoftirqd (kernel/softirq.c:1055)
smpboot_thread_fn (kernel/smpboot.c:160)
kthread (kernel/kthread.c:436)
ret_from_fork (arch/x86/kernel/process.c:164)
</TASK>
Reject the receive side on such instances instead of touching rstate.
slhc_uncompress() falls through to its existing 'bad' label, which
bumps sls_i_error and enters the toss state. slhc_remember() mirrors
that with an explicit sls_i_error increment followed by slhc_toss();
the sls_i_runt counter is not used here because a missing rstate is
an internal configuration state, not a runt packet.
The transmit path is unaffected: the only in-tree caller that picks
rslots from userspace (ppp_generic.c) still supplies tslots >= 1, and
slip.c always calls slhc_init(16, 16), so comp->tstate remains valid
and slhc_compress() continues to work.
Fixes: 4ab42d78e37a ("ppp, slip: Validate VJ compression slot parameters completely")
Reported-by: Xiang Mei <xmei5@asu.edu>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20260415204130.258866-2-bestswngs@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ye Bin <yebin10@huawei.com>
Date: Thu May 14 21:14:18 2026 +0800
smb/client: fix possible infinite loop and oob read in symlink_data()
commit 7d9a7f1f96cd617ee9e75bb22217c709038e26b8 upstream.
On 32-bit architectures, the infinite loop is as follows:
len = p->ErrorDataLength == 0xfffffff8
u8 *next = p->ErrorContextData + len
next == p
On 32-bit architectures, the out-of-bounds read is as follows:
len = p->ErrorDataLength == 0xfffffff0
u8 *next = p->ErrorContextData + len
next == (u8 *)p - 8
Reported-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Fixes: 76894f3e2f71 ("cifs: improve symlink handling for smb2+")
Cc: stable@vger.kernel.org
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Liang Jie <liangjie@lixiang.com>
Date: Mon May 18 16:14:00 2026 +0800
smb: client: correctly handle ErrorContextData as a flexible array
[ Upstream commit 215b7f9ecb8d7c14d56febdcdd246f3579c32aba ]
The `smb2_symlink_err_rsp` structure was previously defined with
`ErrorContextData` as a single `__u8` byte. However, the `ErrorContextData`
field is intended to be a variable-length array based on `ErrorDataLength`.
This mismatch leads to incorrect pointer arithmetic and potential memory
access issues when processing error contexts.
Updates the `ErrorContextData` field to be a flexible array
(`__u8 ErrorContextData[]`). Additionally, it modifies the corresponding
casts in the `symlink_data()` function to properly handle the flexible
array, ensuring correct memory calculations and data handling.
These changes improve the robustness of SMB2 symlink error processing.
Signed-off-by: Liang Jie <liangjie@lixiang.com>
Suggested-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Alva Lan <alvalan9@foxmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Mon May 18 16:14:01 2026 +0800
smb: client: fix OOB reads parsing symlink error response
[ Upstream commit 3df690bba28edec865cf7190be10708ad0ddd67e ]
When a CREATE returns STATUS_STOPPED_ON_SYMLINK, smb2_check_message()
returns success without any length validation, leaving the symlink
parsers as the only defense against an untrusted server.
symlink_data() walks SMB 3.1.1 error contexts with the loop test "p <
end", but reads p->ErrorId at offset 4 and p->ErrorDataLength at offset
0. When the server-controlled ErrorDataLength advances p to within 1-7
bytes of end, the next iteration will read past it. When the matching
context is found, sym->SymLinkErrorTag is read at offset 4 from
p->ErrorContextData with no check that the symlink header itself fits.
smb2_parse_symlink_response() then bounds-checks the substitute name
using SMB2_SYMLINK_STRUCT_SIZE as the offset of PathBuffer from
iov_base. That value is computed as sizeof(smb2_err_rsp) +
sizeof(smb2_symlink_err_rsp), which is correct only when
ErrorContextCount == 0.
With at least one error context the symlink data sits 8 bytes deeper,
and each skipped non-matching context shifts it further by 8 +
ALIGN(ErrorDataLength, 8). The check is too short, allowing the
substitute name read to run past iov_len. The out-of-bound heap bytes
are UTF-16-decoded into the symlink target and returned to userspace via
readlink(2).
Fix this all up by making the loops test require the full context header
to fit, rejecting sym if its header runs past end, and bound the
substitute name against the actual position of sym->PathBuffer rather
than a fixed offset.
Because sub_offs and sub_len are 16bits, the pointer math will not
overflow here with the new greater-than.
Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Cc: Shyam Prasad N <sprasad@microsoft.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Bharath SM <bharathsm@microsoft.com>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Cc: stable <stable@kernel.org>
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Assisted-by: gregkh_clanker_t1000
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Alva Lan <alvalan9@foxmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Piyush Sachdeva <s.piyush1024@gmail.com>
Date: Sat May 16 15:09:46 2026 -0400
smb: client: Use FullSessionKey for AES-256 encryption key derivation
[ Upstream commit 5be7a0cef3229fb3b63a07c0d289daf752545424 ]
When Kerberos authentication is used with AES-256 encryption (AES-256-CCM
or AES-256-GCM), the SMB3 encryption and decryption keys must be derived
using the full session key (Session.FullSessionKey) rather than just the
first 16 bytes (Session.SessionKey).
Per MS-SMB2 section 3.2.5.3.1, when Connection.Dialect is "3.1.1" and
Connection.CipherId is AES-256-CCM or AES-256-GCM, Session.FullSessionKey
must be set to the full cryptographic key from the GSS authentication
context. The encryption and decryption key derivation (SMBC2SCipherKey,
SMBS2CCipherKey) must use this FullSessionKey as the KDF input. The
signing key derivation continues to use Session.SessionKey (first 16
bytes) in all cases.
Previously, generate_key() hardcoded SMB2_NTLMV2_SESSKEY_SIZE (16) as the
HMAC-SHA256 key input length for all derivations. When Kerberos with
AES-256 provides a 32-byte session key, the KDF for encryption/decryption
was using only the first 16 bytes, producing keys that did not match the
server's, causing mount failures with sec=krb5 and require_gcm_256=1.
Add a full_key_size parameter to generate_key() and pass the appropriate
size from generate_smb3signingkey():
- Signing: always SMB2_NTLMV2_SESSKEY_SIZE (16 bytes)
- Encryption/Decryption: ses->auth_key.len when AES-256, otherwise 16
Also fix cifs_dump_full_key() to report the actual session key length for
AES-256 instead of hardcoded CIFS_SESS_KEY_SIZE, so that userspace tools
like Wireshark receive the correct key for decryption.
Cc: <stable@vger.kernel.org>
Reviewed-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Piyush Sachdeva <psachdeva@microsoft.com>
Signed-off-by: Piyush Sachdeva <s.piyush1024@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
[ adapted upstream's void/hmac_sha256_init_usingrawkey-based generate_key() to 6.12's int-return crypto_shash_* form while threading full_key_size through all callers. ]
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Sumit Gupta <sumitg@nvidia.com>
Date: Wed Jan 21 15:42:03 2026 +0530
soc/tegra: cbb: Set ERD on resume for err interrupt
[ Upstream commit b6ff71c5d1d4ad858ddf6f39394d169c96689596 ]
Set the Error Response Disable (ERD) bit to mask SError responses
and use interrupt-based error reporting. When the ERD bit is set,
inband error responses to the initiator via SError are suppressed,
and fabric errors are reported via an interrupt instead.
The register is set during boot but the info is lost during system
suspend and needs to be set again on resume.
Fixes: fc2f151d2314 ("soc/tegra: cbb: Add driver for Tegra234 CBB 2.0")
Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alok Tiwari <alok.a.tiwari@oracle.com>
Date: Sun Mar 29 12:53:23 2026 -0700
soc: qcom: aoss: compare against normalized cooling state
[ Upstream commit cd3c4670db3ffe997be9548c7a9db3952563cf14 ]
qmp_cdev_set_cur_state() normalizes the requested state to a boolean
(cdev_state = !!state). The existing early-return check compares
qmp_cdev->state == state, which can be wrong if state is non-boolean
(any non-zero value). Compare qmp_cdev->state against cdev_state instead,
so the check matches the effective state and avoids redundant updates.
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Fixes: 05589b30b21a ("soc: qcom: Extend AOSS QMP driver to support resources that are used to wake up the SoC.")
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260329195333.1478090-1-alok.a.tiwari@oracle.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alok Tiwari <alok.a.tiwari@oracle.com>
Date: Mon Mar 30 02:51:11 2026 -0700
soc: qcom: llcc: fix v1 SB syndrome register offset
[ Upstream commit 24e7625df5ce065393249b78930781be593bc381 ]
The llcc_v1_edac_reg_offset table uses 0x2304c for trp_ecc_sb_err_syn0,
which is inconsistent with the surrounding TRP ECC registers (0x2034x)
and with llcc_v2_1_edac_reg_offset, where trp_ecc_sb_err_syn0 is 0x2034c
adjacent to trp_ecc_error_status0/1 at 0x20344/0x20348.
Use 0x2034c for llcc v1 so the SB syndrome register follows the expected
+0x4 progression from trp_ecc_error_status1. This fixes EDAC reading the
wrong register for SB syndrome reporting.
Fixes: c13d7d261e36 ("soc: qcom: llcc: Pass LLCC version based register offsets to EDAC driver")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260330095118.2657362-1-alok.a.tiwari@oracle.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Date: Mon Mar 23 03:20:57 2026 +0200
soc: qcom: ocmem: make the core clock optional
[ Upstream commit e8a61c51417c679d1a599fb36695e9d3b8d95514 ]
OCMEM's core clock (aka RPM bus 2 clock) is being handled internally by
the interconnect driver. Corresponding clock has been dropped from the
SMD RPM clock driver. The users of the ocmem will vote on the ocmemnoc
interconnect paths, making sure that ocmem is on. Make the clock
optional, keeping it for compatibility with older DT.
Fixes: d6edc31f3a68 ("clk: qcom: smd-rpm: Separate out interconnect bus clocks")
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260323-ocmem-v1-1-ad9bcae44763@oss.qualcomm.com
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Date: Mon Mar 23 03:20:58 2026 +0200
soc: qcom: ocmem: register reasons for probe deferrals
[ Upstream commit 9dfd69cd89cd6afa4723be9098979abeef3bb8c6 ]
Instead of printing messages to the dmesg, let the message be recorded
as a reason for the OCMEM client deferral.
Fixes: 88c1e9404f1d ("soc: qcom: add OCMEM driver")
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Brian Masney <bmasney@redhat.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260323-ocmem-v1-2-ad9bcae44763@oss.qualcomm.com
[bjorn: s/ERR_PTR(dev_err_probe)/dev_err_ptr_probe/
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Date: Mon Mar 23 03:20:59 2026 +0200
soc: qcom: ocmem: return -EPROBE_DEFER is ocmem is not available
[ Upstream commit 91b59009c7d48b58dbc50fecb27f2ad20749a05a ]
If OCMEM is declared in DT, it is expected that it is present and
handled by the driver. The GPU driver will ignore -ENODEV error, which
typically means that OCMEM isn't defined in DT. Let ocmem return
-EPROBE_DEFER if it supposed to be used, but it is not probed (yet).
Fixes: 88c1e9404f1d ("soc: qcom: add OCMEM driver")
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20260323-ocmem-v1-3-ad9bcae44763@oss.qualcomm.com
[bjorn: s/ERR_PTR(dev_err_probe)/dev_err_ptr_probe/
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Cole Leavitt <cole@unwrap.rs>
Date: Wed Feb 18 11:02:10 2026 -0700
soundwire: bus: demote UNATTACHED state warnings to dev_dbg()
[ Upstream commit 2c96956fe764f8224f9ec93b2a9160a578949a7a ]
The dev_warn() messages in sdw_handle_slave_status() for UNATTACHED
transitions were added in commit d1b328557058 ("soundwire: bus: add
dev_warn() messages to track UNATTACHED devices") to debug attachment
failures with dynamic debug enabled.
These warnings fire during normal operation -- for example when a codec
driver triggers a hardware reset after firmware download, causing the
device to momentarily go UNATTACHED before re-attaching -- producing
misleading noise on every boot.
Demote the messages to dev_dbg() so they remain available via dynamic
debug for diagnosing real attachment failures without alarming users
during expected initialization sequences.
Fixes: d1b328557058 ("soundwire: bus: add dev_warn() messages to track UNATTACHED devices")
Signed-off-by: Cole Leavitt <cole@unwrap.rs>
Reviewed-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Link: https://patch.msgid.link/20260218180210.9263-1-cole@unwrap.rs
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Richard Fitzgerald <rf@opensource.cirrus.com>
Date: Tue Mar 10 11:31:33 2026 +0000
soundwire: cadence: Clear message complete before signaling waiting thread
[ Upstream commit cbfea84f820962c3c5394ff06e7e9344c96bf761 ]
Clear the CDNS_MCP_INT_RX_WL interrupt before signaling completion.
This is to prevent the potential race where:
- The main thread is scheduled immediately the completion is signaled,
and starts a new message
- The RX_WL IRQ for this new message happens before sdw_cdns_irq() has
been re-scheduled.
- When sdw_cdns_irq() is re-scheduled it clears the new RX_WL interrupt.
MAIN THREAD | IRQ THREAD
|
_cdns_xfer_msg() |
{ |
write data to FIFO |
wait_for_completion_timeout() |
<BLOCKED> | <---- RX_WL IRQ
| sdw_cdns_irq()
| {
| signal completion
<== RESCHEDULE <==
Handle message completion |
} |
|
Start new message |
_cdns_xfer_msg() |
{ |
write data to FIFO |
wait_for_completion_timeout() |
<BLOCKED> | <---- RX_WL IRQ
==> RESCHEDULE ==>
| // New RX_WL IRQ is cleared before
| // it has been handled.
| clear CDNS_MCP_INTSTAT
| return IRQ_HANDLED;
| }
Before this change, this error message was sometimes seen on kernels
that have large amounts of debugging enabled:
SCP Msg trf timed out
This error indicates that the completion has not been signalled after
500ms.
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Fixes: 956baa1992f9 ("soundwire: cdns: Add sdw_master_ops and IO transfer support")
Reported-by: Norman Bintang <normanbt@google.com>
Closes: https://issuetracker.google.com/issues/477099834
Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.dev>
Link: https://patch.msgid.link/20260310113133.1707288-1-rf@opensource.cirrus.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Gui-Dong Han <hanguidong02@gmail.com>
Date: Mon Mar 23 16:58:46 2026 +0800
soundwire: debugfs: initialize firmware_file to empty string
[ Upstream commit 7215e4552f31e53595eae56a834f7e286beecccc ]
Passing NULL to debugfs_create_str() causes a NULL pointer dereference,
and creating debugfs nodes with NULL string pointers is no longer
permitted.
Additionally, firmware_file is a global pointer. Previously, adding every
new slave blindly overwrote it with NULL.
Fix these issues by initializing firmware_file to an allocated empty
string once in the subsystem init path (sdw_debugfs_init), and freeing
it in the exit path. Existing driver code handles empty strings
correctly.
Fixes: fe46d2a4301d ("soundwire: debugfs: add interface to read/write commands")
Reported-by: yangshiguang <yangshiguang@xiaomi.com>
Closes: https://lore.kernel.org/lkml/17647e4c.d461.19b46144a4e.Coremail.yangshiguang1011@163.com/
Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com>
Link: https://patch.msgid.link/20260323085930.88894-4-hanguidong02@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Date: Thu Mar 6 15:07:21 2025 +0100
sparc/vdso: Always reject undefined references during linking
[ Upstream commit 652262975db421767ada3f05b926854bbb357759 ]
Instead of using a custom script to detect and fail on undefined
references, use --no-undefined for all VDSO linker invocations.
Drop the now unused checkundef.sh script.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Andreas Larsson <andreas@gaisler.com>
Link: https://lore.kernel.org/r/20250306-vdso-checkundef-v2-2-a26cc315fd73@linutronix.de
Stable-dep-of: acc4f131d5d5 ("sparc64: vdso: Link with -z noexecstack")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Date: Wed Mar 4 08:49:01 2026 +0100
sparc64: vdso: Link with -z noexecstack
[ Upstream commit acc4f131d5d57c2aa89db914aeb6f7bb0ab4eb4a ]
The vDSO stack does not need to be executable. Prevent the linker from
creating executable. For more background see commit ffcf9c5700e4 ("x86:
link vdso and boot with -z noexecstack --no-warn-rwx-segments").
Also prevent the following warning from the linker:
sparc64-linux-ld: warning: arch/sparc/vdso/vdso-note.o: missing .note.GNU-stack section implies executable stack
sparc64-linux-ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
Fixes: 9a08862a5d2e ("vDSO for sparc")
Suggested-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Tested-by: Andreas Larsson <andreas@gaisler.com>
Reviewed-by: Andreas Larsson <andreas@gaisler.com>
Acked-by: Andreas Larsson <andreas@gaisler.com>
Link: https://lore.kernel.org/lkml/20250707144726.4008707-1-arnd@kernel.org/
Link: https://patch.msgid.link/20260304-vdso-sparc64-generic-2-v6-4-d8eb3b0e1410@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Felix Gu <ustc.gu@gmail.com>
Date: Wed Mar 4 20:47:21 2026 +0800
spi: fsl-qspi: Use reinit_completion() for repeated operations
[ Upstream commit 981b080a79724738882b0af1c5bb7ade30d94f24 ]
The driver currently calls init_completion() during every spi_mem_op.
Tchnically it may work, but it's not the recommended pattern.
According to the kernel documentation: Calling init_completion() on
the same completion object twice is most likely a bug as it
re-initializes the queue to an empty queue and enqueued tasks could
get "lost" - use reinit_completion() in that case, but be aware of
other races.
So moves the initial initialization to probe function and uses
reinit_completion() for subsequent operations.
Fixes: 84d043185dbe ("spi: Add a driver for the Freescale/NXP QuadSPI controller")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Reviewed-by: Haibo Chen <haibo.chen@nxp.com>
Link: https://patch.msgid.link/20260304-spi-nxp-v2-3-cd7d7726a27e@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pei Xiao <xiaopei01@kylinos.cn>
Date: Thu Mar 19 11:06:41 2026 +0800
spi: hisi-kunpeng: prevent infinite while() loop in hisi_spi_flush_fifo
[ Upstream commit 9f61daf2c2debe9f5cf4e1a4471e56a89a6fe45a ]
The hisi_spi_flush_fifo()'s inner while loop that lacks any timeout
mechanism. Maybe the hardware never becomes empty, the loop will spin
forever, causing the CPU to hang.
Fix this by adding a inner_limit based on loops_per_jiffy. The inner loop
now exits after approximately one jiffy if the FIFO remains non-empty, logs
a ratelimited warning, and breaks out of the outer loop. Additionally, add
a cpu_relax() inside the busy loop to improve power efficiency.
Fixes: c770d8631e18 ("spi: Add HiSilicon SPI Controller Driver for Kunpeng SoCs")
Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn>
Link: https://patch.msgid.link/d834ce28172886bfaeb9c8ca00cfd9bf1c65d5a1.1773889292.git.xiaopei01@kylinos.cn
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pei Xiao <xiaopei01@kylinos.cn>
Date: Tue Apr 7 15:26:59 2026 +0800
spi: mtk-snfi: unregister ECC engine on probe failure and remove() callback
[ Upstream commit ab00febad191d7a4400aa1c3468279fb508258d4 ]
mtk_snand_probe() registers the on-host NAND ECC engine, but teardown was
missing from both probe unwind and remove-time cleanup. Add a devm cleanup
action after successful registration so
nand_ecc_unregister_on_host_hw_engine() runs automatically on probe
failures and during device removal.
Fixes: 764f1b748164 ("spi: add driver for MTK SPI NAND Flash Interface")
Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn>
Link: https://patch.msgid.link/20263f885f1a9c9d559f95275298cd6de4b11ed5.1775546401.git.xiaopei01@kylinos.cn
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Felix Gu <ustc.gu@gmail.com>
Date: Wed Mar 4 20:47:20 2026 +0800
spi: nxp-fspi: Use reinit_completion() for repeated operations
[ Upstream commit 68c8c93fdb0de7e528dc3dfb1d17eb0f652259b8 ]
The driver currently calls init_completion() during every spi_mem_op.
Tchnically it may work, but it's not the recommended pattern.
According to the kernel documentation: Calling init_completion() on
the same completion object twice is most likely a bug as it
re-initializes the queue to an empty queue and enqueued tasks
could get "lost" - use reinit_completion() in that case, but be
aware of other races.
So moves the initial initialization to probe function and uses
reinit_completion() for subsequent operations.
Fixes: a5356aef6a90 ("spi: spi-mem: Add driver for NXP FlexSPI controller")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Reviewed-by: Haibo Chen <haibo.chen@nxp.com>
Link: https://patch.msgid.link/20260304-spi-nxp-v2-2-cd7d7726a27e@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: John Madieu <john.madieu@gmail.com>
Date: Sat Apr 25 09:29:34 2026 +0000
spi: rockchip: Read ISR, not IMR, to detect cs-inactive IRQ
[ Upstream commit b4683a239a409d65f88052f5630c748a8ba070cd ]
rockchip_spi_isr() decides whether the current interrupt was the
cs-inactive event by reading IMR:
if (rs->cs_inactive &&
readl_relaxed(rs->regs + ROCKCHIP_SPI_IMR) & INT_CS_INACTIVE)
ctlr->target_abort(ctlr);
IMR is the interrupt mask register: it tells which sources are enabled,
not which one fired. In the PIO path, rockchip_spi_prepare_irq() enables
both INT_RF_FULL and INT_CS_INACTIVE in IMR when rs->cs_inactive is true:
if (rs->cs_inactive)
writel_relaxed(INT_RF_FULL | INT_CS_INACTIVE,
rs->regs + ROCKCHIP_SPI_IMR);
so the IMR check is always true once cs_inactive is enabled, and every
PIO interrupt - including normal RF_FULL completions - is dispatched to
ctlr->target_abort(), aborting the transfer. The bug is reachable on
ROCKCHIP_SPI_VER2_TYPE2 in target mode with a DMA-capable controller
when the transfer is short enough to fall back to PIO
(rockchip_spi_can_dma() returns false below fifo_len).
Read ISR (which is RISR masked by IMR) so the check actually reflects
which interrupt fired, and parenthesise the expression for clarity while
at it.
Fixes: 869f2c94db92 ("spi: rockchip: Stop spi slave dma receiver when cs inactive")
Signed-off-by: John Madieu <john.madieu@gmail.com>
Link: https://patch.msgid.link/20260425092936.2590132-2-john.madieu@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Johan Hovold <johan@kernel.org>
Date: Wed May 20 10:27:26 2026 -0400
spi: sifive: fix controller deregistration
[ Upstream commit 0f25236694a2854627c1597465a071e6bb6fe572 ]
Make sure to deregister the controller before disabling underlying
resources like interrupts during driver unbind.
Note that clocks were also disabled before the recent commit
140039c23aca ("spi: sifive: Simplify clock handling with
devm_clk_get_enabled()").
Fixes: 484a9a68d669 ("spi: sifive: Add driver for the SiFive SPI controller")
Cc: stable@vger.kernel.org # 5.1
Cc: Yash Shah <yash.shah@sifive.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://patch.msgid.link/20260410081757.503099-15-johan@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Pei Xiao <xiaopei01@kylinos.cn>
Date: Wed May 20 10:27:25 2026 -0400
spi: sifive: Simplify clock handling with devm_clk_get_enabled()
[ Upstream commit 140039c23aca067b9ff0242e3c0ce96276bb95f3 ]
Replace devm_clk_get() followed by clk_prepare_enable() with
devm_clk_get_enabled() for the bus clock. This reduces boilerplate code
and error handling, as the managed API automatically disables the clock
when the device is removed or if probe fails.
Remove the now-unnecessary clk_disable_unprepare() calls from the probe
error path and the remove callback. Adjust the error handling to use the
existing put_host label.
Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn>
Link: https://patch.msgid.link/73d0d8ecb4e1af5a558d6a7866c0f886d94fe3d1.1773885292.git.xiaopei01@kylinos.cn
Signed-off-by: Mark Brown <broonie@kernel.org>
Stable-dep-of: 0f25236694a2 ("spi: sifive: fix controller deregistration")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Haibo Chen <haibo.chen@nxp.com>
Date: Mon Apr 28 18:06:44 2025 +0800
spi: spi-nxp-fspi: enable runtime pm for fspi
[ Upstream commit 97be4b919a609fc8c4bd1118502b5d26cc2f77c4 ]
Enable the runtime PM in fspi driver.
Also for system PM, On some board like i.MX8ULP-EVK board,
after system suspend, IOMUX module will lost power, so all
the pinctrl setting will lost when system resume back, need
driver to save/restore the pinctrl setting.
Signed-off-by: Han Xu <han.xu@nxp.com>
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Link: https://patch.msgid.link/20250428-flexspipatch-v3-2-61d5e8f591bc@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Stable-dep-of: 68c8c93fdb0d ("spi: nxp-fspi: Use reinit_completion() for repeated operations")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Thu Apr 16 20:03:07 2026 +0000
tcp: add data-race annotations around tp->data_segs_out and tp->total_retrans
[ Upstream commit 21e92a38cfd891538598ba8f805e0165a820d532 ]
tcp_get_timestamping_opt_stats() intentionally runs lockless, we must
add READ_ONCE() and WRITE_ONCE() annotations to keep KCSAN happy.
Fixes: 7e98102f4897 ("tcp: record pkts sent and retransmistted")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260416200319.3608680-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Thu Apr 16 20:03:11 2026 +0000
tcp: add data-race annotations for TCP_NLA_SNDQ_SIZE
[ Upstream commit 124199444de467767175a9004e1574dc42523e62 ]
tcp_get_timestamping_opt_stats() intentionally runs lockless, we must
add READ_ONCE() and WRITE_ONCE() annotations to keep KCSAN happy.
Fixes: 87ecc95d81d9 ("tcp: add send queue size stat in SCM_TIMESTAMPING_OPT_STATS")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260416200319.3608680-7-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Thu Apr 16 20:03:18 2026 +0000
tcp: annotate data-races around (tp->write_seq - tp->snd_nxt)
[ Upstream commit 3a63b3d160560ef51e43fb4c880a5cde8078053c ]
tcp_get_timestamping_opt_stats() intentionally runs lockless, we must
add READ_ONCE() annotations to keep KCSAN happy.
WRITE_ONCE() annotations are already present.
Fixes: e08ab0b377a1 ("tcp: add bytes not sent to SCM_TIMESTAMPING_OPT_STATS")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260416200319.3608680-14-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Thu Apr 16 20:03:13 2026 +0000
tcp: annotate data-races around tp->bytes_retrans
[ Upstream commit 5efc7b9f7cbd43401f1af81d3d7f2be00f93390d ]
tcp_get_timestamping_opt_stats() intentionally runs lockless, we must
add READ_ONCE() and WRITE_ONCE() annotations to keep KCSAN happy.
Fixes: fb31c9b9f6c8 ("tcp: add data bytes retransmitted stats")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260416200319.3608680-9-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Thu Apr 16 20:03:12 2026 +0000
tcp: annotate data-races around tp->bytes_sent
[ Upstream commit ee43e957ce2ec77b2ec47fef28f3c0df6ab01a31 ]
tcp_get_timestamping_opt_stats() intentionally runs lockless, we must
add READ_ONCE() and WRITE_ONCE() annotations to keep KCSAN happy.
Fixes: ba113c3aa79a ("tcp: add data bytes sent stats")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260416200319.3608680-8-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Thu Apr 16 20:03:14 2026 +0000
tcp: annotate data-races around tp->dsack_dups
[ Upstream commit a984705ca88b976bf1087978fd98b7f3993da88c ]
tcp_get_timestamping_opt_stats() intentionally runs lockless, we must
add READ_ONCE() and WRITE_ONCE() annotations to keep KCSAN happy.
Fixes: 7e10b6554ff2 ("tcp: add dsack blocks received stats")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260416200319.3608680-10-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Eric Dumazet <edumazet@google.com>
Date: Thu Apr 16 20:03:19 2026 +0000
tcp: annotate data-races around tp->plb_rehash
[ Upstream commit 9e89b9d03a2d2e30dcca166d5af52f9a8eceab25 ]
tcp_get_timestamping_opt_stats() intentionally runs lockless, we must
add READ_ONCE() and WRITE_ONCE() annotations to keep KCSAN happy.
Fixes: 29c1c44646ae ("tcp: add u32 counter in tcp_sock and an SNMP counter for PLB")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260416200319.3608680-15-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kuniyuki Iwashima <kuniyu@google.com>
Date: Fri Apr 10 23:53:27 2026 +0000
tcp: Don't set treq->req_usec_ts in cookie_tcp_reqsk_init().
[ Upstream commit c058bbf05b1197c33df7204842665bd8bc70b3a8 ]
Commit de5626b95e13 ("tcp: Factorise cookie-independent fields
initialisation in cookie_v[46]_check().") miscategorised
tcp_rsk(req)->req_usec_ts init to cookie_tcp_reqsk_init(),
which is used by both BPF/non-BPF SYN cookie reqsk.
Rather, it should have been moved to cookie_tcp_reqsk_alloc() by
commit 8e7bab6b9652 ("tcp: Factorise cookie-dependent fields
initialisation in cookie_v[46]_check()") so that only non-BPF SYN
cookie sets tcp_rsk(req)->req_usec_ts to false.
Let's move the initialisation to cookie_tcp_reqsk_alloc() to
respect bpf_tcp_req_attrs.usec_ts_ok.
Fixes: e472f88891ab ("bpf: tcp: Support arbitrary SYN Cookie.")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260410235328.1773449-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Altan Hacigumus <ahacigu.linux@gmail.com>
Date: Thu Apr 23 18:46:38 2026 -0700
tcp: make probe0 timer handle expired user timeout
[ Upstream commit 2b9f6f7065d4cfb65ba19126e0b35ac4544c3f3a ]
tcp_clamp_probe0_to_user_timeout() computes remaining time in jiffies
using subtraction with an unsigned lvalue. If elapsed probing time
exceeds the configured TCP_USER_TIMEOUT, the underflow yields a large
value.
This ends up re-arming the probe timer for a full backoff interval
instead of expiring immediately, delaying connection teardown beyond
the configured timeout.
Fix this by preventing underflow so user-set timeout expiration is
handled correctly without extending the probe timer.
Fixes: 344db93ae3ee ("tcp: make TCP_USER_TIMEOUT accurate for zero window probes")
Link: https://lore.kernel.org/r/20260414013634.43997-1-ahacigu.linux@gmail.com
Signed-off-by: Altan Hacigumus <ahacigu.linux@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260424014639.54110-1-ahacigu.linux@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Gopi Krishna Menon <krishnagopi487@gmail.com>
Date: Fri Mar 27 14:35:24 2026 +0530
thermal/drivers/spear: Fix error condition for reading st,thermal-flags
[ Upstream commit da2c4f332a0504d9c284e7626a561d343c8d6f57 ]
of_property_read_u32 returns 0 on success. The current check returns
-EINVAL if the property is read successfully.
Fix the check by removing ! from of_property_read_u32
Fixes: b9c7aff481f1 ("drivers/thermal/spear_thermal.c: add Device Tree probing capability")
Signed-off-by: Gopi Krishna Menon <krishnagopi487@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@kernel.org>
Suggested-by: Daniel Baluta <daniel.baluta@nxp.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Link: https://patch.msgid.link/20260327090526.59330-1-krishnagopi487@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Lee Jones <lee@kernel.org>
Date: Tue Apr 21 13:45:26 2026 +0100
tipc: fix double-free in tipc_buf_append()
[ Upstream commit d293ca716e7d5dffdaecaf6b9b2f857a33dc3d3a ]
tipc_msg_validate() can potentially reallocate the skb it is validating,
freeing the old one. In tipc_buf_append(), it was being called with a
pointer to a local variable which was a copy of the caller's skb
pointer.
If the skb was reallocated and validation subsequently failed, the error
handling path would free the original skb pointer, which had already
been freed, leading to double-free.
Fix this by checking if head now points to a newly allocated reassembled
skb. If it does, reassign *headbuf for later freeing operations.
Fixes: d618d09a68e4 ("tipc: enforce valid ratio between skb truesize and contents")
Suggested-by: Tung Nguyen <tung.quang.nguyen@est.tech>
Signed-off-by: Lee Jones <lee@kernel.org>
Reviewed-by: Tung Nguyen <tung.quang.nguyen@est.tech>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Breno Leitao <leitao@debian.org>
Date: Mon Apr 20 06:25:09 2026 -0700
tracing: branch: Fix inverted check on stat tracer registration
[ Upstream commit 3b75dd76e64a04771861bb5647951c264919e563 ]
init_annotated_branch_stats() and all_annotated_branch_stats() check the
return value of register_stat_tracer() with "if (!ret)", but
register_stat_tracer() returns 0 on success and a negative errno on
failure. The inverted check causes the warning to be printed on every
successful registration, e.g.:
Warning: could not register annotated branches stats
while leaving real failures silent. The initcall also returned a
hard-coded 1 instead of the actual error.
Invert the check and propagate ret so that the warning fires on real
errors and the initcall reports the correct status.
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: https://patch.msgid.link/20260420-tracing-v1-1-d8f4cd0d6af1@debian.org
Fixes: 002bb86d8d42 ("tracing/ftrace: separate events tracing and stats tracing engine")
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Pengpeng Hou <pengpeng@iscas.ac.cn>
Date: Wed Apr 1 19:22:23 2026 +0800
tracing: Rebuild full_name on each hist_field_name() call
[ Upstream commit 5ec1d1e97de134beed3a5b08235a60fc1c51af96 ]
hist_field_name() uses a static MAX_FILTER_STR_VAL buffer for fully
qualified variable-reference names, but it currently appends into that
buffer with strcat() without rebuilding it first. As a result, repeated
calls append a new "system.event.field" name onto the previous one,
which can eventually run past the end of full_name.
Build the name with snprintf() on each call and return NULL if the fully
qualified name does not fit in MAX_FILTER_STR_VAL.
Link: https://patch.msgid.link/20260401112224.85582-1-pengpeng@iscas.ac.cn
Fixes: 067fe038e70f ("tracing: Add variable reference handling to hist triggers")
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Tested-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Randy Dunlap <rdunlap@infradead.org>
Date: Thu Jan 29 23:29:37 2026 -0800
tty: hvc_iucv: fix off-by-one in number of supported devices
[ Upstream commit f2a880e802ad12d1e38039d1334fb1475d0f5241 ]
MAX_HVC_IUCV_LINES == HVC_ALLOC_TTY_ADAPTERS == 8.
This is the number of entries in:
static struct hvc_iucv_private *hvc_iucv_table[MAX_HVC_IUCV_LINES];
Sometimes hvc_iucv_table[] is limited by:
(a) if (num > hvc_iucv_devices) // for error detection
or
(b) for (i = 0; i < hvc_iucv_devices; i++) // in 2 places
(so these 2 don't agree; second one appears to be correct to me.)
hvc_iucv_devices can be 0..8. This is a counter.
(c) if (hvc_iucv_devices > MAX_HVC_IUCV_LINES)
If hvc_iucv_devices == 8, (a) allows the code to access hvc_iucv_table[8].
Oops.
Fixes: 44a01d5ba8a4 ("[S390] s390/hvc_console: z/VM IUCV hypervisor console support")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://patch.msgid.link/20260130072939.1535869-1-rdunlap@infradead.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Date: Thu Apr 2 12:21:53 2026 +0200
tty: serial: ip22zilog: Fix section mispatch warning
[ Upstream commit a1a81aef99e853dec84241d701fbf587d713eb5b ]
ip22zilog_prepare() is now called by driver probe routine, so it
shouldn't be in the __init section any longer.
Fixes: 3fc36ae6abd2 ("tty: serial: ip22zilog: Use platform device for probing")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202604020945.c9jAvCPs-lkp@intel.com/
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Link: https://patch.msgid.link/20260402102154.136620-1-tbogendoerfer@suse.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Gabriel Krisman Bertazi <krisman@suse.de>
Date: Fri Apr 10 11:59:36 2026 -0400
udp: Force compute_score to always inline
[ Upstream commit b80a95ccf1604a882bb153c45ccb4056e44c8edb ]
Back in 2024 I reported a 7-12% regression on an iperf3 UDP loopback
thoughput test that we traced to the extra overhead of calling
compute_score on two places, introduced by commit f0ea27e7bfe1 ("udp:
re-score reuseport groups when connected sockets are present"). At the
time, I pointed out the overhead was caused by the multiple calls,
associated with cpu-specific mitigations, and merged commit
50aee97d1511 ("udp: Avoid call to compute_score on multiple sites") to
jump back explicitly, to force the rescore call in a single place.
Recently though, we got another regression report against a newer distro
version, which a team colleague traced back to the same root-cause.
Turns out that once we updated to gcc-13, the compiler got smart enough
to unroll the loop, undoing my previous mitigation. Let's bite the
bullet and __always_inline compute_score on both ipv4 and ipv6 to
prevent gcc from de-optimizing it again in the future. These functions
are only called in two places each, udpX_lib_lookup1 and
udpX_lib_lookup2, so the extra size shouldn't be a problem and it is hot
enough to be very visible in profilings. In fact, with gcc13, forcing
the inline will prevent gcc from unrolling the fix from commit
50aee97d1511, so we don't end up increasing udpX_lib_lookup2 at all.
I haven't recollected the results myself, as I don't have access to the
machine at the moment. But the same colleague reported 4.67%
inprovement with this patch in the loopback benchmark, solving the
regression report within noise margins.
Eric Dumazet reported no size change to vmlinux when built with clang.
I report the same also with gcc-13:
scripts/bloat-o-meter vmlinux vmlinux-inline
add/remove: 0/2 grow/shrink: 4/0 up/down: 616/-416 (200)
Function old new delta
udp6_lib_lookup2 762 949 +187
__udp6_lib_lookup 810 975 +165
udp4_lib_lookup2 757 906 +149
__udp4_lib_lookup 871 986 +115
__pfx_compute_score 32 - -32
compute_score 384 - -384
Total: Before=35011784, After=35011984, chg +0.00%
Fixes: 50aee97d1511 ("udp: Avoid call to compute_score on multiple sites")
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
Link: https://patch.msgid.link/20260410155936.654915-1-krisman@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Michal Grzedzicki <mge@meta.com>
Date: Fri Feb 13 11:39:59 2026 -0800
unshare: fix nsproxy leak in ksys_unshare() on set_cred_ucounts() failure
[ Upstream commit a98621a0f187a934c115dcfe79a49520ae892111 ]
When set_cred_ucounts() fails in ksys_unshare() new_nsproxy is leaked.
Let's call put_nsproxy() if that happens.
Link: https://lkml.kernel.org/r/20260213193959.2556730-1-mge@meta.com
Fixes: 905ae01c4ae2 ("Add a reference to ucounts for each cred")
Signed-off-by: Michal Grzedzicki <mge@meta.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexey Gladkov (Intel) <legion@kernel.org>
Cc: Ben Segall <bsegall@google.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Kees Cook <kees@kernel.org>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Kohei Enju <kohei@enjuk.jp>
Date: Wed Apr 22 02:30:24 2026 +0000
vhost_net: fix sleeping with preempt-disabled in vhost_net_busy_poll()
[ Upstream commit e08a9fac5cf8c3fecf4755e7e3ac059f78b8f83d ]
syzbot reported "sleeping function called from invalid context" in
vhost_net_busy_poll().
Commit 030881372460 ("vhost_net: basic polling support") introduced a
busy-poll loop and preempt_{disable,enable}() around it, where each
iteration calls a sleepable function inside the loop.
The purpose of disabling preemption was to keep local_clock()-based
timeout accounting on a single CPU, rather than as a requirement of
busy-poll itself:
https://lore.kernel.org/1448435489-5949-4-git-send-email-jasowang@redhat.com
From this perspective, migrate_disable() is sufficient here, so replace
preempt_disable() with migrate_disable(), avoiding sleepable accesses
from a preempt-disabled context.
Fixes: 030881372460 ("vhost_net: basic polling support")
Tested-by: syzbot+6985cb8e543ea90ba8ee@syzkaller.appspotmail.com
Reported-by: syzbot+6985cb8e543ea90ba8ee@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/69e6a414.050a0220.24bfd3.002d.GAE@google.com/T/
Signed-off-by: Kohei Enju <kohei@enjuk.jp>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Akihiko Odaki <akihiko.odaki@daynix.com>
Date: Fri Mar 21 15:48:33 2025 +0900
virtio_net: Fix endian with virtio_net_ctrl_rss
[ Upstream commit 97841341e302eac13d54eb5e968570b5626196a7 ]
Mark the fields of struct virtio_net_ctrl_rss as little endian as
they are in struct virtio_net_rss_config, which it follows.
Fixes: c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Link: https://patch.msgid.link/20250321-virtio-v2-2-33afb8f4640b@daynix.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 3bc06da858ef ("virtio_net: sync rss_trailer.max_tx_vq on queue_pairs change via VQ_PAIRS_SET")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Akihiko Odaki <akihiko.odaki@daynix.com>
Date: Fri Mar 21 15:48:32 2025 +0900
virtio_net: Split struct virtio_net_rss_config
[ Upstream commit 976c2696b71da376d42e63ca3802eb2aafc164eb ]
struct virtio_net_rss_config was less useful in actual code because of a
flexible array placed in the middle. Add new structures that split it
into two to avoid having a flexible array in the middle.
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Link: https://patch.msgid.link/20250321-virtio-v2-1-33afb8f4640b@daynix.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 3bc06da858ef ("virtio_net: sync rss_trailer.max_tx_vq on queue_pairs change via VQ_PAIRS_SET")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Brett Creeley <brett.creeley@amd.com>
Date: Thu Apr 16 14:21:21 2026 -0700
virtio_net: sync rss_trailer.max_tx_vq on queue_pairs change via VQ_PAIRS_SET
[ Upstream commit 3bc06da858ef17cfe94b49efc0d9713727012835 ]
When netif_is_rxfh_configured() is true (i.e., the user has explicitly
configured the RSS indirection table), virtnet_set_queues() skips the
RSS update path and falls through to the VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET
command to change the number of queue pairs. However, it does not update
vi->rss_trailer.max_tx_vq to reflect the new queue_pairs value.
This causes a mismatch between vi->curr_queue_pairs and
vi->rss_trailer.max_tx_vq. Any subsequent RSS reconfiguration (e.g.,
via ethtool -X) calls virtnet_commit_rss_command(), which sends the
stale max_tx_vq to the device, silently reverting the queue count.
Reproduction:
1. User configured RSS
ethtool -X eth0 equal 8
2. VQ_PAIRS_SET path; max_tx_vq stays 16
ethtool -L eth0 combined 12
3. RSS commit uses max_tx_vq=16 instead of 12
ethtool -X eth0 equal 4
Fix this by updating vi->rss_trailer.max_tx_vq after a successful
VQ_PAIRS_SET command when RSS is enabled, keeping it in sync with
curr_queue_pairs.
Fixes: 50bfcaedd78e ("virtio_net: Update rss when set queue")
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20260416212121.29073-1-brett.creeley@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Akihiko Odaki <akihiko.odaki@daynix.com>
Date: Fri Mar 21 15:48:34 2025 +0900
virtio_net: Use new RSS config structs
[ Upstream commit ed3100e90d0d120a045a551b85eb43cf2527e885 ]
The new RSS configuration structures allow easily constructing data for
VIRTIO_NET_CTRL_MQ_RSS_CONFIG as they strictly follow the order of data
for the command.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Link: https://patch.msgid.link/20250321-virtio-v2-3-33afb8f4640b@daynix.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 3bc06da858ef ("virtio_net: sync rss_trailer.max_tx_vq on queue_pairs change via VQ_PAIRS_SET")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ido Schimmel <idosch@nvidia.com>
Date: Thu Apr 23 09:36:07 2026 +0300
vrf: Fix a potential NPD when removing a port from a VRF
[ Upstream commit 2674d603a9e6970463b2b9ebcf8e31e90beae169 ]
RCU readers that identified a net device as a VRF port using
netif_is_l3_slave() assume that a subsequent call to
netdev_master_upper_dev_get_rcu() will return a VRF device. They then
continue to dereference its l3mdev operations.
This assumption is not always correct and can result in a NPD [1]. There
is no RCU synchronization when removing a port from a VRF, so it is
possible for an RCU reader to see a new master device (e.g., a bridge)
that does not have l3mdev operations.
Fix by adding RCU synchronization after clearing the IFF_L3MDEV_SLAVE
flag. Skip this synchronization when a net device is removed from a VRF
as part of its deletion and when the VRF device itself is deleted. In
the latter case an RCU grace period will pass by the time RTNL is
released.
[1]
BUG: kernel NULL pointer dereference, address: 0000000000000000
[...]
RIP: 0010:l3mdev_fib_table_rcu (net/l3mdev/l3mdev.c:181)
[...]
Call Trace:
<TASK>
l3mdev_fib_table_by_index (net/l3mdev/l3mdev.c:201 net/l3mdev/l3mdev.c:189)
__inet_bind (net/ipv4/af_inet.c:499 (discriminator 3))
inet_bind_sk (net/ipv4/af_inet.c:469)
__sys_bind (./include/linux/file.h:62 (discriminator 1) ./include/linux/file.h:83 (discriminator 1) net/socket.c:1951 (discriminator 1))
__x64_sys_bind (net/socket.c:1969 (discriminator 1) net/socket.c:1967 (discriminator 1) net/socket.c:1967 (discriminator 1))
do_syscall_64 (arch/x86/entry/syscall_64.c:63 (discriminator 1) arch/x86/entry/syscall_64.c:94 (discriminator 1))
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
Fixes: fdeea7be88b1 ("net: vrf: Set slave's private flag before linking")
Reported-by: Haoze Xie <royenheart@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Yuan Tan <yuantan098@gmail.com>
Closes: https://lore.kernel.org/netdev/20260419145332.3988923-1-n05ec@lzu.edu.cn/
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260423063607.1208202-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Date: Wed Mar 25 11:05:01 2026 +0800
wifi: ath10k: fix station lookup failure during disconnect
[ Upstream commit 9a34a59c6086ae731a06b3e61b0951feef758648 ]
Recent commit [1] moved station statistics collection to an earlier stage
of the disconnect flow. With this change in place, ath10k fails to resolve
the station entry when handling a peer stats event triggered during
disconnect, resulting in log messages such as:
wlp58s0: deauthenticating from 74:1a:e0:e7:b4:c8 by local choice (Reason: 3=DEAUTH_LEAVING)
ath10k_pci 0000:3a:00.0: not found station for peer stats
ath10k_pci 0000:3a:00.0: failed to parse stats info tlv: -22
The failure occurs because ath10k relies on ieee80211_find_sta_by_ifaddr()
for station lookup. That function uses local->sta_hash, but by the time
the peer stats request is triggered during disconnect, mac80211 has
already removed the station from that hash table, leading to lookup
failure.
Before commit [1], this issue was not visible because the transition from
IEEE80211_STA_NONE to IEEE80211_STA_NOTEXIST prevented ath10k from sending
a peer stats request at all: ath10k_mac_sta_get_peer_stats_info() would
fail early to find the peer and skip requesting statistics.
Fix this by switching the lookup path to ath10k_peer_find(), which queries
ath10k's internal peer table. At the point where the firmware emits the
peer stats event, the peer entry is still present in the driver's list,
ensuring lookup succeeds.
Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00309-QCARMSWPZ-1
Fixes: a203dbeeca15 ("wifi: mac80211: collect station statistics earlier when disconnect") # [1]
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Closes: https://lore.kernel.org/ath10k/57671b89-ec9f-4e6c-992c-45eb8e75929c@molgen.mpg.de
Signed-off-by: Baochen Qiang <baochen.qiang@oss.qualcomm.com>
Reviewed-by: Rameshkumar Sundaram <rameshkumar.sundaram@oss.qualcomm.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de>
Link: https://patch.msgid.link/20260325-ath10k-station-lookup-failure-v1-1-2e0c970f25d5@oss.qualcomm.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ethan Tidmore <ethantidmore06@gmail.com>
Date: Mon Feb 16 20:30:43 2026 -0600
wifi: brcmfmac: Fix error pointer dereference
[ Upstream commit dd8592fc6007a451c3e4b9025de365e39de8178a ]
The function brcmf_chip_add_core() can return an error pointer and is
not checked. Add checks for error pointer.
Detected by Smatch:
drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c:1010 brcmf_chip_recognition() error:
'core' dereferencing possible ERR_PTR()
drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c:1013 brcmf_chip_recognition() error:
'core' dereferencing possible ERR_PTR()
drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c:1016 brcmf_chip_recognition() error:
'core' dereferencing possible ERR_PTR()
drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c:1019 brcmf_chip_recognition() error:
'core' dereferencing possible ERR_PTR()
drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c:1022 brcmf_chip_recognition() error:
'core' dereferencing possible ERR_PTR()
Fixes: cb7cf7be9eba7 ("brcmfmac: make chip related functions host interface independent")
Signed-off-by: Ethan Tidmore <ethantidmore06@gmail.com>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Link: https://patch.msgid.link/20260217023043.73631-1-ethantidmore06@gmail.com
[add missing wifi: prefix]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Nicolas Escande <nico.escande@gmail.com>
Date: Fri Mar 27 11:02:56 2026 +0100
wifi: mac80211: handle VHT EXT NSS in ieee80211_determine_our_sta_mode()
[ Upstream commit b5b8e295973083abf823fb66647a7c702a8db8a7 ]
A station which has a NSS ratio on the number of streams it is capable of
in 160MHz VHT operation is supposed to use the 'Extended NSS BW Support'
as defined by section '9.4.2.156.2 VHT Capabilities Information field'.
This was missing in ieee80211_determine_our_sta_mode() and so we would
wrongfully downgrade our bandwidth when connecting to an AP that supported
160MHz with messages such as:
[ 37.638346] wlan1: AP XX:XX:XX:XX:XX:XX changed bandwidth in assoc response, new used config is 5280.000 MHz, width 3 (5290.000/0 MHz)
Fixes: 310c8387c638 ("wifi: mac80211: clean up connection process")
Signed-off-by: Nicolas Escande <nico.escande@gmail.com>
Link: https://patch.msgid.link/20260327100256.3101348-1-nico.escande@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ryder Lee <ryder.lee@mediatek.com>
Date: Wed Jan 21 09:41:56 2026 -0800
wifi: mt76: mt7615: fix use_cts_prot support
[ Upstream commit 1974a67d9b65c29a0a9426e32e8cd8c056de48b7 ]
Driver should not directly write WTBL to prevent overwritten issues.
With this fix, when driver needs to adjust its behavior for compatibility,
especially concerning older 11g/n devices, by enabling or disabling CTS
protection frames, often for hidden SSIDs or to manage legacy clients.
Fixes: e34235ccc5e3 ("wifi: mt76: mt7615: enable use_cts_prot support")
Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Link: https://patch.msgid.link/edb87088b0111b32fafc6c4179f54a5286dd37d8.1768879119.git.ryder.lee@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Duoming Zhou <duoming@zju.edu.cn>
Date: Fri Jan 30 22:57:59 2026 +0800
wifi: mt76: mt7915: fix use-after-free bugs in mt7915_mac_dump_work()
[ Upstream commit 1146d0946b5358fad24812bd39d68f31cd40cc34 ]
When the mt7915 pci chip is detaching, the mt7915_crash_data is
released in mt7915_coredump_unregister(). However, the work item
dump_work may still be running or pending, leading to UAF bugs
when the already freed crash_data is dereferenced again in
mt7915_mac_dump_work().
The race condition can occur as follows:
CPU 0 (removal path) | CPU 1 (workqueue)
mt7915_pci_remove() | mt7915_sys_recovery_set()
mt7915_unregister_device() | mt7915_reset()
mt7915_coredump_unregister() | queue_work()
vfree(dev->coredump.crash_data) | mt7915_mac_dump_work()
| crash_data-> // UAF
Fix this by ensuring dump_work is properly canceled before
the crash_data is deallocated. Add cancel_work_sync() in
mt7915_unregister_device() to synchronize with any pending
or executing dump work.
Fixes: 4dbcb9125cc3 ("wifi: mt76: mt7915: enable coredump support")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Link: https://patch.msgid.link/20260130145759.84272-1-duoming@zju.edu.cn
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ryder Lee <ryder.lee@mediatek.com>
Date: Wed Jan 21 09:41:57 2026 -0800
wifi: mt76: mt7915: fix use_cts_prot support
[ Upstream commit 8b2c26562b95c6397e132d21f2bd3d73aaee0c0a ]
With this fix, when driver needs to adjust its behavior for compatibility,
especially concerning older 11g/n devices, by enabling or disabling CTS
protection frames, often for hidden SSIDs or to manage legacy clients.
Fixes: 150b91419d3d ("wifi: mt76: mt7915: enable use_cts_prot support")
Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Link: https://patch.msgid.link/eb8db4d0bf1c89b7486e89facb788ae3e510dd8b.1768879119.git.ryder.lee@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Michael Lo <michael.lo@mediatek.com>
Date: Wed Feb 11 17:50:25 2026 +0800
wifi: mt76: mt7921: fix 6GHz regulatory update on connection
[ Upstream commit 3dc0c40d7806c72cfe88cf4e1e2650c1673f9db4 ]
Call mt7921_regd_update() instead of mt7921_mcu_set_clc() when setting
the 6GHz power type after connection, so that regulatory limits and SAR
power are also applied.
Fixes: 51ba0e3a15eb ("wifi: mt76: mt7921: add 6GHz power type support for clc")
Signed-off-by: Michael Lo <michael.lo@mediatek.com>
Link: https://patch.msgid.link/20260211095025.2415624-1-leon.yen@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Rory Little <rory@candelatech.com>
Date: Wed Sep 3 17:07:11 2025 -0700
wifi: mt76: mt7921: Place upper limit on station AID
[ Upstream commit 4d0bf21e3e20619d51d06c0c36207aabab8b712c ]
Any station configured with an AID over 20 causes a firmware crash.
This situation occurred in our testing using an AP interface on 7922
hardware, with a modified hostapd, sourced from Mediatek's OpenWRT
feeds.
In stock hostapd, station AIDs begin counting at 1, and this
configuration is prevented with an upper limit on associated stations.
However, the modified hostapd began allocation at 65, which caused the
firmware to crash. This fix does not allow these AIDs to work, but will
prevent the firmware crash.
This crash was only seen on IFTYPE_AP interfaces, and the fix does not
appear to have an effect on IFTYPE_STATION behavior.
Fixes: 5c14a5f944b9 ("mt76: mt7921: introduce mt7921e support")
Signed-off-by: Rory Little <rory@candelatech.com>
Link: https://patch.msgid.link/20250904000711.3033860-1-rory@candelatech.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Sean Wang <sean.wang@mediatek.com>
Date: Mon Dec 15 18:59:30 2025 -0600
wifi: mt76: mt7921: Reset ampdu_state state in case of failure in mt76_connac2_tx_check_aggr()
[ Upstream commit 53ffffeb9624ffab6d9a3b1da8635a23f1172b5e ]
Reset ampdu_state if ieee80211_start_tx_ba_session() fails in
mt76_connac2_tx_check_aggr(), otherwise the driver may incorrectly
assume aggregation is active and skip future BA setup attempts.
Fixes: 163f4d22c118 ("mt76: mt7921: add MAC support")
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Link: https://patch.msgid.link/20251216005930.9412-1-sean.wang@kernel.org
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Leon Yen <leon.yen@mediatek.com>
Date: Thu Dec 11 20:38:36 2025 +0800
wifi: mt76: mt7925: Fix incorrect MLO mode in firmware control
[ Upstream commit 1695f662329faa07c860c73453c097823852df28 ]
The selection of MLO mode should depend on the capabilities of the STA
rather than those of the peer AP to avoid compatibility issues with
certain APs, such as Xiaomi BE5000 WiFi7 router.
Fixes: 69acd6d910b0c ("wifi: mt76: mt7925: add mt7925_change_vif_links")
Signed-off-by: Leon Yen <leon.yen@mediatek.com>
Link: https://patch.msgid.link/20251211123836.4169436-1-leon.yen@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ming Yen Hsieh <mingyen.hsieh@mediatek.com>
Date: Thu Sep 4 11:06:48 2025 +0800
wifi: mt76: mt7925: prevent NULL pointer dereference in mt7925_tx_check_aggr()
[ Upstream commit 83ae3a18ba957257b4c406273d2da2caeea2b439 ]
Move the NULL check for 'sta' before dereferencing it to prevent a
possible crash.
Fixes: 44eb173bdd4f ("wifi: mt76: mt7925: add link handling in mt7925_txwi_free")
Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com>
Link: https://patch.msgid.link/20250904030649.655436-4-mingyen.hsieh@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Ming Yen Hsieh <mingyen.hsieh@mediatek.com>
Date: Thu Sep 4 11:06:47 2025 +0800
wifi: mt76: mt7925: prevent NULL vif dereference in mt7925_mac_write_txwi
[ Upstream commit 962eb04e67552be406c906c83099c1d736aae3b6 ]
Check for a NULL `vif` before accessing `ieee80211_vif_is_mld(vif)` to
avoid a potential kernel panic in scenarios where `vif` might not be
initialized.
Fixes: ebb1406813c6 ("wifi: mt76: mt7925: add link handling to txwi")
Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com>
Link: https://patch.msgid.link/20250904030649.655436-3-mingyen.hsieh@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alok Tiwari <alok.a.tiwari@oracle.com>
Date: Mon Oct 13 02:08:24 2025 -0700
wifi: mt76: mt7996: fix FCS error flag check in RX descriptor
[ Upstream commit d8db56142e531f060c938fa0b5175ed6c8cabb11 ]
The mt7996 driver currently checks the MT_RXD3_NORMAL_FCS_ERR bit in
rxd1 whereas other Connac3-based drivers(mt7925) correctly check this
bit in rxd3.
Since the MT_RXD3_NORMAL_FCS_ERR bit is defined in the fourth RX
descriptor word (rxd3), update mt7996 to use the proper descriptor
field. This change aligns mt7996 with mt7925 and the rest of the
Connac3 family.
Fixes: 98686cd21624 ("wifi: mt76: mt7996: add driver for MediaTek Wi-Fi 7 (802.11be) devices")
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://patch.msgid.link/20251013090826.753992-1-alok.a.tiwari@oracle.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: StanleyYP Wang <StanleyYP.Wang@mediatek.com>
Date: Tue Feb 3 23:55:30 2026 +0800
wifi: mt76: mt7996: fix struct mt7996_mcu_uni_event
[ Upstream commit efbd5bf395f4e6b45a87f3835d4c2e28170c77c5 ]
The cid field is defined as a two-byte value in the firmware.
Fixes: 98686cd21624 ("wifi: mt76: mt7996: add driver for MediaTek Wi-Fi 7 (802.11be) devices")
Signed-off-by: StanleyYP Wang <StanleyYP.Wang@mediatek.com>
Signed-off-by: Shayne Chen <shayne.chen@mediatek.com>
Link: https://patch.msgid.link/20260203155532.1098290-2-shayne.chen@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Duoming Zhou <duoming@zju.edu.cn>
Date: Sat Jan 31 10:47:31 2026 +0800
wifi: mt76: mt7996: fix use-after-free bugs in mt7996_mac_dump_work()
[ Upstream commit c8f62f73bbced3a79894655bdb0b625462d956fc ]
When the mt7996 pci chip is detaching, the mt7996_crash_data is
released in mt7996_coredump_unregister(). However, the work item
dump_work may still be running or pending, leading to UAF bugs
when the already freed crash_data is dereferenced again in
mt7996_mac_dump_work().
The race condition can occur as follows:
CPU 0 (removal path) | CPU 1 (workqueue)
mt7996_pci_remove() | mt7996_sys_recovery_set()
mt7996_unregister_device() | mt7996_reset()
mt7996_coredump_unregister() | queue_work()
vfree(dev->coredump.crash_data) | mt7996_mac_dump_work()
| crash_data-> // UAF
Fix this by ensuring dump_work is properly canceled before
the crash_data is deallocated. Add cancel_work_sync() in
mt7996_unregister_device() to synchronize with any pending
or executing dump work.
Fixes: 878161d5d4a4 ("wifi: mt76: mt7996: enable coredump support")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Link: https://patch.msgid.link/20260131024731.18741-1-duoming@zju.edu.cn
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Zilin Guan <zilin@seu.edu.cn>
Date: Mon Jan 19 09:26:25 2026 +0000
wifi: mwifiex: Fix memory leak in mwifiex_11n_aggregate_pkt()
[ Upstream commit 990a73dec3fdc145fef6c827c29205437d533ece ]
In mwifiex_11n_aggregate_pkt(), skb_aggr is allocated via
mwifiex_alloc_dma_align_buf(). If mwifiex_is_ralist_valid() returns false,
the function currently returns -1 immediately without freeing the
previously allocated skb_aggr, causing a memory leak.
Since skb_aggr has not yet been queued via skb_queue_tail(), no other
references to this memory exist. Therefore, it has to be freed locally
before returning the error.
Fix this by calling mwifiex_write_data_complete() to free skb_aggr before
returning the error status.
Compile tested only. Issue found using a prototype static analysis tool
and code review.
Fixes: 5e6e3a92b9a4 ("wireless: mwifiex: initial commit for Marvell mwifiex driver")
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Reviewed-by: Jeff Chen <jeff.chen_1@nxp.com>
Link: https://patch.msgid.link/20260119092625.1349934-1-zilin@seu.edu.cn
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Duoming Zhou <duoming@zju.edu.cn>
Date: Mon Feb 23 12:55:22 2026 +0800
wifi: rtlwifi: pci: fix possible use-after-free caused by unfinished irq_prepare_bcn_tasklet
[ Upstream commit 039cd522dc70151da13329a5e3ae19b1736f468a ]
The irq_prepare_bcn_tasklet is initialized in rtl_pci_init() and
scheduled when RTL_IMR_BCNINT interrupt is triggered by hardware.
But it is never killed in rtl_pci_deinit(). When the rtlwifi card
probe fails or is being detached, the ieee80211_hw is deallocated.
However, irq_prepare_bcn_tasklet may still be running or pending,
leading to use-after-free when the freed ieee80211_hw is accessed
in _rtl_pci_prepare_bcn_tasklet().
Similar to irq_tasklet, add tasklet_kill() in rtl_pci_deinit() to
ensure that irq_prepare_bcn_tasklet is properly terminated before
the ieee80211_hw is released.
The issue was identified through static analysis.
Fixes: 0c8173385e54 ("rtl8192ce: Add new driver")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20260223045522.48377-1-duoming@zju.edu.cn
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Alexey Velichayshiy <a.velichayshiy@ispras.ru>
Date: Mon Mar 23 17:05:53 2026 +0300
wifi: rtw89: phy: fix uninitialized variable access in rtw89_phy_cfo_set_crystal_cap()
[ Upstream commit 047cddf88c611e616d49a00311d4722e46286234 ]
In the rtw89_phy_cfo_set_crystal_cap() function, for chips other than
RTL8852A/RTL8851B, the values read by rtw89_mac_read_xtal_si() are
stored into the local variables sc_xi_val and sc_xo_val. If either
read fails, these variables remain uninitialized, they are later
used to update cfo->crystal_cap and in debug print statements. This
can lead to undefined behavior.
Fix the issue by initializing sc_xi_val and sc_xo_val to zero,
like is implemented in vendor driver.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 8379fa611536 ("rtw89: 8852c: add write/read crystal function in CFO tracking")
Signed-off-by: Alexey Velichayshiy <a.velichayshiy@ispras.ru>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Link: https://patch.msgid.link/20260323140613.1615574-1-a.velichayshiy@ispras.ru
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Breno Leitao <leitao@debian.org>
Date: Fri May 8 09:22:03 2026 -0700
workqueue: Fix wq->cpu_pwq leak in alloc_and_link_pwqs() WQ_UNBOUND path
commit 0143033dc22cdff912cfc13419f5db92fea3b4cb upstream.
For WQ_UNBOUND workqueues, alloc_and_link_pwqs() allocates wq->cpu_pwq
via alloc_percpu() and then calls apply_workqueue_attrs_locked(). On
failure it returns the error directly, bypassing the enomem: label
which holds the only free_percpu(wq->cpu_pwq) in this function.
The caller's error path kfree()s wq without touching wq->cpu_pwq,
leaking one percpu pointer table (nr_cpu_ids * sizeof(void *) bytes) per
failed call.
If kmemleak is enabled, we can see:
unreferenced object (percpu) 0xc0fffa5b121048 (size 8):
comm "insmod", pid 776, jiffies 4294682844
backtrace (crc 0):
pcpu_alloc_noprof+0x665/0xac0
__alloc_workqueue+0x33f/0xa20
alloc_workqueue_noprof+0x60/0x100
Route the error through the existing enomem: cleanup and any error
before this one.
Cc: stable@kernel.org
Fixes: 636b927eba5b ("workqueue: Make unbound workqueues to use per-cpu pool_workqueues")
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Author: Thomas Weißschuh <linux@weissschuh.net>
Date: Mon Oct 13 12:40:21 2025 +0200
x86/um/vdso: Drop VDSO64-y from Makefile
[ Upstream commit 3c9b904f9033fb250db72d258bbdec791dc89405 ]
This symbol is unnecessary, remove it.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://patch.msgid.link/20251013-uml-vdso-cleanup-v1-4-a079c7adcc69@weissschuh.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Stable-dep-of: d1895c15fc7d ("x86/um: fix vDSO installation")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Author: Thomas Weißschuh <linux@weissschuh.net>
Date: Wed Mar 18 22:03:26 2026 +0100
x86/um: fix vDSO installation
[ Upstream commit d1895c15fc7d90a615bc8c455feb02acaf08ef1e ]
The generic vDSO installation logic used by 'make vdso_install' requires
that $(vdso-install-y) is defined by the top-level architecture Makefile
and that it contains a path relative to the root of the tree.
For UML neither of these is satisfied.
Move the definition of $(vdso-install-y) to a place which is included by
the arch/um/Makefile and use the full relative path.
Fixes: f1c2bb8b9964 ("um: implement a x86_64 vDSO")
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://patch.msgid.link/20260318-um-vdso-install-v1-1-26a4ca5c4210@weissschuh.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>