óÐÉÓÏË ÉÚÍÅÎÅÎÉÊ × Linux 6.3.6

 
ARM: dts: imx6ull-dhcor: Set and limit the mode for PMIC buck 1, 2 and 3 [+ + +]
Author: Christoph Niedermaier <cniedermaier@dh-electronics.com>
Date:   Tue May 2 13:14:24 2023 +0200

    ARM: dts: imx6ull-dhcor: Set and limit the mode for PMIC buck 1, 2 and 3
    
    [ Upstream commit 892943d7729bbfb2edeed9e323eba9a5cec21c49 ]
    
    According to Renesas Electronics (formerly Dialog Semiconductor), the
    standard AUTO mode of the PMIC DA9061 can lead to stability problems
    depending on the hardware revision. It is recommended to set a defined
    mode such as PFM or PWM permanently. So set and limit the mode for
    buck 1, 2 and 3 to a fixed one.
    
    Fixes: 611b6c891e40 ("ARM: dts: imx6ull-dhcom: Add DH electronics DHCOM i.MX6ULL SoM and PDK2 board")
    Signed-off-by: Christoph Niedermaier <cniedermaier@dh-electronics.com>
    Reviewed-by: Marek Vasut <marex@denx.de>
    Signed-off-by: Shawn Guo <shawnguo@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
ASoC: Intel: avs: Fix module lookup [+ + +]
Author: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com>
Date:   Fri May 19 22:17:05 2023 +0200

    ASoC: Intel: avs: Fix module lookup
    
    [ Upstream commit ff04437f6dcd138b50483afc7b313f016020ce8f ]
    
    When changing value of kcontrol, FW module to which data should be send
    needs to be found. Currently it is done in improper way, fix it. Change
    function name to indicate that it looks only for volume module.
    
    This allows to change volume during runtime, instead of only changing
    init value.
    
    Fixes: be2b81b519d7 ("ASoC: Intel: avs: Parse control tuples")
    Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com>
    Signed-off-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com>
    Link: https://lore.kernel.org/r/20230519201711.4073845-2-amadeuszx.slawinski@linux.intel.com
    Signed-off-by: Mark Brown <broonie@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
blk-mq: fix race condition in active queue accounting [+ + +]
Author: Tian Lan <tian.lan@twosigma.com>
Date:   Mon May 22 17:05:55 2023 -0400

    blk-mq: fix race condition in active queue accounting
    
    [ Upstream commit 3e94d54e83cafd2b562bb6d15bb2f72d76200fb5 ]
    
    If multiple CPUs are sharing the same hardware queue, it can
    cause leak in the active queue counter tracking when __blk_mq_tag_busy()
    is executed simultaneously.
    
    Fixes: ee78ec1077d3 ("blk-mq: blk_mq_tag_busy is no need to return a value")
    Signed-off-by: Tian Lan <tian.lan@twosigma.com>
    Reviewed-by: Ming Lei <ming.lei@redhat.com>
    Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
    Reviewed-by: John Garry <john.g.garry@oracle.com>
    Link: https://lore.kernel.org/r/20230522210555.794134-1-tilan7663@gmail.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
blk-wbt: fix that wbt can't be disabled by default [+ + +]
Author: Yu Kuai <yukuai3@huawei.com>
Date:   Mon May 22 20:18:54 2023 +0800

    blk-wbt: fix that wbt can't be disabled by default
    
    [ Upstream commit 8a2b20a997a3779ae9fcae268f2959eb82ec05a1 ]
    
    commit b11d31ae01e6 ("blk-wbt: remove unnecessary check in
    wbt_enable_default()") removes the checking of CONFIG_BLK_WBT_MQ by
    mistake, which is used to control enable or disable wbt by default.
    
    Fix the problem by adding back the checking. This patch also do a litter
    cleanup to make related code more readable.
    
    Fixes: b11d31ae01e6 ("blk-wbt: remove unnecessary check in wbt_enable_default()")
    Reported-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
    Link: https://lore.kernel.org/lkml/CAKXUXMzfKq_J9nKHGyr5P5rvUETY4B-fxoQD4sO+NYjFOfVtZA@mail.gmail.com/t/
    Signed-off-by: Yu Kuai <yukuai3@huawei.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Link: https://lore.kernel.org/r/20230522121854.2928880-1-yukuai1@huaweicloud.com
    Signed-off-by: Jens Axboe <axboe@kernel.dk>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
bluetooth: Add cmd validity checks at the start of hci_sock_ioctl() [+ + +]
Author: Ruihan Li <lrh2000@pku.edu.cn>
Date:   Sun Apr 16 16:02:51 2023 +0800

    bluetooth: Add cmd validity checks at the start of hci_sock_ioctl()
    
    commit 000c2fa2c144c499c881a101819cf1936a1f7cf2 upstream.
    
    Previously, channel open messages were always sent to monitors on the first
    ioctl() call for unbound HCI sockets, even if the command and arguments
    were completely invalid. This can leave an exploitable hole with the abuse
    of invalid ioctl calls.
    
    This commit hardens the ioctl processing logic by first checking if the
    command is valid, and immediately returning with an ENOIOCTLCMD error code
    if it is not. This ensures that ioctl calls with invalid commands are free
    of side effects, and increases the difficulty of further exploitation by
    forcing exploitation to find a way to pass a valid command first.
    
    Signed-off-by: Ruihan Li <lrh2000@pku.edu.cn>
    Co-developed-by: Marcel Holtmann <marcel@holtmann.org>
    Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
    Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
    Signed-off-by: Dragos-Marian Panait <dragos.panait@windriver.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
bpf, sockmap: Convert schedule_work into delayed_work [+ + +]
Author: John Fastabend <john.fastabend@gmail.com>
Date:   Mon May 22 19:56:06 2023 -0700

    bpf, sockmap: Convert schedule_work into delayed_work
    
    [ Upstream commit 29173d07f79883ac94f5570294f98af3d4287382 ]
    
    Sk_buffs are fed into sockmap verdict programs either from a strparser
    (when the user might want to decide how framing of skb is done by attaching
    another parser program) or directly through tcp_read_sock. The
    tcp_read_sock is the preferred method for performance when the BPF logic is
    a stream parser.
    
    The flow for Cilium's common use case with a stream parser is,
    
     tcp_read_sock()
      sk_psock_verdict_recv
        ret = bpf_prog_run_pin_on_cpu()
        sk_psock_verdict_apply(sock, skb, ret)
         // if system is under memory pressure or app is slow we may
         // need to queue skb. Do this queuing through ingress_skb and
         // then kick timer to wake up handler
         skb_queue_tail(ingress_skb, skb)
         schedule_work(work);
    
    The work queue is wired up to sk_psock_backlog(). This will then walk the
    ingress_skb skb list that holds our sk_buffs that could not be handled,
    but should be OK to run at some later point. However, its possible that
    the workqueue doing this work still hits an error when sending the skb.
    When this happens the skbuff is requeued on a temporary 'state' struct
    kept with the workqueue. This is necessary because its possible to
    partially send an skbuff before hitting an error and we need to know how
    and where to restart when the workqueue runs next.
    
    Now for the trouble, we don't rekick the workqueue. This can cause a
    stall where the skbuff we just cached on the state variable might never
    be sent. This happens when its the last packet in a flow and no further
    packets come along that would cause the system to kick the workqueue from
    that side.
    
    To fix we could do simple schedule_work(), but while under memory pressure
    it makes sense to back off some instead of continue to retry repeatedly. So
    instead to fix convert schedule_work to schedule_delayed_work and add
    backoff logic to reschedule from backlog queue on errors. Its not obvious
    though what a good backoff is so use '1'.
    
    To test we observed some flakes whil running NGINX compliance test with
    sockmap we attributed these failed test to this bug and subsequent issue.
    
    >From on list discussion. This commit
    
     bec217197b41("skmsg: Schedule psock work if the cached skb exists on the psock")
    
    was intended to address similar race, but had a couple cases it missed.
    Most obvious it only accounted for receiving traffic on the local socket
    so if redirecting into another socket we could still get an sk_buff stuck
    here. Next it missed the case where copied=0 in the recv() handler and
    then we wouldn't kick the scheduler. Also its sub-optimal to require
    userspace to kick the internal mechanisms of sockmap to wake it up and
    copy data to user. It results in an extra syscall and requires the app
    to actual handle the EAGAIN correctly.
    
    Fixes: 04919bed948dc ("tcp: Introduce tcp_read_skb()")
    Signed-off-by: John Fastabend <john.fastabend@gmail.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Tested-by: William Findlay <will@isovalent.com>
    Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
    Link: https://lore.kernel.org/bpf/20230523025618.113937-3-john.fastabend@gmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf, sockmap: Handle fin correctly [+ + +]
Author: John Fastabend <john.fastabend@gmail.com>
Date:   Mon May 22 19:56:09 2023 -0700

    bpf, sockmap: Handle fin correctly
    
    [ Upstream commit 901546fd8f9ca4b5c481ce00928ab425ce9aacc0 ]
    
    The sockmap code is returning EAGAIN after a FIN packet is received and no
    more data is on the receive queue. Correct behavior is to return 0 to the
    user and the user can then close the socket. The EAGAIN causes many apps
    to retry which masks the problem. Eventually the socket is evicted from
    the sockmap because its released from sockmap sock free handling. The
    issue creates a delay and can cause some errors on application side.
    
    To fix this check on sk_msg_recvmsg side if length is zero and FIN flag
    is set then set return to zero. A selftest will be added to check this
    condition.
    
    Fixes: 04919bed948dc ("tcp: Introduce tcp_read_skb()")
    Signed-off-by: John Fastabend <john.fastabend@gmail.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Tested-by: William Findlay <will@isovalent.com>
    Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
    Link: https://lore.kernel.org/bpf/20230523025618.113937-6-john.fastabend@gmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf, sockmap: Improved check for empty queue [+ + +]
Author: John Fastabend <john.fastabend@gmail.com>
Date:   Mon May 22 19:56:08 2023 -0700

    bpf, sockmap: Improved check for empty queue
    
    [ Upstream commit 405df89dd52cbcd69a3cd7d9a10d64de38f854b2 ]
    
    We noticed some rare sk_buffs were stepping past the queue when system was
    under memory pressure. The general theory is to skip enqueueing
    sk_buffs when its not necessary which is the normal case with a system
    that is properly provisioned for the task, no memory pressure and enough
    cpu assigned.
    
    But, if we can't allocate memory due to an ENOMEM error when enqueueing
    the sk_buff into the sockmap receive queue we push it onto a delayed
    workqueue to retry later. When a new sk_buff is received we then check
    if that queue is empty. However, there is a problem with simply checking
    the queue length. When a sk_buff is being processed from the ingress queue
    but not yet on the sockmap msg receive queue its possible to also recv
    a sk_buff through normal path. It will check the ingress queue which is
    zero and then skip ahead of the pkt being processed.
    
    Previously we used sock lock from both contexts which made the problem
    harder to hit, but not impossible.
    
    To fix instead of popping the skb from the queue entirely we peek the
    skb from the queue and do the copy there. This ensures checks to the
    queue length are non-zero while skb is being processed. Then finally
    when the entire skb has been copied to user space queue or another
    socket we pop it off the queue. This way the queue length check allows
    bypassing the queue only after the list has been completely processed.
    
    To reproduce issue we run NGINX compliance test with sockmap running and
    observe some flakes in our testing that we attributed to this issue.
    
    Fixes: 04919bed948dc ("tcp: Introduce tcp_read_skb()")
    Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
    Signed-off-by: John Fastabend <john.fastabend@gmail.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Tested-by: William Findlay <will@isovalent.com>
    Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
    Link: https://lore.kernel.org/bpf/20230523025618.113937-5-john.fastabend@gmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf, sockmap: Incorrectly handling copied_seq [+ + +]
Author: John Fastabend <john.fastabend@gmail.com>
Date:   Mon May 22 19:56:12 2023 -0700

    bpf, sockmap: Incorrectly handling copied_seq
    
    [ Upstream commit e5c6de5fa025882babf89cecbed80acf49b987fa ]
    
    The read_skb() logic is incrementing the tcp->copied_seq which is used for
    among other things calculating how many outstanding bytes can be read by
    the application. This results in application errors, if the application
    does an ioctl(FIONREAD) we return zero because this is calculated from
    the copied_seq value.
    
    To fix this we move tcp->copied_seq accounting into the recv handler so
    that we update these when the recvmsg() hook is called and data is in
    fact copied into user buffers. This gives an accurate FIONREAD value
    as expected and improves ACK handling. Before we were calling the
    tcp_rcv_space_adjust() which would update 'number of bytes copied to
    user in last RTT' which is wrong for programs returning SK_PASS. The
    bytes are only copied to the user when recvmsg is handled.
    
    Doing the fix for recvmsg is straightforward, but fixing redirect and
    SK_DROP pkts is a bit tricker. Build a tcp_psock_eat() helper and then
    call this from skmsg handlers. This fixes another issue where a broken
    socket with a BPF program doing a resubmit could hang the receiver. This
    happened because although read_skb() consumed the skb through sock_drop()
    it did not update the copied_seq. Now if a single reccv socket is
    redirecting to many sockets (for example for lb) the receiver sk will be
    hung even though we might expect it to continue. The hang comes from
    not updating the copied_seq numbers and memory pressure resulting from
    that.
    
    We have a slight layer problem of calling tcp_eat_skb even if its not
    a TCP socket. To fix we could refactor and create per type receiver
    handlers. I decided this is more work than we want in the fix and we
    already have some small tweaks depending on caller that use the
    helper skb_bpf_strparser(). So we extend that a bit and always set
    the strparser bit when it is in use and then we can gate the
    seq_copied updates on this.
    
    Fixes: 04919bed948dc ("tcp: Introduce tcp_read_skb()")
    Signed-off-by: John Fastabend <john.fastabend@gmail.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
    Link: https://lore.kernel.org/bpf/20230523025618.113937-9-john.fastabend@gmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf, sockmap: Pass skb ownership through read_skb [+ + +]
Author: John Fastabend <john.fastabend@gmail.com>
Date:   Mon May 22 19:56:05 2023 -0700

    bpf, sockmap: Pass skb ownership through read_skb
    
    [ Upstream commit 78fa0d61d97a728d306b0c23d353c0e340756437 ]
    
    The read_skb hook calls consume_skb() now, but this means that if the
    recv_actor program wants to use the skb it needs to inc the ref cnt
    so that the consume_skb() doesn't kfree the sk_buff.
    
    This is problematic because in some error cases under memory pressure
    we may need to linearize the sk_buff from sk_psock_skb_ingress_enqueue().
    Then we get this,
    
     skb_linearize()
       __pskb_pull_tail()
         pskb_expand_head()
           BUG_ON(skb_shared(skb))
    
    Because we incremented users refcnt from sk_psock_verdict_recv() we
    hit the bug on with refcnt > 1 and trip it.
    
    To fix lets simply pass ownership of the sk_buff through the skb_read
    call. Then we can drop the consume from read_skb handlers and assume
    the verdict recv does any required kfree.
    
    Bug found while testing in our CI which runs in VMs that hit memory
    constraints rather regularly. William tested TCP read_skb handlers.
    
    [  106.536188] ------------[ cut here ]------------
    [  106.536197] kernel BUG at net/core/skbuff.c:1693!
    [  106.536479] invalid opcode: 0000 [#1] PREEMPT SMP PTI
    [  106.536726] CPU: 3 PID: 1495 Comm: curl Not tainted 5.19.0-rc5 #1
    [  106.537023] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ArchLinux 1.16.0-1 04/01/2014
    [  106.537467] RIP: 0010:pskb_expand_head+0x269/0x330
    [  106.538585] RSP: 0018:ffffc90000138b68 EFLAGS: 00010202
    [  106.538839] RAX: 000000000000003f RBX: ffff8881048940e8 RCX: 0000000000000a20
    [  106.539186] RDX: 0000000000000002 RSI: 0000000000000000 RDI: ffff8881048940e8
    [  106.539529] RBP: ffffc90000138be8 R08: 00000000e161fd1a R09: 0000000000000000
    [  106.539877] R10: 0000000000000018 R11: 0000000000000000 R12: ffff8881048940e8
    [  106.540222] R13: 0000000000000003 R14: 0000000000000000 R15: ffff8881048940e8
    [  106.540568] FS:  00007f277dde9f00(0000) GS:ffff88813bd80000(0000) knlGS:0000000000000000
    [  106.540954] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [  106.541227] CR2: 00007f277eeede64 CR3: 000000000ad3e000 CR4: 00000000000006e0
    [  106.541569] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [  106.541915] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [  106.542255] Call Trace:
    [  106.542383]  <IRQ>
    [  106.542487]  __pskb_pull_tail+0x4b/0x3e0
    [  106.542681]  skb_ensure_writable+0x85/0xa0
    [  106.542882]  sk_skb_pull_data+0x18/0x20
    [  106.543084]  bpf_prog_b517a65a242018b0_bpf_skskb_http_verdict+0x3a9/0x4aa9
    [  106.543536]  ? migrate_disable+0x66/0x80
    [  106.543871]  sk_psock_verdict_recv+0xe2/0x310
    [  106.544258]  ? sk_psock_write_space+0x1f0/0x1f0
    [  106.544561]  tcp_read_skb+0x7b/0x120
    [  106.544740]  tcp_data_queue+0x904/0xee0
    [  106.544931]  tcp_rcv_established+0x212/0x7c0
    [  106.545142]  tcp_v4_do_rcv+0x174/0x2a0
    [  106.545326]  tcp_v4_rcv+0xe70/0xf60
    [  106.545500]  ip_protocol_deliver_rcu+0x48/0x290
    [  106.545744]  ip_local_deliver_finish+0xa7/0x150
    
    Fixes: 04919bed948dc ("tcp: Introduce tcp_read_skb()")
    Reported-by: William Findlay <will@isovalent.com>
    Signed-off-by: John Fastabend <john.fastabend@gmail.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Tested-by: William Findlay <will@isovalent.com>
    Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
    Link: https://lore.kernel.org/bpf/20230523025618.113937-2-john.fastabend@gmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf, sockmap: Reschedule is now done through backlog [+ + +]
Author: John Fastabend <john.fastabend@gmail.com>
Date:   Mon May 22 19:56:07 2023 -0700

    bpf, sockmap: Reschedule is now done through backlog
    
    [ Upstream commit bce22552f92ea7c577f49839b8e8f7d29afaf880 ]
    
    Now that the backlog manages the reschedule() logic correctly we can drop
    the partial fix to reschedule from recvmsg hook.
    
    Rescheduling on recvmsg hook was added to address a corner case where we
    still had data in the backlog state but had nothing to kick it and
    reschedule the backlog worker to run and finish copying data out of the
    state. This had a couple limitations, first it required user space to
    kick it introducing an unnecessary EBUSY and retry. Second it only
    handled the ingress case and egress redirects would still be hung.
    
    With the correct fix, pushing the reschedule logic down to where the
    enomem error occurs we can drop this fix.
    
    Fixes: bec217197b412 ("skmsg: Schedule psock work if the cached skb exists on the psock")
    Signed-off-by: John Fastabend <john.fastabend@gmail.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
    Link: https://lore.kernel.org/bpf/20230523025618.113937-4-john.fastabend@gmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf, sockmap: TCP data stall on recv before accept [+ + +]
Author: John Fastabend <john.fastabend@gmail.com>
Date:   Mon May 22 19:56:10 2023 -0700

    bpf, sockmap: TCP data stall on recv before accept
    
    [ Upstream commit ea444185a6bf7da4dd0df1598ee953e4f7174858 ]
    
    A common mechanism to put a TCP socket into the sockmap is to hook the
    BPF_SOCK_OPS_{ACTIVE_PASSIVE}_ESTABLISHED_CB event with a BPF program
    that can map the socket info to the correct BPF verdict parser. When
    the user adds the socket to the map the psock is created and the new
    ops are assigned to ensure the verdict program will 'see' the sk_buffs
    as they arrive.
    
    Part of this process hooks the sk_data_ready op with a BPF specific
    handler to wake up the BPF verdict program when data is ready to read.
    The logic is simple enough (posted here for easy reading)
    
     static void sk_psock_verdict_data_ready(struct sock *sk)
     {
            struct socket *sock = sk->sk_socket;
    
            if (unlikely(!sock || !sock->ops || !sock->ops->read_skb))
                    return;
            sock->ops->read_skb(sk, sk_psock_verdict_recv);
     }
    
    The oversight here is sk->sk_socket is not assigned until the application
    accepts() the new socket. However, its entirely ok for the peer application
    to do a connect() followed immediately by sends. The socket on the receiver
    is sitting on the backlog queue of the listening socket until its accepted
    and the data is queued up. If the peer never accepts the socket or is slow
    it will eventually hit data limits and rate limit the session. But,
    important for BPF sockmap hooks when this data is received TCP stack does
    the sk_data_ready() call but the read_skb() for this data is never called
    because sk_socket is missing. The data sits on the sk_receive_queue.
    
    Then once the socket is accepted if we never receive more data from the
    peer there will be no further sk_data_ready calls and all the data
    is still on the sk_receive_queue(). Then user calls recvmsg after accept()
    and for TCP sockets in sockmap we use the tcp_bpf_recvmsg_parser() handler.
    The handler checks for data in the sk_msg ingress queue expecting that
    the BPF program has already run from the sk_data_ready hook and enqueued
    the data as needed. So we are stuck.
    
    To fix do an unlikely check in recvmsg handler for data on the
    sk_receive_queue and if it exists wake up data_ready. We have the sock
    locked in both read_skb and recvmsg so should avoid having multiple
    runners.
    
    Fixes: 04919bed948dc ("tcp: Introduce tcp_read_skb()")
    Signed-off-by: John Fastabend <john.fastabend@gmail.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
    Link: https://lore.kernel.org/bpf/20230523025618.113937-7-john.fastabend@gmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

bpf, sockmap: Wake up polling after data copy [+ + +]
Author: John Fastabend <john.fastabend@gmail.com>
Date:   Mon May 22 19:56:11 2023 -0700

    bpf, sockmap: Wake up polling after data copy
    
    [ Upstream commit 6df7f764cd3cf5a03a4a47b23be47e57e41fcd85 ]
    
    When TCP stack has data ready to read sk_data_ready() is called. Sockmap
    overwrites this with its own handler to call into BPF verdict program.
    But, the original TCP socket had sock_def_readable that would additionally
    wake up any user space waiters with sk_wake_async().
    
    Sockmap saved the callback when the socket was created so call the saved
    data ready callback and then we can wake up any epoll() logic waiting
    on the read.
    
    Note we call on 'copied >= 0' to account for returning 0 when a FIN is
    received because we need to wake up user for this as well so they
    can do the recvmsg() -> 0 and detect the shutdown.
    
    Fixes: 04919bed948dc ("tcp: Introduce tcp_read_skb()")
    Signed-off-by: John Fastabend <john.fastabend@gmail.com>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
    Link: https://lore.kernel.org/bpf/20230523025618.113937-8-john.fastabend@gmail.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
bpf: netdev: init the offload table earlier [+ + +]
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Fri May 5 14:58:36 2023 -0700

    bpf: netdev: init the offload table earlier
    
    [ Upstream commit e1505c1cc8d527fcc5bcaf9c1ad82eed817e3e10 ]
    
    Some netdevices may get unregistered before late_initcall(),
    we have to move the hashtable init earlier.
    
    Fixes: f1fc43d03946 ("bpf: Move offload initialization into late_initcall")
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217399
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Acked-by: Stanislav Fomichev <sdf@google.com>
    Link: https://lore.kernel.org/r/20230505215836.491485-1-kuba@kernel.org
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
coresight: perf: Release Coresight path when alloc trace id failed [+ + +]
Author: Ruidong Tian <tianruidong@linux.alibaba.com>
Date:   Tue Apr 25 11:24:16 2023 +0800

    coresight: perf: Release Coresight path when alloc trace id failed
    
    [ Upstream commit 04ac7f98b92181179ea84439642493f3826d04a2 ]
    
    Error handler for etm_setup_aux can not release coresight path because
    cpu mask was cleared when coresight_trace_id_get_cpu_id failed.
    
    Call coresight_release_path function explicitly when alloc trace id filed.
    
    Fixes: 4ff1fdb4125c4 ("coresight: perf: traceid: Add perf ID allocation and notifiers")
    Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com>
    Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
    Link: https://lore.kernel.org/r/20230425032416.125542-1-tianruidong@linux.alibaba.com
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
cpufreq: amd-pstate: Add ->fast_switch() callback [+ + +]
Author: Gautham R. Shenoy <gautham.shenoy@amd.com>
Date:   Wed May 17 16:28:15 2023 +0000

    cpufreq: amd-pstate: Add ->fast_switch() callback
    
    commit 4badf2eb1e986bdbf34dd2f5d4c979553a86fe54 upstream.
    
    Schedutil normally calls the adjust_perf callback for drivers with
    adjust_perf callback available and fast_switch_possible flag set.
    However, when frequency invariance is disabled and schedutil tries to
    invoke fast_switch. So, there is a chance of kernel crash if this
    function pointer is not set. To protect against this scenario add
    fast_switch callback to amd_pstate driver.
    
    Fixes: 1d215f0319c2 ("cpufreq: amd-pstate: Add fast switch function for AMD P-State")
    Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
    Signed-off-by: Wyes Karny <wyes.karny@amd.com>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cpufreq: amd-pstate: Remove fast_switch_possible flag from active driver [+ + +]
Author: Wyes Karny <wyes.karny@amd.com>
Date:   Wed May 17 16:28:16 2023 +0000

    cpufreq: amd-pstate: Remove fast_switch_possible flag from active driver
    
    [ Upstream commit 249b62c448de7117c18531d626aed6e153cdfd75 ]
    
    amd_pstate active mode driver is only compatible with static governors.
    Therefore it doesn't need fast_switch functionality. Remove
    fast_switch_possible flag from amd_pstate active mode driver.
    
    Fixes: ffa5096a7c33 ("cpufreq: amd-pstate: implement Pstate EPP support for the AMD processors")
    Signed-off-by: Wyes Karny <wyes.karny@amd.com>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

cpufreq: amd-pstate: Update policy->cur in amd_pstate_adjust_perf() [+ + +]
Author: Wyes Karny <wyes.karny@amd.com>
Date:   Thu May 18 05:58:19 2023 +0000

    cpufreq: amd-pstate: Update policy->cur in amd_pstate_adjust_perf()
    
    commit 3bf8c6307bad5c0cc09cde982e146d847859b651 upstream.
    
    Driver should update policy->cur after updating the frequency.
    Currently amd_pstate doesn't update policy->cur when `adjust_perf`
    is used. Which causes /proc/cpuinfo to show wrong cpu frequency.
    Fix this by updating policy->cur with correct frequency value in
    adjust_perf function callback.
    
    - Before the fix: (setting min freq to 1.5 MHz)
    
    [root@amd]# cat /proc/cpuinfo | grep "cpu MHz" | sort | uniq --count
          1 cpu MHz         : 1777.016
          1 cpu MHz         : 1797.160
          1 cpu MHz         : 1797.270
        189 cpu MHz         : 400.000
    
    - After the fix: (setting min freq to 1.5 MHz)
    
    [root@amd]# cat /proc/cpuinfo | grep "cpu MHz" | sort | uniq --count
          1 cpu MHz         : 1753.353
          1 cpu MHz         : 1756.838
          1 cpu MHz         : 1776.466
          1 cpu MHz         : 1776.873
          1 cpu MHz         : 1777.308
          1 cpu MHz         : 1779.900
        183 cpu MHz         : 1805.231
          1 cpu MHz         : 1956.815
          1 cpu MHz         : 2246.203
          1 cpu MHz         : 2259.984
    
    Fixes: 1d215f0319c2 ("cpufreq: amd-pstate: Add fast switch function for AMD P-State")
    Signed-off-by: Wyes Karny <wyes.karny@amd.com>
    [ rjw: Subject edits ]
    Cc: 5.17+ <stable@vger.kernel.org> # 5.17+
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
cxl/port: Fix NULL pointer access in devm_cxl_add_port() [+ + +]
Author: Robert Richter <rrichter@amd.com>
Date:   Fri May 19 23:54:35 2023 +0200

    cxl/port: Fix NULL pointer access in devm_cxl_add_port()
    
    [ Upstream commit a70fc4ed20a6118837b0aecbbf789074935f473b ]
    
    In devm_cxl_add_port() the port creation may fail and its associated
    pointer does not contain a valid address. During error message
    generation this invalid port address is used. Fix that wrong address
    access.
    
    Fixes: f3cd264c4ec1 ("cxl: Unify debug messages when calling devm_cxl_add_port()")
    Signed-off-by: Robert Richter <rrichter@amd.com>
    Reviewed-by: Dave Jiang <dave.jiang@intel.com>
    Link: https://lore.kernel.org/r/20230519215436.3394532-1-rrichter@amd.com
    Signed-off-by: Dan Williams <dan.j.williams@intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
drm/i915: Disable DPLLs before disconnecting the TC PHY [+ + +]
Author: Imre Deak <imre.deak@intel.com>
Date:   Thu Mar 23 16:20:33 2023 +0200

    drm/i915: Disable DPLLs before disconnecting the TC PHY
    
    [ Upstream commit b108bdd0e22a402bd3e4a6391acbb6aefad31a9e ]
    
    Bspec requires disabling the DPLLs on TC ports before disconnecting the
    port's PHY. Add a post_pll_disable encoder hook and move the call to
    disconnect the port's PHY from the post_disable hook to the new hook.
    
    Reviewed-by: Mika Kahola <mika.kahola@intel.com>
    Signed-off-by: Imre Deak <imre.deak@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20230323142035.1432621-28-imre.deak@intel.com
    Stable-dep-of: 45dfbd992923 ("drm/i915: Fix PIPEDMC disabling for a bigjoiner configuration")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/i915: Fix PIPEDMC disabling for a bigjoiner configuration [+ + +]
Author: Imre Deak <imre.deak@intel.com>
Date:   Wed May 10 13:31:18 2023 +0300

    drm/i915: Fix PIPEDMC disabling for a bigjoiner configuration
    
    [ Upstream commit 45dfbd992923f4df174db4e23b96fca7e30d73e2 ]
    
    For a bigjoiner configuration display->crtc_disable() will be called
    first for the slave CRTCs and then for the master CRTC. However slave
    CRTCs will be actually disabled only after the master CRTC is disabled
    (from the encoder disable hooks called with the master CRTC state).
    Hence the slave PIPEDMCs can be disabled only after the master CRTC is
    disabled, make this so.
    
    intel_encoders_post_pll_disable() must be called only for the master
    CRTC, as for the other two encoder disable hooks. While at it fix this
    up as well. This didn't cause a problem, since
    intel_encoders_post_pll_disable() will call the corresponding hook only
    for an encoder/connector connected to the given CRTC, however slave
    CRTCs will have no associated encoder/connector.
    
    Fixes: 3af2ff0840be ("drm/i915: Enable a PIPEDMC whenever its corresponding pipe is enabled")
    Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Signed-off-by: Imre Deak <imre.deak@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20230510103131.1618266-2-imre.deak@intel.com
    (cherry picked from commit 7eeef32719f6af935a1554813e6bc206446339cd)
    Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/i915: Move shared DPLL disabling into CRTC disable hook [+ + +]
Author: Imre Deak <imre.deak@intel.com>
Date:   Thu Mar 23 16:20:32 2023 +0200

    drm/i915: Move shared DPLL disabling into CRTC disable hook
    
    [ Upstream commit 3acac2d06a7e0f0b182b86b25bb8a2e9b3300406 ]
    
    The spec requires disabling the PLL on TC ports before disconnecting the
    port's PHY. Prepare for that by moving the PLL disabling to the CRTC
    disable hook, while disconnecting the PHY will be moved to the
    post_pll_disable() encoder hook in the next patch.
    
    v2: Move the call from intel_crtc_disable_noatomic() as well.
    
    Reviewed-by: Mika Kahola <mika.kahola@intel.com> # v1
    Signed-off-by: Imre Deak <imre.deak@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20230323142035.1432621-27-imre.deak@intel.com
    Stable-dep-of: 45dfbd992923 ("drm/i915: Fix PIPEDMC disabling for a bigjoiner configuration")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
firmware: arm_ffa: Fix usage of partition info get count flag [+ + +]
Author: Sudeep Holla <sudeep.holla@arm.com>
Date:   Thu Apr 20 16:06:02 2023 +0100

    firmware: arm_ffa: Fix usage of partition info get count flag
    
    [ Upstream commit c6e045361a27ecd4fac6413164e0d091d80eee99 ]
    
    Commit bb1be7498500 ("firmware: arm_ffa: Add v1.1 get_partition_info support")
    adds support to discovery the UUIDs of the partitions or just fetch the
    partition count using the PARTITION_INFO_GET_RETURN_COUNT_ONLY flag.
    
    However the commit doesn't handle the fact that the older version doesn't
    understand the flag and must be MBZ which results in firmware returning
    invalid parameter error. That results in the failure of the driver probe
    which is in correct.
    
    Limit the usage of the PARTITION_INFO_GET_RETURN_COUNT_ONLY flag for the
    versions above v1.0(i.e v1.1 and onwards) which fixes the issue.
    
    Fixes: bb1be7498500 ("firmware: arm_ffa: Add v1.1 get_partition_info support")
    Reported-by: Jens Wiklander <jens.wiklander@linaro.org>
    Reported-by: Marc Bonnici <marc.bonnici@arm.com>
    Tested-by: Jens Wiklander <jens.wiklander@linaro.org>
    Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org>
    Link: https://lore.kernel.org/r/20230419-ffa_fixes_6-4-v2-2-d9108e43a176@arm.com
    Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

firmware: arm_scmi: Fix incorrect alloc_workqueue() invocation [+ + +]
Author: Tejun Heo <tj@kernel.org>
Date:   Thu Apr 20 09:33:49 2023 -1000

    firmware: arm_scmi: Fix incorrect alloc_workqueue() invocation
    
    [ Upstream commit 44e8d5ad2dc01529eb1316b1521f24ac4aac8eaf ]
    
    scmi_xfer_raw_worker_init() is specifying a flag, WQ_SYSFS, as @max_active.
    Fix it by or'ing WQ_SYSFS into @flags so that it actually enables sysfs
    interface and using 0 for @max_active for the default setting.
    
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Fixes: 3c3d818a9317 ("firmware: arm_scmi: Add core raw transmission support")
    Link: https://lore.kernel.org/r/ZEGTnajiQm7mkkZS@slm.duckdns.org
    Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>
 
gpio-f7188x: fix chip name and pin count on Nuvoton chip [+ + +]
Author: Henning Schild <henning.schild@siemens.com>
Date:   Thu Apr 27 17:20:55 2023 +0200

    gpio-f7188x: fix chip name and pin count on Nuvoton chip
    
    [ Upstream commit 3002b8642f016d7fe3ff56240dacea1075f6b877 ]
    
    In fact the device with chip id 0xD283 is called NCT6126D, and that is
    the chip id the Nuvoton code was written for. Correct that name to avoid
    confusion, because a NCT6116D in fact exists as well but has another
    chip id, and is currently not supported.
    
    The look at the spec also revealed that GPIO group7 in fact has 8 pins,
    so correct the pin count in that group as well.
    
    Fixes: d0918a84aff0 ("gpio-f7188x: Add GPIO support for Nuvoton NCT6116")
    Reported-by: Xing Tong Wu <xingtong.wu@siemens.com>
    Signed-off-by: Henning Schild <henning.schild@siemens.com>
    Acked-by: Simon Guinot <simon.guinot@sequanux.org>
    Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
gpiolib: fix allocation of mixed dynamic/static GPIOs [+ + +]
Author: Andreas Kemnade <andreas@kemnade.info>
Date:   Thu May 4 08:04:21 2023 +0200

    gpiolib: fix allocation of mixed dynamic/static GPIOs
    
    [ Upstream commit 7dd3d9bd873f138675cb727eaa51a498d99f0e89 ]
    
    If static allocation and dynamic allocation GPIOs are present,
    dynamic allocation pollutes the numberspace for static allocation,
    causing static allocation to fail.
    Enforce dynamic allocation above GPIO_DYNAMIC_BASE.
    
    Seen on a GTA04 when omap-gpio (static) and twl-gpio (dynamic)
    raced:
    [some successful registrations of omap_gpio instances]
    [    2.553833] twl4030_gpio twl4030-gpio: gpio (irq 145) chaining IRQs 161..178
    [    2.561401] gpiochip_find_base: found new base at 160
    [    2.564392] gpio gpiochip5: (twl4030): added GPIO chardev (254:5)
    [    2.564544] gpio gpiochip5: registered GPIOs 160 to 177 on twl4030
    [...]
    [    2.692169] omap-gpmc 6e000000.gpmc: GPMC revision 5.0
    [    2.697357] gpmc_mem_init: disabling cs 0 mapped at 0x0-0x1000000
    [    2.703643] gpiochip_find_base: found new base at 178
    [    2.704376] gpio gpiochip6: (omap-gpmc): added GPIO chardev (254:6)
    [    2.704589] gpio gpiochip6: registered GPIOs 178 to 181 on omap-gpmc
    [...]
    [    2.840393] gpio gpiochip7: Static allocation of GPIO base is deprecated, use dynamic allocation.
    [    2.849365] gpio gpiochip7: (gpio-160-191): GPIO integer space overlap, cannot add chip
    [    2.857513] gpiochip_add_data_with_key: GPIOs 160..191 (gpio-160-191) failed to register, -16
    [    2.866149] omap_gpio 48310000.gpio: error -EBUSY: Could not register gpio chip
    
    On that device it is fixed invasively by
    commit 92bf78b33b0b4 ("gpio: omap: use dynamic allocation of base")
    but let's also fix that for devices where there is still
    a mixture of static and dynamic allocation.
    
    Fixes: 7b61212f2a07 ("gpiolib: Get rid of ARCH_NR_GPIOS")
    Signed-off-by: Andreas Kemnade <andreas@kemnade.info>
    Reviewed-by: <christophe.leroy@csgroup.eu>
    Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
    Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
    Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
Linux: Linux 6.3.6 [+ + +]
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Mon Jun 5 09:29:47 2023 +0200

    Linux 6.3.6
    
    Link: https://lore.kernel.org/r/20230601131938.702671708@linuxfoundation.org
    Tested-by: Ronald Warsow <rwarsow@gmx.de>
    Tested-by: Shuah Khan <skhan@linuxfoundation.org>
    Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
    Tested-by: Ron Economos <re@w6rz.net>
    Tested-by: Conor Dooley <conor.dooley@microchip.com>
    Tested-by: Jon Hunter <jonathanh@nvidia.com>
    Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
    Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
    Tested-by: Markus Reichelt <lkt+2023@mareichelt.com>
    Tested-by: Justin M. Forbes <jforbes@fedoraproject.org>
    Tested-by: Guenter Roeck <linux@roeck-us.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
net/mlx5: E-switch, Devcom, sync devcom events and devcom comp register [+ + +]
Author: Shay Drory <shayd@nvidia.com>
Date:   Mon Feb 6 11:52:02 2023 +0200

    net/mlx5: E-switch, Devcom, sync devcom events and devcom comp register
    
    [ Upstream commit 8c253dfc89efde6b5faddf9e7400e5d17884e042 ]
    
    devcom events are sent to all registered component. Following the
    cited patch, it is possible for two components, e.g.: two eswitches,
    to send devcom events, while both components are registered. This
    means eswitch layer will do double un/pairing, which is double
    allocation and free of resources, even though only one un/pairing is
    needed. flow example:
    
            cpu0                                    cpu1
            ----                                    ----
    
     mlx5_devlink_eswitch_mode_set(dev0)
      esw_offloads_devcom_init()
       mlx5_devcom_register_component(esw0)
                                             mlx5_devlink_eswitch_mode_set(dev1)
                                              esw_offloads_devcom_init()
                                               mlx5_devcom_register_component(esw1)
                                               mlx5_devcom_send_event()
       mlx5_devcom_send_event()
    
    Hence, check whether the eswitches are already un/paired before
    free/allocation of resources.
    
    Fixes: 09b278462f16 ("net: devlink: enable parallel ops on netlink interface")
    Signed-off-by: Shay Drory <shayd@nvidia.com>
    Reviewed-by: Mark Bloch <mbloch@nvidia.com>
    Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
net/mlx5e: TC, Fix using eswitch mapping in nic mode [+ + +]
Author: Paul Blakey <paulb@nvidia.com>
Date:   Wed Apr 26 16:04:48 2023 +0300

    net/mlx5e: TC, Fix using eswitch mapping in nic mode
    
    [ Upstream commit dfa1e46d6093831b9d49f0f350227a1d13644a2f ]
    
    Cited patch is using the eswitch object mapping pool while
    in nic mode where it isn't initialized. This results in the
    trace below [0].
    
    Fix that by using either nic or eswitch object mapping pool
    depending if eswitch is enabled or not.
    
    [0]:
    [  826.446057] ==================================================================
    [  826.446729] BUG: KASAN: slab-use-after-free in mlx5_add_flow_rules+0x30/0x490 [mlx5_core]
    [  826.447515] Read of size 8 at addr ffff888194485830 by task tc/6233
    
    [  826.448243] CPU: 16 PID: 6233 Comm: tc Tainted: G        W          6.3.0-rc6+ #1
    [  826.448890] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
    [  826.449785] Call Trace:
    [  826.450052]  <TASK>
    [  826.450302]  dump_stack_lvl+0x33/0x50
    [  826.450650]  print_report+0xc2/0x610
    [  826.450998]  ? __virt_addr_valid+0xb1/0x130
    [  826.451385]  ? mlx5_add_flow_rules+0x30/0x490 [mlx5_core]
    [  826.451935]  kasan_report+0xae/0xe0
    [  826.452276]  ? mlx5_add_flow_rules+0x30/0x490 [mlx5_core]
    [  826.452829]  mlx5_add_flow_rules+0x30/0x490 [mlx5_core]
    [  826.453368]  ? __kmalloc_node+0x5a/0x120
    [  826.453733]  esw_add_restore_rule+0x20f/0x270 [mlx5_core]
    [  826.454288]  ? mlx5_eswitch_add_send_to_vport_meta_rule+0x260/0x260 [mlx5_core]
    [  826.455011]  ? mutex_unlock+0x80/0xd0
    [  826.455361]  ? __mutex_unlock_slowpath.constprop.0+0x210/0x210
    [  826.455862]  ? mapping_add+0x2cb/0x440 [mlx5_core]
    [  826.456425]  mlx5e_tc_action_miss_mapping_get+0x139/0x180 [mlx5_core]
    [  826.457058]  ? mlx5e_tc_update_skb_nic+0xb0/0xb0 [mlx5_core]
    [  826.457636]  ? __kasan_kmalloc+0x77/0x90
    [  826.458000]  ? __kmalloc+0x57/0x120
    [  826.458336]  mlx5_tc_ct_flow_offload+0x325/0xe40 [mlx5_core]
    [  826.458916]  ? ct_kernel_enter.constprop.0+0x48/0xa0
    [  826.459360]  ? mlx5_tc_ct_parse_action+0xf0/0xf0 [mlx5_core]
    [  826.459933]  ? mlx5e_mod_hdr_attach+0x491/0x520 [mlx5_core]
    [  826.460507]  ? mlx5e_mod_hdr_get+0x12/0x20 [mlx5_core]
    [  826.461046]  ? mlx5e_tc_attach_mod_hdr+0x154/0x170 [mlx5_core]
    [  826.461635]  mlx5e_configure_flower+0x969/0x2110 [mlx5_core]
    [  826.462217]  ? _raw_spin_lock_bh+0x85/0xe0
    [  826.462597]  ? __mlx5e_add_fdb_flow+0x750/0x750 [mlx5_core]
    [  826.463163]  ? kasan_save_stack+0x2e/0x40
    [  826.463534]  ? down_read+0x115/0x1b0
    [  826.463878]  ? down_write_killable+0x110/0x110
    [  826.464288]  ? tc_setup_action.part.0+0x9f/0x3b0
    [  826.464701]  ? mlx5e_is_uplink_rep+0x4c/0x90 [mlx5_core]
    [  826.465253]  ? mlx5e_tc_reoffload_flows_work+0x130/0x130 [mlx5_core]
    [  826.465878]  tc_setup_cb_add+0x112/0x250
    [  826.466247]  fl_hw_replace_filter+0x230/0x310 [cls_flower]
    [  826.466724]  ? fl_hw_destroy_filter+0x1a0/0x1a0 [cls_flower]
    [  826.467212]  fl_change+0x14e1/0x2030 [cls_flower]
    [  826.467636]  ? sock_def_readable+0x89/0x120
    [  826.468019]  ? fl_tmplt_create+0x2d0/0x2d0 [cls_flower]
    [  826.468509]  ? kasan_unpoison+0x23/0x50
    [  826.468873]  ? get_random_u16+0x180/0x180
    [  826.469244]  ? __radix_tree_lookup+0x2b/0x130
    [  826.469640]  ? fl_get+0x7b/0x140 [cls_flower]
    [  826.470042]  ? fl_mask_put+0x200/0x200 [cls_flower]
    [  826.470478]  ? __mutex_unlock_slowpath.constprop.0+0x210/0x210
    [  826.470973]  ? fl_tmplt_create+0x2d0/0x2d0 [cls_flower]
    [  826.471427]  tc_new_tfilter+0x644/0x1050
    [  826.471795]  ? tc_get_tfilter+0x860/0x860
    [  826.472170]  ? __thaw_task+0x130/0x130
    [  826.472525]  ? arch_stack_walk+0x98/0xf0
    [  826.472892]  ? cap_capable+0x9f/0xd0
    [  826.473235]  ? security_capable+0x47/0x60
    [  826.473608]  rtnetlink_rcv_msg+0x1d5/0x550
    [  826.473985]  ? rtnl_calcit.isra.0+0x1f0/0x1f0
    [  826.474383]  ? __stack_depot_save+0x35/0x4c0
    [  826.474779]  ? kasan_save_stack+0x2e/0x40
    [  826.475149]  ? kasan_save_stack+0x1e/0x40
    [  826.475518]  ? __kasan_record_aux_stack+0x9f/0xb0
    [  826.475939]  ? task_work_add+0x77/0x1c0
    [  826.476305]  netlink_rcv_skb+0xe0/0x210
    [  826.476661]  ? rtnl_calcit.isra.0+0x1f0/0x1f0
    [  826.477057]  ? netlink_ack+0x7c0/0x7c0
    [  826.477412]  ? rhashtable_jhash2+0xef/0x150
    [  826.477796]  ? _copy_from_iter+0x105/0x770
    [  826.484386]  netlink_unicast+0x346/0x490
    [  826.484755]  ? netlink_attachskb+0x400/0x400
    [  826.485145]  ? kernel_text_address+0xc2/0xd0
    [  826.485535]  netlink_sendmsg+0x3b0/0x6c0
    [  826.485902]  ? kernel_text_address+0xc2/0xd0
    [  826.486296]  ? netlink_unicast+0x490/0x490
    [  826.486671]  ? iovec_from_user.part.0+0x7a/0x1a0
    [  826.487083]  ? netlink_unicast+0x490/0x490
    [  826.487461]  sock_sendmsg+0x73/0xc0
    [  826.487803]  ____sys_sendmsg+0x364/0x380
    [  826.488186]  ? import_iovec+0x7/0x10
    [  826.488531]  ? kernel_sendmsg+0x30/0x30
    [  826.488893]  ? __copy_msghdr+0x180/0x180
    [  826.489258]  ? kasan_save_stack+0x2e/0x40
    [  826.489629]  ? kasan_save_stack+0x1e/0x40
    [  826.490002]  ? __kasan_record_aux_stack+0x9f/0xb0
    [  826.490424]  ? __call_rcu_common.constprop.0+0x46/0x580
    [  826.490876]  ___sys_sendmsg+0xdf/0x140
    [  826.491231]  ? copy_msghdr_from_user+0x110/0x110
    [  826.491649]  ? fget_raw+0x120/0x120
    [  826.491988]  ? ___sys_recvmsg+0xd9/0x130
    [  826.492355]  ? folio_batch_add_and_move+0x80/0xa0
    [  826.492776]  ? _raw_spin_lock+0x7a/0xd0
    [  826.493137]  ? _raw_spin_lock+0x7a/0xd0
    [  826.493500]  ? _raw_read_lock_irq+0x30/0x30
    [  826.493880]  ? kasan_set_track+0x21/0x30
    [  826.494249]  ? kasan_save_free_info+0x2a/0x40
    [  826.494650]  ? do_sys_openat2+0xff/0x270
    [  826.495016]  ? __fget_light+0x1b5/0x200
    [  826.495377]  ? __virt_addr_valid+0xb1/0x130
    [  826.495763]  __sys_sendmsg+0xb2/0x130
    [  826.496118]  ? __sys_sendmsg_sock+0x20/0x20
    [  826.496501]  ? __x64_sys_rseq+0x2e0/0x2e0
    [  826.496874]  ? do_user_addr_fault+0x276/0x820
    [  826.497273]  ? fpregs_assert_state_consistent+0x52/0x60
    [  826.497727]  ? exit_to_user_mode_prepare+0x30/0x120
    [  826.498158]  do_syscall_64+0x3d/0x90
    [  826.498502]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
    [  826.498949] RIP: 0033:0x7f9b67f4f887
    [  826.499294] Code: 0a 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
    [  826.500742] RSP: 002b:00007fff5d1a5498 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
    [  826.501395] RAX: ffffffffffffffda RBX: 0000000064413ce6 RCX: 00007f9b67f4f887
    [  826.501975] RDX: 0000000000000000 RSI: 00007fff5d1a5500 RDI: 0000000000000003
    [  826.502556] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000001
    [  826.503135] R10: 00007f9b67e08708 R11: 0000000000000246 R12: 0000000000000001
    [  826.503714] R13: 0000000000000001 R14: 00007fff5d1a9800 R15: 0000000000485400
    [  826.504304]  </TASK>
    
    [  826.504753] Allocated by task 3764:
    [  826.505090]  kasan_save_stack+0x1e/0x40
    [  826.505453]  kasan_set_track+0x21/0x30
    [  826.505810]  __kasan_kmalloc+0x77/0x90
    [  826.506164]  __mlx5_create_flow_table+0x16d/0xbb0 [mlx5_core]
    [  826.506742]  esw_offloads_enable+0x60d/0xfb0 [mlx5_core]
    [  826.507292]  mlx5_eswitch_enable_locked+0x4d3/0x680 [mlx5_core]
    [  826.507885]  mlx5_devlink_eswitch_mode_set+0x2a3/0x580 [mlx5_core]
    [  826.508513]  devlink_nl_cmd_eswitch_set_doit+0xdf/0x1f0
    [  826.508969]  genl_family_rcv_msg_doit.isra.0+0x146/0x1c0
    [  826.509427]  genl_rcv_msg+0x28d/0x3e0
    [  826.509772]  netlink_rcv_skb+0xe0/0x210
    [  826.510133]  genl_rcv+0x24/0x40
    [  826.510448]  netlink_unicast+0x346/0x490
    [  826.510810]  netlink_sendmsg+0x3b0/0x6c0
    [  826.511179]  sock_sendmsg+0x73/0xc0
    [  826.511519]  __sys_sendto+0x18d/0x220
    [  826.511867]  __x64_sys_sendto+0x72/0x80
    [  826.512232]  do_syscall_64+0x3d/0x90
    [  826.512576]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
    
    [  826.513220] Freed by task 5674:
    [  826.513535]  kasan_save_stack+0x1e/0x40
    [  826.513893]  kasan_set_track+0x21/0x30
    [  826.514245]  kasan_save_free_info+0x2a/0x40
    [  826.514629]  ____kasan_slab_free+0x11a/0x1b0
    [  826.515021]  __kmem_cache_free+0x14d/0x280
    [  826.515399]  tree_put_node+0x109/0x1c0 [mlx5_core]
    [  826.515907]  mlx5_destroy_flow_table+0x119/0x630 [mlx5_core]
    [  826.516481]  esw_offloads_steering_cleanup+0xe7/0x150 [mlx5_core]
    [  826.517084]  esw_offloads_disable+0xe0/0x160 [mlx5_core]
    [  826.517632]  mlx5_eswitch_disable_locked+0x26c/0x290 [mlx5_core]
    [  826.518225]  mlx5_devlink_eswitch_mode_set+0x128/0x580 [mlx5_core]
    [  826.518834]  devlink_nl_cmd_eswitch_set_doit+0xdf/0x1f0
    [  826.519286]  genl_family_rcv_msg_doit.isra.0+0x146/0x1c0
    [  826.519748]  genl_rcv_msg+0x28d/0x3e0
    [  826.520101]  netlink_rcv_skb+0xe0/0x210
    [  826.520458]  genl_rcv+0x24/0x40
    [  826.520771]  netlink_unicast+0x346/0x490
    [  826.521137]  netlink_sendmsg+0x3b0/0x6c0
    [  826.521505]  sock_sendmsg+0x73/0xc0
    [  826.521842]  __sys_sendto+0x18d/0x220
    [  826.522191]  __x64_sys_sendto+0x72/0x80
    [  826.522554]  do_syscall_64+0x3d/0x90
    [  826.522894]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
    
    [  826.523540] Last potentially related work creation:
    [  826.523969]  kasan_save_stack+0x1e/0x40
    [  826.524331]  __kasan_record_aux_stack+0x9f/0xb0
    [  826.524739]  insert_work+0x30/0x130
    [  826.525078]  __queue_work+0x34b/0x690
    [  826.525426]  queue_work_on+0x48/0x50
    [  826.525766]  __rhashtable_remove_fast_one+0x4af/0x4d0 [mlx5_core]
    [  826.526365]  del_sw_flow_group+0x1b5/0x270 [mlx5_core]
    [  826.526898]  tree_put_node+0x109/0x1c0 [mlx5_core]
    [  826.527407]  esw_offloads_steering_cleanup+0xd3/0x150 [mlx5_core]
    [  826.528009]  esw_offloads_disable+0xe0/0x160 [mlx5_core]
    [  826.528616]  mlx5_eswitch_disable_locked+0x26c/0x290 [mlx5_core]
    [  826.529218]  mlx5_devlink_eswitch_mode_set+0x128/0x580 [mlx5_core]
    [  826.529823]  devlink_nl_cmd_eswitch_set_doit+0xdf/0x1f0
    [  826.530276]  genl_family_rcv_msg_doit.isra.0+0x146/0x1c0
    [  826.530733]  genl_rcv_msg+0x28d/0x3e0
    [  826.531079]  netlink_rcv_skb+0xe0/0x210
    [  826.531439]  genl_rcv+0x24/0x40
    [  826.531755]  netlink_unicast+0x346/0x490
    [  826.532123]  netlink_sendmsg+0x3b0/0x6c0
    [  826.532487]  sock_sendmsg+0x73/0xc0
    [  826.532825]  __sys_sendto+0x18d/0x220
    [  826.533175]  __x64_sys_sendto+0x72/0x80
    [  826.533533]  do_syscall_64+0x3d/0x90
    [  826.533877]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
    
    [  826.534521] The buggy address belongs to the object at ffff888194485800
                    which belongs to the cache kmalloc-512 of size 512
    [  826.535506] The buggy address is located 48 bytes inside of
                    freed 512-byte region [ffff888194485800, ffff888194485a00)
    
    [  826.536666] The buggy address belongs to the physical page:
    [  826.537138] page:00000000d75841dd refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x194480
    [  826.537915] head:00000000d75841dd order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
    [  826.538595] flags: 0x200000000010200(slab|head|node=0|zone=2)
    [  826.539089] raw: 0200000000010200 ffff888100042c80 ffffea0004523800 dead000000000002
    [  826.539755] raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
    [  826.540417] page dumped because: kasan: bad access detected
    
    [  826.541095] Memory state around the buggy address:
    [  826.541519]  ffff888194485700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    [  826.542149]  ffff888194485780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    [  826.542773] >ffff888194485800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    [  826.543400]                                      ^
    [  826.543822]  ffff888194485880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    [  826.544452]  ffff888194485900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    [  826.545079] ==================================================================
    
    Fixes: 6702782845a5 ("net/mlx5e: TC, Set CT miss to the specific ct action instance")
    Signed-off-by: Paul Blakey <paulb@nvidia.com>
    Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
    Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
net: fec: add dma_wmb to ensure correct descriptor values [+ + +]
Author: Shenwei Wang <shenwei.wang@nxp.com>
Date:   Thu May 18 10:02:02 2023 -0500

    net: fec: add dma_wmb to ensure correct descriptor values
    
    [ Upstream commit 9025944fddfed5966c8f102f1fe921ab3aee2c12 ]
    
    Two dma_wmb() are added in the XDP TX path to ensure proper ordering of
    descriptor and buffer updates:
    1. A dma_wmb() is added after updating the last BD to make sure
       the updates to rest of the descriptor are visible before
       transferring ownership to FEC.
    2. A dma_wmb() is also added after updating the bdp to ensure these
       updates are visible before updating txq->bd.cur.
    3. Start the xmit of the frame immediately right after configuring the
       tx descriptor.
    
    Fixes: 6d6b39f180b8 ("net: fec: add initial XDP support")
    Signed-off-by: Shenwei Wang <shenwei.wang@nxp.com>
    Reviewed-by: Wei Fang <wei.fang@nxp.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

net: phy: mscc: enable VSC8501/2 RGMII RX clock [+ + +]
Author: David Epping <david.epping@missinglinkelectronics.com>
Date:   Tue May 23 17:31:08 2023 +0200

    net: phy: mscc: enable VSC8501/2 RGMII RX clock
    
    [ Upstream commit 71460c9ec5c743e9ffffca3c874d66267c36345e ]
    
    By default the VSC8501 and VSC8502 RGMII/GMII/MII RX_CLK output is
    disabled. To allow packet forwarding towards the MAC it needs to be
    enabled.
    
    For other PHYs supported by this driver the clock output is enabled
    by default.
    
    Fixes: d3169863310d ("net: phy: mscc: add support for VSC8502")
    Signed-off-by: David Epping <david.epping@missinglinkelectronics.com>
    Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
    Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
netfilter: ctnetlink: Support offloaded conntrack entry deletion [+ + +]
Author: Paul Blakey <paulb@nvidia.com>
Date:   Wed Mar 22 09:35:32 2023 +0200

    netfilter: ctnetlink: Support offloaded conntrack entry deletion
    
    commit 9b7c68b3911aef84afa4cbfc31bce20f10570d51 upstream.
    
    Currently, offloaded conntrack entries (flows) can only be deleted
    after they are removed from offload, which is either by timeout,
    tcp state change or tc ct rule deletion. This can cause issues for
    users wishing to manually delete or flush existing entries.
    
    Support deletion of offloaded conntrack entries.
    
    Example usage:
     # Delete all offloaded (and non offloaded) conntrack entries
     # whose source address is 1.2.3.4
     $ conntrack -D -s 1.2.3.4
     # Delete all entries
     $ conntrack -F
    
    Signed-off-by: Paul Blakey <paulb@nvidia.com>
    Reviewed-by: Simon Horman <simon.horman@corigine.com>
    Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Cc: Demi Marie Obenour <demi@invisiblethingslab.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

 
platform/x86/amd/pmf: Fix CnQF and auto-mode after resume [+ + +]
Author: Mario Limonciello <mario.limonciello@amd.com>
Date:   Fri May 12 20:14:08 2023 -0500

    platform/x86/amd/pmf: Fix CnQF and auto-mode after resume
    
    [ Upstream commit b54147fa374dbeadcb01b1762db1a793e06e37de ]
    
    After suspend/resume cycle there is an error message and auto-mode
    or CnQF stops working.
    
    [ 5741.447511] amd-pmf AMDI0100:00: SMU cmd failed. err: 0xff
    [ 5741.447523] amd-pmf AMDI0100:00: AMD_PMF_REGISTER_RESPONSE:ff
    [ 5741.447527] amd-pmf AMDI0100:00: AMD_PMF_REGISTER_ARGUMENT:7
    [ 5741.447531] amd-pmf AMDI0100:00: AMD_PMF_REGISTER_MESSAGE:16
    [ 5741.447540] amd-pmf AMDI0100:00: [AUTO_MODE] avg power: 0 mW mode: QUIET
    
    This is because the DRAM address used for accessing metrics table
    needs to be refreshed after a suspend resume cycle. Add a resume
    callback to reset this again.
    
    Fixes: 1a409b35c995 ("platform/x86/amd/pmf: Get performance metrics from PMFW")
    Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
    Link: https://lore.kernel.org/r/20230513011408.958-1-mario.limonciello@amd.com
    Reviewed-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Hans de Goede <hdegoede@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
power: supply: rt9467: Fix passing zero to 'dev_err_probe' [+ + +]
Author: ChiaEn Wu <chiaen_wu@richtek.com>
Date:   Fri May 12 13:44:23 2023 +0800

    power: supply: rt9467: Fix passing zero to 'dev_err_probe'
    
    [ Upstream commit bc97139ff13598fa5becf6b582ef99ab428c03ef ]
    
    Fix passing zero to 'dev_err_probe()' in 'rt9467_request_interrupt()'
    
    Fixes: 6f7f70e3a8dd ("power: supply: rt9467: Add Richtek RT9467 charger driver")
    Reported-by: kernel test robot <lkp@intel.com>
    Reported-by: Dan Carpenter <error27@gmail.com>
    Link: https://lore.kernel.org/r/202305111228.bHLWU6bq-lkp@intel.com/
    Signed-off-by: ChiaEn Wu <chiaen_wu@richtek.com>
    Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
Revert "net/mlx5: Expose steering dropped packets counter" [+ + +]
Author: Maher Sanalla <msanalla@nvidia.com>
Date:   Mon Mar 20 19:43:27 2023 +0200

    Revert "net/mlx5: Expose steering dropped packets counter"
    
    [ Upstream commit e267b8a52ca5d5e8434929a5e9f5574aed141024 ]
    
    This reverts commit 4fe1b3a5f8fe2fdcedcaba9561e5b0ae5cb1d15b, which
    exposes the steering dropped packets counter via debugfs. The upcoming
    series will expose the counter via devlink health reporter instead
    of debugfs.
    
    Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
    Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
    Stable-dep-of: 8c253dfc89ef ("net/mlx5: E-switch, Devcom, sync devcom events and devcom comp register")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

Revert "net/mlx5: Expose vnic diagnostic counters for eswitch managed vports" [+ + +]
Author: Maher Sanalla <msanalla@nvidia.com>
Date:   Mon Mar 20 19:43:47 2023 +0200

    Revert "net/mlx5: Expose vnic diagnostic counters for eswitch managed vports"
    
    [ Upstream commit 0a431418f685e100c45ff150efaf4a5afa6f1982 ]
    
    This reverts commit 606e6a72e29dff9e3341c4cc9b554420e4793f401 which exposes
    the vnic diagnostic counters via debugfs. Instead, The upcoming series will
    expose the same counters through devlink health reporter.
    
    Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
    Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
    Stable-dep-of: 8c253dfc89ef ("net/mlx5: E-switch, Devcom, sync devcom events and devcom comp register")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
selftests/bpf: Fix pkg-config call building sign-file [+ + +]
Author: Jeremy Sowden <jeremy@azazel.net>
Date:   Wed Apr 26 22:50:32 2023 +0100

    selftests/bpf: Fix pkg-config call building sign-file
    
    [ Upstream commit 5f5486b620cd43b16a1787ef92b9bc21bd72ef2e ]
    
    When building sign-file, the call to get the CFLAGS for libcrypto is
    missing white-space between `pkg-config` and `--cflags`:
    
      $(shell $(HOSTPKG_CONFIG)--cflags libcrypto 2> /dev/null)
    
    Removing the redirection of stderr, we see:
    
      $ make -C tools/testing/selftests/bpf sign-file
      make: Entering directory '[...]/tools/testing/selftests/bpf'
      make: pkg-config--cflags: No such file or directory
        SIGN-FILE sign-file
      make: Leaving directory '[...]/tools/testing/selftests/bpf'
    
    Add the missing space.
    
    Fixes: fc97590668ae ("selftests/bpf: Add test for bpf_verify_pkcs7_signature() kfunc")
    Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
    Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
    Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com>
    Link: https://lore.kernel.org/bpf/20230426215032.415792-1-jeremy@azazel.net
    Signed-off-by: Alexei Starovoitov <ast@kernel.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
spi: spi-geni-qcom: Select FIFO mode for chip select [+ + +]
Author: Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com>
Date:   Tue May 9 15:31:36 2023 +0530

    spi: spi-geni-qcom: Select FIFO mode for chip select
    
    [ Upstream commit 4c329f5da7cfa366bacfda1328a025dd38951317 ]
    
    Spi geni driver switches between FIFO and DMA modes based on xfer length.
    FIFO mode relies on M_CMD_DONE_EN interrupt for completion while DMA mode
    relies on XX_DMA_DONE.
    During dynamic switching, if FIFO mode is chosen, FIFO related interrupts
    are enabled and DMA related interrupts are disabled. And viceversa.
    Chip select shares M_CMD_DONE_EN interrupt with FIFO to check completion.
    Now, if a chip select operation is preceded by a DMA xfer, M_CMD_DONE_EN
    interrupt would have been disabled and hence it will never receive one
    resulting in timeout.
    
    For chip select, in addition to setting the xfer mode to FIFO,
    select_mode() to FIFO so that required interrupts are enabled.
    
    Fixes: e5f0dfa78ac7 ("spi: spi-geni-qcom: Add support for SE DMA mode")
    Suggested-by: Praveen Talari <quic_ptalari@quicinc.com
    Signed-off-by: Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com
    Reviewed-by: Douglas Anderson <dianders@chromium.org
    Link: https://lore.kernel.org/r/1683626496-9685-1-git-send-email-quic_vnivarth@quicinc.com
    Signed-off-by: Mark Brown <broonie@kernel.org
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
tls: rx: device: fix checking decryption status [+ + +]
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Tue May 16 18:50:36 2023 -0700

    tls: rx: device: fix checking decryption status
    
    [ Upstream commit b3a03b540e3cf62a255213d084d76d71c02793d5 ]
    
    skb->len covers the entire skb, including the frag_list.
    In fact we're guaranteed that rxm->full_len <= skb->len,
    so since the change under Fixes we were not checking decrypt
    status of any skb but the first.
    
    Note that the skb_pagelen() added here may feel a bit costly,
    but it's removed by subsequent fixes, anyway.
    
    Reported-by: Tariq Toukan <tariqt@nvidia.com>
    Fixes: 86b259f6f888 ("tls: rx: device: bound the frag walk")
    Tested-by: Shai Amiram <samiram@nvidia.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Reviewed-by: Simon Horman <simon.horman@corigine.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tls: rx: strp: don't use GFP_KERNEL in softirq context [+ + +]
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Tue May 16 18:50:42 2023 -0700

    tls: rx: strp: don't use GFP_KERNEL in softirq context
    
    [ Upstream commit 74836ec828fe17b63f2006fdbf53311d691396bf ]
    
    When receive buffer is small, or the TCP rx queue looks too
    complicated to bother using it directly - we allocate a new
    skb and copy data into it.
    
    We already use sk->sk_allocation... but nothing actually
    sets it to GFP_ATOMIC on the ->sk_data_ready() path.
    
    Users of HW offload are far more likely to experience problems
    due to scheduling while atomic. "Copy mode" is very rarely
    triggered with SW crypto.
    
    Fixes: 84c61fe1a75b ("tls: rx: do not use the standard strparser")
    Tested-by: Shai Amiram <samiram@nvidia.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Reviewed-by: Simon Horman <simon.horman@corigine.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tls: rx: strp: factor out copying skb data [+ + +]
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Tue May 16 18:50:40 2023 -0700

    tls: rx: strp: factor out copying skb data
    
    [ Upstream commit c1c607b1e5d5477d82ca6a86a05a4f10907b33ee ]
    
    We'll need to copy input skbs individually in the next patch.
    Factor that code out (without assuming we're copying a full record).
    
    Tested-by: Shai Amiram <samiram@nvidia.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Reviewed-by: Simon Horman <simon.horman@corigine.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Stable-dep-of: eca9bfafee3a ("tls: rx: strp: preserve decryption status of skbs when needed")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tls: rx: strp: fix determining record length in copy mode [+ + +]
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Tue May 16 18:50:39 2023 -0700

    tls: rx: strp: fix determining record length in copy mode
    
    [ Upstream commit 8b0c0dc9fbbd01e58a573a41c38885f9e4c17696 ]
    
    We call tls_rx_msg_size(skb) before doing skb->len += chunk.
    So the tls_rx_msg_size() code will see old skb->len, most
    likely leading to an over-read.
    
    Worst case we will over read an entire record, next iteration
    will try to trim the skb but may end up turning frag len negative
    or discarding the subsequent record (since we already told TCP
    we've read it during previous read but now we'll trim it out of
    the skb).
    
    Fixes: 84c61fe1a75b ("tls: rx: do not use the standard strparser")
    Tested-by: Shai Amiram <samiram@nvidia.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Reviewed-by: Simon Horman <simon.horman@corigine.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tls: rx: strp: force mixed decrypted records into copy mode [+ + +]
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Tue May 16 18:50:38 2023 -0700

    tls: rx: strp: force mixed decrypted records into copy mode
    
    [ Upstream commit 14c4be92ebb3e36e392aa9dd8f314038a9f96f3c ]
    
    If a record is partially decrypted we'll have to CoW it, anyway,
    so go into copy mode and allocate a writable skb right away.
    
    This will make subsequent fix simpler because we won't have to
    teach tls_strp_msg_make_copy() how to copy skbs while preserving
    decrypt status.
    
    Tested-by: Shai Amiram <samiram@nvidia.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Reviewed-by: Simon Horman <simon.horman@corigine.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Stable-dep-of: eca9bfafee3a ("tls: rx: strp: preserve decryption status of skbs when needed")
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tls: rx: strp: preserve decryption status of skbs when needed [+ + +]
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Tue May 16 18:50:41 2023 -0700

    tls: rx: strp: preserve decryption status of skbs when needed
    
    [ Upstream commit eca9bfafee3a0487e59c59201ae14c7594ba940a ]
    
    When receive buffer is small we try to copy out the data from
    TCP into a skb maintained by TLS to prevent connection from
    stalling. Unfortunately if a single record is made up of a mix
    of decrypted and non-decrypted skbs combining them into a single
    skb leads to loss of decryption status, resulting in decryption
    errors or data corruption.
    
    Similarly when trying to use TCP receive queue directly we need
    to make sure that all the skbs within the record have the same
    status. If we don't the mixed status will be detected correctly
    but we'll CoW the anchor, again collapsing it into a single paged
    skb without decrypted status preserved. So the "fixup" code will
    not know which parts of skb to re-encrypt.
    
    Fixes: 84c61fe1a75b ("tls: rx: do not use the standard strparser")
    Tested-by: Shai Amiram <samiram@nvidia.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Reviewed-by: Simon Horman <simon.horman@corigine.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

tls: rx: strp: set the skb->len of detached / CoW'ed skbs [+ + +]
Author: Jakub Kicinski <kuba@kernel.org>
Date:   Tue May 16 18:50:37 2023 -0700

    tls: rx: strp: set the skb->len of detached / CoW'ed skbs
    
    [ Upstream commit 210620ae44a83f25220450bbfcc22e6fe986b25f ]
    
    alloc_skb_with_frags() fills in page frag sizes but does not
    set skb->len and skb->data_len. Set those correctly otherwise
    device offload will most likely generate an empty skb and
    hit the BUG() at the end of __skb_nsg().
    
    Fixes: 84c61fe1a75b ("tls: rx: do not use the standard strparser")
    Tested-by: Shai Amiram <samiram@nvidia.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Reviewed-by: Simon Horman <simon.horman@corigine.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

 
vfio/type1: check pfn valid before converting to struct page [+ + +]
Author: Yan Zhao <yan.y.zhao@intel.com>
Date:   Fri May 19 14:58:43 2023 +0800

    vfio/type1: check pfn valid before converting to struct page
    
    [ Upstream commit 4752354af71043e6fd72ef5490ed6da39e6cab4a ]
    
    Check physical PFN is valid before converting the PFN to a struct page
    pointer to be returned to caller of vfio_pin_pages().
    
    vfio_pin_pages() pins user pages with contiguous IOVA.
    If the IOVA of a user page to be pinned belongs to vma of vm_flags
    VM_PFNMAP, pin_user_pages_remote() will return -EFAULT without returning
    struct page address for this PFN. This is because usually this kind of PFN
    (e.g. MMIO PFN) has no valid struct page address associated.
    Upon this error, vaddr_get_pfns() will obtain the physical PFN directly.
    
    While previously vfio_pin_pages() returns to caller PFN arrays directly,
    after commit
    34a255e67615 ("vfio: Replace phys_pfn with pages for vfio_pin_pages()"),
    PFNs will be converted to "struct page *" unconditionally and therefore
    the returned "struct page *" array may contain invalid struct page
    addresses.
    
    Given current in-tree users of vfio_pin_pages() only expect "struct page *
    returned, check PFN validity and return -EINVAL to let the caller be
    aware of IOVAs to be pinned containing PFN not able to be returned in
    "struct page *" array. So that, the caller will not consume the returned
    pointer (e.g. test PageReserved()) and avoid error like "supervisor read
    access in kernel mode".
    
    Fixes: 34a255e67615 ("vfio: Replace phys_pfn with pages for vfio_pin_pages()")
    Cc: Sean Christopherson <seanjc@google.com>
    Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
    Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
    Reviewed-by: Sean Christopherson <seanjc@google.com>
    Link: https://lore.kernel.org/r/20230519065843.10653-1-yan.y.zhao@intel.com
    Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
    Signed-off-by: Sasha Levin <sashal@kernel.org>