Export limit exceeded: 340122 CVEs match your query. Please refine your search to export 10,000 CVEs or fewer.
Search
Search Results (340122 CVEs found)
| CVE | Vendors | Products | Updated | CVSS v3.1 |
|---|---|---|---|---|
| CVE-2026-23280 | 1 Linux | 1 Linux Kernel | 2026-03-25 | N/A |
| In the Linux kernel, the following vulnerability has been resolved: accel/amdxdna: Prevent ubuf size overflow The ubuf size calculation may overflow, resulting in an undersized allocation and possible memory corruption. Use check_add_overflow() helpers to validate the size calculation before allocation. | ||||
| CVE-2026-23279 | 1 Linux | 1 Linux Kernel | 2026-03-25 | N/A |
| In the Linux kernel, the following vulnerability has been resolved: wifi: mac80211: fix NULL pointer dereference in mesh_rx_csa_frame() In mesh_rx_csa_frame(), elems->mesh_chansw_params_ie is dereferenced at lines 1638 and 1642 without a prior NULL check: ifmsh->chsw_ttl = elems->mesh_chansw_params_ie->mesh_ttl; ... pre_value = le16_to_cpu(elems->mesh_chansw_params_ie->mesh_pre_value); The mesh_matches_local() check above only validates the Mesh ID, Mesh Configuration, and Supported Rates IEs. It does not verify the presence of the Mesh Channel Switch Parameters IE (element ID 118). When a received CSA action frame omits that IE, ieee802_11_parse_elems() leaves elems->mesh_chansw_params_ie as NULL, and the unconditional dereference causes a kernel NULL pointer dereference. A remote mesh peer with an established peer link (PLINK_ESTAB) can trigger this by sending a crafted SPECTRUM_MGMT/CHL_SWITCH action frame that includes a matching Mesh ID and Mesh Configuration IE but omits the Mesh Channel Switch Parameters IE. No authentication beyond the default open mesh peering is required. Crash confirmed on kernel 6.17.0-5-generic via mac80211_hwsim: BUG: kernel NULL pointer dereference, address: 0000000000000000 Oops: Oops: 0000 [#1] SMP NOPTI RIP: 0010:ieee80211_mesh_rx_queued_mgmt+0x143/0x2a0 [mac80211] CR2: 0000000000000000 Fix by adding a NULL check for mesh_chansw_params_ie after mesh_matches_local() returns, consistent with how other optional IEs are guarded throughout the mesh code. The bug has been present since v3.13 (released 2014-01-19). | ||||
| CVE-2026-23278 | 1 Linux | 1 Linux Kernel | 2026-03-25 | 5.5 Medium |
| In the Linux kernel, the following vulnerability has been resolved: netfilter: nf_tables: always walk all pending catchall elements During transaction processing we might have more than one catchall element: 1 live catchall element and 1 pending element that is coming as part of the new batch. If the map holding the catchall elements is also going away, its required to toggle all catchall elements and not just the first viable candidate. Otherwise, we get: WARNING: ./include/net/netfilter/nf_tables.h:1281 at nft_data_release+0xb7/0xe0 [nf_tables], CPU#2: nft/1404 RIP: 0010:nft_data_release+0xb7/0xe0 [nf_tables] [..] __nft_set_elem_destroy+0x106/0x380 [nf_tables] nf_tables_abort_release+0x348/0x8d0 [nf_tables] nf_tables_abort+0xcf2/0x3ac0 [nf_tables] nfnetlink_rcv_batch+0x9c9/0x20e0 [..] | ||||
| CVE-2026-23277 | 1 Linux | 1 Linux Kernel | 2026-03-25 | 5.5 Medium |
| In the Linux kernel, the following vulnerability has been resolved: net/sched: teql: fix NULL pointer dereference in iptunnel_xmit on TEQL slave xmit teql_master_xmit() calls netdev_start_xmit(skb, slave) to transmit through slave devices, but does not update skb->dev to the slave device beforehand. When a gretap tunnel is a TEQL slave, the transmit path reaches iptunnel_xmit() which saves dev = skb->dev (still pointing to teql0 master) and later calls iptunnel_xmit_stats(dev, pkt_len). This function does: get_cpu_ptr(dev->tstats) Since teql_master_setup() does not set dev->pcpu_stat_type to NETDEV_PCPU_STAT_TSTATS, the core network stack never allocates tstats for teql0, so dev->tstats is NULL. get_cpu_ptr(NULL) computes NULL + __per_cpu_offset[cpu], resulting in a page fault. BUG: unable to handle page fault for address: ffff8880e6659018 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 68bc067 P4D 68bc067 PUD 0 Oops: Oops: 0002 [#1] SMP KASAN PTI RIP: 0010:iptunnel_xmit (./include/net/ip_tunnels.h:664 net/ipv4/ip_tunnel_core.c:89) Call Trace: <TASK> ip_tunnel_xmit (net/ipv4/ip_tunnel.c:847) __gre_xmit (net/ipv4/ip_gre.c:478) gre_tap_xmit (net/ipv4/ip_gre.c:779) teql_master_xmit (net/sched/sch_teql.c:319) dev_hard_start_xmit (net/core/dev.c:3887) sch_direct_xmit (net/sched/sch_generic.c:347) __dev_queue_xmit (net/core/dev.c:4802) neigh_direct_output (net/core/neighbour.c:1660) ip_finish_output2 (net/ipv4/ip_output.c:237) __ip_finish_output.part.0 (net/ipv4/ip_output.c:315) ip_mc_output (net/ipv4/ip_output.c:369) ip_send_skb (net/ipv4/ip_output.c:1508) udp_send_skb (net/ipv4/udp.c:1195) udp_sendmsg (net/ipv4/udp.c:1485) inet_sendmsg (net/ipv4/af_inet.c:859) __sys_sendto (net/socket.c:2206) Fix this by setting skb->dev = slave before calling netdev_start_xmit(), so that tunnel xmit functions see the correct slave device with properly allocated tstats. | ||||
| CVE-2026-23276 | 1 Linux | 1 Linux Kernel | 2026-03-25 | 5.5 Medium |
| In the Linux kernel, the following vulnerability has been resolved: net: add xmit recursion limit to tunnel xmit functions Tunnel xmit functions (iptunnel_xmit, ip6tunnel_xmit) lack their own recursion limit. When a bond device in broadcast mode has GRE tap interfaces as slaves, and those GRE tunnels route back through the bond, multicast/broadcast traffic triggers infinite recursion between bond_xmit_broadcast() and ip_tunnel_xmit()/ip6_tnl_xmit(), causing kernel stack overflow. The existing XMIT_RECURSION_LIMIT (8) in the no-qdisc path is not sufficient because tunnel recursion involves route lookups and full IP output, consuming much more stack per level. Use a lower limit of 4 (IP_TUNNEL_RECURSION_LIMIT) to prevent overflow. Add recursion detection using dev_xmit_recursion helpers directly in iptunnel_xmit() and ip6tunnel_xmit() to cover all IPv4/IPv6 tunnel paths including UDP encapsulated tunnels (VXLAN, Geneve, etc.). Move dev_xmit_recursion helpers from net/core/dev.h to public header include/linux/netdevice.h so they can be used by tunnel code. BUG: KASAN: stack-out-of-bounds in blake2s.constprop.0+0xe7/0x160 Write of size 32 at addr ffff88810033fed0 by task kworker/0:1/11 Workqueue: mld mld_ifc_work Call Trace: <TASK> __build_flow_key.constprop.0 (net/ipv4/route.c:515) ip_rt_update_pmtu (net/ipv4/route.c:1073) iptunnel_xmit (net/ipv4/ip_tunnel_core.c:84) ip_tunnel_xmit (net/ipv4/ip_tunnel.c:847) gre_tap_xmit (net/ipv4/ip_gre.c:779) dev_hard_start_xmit (net/core/dev.c:3887) sch_direct_xmit (net/sched/sch_generic.c:347) __dev_queue_xmit (net/core/dev.c:4802) bond_dev_queue_xmit (drivers/net/bonding/bond_main.c:312) bond_xmit_broadcast (drivers/net/bonding/bond_main.c:5279) bond_start_xmit (drivers/net/bonding/bond_main.c:5530) dev_hard_start_xmit (net/core/dev.c:3887) __dev_queue_xmit (net/core/dev.c:4841) ip_finish_output2 (net/ipv4/ip_output.c:237) ip_output (net/ipv4/ip_output.c:438) iptunnel_xmit (net/ipv4/ip_tunnel_core.c:86) gre_tap_xmit (net/ipv4/ip_gre.c:779) dev_hard_start_xmit (net/core/dev.c:3887) sch_direct_xmit (net/sched/sch_generic.c:347) __dev_queue_xmit (net/core/dev.c:4802) bond_dev_queue_xmit (drivers/net/bonding/bond_main.c:312) bond_xmit_broadcast (drivers/net/bonding/bond_main.c:5279) bond_start_xmit (drivers/net/bonding/bond_main.c:5530) dev_hard_start_xmit (net/core/dev.c:3887) __dev_queue_xmit (net/core/dev.c:4841) ip_finish_output2 (net/ipv4/ip_output.c:237) ip_output (net/ipv4/ip_output.c:438) iptunnel_xmit (net/ipv4/ip_tunnel_core.c:86) ip_tunnel_xmit (net/ipv4/ip_tunnel.c:847) gre_tap_xmit (net/ipv4/ip_gre.c:779) dev_hard_start_xmit (net/core/dev.c:3887) sch_direct_xmit (net/sched/sch_generic.c:347) __dev_queue_xmit (net/core/dev.c:4802) bond_dev_queue_xmit (drivers/net/bonding/bond_main.c:312) bond_xmit_broadcast (drivers/net/bonding/bond_main.c:5279) bond_start_xmit (drivers/net/bonding/bond_main.c:5530) dev_hard_start_xmit (net/core/dev.c:3887) __dev_queue_xmit (net/core/dev.c:4841) mld_sendpack mld_ifc_work process_one_work worker_thread </TASK> | ||||
| CVE-2026-23274 | 1 Linux | 1 Linux Kernel | 2026-03-25 | 5.5 Medium |
| In the Linux kernel, the following vulnerability has been resolved: netfilter: xt_IDLETIMER: reject rev0 reuse of ALARM timer labels IDLETIMER revision 0 rules reuse existing timers by label and always call mod_timer() on timer->timer. If the label was created first by revision 1 with XT_IDLETIMER_ALARM, the object uses alarm timer semantics and timer->timer is never initialized. Reusing that object from revision 0 causes mod_timer() on an uninitialized timer_list, triggering debugobjects warnings and possible panic when panic_on_warn=1. Fix this by rejecting revision 0 rule insertion when an existing timer with the same label is of ALARM type. | ||||
| CVE-2026-23271 | 1 Linux | 1 Linux Kernel | 2026-03-25 | 5.5 Medium |
| In the Linux kernel, the following vulnerability has been resolved: perf: Fix __perf_event_overflow() vs perf_remove_from_context() race Make sure that __perf_event_overflow() runs with IRQs disabled for all possible callchains. Specifically the software events can end up running it with only preemption disabled. This opens up a race vs perf_event_exit_event() and friends that will go and free various things the overflow path expects to be present, like the BPF program. | ||||
| CVE-2026-23270 | 1 Linux | 1 Linux Kernel | 2026-03-25 | 7.0 High |
| In the Linux kernel, the following vulnerability has been resolved: net/sched: Only allow act_ct to bind to clsact/ingress qdiscs and shared blocks As Paolo said earlier [1]: "Since the blamed commit below, classify can return TC_ACT_CONSUMED while the current skb being held by the defragmentation engine. As reported by GangMin Kim, if such packet is that may cause a UaF when the defrag engine later on tries to tuch again such packet." act_ct was never meant to be used in the egress path, however some users are attaching it to egress today [2]. Attempting to reach a middle ground, we noticed that, while most qdiscs are not handling TC_ACT_CONSUMED, clsact/ingress qdiscs are. With that in mind, we address the issue by only allowing act_ct to bind to clsact/ingress qdiscs and shared blocks. That way it's still possible to attach act_ct to egress (albeit only with clsact). [1] https://lore.kernel.org/netdev/674b8cbfc385c6f37fb29a1de08d8fe5c2b0fbee.1771321118.git.pabeni@redhat.com/ [2] https://lore.kernel.org/netdev/cc6bfb4a-4a2b-42d8-b9ce-7ef6644fb22b@ovn.org/ | ||||
| CVE-2026-23269 | 1 Linux | 1 Linux Kernel | 2026-03-25 | N/A |
| In the Linux kernel, the following vulnerability has been resolved: apparmor: validate DFA start states are in bounds in unpack_pdb Start states are read from untrusted data and used as indexes into the DFA state tables. The aa_dfa_next() function call in unpack_pdb() will access dfa->tables[YYTD_ID_BASE][start], and if the start state exceeds the number of states in the DFA, this results in an out-of-bound read. ================================================================== BUG: KASAN: slab-out-of-bounds in aa_dfa_next+0x2a1/0x360 Read of size 4 at addr ffff88811956fb90 by task su/1097 ... Reject policies with out-of-bounds start states during unpacking to prevent the issue. | ||||
| CVE-2026-23268 | 1 Linux | 1 Linux Kernel | 2026-03-25 | N/A |
| In the Linux kernel, the following vulnerability has been resolved: apparmor: fix unprivileged local user can do privileged policy management An unprivileged local user can load, replace, and remove profiles by opening the apparmorfs interfaces, via a confused deputy attack, by passing the opened fd to a privileged process, and getting the privileged process to write to the interface. This does require a privileged target that can be manipulated to do the write for the unprivileged process, but once such access is achieved full policy management is possible and all the possible implications that implies: removing confinement, DoS of system or target applications by denying all execution, by-passing the unprivileged user namespace restriction, to exploiting kernel bugs for a local privilege escalation. The policy management interface can not have its permissions simply changed from 0666 to 0600 because non-root processes need to be able to load policy to different policy namespaces. Instead ensure the task writing the interface has privileges that are a subset of the task that opened the interface. This is already done via policy for confined processes, but unconfined can delegate access to the opened fd, by-passing the usual policy check. | ||||
| CVE-2026-23253 | 1 Linux | 1 Linux Kernel | 2026-03-25 | 5.5 Medium |
| In the Linux kernel, the following vulnerability has been resolved: media: dvb-core: fix wrong reinitialization of ringbuffer on reopen dvb_dvr_open() calls dvb_ringbuffer_init() when a new reader opens the DVR device. dvb_ringbuffer_init() calls init_waitqueue_head(), which reinitializes the waitqueue list head to empty. Since dmxdev->dvr_buffer.queue is a shared waitqueue (all opens of the same DVR device share it), this orphans any existing waitqueue entries from io_uring poll or epoll, leaving them with stale prev/next pointers while the list head is reset to {self, self}. The waitqueue and spinlock in dvr_buffer are already properly initialized once in dvb_dmxdev_init(). The open path only needs to reset the buffer data pointer, size, and read/write positions. Replace the dvb_ringbuffer_init() call in dvb_dvr_open() with direct assignment of data/size and a call to dvb_ringbuffer_reset(), which properly resets pread, pwrite, and error with correct memory ordering without touching the waitqueue or spinlock. | ||||
| CVE-2026-23252 | 1 Linux | 1 Linux Kernel | 2026-03-25 | 5.5 Medium |
| In the Linux kernel, the following vulnerability has been resolved: xfs: get rid of the xchk_xfile_*_descr calls The xchk_xfile_*_descr macros call kasprintf, which can fail to allocate memory if the formatted string is larger than 16 bytes (or whatever the nofail guarantees are nowadays). Some of them could easily exceed that, and Jiaming Zhang found a few places where that can happen with syzbot. The descriptions are debugging aids and aren't required to be unique, so let's just pass in static strings and eliminate this path to failure. Note this patch touches a number of commits, most of which were merged between 6.6 and 6.14. | ||||
| CVE-2026-23246 | 1 Linux | 1 Linux Kernel | 2026-03-25 | 5.9 Medium |
| In the Linux kernel, the following vulnerability has been resolved: wifi: mac80211: bounds-check link_id in ieee80211_ml_reconfiguration link_id is taken from the ML Reconfiguration element (control & 0x000f), so it can be 0..15. link_removal_timeout[] has IEEE80211_MLD_MAX_NUM_LINKS (15) elements, so index 15 is out-of-bounds. Skip subelements with link_id >= IEEE80211_MLD_MAX_NUM_LINKS to avoid a stack out-of-bounds write. | ||||
| CVE-2026-23245 | 1 Linux | 1 Linux Kernel | 2026-03-25 | N/A |
| In the Linux kernel, the following vulnerability has been resolved: net/sched: act_gate: snapshot parameters with RCU on replace The gate action can be replaced while the hrtimer callback or dump path is walking the schedule list. Convert the parameters to an RCU-protected snapshot and swap updates under tcf_lock, freeing the previous snapshot via call_rcu(). When REPLACE omits the entry list, preserve the existing schedule so the effective state is unchanged. | ||||
| CVE-2026-23244 | 1 Linux | 1 Linux Kernel | 2026-03-25 | 5.5 Medium |
| In the Linux kernel, the following vulnerability has been resolved: nvme: fix memory allocation in nvme_pr_read_keys() nvme_pr_read_keys() takes num_keys from userspace and uses it to calculate the allocation size for rse via struct_size(). The upper limit is PR_KEYS_MAX (64K). A malicious or buggy userspace can pass a large num_keys value that results in a 4MB allocation attempt at most, causing a warning in the page allocator when the order exceeds MAX_PAGE_ORDER. To fix this, use kvzalloc() instead of kzalloc(). This bug has the same reasoning and fix with the patch below: https://lore.kernel.org/linux-block/20251212013510.3576091-1-kartikey406@gmail.com/ Warning log: WARNING: mm/page_alloc.c:5216 at __alloc_frozen_pages_noprof+0x5aa/0x2300 mm/page_alloc.c:5216, CPU#1: syz-executor117/272 Modules linked in: CPU: 1 UID: 0 PID: 272 Comm: syz-executor117 Not tainted 6.19.0 #1 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 RIP: 0010:__alloc_frozen_pages_noprof+0x5aa/0x2300 mm/page_alloc.c:5216 Code: ff 83 bd a8 fe ff ff 0a 0f 86 69 fb ff ff 0f b6 1d f9 f9 c4 04 80 fb 01 0f 87 3b 76 30 ff 83 e3 01 75 09 c6 05 e4 f9 c4 04 01 <0f> 0b 48 c7 85 70 fe ff ff 00 00 00 00 e9 8f fd ff ff 31 c0 e9 0d RSP: 0018:ffffc90000fcf450 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1ffff920001f9ea0 RDX: 0000000000000000 RSI: 000000000000000b RDI: 0000000000040dc0 RBP: ffffc90000fcf648 R08: ffff88800b6c3380 R09: 0000000000000001 R10: ffffc90000fcf840 R11: ffff88807ffad280 R12: 0000000000000000 R13: 0000000000040dc0 R14: 0000000000000001 R15: ffffc90000fcf620 FS: 0000555565db33c0(0000) GS:ffff8880be26c000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000002000000c CR3: 0000000003b72000 CR4: 00000000000006f0 Call Trace: <TASK> alloc_pages_mpol+0x236/0x4d0 mm/mempolicy.c:2486 alloc_frozen_pages_noprof+0x149/0x180 mm/mempolicy.c:2557 ___kmalloc_large_node+0x10c/0x140 mm/slub.c:5598 __kmalloc_large_node_noprof+0x25/0xc0 mm/slub.c:5629 __do_kmalloc_node mm/slub.c:5645 [inline] __kmalloc_noprof+0x483/0x6f0 mm/slub.c:5669 kmalloc_noprof include/linux/slab.h:961 [inline] kzalloc_noprof include/linux/slab.h:1094 [inline] nvme_pr_read_keys+0x8f/0x4c0 drivers/nvme/host/pr.c:245 blkdev_pr_read_keys block/ioctl.c:456 [inline] blkdev_common_ioctl+0x1b71/0x29b0 block/ioctl.c:730 blkdev_ioctl+0x299/0x700 block/ioctl.c:786 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:597 [inline] __se_sys_ioctl fs/ioctl.c:583 [inline] __x64_sys_ioctl+0x1bf/0x220 fs/ioctl.c:583 x64_sys_call+0x1280/0x21b0 mnt/fuzznvme_1/fuzznvme/linux-build/v6.19/./arch/x86/include/generated/asm/syscalls_64.h:17 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x71/0x330 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7fb893d3108d Code: 28 c3 e8 46 1e 00 00 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffff61f2f38 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007ffff61f3138 RCX: 00007fb893d3108d RDX: 0000000020000040 RSI: 00000000c01070ce RDI: 0000000000000003 RBP: 0000000000000001 R08: 0000000000000000 R09: 00007ffff61f3138 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 R13: 00007ffff61f3128 R14: 00007fb893dae530 R15: 0000000000000001 </TASK> | ||||
| CVE-2026-23227 | 1 Linux | 1 Linux Kernel | 2026-03-25 | 7.8 High |
| In the Linux kernel, the following vulnerability has been resolved: drm/exynos: vidi: use ctx->lock to protect struct vidi_context member variables related to memory alloc/free Exynos Virtual Display driver performs memory alloc/free operations without lock protection, which easily causes concurrency problem. For example, use-after-free can occur in race scenario like this: ``` CPU0 CPU1 CPU2 ---- ---- ---- vidi_connection_ioctl() if (vidi->connection) // true drm_edid = drm_edid_alloc(); // alloc drm_edid ... ctx->raw_edid = drm_edid; ... drm_mode_getconnector() drm_helper_probe_single_connector_modes() vidi_get_modes() if (ctx->raw_edid) // true drm_edid_dup(ctx->raw_edid); if (!drm_edid) // false ... vidi_connection_ioctl() if (vidi->connection) // false drm_edid_free(ctx->raw_edid); // free drm_edid ... drm_edid_alloc(drm_edid->edid) kmemdup(edid); // UAF!! ... ``` To prevent these vulns, at least in vidi_context, member variables related to memory alloc/free should be protected with ctx->lock. | ||||
| CVE-2026-23204 | 1 Linux | 1 Linux Kernel | 2026-03-25 | 7.1 High |
| In the Linux kernel, the following vulnerability has been resolved: net/sched: cls_u32: use skb_header_pointer_careful() skb_header_pointer() does not fully validate negative @offset values. Use skb_header_pointer_careful() instead. GangMin Kim provided a report and a repro fooling u32_classify(): BUG: KASAN: slab-out-of-bounds in u32_classify+0x1180/0x11b0 net/sched/cls_u32.c:221 | ||||
| CVE-2026-23157 | 1 Linux | 1 Linux Kernel | 2026-03-25 | 5.5 Medium |
| In the Linux kernel, the following vulnerability has been resolved: btrfs: do not strictly require dirty metadata threshold for metadata writepages [BUG] There is an internal report that over 1000 processes are waiting at the io_schedule_timeout() of balance_dirty_pages(), causing a system hang and trigger a kernel coredump. The kernel is v6.4 kernel based, but the root problem still applies to any upstream kernel before v6.18. [CAUSE] From Jan Kara for his wisdom on the dirty page balance behavior first. This cgroup dirty limit was what was actually playing the role here because the cgroup had only a small amount of memory and so the dirty limit for it was something like 16MB. Dirty throttling is responsible for enforcing that nobody can dirty (significantly) more dirty memory than there's dirty limit. Thus when a task is dirtying pages it periodically enters into balance_dirty_pages() and we let it sleep there to slow down the dirtying. When the system is over dirty limit already (either globally or within a cgroup of the running task), we will not let the task exit from balance_dirty_pages() until the number of dirty pages drops below the limit. So in this particular case, as I already mentioned, there was a cgroup with relatively small amount of memory and as a result with dirty limit set at 16MB. A task from that cgroup has dirtied about 28MB worth of pages in btrfs btree inode and these were practically the only dirty pages in that cgroup. So that means the only way to reduce the dirty pages of that cgroup is to writeback the dirty pages of btrfs btree inode, and only after that those processes can exit balance_dirty_pages(). Now back to the btrfs part, btree_writepages() is responsible for writing back dirty btree inode pages. The problem here is, there is a btrfs internal threshold that if the btree inode's dirty bytes are below the 32M threshold, it will not do any writeback. This behavior is to batch as much metadata as possible so we won't write back those tree blocks and then later re-COW them again for another modification. This internal 32MiB is higher than the existing dirty page size (28MiB), meaning no writeback will happen, causing a deadlock between btrfs and cgroup: - Btrfs doesn't want to write back btree inode until more dirty pages - Cgroup/MM doesn't want more dirty pages for btrfs btree inode Thus any process touching that btree inode is put into sleep until the number of dirty pages is reduced. Thanks Jan Kara a lot for the analysis of the root cause. [ENHANCEMENT] Since kernel commit b55102826d7d ("btrfs: set AS_KERNEL_FILE on the btree_inode"), btrfs btree inode pages will only be charged to the root cgroup which should have a much larger limit than btrfs' 32MiB threshold. So it should not affect newer kernels. But for all current LTS kernels, they are all affected by this problem, and backporting the whole AS_KERNEL_FILE may not be a good idea. Even for newer kernels I still think it's a good idea to get rid of the internal threshold at btree_writepages(), since for most cases cgroup/MM has a better view of full system memory usage than btrfs' fixed threshold. For internal callers using btrfs_btree_balance_dirty() since that function is already doing internal threshold check, we don't need to bother them. But for external callers of btree_writepages(), just respect their requests and write back whatever they want, ignoring the internal btrfs threshold to avoid such deadlock on btree inode dirty page balancing. | ||||
| CVE-2026-23154 | 1 Linux | 1 Linux Kernel | 2026-03-25 | 5.5 Medium |
| In the Linux kernel, the following vulnerability has been resolved: net: fix segmentation of forwarding fraglist GRO This patch enhances GSO segment handling by properly checking the SKB_GSO_DODGY flag for frag_list GSO packets, addressing low throughput issues observed when a station accesses IPv4 servers via hotspots with an IPv6-only upstream interface. Specifically, it fixes a bug in GSO segmentation when forwarding GRO packets containing a frag_list. The function skb_segment_list cannot correctly process GRO skbs that have been converted by XLAT, since XLAT only translates the header of the head skb. Consequently, skbs in the frag_list may remain untranslated, resulting in protocol inconsistencies and reduced throughput. To address this, the patch explicitly sets the SKB_GSO_DODGY flag for GSO packets in XLAT's IPv4/IPv6 protocol translation helpers (bpf_skb_proto_4_to_6 and bpf_skb_proto_6_to_4). This marks GSO packets as potentially modified after protocol translation. As a result, GSO segmentation will avoid using skb_segment_list and instead falls back to skb_segment for packets with the SKB_GSO_DODGY flag. This ensures that only safe and fully translated frag_list packets are processed by skb_segment_list, resolving protocol inconsistencies and improving throughput when forwarding GRO packets converted by XLAT. | ||||
| CVE-2026-23141 | 1 Linux | 1 Linux Kernel | 2026-03-25 | 5.5 Medium |
| In the Linux kernel, the following vulnerability has been resolved: btrfs: send: check for inline extents in range_is_hole_in_parent() Before accessing the disk_bytenr field of a file extent item we need to check if we are dealing with an inline extent. This is because for inline extents their data starts at the offset of the disk_bytenr field. So accessing the disk_bytenr means we are accessing inline data or in case the inline data is less than 8 bytes we can actually cause an invalid memory access if this inline extent item is the first item in the leaf or access metadata from other items. | ||||