Linux: Vulnerabilities in nf_tables trigger privilege escalation, recordsdata leak

Linux: Vulnerabilities in nf_tables trigger privilege escalation, recordsdata leak

Thread recordsdata
[Search the oss-security archive]

From:   David Bouman
Self-discipline:   [oss-security] Linux kernel: CVE-2022-1015,CVE-2022-1016 in nf_tables trigger privilege escalation, recordsdata leak
Date:   Mon, 28 Mar 2022 20: 28: 21 +0200
Howdy list,

I'm reporting two linux kernel vulnerabilities in the nf_tables 
a part of the netfilter subsystem that I discovered.

CVE-2022-1015 pertains to an out of bounds access in nf_tables 
expression evaluate as a result of validation of user register indices. It 
results in local privilege escalation, for instance by overwriting a stack 
return tackle OOB with a crafted nft_expr_payload.

CVE-2022-1015 is exploitable starting from commit 345023b0db3 
("netfilter: nftables: add nft_parse_register_store() and use it"), 
v5.12 and has been mounted in commit 6e1acfa387b9 ("netfilter: nf_tables: 
validate registers coming from userspace.").

The worm has been most modern since commit 49499c3e6e18 ("netfilter: 
nf_tables: swap registers to 32 bit addressing"), but to my recordsdata 
has no longer been exploitable unless v5.12.

CVE-2022-1016 pertains to uninitialized stack recordsdata in the nft_do_chain 
routine. CVE-2022-1016 is exploitable starting from commit 96518518cc41 
(normal merge of nf_tables), v3.13-rc1, and has been mounted in commit 
4c905f6740a3 ("netfilter: nf_tables: initialize registers in 

I shall be releasing an in depth blog post and exploit code for every and every 
vulnerabilities in about a days.

Root trigger CVE-2022-1016: (it's the shortest, so I will open with it)

The nft_do_chain routine in earn/netfilter/nf_tables_core.c does no longer 
initialize the register recordsdata that nf_tables expressions can read from- 
and write to. These expressions inherently reward facet effects that can 
be veteran to search out out the register recordsdata, which can beget kernel record 
pointers, module pointers, and allocation pointers reckoning on the code 
direction taken to wind up at nft_do_chain.

unsigned int
nft_do_chain(struct nft_pktinfo *pkt, void *priv)
	const struct nft_chain *chain=priv, *basechain=chain;
	const struct earn *earn=nft_net(pkt);
	struct nft_rule *const *principles;
	const struct nft_rule *rule;
	const struct nft_expr *expr, *closing;
	struct nft_regs regs; // nft.gencursor);
	struct nft_traceinfo data;

	if (static_branch_unlikely(&nft_trace_enabled))
		nft_trace_init(&data, pkt, &regs.verdict, basechain);
	if (genbit)

	for (; *principles ; principles++) {
		nft_rule_for_each_expr(expr, closing, rule) {
			if (expr->ops==&nft_cmp_fast_ops)
				nft_cmp_fast_eval(expr, &regs);
			else if (expr->ops==&nft_bitwise_fast_ops)
				nft_bitwise_fast_eval(expr, &regs);
			else if (expr->ops !=&nft_payload_fast_ops ||
				 !nft_payload_fast_eval(expr, &regs, pkt))
				expr_call_ops_eval(expr, &regs, pkt);

Root trigger CVE-2022-1015:

(under is pasted from my normal picture)

Howdy, I'm mailing to picture a vulnerability I discovered in nf_tables 
a part of the netfilter subsystem. The vulnerability offers an 
attacker a ambitious earlier skool that shall be veteran to each and every read from and 
write to relative stack recordsdata. This may perhaps occasionally lead to arbitrary code execution 
by an attacker.

In uncover for an unprivileged attacker to take advantage of this issue, 
unprivileged user- and community namespaces access is required 
(CLONE_NEWUSER | CLONE_NEWNET). The worm depends on a compiler 
optimization that introduces behavior that the maintainer did no longer 
legend for, and seemingly most tremendous occurs on kernels with 
`CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y`. I efficiently exploited the worm 
on x86_64 kernel model 5.16-rc3, but I imagine this vulnerability 
exists across totally different kernel variations and architectures (more on this 

With out extra ado:

The worm resides in `linux/earn/netfilter/nf_tables_api.c`, in the 
`nft_validate_register_store` and `nft_validate_register_load` routines. 
These routines are veteran to check if nft expression parameters supplied 
by the user are sound and acquired't trigger OOB stack accesses when evaluating 
the expression.

 From my 5.16-rc3 kernel provide 
(d58071a8a76d779eedab38033ae4c821c30295a5: Linux 5.16-rc3):


static int nft_validate_register_store(const struct nft_ctx *ctx,
       enum nft_registers reg,
       const struct nft_data *recordsdata,
       enum nft_data_types sort,
       unsigned int len)
int err;

swap (reg) {
if (reg 
    sizeof_field(struct nft_regs, recordsdata))
return -ERANGE;

if (recordsdata !=NULL && sort !=NFT_DATA_VALUE)
return -EINVAL;
return 0;


static int nft_validate_register_load(enum nft_registers reg, unsigned 
int len)
if (reg  sizeof_field(struct nft_regs, recordsdata))
return -ERANGE;

return 0;

The issue lies in the actual fact that `enum nft_registers reg` is no longer 
guaranteed most tremendous be a single byte. As per the C89 specification, 
Enumeration constants: `An identifier declared as an enumeration 
constant has sort int.`.

Successfully this implies that the compiler is free to emit code that 
operates on `reg` as if it were a 32-bit price. If right here's the case (and 
it's on the kernel I examined), a user can forge an expression register 
price that can overflow upon multiplication with `NFT_REG32_SIZE` (4) 
and upon addition with `len`, shall be a price smaller than 
`sizeof_field(struct nft_regs, recordsdata)` (0x50). Once this compare passes, 
the least essential byte of `reg` can peaceable beget a price that can 
index outside of the bounds of the `struct nft_regs regs` that this may perhaps per chance presumably perhaps 
later be veteran with.

Take for instance a `reg` price of `0xfffffff8` and a `len` price of 
`0x40`. The expression `reg 4 + len` will then lead to `0xffffffe0 + 
0x40=0x20`, which is lower than `0x50`. This makes that a price of 
`0xf8` is identified as a sound index, and is subsequently assigned to a 
register price in the expression data structs.

Right here is a snippet of the x86_64 assembly code that these capabilities may perhaps presumably perhaps 

Disassembly of allotment .text:

0000000000002ed0 :
     2ed0: e8 00 00 00 00       callq  2ed5 

     2ed5: 55                   push   %rbp
     2ed6: 48 89 e5             mov    %rsp,%rbp
     2ed9: 41 54                 push   %r12
     2edb: 85 f6                 test   %esi,%esi
     2edd: 75 2b                 jne    2f0a 

     2edf: 81 f9 00 ff ff ff     cmp    $0xffffff00,%ecx
     2ee5: 75 49                 jne    2f30 

     2ee7: 45 31 e4             xor    %r12d,%r12d
     2eea: 48 85 d2             test   %rdx,%rdx
     2eed: 74 3a                 je     2f29 

     2eef: 8b 02                 mov    (%rdx),%eax
     2ef1: 83 c0 04             add    $0x4,%eax
     2ef4: 83 f8 01             cmp    $0x1,%eax
     2ef7: 77 30                 ja     2f29 

     2ef9: 48 8b 72 08           mov    0x8(%rdx),%rsi
     2efd: e8 7e da ff ff       callq  980 
     2f02: 85 c0                 test   %eax,%eax
     2f04: 44 0f 4e e0           cmovle %eax,%r12d
     2f08: eb 1f                 jmp    2f29 

     2f0a: 83 fe 03             cmp    $0x3,%esi
     2f0d: 76 21                 jbe    2f30 

     2f0f: 45 85 c0             test   %r8d,%r8d
     2f12: 74 1c                 je     2f30 

     2f14: 41 8d 04 b0           lea    (%r8,%rsi,4),%eax
     2f18: 83 f8 50             cmp    $0x50,%eax
     2f1b: 77 1b                 ja     2f38 

     2f1d: 48 85 d2             test   %rdx,%rdx
     2f20: 74 04                 je     2f26 

     2f22: 85 c9                 test   %ecx,%ecx
     2f24: 75 0a                 jne    2f30 

     2f26: 45 31 e4             xor    %r12d,%r12d
     2f29: 44 89 e0             mov    %r12d,%eax
     2f2c: 41 5c                 pop    %r12
     2f2e: 5d                   pop    %rbp
     2f2f: c3                   retq
     2f30: 41 bc ea ff ff ff     mov    $0xffffffea,%r12d
     2f36: eb f1                 jmp    2f29 

     2f38: 41 bc de ff ff ff     mov    $0xffffffde,%r12d
     2f3e: eb e9                 jmp    2f29 


the `lea` instruction at `2f14` will multiply `%rsi` (reg) by 4 and add 
`%r8` len to it.

I created a working local privilege escalation exploit by utilizing such an 
out of bounds index to copy stack recordsdata to the explicit register space 
(declared in nf_tables_core.c:nft_do_chain). Then, I wrote a about a nft 
principles that tumble or settle for packets reckoning on whether or no longer the targeted byte 
is bigger than the constant comparand in the rule or no longer. This kind I 
may perhaps presumably perhaps create a binary search procedure that can presumably perhaps resolve the price of 
the leaked byte by registering whether or no longer the packet changed into as soon as dropped or no longer. 
This results in a kernel tackle leak.

At closing, I veteran a nft payload expression to write down my arbitrary recordsdata 
supplied in a packet to the stack in uncover to overwrite a return tackle 
and have a ROP chain.

An different exploitation technique may perhaps presumably perhaps be to overwrite to verdict 
register (including its chain pointer) to arbitrary values, as you may perhaps per chance presumably perhaps be ready to 
now earn an register index of 0 in the identical manner.


David Bouman

Read More

Related Articles

Windows 11 Guide

A guide on setting up your Windows 11 Desktop with all the essential Applications, Tools, and Games to make your experience with Windows 11 great! Note: You can easily convert this markdown file to a PDF in VSCode using this handy extension Markdown PDF. Getting Started Windows 11 Desktop Bypass Windows 11’s TPM, CPU and…

What’s recent in Emacs 28.1?

By Mickey Petersen It’s that time again: there’s a new major version of Emacs and, with it, a treasure trove of new features and changes.Notable features include the formal inclusion of native compilation, a technique that will greatly speed up your Emacs experience.A critical issue surrounding the use of ligatures also fixed; without it, you…