| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit ec7cb62d18d854ea09df8b7194e7e710985f8b9a upstream.
DCCP doesn't purge timewait sockets on network namespace shutdown.
So, after net namespace destroyed we could still have an active timer
which will trigger use after free in tw_timer_handler():
BUG: KASAN: use-after-free in tw_timer_handler+0x4a/0xa0 at addr ffff88010e0d1e10
Read of size 8 by task swapper/1/0
Call Trace:
__asan_load8+0x54/0x90
tw_timer_handler+0x4a/0xa0
call_timer_fn+0x127/0x480
expire_timers+0x1db/0x2e0
run_timer_softirq+0x12f/0x2a0
__do_softirq+0x105/0x5b4
irq_exit+0xdd/0xf0
smp_apic_timer_interrupt+0x57/0x70
apic_timer_interrupt+0x90/0xa0
Object at ffff88010e0d1bc0, in cache net_namespace size: 6848
Allocated:
save_stack_trace+0x1b/0x20
kasan_kmalloc+0xee/0x180
kasan_slab_alloc+0x12/0x20
kmem_cache_alloc+0x134/0x310
copy_net_ns+0x8d/0x280
create_new_namespaces+0x23f/0x340
unshare_nsproxy_namespaces+0x75/0xf0
SyS_unshare+0x299/0x4f0
entry_SYSCALL_64_fastpath+0x18/0xad
Freed:
save_stack_trace+0x1b/0x20
kasan_slab_free+0xae/0x180
kmem_cache_free+0xb4/0x350
net_drop_ns+0x3f/0x50
cleanup_net+0x3df/0x450
process_one_work+0x419/0xbb0
worker_thread+0x92/0x850
kthread+0x192/0x1e0
ret_from_fork+0x2e/0x40
Add .exit_batch hook to dccp_v4_ops()/dccp_v6_ops() which will purge
timewait sockets on net namespace destruction and prevent above issue.
Fixes: f2bf415cfed7 ("mib: add net to NET_ADD_STATS_BH")
Change-Id: I092f047f2ae2c13f4610512047c5a75833e165d6
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.16: pass twdr parameter to inet_twsk_purge()
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 45caeaa5ac0b4b11784ac6f932c0ad4c6b67cda0 upstream.
As Eric Dumazet pointed out this also needs to be fixed in IPv6.
v2: Contains the IPv6 tcp/Ipv6 dccp patches as well.
We have seen a few incidents lately where a dst_enty has been freed
with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that
dst_entry. If the conditions/timings are right a crash then ensues when the
freed dst_entry is referenced later on. A Common crashing back trace is:
#8 [] page_fault at ffffffff8163e648
[exception RIP: __tcp_ack_snd_check+74]
.
.
#9 [] tcp_rcv_established at ffffffff81580b64
#10 [] tcp_v4_do_rcv at ffffffff8158b54a
#11 [] tcp_v4_rcv at ffffffff8158cd02
#12 [] ip_local_deliver_finish at ffffffff815668f4
#13 [] ip_local_deliver at ffffffff81566bd9
#14 [] ip_rcv_finish at ffffffff8156656d
#15 [] ip_rcv at ffffffff81566f06
#16 [] __netif_receive_skb_core at ffffffff8152b3a2
#17 [] __netif_receive_skb at ffffffff8152b608
#18 [] netif_receive_skb at ffffffff8152b690
#19 [] vmxnet3_rq_rx_complete at ffffffffa015eeaf [vmxnet3]
#20 [] vmxnet3_poll_rx_only at ffffffffa015f32a [vmxnet3]
#21 [] net_rx_action at ffffffff8152bac2
#22 [] __do_softirq at ffffffff81084b4f
#23 [] call_softirq at ffffffff8164845c
#24 [] do_softirq at ffffffff81016fc5
#25 [] irq_exit at ffffffff81084ee5
#26 [] do_IRQ at ffffffff81648ff8
Of course it may happen with other NIC drivers as well.
It's found the freed dst_entry here:
224 static bool tcp_in_quickack_mode(struct sock *sk)â©
225 {â©
226 â¹ const struct inet_connection_sock *icsk = inet_csk(sk);â©
227 â¹ const struct dst_entry *dst = __sk_dst_get(sk);â©
228 â©
229 â¹ return (dst && dst_metric(dst, RTAX_QUICKACK)) ||â©
230 â¹ â¹ (icsk->icsk_ack.quick && !icsk->icsk_ack.pingpong);â©
231 }â©
But there are other backtraces attributed to the same freed dst_entry in
netfilter code as well.
All the vmcores showed 2 significant clues:
- Remote hosts behind the default gateway had always been redirected to a
different gateway. A rtable/dst_entry will be added for that host. Making
more dst_entrys with lower reference counts. Making this more probable.
- All vmcores showed a postitive LockDroppedIcmps value, e.g:
LockDroppedIcmps 267
A closer look at the tcp_v4_err() handler revealed that do_redirect() will run
regardless of whether user space has the socket locked. This can result in a
race condition where the same dst_entry cached in sk->sk_dst_entry can be
decremented twice for the same socket via:
do_redirect()->__sk_dst_check()-> dst_release().
Which leads to the dst_entry being prematurely freed with another socket
pointing to it via sk->sk_dst_cache and a subsequent crash.
To fix this skip do_redirect() if usespace has the socket locked. Instead let
the redirect take place later when user space does not have the socket
locked.
The dccp/IPv6 code is very similar in this respect, so fixing it there too.
As Eric Garver pointed out the following commit now invalidates routes. Which
can set the dst->obsolete flag so that ipv4_dst_check() returns null and
triggers the dst_release().
Fixes: ceb3320610d6 ("ipv4: Kill routes during PMTU/redirect updates.")
Cc: Eric Garver <egarver@redhat.com>
Cc: Hannes Sowa <hsowa@redhat.com>
Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commit 6706a97fec963d6cb3f7fc2978ec1427b4651214 upstream.
dccp_v4_err() does not use pskb_may_pull() and might access garbage.
We only need 4 bytes at the beginning of the DCCP header, like TCP,
so the 8 bytes pulled in icmp_socket_deliver() are more than enough.
This patch might allow to process more ICMP messages, as some routers
are still limiting the size of reflected bytes to 28 (RFC 792), instead
of extended lengths (RFC 1812 4.3.2.3)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
| |
|