aboutsummaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
...
* staging: Remove the Android alarm-dev driverJohn Stultz2016-11-174-652/+0
| | | | | | | | | | | | | | | | | | | | | | The functionality provided by the Android alarm-dev driver should now be present in the timerfd interface (thanks to Greg Hackmann and Todd Poynor). As of Lollipop, AOSP can make use of the timerfd if alarm-dev is not present (though a fixup for setting the rtc time if rtc0 isn't the backing for _ALARM clockids has been applied post-Lollipop). Thus, we should be able to remove alarm-dev from staging. Cc: Greg Hackmann <ghackmann@google.com> Cc: Elliott Hughes <enh@google.com> Cc: Todd Poynor <toddpoynor@google.com> Cc: Android Kernel Team <kernel-team@android.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Acked-by: Mark Salyzyn <salyzyn@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Change-Id: Ia905d4b809cc1614ddde01ccb791fc56ac292fa7
* staging: Remove the Android logger driverJohn Stultz2016-11-173-1395/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the relase of Lollipop, Android no longer requires the logger driver. There are three patches which the android dev's still need before they drop logger on all their devices: [PATCH v4 1/5] pstores: use scnprintf [PATCH v2 2/5] pstore: remove superfluous memory size check [PATCH 3/5] pstore: handle zero-sized prz in series [PATCH v4 4/5] pstore: add pmsg [PATCH 5/5] pstore: selinux: add security in-core xattr support for pstore and debugfs But these seem to have been acked and are hopefully queued for upstream. So this patch removes the logger driver from staging. Cc: Rom Lemarchand <romlem@google.com>, Cc: Mark Salyzyn <salyzyn@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Android Kernel Team <kernel-team@android.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Bug: 13505761 Change-Id: I21b6897f01871851e05b6eb53c7c08a1cb597e3e (cherry picked from commit e26c1fa7e7ab4a06242d9fce5368b05e412812e1)
* android: drivers: workaround debugfs race in binderRiley Andrews2016-11-171-17/+9
| | | | | | | | | | | | | If a /d/binder/proc/[pid] entry is kept open after linux has torn down the associated process, binder_proc_show can deference an invalid binder_proc that has been stashed in the debugfs inode. Validate that the binder_proc ptr passed into binder_proc_show has not been freed by looking for it within the global process list whilst the global lock is held. If the ptr is not valid, print nothing. Bug 19587483 Change-Id: Ice878c171db51ef9a4879c2f9299a2deb873d255 Signed-off-by: Riley Andrews <riandrews@android.com>
* ipv6: clean up anycast when an interface is destroyedSabrina Dubroca2016-11-173-4/+27
| | | | | | | | | | | | | | | | | | | | | | | If we try to rmmod the driver for an interface while sockets with setsockopt(JOIN_ANYCAST) are alive, some refcounts aren't cleaned up and we get stuck on: unregister_netdevice: waiting for ens3 to become free. Usage count = 1 If we LEAVE_ANYCAST/close everything before rmmod'ing, there is no problem. We need to perform a cleanup similar to the one for multicast in addrconf_ifdown(how == 1). BUG: 18902601 Bug: 19100303 Change-Id: I6d51aed5755eb5738fcba91950e7773a1c985d2e Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Patrick Tjin <pattjin@google.com>
* KEYS: Fix handling of stored error in a negatively instantiated user keyDavid Howells2016-11-173-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a user key gets negatively instantiated, an error code is cached in the payload area. A negatively instantiated key may be then be positively instantiated by updating it with valid data. However, the ->update key type method must be aware that the error code may be there. The following may be used to trigger the bug in the user key type: keyctl request2 user user "" @u keyctl add user user "a" @u which manifests itself as: BUG: unable to handle kernel paging request at 00000000ffffff8a IP: [<ffffffff810a376f>] __call_rcu.constprop.76+0x1f/0x280 kernel/rcu/tree.c:3046 PGD 7cc30067 PUD 0 Oops: 0002 [#1] SMP Modules linked in: CPU: 3 PID: 2644 Comm: a.out Not tainted 4.3.0+ #49 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff88003ddea700 ti: ffff88003dd88000 task.ti: ffff88003dd88000 RIP: 0010:[<ffffffff810a376f>] [<ffffffff810a376f>] __call_rcu.constprop.76+0x1f/0x280 [<ffffffff810a376f>] __call_rcu.constprop.76+0x1f/0x280 kernel/rcu/tree.c:3046 RSP: 0018:ffff88003dd8bdb0 EFLAGS: 00010246 RAX: 00000000ffffff82 RBX: 0000000000000000 RCX: 0000000000000001 RDX: ffffffff81e3fe40 RSI: 0000000000000000 RDI: 00000000ffffff82 RBP: ffff88003dd8bde0 R08: ffff88007d2d2da0 R09: 0000000000000000 R10: 0000000000000000 R11: ffff88003e8073c0 R12: 00000000ffffff82 R13: ffff88003dd8be68 R14: ffff88007d027600 R15: ffff88003ddea700 FS: 0000000000b92880(0063) GS:ffff88007fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000ffffff8a CR3: 000000007cc5f000 CR4: 00000000000006e0 Stack: ffff88003dd8bdf0 ffffffff81160a8a 0000000000000000 00000000ffffff82 ffff88003dd8be68 ffff88007d027600 ffff88003dd8bdf0 ffffffff810a39e5 ffff88003dd8be20 ffffffff812a31ab ffff88007d027600 ffff88007d027620 Call Trace: [<ffffffff810a39e5>] kfree_call_rcu+0x15/0x20 kernel/rcu/tree.c:3136 [<ffffffff812a31ab>] user_update+0x8b/0xb0 security/keys/user_defined.c:129 [< inline >] __key_update security/keys/key.c:730 [<ffffffff8129e5c1>] key_create_or_update+0x291/0x440 security/keys/key.c:908 [< inline >] SYSC_add_key security/keys/keyctl.c:125 [<ffffffff8129fc21>] SyS_add_key+0x101/0x1e0 security/keys/keyctl.c:60 [<ffffffff8185f617>] entry_SYSCALL_64_fastpath+0x12/0x6a arch/x86/entry/entry_64.S:185 Note the error code (-ENOKEY) in EDX. A similar bug can be tripped by: keyctl request2 trusted user "" @u keyctl add trusted user "a" @u This should also affect encrypted keys - but that has to be correctly parameterised or it will fail with EINVAL before getting to the bit that will crashes. Change-Id: I171d566f431c56208e1fe279f466d2d399a9ac7c Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
* net: add length argument to skb_copy_and_csum_datagram_iovecSabrina Dubroca2016-11-177-7/+14
| | | | | | | | | | | | | | | | | | Without this length argument, we can read past the end of the iovec in memcpy_toiovec because we have no way of knowing the total length of the iovec's buffers. This is needed for stable kernels where 89c22d8c3b27 ("net: Fix skb csum races when peeking") has been backported but that don't have the ioviter conversion, which is almost all the stable trees <= 3.18. This also fixes a kernel crash for NFS servers when the client uses -onfsvers=3,proto=udp to mount the export. Change-Id: I1865e3d7a1faee42a5008a9ad58c4d3323ea4bab Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org> (cherry picked from commit 1644c6f70701fea6b3f8bbe3130d5633a5ec14f0)
* mm: reorder can_do_mlock to fix audit denialJeff Vander Stoep2016-11-171-2/+2
| | | | | | | | | | | | | | | | | | | | | A userspace call to mmap(MAP_LOCKED) may result in the successful locking of memory while also producing a confusing audit log denial. can_do_mlock checks capable and rlimit. If either of these return positive can_do_mlock returns true. The capable check leads to an LSM hook used by apparmour and selinux which produce the audit denial. Reordering so rlimit is checked first eliminates the denial on success, only recording a denial when the lock is unsuccessful as a result of the denial. Change-Id: Ic6e724554a7d566768a594917f160ab5b732108e Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Acked-by: Nick Kralevich <nnk@google.com> Cc: Jeff Vander Stoep <jeffv@google.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Rik van Riel <riel@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Paul Cassella <cassella@cray.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* net: fix iterating over hashtable in tcp_nuke_addr()Dmitry Torokhov2016-11-171-1/+1
| | | | | | | | | The actual size of the tcp hashinfo table is tcp_hashinfo.ehash_mask + 1 so we need to adjust the loop accordingly to get the sockets hashed into the last bucket. Change-Id: I796b3c7b4a1a7fa35fba9e5192a4a403eb6e17de Signed-off-by: Dmitry Torokhov <dtor@google.com>
* bluetooth: Validate socket address length in sco_sock_bind().David S. Miller2016-11-171-1/+1
| | | | | Change-Id: I890640975f1af64f71947b6a1820249e08f6375b Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: addrconf: validate new MTU before applying itMarcelo Leitner2016-11-171-1/+16
| | | | | | | | | | | | | | | | | | | | | | | Currently we don't check if the new MTU is valid or not and this allows one to configure a smaller than minimum allowed by RFCs or even bigger than interface own MTU, which is a problem as it may lead to packet drops. If you have a daemon like NetworkManager running, this may be exploited by remote attackers by forging RA packets with an invalid MTU, possibly leading to a DoS. (NetworkManager currently only validates for values too small, but not for too big ones.) The fix is just to make sure the new value is valid. That is, between IPV6_MIN_MTU and interface's MTU. Note that similar check is already performed at ndisc_router_discovery(), for when kernel itself parses the RA. Change-Id: I2a10453ba3ef7709bf8f943ac85bd44368c23311 Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* msm: null pointer dereferencingWish Wu2016-11-172-2/+9
| | | | | | | | | | | | | | | Prevent unintended kernel NULL pointer dereferencing. Orignal code: hlist_del_rcu(&event->hlist_entry); Fix: Adding pointer check: if(!hlist_unhashed(&p_event->hlist_entry)) hlist_del_rcu(&p_event->hlist_entry); Bug: 25364034 Change-Id: Ieda6d8f4bb567827fa6c7709e9e729905c6c3882 Signed-off-by: Yuan Lin <yualin@google.com>
* mnt: Fail collect_mounts when applied to unmounted mountsEric W. Biederman2016-11-171-2/+5
| | | | | | | | | | | | | | | | | | | | | | The only users of collect_mounts are in audit_tree.c In audit_trim_trees and audit_add_tree_rule the path passed into collect_mounts is generated from kern_path passed an audit_tree pathname which is guaranteed to be an absolute path. In those cases collect_mounts is obviously intended to work on mounted paths and if a race results in paths that are unmounted when collect_mounts it is reasonable to fail early. The paths passed into audit_tag_tree don't have the absolute path check. But are used to play with fsnotify and otherwise interact with the audit_trees, so again operating only on mounted paths appears reasonable. Avoid having to worry about what happens when we try and audit unmounted filesystems by restricting collect_mounts to mounts that appear in the mount tree. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* net: add validation for the socket syscall protocol argumentHannes Frederic Sowa2016-11-173-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 郭永刚 reported that one could simply crash the kernel as root by using a simple program: int socket_fd; struct sockaddr_in addr; addr.sin_port = 0; addr.sin_addr.s_addr = INADDR_ANY; addr.sin_family = 10; socket_fd = socket(10,3,0x40000000); connect(socket_fd , &addr,16); AF_INET, AF_INET6 sockets actually only support 8-bit protocol identifiers. inet_sock's skc_protocol field thus is sized accordingly, thus larger protocol identifiers simply cut off the higher bits and store a zero in the protocol fields. This could lead to e.g. NULL function pointer because as a result of the cut off inet_num is zero and we call down to inet_autobind, which is NULL for raw sockets. kernel: Call Trace: kernel: [<ffffffff816db90e>] ? inet_autobind+0x2e/0x70 kernel: [<ffffffff816db9a4>] inet_dgram_connect+0x54/0x80 kernel: [<ffffffff81645069>] SYSC_connect+0xd9/0x110 kernel: [<ffffffff810ac51b>] ? ptrace_notify+0x5b/0x80 kernel: [<ffffffff810236d8>] ? syscall_trace_enter_phase2+0x108/0x200 kernel: [<ffffffff81645e0e>] SyS_connect+0xe/0x10 kernel: [<ffffffff81779515>] tracesys_phase2+0x84/0x89 I found no particular commit which introduced this problem. Change-Id: I653fad90da54908144cc8916c2dccb1fa6f14eed CVE: CVE-2015-8543 Cc: Cong Wang <cwang@twopensource.com> Reported-by: 郭永刚 <guoyonggang@360.cn> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* netfilter: x_tables: check for size overflowFlorian Westphal2016-11-171-0/+4
| | | | | | | | | | | | Ben Hawkes says: integer overflow in xt_alloc_table_info, which on 32-bit systems can lead to small structure allocation and a copy_from_user based heap corruption. Change-Id: I13c554c630651a37e3f6a195e9a5f40cddcb29a1 Reported-by: Ben Hawkes <hawkes@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* ipv4: Don't do expensive useless work during inetdev destroy.David S. Miller2016-11-173-2/+18
| | | | | | | | | | | | | | | | | | When an inetdev is destroyed, every address assigned to the interface is removed. And in this scenerio we do two pointless things which can be very expensive if the number of assigned interfaces is large: 1) Address promotion. We are deleting all addresses, so there is no point in doing this. 2) A full nf conntrack table purge for every address. We only need to do this once, as is already caught by the existing masq_dev_notifier so masq_inet_event() can skip this. Change-Id: I4b2a3ed665543728451c21465fb90ec89f739135 Reported-by: Solar Designer <solar@openwall.com> Signed-off-by: David S. Miller <davem@davemloft.net> Tested-by: Cyrill Gorcunov <gorcunov@openvz.org>
* tcp: fix use after free in tcp_xmit_retransmit_queue()Eric Dumazet2016-11-171-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When tcp_sendmsg() allocates a fresh and empty skb, it puts it at the tail of the write queue using tcp_add_write_queue_tail() Then it attempts to copy user data into this fresh skb. If the copy fails, we undo the work and remove the fresh skb. Unfortunately, this undo lacks the change done to tp->highest_sack and we can leave a dangling pointer (to a freed skb) Later, tcp_xmit_retransmit_queue() can dereference this pointer and access freed memory. For regular kernels where memory is not unmapped, this might cause SACK bugs because tcp_highest_sack_seq() is buggy, returning garbage instead of tp->snd_nxt, but with various debug features like CONFIG_DEBUG_PAGEALLOC, this can crash the kernel. This bug was found by Marco Grassi thanks to syzkaller. Change-Id: I264f97d30d0a623011d9ee811c63fa0e0c2149a2 Fixes: 6859d49475d4 ("[TCP]: Abstract tp->highest_sack accessing & point to next skb") Reported-by: Marco Grassi <marco.gra@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* KEYS: Fix short sprintf buffer in /proc/keys show functionDavid Howells2016-11-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a short sprintf buffer in proc_keys_show(). If the gcc stack protector is turned on, this can cause a panic due to stack corruption. The problem is that xbuf[] is not big enough to hold a 64-bit timeout rendered as weeks: (gdb) p 0xffffffffffffffffULL/(60*60*24*7) $2 = 30500568904943 That's 14 chars plus NUL, not 11 chars plus NUL. Expand the buffer to 16 chars. I think the unpatched code apparently works if the stack-protector is not enabled because on a 32-bit machine the buffer won't be overflowed and on a 64-bit machine there's a 64-bit aligned pointer at one side and an int that isn't checked again on the other side. The panic incurred looks something like: Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff81352ebe CPU: 0 PID: 1692 Comm: reproducer Not tainted 4.7.2-201.fc24.x86_64 #1 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 0000000000000086 00000000fbbd2679 ffff8800a044bc00 ffffffff813d941f ffffffff81a28d58 ffff8800a044bc98 ffff8800a044bc88 ffffffff811b2cb6 ffff880000000010 ffff8800a044bc98 ffff8800a044bc30 00000000fbbd2679 Call Trace: [<ffffffff813d941f>] dump_stack+0x63/0x84 [<ffffffff811b2cb6>] panic+0xde/0x22a [<ffffffff81352ebe>] ? proc_keys_show+0x3ce/0x3d0 [<ffffffff8109f7f9>] __stack_chk_fail+0x19/0x30 [<ffffffff81352ebe>] proc_keys_show+0x3ce/0x3d0 [<ffffffff81350410>] ? key_validate+0x50/0x50 [<ffffffff8134db30>] ? key_default_cmp+0x20/0x20 [<ffffffff8126b31c>] seq_read+0x2cc/0x390 [<ffffffff812b6b12>] proc_reg_read+0x42/0x70 [<ffffffff81244fc7>] __vfs_read+0x37/0x150 [<ffffffff81357020>] ? security_file_permission+0xa0/0xc0 [<ffffffff81246156>] vfs_read+0x96/0x130 [<ffffffff81247635>] SyS_read+0x55/0xc0 [<ffffffff817eb872>] entry_SYSCALL_64_fastpath+0x1a/0xa4 Change-Id: I0787d5a38c730ecb75d3c08f28f0ab36295d59e7 Reported-by: Ondrej Kozina <okozina@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Ondrej Kozina <okozina@redhat.com> (cherry picked from commit 4e2e424f973fb174c0bf7750660e6cecb9acd68a)
* Revert "security/selinux: force permissive"fire8552016-11-171-2/+0
| | | | This reverts commit d466c7e32a7d74f0e14b018a92d949f95268db32.
* Revert "hand-pick: mediatek:remove unnecessary sido call flow"Moyster2016-11-111-0/+12
| | | | This reverts commit 9a7858491639342b5d3c8d496d3b9370d2330591.
* wlan: WiFi Direct CTS fixsdragonpt2016-11-116-11/+66
| | | | | | | | | | | | | | | | | | | | | | Cylen Yao <cylen.yao@mediatek.com> Details: 1. WiFi Direct CTS tests will fail as supplicant and driver could not keep sync in following case: 1.1 supplicant will request channel when do p2p listen, but driver/firmware has not switch to the target channel when supplicant get remain on channel credit by call driver API of remain on channel; This will make supplicant and driver in unsync state which will make supplicant fail to go to listen state randomly. 1.2 Supplicant and driver will also keep unsync when do mgmt frame TX; supplicant will do other task once mgmt frame TX is returned by calling driver API mgmt_tx, but, driver has not actually TX the mgmt frame out. In extremely case, driver will drop the second mgmt frame if the previous on has not been TX out, just as the group owner test case.
* tty: Properly fix memleak of alloc_piddragonpt2016-11-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cylen Yao <cylen.yao@mediatek.com> bug: 7845126 MT67x2 Memleak is due to unreleased pid->count, which execute in function: get_pid()(pid->count++) and put_pid()(pid->count--). The race condition as following: task[dumpsys] task[adbd] in disassociate_ctty() in tty_signal_session_leader() ----------------------- ------------------------- tty = get_current_tty(); // tty is not NULL ... spin_lock_irq(&current->sighand->siglock); put_pid(current->signal->tty_old_pgrp); current->signal->tty_old_pgrp = NULL; spin_unlock_irq(&current->sighand->siglock); spin_lock_irq(&p->sighand->siglock); ... p->signal->tty = NULL; ... spin_unlock_irq(&p->sighand->siglock); tty = get_current_tty(); // tty NULL, goto else branch by accident. if (tty) { ... put_pid(tty_session); put_pid(tty_pgrp); ... } else { print msg } in task[dumpsys], in disassociate_ctty(), tty is set NULL by task[adbd], tty_signal_session_leader(), then it goto else branch and lack of put_pid(), cause memleak. move spin_unlock(sighand->siglock) after get_current_tty() can avoid the race and fix the memleak.
* KBUILD_CFLAGS: Some more tuningcm2016-11-111-1/+1
| | | | use -ftree-vectorize
* KBUILD_CFLAGS: Some tuningcm2016-11-111-1/+1
| | | | use cortex-a53 and arm platf proper optimized flags
* Fix "Security Vulnerability - kernel info leak of wifi driver"cm2016-11-111-7/+14
|
* nf: IDLETIMER: Adds the uid field in the msgRuchi Kandoi2016-11-111-5/+32
| | | | | | | | | | | Message notifications contains an additional uid field. This field represents the uid that was responsible for waking the radio. And hence it is present only in notifications stating that the radio is now active. Change-Id: I18fc73eada512e370d7ab24fc9f890845037b729 Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com> Bug: 20264396
* stackprotector: Introduce CONFIG_CC_STACKPROTECTOR_STRONGKees Cook2016-11-113-4/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This changes the stack protector config option into a choice of "None", "Regular", and "Strong": CONFIG_CC_STACKPROTECTOR_NONE CONFIG_CC_STACKPROTECTOR_REGULAR CONFIG_CC_STACKPROTECTOR_STRONG "Regular" means the old CONFIG_CC_STACKPROTECTOR=y option. "Strong" is a new mode introduced by this patch. With "Strong" the kernel is built with -fstack-protector-strong (available in gcc 4.9 and later). This option increases the coverage of the stack protector without the heavy performance hit of -fstack-protector-all. For reference, the stack protector options available in gcc are: -fstack-protector-all: Adds the stack-canary saving prefix and stack-canary checking suffix to _all_ function entry and exit. Results in substantial use of stack space for saving the canary for deep stack users (e.g. historically xfs), and measurable (though shockingly still low) performance hit due to all the saving/checking. Really not suitable for sane systems, and was entirely removed as an option from the kernel many years ago. -fstack-protector: Adds the canary save/check to functions that define an 8 (--param=ssp-buffer-size=N, N=8 by default) or more byte local char array. Traditionally, stack overflows happened with string-based manipulations, so this was a way to find those functions. Very few total functions actually get the canary; no measurable performance or size overhead. -fstack-protector-strong Adds the canary for a wider set of functions, since it's not just those with strings that have ultimately been vulnerable to stack-busting. With this superset, more functions end up with a canary, but it still remains small compared to all functions with only a small change in performance. Based on the original design document, a function gets the canary when it contains any of: - local variable's address used as part of the right hand side of an assignment or function argument - local variable is an array (or union containing an array), regardless of array type or length - uses register local variables https://docs.google.com/a/google.com/document/d/1xXBH6rRZue4f296vGt9YQcuLVQHeE516stHwt8M9xyU Find below a comparison of "size" and "objdump" output when built with gcc-4.9 in three configurations: - defconfig 11430641 kernel text size 36110 function bodies - defconfig + CONFIG_CC_STACKPROTECTOR_REGULAR 11468490 kernel text size (+0.33%) 1015 of 36110 functions are stack-protected (2.81%) - defconfig + CONFIG_CC_STACKPROTECTOR_STRONG via this patch 11692790 kernel text size (+2.24%) 7401 of 36110 functions are stack-protected (20.5%) With -strong, ARM's compressed boot code now triggers stack protection, so a static guard was added. Since this is only used during decompression and was never used before, the exposure here is very small. Once it switches to the full kernel, the stack guard is back to normal. Chrome OS has been using -fstack-protector-strong for its kernel builds for the last 8 months with no problems. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Michal Marek <mmarek@suse.cz> Cc: Russell King <linux@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: James Hogan <james.hogan@imgtec.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Shawn Guo <shawn.guo@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-mips@linux-mips.org Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/1387481759-14535-3-git-send-email-keescook@chromium.org [ Improved the changelog and descriptions some more. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Git-commit: 8779657d29c0ebcc0c94ede4df2f497baf1b563f Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Change-Id: I0c53785c54b9c2bedd6134fb959b59d1d1afb0ef Signed-off-by: David Brown <davidb@codeaurora.org>
* stackprotector: Unify the HAVE_CC_STACKPROTECTOR logic between architecturesKees Cook2016-11-119-46/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of duplicating the CC_STACKPROTECTOR Kconfig and Makefile logic in each architecture, switch to using HAVE_CC_STACKPROTECTOR and keep everything in one place. This retains the x86-specific bug verification scripts. Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Michal Marek <mmarek@suse.cz> Cc: Russell King <linux@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: James Hogan <james.hogan@imgtec.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Shawn Guo <shawn.guo@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-mips@linux-mips.org Cc: linux-arch@vger.kernel.org Link: http://lkml.kernel.org/r/1387481759-14535-2-git-send-email-keescook@chromium.org Signed-off-by: Ingo Molnar <mingo@kernel.org> [davidb@codeaurora.org: Simple Kconfig merge resolution] Git-commit: 19952a92037e752f9d3bbbad552d596f9a56e146 Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git Change-Id: I6e430de3c79306724e90ea1178f242145c39f059 Signed-off-by: David Brown <davidb@codeaurora.org> Conflicts: arch/x86/Kconfig
* defconfig: enable CC_STACKPROTECTOR-STRONGMoyster2016-11-111-0/+4
|
* hardcode LINUX_COMPILE_BY and COMPILE_HOST to linux@linux-userMoyster2016-11-081-0/+11
|
* defconfig : So wrong...Moyster2016-11-071-2/+2
| | | | | | | | | Used to be, I could make the pieces fit Break the edges, force fit all of this How could I ever be so wrong? At our base, we are doomed once we begin Kinda makes you wonder, "What's the sense?" How could I ever be so wrong?
* regen defconfigMoyster2016-11-071-14/+6
|
* defconfig: don't build old ext3Moyster2016-11-071-3/+3
|
* Regen Defconfig: Remove Deprecated FSscafroglia932016-11-071-3/+3
|
* proc: much faster /proc/vmstatFrancisco Franco2016-11-071-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Every current KDE system has process named ksysguardd polling files below once in several seconds: $ strace -e trace=open -p $(pidof ksysguardd) Process 1812 attached open("/etc/mtab", O_RDONLY|O_CLOEXEC) = 8 open("/etc/mtab", O_RDONLY|O_CLOEXEC) = 8 open("/proc/net/dev", O_RDONLY) = 8 open("/proc/net/wireless", O_RDONLY) = -1 ENOENT (No such file or directory) open("/proc/stat", O_RDONLY) = 8 open("/proc/vmstat", O_RDONLY) = 8 Hell knows what it is doing but speed up reading /proc/vmstat by 33%! Benchmark is open+read+close 1.000.000 times. BEFORE $ perf stat -r 10 taskset -c 3 ./proc-vmstat Performance counter stats for 'taskset -c 3 ./proc-vmstat' (10 runs): 13146.768464 task-clock (msec) # 0.960 CPUs utilized ( +- 0.60% ) 15 context-switches # 0.001 K/sec ( +- 1.41% ) 1 cpu-migrations # 0.000 K/sec ( +- 11.11% ) 104 page-faults # 0.008 K/sec ( +- 0.57% ) 45,489,799,349 cycles # 3.460 GHz ( +- 0.03% ) 9,970,175,743 stalled-cycles-frontend # 21.92% frontend cycles idle ( +- 0.10% ) 2,800,298,015 stalled-cycles-backend # 6.16% backend cycles idle ( +- 0.32% ) 79,241,190,850 instructions # 1.74 insn per cycle # 0.13 stalled cycles per insn ( +- 0.00% ) 17,616,096,146 branches # 1339.956 M/sec ( +- 0.00% ) 176,106,232 branch-misses # 1.00% of all branches ( +- 0.18% ) 13.691078109 seconds time elapsed ( +- 0.03% ) ^^^^^^^^^^^^ AFTER $ perf stat -r 10 taskset -c 3 ./proc-vmstat Performance counter stats for 'taskset -c 3 ./proc-vmstat' (10 runs): 8688.353749 task-clock (msec) # 0.950 CPUs utilized ( +- 1.25% ) 10 context-switches # 0.001 K/sec ( +- 2.13% ) 1 cpu-migrations # 0.000 K/sec 104 page-faults # 0.012 K/sec ( +- 0.56% ) 30,384,010,730 cycles # 3.497 GHz ( +- 0.07% ) 12,296,259,407 stalled-cycles-frontend # 40.47% frontend cycles idle ( +- 0.13% ) 3,370,668,651 stalled-cycles-backend # 11.09% backend cycles idle ( +- 0.69% ) 28,969,052,879 instructions # 0.95 insn per cycle # 0.42 stalled cycles per insn ( +- 0.01% ) 6,308,245,891 branches # 726.058 M/sec ( +- 0.00% ) 214,685,502 branch-misses # 3.40% of all branches ( +- 0.26% ) 9.146081052 seconds time elapsed ( +- 0.07% ) ^^^^^^^^^^^ vsnprintf() is slow because: 1. format_decode() is busy looking for format specifier: 2 branches per character (not in this case, but in others) 2. approximately million branches while parsing format mini language and everywhere 3. just look at what string() does /proc/vmstat is good case because most of its content are strings Link: http://lkml.kernel.org/r/20160806125455.GA1187@p183.telecom.by Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Joe Perches <joe@perches.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>
* arm64: kill off the libgcc dependencyFrancisco Franco2016-11-071-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | The arm64 kernel builds fine without the libgcc. Actually it should not be used at all in the kernel. The following are the reasons indicated by Russell King: Although libgcc is part of the compiler, libgcc is built with the expectation that it will be running in userland - it expects to link to a libc. That's why you can't build libgcc without having the glibc headers around. [...] Meanwhile, having the kernel build the compiler support functions that it needs ensures that (a) we know what compiler support functions are being used, (b) we know the implementation of those support functions are sane for use in the kernel, (c) we can build them with appropriate compiler flags for best performance, and (d) we remove an unnecessary dependency on the build toolchain. Signed-off-by: Kevin Hao <haokexin@gmail.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> (cherry picked from commit d67703a) Signed-off-by: Alex Shi <alex.shi@linaro.org> Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>
* proc: Remove verifiedbootstate flag from /proc/cmdlineSultanxda2016-11-071-1/+27
| | | | | | | | | | | | Userspace parses this and sets the ro.boot.verifiedbootstate prop according to the value that this flag has. When ro.boot.verifiedbootstate is not 'green', SafetyNet is tripped and fails the CTS test. Hide verifiedbootstate from /proc/cmdline in order to fix the failed SafetyNet CTS check. Signed-off-by: Sultanxda <sultanxda@gmail.com> Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>
* defconfig: enable dummy0 interfaceMoyster2016-11-071-1/+1
|
* defconfigMoyster2016-11-071-2/+3
|
* ping: fix null pointer exception (seen in speedtest app)DerTeufel2016-11-071-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [30409.811801]<5> (5)[29555:ping]SELinux: security field of sock is null!! [30409.811808]<5> (5)[29555:ping]SELinux: security field of sock is null!! [30409.811817]<5> (5)[29555:ping]SELinux: security field of sock is null!! [30409.811823]<5> (5)[29555:ping]SELinux: security field of sock is null!! [30409.811833]<5> (5)[29555:ping]SELinux: security field of sock is null!! [30409.811839]<5> (5)[29555:ping]SELinux: security field of sock is null!! [30409.811848]<5> (5)[29555:ping]SELinux: security field of sock is null!! [30409.811855]<5> (5)[29555:ping]SELinux: security field of sock is null!! [30409.811869]<5> (5)[29555:ping]Unable to handle kernel paging request at virtual address 5f37d1ba1e0fb303 [30409.811878]<5> (5)[29555:ping]pgd = ffffffc00fa8c000 [30409.811884][5f37d1ba1e0fb303] *pgd=0000000000000000 [30409.811893]<5> (5)[29555:ping][KERN Warning] ERROR/WARN forces debug_lock off! [30409.811899]<5> (5)[29555:ping][KERN Warning] check backtrace: [30409.811910]<5> (5)[29555:ping]CPU: 5 PID: 29555 Comm: ping Tainted: G W 3.10.90-DragonDevil_Jiayu.de #4 [30409.811918]<5> (5)[29555:ping]Call trace: [30409.811933]<5> (5)[29555:ping][<ffffffc000088aec>] dump_backtrace+0x0/0x16c [30409.811943]<5> (5)[29555:ping][<ffffffc000088c68>] show_stack+0x10/0x1c [30409.811956]<5> (5)[29555:ping][<ffffffc0009bff20>] dump_stack+0x1c/0x28 [30409.811967]<5> (5)[29555:ping][<ffffffc0002fb9f0>] debug_locks_off+0x44/0x5c [30409.811978]<5> (5)[29555:ping][<ffffffc000099f10>] oops_enter+0xc/0x28 [30409.811988]<5> (5)[29555:ping][<ffffffc000088c9c>] die+0x28/0x1d8 [30409.811998]<5> (5)[29555:ping][<ffffffc0009bda84>] __do_kernel_fault.part.5+0x70/0x84 [30409.812009]<5> (5)[29555:ping][<ffffffc0000942c4>] do_bad_area+0x90/0x94 [30409.812019]<5> (5)[29555:ping][<ffffffc000094310>] do_translation_fault+0x30/0x4c [30409.812028]<5> (5)[29555:ping][<ffffffc0000813f8>] do_mem_abort+0x38/0x9c [30409.812036]<5> (5)[29555:ping]Exception stack(0xffffffc0876cf8f0 to 0xffffffc0876cfac4) [30409.812046]<5> (5)[29555:ping]f8e0: 876cfb28 ffffffc0 876cc000 ffffffc0 [30409.812056]<5> (5)[29555:ping]f900: 876cfab0 ffffffc0 002a0018 ffffffc0 00df2008 ffffffc0 00df2008 ffffffc0 [30409.812066]<5> (5)[29555:ping]f920: 876cf930 ffffffc0 009cd040 ffffffc0 876cf940 ffffffc0 000c3cdc ffffffc0 [30409.812076]<5> (5)[29555:ping]f940: 876cf960 ffffffc0 0009b974 ffffffc0 00df2008 ffffffc0 00df1000 ffffffc0 [30409.812086]<5> (5)[29555:ping]f960: 876cf970 ffffffc0 009cd000 ffffffc0 876cf980 ffffffc0 0009c0dc ffffffc0 [30409.812096]<5> (5)[29555:ping]f980: 876cfa20 ffffffc0 009bdd38 ffffffc0 00000000 00000000 8ec6c9c0 ffffffc0 [30409.812106]<5> (5)[29555:ping]f9a0: 00000002 00000000 876cc000 ffffffc0 002a0144 ffffffc0 876cfbc0 ffffffc0 [30409.812116]<5> (5)[29555:ping]f9c0: 876cfd50 ffffffc0 00000000 00000000 000000d4 00000000 00000004 00000000 [30409.812125]<5> (5)[29555:ping]f9e0: 86544000 00000055 000003e8 00000000 0000000a 00000000 000003e8 00000000 [30409.812135]<5> (5)[29555:ping]fa00: 00000001 00000000 000000c0 00000000 00838260 ffffffc0 9c13b0c4 0000007f [30409.812145]<5> (5)[29555:ping]fa20: 00000001 00000000 876cfb28 ffffffc0 1e0fb303 5f37d1ba 00000194 00000000 [30409.812155]<5> (5)[29555:ping]fa40: 00000000 00000000 876cfe88 ffffffc0 876cfd50 ffffffc0 876cfdd0 ffffffc0 [30409.812165]<5> (5)[29555:ping]fa60: ebec47e8 0000007f 000000c0 00000000 876cfe88 ffffffc0 876cfab0 ffffffc0 [30409.812174]<5> (5)[29555:ping]fa80: 002a000c ffffffc0 876cfab0 ffffffc0 002a0018 ffffffc0 60000145 00000000 [30409.812184]<5> (5)[29555:ping]faa0: 876cfb28 ffffffc0 00000001 00000000 876cfb60 ffffffc0 002a0164 ffffffc0 [30409.812191]<5> (5)[29555:ping]fac0: 876cfc38 [30409.812200]<5> (5)[29555:ping][<ffffffc000083c58>] el1_da+0x1c/0x88 [30409.812213]<5> (5)[29555:ping][<ffffffc0002a0160>] selinux_socket_recvmsg+0x1c/0x28 [30409.812225]<5> (5)[29555:ping][<ffffffc00029bfbc>] security_socket_recvmsg+0x14/0x20 [30409.812237]<5> (5)[29555:ping][<ffffffc0008347f0>] sock_recvmsg+0x74/0xf4 [30409.812248]<5> (5)[29555:ping][<ffffffc000834d18>] ___sys_recvmsg+0xcc/0x220 [30409.812259]<5> (5)[29555:ping][<ffffffc000838218>] __sys_recvmsg+0x3c/0x84 [30409.812270]<5> (5)[29555:ping][<ffffffc00083826c>] SyS_recvmsg+0xc/0x20 [30409.812278]<5>-(5)[29555:ping]Internal error: Oops: 96000004 [#1] PREEMPT SMP [30409.812284]disable aee kernel api
* ping: Fix race in free in receive pathsubashab@codeaurora.org2016-11-071-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit fc752f1f43c1c038a2c6ae58cc739ebb5953ccb0 ] An exception is seen in ICMP ping receive path where the skb destructor sock_rfree() tries to access a freed socket. This happens because ping_rcv() releases socket reference with sock_put() and this internally frees up the socket. Later icmp_rcv() will try to free the skb and as part of this, skb destructor is called and which leads to a kernel panic as the socket is freed already in ping_rcv(). -->|exception -007|sk_mem_uncharge -007|sock_rfree -008|skb_release_head_state -009|skb_release_all -009|__kfree_skb -010|kfree_skb -011|icmp_rcv -012|ip_local_deliver_finish Fix this incorrect free by cloning this skb and processing this cloned skb instead. This patch was suggested by Eric Dumazet Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* tcp: make challenge acks less predictableCharles (Chas) Williams2016-11-071-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 75ff39ccc1bd5d3c455b6822ab09e533c551f758 upstream. From: Eric Dumazet <edumazet@google.com> Yue Cao claims that current host rate limiting of challenge ACKS (RFC 5961) could leak enough information to allow a patient attacker to hijack TCP sessions. He will soon provide details in an academic paper. This patch increases the default limit from 100 to 1000, and adds some randomization so that the attacker can no longer hijack sessions without spending a considerable amount of probes. Based on initial analysis and patch from Linus. Note that we also have per socket rate limiting, so it is tempting to remove the host limit in the future. v2: randomize the count of challenge acks per second, not the period. Fixes: 282f23c6ee34 ("tcp: implement RFC 5961 3.2") Reported-by: Yue Cao <ycao009@ucr.edu> Signed-off-by: Eric Dumazet <edumazet@google.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> [ ciwillia: backport to 3.10-stable ] Signed-off-by: Chas Williams <ciwillia@brocade.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* cleanup: delete 3.10.101 patch from /Moyster2016-11-071-1493/+0
|
* cleanup infosMoyster2016-11-074-110/+9
| | | | readme update
* Linux 3.10.104Willy Tarreau2016-11-071-1/+1
|
* mm: remove gup_flags FOLL_WRITE games from __get_user_pages()Linus Torvalds2016-11-072-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 19be0eaffa3ac7d8eb6784ad9bdbc7d67ed8e619 upstream. This is an ancient bug that was actually attempted to be fixed once (badly) by me eleven years ago in commit 4ceb5db9757a ("Fix get_user_pages() race for write access") but that was then undone due to problems on s390 by commit f33ea7f404e5 ("fix get_user_pages bug"). In the meantime, the s390 situation has long been fixed, and we can now fix it by checking the pte_dirty() bit properly (and do it better). The s390 dirty bit was implemented in abf09bed3cce ("s390/mm: implement software dirty bits") which made it into v3.9. Earlier kernels will have to look at the page state itself. Also, the VM has become more scalable, and what used a purely theoretical race back then has become easier to trigger. To fix it, we introduce a new internal FOLL_COW flag to mark the "yes, we already did a COW" rather than play racy games with FOLL_WRITE that is very fundamental, and then use the pte dirty flag to validate that the FOLL_COW flag is still valid. Reported-and-tested-by: Phil "not Paul" Oester <kernel@linuxace.com> Acked-by: Hugh Dickins <hughd@google.com> Reviewed-by: Michal Hocko <mhocko@suse.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Nick Piggin <npiggin@gmail.com> Cc: Greg Thelen <gthelen@google.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [wt: s/gup.c/memory.c; s/follow_page_pte/follow_page_mask; s/faultin_page/__get_user_page] Signed-off-by: Willy Tarreau <w@1wt.eu>
* xen-netback: ref count shared ringsWei Liu2016-11-073-2/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ... so that we can make sure the rings are not freed until all SKBs in internal queues are consumed. 1. The VM is receiving packets through bonding + bridge + netback + netfront. 2. For some unknown reason at least one packet remains in the rx queue and is not delivered to the domU immediately by netback. 3. The VM finishes shutting down. 4. The shared ring between dom0 and domU is freed. 5. then xen-netback continues processing the pending requests and tries to put the packet into the now already released shared ring. > XXXlan0: port 9(vif26.0) entered disabled state > BUG: unable to handle kernel paging request at ffffc900108641d8 > IP: [<ffffffffa04147dc>] xen_netbk_rx_action+0x18b/0x6f0 [xen_netback] > PGD 57e20067 PUD 57e21067 PMD 571a7067 PTE 0 > Oops: 0000 [#1] SMP > ... > CPU: 0 PID: 12587 Comm: netback/0 Not tainted 3.10.0-ucs58-amd64 #1 Debian 3.10.11-1.58.201405060908 > Hardware name: FUJITSU PRIMERGY BX620 S6/D3051, BIOS 080015 Rev.3C78.3051 07/22/2011 > task: ffff880004b067c0 ti: ffff8800561ec000 task.ti: ffff8800561ec000 > RIP: e030:[<ffffffffa04147dc>] [<ffffffffa04147dc>] xen_netbk_rx_action+0x18b/0x6f0 [xen_netback] > RSP: e02b:ffff8800561edce8 EFLAGS: 00010202 > RAX: ffffc900104adac0 RBX: ffff8800541e95c0 RCX: ffffc90010864000 > RDX: 000000000000003b RSI: 0000000000000000 RDI: ffff880040014380 > RBP: ffff8800570e6800 R08: 0000000000000000 R09: ffff880004799800 > R10: ffffffff813ca115 R11: ffff88005e4fdb08 R12: ffff880054e6f800 > R13: ffff8800561edd58 R14: ffffc900104a1000 R15: 0000000000000000 > FS: 00007f19a54a8700(0000) GS:ffff88005da00000(0000) knlGS:0000000000000000 > CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: ffffc900108641d8 CR3: 0000000054cb3000 CR4: 0000000000002660 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > Stack: > ffff880004b06ba0 0000000000000000 ffff88005da13ec0 ffff88005da13ec0 > 0000000004b067c0 ffffc900104a8ac0 ffffc900104a1020 000000005da13ec0 > 0000000000000000 0000000000000001 ffffc900104a8ac0 ffffc900104adac0 > Call Trace: > [<ffffffff813ca32d>] ? _raw_spin_lock_irqsave+0x11/0x2f > [<ffffffffa0416033>] ? xen_netbk_kthread+0x174/0x841 [xen_netback] > [<ffffffff8105d373>] ? wake_up_bit+0x20/0x20 > [<ffffffffa0415ebf>] ? xen_netbk_tx_build_gops+0xce8/0xce8 [xen_netback] > [<ffffffff8105cd73>] ? kthread_freezable_should_stop+0x56/0x56 > [<ffffffffa0415ebf>] ? xen_netbk_tx_build_gops+0xce8/0xce8 [xen_netback] > [<ffffffff8105ce1e>] ? kthread+0xab/0xb3 > [<ffffffff81003638>] ? xen_end_context_switch+0xe/0x1c > [<ffffffff8105cd73>] ? kthread_freezable_should_stop+0x56/0x56 > [<ffffffff813cfbfc>] ? ret_from_fork+0x7c/0xb0 > [<ffffffff8105cd73>] ? kthread_freezable_should_stop+0x56/0x56 > Code: 8b b3 d0 00 00 00 48 8b bb d8 00 00 00 0f b7 74 37 02 89 70 08 eb 07 c7 40 08 00 00 00 00 89 d2 c7 40 04 00 00 00 00 48 83 c2 08 <0f> b7 34 d1 89 30 c7 44 24 60 00 00 00 00 8b 44 d1 04 89 44 24 > RIP [<ffffffffa04147dc>] xen_netbk_rx_action+0x18b/0x6f0 [xen_netback] > RSP <ffff8800561edce8> > CR2: ffffc900108641d8 Track the shared ring buffer being unmapped and drop those packets. Ref-count the rings as followed: map -> set to 1 start_xmit -> inc when queueing SKB to internal queue rx_action -> dec after finishing processing a SKB unmap -> dec and wait to be 0 Note that this is different from ref counting the vif structure itself. Currently only guest Rx path is taken care of because that's where the bug surfaced. This bug doesn't exist in kernel >=3.12 as multi-queue support was added there. Link: <https://lists.xenproject.org/archives/html/xen-devel/2014-06/msg00818.html> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Philipp Hahn <hahn@univention.de> Cc: David Vrabel <david.vrabel@citrix.com> Tested-by: Philipp Hahn <hahn@univention.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
* security: let security modules use PTRACE_MODE_* with bitmasksJann Horn2016-11-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | commit 3dfb7d8cdbc7ea0c2970450e60818bb3eefbad69 upstream. It looks like smack and yama weren't aware that the ptrace mode can have flags ORed into it - PTRACE_MODE_NOAUDIT until now, but only for /proc/$pid/stat, and with the PTRACE_MODE_*CREDS patch, all modes have flags ORed into them. Signed-off-by: Jann Horn <jann@thejh.net> Acked-by: Kees Cook <keescook@chromium.org> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Morris <james.l.morris@oracle.com> Cc: "Serge E. Hallyn" <serge.hallyn@ubuntu.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [wt: no smk_ptrace_mode() in 3.10] Signed-off-by: Willy Tarreau <w@1wt.eu>
* MIPS: KVM: Check for pfn noslot caseJames Hogan2016-11-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit ba913e4f72fc9cfd03dad968dfb110eb49211d80 upstream. When mapping a page into the guest we error check using is_error_pfn(), however this doesn't detect a value of KVM_PFN_NOSLOT, indicating an error HVA for the page. This can only happen on MIPS right now due to unusual memslot management (e.g. being moved / removed / resized), or with an Enhanced Virtual Memory (EVA) configuration where the default KVM_HVA_ERR_* and kvm_is_error_hva() definitions are unsuitable (fixed in a later patch). This case will be treated as a pfn of zero, mapping the first page of physical memory into the guest. It would appear the MIPS KVM port wasn't updated prior to being merged (in v3.10) to take commit 81c52c56e2b4 ("KVM: do not treat noslot pfn as a error pfn") into account (merged v3.8), which converted a bunch of is_error_pfn() calls to is_error_noslot_pfn(). Switch to using is_error_noslot_pfn() instead to catch this case properly. Fixes: 858dd5d45733 ("KVM/MIPS32: MMU/TLB operations for the Guest.") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim Krčmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [james.hogan@imgtec.com: Backport to v3.16.y] Signed-off-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* mm: thp: fix SMP race condition between THP page fault and MADV_DONTNEEDAndrea Arcangeli2016-11-071-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit ad33bb04b2a6cee6c1f99fabb15cddbf93ff0433 upstream. pmd_trans_unstable()/pmd_none_or_trans_huge_or_clear_bad() were introduced to locklessy (but atomically) detect when a pmd is a regular (stable) pmd or when the pmd is unstable and can infinitely transition from pmd_none() and pmd_trans_huge() from under us, while only holding the mmap_sem for reading (for writing not). While holding the mmap_sem only for reading, MADV_DONTNEED can run from under us and so before we can assume the pmd to be a regular stable pmd we need to compare it against pmd_none() and pmd_trans_huge() in an atomic way, with pmd_trans_unstable(). The old pmd_trans_huge() left a tiny window for a race. Useful applications are unlikely to notice the difference as doing MADV_DONTNEED concurrently with a page fault would lead to undefined behavior. [js] 3.12 backport: no pmd_devmap in 3.12 yet. [akpm@linux-foundation.org: tidy up comment grammar/layout] Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reported-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
* ACPI / sysfs: fix error code in get_status()Dan Carpenter2016-11-071-4/+3
| | | | | | | | | | | | | | | | | commit f18ebc211e259d4f591e39e74b2aa2de226c9a1d upstream. The problem with ornamental, do-nothing gotos is that they lead to "forgot to set the error code" bugs. We should be returning -EINVAL here but we don't. It leads to an uninitalized variable in counter_show(): drivers/acpi/sysfs.c:603 counter_show() error: uninitialized symbol 'status'. Fixes: 1c8fce27e275 (ACPI: introduce drivers/acpi/sysfs.c) Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Willy Tarreau <w@1wt.eu>