aboutsummaryrefslogtreecommitdiff
path: root/include
Commit message (Collapse)AuthorAgeFilesLines
* zram: pass gfp from zcomp frontend to backendMinchan Kim2017-09-251-1/+1
| | | | | | | | | | | | | | | | Each zcomp backend uses own gfp flag but it's pointless because the context they could be called is driven by upper layer(ie, zcomp frontend). As well, zcomp frondend could call them in different context. One context(ie, zram init part) is it should be better to make sure successful allocation other context(ie, further stream allocation part for accelarating I/O speed) is just optional so let's pass gfp down from driver (ie, zcomp frontend) like normal MM convention. [sergey.senozhatsky@gmail.com: add missing __vmalloc zero and highmem gfps] Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* zram: check comp algorithm availability earlierSergey Senozhatsky2017-09-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Improvement idea by Marcin Jabrzyk. comp_algorithm_store() silently accepts any supplied algorithm name, because zram performs algorithm availability check later, during the device configuration phase in disksize_store() and emits the following error: "zram: Cannot initialise %s compressing backend" this error line is somewhat generic and, besides, can indicate a failed attempt to allocate compression backend's working buffers. add algorithm availability check to comp_algorithm_store(): echo lzz > /sys/block/zram0/comp_algorithm -bash: echo: write error: Invalid argument Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reported-by: Marcin Jabrzyk <m.jabrzyk@samsung.com> Acked-by: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* block: zram: Backport from Linux 4.1Sultan Qasim Khan2017-09-253-0/+19
| | | | Change-Id: I23f6f75979077992298d848efd79a6efc0d776bd
* Revert zram updates to merge 4.1 driversMister Oyster2017-09-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Revert "zram: do not use copy_page with non-page aligned address" This reverts commit ed3e8707d2e19d6da506d8ab298e68e79b6621f2. Revert "zram: sym permissions -> octal perm (checkpath warnings)" This reverts commit 920095f4566b901834f9b41395968b739b402d4c. Revert "zram: fix indents/warnings from checkpath" This reverts commit 0a2fdee5446969c8c70bbdc9f8fde93eb1d47327. Revert "UPSTREAM: zram/zcomp: do not zero out zcomp private pages" This reverts commit d13c0c08323df29367affc7b7623d9d2d0ccfbb2. Revert "UPSTREAM: zram: pass gfp from zcomp frontend to backend" This reverts commit 6d22d73c07a0f2ffe706e88c302d52371ad29206. Revert "UPSTREAM: zram: try vmalloc() after kmalloc()" This reverts commit e6af82ad8a5599a783e9850aca8f1b32fc1f93f4. Revert "UPSTREAM: zram/zcomp: use GFP_NOIO to allocate streams" This reverts commit 38e34f1f6f1c9ee9c7f3958fcb35e72174337690. Revert "zram: Fix a wrong return after merged new LZ4 version" This reverts commit 7832ce6d8a006747a4c27840b4f7e7d3c12f0dbb. Revert "zram: change usage of LZ4 to work with new LZ4 version" This reverts commit 56622e86d4356054aad833aa8547992fdb76e4e3. Revert "zram: avoid lockdep splat by revalidate_disk" This reverts commit 149cadf4d8043f55a0d92cacc4b3d3d9cfb75148. Revert "zram: revalidate disk after capacity change" This reverts commit 270bdcb8d33f5c4769edab61f33f2fe43c8636f8.
* mm: zsmalloc: backport from Linux 4.1Sultan Qasim Khan2017-09-251-2/+3
| | | | Change-Id: I3960e31f889d643e87b99fe7a88a1e0ca402d6cd
* mm/zpool: implement common zpool api to zbud/zsmallocDan Streetman2017-09-251-0/+106
| | | | | | | | | | | | | | | | | | | Add zpool api. zpool provides an interface for memory storage, typically of compressed memory. Users can select what backend to use; currently the only implementations are zbud, a low density implementation with up to two compressed pages per storage page, and zsmalloc, a higher density implementation with multiple compressed pages per storage page. Change-Id: Ie29da7d16f2f92a0fce1753eaae5629e168684c6 Signed-off-by: Dan Streetman <ddstreet@ieee.org> Tested-by: Seth Jennings <sjennings@variantweb.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Weijie Yang <weijie.yang@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* seq_file: introduce seq_setwidth() and seq_pad()Tetsuo Handa2017-09-231-0/+15
| | | | | | | | | | | | | | | | | | | | | There are several users who want to know bytes written by seq_*() for alignment purpose. Currently they are using %n format for knowing it because seq_*() returns 0 on success. This patch introduces seq_setwidth() and seq_pad() for allowing them to align without using %n format. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Joe Perches <joe@perches.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Git-commit: 839cc2a94cc3665bafe32203c2f095f4dd470a80) Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git CRs-fixed: 665291 Change-Id: I727d9af5ed320d717295c9d0f82c88623fb181c1 Signed-off-by: David Brown <davidb@codeaurora.org>
* misc: replace __FUNCTION__ by __function__Moyster2017-09-233-3/+3
| | | | | result of : git grep -l '__FUNCTION__' | xargs sed -i 's/__FUNCTION__/__func__/g'
* kernel.h: remove ancient __FUNCTION__ hackRasmus Villemoes2017-09-232-6/+3
| | | | | | | | | | | | | | | | __FUNCTION__ hasn't been treated as a string literal since gcc 3.4, so this only helps people who only test-compile using 3.3 (compiler-gcc3.h barks at anything older than that). Besides, there are almost no occurrences of __FUNCTION__ left in the tree. [akpm@linux-foundation.org: convert remaining __FUNCTION__ references] Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Moyster <oysterized@gmail.com>
* fsync: cleanupMister Oyster2017-09-221-0/+6
| | | | Signed-off-by: Mister Oyster <oysterized@gmail.com>
* binder: fix redefine of READ/WRITE_ONCE in compiler.hMister Oyster2017-09-161-2/+0
|
* binder: make FIFO inheritance a per-context optionTim Murray2017-09-161-0/+1
| | | | | | | | | | | | | Add a new ioctl to binder to control whether FIFO inheritance should happen. In particular, hwbinder should inherit FIFO priority from callers, but standard binder threads should not. Test: boots bug 36516194 Signed-off-by: Tim Murray <timmurray@google.com> Change-Id: I8100c4364b7d15d1bf00a8ca5c286e4d4b23ce85
* drivers: merged Android Binder from 4.9Lukas06102017-09-163-13/+100
| | | | | Change-Id: I857ef86b2d502293fb8c37398383dceaa21dd29f Signed-off-by: Mister Oyster <oysterized@gmail.com>
* lib: vsprintf: whitelist stack tracesDave Weinstein2017-09-141-1/+1
| | | | | | | | | Use the %pP functionality to explicitly allow kernel pointers to be logged for stack traces BUG: 30368199 Change-Id: I495915465565293e9e4da5aa28fbd1d14538d99b Signed-off-by: Dave Weinstein <olorin@google.com>
* usb: gadget: f_fs: Increase EP_ALLOC ioctl numberJerry Zhang2017-09-141-1/+1
| | | | | | | | | Prevent conflict with possible new upstream ioctls before it itself is upstreamed. Test: None Change-Id: I10cbc01c25f920a626ea7559e8ca80ee08865333 Signed-off-by: Jerry Zhang <zhangjerry@google.com>
* usb: gadget: f_fs: Add ioctl for allocating endpoint buffers.Jerry Zhang2017-09-141-0/+5
| | | | | | | | | | | This creates an ioctl named FUNCTIONFS_ENDPOINT_ALLOC which will preallocate buffers for a given size. Any reads/writes on that endpoint below that size will use those buffers instead of allocating their own. If the endpoint is not active, the buffer will not be allocated until it becomes active. Change-Id: I4da517620ed913161ea9e21a31f6b92c9a012b44 Signed-off-by: Jerry Zhang <zhangjerry@google.com>
* usb: gadget: f_fs: add ioctl returning ep descriptorRobert Baldyga2017-09-141-0/+6
| | | | | | | | | | | | This patch introduces ioctl named FUNCTIONFS_ENDPOINT_DESC, which returns endpoint descriptor to userspace. It works only if function is active. Signed-off-by: Robert Baldyga <r.baldyga@samsung.com> Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Jerry Zhang <zhangjerry@google.com> Change-Id: I55987bf0c6744327f7763b567b5a2b39c50d18e6
* llists: move llist_reverse_order from raid5 to llist.cChristoph Hellwig2017-09-111-0/+2
| | | | | | | | | | | | | | | | | Make this useful helper available for other users. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Conflicts: drivers/md/raid5.c Change-Id: Ibfc31e7289ffe9bda511c88543bc2deb70a4691b Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>
* mm: Fix incorrect type conversion for size during dma allocationMaggie White2017-09-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | This was found during userspace fuzzing test when a large size allocation is made from ion [<ffffffc00008a098>] show_stack+0x10/0x1c [<ffffffc00119c390>] dump_stack+0x74/0xc8 [<ffffffc00020d9a0>] kasan_report_error+0x2b0/0x408 [<ffffffc00020dbd4>] kasan_report+0x34/0x40 [<ffffffc00020cfec>] __asan_storeN+0x15c/0x168 [<ffffffc00020d228>] memset+0x20/0x44 [<ffffffc00009b730>] __dma_alloc_coherent+0x114/0x18c [<ffffffc00009c6e8>] __dma_alloc_noncoherent+0xbc/0x19c [<ffffffc000c2b3e0>] ion_cma_allocate+0x178/0x2f0 [<ffffffc000c2b750>] ion_secure_cma_allocate+0xdc/0x190 [<ffffffc000c250dc>] ion_alloc+0x264/0xb88 [<ffffffc000c25e94>] ion_ioctl+0x1f4/0x480 [<ffffffc00022f650>] do_vfs_ioctl+0x67c/0x764 [<ffffffc00022f790>] SyS_ioctl+0x58/0x8c Bug: 38195738 Signed-off-by: Rohit Vaswani <rvaswani@codeaurora.org> Signed-off-by: Maggie White <maggiewhite@google.com> Change-Id: I6b1a0a3eaec10500cd4e73290efad4023bc83da5
* sched: Implement smarter wake-affine logicMichael Wang2017-09-111-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The wake-affine scheduler feature is currently always trying to pull the wakee close to the waker. In theory this should be beneficial if the waker's CPU caches hot data for the wakee, and it's also beneficial in the extreme ping-pong high context switch rate case. Testing shows it can benefit hackbench up to 15%. However, the feature is somewhat blind, from which some workloads such as pgbench suffer. It's also time-consuming algorithmically. Testing shows it can damage pgbench up to 50% - far more than the benefit it brings in the best case. So wake-affine should be smarter and it should realize when to stop its thankless effort at trying to find a suitable CPU to wake on. This patch introduces 'wakee_flips', which will be increased each time the task flips (switches) its wakee target. So a high 'wakee_flips' value means the task has more than one wakee, and the bigger the number, the higher the wakeup frequency. Now when making the decision on whether to pull or not, pay attention to the wakee with a high 'wakee_flips', pulling such a task may benefit the wakee. Also imply that the waker will face cruel competition later, it could be very cruel or very fast depends on the story behind 'wakee_flips', waker therefore suffers. Furthermore, if waker also has a high 'wakee_flips', that implies that multiple tasks rely on it, then waker's higher latency will damage all of them, so pulling wakee seems to be a bad deal. Thus, when 'waker->wakee_flips / wakee->wakee_flips' becomes higher and higher, the cost of pulling seems to be worse and worse. The patch therefore helps the wake-affine feature to stop its pulling work when: wakee->wakee_flips > factor && waker->wakee_flips > (factor * wakee->wakee_flips) The 'factor' here is the number of CPUs in the current CPU's NUMA node, so a bigger node will lead to more pulling since the trial becomes more severe. After applying the patch, pgbench shows up to 40% improvements and no regressions. Tested with 12 cpu x86 server and tip 3.10.0-rc7. The percentages in the final column highlight the areas with the biggest wins, all other areas improved as well: pgbench base smart | db_size | clients | tps | | tps | +---------+---------+-------+ +-------+ | 22 MB | 1 | 10598 | | 10796 | | 22 MB | 2 | 21257 | | 21336 | | 22 MB | 4 | 41386 | | 41622 | | 22 MB | 8 | 51253 | | 57932 | | 22 MB | 12 | 48570 | | 54000 | | 22 MB | 16 | 46748 | | 55982 | +19.75% | 22 MB | 24 | 44346 | | 55847 | +25.93% | 22 MB | 32 | 43460 | | 54614 | +25.66% | 7484 MB | 1 | 8951 | | 9193 | | 7484 MB | 2 | 19233 | | 19240 | | 7484 MB | 4 | 37239 | | 37302 | | 7484 MB | 8 | 46087 | | 50018 | | 7484 MB | 12 | 42054 | | 48763 | | 7484 MB | 16 | 40765 | | 51633 | +26.66% | 7484 MB | 24 | 37651 | | 52377 | +39.11% | 7484 MB | 32 | 37056 | | 51108 | +37.92% | 15 GB | 1 | 8845 | | 9104 | | 15 GB | 2 | 19094 | | 19162 | | 15 GB | 4 | 36979 | | 36983 | | 15 GB | 8 | 46087 | | 49977 | | 15 GB | 12 | 41901 | | 48591 | | 15 GB | 16 | 40147 | | 50651 | +26.16% | 15 GB | 24 | 37250 | | 52365 | +40.58% | 15 GB | 32 | 36470 | | 50015 | +37.14% Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/51D50057.9000809@linux.vnet.ibm.com [ Improved the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Change-Id: I70018a00435ea795121b70a576d74bbbd00b7464 Signed-off-by: Paul Reioux <reioux@gmail.com>
* kthread: Backport queuing_blocked()Petr Mladek2017-09-041-0/+12
| | | | | | | | | This patch backports the queuing_blocked() function from Linux mainline and places it into the kthread header so it is accessible everywhere. Signed-off-by: Alex Naidis <alex.naidis@linux.com> Signed-off-by: Joe Maples <joe@frap129.org>
* iio: Add iio_push_buffers_with_timestamp() helperLars-Peter Clausen2017-08-311-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | Drivers using software buffers often store the timestamp in their data buffer before calling iio_push_to_buffers() with that data buffer. Storing the timestamp in the buffer usually involves some ugly pointer arithmetic. This patch adds a new helper function called iio_push_buffers_with_timestamp() which is similar to iio_push_to_buffers but takes an additional timestamp parameter. The function will help to hide to uglyness in one central place instead of exposing it in every driver. If timestamps are enabled for the IIO device iio_push_buffers_with_timestamp() will store the timestamp as the last element in buffer, before passing the buffer on to iio_push_buffers(). The buffer needs large enough to hold the timestamp in this case. If timestamps are disabled iio_push_buffers_with_timestamp() will behave just like iio_push_buffers(). Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Cc: Oleksandr Kravchenko <o.v.kravchenko@globallogic.com> Cc: Josh Wu <josh.wu@atmel.com> Cc: Denis Ciocca <denis.ciocca@gmail.com> Cc: Manuel Stahl <manuel.stahl@iis.fraunhofer.de> Cc: Ge Gao <ggao@invensense.com> Cc: Peter Meerwald <pmeerw@pmeerw.net> Cc: Jacek Anaszewski <j.anaszewski@samsung.com> Cc: Fabio Estevam <fabio.estevam@freescale.com> Cc: Marek Vasut <marex@denx.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org>
* crypto: xts - consolidate sanity check for keysStephan Mueller2017-08-311-0/+27
| | | | | | | | | | | | | | | The patch centralizes the XTS key check logic into the service function xts_check_key which is invoked from the different XTS implementations. With this, the XTS implementations in ARM, ARM64, PPC and S390 have now a sanity check for the XTS keys similar to the other arches. In addition, this service function received a check to ensure that the key != the tweak key which is mandated by FIPS 140-2 IG A.9. As the check is not present in the standards defining XTS, it is only enforced in FIPS mode of the kernel. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* BACKPORT: FROMLIST: pids: make task_tgid_nr_ns() safeOleg Nesterov2017-08-312-8/+13
| | | | | | | | | | | | | | | | | | | | | | This was reported many times, and this was even mentioned in commit 52ee2dfdd4f5 "pids: refactor vnr/nr_ns helpers to make them safe" but somehow nobody bothered to fix the obvious problem: task_tgid_nr_ns() is not safe because task->group_leader points to nowhere after the exiting task passes exit_notify(), rcu_read_lock() can not help. We really need to change __unhash_process() to nullify group_leader, parent, and real_parent, but this needs some cleanups. Until then we can turn task_tgid_nr_ns() into another user of __task_pid_nr_ns() and fix the problem. Reported-by: Troy Kensinger <tkensinger@google.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> (url: https://patchwork.kernel.org/patch/9913055/) Bug: 31495866 Change-Id: I5e67b02a77e805f71fa3a787249f13c1310f02e2
* compiler-intel.h: Remove duplicate definitionPranith Kumar2017-08-311-3/+0
| | | | | | | | | barrier is already defined as __memory_barrier in compiler.h Remove this unnecessary redefinition. Signed-off-by: Pranith Kumar <bobby.prani@gmail.com> Link: http://lkml.kernel.org/r/CAJhHMCAnYPy0%2BqD-1KBnJPLt3XgAjdR12j%2BySSnPgmZcpbE7HQ@mail.gmail.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* UPSTREAM: crypto: crypto_memneq - add equality testing of memory regions w/o ↵James Yonan2017-08-241-1/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | timing leaks (cherry pick from commit 5fd53819a37e243f368376d70260873448b83df8 in 3.10.y) commit 6bf37e5aa90f18baf5acf4874bca505dd667c37f upstream. When comparing MAC hashes, AEAD authentication tags, or other hash values in the context of authentication or integrity checking, it is important not to leak timing information to a potential attacker, i.e. when communication happens over a network. Bytewise memory comparisons (such as memcmp) are usually optimized so that they return a nonzero value as soon as a mismatch is found. E.g, on x86_64/i5 for 512 bytes this can be ~50 cyc for a full mismatch and up to ~850 cyc for a full match (cold). This early-return behavior can leak timing information as a side channel, allowing an attacker to iteratively guess the correct result. This patch adds a new method crypto_memneq ("memory not equal to each other") to the crypto API that compares memory areas of the same length in roughly "constant time" (cache misses could change the timing, but since they don't reveal information about the content of the strings being compared, they are effectively benign). Iow, best and worst case behaviour take the same amount of time to complete (in contrast to memcmp). Note that crypto_memneq (unlike memcmp) can only be used to test for equality or inequality, NOT for lexicographical order. This, however, is not an issue for its use-cases within the crypto API. We tried to locate all of the places in the crypto API where memcmp was being used for authentication or integrity checking, and convert them over to crypto_memneq. crypto_memneq is declared noinline, placed in its own source file, and compiled with optimizations that might increase code size disabled ("Os") because a smart compiler (or LTO) might notice that the return value is always compared against zero/nonzero, and might then reintroduce the same early-return optimization that we are trying to avoid. Using #pragma or __attribute__ optimization annotations of the code for disabling optimization was avoided as it seems to be considered broken or unmaintained for long time in GCC [1]. Therefore, we work around that by specifying the compile flag for memneq.o directly in the Makefile. We found that this seems to be most appropriate. As we use ("Os"), this patch also provides a loop-free "fast-path" for frequently used 16 byte digests. Similarly to kernel library string functions, leave an option for future even further optimized architecture specific assembler implementations. This was a joint work of James Yonan and Daniel Borkmann. Also thanks for feedback from Florian Weimer on this and earlier proposals [2]. [1] http://gcc.gnu.org/ml/gcc/2012-07/msg00211.html [2] https://lkml.org/lkml/2013/2/10/131 Signed-off-by: James Yonan <james@openvpn.net> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Florian Weimer <fw@deneb.enyo.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
* pstore: Allow prz to control need for lockingJoel Fernandes2017-08-201-1/+9
| | | | | | | | | | | In preparation of not locking at all for certain buffers depending on if there's contention, make locking optional depending on the initialization of the prz. Signed-off-by: Joel Fernandes <joelaf@google.com> [kees: moved locking flag into prz instead of via caller arguments] Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Joe Maples <joe@frap129.org>
* pstore: Make spinlock per zone instead of globalJoel Fernandes2017-08-201-0/+1
| | | | | | | | | | | | | | | Currently pstore has a global spinlock for all zones. Since the zones are independent and modify different areas of memory, there's no need to have a global lock, so we should use a per-zone lock as introduced here. Also, when ramoops's ftrace use-case has a FTRACE_PER_CPU flag introduced later, which splits the ftrace memory area into a single zone per CPU, it will eliminate the need for locking. In preparation for this, make the locking optional. Signed-off-by: Joel Fernandes <joelaf@google.com> [kees: updated commit message] Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Joe Maples <joe@frap129.org>
* mm: make migrate_page_move_mapping() take an extra_count parameterBenjamin LaHaise2017-07-211-2/+3
| | | | | | Needed for f2fs upstream. Referenced from https://github.com/torvalds/linux/commit/8e321fefb0e60bae4e2a28d20fc4fa30758d27c6#diff-8e2530775024feb6361f8a93e833d3c1R342
* crypto: LLVMLinux: Add macro to remove use of VLAIS in crypto codeBehan Webster2017-07-211-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add a macro which replaces the use of a Variable Length Array In Struct (VLAIS) with a C99 compliant equivalent. This macro instead allocates the appropriate amount of memory using an char array. The new code can be compiled with both gcc and clang. struct shash_desc contains a flexible array member member ctx declared with CRYPTO_MINALIGN_ATTR, so sizeof(struct shash_desc) aligns the beginning of the array declared after struct shash_desc with long long. No trailing padding is required because it is not a struct type that can be used in an array. The CRYPTO_MINALIGN_ATTR is required so that desc is aligned with long long as would be the case for a struct containing a member with CRYPTO_MINALIGN_ATTR. If you want to get to the ctx at the end of the shash_desc as before you can do so using shash_desc_ctx(shash) Signed-off-by: Behan Webster <behanw@converseincode.com> Reviewed-by: Mark Charlebois <charlebm@gmail.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Cc: Michał Mirosław <mirqus@gmail.com>
* fscrypt: add support for AES-128-CBCDaniel Walter2017-07-212-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fscrypt provides facilities to use different encryption algorithms which are selectable by userspace when setting the encryption policy. Currently, only AES-256-XTS for file contents and AES-256-CBC-CTS for file names are implemented. This is a clear case of kernel offers the mechanism and userspace selects a policy. Similar to what dm-crypt and ecryptfs have. This patch adds support for using AES-128-CBC for file contents and AES-128-CBC-CTS for file name encryption. To mitigate watermarking attacks, IVs are generated using the ESSIV algorithm. While AES-CBC is actually slightly less secure than AES-XTS from a security point of view, there is more widespread hardware support. Using AES-CBC gives us the acceptable performance while still providing a moderate level of security for persistent storage. Especially low-powered embedded devices with crypto accelerators such as CAAM or CESA often only support AES-CBC. Since using AES-CBC over AES-XTS is basically thought of a last resort, we use AES-128-CBC over AES-256-CBC since it has less encryption rounds and yields noticeable better performance starting from a file size of just a few kB. Signed-off-by: Daniel Walter <dwalter@sigma-star.at> [david@sigma-star.at: addressed review comments] Signed-off-by: David Gstir <david@sigma-star.at> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Conflicts: fs/crypto/crypto.c fs/crypto/fscrypt_private.h fs/crypto/keyinfo.c
* fscrypt: inline fscrypt_free_filename()Eric Biggers2017-07-211-1/+6
| | | | | | | | | | | fscrypt_free_filename() only needs to do a kfree() of crypto_buf.name, which works well as an inline function. We can skip setting the various pointers to NULL, since no user cares about it (the name is always freed just before it goes out of scope). Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: David Gstir <david@sigma-star.at> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* f2fs: split bio cacheJaegeuk Kim2017-07-211-1/+10
| | | | | | | | | | | | | Split DATA/NODE type bio cache according to different temperature, so write IOs with the same temperature can be merged in corresponding bio cache as much as possible, otherwise, different temperature write IOs submitting into one bio cache will always cause split of bio. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Conflicts: include/trace/events/f2fs.h
* f2fs: remove unnecessary read cases in merged IO flowJaegeuk Kim2017-07-211-1/+1
| | | | | | | | | | Merged IO flow doesn't need to care about read IOs. f2fs_submit_merged_bio -> f2fs_submit_merged_write f2fs_submit_merged_bios -> f2fs_submit_merged_writes f2fs_submit_merged_bio_cond -> f2fs_submit_merged_write_cond Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* mm: larger stack guard gap, between vmasHugh Dickins2017-07-041-28/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 1be7107fbe18eed3e319a6c3e83c78254b693acb upstream. Stack guard page is a useful feature to reduce a risk of stack smashing into a different mapping. We have been using a single page gap which is sufficient to prevent having stack adjacent to a different mapping. But this seems to be insufficient in the light of the stack usage in userspace. E.g. glibc uses as large as 64kB alloca() in many commonly used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX] which is 256kB or stack strings with MAX_ARG_STRLEN. This will become especially dangerous for suid binaries and the default no limit for the stack size limit because those applications can be tricked to consume a large portion of the stack and a single glibc call could jump over the guard page. These attacks are not theoretical, unfortunatelly. Make those attacks less probable by increasing the stack guard gap to 1MB (on systems with 4k pages; but make it depend on the page size because systems with larger base pages might cap stack allocations in the PAGE_SIZE units) which should cover larger alloca() and VLA stack allocations. It is obviously not a full fix because the problem is somehow inherent, but it should reduce attack space a lot. One could argue that the gap size should be configurable from userspace, but that can be done later when somebody finds that the new 1MB is wrong for some special case applications. For now, add a kernel command line option (stack_guard_gap) to specify the stack gap size (in page units). Implementation wise, first delete all the old code for stack guard page: because although we could get away with accounting one extra page in a stack vma, accounting a larger gap can break userspace - case in point, a program run with "ulimit -S -v 20000" failed when the 1MB gap was counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK and strict non-overcommit mode. Instead of keeping gap inside the stack vma, maintain the stack guard gap as a gap between vmas: using vm_start_gap() in place of vm_start (or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few places which need to respect the gap - mainly arch_get_unmapped_area(), and and the vma tree's subtree_gap support for that. Original-patch-by: Oleg Nesterov <oleg@redhat.com> Original-patch-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Hugh Dickins <hughd@google.com> [wt: backport to 4.11: adjust context] [wt: backport to 4.9: adjust context ; kernel doc was not in admin-guide] [wt: backport to 4.4: adjust context ; drop ppc hugetlb_radix changes] [wt: backport to 3.18: adjust context ; no FOLL_POPULATE ; s390 uses generic arch_get_unmapped_area()] [wt: backport to 3.16: adjust context] [wt: backport to 3.10: adjust context ; code logic in PARISC's arch_get_unmapped_area() wasn't found ; code inserted into expand_upwards() and expand_downwards() runs under anon_vma lock; changes for gup.c:faultin_page go to memory.c:__get_user_pages(); included Hugh Dickins' fixes] Signed-off-by: Willy Tarreau <w@1wt.eu>
* uksm: remove Mtk aksm & uksm (because its fugly)Mister Oyster2017-07-046-282/+18
| | | | | | | | | | | | | | | | Revert "KSM: mediatek: implement Adaptive KSM" Revert "mm: uksm: fix maybe-uninitialized warning" Revert "UKSM: Add Governors for Higher CPU usage (HighCPU) for more merging, and low cpu usage (Battery) for less battery drain" Revert "uksm: use deferrable timer" Revert "mm: limit UKSM sleep time instead of failing" Revert "uksm: Fix warning" Revert "uksm: clean up and remove some (no)inlines" Revert "uksm: modify ema logic and tidy up" Revert "uksm: enhancements and cleanups" Revert "uksm: squashed fixups" Revert "UKSM: cast variable as const" Revert "UKSM: remove U64_MAX definition" Revert "add uksm 0.1.2.3 for v3.10 .ge.46.patch"
* uapi: fix linux/packet_diag.h userspace compilation errorDmitry V. Levin2017-07-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | commit 745cb7f8a5de0805cade3de3991b7a95317c7c73 upstream. Replace MAX_ADDR_LEN with its numeric value to fix the following linux/packet_diag.h userspace compilation error: /usr/include/linux/packet_diag.h:67:17: error: 'MAX_ADDR_LEN' undeclared here (not in a function) __u8 pdmc_addr[MAX_ADDR_LEN]; This is not the first case in the UAPI where the numeric value of MAX_ADDR_LEN is used instead of symbolic one, uapi/linux/if_link.h already does the same: $ grep MAX_ADDR_LEN include/uapi/linux/if_link.h __u8 mac[32]; /* MAX_ADDR_LEN */ There are no UAPI headers besides these two that use MAX_ADDR_LEN. Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Acked-by: Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Willy Tarreau <w@1wt.eu>
* nfs: Don't increment lock sequence ID after NFS4ERR_MOVEDChuck Lever2017-07-041-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | commit 059aa734824165507c65fd30a55ff000afd14983 upstream. Xuan Qi reports that the Linux NFSv4 client failed to lock a file that was migrated. The steps he observed on the wire: 1. The client sent a LOCK request to the source server 2. The source server replied NFS4ERR_MOVED 3. The client switched to the destination server 4. The client sent the same LOCK request to the destination server with a bumped lock sequence ID 5. The destination server rejected the LOCK request with NFS4ERR_BAD_SEQID RFC 3530 section 8.1.5 provides a list of NFS errors which do not bump a lock sequence ID. However, RFC 3530 is now obsoleted by RFC 7530. In RFC 7530 section 9.1.7, this list has been updated by the addition of NFS4ERR_MOVED. Reported-by: Xuan Qi <xuan.qi@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* can: raw: raw_setsockopt: limit number of can_filter that can be setMarc Kleine-Budde2017-07-041-0/+1
| | | | | | | | | | | | | | | commit 332b05ca7a438f857c61a3c21a88489a21532364 upstream. This patch adds a check to limit the number of can_filters that can be set via setsockopt on CAN_RAW sockets. Otherwise allocations > MAX_ORDER are not prevented resulting in a warning. Reference: https://lkml.org/lkml/2016/12/2/230 Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Willy Tarreau <w@1wt.eu>
* binder: merge aosp-common/3.10 binder drivers (uptodate)Mister Oyster2017-06-181-3/+112
|
* KVM: kvm_io_bus_unregister_dev() should never failDavid Hildenbrand2017-06-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | commit 90db10434b163e46da413d34db8d0e77404cc645 upstream. No caller currently checks the return value of kvm_io_bus_unregister_dev(). This is evil, as all callers silently go on freeing their device. A stale reference will remain in the io_bus, getting at least used again, when the iobus gets teared down on kvm_destroy_vm() - leading to use after free errors. There is nothing the callers could do, except retrying over and over again. So let's simply remove the bus altogether, print an error and make sure no one can access this broken bus again (returning -ENOMEM on any attempt to access it). Fixes: e93f8a0f821e ("KVM: convert io_bus to SRCU") Reported-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [wt: no kvm_io_bus_read_cookie in 3.10, slightly different constructs] Signed-off-by: Willy Tarreau <w@1wt.eu>
* kvm: exclude ioeventfd from counting kvm_io_range limitAmos Kong2017-06-171-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | commit 6ea34c9b78c10289846db0abeebd6b84d5aca084 upstream. We can easily reach the 1000 limit by start VM with a couple hundred I/O devices (multifunction=on). The hardcode limit already been adjusted 3 times (6 ~ 200 ~ 300 ~ 1000). In userspace, we already have maximum file descriptor to limit ioeventfd count. But kvm_io_bus devices also are used for pit, pic, ioapic, coalesced_mmio. They couldn't be limited by maximum file descriptor. Currently only ioeventfds take too much kvm_io_bus devices, so just exclude it from counting kvm_io_range limit. Also fixed one indent issue in kvm_host.h Signed-off-by: Amos Kong <akong@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> [wt: next patch depends on this one] Signed-off-by: Willy Tarreau <w@1wt.eu>
* can: Fix kernel panic at security_sock_rcv_skbEric Dumazet2017-06-171-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit f1712c73714088a7252d276a57126d56c7d37e64 upstream. Zhang Yanmin reported crashes [1] and provided a patch adding a synchronize_rcu() call in can_rx_unregister() The main problem seems that the sockets themselves are not RCU protected. If CAN uses RCU for delivery, then sockets should be freed only after one RCU grace period. Recent kernels could use sock_set_flag(sk, SOCK_RCU_FREE), but let's ease stable backports with the following fix instead. [1] BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81495e25>] selinux_socket_sock_rcv_skb+0x65/0x2a0 Call Trace: <IRQ> [<ffffffff81485d8c>] security_sock_rcv_skb+0x4c/0x60 [<ffffffff81d55771>] sk_filter+0x41/0x210 [<ffffffff81d12913>] sock_queue_rcv_skb+0x53/0x3a0 [<ffffffff81f0a2b3>] raw_rcv+0x2a3/0x3c0 [<ffffffff81f06eab>] can_rcv_filter+0x12b/0x370 [<ffffffff81f07af9>] can_receive+0xd9/0x120 [<ffffffff81f07beb>] can_rcv+0xab/0x100 [<ffffffff81d362ac>] __netif_receive_skb_core+0xd8c/0x11f0 [<ffffffff81d36734>] __netif_receive_skb+0x24/0xb0 [<ffffffff81d37f67>] process_backlog+0x127/0x280 [<ffffffff81d36f7b>] net_rx_action+0x33b/0x4f0 [<ffffffff810c88d4>] __do_softirq+0x184/0x440 [<ffffffff81f9e86c>] do_softirq_own_stack+0x1c/0x30 <EOI> [<ffffffff810c76fb>] do_softirq.part.18+0x3b/0x40 [<ffffffff810c8bed>] do_softirq+0x1d/0x20 [<ffffffff81d30085>] netif_rx_ni+0xe5/0x110 [<ffffffff8199cc87>] slcan_receive_buf+0x507/0x520 [<ffffffff8167ef7c>] flush_to_ldisc+0x21c/0x230 [<ffffffff810e3baf>] process_one_work+0x24f/0x670 [<ffffffff810e44ed>] worker_thread+0x9d/0x6f0 [<ffffffff810e4450>] ? rescuer_thread+0x480/0x480 [<ffffffff810ebafc>] kthread+0x12c/0x150 [<ffffffff81f9ccef>] ret_from_fork+0x3f/0x70 Reported-by: Zhang Yanmin <yanmin.zhang@intel.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Willy Tarreau <w@1wt.eu>
* ACPI / PNP: Reserve ACPI resources at the fs_initcall_sync stageRafael J. Wysocki2017-06-171-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 0294112ee3135fbd15eaa70015af8283642dd970 upstream. This effectively reverts the following three commits: 7bc10388ccdd ACPI / resources: free memory on error in add_region_before() 0f1b414d1907 ACPI / PNP: Avoid conflicting resource reservations b9a5e5e18fbf ACPI / init: Fix the ordering of acpi_reserve_resources() (commit b9a5e5e18fbf introduced regressions some of which, but not all, were addressed by commit 0f1b414d1907 and commit 7bc10388ccdd was a fixup on top of the latter) and causes ACPI fixed hardware resources to be reserved at the fs_initcall_sync stage of system initialization. The story is as follows. First, a boot regression was reported due to an apparent resource reservation ordering change after a commit that shouldn't lead to such changes. Investigation led to the conclusion that the problem happened because acpi_reserve_resources() was executed at the device_initcall() stage of system initialization which wasn't strictly ordered with respect to driver initialization (and with respect to the initialization of the pcieport driver in particular), so a random change causing the device initcalls to be run in a different order might break things. The response to that was to attempt to run acpi_reserve_resources() as soon as we knew that ACPI would be in use (commit b9a5e5e18fbf). However, that turned out to be too early, because it caused resource reservations made by the PNP system driver to fail on at least one system and that failure was addressed by commit 0f1b414d1907. That fix still turned out to be insufficient, though, because calling acpi_reserve_resources() before the fs_initcall stage of system initialization caused a boot regression to happen on the eCAFE EC-800-H20G/S netbook. That meant that we only could call acpi_reserve_resources() at the fs_initcall initialization stage or later, but then we might just as well call it after the PNP initalization in which case commit 0f1b414d1907 wouldn't be necessary any more. For this reason, the changes made by commit 0f1b414d1907 are reverted (along with a memory leak fixup on top of that commit), the changes made by commit b9a5e5e18fbf that went too far are reverted too and acpi_reserve_resources() is changed into fs_initcall_sync, which will cause it to be executed after the PNP subsystem initialization (which is an fs_initcall) and before device initcalls (including the pcieport driver initialization) which should avoid the initial issue. Link: https://bugzilla.kernel.org/show_bug.cgi?id=100581 Link: http://marc.info/?t=143092384600002&r=1&w=2 Link: https://bugzilla.kernel.org/show_bug.cgi?id=99831 Link: http://marc.info/?t=143389402600001&r=1&w=2 Fixes: b9a5e5e18fbf "ACPI / init: Fix the ordering of acpi_reserve_resources()" Reported-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* ACPI / PNP: Avoid conflicting resource reservationsRafael J. Wysocki2017-06-171-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 0f1b414d190724617eb1cdd615592fa8cd9d0b50 upstream. Commit b9a5e5e18fbf "ACPI / init: Fix the ordering of acpi_reserve_resources()" overlooked the fact that the memory and/or I/O regions reserved by acpi_reserve_resources() may conflict with those reserved by the PNP "system" driver. If that conflict actually takes place, it causes the reservations made by the "system" driver to fail while before commit b9a5e5e18fbf all reservations made by it and by acpi_reserve_resources() would be successful. In turn, that allows the resources that haven't been reserved by the "system" driver to be used by others (e.g. PCI) which sometimes leads to functional problems (up to and including boot failures). To fix that issue, introduce a common resource reservation routine, acpi_reserve_region(), to be used by both acpi_reserve_resources() and the "system" driver, that will track all resources reserved by it and avoid making conflicting requests. Link: https://bugzilla.kernel.org/show_bug.cgi?id=99831 Link: http://marc.info/?t=143389402600001&r=1&w=2 Fixes: b9a5e5e18fbf "ACPI / init: Fix the ordering of acpi_reserve_resources()" Reported-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* locking/static_keys: Add static_key_{en,dis}able() helpersPeter Zijlstra2017-06-171-0/+16
| | | | | | | | | | | | | | | | | | | | | | | commit e33886b38cc82a9fc3b2d655dfc7f50467594138 upstream. Add two helpers to make it easier to treat the refcount as boolean. [js] do not involve WARN_ON_ONCE as it causes build failures Suggested-by: Jason Baron <jasonbaron0@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz> [wt: only backported for use in next fix ; s/static_key_count(key)/atomic_read(&key->enabled)/] Signed-off-by: Willy Tarreau <w@1wt.eu>
* tracing: Add #undef to fix compile errorRik van Riel2017-06-171-0/+1
| | | | | | | | | | | | | | | | | | | | | commit bf7165cfa23695c51998231c4efa080fe1d3548d upstream. There are several trace include files that define TRACE_INCLUDE_FILE. Include several of them in the same .c file (as I currently have in some code I am working on), and the compile will blow up with a "warning: "TRACE_INCLUDE_FILE" redefined #define TRACE_INCLUDE_FILE syscalls" Every other include file in include/trace/events/ avoids that issue by having a #undef TRACE_INCLUDE_FILE before the #define; syscalls.h should have one, too. Link: http://lkml.kernel.org/r/20160928225554.13bd7ac6@annuminas.surriel.com Fixes: b8007ef74222 ("tracing: Separate raw syscall from syscall tracer") Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
* nlm: Ensure callback code also checks that the files matchTrond Myklebust2017-06-171-1/+2
| | | | | | | | | | | | | | commit 251af29c320d86071664f02c76f0d063a19fefdf upstream. It is not sufficient to just check that the lock pids match when granting a callback, we also need to ensure that we're granting the callback on the right file. Reported-by: Pankaj Singh <psingh.ait@gmail.com> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* RDMA/core: Fix incorrect structure packing for booleansJason Gunthorpe2017-06-171-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 55efcfcd7776165b294f8b5cd6e05ca00ec89b7c upstream. The RDMA core uses ib_pack() to convert from unpacked CPU structs to on-the-wire bitpacked structs. This process requires that 1 bit fields are declared as u8 in the unpacked struct, otherwise the packing process does not read the value properly and the packed result is wired to 0. Several places wrongly used int. Crucially this means the kernel has never, set reversible correctly in the path record request. It has always asked for irreversible paths even if the ULP requests otherwise. When the kernel is used with a SM that supports this feature, it completely breaks communication management if reversible paths are not properly requested. The only reason this ever worked is because opensm ignores the reversible bit. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* netlabel: out of bound access in cipso_v4_validate()Eric Dumazet2017-06-171-0/+4
| | | | | | | | | | | | | | | | commit d71b7896886345c53ef1d84bda2bc758554f5d61 upstream. syzkaller found another out of bound access in ip_options_compile(), or more exactly in cipso_v4_validate() Fixes: 20e2a8648596 ("cipso: handle CIPSO options correctly when NetLabel is disabled") Fixes: 446fda4f2682 ("[NetLabel]: CIPSOv4 engine") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Paul Moore <paul@paul-moore.com> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Willy Tarreau <w@1wt.eu>