aboutsummaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
...
* mm: limit UKSM sleep time instead of failingRyan Pennucci2017-04-111-1/+4
|
* uksm: Fix warningJoe Maples2017-04-111-1/+1
| | | | Signed-off-by: Joe Maples <joe@frap129.org>
* uksm: clean up and remove some (no)inlinesRyan Pennucci2017-04-111-13/+7
|
* uksm: modify ema logic and tidy upRyan Pennucci2017-04-111-57/+41
|
* uksm: enhancements and cleanupsRyan Pennucci2017-04-111-69/+169
| | | | | | | | | | | | - Replace the old throttle multiplier with a divisor - Fix up the logic for saved_pages_to_scan - Fix a couple memory leaks and assorted tiny bugs - Add a cpu_scales tuner to better configure rung CPU usage Signed-off-by: Joe Maples <joe@frap129.org> Conflicts: mm/uksm.c
* uksm: squashed fixupsRyan Pennucci2017-04-111-121/+96
| | | | | | | | | | | | | | | | | | | | | | uksm: cleanups and scheduling changes - Fix deprecated function warnings - Use nice 15 instead of 5 - Use fixed periods instead of adjusting - Rewrite pages_to_scan logic - Modify kthread behavior to freeze more aggressively uksm: fixups - enforce usage limit across all rungs - mark uksm_do_scan hot - frob a few tuning knobs uksm: back off during load Signed-off-by: Joe Maples <joe@frap129.org> Conflicts: mm/uksm.c
* UKSM: cast variable as constNathan Chancellor2017-04-111-1/+1
| | | | | | | | | | | | gcc 4.9 produces this error: mm/uksm.c: In function 'is_full_zero': mm/uksm.c:169:23: warning: initialization discards 'const' qualifier from pointer target type error, forbidden warning: uksm.c:169 Casting *src as a constant fixes this. Could also remove the const from the parameter. Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
* UKSM: remove U64_MAX definitionNathan Chancellor2017-04-111-1/+0
| | | | | | This is already defined in include/linux/kernel.h, causing a compilation error; UKSM has not been updated for the later versions of 3.10 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
* mm,ksm: add __GFP_HIGH to the allocation in alloc_stable_node()zhong jiang2017-04-111-1/+6
| | | | | | | | | | | | | | | | | | | | According to Hugh's suggestion, alloc_stable_node() with GFP_KERNEL can in rare cases cause a hung task warning. At present, if alloc_stable_node() allocation fails, two break_cows may want to allocate a couple of pages, and the issue will come up when free memory is under pressure. We fix it by adding __GFP_HIGH to GFP, to grant access to memory reserves, increasing the likelihood of allocation success. [akpm@linux-foundation.org: tweak comment] Link: http://lkml.kernel.org/r/1474354484-58233-1-git-send-email-zhongjiang@huawei.com Signed-off-by: zhong jiang <zhongjiang@huawei.com> Suggested-by: Hugh Dickins <hughd@google.com> Acked-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm,ksm: fix endless looping in allocating memory when ksm enablezhong jiang2017-04-111-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I hit the following hung task when runing a OOM LTP test case with 4.1 kernel. Call trace: [<ffffffc000086a88>] __switch_to+0x74/0x8c [<ffffffc000a1bae0>] __schedule+0x23c/0x7bc [<ffffffc000a1c09c>] schedule+0x3c/0x94 [<ffffffc000a1eb84>] rwsem_down_write_failed+0x214/0x350 [<ffffffc000a1e32c>] down_write+0x64/0x80 [<ffffffc00021f794>] __ksm_exit+0x90/0x19c [<ffffffc0000be650>] mmput+0x118/0x11c [<ffffffc0000c3ec4>] do_exit+0x2dc/0xa74 [<ffffffc0000c46f8>] do_group_exit+0x4c/0xe4 [<ffffffc0000d0f34>] get_signal+0x444/0x5e0 [<ffffffc000089fcc>] do_signal+0x1d8/0x450 [<ffffffc00008a35c>] do_notify_resume+0x70/0x78 The oom victim cannot terminate because it needs to take mmap_sem for write while the lock is held by ksmd for read which loops in the page allocator ksm_do_scan scan_get_next_rmap_item down_read get_next_rmap_item alloc_rmap_item #ksmd will loop permanently. There is no way forward because the oom victim cannot release any memory in 4.1 based kernel. Since 4.6 we have the oom reaper which would solve this problem because it would release the memory asynchronously. Nevertheless we can relax alloc_rmap_item requirements and use __GFP_NORETRY because the allocation failure is acceptable as ksm_do_scan would just retry later after the lock got dropped. Such a patch would be also easy to backport to older stable kernels which do not have oom_reaper. While we are at it add GFP_NOWARN so the admin doesn't have to be alarmed by the allocation failure. Link: http://lkml.kernel.org/r/1474165570-44398-1-git-send-email-zhongjiang@huawei.com Signed-off-by: zhong jiang <zhongjiang@huawei.com> Suggested-by: Hugh Dickins <hughd@google.com> Suggested-by: Michal Hocko <mhocko@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ksm: fix conflict between mmput and scan_get_next_rmap_itemZhou Chengming2017-04-111-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A concurrency issue about KSM in the function scan_get_next_rmap_item. task A (ksmd): |task B (the mm's task): | mm = slot->mm; | down_read(&mm->mmap_sem); | | ... | | spin_lock(&ksm_mmlist_lock); | | ksm_scan.mm_slot go to the next slot; | | spin_unlock(&ksm_mmlist_lock); | |mmput() -> | ksm_exit(): | |spin_lock(&ksm_mmlist_lock); |if (mm_slot && ksm_scan.mm_slot != mm_slot) { | if (!mm_slot->rmap_list) { | easy_to_free = 1; | ... | |if (easy_to_free) { | mmdrop(mm); | ... | |So this mm_struct may be freed in the mmput(). | up_read(&mm->mmap_sem); | As we can see above, the ksmd thread may access a mm_struct that already been freed to the kmem_cache. Suppose a fork will get this mm_struct from the kmem_cache, the ksmd thread then call up_read(&mm->mmap_sem), will cause mmap_sem.count to become -1. As suggested by Andrea Arcangeli, unmerge_and_remove_all_rmap_items has the same SMP race condition, so fix it too. My prev fix in function scan_get_next_rmap_item will introduce a different SMP race condition, so just invert the up_read/spin_unlock order as Andrea Arcangeli said. Link: http://lkml.kernel.org/r/1462708815-31301-1-git-send-email-zhouchengming1@huawei.com Signed-off-by: Zhou Chengming <zhouchengming1@huawei.com> Suggested-by: Andrea Arcangeli <aarcange@redhat.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Geliang Tang <geliangtang@163.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Ding Tianhong <dingtianhong@huawei.com> Cc: Li Bin <huawei.libin@huawei.com> Cc: Zhen Lei <thunder.leizhen@huawei.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm/ksm.c: mark stable page dirtyMinchan Kim2017-04-111-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The MADV_FREE patchset changes page reclaim to simply free a clean anonymous page with no dirty ptes, instead of swapping it out; but KSM uses clean write-protected ptes to reference the stable ksm page. So be sure to mark that page dirty, so it's never mistakenly discarded. [hughd@google.com: adjusted comments] Signed-off-by: Minchan Kim <minchan@kernel.org> Acked-by: Hugh Dickins <hughd@google.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Shaohua Li <shli@kernel.org> Cc: <yalin.wang2010@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen Gang <gang.chen.5i5j@gmail.com> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: David S. Miller <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Jason Evans <je@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mika Penttil <mika.penttila@nextfour.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Rik van Riel <riel@redhat.com> Cc: Roland Dreier <roland@kernel.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Shaohua Li <shli@kernel.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm/ksm.c: use list_for_each_entry_safeGeliang Tang2017-04-111-13/+7
| | | | | | | | | | Use list_for_each_entry_safe() instead of list_for_each_safe() to simplify the code. Signed-off-by: Geliang Tang <geliangtang@163.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ksm: unstable_tree_search_insert error checking cleanupAndrea Arcangeli2017-04-111-2/+3
| | | | | | | | | | | | | | | get_mergeable_page() can only return NULL (also in case of errors) or the pinned mergeable page. It can't return an error different than NULL. This optimizes away the unnecessary error check. Add a return after the "out:" label in the callee to make it more readable. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: Petr Holasek <pholasek@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ksm: use find_mergeable_vma in try_to_merge_with_ksm_pageAndrea Arcangeli2017-04-111-6/+2
| | | | | | | | | | | | | | Doing the VM_MERGEABLE check after the page == kpage check won't provide any meaningful benefit. The !vma->anon_vma check of find_mergeable_vma is the only superfluous bit in using find_mergeable_vma because the !PageAnon check of try_to_merge_one_page() implicitly checks for that, but it still looks cleaner to share the same find_mergeable_vma(). Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: Petr Holasek <pholasek@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* ksm: use the helper method to do the hlist_empty checkAndrea Arcangeli2017-04-111-1/+1
| | | | | | | | | | | | This just uses the helper function to cleanup the assumption on the hlist_node internals. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: Petr Holasek <pholasek@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Joe Maples <joe@frap129.org>
* ksm: don't fail stable tree lookups if walking over stale stable_nodesAndrea Arcangeli2017-04-111-5/+27
| | | | | | | | | | | | | | | | | | | | | | | The stable_nodes can become stale at any time if the underlying pages gets freed. The stable_node gets collected and removed from the stable rbtree if that is detected during the rbtree lookups. Don't fail the lookup if running into stale stable_nodes, just restart the lookup after collecting the stale stable_nodes. Otherwise the CPU spent in the preparation stage is wasted and the lookup must be repeated at the next loop potentially failing a second time in a second stale stable_node. If we don't prune aggressively we delay the merging of the unstable node candidates and at the same time we delay the freeing of the stale stable_nodes. Keeping stale stable_nodes around wastes memory and it can't provide any benefit. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: Petr Holasek <pholasek@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Joe Maples <joe@frap129.org>
* ksm: add cond_resched() to the rmap_walksAndrea Arcangeli2017-04-112-0/+5
| | | | | | | | | | | | | | | While at it add it to the file and anon walks too. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: Petr Holasek <pholasek@redhat.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Joe Maples <joe@frap129.org> Conflicts: mm/rmap.c
* ksm: Don't bug on pmd_trans_hugeJoe Maples2017-04-111-1/+0
| | | | Signed-off-by: Joe Maples <joe@frap129.org>
* crypto: lz4,lz4hc - fix decompressionKOVACS Krisztian2017-04-112-2/+2
| | | | | | | | | | | | | | | | | | | The lz4 library has two functions for decompression, with slightly different signatures and behaviour. The lz4_decompress_crypto() function seemed to be using the one that assumes that the decompressed length is known in advance. This patch switches to the other decompression function and makes sure that the length of the decompressed output is properly returned to the caller. The same issue was present in the lz4hc algorithm. Coincidentally, this change also makes very basic lz4 and lz4hc compression tests in testmgr pass. Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: add lz4 Cryptographic APIChanho Min2017-04-114-1/+230
| | | | | | | | | | | | | | | | Add support for lz4 and lz4hc compression algorithm using the lib/lz4/* codebase. [akpm@linux-foundation.org: fix warnings] Signed-off-by: Chanho Min <chanho.min@lge.com> Cc: "Darrick J. Wong" <djwong@us.ibm.com> Cc: Bob Pearson <rpearson@systemfabricworks.com> Cc: Richard Weinberger <richard@nod.at> Cc: Herbert Xu <herbert@gondor.hengli.com.au> Cc: Yann Collet <yann.collet.73@gmail.com> Cc: Kyungsik Lee <kyungsik.lee@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* lib: lz4: Set ARM_EFFICIENT_UNALIGNED_ACCESSChristopher Fries2017-04-111-0/+1
| | | | | | | | | | | | | | | | | | Set ARM_EFFICIENT_UNALIGNED_ACCESS to improve performance in lz4 compression and decompression. On msm8x26 cortex-a7, LZO LZ4 LZ4 w/ UA decompress (bs=4k) 121.21 115.52 148.7 LZO LZ4 LZ4 w/ UA compress (bs=4k) 37.5 34.5 44.8 ported from http://gerrit.mot.com/#/c/697567/ Change-Id: I95db669cdf542767b356e9c5bc2d6a272fb111d4 Signed-off-by: Jiangli Yuan <jlyuan@motorola.com>
* lib/sort: Add 64 bit swap functionDaniel Wagner2017-04-111-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case the call side is not providing a swap function, we either use a 32 bit or a generic swap function. When swapping around pointers on 64 bit architectures falling back to use the generic swap function seems like an unnecessary waste. There at least 9 users ('sort' is of difficult to grep for) of sort() and all of them use the sort function without a customized swap function. Furthermore, they are all using pointers to swap around: arch/x86/kernel/e820.c:sanitize_e820_map() arch/x86/mm/extable.c:sort_extable() drivers/acpi/fan.c:acpi_fan_get_fps() fs/btrfs/super.c:btrfs_descending_sort_devices() fs/xfs/libxfs/xfs_dir2_block.c:xfs_dir2_sf_to_block() kernel/range.c:clean_sort_range() mm/memcontrol.c:__mem_cgroup_usage_register_event() sound/pci/hda/hda_auto_parser.c:snd_hda_parse_pin_defcfg() sound/pci/hda/hda_auto_parser.c:sort_pins_by_sequence() Obviously, we could improve the swap for other sizes as well but this is overkill at this point. A simple test shows sorting a 400 element array (try to stay in one page) with either with u32_swap() or u64_swap() show that the theory actually works. This test was done on a x86_64 (Intel Xeon E5-4610) machine. - swap_32: NumSamples = 100; Min = 48.00; Max = 49.00 Mean = 48.320000; Variance = 0.217600; SD = 0.466476; Median 48.000000 each * represents a count of 1 48.0000 - 48.1000 [ 68]: ******************************************************************** 48.1000 - 48.2000 [ 0]: 48.2000 - 48.3000 [ 0]: 48.3000 - 48.4000 [ 0]: 48.4000 - 48.5000 [ 0]: 48.5000 - 48.6000 [ 0]: 48.6000 - 48.7000 [ 0]: 48.7000 - 48.8000 [ 0]: 48.8000 - 48.9000 [ 0]: 48.9000 - 49.0000 [ 32]: ******************************** - swap_64: NumSamples = 100; Min = 44.00; Max = 63.00 Mean = 48.250000; Variance = 18.687500; SD = 4.322904; Median 47.000000 each * represents a count of 1 44.0000 - 45.9000 [ 15]: *************** 45.9000 - 47.8000 [ 37]: ************************************* 47.8000 - 49.7000 [ 39]: *************************************** 49.7000 - 51.6000 [ 0]: 51.6000 - 53.5000 [ 0]: 53.5000 - 55.4000 [ 0]: 55.4000 - 57.3000 [ 0]: 57.3000 - 59.2000 [ 1]: * 59.2000 - 61.1000 [ 3]: *** 61.1000 - 63.0000 [ 5]: ***** - swap_72: NumSamples = 100; Min = 53.00; Max = 71.00 Mean = 55.070000; Variance = 21.565100; SD = 4.643824; Median 53.000000 each * represents a count of 1 53.0000 - 54.8000 [ 73]: ************************************************************************* 54.8000 - 56.6000 [ 9]: ********* 56.6000 - 58.4000 [ 9]: ********* 58.4000 - 60.2000 [ 0]: 60.2000 - 62.0000 [ 0]: 62.0000 - 63.8000 [ 0]: 63.8000 - 65.6000 [ 0]: 65.6000 - 67.4000 [ 1]: * 67.4000 - 69.2000 [ 4]: **** 69.2000 - 71.0000 [ 4]: **** - test program: static int cmp_32(const void *a, const void *b) { u32 l = *(u32 *)a; u32 r = *(u32 *)b; if (l < r) return -1; if (l > r) return 1; return 0; } static int cmp_64(const void *a, const void *b) { u64 l = *(u64 *)a; u64 r = *(u64 *)b; if (l < r) return -1; if (l > r) return 1; return 0; } static int cmp_72(const void *a, const void *b) { u32 l = get_unaligned((u32 *) a); u32 r = get_unaligned((u32 *) b); if (l < r) return -1; if (l > r) return 1; return 0; } static void init_array32(void *array) { u32 *a = array; int i; a[0] = 3821; for (i = 1; i < ARRAY_ELEMENTS; i++) a[i] = next_pseudo_random32(a[i-1]); } static void init_array64(void *array) { u64 *a = array; int i; a[0] = 3821; for (i = 1; i < ARRAY_ELEMENTS; i++) a[i] = next_pseudo_random32(a[i-1]); } static void init_array72(void *array) { char *p; u32 v; int i; v = 3821; for (i = 0; i < ARRAY_ELEMENTS; i++) { p = (char *)array + (i * 9); put_unaligned(v, (u32*) p); v = next_pseudo_random32(v); } } static void sort_test(void (*init)(void *array), int (*cmp) (const void *, const void *), void *array, size_t size) { ktime_t start, stop; int i; for (i = 0; i < 10000; i++) { init(array); local_irq_disable(); start = ktime_get(); sort(array, ARRAY_ELEMENTS, size, cmp, NULL); stop = ktime_get(); local_irq_enable(); if (i > 10000 - 101) pr_info("%lld\n", ktime_to_us(ktime_sub(stop, start))); } } static void *create_array(size_t size) { void *array; array = kmalloc(ARRAY_ELEMENTS * size, GFP_KERNEL); if (!array) return NULL; return array; } static int perform_test(size_t size) { void *array; array = create_array(size); if (!array) return -ENOMEM; pr_info("test element size %d bytes\n", (int)size); switch (size) { case 4: sort_test(init_array32, cmp_32, array, size); break; case 8: sort_test(init_array64, cmp_64, array, size); break; case 9: sort_test(init_array72, cmp_72, array, size); break; } kfree(array); return 0; } static int __init sort_tests_init(void) { int err; err = perform_test(sizeof(u32)); if (err) return err; err = perform_test(sizeof(u64)); if (err) return err; err = perform_test(sizeof(u64)+1); if (err) return err; return 0; } static void __exit sort_tests_exit(void) { } module_init(sort_tests_init); module_exit(sort_tests_exit); MODULE_LICENSE("GPL v2"); MODULE_AUTHOR("Daniel Wagner"); MODULE_DESCRIPTION("sort perfomance tests"); Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* lib/sort.c: move include inside #if 0Rasmus Villemoes2017-04-111-1/+1
| | | | | | | | | | | The sort function and its helpers don't do memory allocation, so the slab.h include is redundant. Move it inside the #if 0 protecting the self-test, similar to how it is done in lib/list_sort.c. This removes over 450 lines from the generated dependency file. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* lib/sort.c: use simpler includesRasmus Villemoes2017-04-111-2/+2
| | | | | | | | | | | | sort.c doesn't use facilities from kernel.h, but does use some types defined in linux/types.h. Include the latter directly instead of relying on some other header doing it. Similarly, include linux/export.h directly instead of through module.h. This removes 80 lines from the dependency file .sort.o.cmd. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* lib: introduce upper case hex ascii helpersAndre Naujoks2017-04-112-0/+13
| | | | | | | | | To be able to use the hex ascii functions in case sensitive environments the array hex_asc_upper[] and the needed functions for hex_byte_pack_upper() are introduced. Signed-off-by: Andre Naujoks <nautsch2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ALSA: timer: fix NULL pointer dereference on memory allocation failureVegard Nossum2017-04-111-2/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 8ddc05638ee42b18ba4fe99b5fb647fa3ad20456 upstream. I hit this with syzkaller: kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 1327 Comm: a.out Not tainted 4.8.0-rc2+ #190 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014 task: ffff88011278d600 task.stack: ffff8801120c0000 RIP: 0010:[<ffffffff82c8ba07>] [<ffffffff82c8ba07>] snd_hrtimer_start+0x77/0x100 RSP: 0018:ffff8801120c7a60 EFLAGS: 00010006 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000007 RDX: 0000000000000009 RSI: 1ffff10023483091 RDI: 0000000000000048 RBP: ffff8801120c7a78 R08: ffff88011a5cf768 R09: ffff88011a5ba790 R10: 0000000000000002 R11: ffffed00234b9ef1 R12: ffff880114843980 R13: ffffffff84213c00 R14: ffff880114843ab0 R15: 0000000000000286 FS: 00007f72958f3700(0000) GS:ffff88011aa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000603001 CR3: 00000001126ab000 CR4: 00000000000006f0 Stack: ffff880114843980 ffff880111eb2dc0 ffff880114843a34 ffff8801120c7ad0 ffffffff82c81ab1 0000000000000000 ffffffff842138e0 0000000100000000 ffff880111eb2dd0 ffff880111eb2dc0 0000000000000001 ffff880111eb2dc0 Call Trace: [<ffffffff82c81ab1>] snd_timer_start1+0x331/0x670 [<ffffffff82c85bfd>] snd_timer_start+0x5d/0xa0 [<ffffffff82c8795e>] snd_timer_user_ioctl+0x88e/0x2830 [<ffffffff8159f3a0>] ? __follow_pte.isra.49+0x430/0x430 [<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80 [<ffffffff815a26fa>] ? do_wp_page+0x3aa/0x1c90 [<ffffffff8132762f>] ? put_prev_entity+0x108f/0x21a0 [<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80 [<ffffffff816b0733>] do_vfs_ioctl+0x193/0x1050 [<ffffffff813510af>] ? cpuacct_account_field+0x12f/0x1a0 [<ffffffff816b05a0>] ? ioctl_preallocate+0x200/0x200 [<ffffffff81002f2f>] ? syscall_trace_enter+0x3cf/0xdb0 [<ffffffff815045ba>] ? __context_tracking_exit.part.4+0x9a/0x1e0 [<ffffffff81002b60>] ? exit_to_usermode_loop+0x190/0x190 [<ffffffff82001a97>] ? check_preemption_disabled+0x37/0x1e0 [<ffffffff81d93889>] ? security_file_ioctl+0x89/0xb0 [<ffffffff816b167f>] SyS_ioctl+0x8f/0xc0 [<ffffffff816b15f0>] ? do_vfs_ioctl+0x1050/0x1050 [<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0 [<ffffffff83c32b2a>] entry_SYSCALL64_slow_path+0x25/0x25 Code: c7 c7 c4 b9 c8 82 48 89 d9 4c 89 ee e8 63 88 7f fe e8 7e 46 7b fe 48 8d 7b 48 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 04 84 c0 7e 65 80 7b 48 00 74 0e e8 52 46 RIP [<ffffffff82c8ba07>] snd_hrtimer_start+0x77/0x100 RSP <ffff8801120c7a60> ---[ end trace 5955b08db7f2b029 ]--- This can happen if snd_hrtimer_open() fails to allocate memory and returns an error, which is currently not checked by snd_timer_open(): ioctl(SNDRV_TIMER_IOCTL_SELECT) - snd_timer_user_tselect() - snd_timer_close() - snd_hrtimer_close() - (struct snd_timer *) t->private_data = NULL - snd_timer_open() - snd_hrtimer_open() - kzalloc() fails; t->private_data is still NULL ioctl(SNDRV_TIMER_IOCTL_START) - snd_timer_user_start() - snd_timer_start() - snd_timer_start1() - snd_hrtimer_start() - t->private_data == NULL // boom [js] no put_device in 3.12 yet Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
* ALSA: timer: fix division by zero after SNDRV_TIMER_IOCTL_CONTINUEVegard Nossum2017-04-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 6b760bb2c63a9e322c0e4a0b5daf335ad93d5a33 upstream. I got this: divide error: 0000 [#1] PREEMPT SMP KASAN CPU: 1 PID: 1327 Comm: a.out Not tainted 4.8.0-rc2+ #189 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014 task: ffff8801120a9580 task.stack: ffff8801120b0000 RIP: 0010:[<ffffffff82c8bd9a>] [<ffffffff82c8bd9a>] snd_hrtimer_callback+0x1da/0x3f0 RSP: 0018:ffff88011aa87da8 EFLAGS: 00010006 RAX: 0000000000004f76 RBX: ffff880112655e88 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff880112655ea0 RDI: 0000000000000001 RBP: ffff88011aa87e00 R08: ffff88013fff905c R09: ffff88013fff9048 R10: ffff88013fff9050 R11: 00000001050a7b8c R12: ffff880114778a00 R13: ffff880114778ab4 R14: ffff880114778b30 R15: 0000000000000000 FS: 00007f071647c700(0000) GS:ffff88011aa80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000603001 CR3: 0000000112021000 CR4: 00000000000006e0 Stack: 0000000000000000 ffff880114778ab8 ffff880112655ea0 0000000000004f76 ffff880112655ec8 ffff880112655e80 ffff880112655e88 ffff88011aa98fc0 00000000b97ccf2b dffffc0000000000 ffff88011aa98fc0 ffff88011aa87ef0 Call Trace: <IRQ> [<ffffffff813abce7>] __hrtimer_run_queues+0x347/0xa00 [<ffffffff82c8bbc0>] ? snd_hrtimer_close+0x130/0x130 [<ffffffff813ab9a0>] ? retrigger_next_event+0x1b0/0x1b0 [<ffffffff813ae1a6>] ? hrtimer_interrupt+0x136/0x4b0 [<ffffffff813ae220>] hrtimer_interrupt+0x1b0/0x4b0 [<ffffffff8120f91e>] local_apic_timer_interrupt+0x6e/0xf0 [<ffffffff81227ad3>] ? kvm_guest_apic_eoi_write+0x13/0xc0 [<ffffffff83c35086>] smp_apic_timer_interrupt+0x76/0xa0 [<ffffffff83c3416c>] apic_timer_interrupt+0x8c/0xa0 <EOI> [<ffffffff83c3239c>] ? _raw_spin_unlock_irqrestore+0x2c/0x60 [<ffffffff82c8185d>] snd_timer_start1+0xdd/0x670 [<ffffffff82c87015>] snd_timer_continue+0x45/0x80 [<ffffffff82c88100>] snd_timer_user_ioctl+0x1030/0x2830 [<ffffffff8159f3a0>] ? __follow_pte.isra.49+0x430/0x430 [<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80 [<ffffffff815a26fa>] ? do_wp_page+0x3aa/0x1c90 [<ffffffff815aa4f8>] ? handle_mm_fault+0xbc8/0x27f0 [<ffffffff815a9930>] ? __pmd_alloc+0x370/0x370 [<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80 [<ffffffff816b0733>] do_vfs_ioctl+0x193/0x1050 [<ffffffff816b05a0>] ? ioctl_preallocate+0x200/0x200 [<ffffffff81002f2f>] ? syscall_trace_enter+0x3cf/0xdb0 [<ffffffff815045ba>] ? __context_tracking_exit.part.4+0x9a/0x1e0 [<ffffffff81002b60>] ? exit_to_usermode_loop+0x190/0x190 [<ffffffff82001a97>] ? check_preemption_disabled+0x37/0x1e0 [<ffffffff81d93889>] ? security_file_ioctl+0x89/0xb0 [<ffffffff816b167f>] SyS_ioctl+0x8f/0xc0 [<ffffffff816b15f0>] ? do_vfs_ioctl+0x1050/0x1050 [<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0 [<ffffffff83c32b2a>] entry_SYSCALL64_slow_path+0x25/0x25 Code: e8 fc 42 7b fe 8b 0d 06 8a 50 03 49 0f af cf 48 85 c9 0f 88 7c 01 00 00 48 89 4d a8 e8 e0 42 7b fe 48 8b 45 c0 48 8b 4d a8 48 99 <48> f7 f9 49 01 c7 e8 cb 42 7b fe 48 8b 55 d0 48 b8 00 00 00 00 RIP [<ffffffff82c8bd9a>] snd_hrtimer_callback+0x1da/0x3f0 RSP <ffff88011aa87da8> ---[ end trace 6aa380f756a21074 ]--- The problem happens when you call ioctl(SNDRV_TIMER_IOCTL_CONTINUE) on a completely new/unused timer -- it will have ->sticks == 0, which causes a divide by 0 in snd_hrtimer_callback(). Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
* ALSA: timer: fix NULL pointer dereference in read()/ioctl() raceVegard Nossum2017-04-111-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 11749e086b2766cccf6217a527ef5c5604ba069c upstream. I got this with syzkaller: ================================================================== BUG: KASAN: null-ptr-deref on address 0000000000000020 Read of size 32 by task syz-executor/22519 CPU: 1 PID: 22519 Comm: syz-executor Not tainted 4.8.0-rc2+ #169 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2 014 0000000000000001 ffff880111a17a00 ffffffff81f9f141 ffff880111a17a90 ffff880111a17c50 ffff880114584a58 ffff880114584a10 ffff880111a17a80 ffffffff8161fe3f ffff880100000000 ffff880118d74a48 ffff880118d74a68 Call Trace: [<ffffffff81f9f141>] dump_stack+0x83/0xb2 [<ffffffff8161fe3f>] kasan_report_error+0x41f/0x4c0 [<ffffffff8161ff74>] kasan_report+0x34/0x40 [<ffffffff82c84b54>] ? snd_timer_user_read+0x554/0x790 [<ffffffff8161e79e>] check_memory_region+0x13e/0x1a0 [<ffffffff8161e9c1>] kasan_check_read+0x11/0x20 [<ffffffff82c84b54>] snd_timer_user_read+0x554/0x790 [<ffffffff82c84600>] ? snd_timer_user_info_compat.isra.5+0x2b0/0x2b0 [<ffffffff817d0831>] ? proc_fault_inject_write+0x1c1/0x250 [<ffffffff817d0670>] ? next_tgid+0x2a0/0x2a0 [<ffffffff8127c278>] ? do_group_exit+0x108/0x330 [<ffffffff8174653a>] ? fsnotify+0x72a/0xca0 [<ffffffff81674dfe>] __vfs_read+0x10e/0x550 [<ffffffff82c84600>] ? snd_timer_user_info_compat.isra.5+0x2b0/0x2b0 [<ffffffff81674cf0>] ? do_sendfile+0xc50/0xc50 [<ffffffff81745e10>] ? __fsnotify_update_child_dentry_flags+0x60/0x60 [<ffffffff8143fec6>] ? kcov_ioctl+0x56/0x190 [<ffffffff81e5ada2>] ? common_file_perm+0x2e2/0x380 [<ffffffff81746b0e>] ? __fsnotify_parent+0x5e/0x2b0 [<ffffffff81d93536>] ? security_file_permission+0x86/0x1e0 [<ffffffff816728f5>] ? rw_verify_area+0xe5/0x2b0 [<ffffffff81675355>] vfs_read+0x115/0x330 [<ffffffff81676371>] SyS_read+0xd1/0x1a0 [<ffffffff816762a0>] ? vfs_write+0x4b0/0x4b0 [<ffffffff82001c2c>] ? __this_cpu_preempt_check+0x1c/0x20 [<ffffffff8150455a>] ? __context_tracking_exit.part.4+0x3a/0x1e0 [<ffffffff816762a0>] ? vfs_write+0x4b0/0x4b0 [<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0 [<ffffffff810052fc>] ? syscall_return_slowpath+0x16c/0x1d0 [<ffffffff83c3276a>] entry_SYSCALL64_slow_path+0x25/0x25 ================================================================== There are a couple of problems that I can see: - ioctl(SNDRV_TIMER_IOCTL_SELECT), which potentially sets tu->queue/tu->tqueue to NULL on memory allocation failure, so read() would get a NULL pointer dereference like the above splat - the same ioctl() can free tu->queue/to->tqueue which means read() could potentially see (and dereference) the freed pointer We can fix both by taking the ioctl_lock mutex when dereferencing ->queue/->tqueue, since that's always held over all the ioctl() code. Just looking at the code I find it likely that there are more problems here such as tu->qhead pointing outside the buffer if the size is changed concurrently using SNDRV_TIMER_IOCTL_PARAMS. [js] unlock in fail paths Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
* trace: net: use %pK for kernel pointersmukesh agrawal2017-04-111-2/+2
| | | | | | | | | | | | | | | We want to use network trace events in production builds, to help diagnose Wifi problems. However, we don't want to expose raw kernel pointers in such builds. Change the format specifier for the skbaddr field, so that, if kptr_restrict is enabled, the pointers will be reported as 0. Bug: 30090733 Change-Id: Ic4bd583d37af6637343601feca875ee24479ddff Signed-off-by: mukesh agrawal <quiche@google.com>
* fix memory leaks in tracing_buffers_splice_read()Al Viro2017-04-111-8/+9
| | | | | | | commit 1ae2293dd6d2f5c823cf97e60b70d03631cd622f upstream. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
* zlib: clean up some dead codeSergey Senozhatsky2017-04-113-393/+0
| | | | | | | | | | | | | | | | | | | Cleanup unused `if 0'-ed functions, which have been dead since 2006 (commits 87c2ce3b9305 ("lib/zlib*: cleanups") by Adrian Bunk and 4f3865fb57a0 ("zlib_inflate: Upgrade library code to a recent version") by Richard Purdie): - zlib_deflateSetDictionary - zlib_deflateParams - zlib_deflateCopy - zlib_inflateSync - zlib_syncsearch - zlib_inflateSetDictionary - zlib_inflatePrime Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* lib: zlib_inflage: fixed potential buffer overflowPaul Taysom2017-04-111-2/+3
| | | | | | | | | | | | | | | smatch error from arm build: arch/arm/boot/compressed/../../../../lib/zlib_inflate/inftrees.c:240 zlib_inflate_table() error: buffer overflow 'count' 16 <= 16 Because min is later assigned to len in zlib_inflate_table, by switching the tests around, min always stays in bounds. BUG=chromium:237705 TEST=FEATURES=test emerge-$B chromeos-kernel-next Signed-off-by: Paul Taysom <taysom@chromium.org> Reviewed-on: https://gerrit.chromium.org/gerrit/59426 Reviewed-by: Sonny Rao <sonnyrao@chromium.org>
* lib/crc7: Shift crc7() output left 1 bitGeorge Spelvin2017-04-117-49/+53
| | | | | | | | | | | | | | This eliminates a 1-bit left shift in every single caller, and makes the inner loop of the CRC computation more efficient. Renamed crc7 to crc7_be (big-endian) since the interface changed. Also purged #include <linux/crc7.h> from files that don't use it at all. Signed-off-by: George Spelvin <linux@horizon.com> Reviewed-by: Pavel Machek <pavel@ucw.cz> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
* lib: crc32: Add some additional __pure annotationsGeorge Spelvin2017-04-112-4/+4
| | | | | | | | In case they help the compiler. Signed-off-by: George Spelvin <linux@horizon.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* lib: crc32: Mark test data __initconstGeorge Spelvin2017-04-111-2/+2
| | | | | | | | So it gets discarded after the selftest. Signed-off-by: George Spelvin <linux@horizon.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* lib: crc32: Greatly shrink CRC combining codeGeorge Spelvin2017-04-112-79/+82
| | | | | | | | | | | | | | | | | | There's no need for a full 32x32 matrix, when rows before the last are just shifted copies of the rows after them. There's still room for improvement (especially on X86 processors with CRC32 and PCLMUL instructions), but this is a large step in the right direction [which is in particular useful for its current user, namely SCTP checksumming over multiple skb frags[] entries, i.e. in IPVS balancing when other CRC32 offloads are not available]. The internal primitive is now called crc32_generic_shift and takes one less argument; the XOR with crc2 is done in inline wrappers. Signed-off-by: George Spelvin <linux@horizon.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* lib/crc32.c: remove unnecessary __constantFabian Frederick2017-04-111-2/+2
| | | | | | | | | Use cpu_to_le32 instead of __constant_cpu_to_le32. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* lib: crc32: reduce number of cases for crc32{, c}_combineDaniel Borkmann2017-04-111-2/+2
| | | | | | | | | | | We can safely reduce the number of test cases by a tenth. There is no particular need to run as many as we're running now for crc32{,c}_combine, that gives us still ~8000 tests we're doing if people run kernels with crc selftests enabled which is perfectly fine. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* lib: crc32: conditionally resched when running testcasesDaniel Borkmann2017-04-111-0/+3
| | | | | | | | | | | | | Fengguang reports that when crc32 selftests are running on startup, on some e.g. 32bit systems, we can get a CPU stall like "INFO: rcu_sched self-detected stall on CPU { 0} (t=2101 jiffies g=4294967081 c=4294967080 q=41)". As this is not intended, add a cond_resched() at the end of a test case to fix it. Introduced by efba721f63 ("lib: crc32: add test cases for crc32{, c}_combine routines"). Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* lib: crc32: add test cases for crc32{, c}_combine routinesDaniel Borkmann2017-04-111-0/+72
| | | | | | | | | | | | | | | | | | | | We already have 100 test cases for crcs itself, so split the test buffer with a-prio known checksums, and test crc of two blocks against crc of the whole block for the same results. Output/result with CONFIG_CRC32_SELFTEST=y: [ 2.687095] crc32: CRC_LE_BITS = 64, CRC_BE BITS = 64 [ 2.687097] crc32: self tests passed, processed 225944 bytes in 278177 nsec [ 2.687383] crc32c: CRC_LE_BITS = 64 [ 2.687385] crc32c: self tests passed, processed 225944 bytes in 141708 nsec [ 7.336771] crc32_combine: 113072 self tests passed [ 12.050479] crc32c_combine: 113072 self tests passed [ 17.633089] alg: No test for crc32 (crc32-pclmul) Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
* lib: crc32: add functionality to combine two crc32{, c}s in GF(2)Daniel Borkmann2017-04-112-0/+121
| | | | | | | | | | | | | | | | This patch adds a combinator to merge two or more crc32{,c}s into a new one. This is useful for checksum computations of fragmented skbs that use crc32/crc32c as checksums. The arithmetics for combining both in the GF(2) was taken and slightly modified from zlib. Only passing two crcs is insufficient as two crcs and the length of the second piece is needed for merging. The code is made generic, so that only polynomials need to be passed for crc32_le resp. crc32c_le. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
* lib: crc32: clean up spacing in test casesDaniel Borkmann2017-04-111-200/+100
| | | | | | | | | | This is nothing more but a whitepace cleanup, as 80 chars is not a hard but soft limit, and otherwise makes the test cases array really look ugly. So fix it up. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
* lib/crc32: update the comments of crc32_{be,le}_generic()Gu Zheng2017-04-111-6/+11
| | | | | | | [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Optimize ARM memset and memzero functionsHarm Hanemaaijer2017-04-113-184/+250
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ARM memset and memzero functions are optimized with lower overhead for small requests, generation of more 16-bit Thumb2 instructions when compiled in Thumb2 mode, and configurable destination alignment before the main block-copying loop. The new compile-time constant MEMSET_WRITE_ALIGN_BYTES is introduced in assembler.h, to augment the CALGN macro that previously regulated 32-byte write alignment but was only used on the Feroceon platform. MEMSET_WRITE_ALIGN_BYTES can have values of 0 (no write alignment), 8, or 32. Apart from Feroceon, memset write alignment of 32 bytes appears to benefit the armv6 platform, while the armv7 platform seems benefit from alignment to 8 bytes for memset/memzero. The CALGN macro is renamed to MEMSET_CALGN for memset and memzero; the original CALGN macro is reserved for the memcpy family of functions (memcpy, copy_from_user, copy_to_user) currently implemented in copy_template.S, and the associated compile time constant WRITE_ALIGN_BYTES defines the write alignment for memcpy-related functions that will be utilized in subsequent contributions. Because the current CALGN implementation in copy_template.S only implements 32-byte write alignment and is broken on Thumb2, it is only enabled when WRITE_ALIGN_BYTES is equal to 32 and Thumb2 mode is not enabled. Finally, memset and memzero now include a directive to enable unified ARM assembler syntax. Signed-off-by: Harm Hanemaaijer <fgenfb@yahoo.com>
* Optimize copy_page for modern ARM platformsHarm Hanemaaijer2017-04-113-24/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | The existing implementation of copy_page for ARM appears to be optimized for older platforms. Benchmark testing in a sandbox environment shows suboptimal performance on modern platforms like armv6 and armv7, with speed-ups ranging from 10% (Cortex A8) to 80% (armv6 used in Raspberry Pi) being achievable. This commit optimizes copy_page and introduces the new compile-time constant PREFETCH_DISTANCE, defined in cache.h, which when multiplied by L1_CACHE_BYTES is equal to the offset used for prefetches performed with the PLD instruction. For platforms where L1_CACHE_BYTES is 32 (armv5 and armv6), copy_page processes 32 bytes at a time while doing one prefetch per iteration, while for armv7 (with L1_CACHE_BYTES equal to 64), 64 bytes are processed at at time with one prefetch per iteration. When no preload instruction is available (platforms earlier than armv5), no preload instructions are generated and 32 bytes are processed at at time. To facilitate specifying instructions for architectures with no preload instruction, the NO_PLD macro is added to assembler.h, augmenting the PLD macro. Signed-off-by: Harm Hanemaaijer <fgenfb@yahoo.com> Signed-off-by: RyTek <rytek1128@outlook.com>
* Rename ARM assembler push/pull macrosHarm Hanemaaijer2017-04-117-206/+206
| | | | | | | | | | | | | | | | | | The ARM assembler library functions use a macro called "push" that along with a macro called "pull" is used to shift bytes around in a word in an endian-independent way. However, the modern unified ARM assembler syntax also defines the instruction "push" to push data onto the stack, which has specific encodings in the Thumb2 instruction set. For prevent possible conflicts going forward, and to allow the use of the more transparent "push" instruction along with the modern unified assembler syntax, this patch renames all occurrences of the "push" macro to "pushbits", as well as renaming the macro argument, when also called "push", to "pushshift". For consistency, the macro called "pull" with its argument name "pull" are also renamed to "pullbits" and "pullshift", respectively. Signed-off-by: Harm Hanemaaijer <fgenfb@yahoo.com>
* Speed up console framebuffer imageblit functionHarm Hanemaaijer2017-04-111-5/+147
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Especially on platforms with a slower CPU but a relatively high framebuffer fill bandwidth, like current ARM devices, the existing console monochrome imageblit function used to draw console text is suboptimal for common pixel depths such as 16bpp and 32bpp. The existing code is quite general and can deal with several pixel depths. By creating special case functions for 16bpp and 32bpp, by far the most common pixel formats used on modern systems, a significant speed-up is attained which can be readily felt on ARM-based devices like the Raspberry Pi and the Allwinner platform, but should help any platform using the fb layer. The special case functions allow constant folding, eliminating a number of instructions including divide operations, and allow the use of an unrolled loop, eliminating instructions with a variable shift size, reducing source memory access instructions, and eliminating excessive branching. These unrolled loops also allow much better code optimization by the C compiler. The code that selects which optimized variant is used is also simplified, eliminating integer divide instructions. The speed-up, measured by timing 'cat file.txt' in the console, varies between 40% and 70%, when testing on the Raspberry Pi and Allwinner ARM-based platforms, depending on font size and the pixel depth, with the greater benefit for 32bpp. Signed-off-by: Harm Hanemaaijer <fgenfb@yahoo.com>
* lib/string.c: improve strrchr()Rasmus Villemoes2017-04-111-6/+6
| | | | | | | | | | | Instead of potentially passing over the string twice in case c is not found, just keep track of the last occurrence. According to bloat-o-meter, this also cuts the generated code by a third (54 vs 36 bytes). Oh, and we get rid of those 7-space indented lines. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fs/seq_file: fix out-of-bounds readVegard Nossum2017-04-111-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 088bf2ff5d12e2e32ee52a4024fec26e582f44d3 upstream. seq_read() is a nasty piece of work, not to mention buggy. It has (I think) an old bug which allows unprivileged userspace to read beyond the end of m->buf. I was getting these: BUG: KASAN: slab-out-of-bounds in seq_read+0xcd2/0x1480 at addr ffff880116889880 Read of size 2713 by task trinity-c2/1329 CPU: 2 PID: 1329 Comm: trinity-c2 Not tainted 4.8.0-rc1+ #96 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014 Call Trace: kasan_object_err+0x1c/0x80 kasan_report_error+0x2cb/0x7e0 kasan_report+0x4e/0x80 check_memory_region+0x13e/0x1a0 kasan_check_read+0x11/0x20 seq_read+0xcd2/0x1480 proc_reg_read+0x10b/0x260 do_loop_readv_writev.part.5+0x140/0x2c0 do_readv_writev+0x589/0x860 vfs_readv+0x7b/0xd0 do_readv+0xd8/0x2c0 SyS_readv+0xb/0x10 do_syscall_64+0x1b3/0x4b0 entry_SYSCALL64_slow_path+0x25/0x25 Object at ffff880116889100, in cache kmalloc-4096 size: 4096 Allocated: PID = 1329 save_stack_trace+0x26/0x80 save_stack+0x46/0xd0 kasan_kmalloc+0xad/0xe0 __kmalloc+0x1aa/0x4a0 seq_buf_alloc+0x35/0x40 seq_read+0x7d8/0x1480 proc_reg_read+0x10b/0x260 do_loop_readv_writev.part.5+0x140/0x2c0 do_readv_writev+0x589/0x860 vfs_readv+0x7b/0xd0 do_readv+0xd8/0x2c0 SyS_readv+0xb/0x10 do_syscall_64+0x1b3/0x4b0 return_from_SYSCALL_64+0x0/0x6a Freed: PID = 0 (stack is not available) Memory state around the buggy address: ffff88011688a000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff88011688a080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff88011688a100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff88011688a180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88011688a200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Disabling lock debugging due to kernel taint This seems to be the same thing that Dave Jones was seeing here: https://lkml.org/lkml/2016/8/12/334 There are multiple issues here: 1) If we enter the function with a non-empty buffer, there is an attempt to flush it. But it was not clearing m->from after doing so, which means that if we try to do this flush twice in a row without any call to traverse() in between, we are going to be reading from the wrong place -- the splat above, fixed by this patch. 2) If there's a short write to userspace because of page faults, the buffer may already contain multiple lines (i.e. pos has advanced by more than 1), but we don't save the progress that was made so the next call will output what we've already returned previously. Since that is a much less serious issue (and I have a headache after staring at seq_read() for the past 8 hours), I'll leave that for now. Link: http://lkml.kernel.org/r/1471447270-32093-1-git-send-email-vegard.nossum@oracle.com Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Reported-by: Dave Jones <davej@codemonkey.org.uk> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz>