aboutsummaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
...
* BACKPORT: selinux: restrict kernel module loadingJeff Vander Stoep2017-04-252-1/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Backport notes: Backport uses kernel_module_from_file not kernel_read_file hook. kernel_read_file replaced kernel_module_from_file in the 4.6 kernel. There are no inode_security_() helper functions (also introduced in 4.6) so the inode lookup is done using the file_inode() helper which is standard for kernel version < 4.6. (Cherry picked from commit 61d612ea731e57dc510472fb746b55cdc017f371) Utilize existing kernel_read_file hook on kernel module load. Add module_load permission to the system class. Enforces restrictions on kernel module origin when calling the finit_module syscall. The hook checks that source type has permission module_load for the target type. Example for finit_module: allow foo bar_file:system module_load; Similarly restrictions are enforced on kernel module loading when calling the init_module syscall. The hook checks that source type has permission module_load with itself as the target object because the kernel module is sourced from the calling process. Example for init_module: allow foo foo:system module_load; Bug: 27824855 Change-Id: I64bf3bd1ab2dc735321160642dc6bbfa996f8068 Signed-off-by: Jeff Vander Stoep <jeffv@google.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
* mlog: use round_jiffies in mlog timerAnmin Hsu2017-04-251-1/+1
| | | | | | | | | | | | | | | | [Detail] CPU0 off [Solution] use round_jiffies in mlog timer [Feature] Others MTK-Commit-Id: b39445fa9c9c93fbd6bb660011c242688af44f0b Change-Id: I2b209908f2a7493e1760faafe5bd394969445996 Signed-off-by: mtk10008 <tehsin.lin@mediatek.com> CR-Id: ALPS02298339
* mlog: Fixed stack overflowAnmin Hsu2017-04-251-1/+1
| | | | | | | | | | | | | | | | [Detail] Function sprintf does not check the length of buffer. [Solution] Using snprintf prevent stack overflow. [Feature] Monkey Test MTK-Commit-Id: 699464af6ac730e4edd21773b02aa5e1f6dc9403 Change-Id: I238b71ac9966b1967f4c93ffeb29a7c88d441193 Signed-off-by: mtk10008 <tehsin.lin@mediatek.com> CR-Id: ALPS02316340
* AEE: kernel driver memory leak riskAnmin Hsu2017-04-251-1/+1
| | | | | | | | | | | | | | [Detail] It has memory leak risk while call aee_kernel_dal_api(). [Solution] Modify aee_kernel_dal_api() with kfree() system call. [Feature] Memory Optimization MTK-Commit-Id: 09f75c2a0814049e8285693b5b5c715efe1298f7 Change-Id: I9902927084839175bb72e746c481b0d969d819d6 Signed-off-by: Zhiyong Wang <zhiyong.wang@mediatek.com> CR-Id: ALPS02312652
* BACKPORT: [UPSTREAM] ext2: convert to mbcache2Jan Kara2017-04-254-100/+92
| | | | | | | | | | | | | | (Cherry-pick from commit be0726d33cb8f411945884664924bed3cb8c70ee) The conversion is generally straightforward. We convert filesystem from a global cache to per-fs one. Similarly to ext4 the tricky part is that xattr block corresponding to found mbcache entry can get freed before we get buffer lock for that block. So we have to check whether the entry is still valid after getting the buffer lock. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Bug: 32461228
* defconfig: disable ntfs and let fuse handle it insteadMister Oyster2017-04-251-1/+1
|
* Android: sdcardfs: Don't complain in fixup_lower_ownershipDaniel Rosenberg2017-04-251-1/+1
| | | | | | | | | Not all filesystems support changing the owner of a file. We shouldn't complain if it doesn't happen. Signed-off-by: Daniel Rosenberg <drosen@google.com> Bug: 37488099 Change-Id: I403e44ab7230f176e6df82f6adb4e5c82ce57f33
* ANDROID: sdcardfs: ->iget fixesDaniel Rosenberg2017-04-251-8/+6
| | | | | | | | | | | | | | Adapted from wrapfs commit 8c49eaa0sb9c ("Wrapfs: ->iget fixes") Change where we igrab/iput to ensure we always hold a valid lower_inode. Return ENOMEM (not EACCES) if iget5_locked returns NULL. Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu> Signed-off-by: Daniel Rosenberg <drosen@google.com> Bug: 35766959 Change-Id: Id8d4e0c0cbc685a0a77685ce73c923e9a3ddc094
* Android: sdcardfs: Change cache GID valueDaniel Rosenberg2017-04-252-3/+5
| | | | | | Change-Id: Ieb955dd26493da26a458bc20fbbe75bca32b094f Signed-off-by: Daniel Rosenberg <drosen@google.com> Bug: 37193650
* UPSTREAM: char: lack of bool string made CONFIG_DEVPORT always onMax Bires2017-04-251-1/+4
| | | | | | | | | | | | | | | | | (cherry pick from commit f2cfa58b136e4b06a9b9db7af5ef62fbb5992f62) Without a bool string present, using "# CONFIG_DEVPORT is not set" in defconfig files would not actually unset devport. This esnured that /dev/port was always on, but there are reasons a user may wish to disable it (smaller kernel, attack surface reduction) if it's not being used. Adding a message here in order to make this user visible. Signed-off-by: Max Bires <jbires@google.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Bug: 37210310 Bug: 36604779 Change-Id: Ib1e947526f6c6f7cdf6389923287631056f32c36
* UPSTREAM: char: Drop bogus dependency of DEVPORT on !M68KGeert Uytterhoeven2017-04-251-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | (cherry pick from commit 309124e2648d668a0c23539c5078815660a4a850) According to full-history-linux commit d3794f4fa7c3edc3 ("[PATCH] M68k update (part 25)"), port operations are allowed on m68k if CONFIG_ISA is defined. However, commit 153dcc54df826d2f ("[PATCH] mem driver: fix conditional on isa i/o support") accidentally changed an "||" into an "&&", disabling it completely on m68k. This logic was retained when introducing the DEVPORT symbol in commit 4f911d64e04a44c4 ("Make /dev/port conditional on config symbol"). Drop the bogus dependency on !M68K to fix this. Fixes: 153dcc54df826d2f ("[PATCH] mem driver: fix conditional on isa i/o support") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Al Stone <ahs3@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Bug: 37210310 Bug: 36604779 Change-Id: I9139bd8a5a6e9e39c2e428bde23a7d9be07e2f91
* BACKPORT: [UPSTREAM] mbcache2: reimplement mbcacheJan Kara2017-04-253-1/+410
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (Cherry-pick from commit f9a61eb4e2471c56a63cd804c7474128138c38ac) Original mbcache was designed to have more features than what ext? filesystems ended up using. It supported entry being in more hashes, it had a home-grown rwlocking of each entry, and one cache could cache entries from multiple filesystems. This genericity also resulted in more complex locking, larger cache entries, and generally more code complexity. This is reimplementation of the mbcache functionality to exactly fit the purpose ext? filesystems use it for. Cache entries are now considerably smaller (7 instead of 13 longs), the code is considerably smaller as well (414 vs 913 lines of code), and IMO also simpler. The new code is also much more lightweight. I have measured the speed using artificial xattr-bench benchmark, which spawns P processes, each process sets xattr for F different files, and the value of xattr is randomly chosen from a pool of V values. Averages of runtimes for 5 runs for various combinations of parameters are below. The first value in each cell is old mbache, the second value is the new mbcache. V=10 F\P 1 2 4 8 16 32 64 10 0.158,0.157 0.208,0.196 0.500,0.277 0.798,0.400 3.258,0.584 13.807,1.047 61.339,2.803 100 0.172,0.167 0.279,0.222 0.520,0.275 0.825,0.341 2.981,0.505 12.022,1.202 44.641,2.943 1000 0.185,0.174 0.297,0.239 0.445,0.283 0.767,0.340 2.329,0.480 6.342,1.198 16.440,3.888 V=100 F\P 1 2 4 8 16 32 64 10 0.162,0.153 0.200,0.186 0.362,0.257 0.671,0.496 1.433,0.943 3.801,1.345 7.938,2.501 100 0.153,0.160 0.221,0.199 0.404,0.264 0.945,0.379 1.556,0.485 3.761,1.156 7.901,2.484 1000 0.215,0.191 0.303,0.246 0.471,0.288 0.960,0.347 1.647,0.479 3.916,1.176 8.058,3.160 V=1000 F\P 1 2 4 8 16 32 64 10 0.151,0.129 0.210,0.163 0.326,0.245 0.685,0.521 1.284,0.859 3.087,2.251 6.451,4.801 100 0.154,0.153 0.211,0.191 0.276,0.282 0.687,0.506 1.202,0.877 3.259,1.954 8.738,2.887 1000 0.145,0.179 0.202,0.222 0.449,0.319 0.899,0.333 1.577,0.524 4.221,1.240 9.782,3.579 V=10000 F\P 1 2 4 8 16 32 64 10 0.161,0.154 0.198,0.190 0.296,0.256 0.662,0.480 1.192,0.818 2.989,2.200 6.362,4.746 100 0.176,0.174 0.236,0.203 0.326,0.255 0.696,0.511 1.183,0.855 4.205,3.444 19.510,17.760 1000 0.199,0.183 0.240,0.227 1.159,1.014 2.286,2.154 6.023,6.039 ---,10.933 ---,36.620 V=100000 F\P 1 2 4 8 16 32 64 10 0.171,0.162 0.204,0.198 0.285,0.230 0.692,0.500 1.225,0.881 2.990,2.243 6.379,4.771 100 0.151,0.171 0.220,0.210 0.295,0.255 0.720,0.518 1.226,0.844 3.423,2.831 19.234,17.544 1000 0.192,0.189 0.249,0.225 1.162,1.043 2.257,2.093 5.853,4.997 ---,10.399 ---,32.198 We see that the new code is faster in pretty much all the cases and starting from 4 processes there are significant gains with the new code resulting in upto 20-times shorter runtimes. Also for large numbers of cached entries all values for the old code could not be measured as the kernel started hitting softlockups and died before the test completed. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* BACKPORT: [UPSTREAM] mm: new shrinker APIDave Chinner2017-04-252-29/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (Cherry-pick from commit 24f7c6b981fb70084757382da464ea85d72af300) The current shrinker callout API uses an a single shrinker call for multiple functions. To determine the function, a special magical value is passed in a parameter to change the behaviour. This complicates the implementation and return value specification for the different behaviours. Separate the two different behaviours into separate operations, one to return a count of freeable objects in the cache, and another to scan a certain number of objects in the cache for freeing. In defining these new operations, ensure the return values and resultant behaviours are clearly defined and documented. Modify shrink_slab() to use the new API and implement the callouts for all the existing shrinkers. Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Glauber Costa <glommer@parallels.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Cc: Arve Hjønnevåg <arve@android.com> Cc: Carlos Maiolino <cmaiolino@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Rientjes <rientjes@google.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: J. Bruce Fields <bfields@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Stultz <john.stultz@linaro.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Kent Overstreet <koverstreet@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Thomas Hellstrom <thellstrom@vmware.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* defconfig: enable ikconfigMister Oyster2017-04-251-1/+3
|
* defconfig: regen and update recovery_defconfigMister Oyster2017-04-251-114/+154
|
* defconfig: split user & debug defconfigMister Oyster2017-04-252-143/+199
|
* net: Fix maybe-uninitialized variablesChristopher N. Hesse2017-04-252-1/+11
| | | | Change-Id: I83202d1362a1d01fbd5be6c23f2f47fe60efcb61
* net/packet: fix overflow in check for tp_frame_nrAndrey Konovalov2017-04-251-0/+2
| | | | | | | | | | | | | | When calculating rb->frames_per_block * req->tp_block_nr the result can overflow. Add a check that tp_block_size * tp_block_nr <= UINT_MAX. Since frames_per_block <= tp_block_size, the expression would never overflow. Change-Id: I3598423e621275aa1d890b80bcf9018929087d90 Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Eric Dumazet <edumazet@google.com>
* net/packet: fix overflow in check for tp_reserveAndrey Konovalov2017-04-251-0/+2
| | | | | | | | | | When calculating po->tp_hdrlen + po->tp_reserve the result can overflow. Fix by checking that tp_reserve <= INT_MAX on assign. Change-Id: I6a4ea0cbe87cfd3db0979896c9bf9b3c626ec1d6 Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Eric Dumazet <edumazet@google.com>
* cred/userns: define current_user_ns() as a functionArnd Bergmann2017-04-252-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current_user_ns() macro currently returns &init_user_ns when user namespaces are disabled, and that causes several warnings when building with gcc-6.0 in code that compares the result of the macro to &init_user_ns itself: fs/xfs/xfs_ioctl.c: In function 'xfs_ioctl_setattr_check_projid': fs/xfs/xfs_ioctl.c:1249:22: error: self-comparison always evaluates to true [-Werror=tautological-compare] if (current_user_ns() == &init_user_ns) This is a legitimate warning in principle, but here it isn't really helpful, so I'm reprasing the definition in a way that shuts up the warning. Apparently gcc only warns when comparing identical literals, but it can figure out that the result of an inline function can be identical to a constant expression in order to optimize a condition yet not warn about the fact that the condition is known at compile time. This is exactly what we want here, and it looks reasonable because we generally prefer inline functions over macros anyway. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Cc: David Howells <dhowells@redhat.com> Cc: Yaowei Bai <baiyaowei@cmss.chinamobile.com> Cc: James Morris <james.l.morris@oracle.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* drivers: mtk: conn_md: backport from 3.18Mister Oyster2017-04-2511-160/+65
|
* sched: Disabled Gentle Fair Sleepers for better UI performanceCristoforo Cataldo2017-04-251-1/+1
|
* xt_qtaguid: fix a race condition in if_tag_stat_updateliping.zhang2017-04-171-3/+4
| | | | | | | | | Miss a lock protection in if_tag_stat_update while doing get_iface_entry. So if one CPU is doing iface_stat_create while another CPU is doing if_tag_stat_update, race will happened. Change-Id: Ib8d98e542f4e385685499f5b7bb7354f08654a75 Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
* rbtree: add postorder iteration functionsCody P Schafer2017-04-172-0/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Postorder iteration yields all of a node's children prior to yielding the node itself, and this particular implementation also avoids examining the leaf links in a node after that node has been yielded. In what I expect will be its most common usage, postorder iteration allows the deletion of every node in an rbtree without modifying the rbtree nodes (no _requirement_ that they be nulled) while avoiding referencing child nodes after they have been "deleted" (most commonly, freed). I have only updated zswap to use this functionality at this point, but numerous bits of code (most notably in the filesystem drivers) use a hand rolled postorder iteration that NULLs child links as it traverses the tree. Each of those instances could be replaced with this common implementation. 1 & 2 add rbtree postorder iteration functions. 3 adds testing of the iteration to the rbtree runtime tests 4 allows building the rbtree runtime tests as builtins 5 updates zswap. This patch: Add postorder iteration functions for rbtree. These are useful for safely freeing an entire rbtree without modifying the tree at all. Change-Id: I5d0f2db0b5bcb57da9c7fa1c5f34b8686db8dcc9 Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Reviewed-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: David Woodhouse <David.Woodhouse@intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Theodore Ts'o <tytso@google.com>
* rbtree: add rbtree_postorder_for_each_entry_safe() helperCody P Schafer2017-04-171-0/+18
| | | | | | | | | | | | | | | | | Because deletion (of the entire tree) is a relatively common use of the rbtree_postorder iteration, and because doing it safely means fiddling with temporary storage, provide a helper to simplify postorder rbtree iteration. Change-Id: Ifb89570a13fe7f3f480aa48f4281c21d99e28094 Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Reviewed-by: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: David Woodhouse <David.Woodhouse@intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Michel Lespinasse <walken@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Theodore Ts'o <tytso@google.com>
* staging: android: ashmem: convert range macros to inlinesGuillaume Tucker2017-04-171-4/+8
| | | | | | | | | | Convert range_size and range_on_lru macros to inline functions to fix checkpatch check: CHECK: Macro argument reuse 'range' - possible side-effects? Signed-off-by: Guillaume Tucker <guillaume.tucker@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* drivers:lmk: Fix null pointer issueHong-Mei Li2017-04-171-1/+1
| | | | | | | | | | | | | | | | On some race, the tsk that lmk is using may be deleted from the RB tree by other thread, and rb_next would return a NULL if we use this tsk to get next. For this case, we need to skip this round of shrink and wait for the next turn. Otherwise, tsk would trigger NULL pointer panic. Change-Id: If28d9a2d3160177f682c08f62421c20eb0cb5e81 Signed-off-by: Hong-Mei Li <a21834@motorola.com> Reviewed-on: http://gerrit.mot.com/729547 SME-Granted: SME Approvals Granted SLTApproved: Slta Waiver <sltawvr@motorola.com> Tested-by: Jira Key <jirakey@motorola.com> Reviewed-by: Yi-Wei Zhao <gbjc64@motorola.com> Submit-Approved: Jira Key <jirakey@motorola.com>
* drivers:lmk: Fix double delete issueHong-Mei Li2017-04-172-1/+7
| | | | | | | | | | | | | | | | | someone may change a process's oom_score_adj by proc fs, even though the process has exited. In that case, the task was deleted from the rb tree already, and the redundant deleting would trigger rb_erase panic finally. In this patch, we make sure to clear the node after deteting and check its empty status before rb_erase. Change-Id: I26098ca3350f111e94567f9e65ec3dce413197aa Signed-off-by: Hong-Mei Li <a21834@motorola.com> Reviewed-on: http://gerrit.mot.com/727760 SME-Granted: SME Approvals Granted SLTApproved: Slta Waiver <sltawvr@motorola.com> Tested-by: Jira Key <jirakey@motorola.com> Reviewed-by: Sheng-Zhe Zhao <a18689@motorola.com> Submit-Approved: Jira Key <jirakey@motorola.com>
* Fix NULL pointer dereference in tcp_nuke_addr.Lorenzo Colitti2017-04-171-2/+5
| | | | | | | | | | | | tcp_nuke addr only grabs the bottom half socket lock, but not the userspace socket lock. This allows a userspace program to call close() while the socket is running, which causes a NULL pointer dereference in inet_put_port. Bug: 23663111 Bug: 24072792 Change-Id: Iecb63af68c2db4764c74785153d1c9054f76b94f Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
* FROMLIST: 9p: fix a potential acl leakCong Wang2017-04-171-0/+2
| | | | | | | | | | | | | | | | | | | (https://lkml.org/lkml/2016/12/13/579) posix_acl_update_mode() could possibly clear 'acl', if so we leak the memory pointed by 'acl'. Save this pointer before calling posix_acl_update_mode() and release the memory if 'acl' really gets cleared. Reported-by: Mark Salyzyn <salyzyn@android.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Greg Kurz <groug@kaod.org> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@sandia.gov> Cc: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Bug: 32458736 Change-Id: Ia78da401e6fd1bfd569653bd2cd0ebd3f9c737a0
* vfs: add setattr2 fix mergeMister Oyster2017-04-173-3/+3
|
* block_dev: don't test bdev->bd_contains when it is not stableNeilBrown2017-04-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [ Upstream commit bcc7f5b4bee8e327689a4d994022765855c807ff ] bdev->bd_contains is not stable before calling __blkdev_get(). When __blkdev_get() is called on a parition with ->bd_openers == 0 it sets bdev->bd_contains = bdev; which is not correct for a partition. After a call to __blkdev_get() succeeds, ->bd_openers will be > 0 and then ->bd_contains is stable. When FMODE_EXCL is used, blkdev_get() calls bd_start_claiming() -> bd_prepare_to_claim() -> bd_may_claim() This call happens before __blkdev_get() is called, so ->bd_contains is not stable. So bd_may_claim() cannot safely use ->bd_contains. It currently tries to use it, and this can lead to a BUG_ON(). This happens when a whole device is already open with a bd_holder (in use by dm in my particular example) and two threads race to open a partition of that device for the first time, one opening with O_EXCL and one without. The thread that doesn't use O_EXCL gets through blkdev_get() to __blkdev_get(), gains the ->bd_mutex, and sets bdev->bd_contains = bdev; Immediately thereafter the other thread, using FMODE_EXCL, calls bd_start_claiming() from blkdev_get(). This should fail because the whole device has a holder, but because bdev->bd_contains == bdev bd_may_claim() incorrectly reports success. This thread continues and blocks on bd_mutex. The first thread then sets bdev->bd_contains correctly and drops the mutex. The thread using FMODE_EXCL then continues and when it calls bd_may_claim() again in: BUG_ON(!bd_may_claim(bdev, whole, holder)); The BUG_ON fires. Fix this by removing the dependency on ->bd_contains in bd_may_claim(). As bd_may_claim() has direct access to the whole device, it can simply test if the target bdev is the whole device. Fixes: 6b4517a7913a ("block: implement bd_claiming and claiming block") Cc: stable@vger.kernel.org (v2.6.35+) Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
* ext4: return -ENOMEM instead of successDan Carpenter2017-04-171-1/+3
| | | | | | | | | | | | [ Upstream commit 578620f451f836389424833f1454eeeb2ffc9e9f ] We should set the error code if kzalloc() fails. Fixes: 67cf5b09a46f ("ext4: add the basic function for inline data support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
* f2fs: s/fi/inode/gAlbert I2017-04-161-1/+1
| | | | | | | * Causing build error when f2fs is enabled Change-Id: Icde63105fc7291f148e76427adf1104108cd03fd Signed-off-by: Albert I <krascgq@outlook.co.id>
* BACKPORT: posix_acl: Clear SGID bit when setting file permissionsJan Kara2017-04-1615-106/+89
| | | | | | | | | | | | | | | | | | | | | | | (cherry pick from commit 073931017b49d9458aa351605b43a7e34598caef) When file permissions are modified via chmod(2) and the user is not in the owning group or capable of CAP_FSETID, the setgid bit is cleared in inode_change_ok(). Setting a POSIX ACL via setxattr(2) sets the file permissions as well as the new ACL, but doesn't clear the setgid bit in a similar way; this allows to bypass the check in chmod(2). Fix that. NB: conflicts resolution included extending the change to all visible users of the near deprecated function posix_acl_equiv_mode replaced with posix_acl_update_mode. We did not resolve the ACL leak in this CL, require additional upstream fixes. References: CVE-2016-7097 Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Bug: 32458736 Change-Id: I19591ad452cc825ac282b3cfd2daaa72aa9a1ac1
* ext4: validate s_first_meta_bg at mount timeEryu Guan2017-04-161-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ralf Spenneberg reported that he hit a kernel crash when mounting a modified ext4 image. And it turns out that kernel crashed when calculating fs overhead (ext4_calculate_overhead()), this is because the image has very large s_first_meta_bg (debug code shows it's 842150400), and ext4 overruns the memory in count_overhead() when setting bitmap buffer, which is PAGE_SIZE. ext4_calculate_overhead(): buf = get_zeroed_page(GFP_NOFS); <=== PAGE_SIZE buffer blks = count_overhead(sb, i, buf); count_overhead(): for (j = ext4_bg_num_gdb(sb, grp); j > 0; j--) { <=== j = 842150400 ext4_set_bit(EXT4_B2C(sbi, s++), buf); <=== buffer overrun count++; } This can be reproduced easily for me by this script: #!/bin/bash rm -f fs.img mkdir -p /mnt/ext4 fallocate -l 16M fs.img mke2fs -t ext4 -O bigalloc,meta_bg,^resize_inode -F fs.img debugfs -w -R "ssv first_meta_bg 842150400" fs.img mount -o loop fs.img /mnt/ext4 Fix it by validating s_first_meta_bg first at mount time, and refusing to mount if its value exceeds the largest possible meta_bg number. Change-Id: I33846fe46efe52ff8ebd4e6275c9c7d4ae1aea33 Reported-by: Ralf Spenneberg <ralf@os-t.de> Signed-off-by: Eryu Guan <guaneryu@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Andreas Dilger <adilger@dilger.ca>
* crypto: sha512 - implement base layer for SHA-512Ard Biesheuvel2017-04-162-1/+132
| | | | | | | | | | | | | | | | | | | To reduce the number of copies of boilerplate code throughout the tree, this patch implements generic glue for the SHA-512 algorithm. This allows a specific arch or hardware implementation to only implement the special handling that it needs. The users need to supply an implementation of void (sha512_block_fn)(struct sha512_state *sst, u8 const *src, int blocks) and pass it to the SHA-512 base functions. For easy casting between the prototype above and existing block functions that take a 'u64 state[]' as their first argument, the 'state' member of struct sha512_state is moved to the base of the struct. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: sha256 - implement base layer for SHA-256Ard Biesheuvel2017-04-162-1/+129
| | | | | | | | | | | | | | | | | | | To reduce the number of copies of boilerplate code throughout the tree, this patch implements generic glue for the SHA-256 algorithm. This allows a specific arch or hardware implementation to only implement the special handling that it needs. The users need to supply an implementation of void (sha256_block_fn)(struct sha256_state *sst, u8 const *src, int blocks) and pass it to the SHA-256 base functions. For easy casting between the prototype above and existing block functions that take a 'u32 state[]' as their first argument, the 'state' member of struct sha256_state is moved to the base of the struct. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: arm64/sha2 - add generated .S files to .gitignoreArd Biesheuvel2017-04-161-0/+2
| | | | | | | | | Add the files that are generated by the recently merged OpenSSL SHA-256/512 implementation to .gitignore so Git disregards them when showing untracked files. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: arm64/sha2 - integrate OpenSSL implementations of SHA256/SHA512Ard Biesheuvel2017-04-167-0/+4228
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This integrates both the accelerated scalar and the NEON implementations of SHA-224/256 as well as SHA-384/512 from the OpenSSL project. Relative performance compared to the respective generic C versions: | SHA256-scalar | SHA256-NEON* | SHA512 | ------------+-----------------+--------------+----------+ Cortex-A53 | 1.63x | 1.63x | 2.34x | Cortex-A57 | 1.43x | 1.59x | 1.95x | Cortex-A73 | 1.26x | 1.56x | ? | The core crypto code was authored by Andy Polyakov of the OpenSSL project, in collaboration with whom the upstream code was adapted so that this module can be built from the same version of sha512-armv8.pl. The version in this patch was taken from OpenSSL commit 32bbb62ea634 ("sha/asm/sha512-armv8.pl: fix big-endian support in __KERNEL__ case.") * The core SHA algorithm is fundamentally sequential, but there is a secondary transformation involved, called the schedule update, which can be performed independently. The NEON version of SHA-224/SHA-256 only implements this part of the algorithm using NEON instructions, the sequential part is always done using scalar instructions. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* ext4: don't save the error information if the block device is read-onlyTheodore Ts'o2017-04-161-0/+2
| | | | | Google-Bug-Id: 20939131 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* defconfig: disable some scsi stuffMister Oyster2017-04-161-3/+7
|
* libceph: introduce ceph_crypt() for in-place en/decryptionIlya Dryomov2017-04-162-0/+89
| | | | | | | | | | | | | | | | | | | | | | | | Starting with 4.9, kernel stacks may be vmalloced and therefore not guaranteed to be physically contiguous; the new CONFIG_VMAP_STACK option is enabled by default on x86. This makes it invalid to use on-stack buffers with the crypto scatterlist API, as sg_set_buf() expects a logical address and won't work with vmalloced addresses. There isn't a different (e.g. kvec-based) crypto API we could switch net/ceph/crypto.c to and the current scatterlist.h API isn't getting updated to accommodate this use case. Allocating a new header and padding for each operation is a non-starter, so do the en/decryption in-place on a single pre-assembled (header + data + padding) heap buffer. This is explicitly supported by the crypto API: "... the caller may provide the same scatter/gather list for the plaintext and cipher text. After the completion of the cipher operation, the plaintext data is replaced with the ciphertext data in case of an encryption and vice versa for a decryption." Change-Id: I554cae76340899bb6f00e2e06230f3a27da186a1 Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
* fscrypt: remove broken support for detecting keyring key revocationEric Biggers2017-04-164-57/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Filesystem encryption ostensibly supported revoking a keyring key that had been used to "unlock" encrypted files, causing those files to become "locked" again. This was, however, buggy for several reasons, the most severe of which was that when key revocation happened to be detected for an inode, its fscrypt_info was immediately freed, even while other threads could be using it for encryption or decryption concurrently. This could be exploited to crash the kernel or worse. This patch fixes the use-after-free by removing the code which detects the keyring key having been revoked, invalidated, or expired. Instead, an encrypted inode that is "unlocked" now simply remains unlocked until it is evicted from memory. Note that this is no worse than the case for block device-level encryption, e.g. dm-crypt, and it still remains possible for a privileged user to evict unused pages, inodes, and dentries by running 'sync; echo 3 > /proc/sys/vm/drop_caches', or by simply unmounting the filesystem. In fact, one of those actions was already needed anyway for key revocation to work even somewhat sanely. This change is not expected to break any applications. In the future I'd like to implement a real API for fscrypt key revocation that interacts sanely with ongoing filesystem operations --- waiting for existing operations to complete and blocking new operations, and invalidating and sanitizing key material and plaintext from the VFS caches. But this is a hard problem, and for now this bug must be fixed. This bug affected almost all versions of ext4, f2fs, and ubifs encryption, and it was potentially reachable in any kernel configured with encryption support (CONFIG_EXT4_ENCRYPTION=y, CONFIG_EXT4_FS_ENCRYPTION=y, CONFIG_F2FS_FS_ENCRYPTION=y, or CONFIG_UBIFS_FS_ENCRYPTION=y). Note that older kernels did not use the shared fs/crypto/ code, but due to the potential security implications of this bug, it may still be worthwhile to backport this fix to them. Fixes: b7236e21d55f ("ext4 crypto: reorganize how we store keys in the inode") Change-Id: I0e4e34307102ed48d4c228c97b7d551d3cc564f9 Cc: stable@vger.kernel.org # v4.2+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Acked-by: Michael Halcrow <mhalcrow@google.com>
* UPSTREAM: net/packet: fix overflow in check for priv area sizeAndrey Konovalov2017-04-161-2/+2
| | | | | | | | | | | | | | | | | | | Subtracting tp_sizeof_priv from tp_block_size and casting to int to check whether one is less then the other doesn't always work (both of them are unsigned ints). Compare them as is instead. Also cast tp_sizeof_priv to u64 before using BLK_PLUS_PRIV, as it can overflow inside BLK_PLUS_PRIV otherwise. Bug: 36725304 Upstream commit: 2b6867c2ce76c596676bec7d2d525af525fdc6e2 Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Change-Id: I46bfbaf5f4a5d80f10ddce731a3030f191de4b28
* BACKPORT: UPSTREAM: selinux: fix off-by-one in setprocattrStephen Smalley2017-04-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SELinux tries to support setting/clearing of /proc/pid/attr attributes from the shell by ignoring terminating newlines and treating an attribute value that begins with a NUL or newline as an attempt to clear the attribute. However, the test for clearing attributes has always been wrong; it has an off-by-one error, and this could further lead to reading past the end of the allocated buffer since commit bb646cdb12e75d82258c2f2e7746d5952d3e321a ("proc_pid_attr_write(): switch to memdup_user()"). Fix the off-by-one error. Even with this fix, setting and clearing /proc/pid/attr attributes from the shell is not straightforward since the interface does not support multiple write() calls (so shells that write the value and newline separately will set and then immediately clear the attribute, requiring use of echo -n to set the attribute), whereas trying to use echo -n "" to clear the attribute causes the shell to skip the write() call altogether since POSIX says that a zero-length write causes no side effects. Thus, one must use echo -n to set and echo without -n to clear, as in the following example: $ echo -n unconfined_u:object_r:user_home_t:s0 > /proc/$$/attr/fscreate $ cat /proc/$$/attr/fscreate unconfined_u:object_r:user_home_t:s0 $ echo "" > /proc/$$/attr/fscreate $ cat /proc/$$/attr/fscreate Note the use of /proc/$$ rather than /proc/self, as otherwise the cat command will read its own attribute value, not that of the shell. There are no users of this facility to my knowledge; possibly we should just get rid of it. UPDATE: Upon further investigation it appears that a local process with the process:setfscreate permission can cause a kernel panic as a result of this bug. This patch fixes CVE-2017-2618. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> [PM: added the update about CVE-2017-2618 to the commit description] Cc: stable@vger.kernel.org # 3.5: d6ea83ec6864e Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: James Morris <james.l.morris@oracle.com> Bug: 35136920 Upstream commit: 0c461cb727d146c9ef2d3e86214f498b78b7d125 Change-Id: I41ba111bd79f6f60306316eb6de0425b1b7d8d19
* sdcardfs: limit stacking depthAndrew Chant2017-04-161-0/+7
| | | | | | | | Limit filesystem stacking to prevent stack overflow. Bug: 32761463 Change-Id: I8b1462b9c0d6c7f00cf110724ffb17e7f307c51e Signed-off-by: Andrew Chant <achant@google.com>
* fs: limit filesystem stacking depthMiklos Szeredi2017-04-162-0/+18
| | | | | | | | | | | | | | | Add a simple read-only counter to super_block that indicates how deep this is in the stack of filesystems. Previously ecryptfs was the only stackable filesystem and it explicitly disallowed multiple layers of itself. Overlayfs, however, can be stacked recursively and also may be stacked on top of ecryptfs or vice versa. To limit the kernel stack usage we must limit the depth of the filesystem stack. Initially the limit is set to 2. Change-Id: I91549cf876ed11a4265487f6b2d980b459399f9d Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
* ip6_gre: fix ip6gre_err() invalid readsEric Dumazet2017-04-161-19/+21
| | | | | | | | | | | | | | | | | | | | | | Andrey Konovalov reported out of bound accesses in ip6gre_err() If GRE flags contains GRE_KEY, the following expression *(((__be32 *)p) + (grehlen / 4) - 1) accesses data ~40 bytes after the expected point, since grehlen includes the size of IPv6 headers. Let's use a "struct gre_base_hdr *greh" pointer to make this code more readable. p[1] becomes greh->protocol. grhlen is the GRE header length. Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Change-Id: I53521c4c78af847e1ba5381199213fa0c5a01e5c Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* BACKPORT: udf: Check path length when reading symlinkJan Kara2017-04-165-20/+50
| | | | | | | | | | | | | Symlink reading code does not check whether the resulting path fits into the page provided by the generic code. This isn't as easy as just checking the symlink size because of various encoding conversions we perform on path. So we have to check whether there is still enough space in the buffer on the fly. Upstream commit: 0e5cc9a40ada6046e6bc3bdfcd0c0d7e4b706b14 CC: stable@vger.kernel.org Reported-by: Carl Henrik Lunde <chlunde@ping.uio.no> Signed-off-by: Jan Kara <jack@suse.cz>