xavi/android_kernel_m2note - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	ext4: only look at the bg_flags field if it is valid	Theodore Ts'o	2019-05-02	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 8844618d8aa7a9973e7b527d038a2a589665002c upstream. The bg_flags field in the block group descripts is only valid if the uninit_bg or metadata_csum feature is enabled. We were not consistently looking at this field; fix this. Also block group #0 must never have uninitialized allocation bitmaps, or need to be zeroed, since that's where the root inode, and other special inodes are set up. Check for these conditions and mark the file system as corrupted if they are detected. This addresses CVE-2018-10876. https://bugzilla.kernel.org/show_bug.cgi?id=199403 Bug: 116406122 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org [bwh: Backported to 3.16: - ext4_read_block_bitmap_nowait() and ext4_read_inode_bitmap() return a pointer (NULL on error) instead of an error code - Open-code sb_rdonly() - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> [ghackmann@google.com: forward-port to 3.18: adjust context] Signed-off-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Change-Id: I11ed41d9fa916662f7eb010854ce4aaaf23ad99a
*	ext4: add more inode number paranoia checks	Theodore Ts'o	2018-12-02	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit c37e9e013469521d9adb932d17a1795c139b36db upstream. If there is a directory entry pointing to a system inode (such as a journal inode), complain and declare the file system to be corrupted. Also, if the superblock's first inode number field is too small, refuse to mount the file system. This addresses CVE-2018-10882. https://bugzilla.kernel.org/show_bug.cgi?id=200069 Signed-off-by: Theodore Ts'o <tytso@mit.edu> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	ext4: fix false negatives and false positives in ext4_check_descriptors()	Theodore Ts'o	2018-12-02	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 44de022c4382541cebdd6de4465d1f4f465ff1dd upstream. Ext4_check_descriptors() was getting called before s_gdb_count was initialized. So for file systems w/o the meta_bg feature, allocation bitmaps could overlap the block group descriptors and ext4 wouldn't notice. For file systems with the meta_bg feature enabled, there was a fencepost error which would cause the ext4_check_descriptors() to incorrectly believe that the block allocation bitmap overlaps with the block group descriptor blocks, and it would reject the mount. Fix both of these problems. Signed-off-by: Theodore Ts'o <tytso@mit.edu> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	ext4: fix check to prevent initializing reserved inodes	Theodore Ts'o	2018-12-02	1	-7/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 5012284700775a4e6e3fbe7eac4c543c4874b559 upstream. Commit 8844618d8aa7: "ext4: only look at the bg_flags field if it is valid" will complain if block group zero does not have the EXT4_BG_INODE_ZEROED flag set. Unfortunately, this is not correct, since a freshly created file system has this flag cleared. It gets almost immediately after the file system is mounted read-write --- but the following somewhat unlikely sequence will end up triggering a false positive report of a corrupted file system: mkfs.ext4 /dev/vdc mount -o ro /dev/vdc /vdc mount -o remount,rw /dev/vdc Instead, when initializing the inode table for block group zero, test to make sure that itable_unused count is not too large, since that is the case that will result in some or all of the reserved inodes getting cleared. This fixes the failures reported by Eric Whiteney when running generic/230 and generic/231 in the the nojournal test case. Fixes: 8844618d8aa7 ("ext4: only look at the bg_flags field if it is valid") Reported-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	ext4: only look at the bg_flags field if it is valid	Theodore Ts'o	2018-12-02	1	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 8844618d8aa7a9973e7b527d038a2a589665002c upstream. The bg_flags field in the block group descripts is only valid if the uninit_bg or metadata_csum feature is enabled. We were not consistently looking at this field; fix this. Also block group #0 must never have uninitialized allocation bitmaps, or need to be zeroed, since that's where the root inode, and other special inodes are set up. Check for these conditions and mark the file system as corrupted if they are detected. This addresses CVE-2018-10876. https://bugzilla.kernel.org/show_bug.cgi?id=199403 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org [bwh: Backported to 3.16: - ext4_read_block_bitmap_nowait() and ext4_read_inode_bitmap() return a pointer (NULL on error) instead of an error code - Open-code sb_rdonly() - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> [ghackmann@google.com: forward-port to 3.18: adjust context] Signed-off-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	ext4: recalucate superblock checksum after updating free blocks/inodes	Theodore Ts'o	2018-12-02	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 4274f516d4bc50648a4d97e4f67ecbd7b65cde4a upstream. When mounting the superblock, ext4_fill_super() calculates the free blocks and free inodes and stores them in the superblock. It's not strictly necessary, since we don't use them any more, but it's nice to keep them roughly aligned to reality. Since it's not critical for file system correctness, the code doesn't call ext4_commit_super(). The problem is that it's in ext4_commit_super() that we recalculate the superblock checksum. So if we're not going to call ext4_commit_super(), we need to call ext4_superblock_csum_set() to make sure the superblock checksum is consistent. Most of the time, this doesn't matter, since we end up calling ext4_commit_super() very soon thereafter, and definitely by the time the file system is unmounted. However, it doesn't work in this sequence: mke2fs -Fq -t ext4 /dev/vdc 128M mount /dev/vdc /vdc cp xfstests/git-versions /vdc godown /vdc umount /vdc mount /dev/vdc tune2fs -l /dev/vdc With this commit, the "tune2fs -l" no longer fails. Reported-by: Chengguang Xu <cgxu519@gmx.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	ext4: add more mount time checks of the superblock	Theodore Ts'o	2018-12-02	1	-11/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit bfe0a5f47ada40d7984de67e59a7d3390b9b9ecc upstream. The kernel's ext4 mount-time checks were more permissive than e2fsprogs's libext2fs checks when opening a file system. The superblock is considered too insane for debugfs or e2fsck to operate on it, the kernel has no business trying to mount it. This will make file system fuzzing tools work harder, but the failure cases that they find will be more useful and be easier to evaluate. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	ext4: make sure bitmaps and the inode table don't overlap with bg descriptors	Theodore Ts'o	2018-12-02	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 77260807d1170a8cf35dbb06e07461a655f67eee upstream. It's really bad when the allocation bitmaps and the inode table overlap with the block group descriptors, since it causes random corruption of the bg descriptors. So we really want to head those off at the pass. https://bugzilla.kernel.org/show_bug.cgi?id=199865 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	ext4: don't allow r/w mounts if metadata blocks overlap the superblock	Theodore Ts'o	2018-12-01	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 18db4b4e6fc31eda838dd1c1296d67dbcb3dc957 upstream. If some metadata block, such as an allocation bitmap, overlaps the superblock, it's very likely that if the file system is mounted read/write, the results will not be pretty. So disallow r/w mounts for file systems corrupted in this particular way. Backport notes: 3.18.y is missing bc98a42c1f7d ("VFS: Convert sb->s_flags & MS_RDONLY to sb_rdonly(sb)") and e462ec50cb5f ("VFS: Differentiate mount flags (MS_*) from internal superblock flags") so we simply use the sb MS_RDONLY check from pre bc98a42c1f7d in place of the sb_rdonly function used in the upstream variant of the patch. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Signed-off-by: Harsh Shandilya <harsh@prjkt.io> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	ext4: fix possible leak of sbi->s_group_desc_leak in error path	Theodore Ts'o	2018-12-01	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	partial merge commit 9e463084cdb22e0b56b2dfbc50461020409a5fd3 upstream. Fixes: bfe0a5f47ada ("ext4: add more mount time checks of the superblock") Reported-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org # 4.18 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Moyster <oysterized@gmail.com>
*	Replace <asm/uaccess.h> with <linux/uaccess.h> globally	Linus Torvalds	2018-11-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was entirely automated, using the script by Al: PATT='^[[:blank:]]#[[:blank:]]include[[:blank:]]*<asm/uaccess.h>' sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \ $(git grep -l "$PATT"\|grep -v ^include/linux/uaccess.h) to do the replacement at the end of the merge window. Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Moyster <oysterized@gmail.com>
*	ext4: convert to mbcache2	Jan Kara	2017-12-31	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \|	The conversion is generally straightforward. The only tricky part is that xattr block corresponding to found mbcache entry can get freed before we get buffer lock for that block. So we have to check whether the entry is still valid after getting buffer lock. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Joe Maples <joe@frap129.org>
*	ext4: validate s_reserved_gdt_blocks on mount	Theodore Ts'o	2017-12-31	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If s_reserved_gdt_blocks is extremely large, it's possible for ext4_init_block_bitmap(), which is called when ext4 sets up an uninitialized block bitmap, to corrupt random kernel memory. Add the same checks which e2fsck has --- it must never be larger than blocksize / sizeof(__u32) --- and then add a backup check in ext4_init_block_bitmap() in case the superblock gets modified after the file system is mounted. Reported-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org [@nathanchance: fixed conflicts] Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Joe Maples <joe@frap129.org>
*	ext4: ignore quota mount options if the quota feature is enabled	Theodore Ts'o	2017-12-31	1	-11/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, ext4 would fail the mount if the file system had the quota feature enabled and quota mount options (used for the older quota setups) were present. This broke xfstests, since xfs silently ignores the usrquote and grpquota mount options if they are specified. This commit changes things so that we are consistent with xfs; having the mount options specified is harmless, so no sense break users by forbidding them. Cc: stable@vger.kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu> [@nathanchance: fixed conflicts] Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Joe Maples <joe@frap129.org>
*	ext4: don't save the error information if the block device is read-only	Theodore Ts'o	2017-12-31	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	.... because otherwise a corrupted root file file system (caused by flash corruption, etc.) will result in the device becoming bricked after an attempted OTA update --- which can only be recovered by a custom recovery image manually flashed onto the device (or a warantee return). Cherry picked from android-common 3.10 branch: 00a6e4dfa BUG=20939131 Change-Id: I7d5fe852c30efc35aa39fba7f708a22c034a1839 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: "Theodore Ts'o" <tytso@google.com>
*	ext4: reject journal options for ext2 mounts	Carlos Maiolino	2017-12-31	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	There is no reason to allow ext2 filesystems be mounted with journal mount options. So, this patch adds them to the MOPT_NO_EXT2 mount options list. Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Joe Maples <joe@frap129.org>
*	fs/ext4/super.c: use strreplace() in ext4_fill_super()	Rasmus Villemoes	2017-12-31	1	-3/+1
\| \| \| \| \| \| \| \| \| \|	This makes a very large function a little smaller. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Joe Maples <joe@frap129.org>
*	ext4: move procfs registration code to fs/ext4/sysfs.c	Theodore Ts'o	2017-12-31	1	-38/+3
\| \| \| \| \| \| \| \| \| \|	This allows us to refactor the procfs code, which saves a bit of compiled space. More importantly it isolates most of the procfs support code into a single file, so it's easier to #ifdef it out if the proc file system has been disabled. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Joe Maples <joe@frap129.org>
*	ext4: move sysfs code from super.c to fs/ext4/sysfs.c	Theodore Ts'o	2017-12-31	1	-409/+17
\| \| \| \| \| \| \| \| \|	Also statically allocate the ext4_kset and ext4_feat objects, since we only need exactly one of each, and it's simpler and less code if we drop the dynamic allocation and deallocation when it's not needed. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Joe Maples <joe@frap129.org>
*	ext4: improve ext4lazyinit scalability	Dmitry Monakhov	2017-12-31	1	-10/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ext4lazyinit is a global thread. This thread performs itable initalization under li_list_mtx mutex. It basically does the following: ext4_lazyinit_thread ->mutex_lock(&eli->li_list_mtx); ->ext4_run_li_request(elr) ->ext4_init_inode_table-> Do a lot of IO if the list is large And when new mount/umount arrive they have to block on ->li_list_mtx because lazy_thread holds it during full walk procedure. ext4_fill_super ->ext4_register_li_request ->mutex_lock(&ext4_li_info->li_list_mtx); ->list_add(&elr->lr_request, &ext4_li_info >li_request_list); In my case mount takes 40minutes on server with 36 * 4Tb HDD. Common user may face this in case of very slow dev ( /dev/mmcblkXXX) Even more. If one of filesystems was frozen lazyinit_thread will simply block on sb_start_write() so other mount/umount will be stuck forever. This patch changes logic like follows: - grab ->s_umount read sem before processing new li_request. After that it is safe to drop li_list_mtx because all callers of li_remove_request are holding ->s_umount for write. - li_thread skips frozen SB's Locking order: Mh KOrder is asserted by umount path like follows: s_umount ->li_list_mtx so the only way to to grab ->s_mount inside li_thread is via down_read_trylock xfstests:ext4/023 #PSBM-49658 Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Joe Maples <joe@frap129.org>
*	ext4: explicit mount options parsing cleanup	Dmitry Monakhov	2017-12-31	1	-2/+6
\| \| \| \| \| \| \| \| \| \|	Currently MOPT_EXPLICIT treated as EXPLICIT_DELALLOC which may be changed in future. Let's fix it now. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Pranav Vashi <neobuddy89@gmail.com> Signed-off-by: Joe Maples <joe@frap129.org>
*	ext4: don't manipulate recovery flag when freezing no-journal fs	Eric Sandeen	2017-12-31	1	-4/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ Upstream commit c642dc9e1aaed953597e7092d7df329e6234096e ] At some point along this sequence of changes: f6e63f9 ext4: fold ext4_nojournal_sops into ext4_sops bb04457 ext4: support freezing ext2 (nojournal) file systems 9ca9238 ext4: Use separate super_operations structure for no_journal filesystems ext4 started setting needs_recovery on filesystems without journals when they are unfrozen. This makes no sense, and in fact confuses blkid to the point where it doesn't recognize the filesystem at all. (freeze ext2; unfreeze ext2; run blkid; see no output; run dumpe2fs, see needs_recovery set on fs w/ no journal). To fix this, don't manipulate the INCOMPAT_RECOVER feature on filesystems without journals. Reported-by: Stu Mark <smark@datto.com> Reviewed-by: Jan Kara <jack@suse.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Pranav Vashi <neobuddy89@gmail.com> Signed-off-by: Joe Maples <joe@frap129.org>
*	ext4: return EROFS if device is r/o and journal replay is needed	Theodore Ts'o	2017-07-04	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \|	commit 4753d8a24d4588657bc0a4cd66d4e282dff15c8c upstream. If the file system requires journal recovery, and the device is read-ony, return EROFS to the mount system call. This allows xfstests generic/050 to pass. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Willy Tarreau <w@1wt.eu>
*	ext4: preserve the needs_recovery flag when the journal is aborted	Theodore Ts'o	2017-07-04	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \|	commit 97abd7d4b5d9c48ec15c425485f054e1c15e591b upstream. If the journal is aborted, the needs_recovery feature flag should not be removed. Otherwise, it's the journal might not get replayed and this could lead to more data getting lost. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Willy Tarreau <w@1wt.eu>
*	ext4: fix in-superblock mount options processing	Theodore Ts'o	2017-07-04	1	-15/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 5aee0f8a3f42c94c5012f1673420aee96315925a upstream. Fix a large number of problems with how we handle mount options in the superblock. For one, if the string in the superblock is long enough that it is not null terminated, we could run off the end of the string and try to interpret superblocks fields as characters. It's unlikely this will cause a security problem, but it could result in an invalid parse. Also, parse_options is destructive to the string, so in some cases if there is a comma-separated string, it would be modified in the superblock. (Fortunately it only happens on file systems with a 1k block size.) Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Willy Tarreau <w@1wt.eu>
*	ext4: avoid modifying checksum fields directly during checksum verification	Daeho Jeong	2017-05-29	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit b47820edd1634dc1208f9212b7ecfb4230610a23 upstream. We temporally change checksum fields in buffers of some types of metadata into '0' for verifying the checksum values. By doing this without locking the buffer, some metadata's checksums, which are being committed or written back to the storage, could be damaged. In our test, several metadata blocks were found with damaged metadata checksum value during recovery process. When we only verify the checksum value, we have to avoid modifying checksum fields directly. Signed-off-by: Daeho Jeong <daeho.jeong@samsung.com> Signed-off-by: Youngjin Gil <youngjin.gil@samsung.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
*	ext4: validate that metadata blocks do not overlap superblock	Theodore Ts'o	2017-05-29	1	-1/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 829fa70dddadf9dd041d62b82cd7cea63943899d upstream. A number of fuzzing failures seem to be caused by allocation bitmaps or other metadata blocks being pointed at the superblock. This can cause kernel BUG or WARNings once the superblock is overwritten, so validate the group descriptor blocks to make sure this doesn't happen. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Willy Tarreau <w@1wt.eu>
*	ext4: sanity check the block and cluster size at mount time	Theodore Ts'o	2017-05-29	1	-1/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ Upstream commit 9e47a4c9fc58032ee135bf76516809c7624b1551 ] If the block size or cluster size is insane, reject the mount. This is important for security reasons (although we shouldn't be just depending on this check). Ref: http://www.securityfocus.com/archive/1/539661 Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1332506 Reported-by: Borislav Petkov <bp@alien8.de> Reported-by: Nikolay Borisov <kernel@kyup.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
*	ext4: use more strict checks for inodes_per_block on mount	Theodore Ts'o	2017-05-29	1	-9/+6
\| \| \| \| \| \| \| \| \| \| \| \|	[ Upstream commit cd6bb35bf7f6d7d922509bf50265383a0ceabe96 ] Centralize the checks for inodes_per_block and be more strict to make sure the inodes_per_block_group can't end up being zero. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
*	ext4: add sanity checking to count_overhead()	Theodore Ts'o	2017-05-29	1	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	[ Upstream commit c48ae41bafe31e9a66d8be2ced4e42a6b57fa814 ] The commit "ext4: sanity check the block and cluster size at mount time" should prevent any problems, but in case the superblock is modified while the file system is mounted, add an extra safety check to make sure we won't overrun the allocated buffer. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
*	ext4: short-cut orphan cleanup on error	Vegard Nossum	2017-05-29	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit c65d5c6c81a1f27dec5f627f67840726fcd146de upstream. If we encounter a filesystem error during orphan cleanup, we should stop. Otherwise, we may end up in an infinite loop where the same inode is processed again and again. EXT4-fs (loop0): warning: checktime reached, running e2fsck is recommended EXT4-fs error (device loop0): ext4_mb_generate_buddy:758: group 2, block bitmap and bg descriptor inconsistent: 6117 vs 0 free clusters Aborting journal on device loop0-8. EXT4-fs (loop0): Remounting filesystem read-only EXT4-fs error (device loop0) in ext4_free_blocks:4895: Journal has aborted EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted EXT4-fs error (device loop0) in ext4_ext_remove_space:3068: IO failure EXT4-fs error (device loop0) in ext4_ext_truncate:4667: Journal has aborted EXT4-fs error (device loop0) in ext4_orphan_del:2927: Journal has aborted EXT4-fs error (device loop0) in ext4_do_update_inode:4893: Journal has aborted EXT4-fs (loop0): Inode 16 (00000000618192a0): orphan list check failed! [...] EXT4-fs (loop0): Inode 16 (0000000061819748): orphan list check failed! [...] EXT4-fs (loop0): Inode 16 (0000000061819bf0): orphan list check failed! [...] See-also: c9eb13a9105 ("ext4: fix hang when processing corrupted orphaned inode list") Cc: Jan Kara <jack@suse.cz> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Signed-off-by: Willy Tarreau <w@1wt.eu>
*	ext4: add lockdep annotations for i_data_sem	Theodore Ts'o	2017-05-29	1	-2/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit daf647d2dd58cec59570d7698a45b98e580f2076 upstream. With the internal Quota feature, mke2fs creates empty quota inodes and quota usage tracking is enabled as soon as the file system is mounted. Since quotacheck is no longer preallocating all of the blocks in the quota inode that are likely needed to be written to, we are now seeing a lockdep false positive caused by needing to allocate a quota block from inside ext4_map_blocks(), while holding i_data_sem for a data inode. This results in this complaint: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&ei->i_data_sem); lock(&s->s_dquot.dqio_mutex); lock(&ei->i_data_sem); lock(&s->s_dquot.dqio_mutex); Google-Bug-Id: 27907753 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
*	ext4, jbd2: ensure entering into panic after recording an error in superblock	Daeho Jeong	2017-05-29	1	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 4327ba52afd03fc4b5afa0ee1d774c9c5b0e85c5 upstream. If a EXT4 filesystem utilizes JBD2 journaling and an error occurs, the journaling will be aborted first and the error number will be recorded into JBD2 superblock and, finally, the system will enter into the panic state in "errors=panic" option. But, in the rare case, this sequence is little twisted like the below figure and it will happen that the system enters into panic state, which means the system reset in mobile environment, before completion of recording an error in the journal superblock. In this case, e2fsck cannot recognize that the filesystem failure occurred in the previous run and the corruption wouldn't be fixed. Task A Task B ext4_handle_error() -> jbd2_journal_abort() -> __journal_abort_soft() -> __jbd2_journal_abort_hard() \| -> journal->j_flags \|= JBD2_ABORT; \| \| __ext4_abort() \| -> jbd2_journal_abort() \| \| -> __journal_abort_soft() \| \| -> if (journal->j_flags & JBD2_ABORT) \| \| return; \| -> panic() \| -> jbd2_journal_update_sb_errno() Tested-by: Hobin Woo <hobin.woo@samsung.com> Signed-off-by: Daeho Jeong <daeho.jeong@samsung.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	ext4: call sync_blockdev() before invalidate_bdev() in put_super()	Theodore Ts'o	2017-05-29	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 89d96a6f8e6491f24fc8f99fd6ae66820e85c6c1 upstream. Normally all of the buffers will have been forced out to disk before we call invalidate_bdev(), but there will be some cases, where a file system operation was aborted due to an ext4_error(), where there may still be some dirty buffers in the buffer cache for the device. So try to force them out to memory before calling invalidate_bdev(). This fixes a warning triggered by generic/081: WARNING: CPU: 1 PID: 3473 at /usr/projects/linux/ext4/fs/block_dev.c:56 __blkdev_put+0xb5/0x16f() Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*	ext4: set lazytime on remount if MS_LAZYTIME is set by mount	Theodore Ts'o	2017-05-29	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \|	Newer versions of mount parse the lazytime feature and pass it to the mount system call via the flags field in the mount system call, removing the lazytime string from the mount options list. So we need to check for the presence of MS_LAZYTIME and set it in sb->s_flags in order for this flag to be set on a remount. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
*	ext4: add optimization for the lazytime mount option	Theodore Ts'o	2017-05-29	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add an optimization for the MS_LAZYTIME mount option so that we will opportunistically write out any inodes with the I_DIRTY_TIME flag set in a particular inode table block when we need to update some inode in that inode table block anyway. Also add some temporary code so that we can set the lazytime mount option without needing a modified /sbin/mount program which can set MS_LAZYTIME. We can eventually make this go away once util-linux has added support. Google-Bug-Id: 18297052 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
*	fs: push sync_filesystem() down to the file system's remount_fs()	Theodore Ts'o	2017-05-29	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, the no-op "mount -o mount /dev/xxx" operation when the file system is already mounted read-write causes an implied, unconditional syncfs(). This seems pretty stupid, and it's certainly documented or guaraunteed to do this, nor is it particularly useful, except in the case where the file system was mounted rw and is getting remounted read-only. However, it's possible that there might be some file systems that are actually depending on this behavior. In most file systems, it's probably fine to only call sync_filesystem() when transitioning from read-write to read-only, and there are some file systems where this is not needed at all (for example, for a pseudo-filesystem or something like romfs). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: linux-fsdevel@vger.kernel.org Cc: Christoph Hellwig <hch@infradead.org> Cc: Artem Bityutskiy <dedekind1@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Evgeniy Dushistov <dushistov@mail.ru> Cc: Jan Kara <jack@suse.cz> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Anders Larsen <al@alarsen.net> Cc: Phillip Lougher <phillip@squashfs.org.uk> Cc: Kees Cook <keescook@chromium.org> Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Cc: Petr Vandrovec <petr@vandrovec.name> Cc: xfs@oss.sgi.com Cc: linux-btrfs@vger.kernel.org Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Cc: codalist@coda.cs.cmu.edu Cc: linux-ext4@vger.kernel.org Cc: linux-f2fs-devel@lists.sourceforge.net Cc: fuse-devel@lists.sourceforge.net Cc: cluster-devel@redhat.com Cc: linux-mtd@lists.infradead.org Cc: jfs-discussion@lists.sourceforge.net Cc: linux-nfs@vger.kernel.org Cc: linux-nilfs@vger.kernel.org Cc: linux-ntfs-dev@lists.sourceforge.net Cc: ocfs2-devel@oss.oracle.com Cc: reiserfs-devel@vger.kernel.org Change-Id: Ie6fc68d845b0d327f56e4da91a8a9ba0673e5d5e
*	UPSTREAM: ext4: fix fencepost in s_first_meta_bg validation	Theodore Ts'o	2017-05-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(cherry-picked from commit 2ba3e6e8afc9b6188b471f27cf2b5e3cf34e7af2) It is OK for s_first_meta_bg to be equal to the number of block group descriptor blocks. (It rarely happens, but it shouldn't cause any problems.) https://bugzilla.kernel.org/show_bug.cgi?id=194567 Fixes: 3a4b77cd47bb837b8557595ec7425f281f2ca1fe Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Change-Id: Ib414feb50f88dcd42dc846429b81df6c72b28136
*	BACKPORT: ext4: validate s_first_meta_bg at mount time	Eryu Guan	2017-05-27	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(Cherry-picked from commit 3a4b77cd47bb837b8557595ec7425f281f2ca1fe) Ralf Spenneberg reported that he hit a kernel crash when mounting a modified ext4 image. And it turns out that kernel crashed when calculating fs overhead (ext4_calculate_overhead()), this is because the image has very large s_first_meta_bg (debug code shows it's 842150400), and ext4 overruns the memory in count_overhead() when setting bitmap buffer, which is PAGE_SIZE. ext4_calculate_overhead(): buf = get_zeroed_page(GFP_NOFS); <=== PAGE_SIZE buffer blks = count_overhead(sb, i, buf); count_overhead(): for (j = ext4_bg_num_gdb(sb, grp); j > 0; j--) { <=== j = 842150400 ext4_set_bit(EXT4_B2C(sbi, s++), buf); <=== buffer overrun count++; } This can be reproduced easily for me by this script: #!/bin/bash rm -f fs.img mkdir -p /mnt/ext4 fallocate -l 16M fs.img mke2fs -t ext4 -O bigalloc,meta_bg,^resize_inode -F fs.img debugfs -w -R "ssv first_meta_bg 842150400" fs.img mount -o loop fs.img /mnt/ext4 Fix it by validating s_first_meta_bg first at mount time, and refusing to mount if its value exceeds the largest possible meta_bg number. Reported-by: Ralf Spenneberg <ralf@os-t.de> Signed-off-by: Eryu Guan <guaneryu@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Change-Id: I252fda33d116b044a3e710b79bdd0c7ce2870145
*	ext4 crypto: release crypto resource on module exit	Chao Yu	2017-05-27	1	-0/+1
\| \| \| \| \| \| \| \| \|	Crypto resource should be released when ext4 module exits, otherwise it will cause memory leak. Change-Id: Ie298e73bd766768707a7af440691ce2f418f5acc Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
*	ext4 crypto: use per-inode tfm structure	Theodore Ts'o	2017-05-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As suggested by Herbert Xu, we shouldn't allocate a new tfm each time we read or write a page. Instead we can use a single tfm hanging off the inode's crypt_info structure for all of our encryption needs for that inode, since the tfm can be used by multiple crypto requests in parallel. Also use cmpxchg() to avoid races that could result in crypt_info structure getting doubly allocated or doubly freed. Change-Id: I4ae5c07d0e5d99ec1e26eeb49d833c4a284d9a5f Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Theodore Ts'o <tytso@google.com>
*	ext4: clean up superblock encryption mode fields	Theodore Ts'o	2017-05-27	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	The superblock fields s_file_encryption_mode and s_dir_encryption_mode are vestigal, so remove them as a cleanup. While we're at it, allow file systems with both encryption and inline_data enabled at the same time to work correctly. We can't have encrypted inodes with inline data, but there's no reason to prohibit unencrypted inodes from using the inline data feature. Change-Id: Ia90b7e24bcf9ebabef529b710d70bd8ba71a17a4 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: "Theodore Ts'o" <tytso@google.com>
*	ext4 crypto: reorganize how we store keys in the inode	Theodore Ts'o	2017-05-27	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a pretty massive patch which does a number of different things: 1) The per-inode encryption information is now stored in an allocated data structure, ext4_crypt_info, instead of directly in the node. This reduces the size usage of an in-memory inode when it is not using encryption. 2) We drop the ext4_fname_crypto_ctx entirely, and use the per-inode encryption structure instead. This remove an unnecessary memory allocation and free for the fname_crypto_ctx as well as allowing us to reuse the ctfm in a directory for multiple lookups and file creations. 3) We also cache the inode's policy information in the ext4_crypt_info structure so we don't have to continually read it out of the extended attributes. 4) We now keep the keyring key in the inode's encryption structure instead of releasing it after we are done using it to derive the per-inode key. This allows us to test to see if the key has been revoked; if it has, we prevent the use of the derived key and free it. 5) When an inode is released (or when the derived key is freed), we will use memset_explicit() to zero out the derived key, so it's not left hanging around in memory. This implies that when a user logs out, it is important to first revoke the key, and then unlink it, and then finally, to use "echo 3 > /proc/sys/vm/drop_caches" to release any decrypted pages and dcache entries from the system caches. 6) All this, and we also shrink the number of lines of code by around 100. :-) Change-Id: I948f7844d425c0ce616f800446ecb0b6bea686f8 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Theodore Ts'o <tytso@google.com>
*	ext4 crypto: separate kernel and userspace structure for the key	Theodore Ts'o	2017-05-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use struct ext4_encryption_key only for the master key passed via the kernel keyring. For internal kernel space users, we now use struct ext4_crypt_info. This will allow us to put information from the policy structure so we can cache it and avoid needing to constantly looking up the extended attribute. We will do this in a spearate patch. This patch is mostly mechnical to make it easier for patch review. Change-Id: I208472675d0550df5f60b3b58652a9a1b434caed Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Theodore Ts'o <tytso@google.com>
*	ext4 crypto: enable encryption feature flag	Theodore Ts'o	2017-05-27	1	-1/+28
\| \| \| \| \| \| \| \| \| \|	Also add the test dummy encryption mode flag so we can more easily test the encryption patches using xfstests. Change-Id: Iaae44110ab5870e5da60aca76197828f0ebc139b Signed-off-by: Michael Halcrow <mhalcrow@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Theodore Ts'o <tytso@google.com>
*	ext4 crypto: add ext4 encryption facilities	Michael Halcrow	2017-05-27	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On encrypt, we will re-assign the buffer_heads to point to a bounce page rather than the control_page (which is the original page to write that contains the plaintext). The block I/O occurs against the bounce page. On write completion, we re-assign the buffer_heads to the original plaintext page. On decrypt, we will attach a read completion callback to the bio struct. This read completion will decrypt the read contents in-place prior to setting the page up-to-date. The current encryption mode, AES-256-XTS, lacks cryptographic integrity. AES-256-GCM is in-plan, but we will need to devise a mechanism for handling the integrity data. Change-Id: I6e0569c9f19a82c75f4b545ad04ff7fdd1908d74 Signed-off-by: Michael Halcrow <mhalcrow@google.com> Signed-off-by: Ildar Muslukhov <ildarm@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Theodore Ts'o <tytso@google.com>
*	ext4: use old percpu_counter_init() function signature	Theodore Ts'o	2017-05-27	1	-7/+4
\| \| \| \| \| \|	Older kernels don't pass in a gfp_flags argument. Signed-off-by: Theodore Ts'o <tytso@mit.edu>
*	BACKPORT: ext4 from 3.18 to mtk-3.10	Mister Oyster	2017-05-27	1	-366/+464
\|
*	UPSTREAM: ext4: fix fencepost in s_first_meta_bg validation	Theodore Ts'o	2017-04-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(cherry-picked from commit 2ba3e6e8afc9b6188b471f27cf2b5e3cf34e7af2) It is OK for s_first_meta_bg to be equal to the number of block group descriptor blocks. (It rarely happens, but it shouldn't cause any problems.) https://bugzilla.kernel.org/show_bug.cgi?id=194567 Fixes: 3a4b77cd47bb837b8557595ec7425f281f2ca1fe Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Change-Id: Ib414feb50f88dcd42dc846429b81df6c72b28136
*	ext4: validate s_first_meta_bg at mount time	Eryu Guan	2017-04-16	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ralf Spenneberg reported that he hit a kernel crash when mounting a modified ext4 image. And it turns out that kernel crashed when calculating fs overhead (ext4_calculate_overhead()), this is because the image has very large s_first_meta_bg (debug code shows it's 842150400), and ext4 overruns the memory in count_overhead() when setting bitmap buffer, which is PAGE_SIZE. ext4_calculate_overhead(): buf = get_zeroed_page(GFP_NOFS); <=== PAGE_SIZE buffer blks = count_overhead(sb, i, buf); count_overhead(): for (j = ext4_bg_num_gdb(sb, grp); j > 0; j--) { <=== j = 842150400 ext4_set_bit(EXT4_B2C(sbi, s++), buf); <=== buffer overrun count++; } This can be reproduced easily for me by this script: #!/bin/bash rm -f fs.img mkdir -p /mnt/ext4 fallocate -l 16M fs.img mke2fs -t ext4 -O bigalloc,meta_bg,^resize_inode -F fs.img debugfs -w -R "ssv first_meta_bg 842150400" fs.img mount -o loop fs.img /mnt/ext4 Fix it by validating s_first_meta_bg first at mount time, and refusing to mount if its value exceeds the largest possible meta_bg number. Change-Id: I33846fe46efe52ff8ebd4e6275c9c7d4ae1aea33 Reported-by: Ralf Spenneberg <ralf@os-t.de> Signed-off-by: Eryu Guan <guaneryu@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Andreas Dilger <adilger@dilger.ca>