aboutsummaryrefslogtreecommitdiff
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert "f2fs: reuse nids more aggressively"Chao Yu2017-12-061-4/+0
| | | | | | | | | | | | | | | | | | Commit 268344664603 ("f2fs: reuse nids more aggressively") tries to reuse nids as many as possilbe, in order to mitigate producing obsolete node pages in page cache. But acutally, before we reuse the nids and related node page cache, we will always invalidate that node page, so there will be not any obsolete node pages in cache. Let's just revert previous commit, so that nm_i::next_scan_nid can be increased ascendingly, making __build_free_nids traverses all NAT pages more easily, finally, free nid bitmap cache can be enabled as soon as possible. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* Revert "f2fs: node segment is prior to data segment selected victim"Yunlong Song2017-12-061-11/+1
| | | | | | | | | | | | | | | | | | | | | This reverts commit b9cd20619e359d199b755543474c3d853c8e3415. That patch causes much fewer node segments (which can be used for SSR) than before, and in the corner case (e.g. create and delete *.txt files in one same directory, there will be very few node segments but many data segments), if the reserved free segments are all used up during gc, then the write_checkpoint can still flush dentry pages to data ssr segments, but will probably fail to flush node pages to node ssr segments, since there are not enough node ssr segments left (the left ones are all full). So revert this patch to give a fair chance to let node segments remain for SSR, which provides more robustness for corner cases. Conflicts: fs/f2fs/gc.c Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* AIO: Don't plug the I/O queue in do_io_submit()Dave Kleikamp2017-11-191-5/+0
| | | | | | | | | | | | | | | | | Asynchronous I/O latency to a solid-state disk greatly increased between the 2.6.32 and 3.0 kernels. By removing the plug from do_io_submit(), we observed a 34% improvement in the I/O latency. Unfortunately, at this level, we don't know if the request is to a rotating disk or not. Change-Id: I7101df956473ed9fd5dcff18e473dd93b688a5c1 Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Cc: linux-aio@kvack.org Cc: Chris Mason <chris.mason@oracle.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jeff Moyer <jmoyer@redhat.com>
* btrfs: prevent to set invalid default subvolidsatoru takeuchi2017-11-061-0/+4
| | | | | | | | | | | | | | | | commit 6d6d282932d1a609e60dc4467677e0e863682f57 upstream. `btrfs sub set-default` succeeds to set an ID which isn't corresponding to any fs/file tree. If such the bad ID is set to a filesystem, we can't mount this filesystem without specifying `subvol` or `subvolid` mount options. Fixes: 6ef5ed0d386b ("Btrfs: add ioctl and incompat flag to set the default mount subvol") Cc: <stable@vger.kernel.org> Signed-off-by: Satoru Takeuchi <satoru.takeuchi@gmail.com> Reviewed-by: Qu Wenruo <quwenruo.btrfs@gmx.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* udf: Fix deadlock between writeback and udf_setsize()Jan Kara2017-11-061-2/+2
| | | | | | | | | | | | | | | | | | | | commit f2e95355891153f66d4156bf3a142c6489cd78c6 upstream. udf_setsize() called truncate_setsize() with i_data_sem held. Thus truncate_pagecache() called from truncate_setsize() could lock a page under i_data_sem which can deadlock as page lock ranks below i_data_sem - e. g. writeback can hold page lock and try to acquire i_data_sem to map a block. Fix the problem by moving truncate_setsize() calls from under i_data_sem. It is safe for us to change i_size without holding i_data_sem as all the places that depend on i_size being stable already hold inode_lock. CC: stable@vger.kernel.org Fixes: 7e49b6f2480cb9a9e7322a91592e56a5c85361f5 Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Willy Tarreau <w@1wt.eu>
* ext4: avoid deadlock when expanding inode sizeJan Kara2017-11-062-8/+13
| | | | | | | | | | | | | | | | | | | commit 2e81a4eeedcaa66e35f58b81e0755b87057ce392 upstream. When we need to move xattrs into external xattr block, we call ext4_xattr_block_set() from ext4_expand_extra_isize_ea(). That may end up calling ext4_mark_inode_dirty() again which will recurse back into the inode expansion code leading to deadlocks. Protect from recursion using EXT4_STATE_NO_EXPAND inode flag and move its management into ext4_expand_extra_isize_ea() since its manipulation is safe there (due to xattr_sem) from possible races with ext4_xattr_set_handle() which plays with it as well. CC: stable@vger.kernel.org # 4.4.x Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Willy Tarreau <w@1wt.eu>
* ext4: in ext4_seek_{hole,data}, return -ENXIO for negative offsetsDarrick J. Wong2017-11-061-2/+2
| | | | | | | | | | | | | | commit 1bd8d6cd3e413d64e543ec3e69ff43e75a1cf1ea upstream. In the ext4 implementations of SEEK_HOLE and SEEK_DATA, make sure we return -ENXIO for negative offsets instead of banging around inside the extent code and returning -EFSCORRUPTED. Reported-by: Mateusz S <muttdini@gmail.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org # 4.6 Signed-off-by: Willy Tarreau <w@1wt.eu>
* ext4: fix SEEK_HOLEJan Kara2017-11-061-36/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 7d95eddf313c88b24f99d4ca9c2411a4b82fef33 upstream. Currently, SEEK_HOLE implementation in ext4 may both return that there's a hole at some offset although that offset already has data and skip some holes during a search for the next hole. The first problem is demostrated by: xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "seek -h 0" file wrote 57344/57344 bytes at offset 0 56 KiB, 14 ops; 0.0000 sec (2.054 GiB/sec and 538461.5385 ops/sec) Whence Result HOLE 0 Where we can see that SEEK_HOLE wrongly returned offset 0 as containing a hole although we have written data there. The second problem can be demonstrated by: xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "pwrite 128k 8k" -c "seek -h 0" file wrote 57344/57344 bytes at offset 0 56 KiB, 14 ops; 0.0000 sec (1.978 GiB/sec and 518518.5185 ops/sec) wrote 8192/8192 bytes at offset 131072 8 KiB, 2 ops; 0.0000 sec (2 GiB/sec and 500000.0000 ops/sec) Whence Result HOLE 139264 Where we can see that hole at offsets 56k..128k has been ignored by the SEEK_HOLE call. The underlying problem is in the ext4_find_unwritten_pgoff() which is just buggy. In some cases it fails to update returned offset when it finds a hole (when no pages are found or when the first found page has higher index than expected), in some cases conditions for detecting hole are just missing (we fail to detect a situation where indices of returned pages are not contiguous). Fix ext4_find_unwritten_pgoff() to properly detect non-contiguous page indices and also handle all cases where we got less pages then expected in one place and handle it properly there. Fixes: c8c0df241cc2719b1262e627f999638411934f60 CC: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Willy Tarreau <w@1wt.eu>
* ext4: keep existing extra fields when inode expandsKonstantin Khlebnikov2017-11-061-2/+3
| | | | | | | | | | | | | commit 887a9730614727c4fff7cb756711b190593fc1df upstream. ext4_expand_extra_isize() should clear only space between old and new size. Fixes: 6dd4ee7cab7e # v2.6.23 Cc: stable@vger.kernel.org Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Willy Tarreau <w@1wt.eu>
* FS-Cache: fix dereference of NULL user_key_payloadEric Biggers2017-11-061-0/+7
| | | | | | | | | | | | | | | | | | | | | commit d124b2c53c7bee6569d2a2d0b18b4a1afde00134 upstream. When the file /proc/fs/fscache/objects (available with CONFIG_FSCACHE_OBJECT_LIST=y) is opened, we request a user key with description "fscache:objlist", then access its payload. However, a revoked key has a NULL payload, and we failed to check for this. request_key() *does* skip revoked keys, but there is still a window where the key can be revoked before we access its payload. Fix it by checking for a NULL payload, treating it like a key which was already revoked at the time it was requested. Fixes: 4fbf4291aa15 ("FS-Cache: Allow the current state of all objects to be dumped") Reviewed-by: James Morris <james.l.morris@oracle.com> Cc: <stable@vger.kernel.org> [v2.6.32+] Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* direct-io: Prevent NULL pointer access in submit_page_sectionAndreas Gruenbacher2017-11-061-1/+2
| | | | | | | | | | | | | | | | | commit 899f0429c7d3eed886406cd72182bee3b96aa1f9 upstream. In the code added to function submit_page_section by commit b1058b981, sdio->bio can currently be NULL when calling dio_bio_submit. This then leads to a NULL pointer access in dio_bio_submit, so check for a NULL bio in submit_page_section before trying to submit it instead. Fixes xfstest generic/250 on gfs2. Cc: stable@vger.kernel.org # v3.10+ Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Willy Tarreau <w@1wt.eu>
* fuse: initialize the flock flag in fuse_file on allocationMateusz Jurczyk2017-11-061-1/+1
| | | | | | | | | | | | | | | | | | | commit 68227c03cba84a24faf8a7277d2b1a03c8959c2c upstream. Before the patch, the flock flag could remain uninitialized for the lifespan of the fuse_file allocation. Unless set to true in fuse_file_flock(), it would remain in an indeterminate state until read in an if statement in fuse_release_common(). This could consequently lead to taking an unexpected branch in the code. The bug was discovered by a runtime instrumentation designed to detect use of uninitialized memory in the kernel. Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com> Fixes: 37fb3a30b462 ("fuse: fix flock") Cc: <stable@vger.kernel.org> # v3.1+ Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Willy Tarreau <w@1wt.eu>
* leak in O_DIRECT readv past the EOFAl Viro2017-11-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In all versions from 2.5.62 to 3.15, on each iteration through the loop by iovec array in do_blockdev_direct_IO() we used to do this: sdio.head = 0; sdio.tail = 0; ... retval = do_direct_IO(dio, &sdio, &map_bh); if (retval) { dio_cleanup(dio, &sdio); break; } with another dio_cleanup() done after the loop, catching the situation when retval had been 0. Consider the situation when e.g. the 3rd iovec in 4-iovec array passed to readv() has crossed the EOF. do_direct_IO() returns 0 and buggers off *without* exhausting the page array. The loop proceeds to the next iovec without calling dio_cleanup() and resets sdio.head and sdio.tail. That reset of sdio.{head,tail} has prevented the eventual dio_cleanup() from seeing anything and the page reference end up leaking. Commit 7b2c99d15559 (new helper: iov_iter_get_pages()) in 3.16 had eliminated the loop by iovec array, along with sdio.head and sdio.tail resets. Backporting that is too much work - the minimal fix is simply to make sure that the only case when do_direct_IO() buggers off early without returning non-zero will not skip dio_cleanup(). The fix applies to all versions from 2.5.62 to 3.15. Reported-and-tested-by: Venki Pallipadi <venki@cohesity.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Willy Tarreau <w@1wt.eu>
* sdcardfs: fix space leakAndrea Arcangeli2017-10-211-0/+6
| | | | | | | | | | | | | | | | | Don't keep looked up dentries around because there's no notification when the lowerfs unlinks an inode, to collect the upper dentry cache so the lower inode reference can be released and the storage space released. This effectively disables the dcache lookups but it should still be much faster than fuse sdcardfs. The alternative is to add a notification mechanism to keep two separate dcache layers in sync which isn't trivial, or stop ever touching the lower fs and remove that path.replace from VolumeInfo.java. Change-Id: I211bd676834126f6f65b3d09ebe951d0375d7985 Signed-off-by: Kevin F. Haggerty <haggertk@lineageos.org>
* FROMLIST: f2fs: expose some sectors to user in inline data or dentry caseJaegeuk Kim2017-10-161-0/+5
| | | | | | | | | | | | | | (from https://patchwork.kernel.org/patch/10005409/) If there's some data written through inline data or dentry, we need to shouw st_blocks. This fixes reporting zero blocks even though there is small written data. Bug: 67651285 Bug: 67600404 Change-Id: I9ad5cb137eb627b9fd22740d2ab98e0221433c95 Cc: stable@vger.kernel.org Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix potential panic during fstrimChao Yu2017-10-143-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As Ju Hyung Park reported: "When 'fstrim' is called for manual trim, a BUG() can be triggered randomly with this patch. I'm seeing this issue on both x86 Desktop and arm64 Android phone. On x86 Desktop, this was caused during Ubuntu boot-up. I have a cronjob installed which calls 'fstrim -v /' during boot. On arm64 Android, this was caused during GC looping with 1ms gc_min_sleep_time & gc_max_sleep_time." Root cause of this issue is that f2fs_wait_discard_bios can only be used by f2fs_put_super, because during put_super there must be no other referrers, so it can ignore discard entry's reference count when removing the entry, otherwise in other caller we will hit bug_on in __remove_discard_cmd as there may be other issuer added reference count in discard entry. Thread A Thread B - issue_discard_thread - f2fs_ioc_fitrim - f2fs_trim_fs - f2fs_wait_discard_bios - __issue_discard_cmd - __submit_discard_cmd - __wait_discard_cmd - dc->ref++ - __wait_one_discard_bio - __wait_discard_cmd - __remove_discard_cmd - f2fs_bug_on(sbi, dc->ref) Fixes: 969d1b180d987c2be02de890d0fff0f66a0e80de Reported-by: Ju Hyung Park <qkrwngud825@gmail.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: Fix merge errorsjollaman9992017-10-043-4/+6
| | | | Signed-off-by: Mister Oyster <oysterized@gmail.com>
* f2fs: hurry up to issue discard after io interruptionChao Yu2017-10-041-2/+15
| | | | | | | | | | | | | Once we encounter I/O interruption during issuing discards, we will delay long time before next round, but if system status is I/O idle during the time, it may loses opportunity to issue discards. So this patch changes to hurry up to issue discard after io interruption. Besides, this patch also fixes to issue discards accurately with assigned rate. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix to show correct discard_granularity in sysfsChao Yu2017-10-041-0/+2
| | | | | | | | | | | | | Fix below incorrect display when reading discard_granularity sysfs node. $ cat /sys/fs/f2fs/<device>/discard_granularity $ 16 $ echo 32 > /sys/fs/f2fs/<device>/discard_granularity $ cat /sys/fs/f2fs/<device>/discard_granularity $ 16 Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: detect dirty inode in evict_inodeChao Yu2017-10-041-0/+3
| | | | | | | | Add a bugon in f2fs_evict_inode to detect inconsistent status between inode cache and related node page cache. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: clear radix tree dirty tag of pages whose dirty flag is clearedDaeho Jeong2017-10-042-0/+14
| | | | | | | | | | | | | | | | | On a senario like writing out the first dirty page of the inode as the inline data, we only cleared dirty flags of the pages, but didn't clear the dirty tags of those pages in the radix tree. If we don't clear the dirty tags of the pages in the radix tree, the inodes which contain the pages will be marked with I_DIRTY_PAGES again and again, and writepages() for the inodes will be invoked in every writeback period. As a result, nothing will be done in every writepages() for the inodes and it will just consume CPU time meaninglessly. Signed-off-by: Daeho Jeong <daeho.jeong@samsung.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: speed up gc_urgent mode with SSRJaegeuk Kim2017-10-043-13/+16
| | | | | | | This patch activates SSR in gc_urgent mode. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: better to wait for fstrim completionJaegeuk Kim2017-10-041-1/+6
| | | | | | | | In android, we'd better wait for fstrim completion instead of issuing the discard commands asynchronous. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: avoid race in between read xattr & write xattrYunlei He2017-10-043-0/+8
| | | | | | | | | | | | | | | | | | | | | | | Thread A: Thread B: -f2fs_getxattr -lookup_all_xattrs -xnid = F2FS_I(inode)->i_xattr_nid; -f2fs_setxattr -__f2fs_setxattr -write_all_xattrs -truncate_xattr_node ... ... -write_checkpoint ... ... -alloc_nid <- nid reuse -get_node_page -f2fs_bug_on <- nid != node_footer->nid It's need a rw_sem to avoid the race Signed-off-by: Yunlei He <heyunlei@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: make get_lock_data_page to handle encrypted inodeJaegeuk Kim2017-10-041-58/+51
| | | | | | | | This patch refactors get_lock_data_page() to handle encryption case directly. In order to do that, it introduces common f2fs_submit_page_read(). Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: use generic terms used for encrypted block managementJaegeuk Kim2017-10-045-12/+15
| | | | | | | | This patch renames functions regarding to buffer management via META_MAPPING used for encrypted blocks especially. We can actually use them in generic way. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: introduce f2fs_encrypted_file for clean-upJaegeuk Kim2017-10-045-10/+14
| | | | | | | | This patch replaces (f2fs_encrypted_inode() && S_ISREG()) with f2fs_encrypted_file(), which gives no functional change. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* Revert "f2fs: add a new function get_ssr_cost"Yunlong Song2017-10-041-10/+1
| | | | | | | | | | | This reverts commit b7b7c4cf1c9ef0272a65f1480457cbfdadcda19d. se->ckpt_valid_blocks will never be smaller than se->valid_blocks, so just remove get_ssr_cost. Signed-off-by: Yunlong Song <yunlong.song@huawei.com> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: constify super_operationsArvind Yadav2017-10-041-1/+1
| | | | | | | | | | | super_operations are not supposed to change at runtime. "struct super_block" working with super_operations provided by <linux/fs.h> work with const super_operations. So mark the non-const structs as const Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix to wake up all sleeping flusherChao Yu2017-10-041-2/+21
| | | | | | | | | | | | | | | In scenario of remount_ro vs flush, after flush_thread exits in ->remount_fs, flusher will only clean up golbal issue_list, but without waking up flushers waiting on that list, result in hang related user threads. In order to fix this issue, this patch enables the flusher to take charge of issue_flush thread: executes merged flush command, and wake up all sleeping flushers. Fixes: 5eba8c5d1fb3 ("f2fs: fix to access nullified flush_cmd_control pointer") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: avoid race in between atomic_read & atomic_incChao Yu2017-10-041-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | Previously, we will miss merging flush command during fsync due to below race condition: Thread A Thread B Thread C - f2fs_issue_flush - atomic_read(&issing_flush) - f2fs_issue_flush - atomic_read(&issing_flush) - f2fs_issue_flush - atomic_read(&issing_flush) - atomic_inc(&issing_flush) - atomic_inc(&issing_flush) - atomic_inc(&issing_flush) - submit_flush_wait - submit_flush_wait - submit_flush_wait It needs to use atomic_inc_return instead to avoid such race. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: remove unneeded parameter of change_cursegChao Yu2017-10-041-10/+8
| | | | | | | | | | allocate_segment_by_default is the only caller of change_curseg passing @reuse with 'false', but commit 763bfe1bc575 ("f2fs: remove reusing any prefree segments") removes the calling, after that, @reuse in change_curseg always be true, so, let's clean up the unneeded parameter. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: update i_flags correctlyChao Yu2017-10-041-0/+3
| | | | | | | | | | | | | | f2fs enables hash-indexed directory by default, so we need to tag FS_INDEX_FL in inode::i_flags during directory creataion, in order to show correct status of inode in lsattr: Before: ------------------- /mnt/f2fs/dir/ After: -----------I------- /mnt/f2fs/dir/ Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: don't check inode's checksum if it was dirtied or writebackedJaegeuk Kim2017-10-042-2/+3
| | | | | | | | If another thread already made the page dirtied or writebacked, we must avoid to verify checksum. If we got an error, we need to remove its uptodate as well. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: don't need to update inode checksum for recoveryJaegeuk Kim2017-10-041-2/+0
| | | | | | | | This patch fixes "f2fs: support inode checksum". The recovered inode page will be rewritten with valid checksum. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: trigger fdatasync for non-atomic_write fileChao Yu2017-10-041-1/+1
| | | | | | | | Sqlite only cares about synchronization of file data instead of other data unrelated attribute of inode, so in commit flow, call fdatasync is enough. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix to avoid race in between aio and gcChao Yu2017-10-041-0/+3
| | | | | | | | | | | We won't wait DIO synchronously when doing AIO, so there will be potential IO reorder in between AIO and GC, which will cause data corruption. This patch adds inode_dio_wait to serialize aio and data GC to avoid this issue. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: wake up discard_thread iff there is a candidateJaegeuk Kim2017-10-043-7/+27
| | | | | | | This patch fixes to avoid needless wake ups. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: return error when accessing insane flie offsetJaegeuk Kim2017-10-041-1/+5
| | | | | | | If file offset is insane, we have to return error instead of kernel panic. Reported-by: Eric Zhang <followme999@163.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: trigger normal fsync for non-atomic_write fileChao Yu2017-10-041-1/+1
| | | | | | | | | | If file was not opened with atomic write mode, but user uses atomic write ioctl to fsync datas, in the flow, we should not fsync that file with atomic write mode. Fixes: 608514deba38 ("f2fs: set fsync mark only for the last dnode") Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: clear FI_HOT_DATA correctlyChao Yu2017-10-042-0/+3
| | | | | | | | | | This patch fixes to clear FI_HOT_DATA correctly in below path: - error handling in f2fs_ioc_start_atomic_write - after commit atomic write in f2fs_ioc_commit_atomic_write - after drop atomic write in drop_inmem_pages Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix out-of-order execution in f2fs_issue_flushChao Yu2017-10-041-1/+4
| | | | | | | | | | In f2fs_issue_flush, due to out-of-order execution of CPU, wake_up can be called before we insert issue_list, result in long latency of wait_for_completion. Fix this by adding smp_mb() to force the order of related codes. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: issue discard commands if gc_urgent is setJaegeuk Kim2017-10-042-1/+10
| | | | | | It's time to issue all the discard commands, if user sets the idle time. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: remove unused function overprovision_sectionsYunlong Song2017-10-041-5/+0
| | | | | | Signed-off-by: Yunlong Song <yunlong.song@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: check hot_data for roll-forward recoveryJaegeuk Kim2017-10-041-1/+1
| | | | | | | | | We need to check HOT_DATA to truncate any previous data block when doing roll-forward recovery. Cc: <stable@vger.kernel.org> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: add tracepoint for f2fs_gcChao Yu2017-10-041-14/+36
| | | | | | | This patch adds tracepoint for f2fs_gc. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: retry to revoke atomic commit in -ENOMEM caseChao Yu2017-10-041-2/+8
| | | | | | | | During atomic committing, if we encounter -ENOMEM in revoke path, it's better to give a chance to retry revoking. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: let fill_super handle roll-forward errorsJaegeuk Kim2017-10-041-2/+0
| | | | | | | | | | | | If we set CP_ERROR_FLAG in roll-forward error, f2fs is no longer to proceed any IOs due to f2fs_cp_error(). But, for example, if some stale data is involved on roll-forward process, we're able to get -ENOENT, getting fs stuck. If we get any error, let fill_super set SBI_NEED_FSCK and try to recover back to stable point. Cc: <stable@vger.kernel.org> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: merge equivalent flags F2FS_GET_BLOCK_[READ|DIO]Qiuyang Sun2017-10-043-10/+11
| | | | | | | | | | | | Currently, the two flags F2FS_GET_BLOCK_[READ|DIO] are totally equivalent and can be used interchangably in all scenarios they are involved in. Neither of the flags is referenced in f2fs_map_blocks(), making them both the default case. To remove the ambiguity, this patch merges both flags into F2FS_GET_BLOCK_DEFAULT, and introduces an enum for all distinct flags. Signed-off-by: Qiuyang Sun <sunqiuyang@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: support journalled quotaChao Yu2017-10-044-32/+404
| | | | | | | | | | | | | This patch supports to enable f2fs to accept quota information through mount option: - {usr,grp,prj}jquota=<quota file path> - jqfmt=<quota type> Then, in ->mount flow, we can recover quota file during log replaying, by this, journelled quota can be supported. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>