aboutsummaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
...
* f2fs: check return value of write_checkpoint during fstrimChao Yu2017-04-131-0/+2
| | | | | | | | During fstrim, if one of multiple write_checkpoint failed, break off and return error number to caller. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix to do f2fs_balance_fs in f2fs_map_blocks correctlyChao Yu2017-04-131-0/+1
| | | | | | | | If we preallocate blocks with f2fs_reserve_blocks in f2fs_map_blocks, we should call f2fs_balance_fs for checking and reclaiming space, fix it. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: avoid unneeded loop in build_sit_entriesChao Yu2017-04-131-16/+32
| | | | | | | | | | | | | | | When building each sit entry in cache, firstly, we will load it from sit page, and then check all entries in sit journal, if there is one updated entry in journal, cover cached entry with the journaled one. Actually, most of check operation is unneeded since we only need to update cached entries with journaled entries in batch, so changing the flow as below for more efficient: 1. load all sit entries into cache from sit pages; 2. update sit entries with journal. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: clean up foreground GC flowChao Yu2017-04-131-11/+8
| | | | | | | | | This patch changes to check valid block number of one GCed section directly instead of checking the number in all segments of section one by one in order to clean up codes of foreground GC. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: set dirty state for filesystem only when updating meta dataChao Yu2017-04-131-0/+4
| | | | | | | | | | | | We don't guarantee integrity of user data after checkpoint, since we only guarantee meta data integrity for data consistency of filesystem. Due to above reason, we only need to set fs as dirty when meta data is updated, so that we can skip writing checkpoint in some case of non-meta data is updated. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: skip new checkpoint when doing fstrim without fs changeYunlei He2017-04-131-0/+11
| | | | | | | | This patch enables to do fstrim without checkpoint, if there is no fs change. Signed-off-by: Yunlei He <heyunlei@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: add discard info to sys entry of f2fs statusYunlei He2017-04-132-3/+14
| | | | | | | This patch add discard block count to sys entry of f2fs status Signed-off-by: Yunlei He <heyunlei@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: reduce batch size of fstrimJaegeuk Kim2017-04-131-1/+1
| | | | | | This is to reduce the batch size of fstrim to avoid long latency. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: do not use discard_map for hard disksJaegeuk Kim2017-04-133-10/+29
| | | | | | We don't need to keep discard_map, if disk does not support discard command. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: not allow to write illegal blkaddrYunlei He2017-04-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | we came across an error as below: [build_nat_area_bitmap:1710] nid[0x 1718] addr[0x 1c18ddc] ino[0x 1718] [build_nat_area_bitmap:1710] nid[0x 1719] addr[0x 1c193d5] ino[0x 1719] [build_nat_area_bitmap:1710] nid[0x 171a] addr[0x 1c1736e] ino[0x 171a] [build_nat_area_bitmap:1710] nid[0x 171b] addr[0x 58b3ee8f] ino[0x815f92ed] [build_nat_area_bitmap:1710] nid[0x 171c] addr[0x fcdc94b] ino[0x49366377] [build_nat_area_bitmap:1710] nid[0x 171d] addr[0x 7cd2facf] ino[0xb3c55300] [build_nat_area_bitmap:1710] nid[0x 171e] addr[0x bd4e25d0] ino[0x77c34c09] ... ... [build_nat_area_bitmap:1710] nid[0x 1718] addr[0x 1c18ddc] ino[0x 1718] [build_nat_area_bitmap:1710] nid[0x 1719] addr[0x 1c193d5] ino[0x 1719] [build_nat_area_bitmap:1710] nid[0x 171a] addr[0x 1c1736e] ino[0x 171a] [build_nat_area_bitmap:1710] nid[0x 171b] addr[0x 58b3ee8f] ino[0x815f92ed] [build_nat_area_bitmap:1710] nid[0x 171c] addr[0x fcdc94b] ino[0x49366377] [build_nat_area_bitmap:1710] nid[0x 171d] addr[0x 7cd2facf] ino[0xb3c55300] [build_nat_area_bitmap:1710] nid[0x 171e] addr[0x bd4e25d0] ino[0x77c34c09] One nat block may be stepped by a data block, so this patch forbid to write if the blkaddr is illegal Signed-off-by: Yunlei He <heyunlei@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* Revert "f2fs: move i_size_write in f2fs_write_end"Chao Yu2017-04-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit a2ee0a300344a6da76186129b078113354fe13d2. When testing with generic/032 of xfstest suit, failure message will be reported as below: generic/032 8s ... [failed, exit status 1] - output mismatch (see results/generic/032.out.bad) --- tests/generic/032.out 2015-01-11 16:52:27.643681072 +0800 +++ results/generic/032.out.bad 2016-08-06 13:44:43.861330500 +0800 @@ -1,5 +1,5 @@ QA output created by 032 -100 iterations -0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd -* -0100000 +1: [768..775]: unwritten +Unwritten extents found! ... (Run 'diff -u tests/generic/032.out results/generic/032.out.bad' to see the entire diff) Ran: generic/032 Failures: generic/032 Failed 1 of 1 tests In write_end(), we should update i_size of inode before unlock page, otherwise, we will lose newly updated data in following race condition. Thread A Thread B - write_end - unlock page - writepages - lock_page - writepage if page is out-of-range of file size, we will skip writting the page. - update i_size Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix build for v3.10Alexander Gordeev2017-04-131-5/+5
| | | | | | | kvfree() first appears in v3.15. f2fs_kvfree() should be used in the backport. Signed-off-by: Alexander Gordeev <alex@gordick.net> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: adjust other changesJaegeuk Kim2017-04-138-43/+61
| | | | | | | | | | This patch changes: - d_inode - file_dentry - inode_nohighmem - ... Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: get victim segment again after new cpYunlei He2017-04-131-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | Previous selected segment may become free after write_checkpoint, if we do garbage collect on this segment, and then new_curseg happen to reuse it, it may cause f2fs_bug_on as below. panic+0x154/0x29c do_garbage_collect+0x15c/0xaf4 f2fs_gc+0x2dc/0x444 f2fs_balance_fs.part.22+0xcc/0x14c f2fs_balance_fs+0x28/0x34 f2fs_map_blocks+0x5ec/0x790 f2fs_preallocate_blocks+0xe0/0x100 f2fs_file_write_iter+0x64/0x11c new_sync_write+0xac/0x11c vfs_write+0x144/0x1e4 SyS_write+0x60/0xc0 Here, maybe we check sit and ssa type during reset_curseg. So, we check segment is stale or not, and select a new victim to avoid this. Signed-off-by: Yunlei He <heyunlei@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: handle error case with f2fs_bug_onJaegeuk Kim2017-04-131-0/+2
| | | | | | | It's enough to show BUG or WARN by f2fs_bug_on for error case. Then, we don't need to remain corrupted filesystem. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: avoid data race when deciding checkpoin in f2fs_sync_fileJaegeuk Kim2017-04-131-2/+11
| | | | | | | | | | | | | | | | | | | | | | | When fs utilization is almost full, f2fs_sync_file should do checkpoint if there is not enough space for roll-forward later. (i.e. space_for_roll_forward) So, currently we have no lock for sbi->alloc_valid_block_count, resulting in race condition. In rare case, we can get -ENOSPC when doing roll-forward which triggers if (is_valid_blkaddr(sbi, dest, META_POR)) { if (src == NULL_ADDR) { err = reserve_new_block(&dn); f2fs_bug_on(sbi, err); ... } ... } in do_recover_data. So, this patch avoids that situation in advance. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: support an ioctl to move a range of data blocksJaegeuk Kim2017-04-132-0/+142
| | | | | | | | | | This patch implements moving a range of data blocks from source file to destination file. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Conflicts: fs/f2fs/file.c
* f2fs: fix to report error number of f2fs_find_entryChao Yu2017-04-134-15/+34
| | | | | | | | | | | This patch fixes to report the right error number of f2fs_find_entry to its caller. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Conflicts: fs/f2fs/namei.c
* f2fs: avoid memory allocation failure due to a long lengthJaegeuk Kim2017-04-131-18/+28
| | | | | | We need to avoid ENOMEM due to unexpected long length. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: reset default idle interval valueChao Yu2017-04-131-1/+1
| | | | | | | | | | | | | The default value of idle interval is 2 mins, but for most time when screen shutdown, there are still operations during the 2 mins interval, and gc's sleep time is about 30 secs to 60 secs, so there is almost no chance for GC thread to do garbage collecting. Set default value of idle interval value from 2 mins to 5 secs for fixing. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: use blk_plug in all the possible pathsJaegeuk Kim2017-04-136-2/+29
| | | | | | | | | | | | | This patch reverts 19a5f5e2ef37 (f2fs: drop any block plugging), and adds blk_plug in write paths additionally. The main reason is that blk_start_plug can be used to wake up from low-power mode before submitting further bios. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Conflicts: fs/f2fs/file.c
* f2fs: fix to avoid data update racing between GC and DIOChao Yu2017-04-134-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Datas in file can be operated by GC and DIO simultaneously, so we will face race case as below: For write case: Thread A Thread B - generic_file_direct_write - invalidate_inode_pages2_range - f2fs_direct_IO - do_blockdev_direct_IO - do_direct_IO - get_more_blocks - f2fs_gc - do_garbage_collect - gc_data_segment - move_data_page - do_write_data_page migrate data block to new block address - dio_bio_submit update user data to old block address For read case: Thread A Thread B - generic_file_direct_write - invalidate_inode_pages2_range - f2fs_direct_IO - do_blockdev_direct_IO - do_direct_IO - get_more_blocks - f2fs_balance_fs - f2fs_gc - do_garbage_collect - gc_data_segment - move_data_page - do_write_data_page migrate data block to new block address - write_checkpoint - do_checkpoint - clear_prefree_segments - f2fs_issue_discard discard old block adress - dio_bio_submit update user buffer from obsolete block address In order to fix this, for one file, we should let DIO and GC getting exclusion against with each other. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Conflicts: fs/f2fs/data.c
* f2fs: add maximum prefree segmentsJaegeuk Kim2017-04-132-0/+4
| | | | | | | | In 1TB storage, we need to admit 22841 prefree segments, which can consume too much segments. This patch sets 8GB in max. prefree segments in that case. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: disable extent_cache for fcollapse/finsert inodesJaegeuk Kim2017-04-133-0/+19
| | | | | | | | | This reduces the elapsed time to do xfstests/generic/017. Before: 458 s After: 390 s Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: refactor __exchange_data_block for speed upJaegeuk Kim2017-04-132-66/+171
| | | | | | | | | This reduces the elapsed time to do xfstests/generic/017. Before: 715 s After: 458 s Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix ERR_PTR returned by bioJaegeuk Kim2017-04-131-1/+3
| | | | | | | | This is to fix wrong error pointer handling flow reported by Dan. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: avoid mark_inode_dirtyJaegeuk Kim2017-04-1310-40/+58
| | | | | | Let's check inode's dirtiness before calling mark_inode_dirty. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: move i_size_write in f2fs_write_endJaegeuk Kim2017-04-131-1/+1
| | | | | | We don't need to do i_size_write under page lock. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix to avoid redundant discard during fstrimChao Yu2017-04-131-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | With below test steps, f2fs will issue redundant discard when doing fstrim, the reason is that we issue discards for both prefree segments and consecutive freed region user wants to trim, part regions they covered are overlapped, here, we change to do not to issue any discards for prefree segments in trimmed range. 1. mount -t f2fs -o discard /dev/zram0 /mnt/f2fs 2. fstrim -o 0 -l 3221225472 -m 2097152 -v /mnt/f2fs/ 3. dd if=/dev/zero of=/mnt/f2fs/a bs=2M count=1 4. dd if=/dev/zero of=/mnt/f2fs/b bs=1M count=1 5. sync 6. rm /mnt/f2fs/a /mnt/f2fs/b 7. fstrim -o 0 -l 3221225472 -m 2097152 -v /mnt/f2fs/ Before: <...>-5428 [001] ...1 9511.052125: f2fs_issue_discard: dev = (251,0), blkstart = 0x2200, blklen = 0x200 <...>-5428 [001] ...1 9511.052787: f2fs_issue_discard: dev = (251,0), blkstart = 0x2200, blklen = 0x300 After: <...>-6764 [000] ...1 9720.382504: f2fs_issue_discard: dev = (251,0), blkstart = 0x2200, blklen = 0x300 Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: avoid mismatching block range for discardYunlei He2017-04-131-0/+4
| | | | | | | | | This patch skip discard block range smaller than trim_minlen, and can not be merged by neighbour Signed-off-by: Yunlei He <heyunlei@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix incorrect f_bfree calculation in ->statfsChao Yu2017-04-131-1/+1
| | | | | | | | | | | As manual described, f_bfree indicates total free blocks in fs, in f2fs, it includes two parts: visible free blocks and over-provision blocks. This patch corrrects the calculation. fsblkcnt_t f_bfree; /* free blocks in fs */ Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: skip to check the block address of node pageJaegeuk Kim2017-04-131-3/+3
| | | | | | If the node page is up-to-date, it should be alive. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: shrink critical region in spin_lockJaegeuk Kim2017-04-131-15/+8
| | | | | | This patch shrinks the critical region in spin_lock. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: call SetPageUptodate if neededJaegeuk Kim2017-04-135-15/+30
| | | | | | | | | | SetPageUptodate() issues memory barrier, resulting in performance degrdation. Let's avoid that. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Conflicts: fs/f2fs/data.c
* f2fs: introduce f2fs_set_page_dirty_nobufferJaegeuk Kim2017-04-134-3/+33
| | | | | | | | | | | This patch adds f2fs_set_page_dirty_nobuffer() copied from __set_page_dirty_buffer. When appending 4KB blocks in f2fs on pmem with multiple cores, this improves the overall performance. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Conflicts: fs/f2fs/f2fs.h
* f2fs: remove unnecessary goto statementTiezhu Yang2017-04-131-2/+2
| | | | | | | | | When base_addr is NULL, there is no need to call kzfree, it should return -ENOMEM directly. Additionally, it is better to initialize variable 'error' with 0. Signed-off-by: Tiezhu Yang <kernelpatch@126.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: add nodiscard mount optionChao Yu2017-04-132-3/+17
| | | | | | | | | | This patch adds 'nodiscard' mount option. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Conflicts: Documentation/filesystems/f2fs.txt
* f2fs: fix to redirty page if fail to gc data pageChao Yu2017-04-131-1/+12
| | | | | | | | If we fail to move data page during foreground GC, we should give another chance to writeback that page which was set dirty previously by writer. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix to detect truncation prior rather than EIO during readChao Yu2017-04-133-13/+13
| | | | | | | | | | | | | In procedure of synchonized read, after sending out the read request, reader will try to lock the page for waiting device to finish the read jobs and unlock the page, but meanwhile, truncater will race with reader, so after reader get lock of the page, it should check page's mapping to detect whether someone has truncated the page in advance, then reader has the chance to do the retry if truncation was done, otherwise read can be failed due to previous condition check. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix to avoid reading out encrypted data in page cacheChao Yu2017-04-131-43/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For encrypted inode, if user overwrites data of the inode, f2fs will read encrypted data into page cache, and then do the decryption. However reader can race with overwriter, and it will see encrypted data which has not been decrypted by overwriter yet. Fix it by moving decrypting work to background and keep page non-uptodated until data is decrypted. Thread A Thread B - f2fs_file_write_iter - __generic_file_write_iter - generic_perform_write - f2fs_write_begin - f2fs_submit_page_bio - generic_file_read_iter - do_generic_file_read - lock_page_killable - unlock_page - copy_page_to_iter hit the encrypted data in updated page - lock_page - fscrypt_decrypt_page Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Conflicts: fs/f2fs/data.c
* f2fs: avoid latency-critical readahead of node pagesJaegeuk Kim2017-04-132-2/+2
| | | | | | | The f2fs_map_blocks is very related to the performance, so let's avoid any latency to read ahead node pages. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: avoid writing node/metapages during writesJaegeuk Kim2017-04-131-3/+3
| | | | | | Let's keep more node/meta pages in run time. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: produce more nids and reduce readahead natsJaegeuk Kim2017-04-136-8/+18
| | | | | | | | | The readahead nat pages are more likely to be reclaimed quickly, so it'd better to gather more free nids in advance. And, let's keep some free nids as much as possible. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: detect host-managed SMR by feature flagJaegeuk Kim2017-04-134-8/+35
| | | | | | | If mkfs.f2fs gives a feature flag for host-managed SMR, we can set mode=lfs by default. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: call update_inode_page for orphan inodesJaegeuk Kim2017-04-136-24/+8
| | | | | | | | | Let's store orphan inode pages right away. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Conflicts: fs/f2fs/namei.c
* f2fs: report error for f2fs_parent_dirJaegeuk Kim2017-04-131-2/+3
| | | | | | | | | If there is no dentry, we can report its error correctly. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Conflicts: fs/f2fs/namei.c
* f2fs: find parent dentry correctlySheng Yong2017-04-133-36/+3
| | | | | | | | | | | | | | If dotdot directory is corrupted, its slot may be ocupied by another file. In this case, dentry[1] is not the parent directory. Rename and cross-rename will update the inode in dentry[1] incorrectly. This patch finds dotdot dentry by name. Signed-off-by: Sheng Yong <shengyong1@huawei.com> [Jaegeuk Kim: remove wron bug_on] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Conflicts: fs/f2fs/f2fs.h
* f2fs: fix deadlock in add_link failureJaegeuk Kim2017-04-131-3/+0
| | | | | | | | | | | | | | | | | | | | | mkdir sync_dirty_inode - init_inode_metadata - lock_page(node) - make_empty_dir - filemap_fdatawrite() - do_writepages - lock_page(data) - write_page(data) - lock_page(node) - f2fs_init_acl - error - truncate_inode_pages - lock_page(data) So, we don't need to truncate data pages in this error case, which will be done by f2fs_evict_inode. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: introduce mode=lfs mount optionJaegeuk Kim2017-04-139-4/+74
| | | | | | | | | This mount option is to enable original log-structured filesystem forcefully. So, there should be no random writes for main area. Especially, this supports host-managed SMR device. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: skip clean segment for gcJaegeuk Kim2017-04-131-0/+4
| | | | | | | If a segment in a section is clean or prefreed, we don't need to get its summary and do gc. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>